2023 2022 2021 2020 2019 2018 2017 2016 2015 2014 2013 2012 2011 2010 2009 2008 2007 2006 2005 2004 2003 2002 2001 Before 2001
2023
Papers
Wang, X. M., Troje, N. F.
Interacting with people and three-dimensional objects depicted on a screen is perceptually different from interacting with them in real life. This difference resides in their corresponding perceptual spaces: The former involves pictorial space, the latter, visual space. Studies have examined the perceptual geometry of pictorial or visual space, but rarely their connection. In the current study, we connected visual and pictorial space using an exocentric pointing task and investigated how binocular disparity and motion parallax affect this connection. In a virtual environment, we displayed a pointing virtual character within a frame that functioned as a window frame or a picture frame but could also adopt intermediate states by differentially controlling binocular disparity and motion parallax. We asked participants to rotate him to point at targets located in visual space. In Experiment 1, we manipulated the virtual characterâs distance to the screen and found that binocular disparity determines the distance relationship between visual and pictorial space. In Experiment 2, we changed the participantsâ viewing angle relative to the screen and found that motion parallax determines the directional relationship between visual and pictorial space. We discuss the theoretical and practical implications of our results in the context of video-mediated telecommunication.
Troje, N. F., Chang, D. H. F.
Life motion, the active movements of people and animals, contains a wealth of information that is potentially accessible to the visual system of an observer. Biological motion point-light displays have been widely used to study both the information contained in the stimulus and the visual mechanisms that make use of it. Biological motion conveys motion-mediated dynamic shape which in turn can be used for identification and recognition of the agent, but it also contains local visual invariants that humans and other animals use as a general detection system that signals the presence of other agents in the visual environment. Here, we are reviewing more recent research on behavioural, neurophysiological, and genetic aspects of this life detection system and discuss its functional significance in the light of earlier hypotheses.
Troje, N. F.
Natural, dynamic eye contact behaviour is critical to social interaction but is dysfunctional in video conferencing. In analysing the problem, I introduce the concept of directionality and emphasize the critical role of motion parallax. I then sketch approaches towards re-establishing directionality and enabling natural, dynamic eye contact in video conferences.
Ghorbani, S., Ferstl, Y., Holden, D., Troje, N. F., Carbonneau, M. A.
We present ZeroEGGS, a neural network framework for speech-driven gesture generation with zero-shot style control by example. This means style can be controlled via only a short example motion clip, even for motion styles unseen during training. Our model uses a Variational framework to learn a style embedding, making it easy to modify style through latent space manipulation or blending and scaling of style embeddings. The probabilistic nature of our framework further enables the generation of a variety of outputs given the same input, addressing the stochastic nature of gesture motion. In a series of experiments, we first demonstrate the flexibility and generalizability of our model to new speakers and styles. In a user study, we then show that our model outperforms previous state-of-the-art techniques in naturalness of motion, appropriateness for speech, and style portrayal. Finally, we release a high-quality dataset of full-body gesture motion including fingers, with speech, spanning across 19 different styles.
Symposia and Published Abstracts
Schwarzer, G., Preissler, L., Troje, N. F.
From early in life, infants encounter real, physical objects in their environment as well as pictorial representations of those objects. When presented with both object formats at the same time, 7-month-old infants looked significantly longer at real objects than matched pictures, thus demonstrating a visual preference for real objects over pictures (DeLoache et al., 1979; Gerhard et al., 2016). However, the cause for this real-object preference is not yet fully understood. While a much-discussed cause is related to the assumption that, compared to pictures, real objects provide greater affordances for actions (Gibson, 1979; Snow et al., 2014), little is known about the relative contribution of motion parallax (MP) which infants use for depth perception of real objects from 4 months on (Nawrot et al., 2009). Therefore, we investigated whether MP plays a role in infantsâ preference for looking at real objects by presenting real objects paired with their pictures using iPads simulating MP. If MP is a contributing factor in infantsâ preference for real objects, we expected them to prefer looking at pictorial representations with MP over pictorial representations without MP, similar to how they prefer looking at real objects over pictorial representations without MP. We tested 7- 8-month-old infants (N = 24, data collection is still ongoing) and used 4 real objects and their pictorial counterparts as stimuli. The pictorial counterparts were presented on an iPad Pro 11 using an application that allowed MP to be switched on or off. Each infant viewed each of the four objects in every possible pair combination of the three object formats (real object, iPad with MP, iPad without MP) for 15 seconds, with left-right position counterbalanced within the pair (total of 24 pairs). Accumulated looking times for each object format served as dependent measure. A repeated measures ANOVA revealed a significant effect of object format, F(2) = 11.93, p < .001, with the longest looking times to the real objects (M = 73.8226 s, SD = 23.4143 s), followed by the looking times to the iPad with MP (M = 59.3270 s, SD = 17.2040 s), and those to the iPad without MP (M = 52.6248, SD = 17.5361 s). Post-hoc comparisons showed significantly longer looking times to the real object compared to the iPad presentations without MP (p < .001). Looking times to the iPad presentations with MP were nearly significantly longer than those to the iPad presentations without MP (p = .06). However, looking times to the real objects still significantly exceeded those to the iPad presentations with MP (p < 0.05).
2022
Papers
Peng, W., Cracco, E., Troje, N. F., Brass, M.
Previous research suggests that belief in free will correlates with intentionality attribution. However, whether belief in free will is also related to more basic social processes is unknown. Based on evidence that biological motion contains intentionality cues that observers spontaneously extract, we investigate whether people who believe more in free will, or in related constructs, such as dualism and determinism, would be better at picking up such cues and therefore at detecting biological agents hidden in noise, or would be more inclined to detect intentionality cues and therefore to detect biological agents even when there are none. Signal detection theory was used to measure participantsâ ability to detect biological motion from scrambled background noise (dâ²) and their response bias (c) in doing so. In two experiments, we found that belief in determinism and belief in dualism, but not belief in free will, were associated with biological motion perception. However, no causal effect was found when experimentally manipulating free will-related beliefs. In sum, our results show that biological motion perception, a low-level social process, is related to high-level beliefs about dualism and determinism.
Martin, L., Stein K., Kubera, K., Troje, N. F., Fuchs, T.
Background. Motor abnormalities occur in the majority of persons with schizophrenia but are generally neglected in clinical care. Psychiatric diagnostics fail to include quantifiable motor variables and few assessment tools examine full-body movement. Methods. We assessed full-body movement during gait of 20 patients and 20 controls with motion capture technology, symptom-load (PANSS, BPRS) and Neurological Soft Signs (NSS). In a data-driven analysis participantsâ motion patterns were quantified and compared between groups. Resulting movement markers (MM) were correlated with the clinical assessment. Results. We identified 16 quantifiable MM of schizophrenia. While walking, patients and controls display significant differences in movement patterns related to posture, velocity, regularity of gait as well as sway, flexibility and integration of body parts. Specifically the adjustment of body sides, limbs and movement direction were affected. The MM remain significant when controlling for medication load. They are systematically related to NSS. Conclusion. Results add assessment tools, analysis methods as well as theory-independent MM to the growing body of research on motor abnormalities in schizophrenia.
Cui, A. -X., Troje, N. F., Cuddy, L. L.
Most listeners possess sophisticated knowledge about the music around them without being aware of it or its intricacies. Previous research shows that we develop such knowledge through exposure. This knowledge can then be assessed using behavioral and neurophysiological measures. It remains unknown however, which neurophysiological measures accompany the development of musical long-term knowledge. In this series of experiments, we first identified a potential ERP marker of musical long-term knowledge by comparing EEG activity following musically unexpected and expected tones within the context of known music (nâ=â30). We then validated the marker by showing that it does not differentiate between such tones within the context of unknown music (nâ=â34). In a third experiment, we exposed participants to unknown music (nâ=â40) and compared EEG data before and after exposure to explore effects of time. Although listenersâ behavior indicated musical long-term knowledge, we did not find any effects of time on the ERP marker. Instead, the relationship between behavioral and EEG data suggests musical long-term knowledge may have formed before we could confirm its presence through behavioral measures. Listeners are thus not only knowledgeable about music but seem to also be incredibly fast music learners.
Ben-Ami, S., Gupta, P., Yadav, M., Shah, P., Talwar, G., Paswan, S., Ganesh, S., Troje, N. F., Sinha, P.
The long-standing nativist vs. empiricist debate asks a foundational question in epistemology â does our knowledge arise through experience or is it available innately? Studies that probe the sensitivity of newborns and patients recovering from congenital blindness are central in informing this dialogue. One of the most robust sensitivities our visual system possesses is to âbiological motionâ â the movement patterns of humans and other vertebrates. Various biological motion perception skills (such as distinguishing between movement of human and non-human animals, or between upright and inverted human movement) become evident within the first months of life. The mechanisms of acquiring these capabilities, and specifically the contribution of visual experience to their development, are still under debate. We had the opportunity to directly examine the role of visual experience in biological motion perception, by testing what level of sensitivity is present immediately upon onset of sight following years of congenital visual deprivation. Two congenitally blind patients who underwent sight-restorative cataract-removal surgery late in life (at the ages of 7 and 20 years) were tested before and after sight restoration. The patients were shown displays of walking humans, pigeons, and cats, and asked to describe what they saw. Visual recognition of movement patterns emerged immediately upon eye-opening following surgery, when the patients spontaneously began to identify human, but not animal, biological motion. This recognition ability was evident contemporaneously for upright and inverted human displays. These findings suggest that visual recognition of human motion patterns may not critically depend on visual experience, as it was evident upon first exposure to un-obstructed sight in patients with very limited prior visual exposure, and furthermore, was not limited to the typical (upright) orientation of humans in real-life settings.
Symposia and Published Abstracts
Vembukumar, V., Troje, N. F.
Recently, the frequency of human interaction mediated by screens has exponentially increased. The prevalence of screen-based communication makes it important to understand how a personâs gaze is perceived when viewed on screen. A better understanding of screen-based gaze perception would facilitate the enhancement of communication tools, thus increasing communication efficiency. An abundance of literature demonstrates that motion parallax is an important depth cue that enables participants to perceive objects more accurately in real and virtual worlds. A study conducted in virtual reality has shown that motion parallax had a greater effect in evoking the sense of presence when compared to stereopsis. The presence of motion parallax provides context to directional cues such as hand gestures or eye gaze. This study aims to examine whether the addition of simulated motion parallax increases participantsâ sensitivity to the gaze direction of faces on a screen. If adding motion parallax can increase the sensitivity to gaze perception, cheaper and more widely accessible solutions could be developed to integrate the depth cue into standard video communication, thereby enhancing communication efficiency. By using motion capture technology to track a userâs head location, we control a virtual camera whose movement in a virtual environment corresponds with the userâs own motions. This allows the image on the screen to be dynamically rendered based on their position. The study examines two conditions: one uses head tracking to simulate the view onto a 3D head behind a window framed by the computer screen. In the other condition we present static images of the same head on the screen. In the Window condition the avatarâs head and eye gaze are set to varying angles and users are asked to move themselves into the line of sight of the avatar. Once they reach a location at which they perceive eye contact their head location is recorded. The angular difference between the participant's head location and their expected position is then analyzed. In the static condition, the avatarâs total gaze (head gaze plus eye gaze) is shown at angles between -11° and 11° deg around the fronto-parallel view, and participants indicated whether they perceived the head looking to their left or right. Both datasets are modelled by normal distributions such that means (accuracy) and variance (precision) of eye gaze perception can be assessed and compared. From pilot data, in the static condition, participants had an average mean of 0.87° and an average standard deviation of 6.2°. In the motion parallax condition, the average mean was 3.5° and the average standard deviation was 3.1°. The difference between the standard deviations in the two conditions was found to be statistically significant (p=0.02).
Troje, N. F., Esser, M., Thaler, A.
The angular orientation of a face pictured in half-profile view is systematically overestimated by the human observer. For instance, a 35 deg view is estimated to be oriented around 45 deg. What is the cause for this perceptual orientation bias? Here, we address three related questions. (1) Is the phenomenon specific to pictorial projections or does it also occur in 3D space? (2) Can it be explained with the depth compression expected when the vantage point of the observer is closer to the picture than the point of projection? (3) Does the visual system use a shape prior that does not match the elliptical horizontal cross section of a typical head? Exp. 1 was conducted in virtual reality. We used a method of adjustment (âorient this face into a 45° positionâ). We found the orientation bias was smaller than expected and only marginally different between picture and 3D conditions. In Exp. 2 we presented static pictures and systematically varied the vantage point of the observer relative to the point of projection of the picture. We observed a pronounced bias which was not dependent on the vantage point. In Exp. 3, we replicated the orientation bias with a non-facial object â a coffee mug with a handle that defined its orientation. We systematically modified the shape of the mug between circular and elliptical horizontal cross sections. Mugs were then presented either as static images or as short movies with the mug rotating about its vertical axis. Participants estimated orientation almost veridically for circular shapes and displayed predictable errors for other shapes. The shape-dependent orientation biases were much smaller for the movies compared to the pictures. We conclude: The visual system adopts the heuristic of a cylindrical head shape unless explicit information about its shape is provided, e.g., through structure-from-motion.
Funkhouser, A., Troje, N. F.
Due to the recent pandemic, video conferencing services (Skype, Zoom, Teams, etc.) have boomed as a medium of communication. Compared to face-to-face communication, communication through video conferencing is less efficient and more exhausting. We hypothesize that this is caused by a lack of directionality, which is the ability to discern how someone is oriented relative to ourselves. The lack of directionality becomes particularly obvious in the context of gaze direction and other deictic behaviors. No matter where a person is in front of their screen, that person only feels looked at if the other person is looking directly into the camera. Physical cues crucial to communication efficiency, such as pointing and eye gaze, rely heavily on directionality. Consequently, communicating through screens is less desirable than in real-life settings. Studies demonstrate that motion parallax facilitates the accurate evaluation of directionality. We hypothesize that adding directionality to video calls through motion parallax improves communication efficiency. We developed an app that adds motion parallax to a video chat with head avatars. The camera on the device tracks the user's head movements and a virtual camera inside the program follows the coordinates from the physical camera to simulate motion parallax as if the user is talking to the other personâs avatar through a window. Communication efficiency is evaluated using a tangram game, where two people cooperate to finish 8 unique puzzles. Participants assume two roles: instructor and student. The instructors have 45 seconds to explain the solution to the puzzle while the student is unable to see it. Then, communication between the players is disabled while the student completes the puzzle. There are two viewing conditions, one with motion parallax and one without. Participants experience both during the experiment, presented in random order. We measure the puzzle-completion time of the student. We expect them to be quicker in the condition with motion parallax than in the condition without motion parallax, as the added directionality provides certain nonverbal cues otherwise unavailable in a non-motion parallax condition. To account for potential differences in puzzle difficulty between trials, we performed a full analysis of variance to compare conditions for each puzzle. There were two factors, a within-subjects variable labeled âviewing conditionâ with two levels (motion parallax and non- motion parallax) and one between-subjects factor labeled âpuzzleâ with eight levels (puzzles one through eight). We expected that regardless of puzzle difficulty, the motion parallax condition would be faster overall. Should results prove to be promising, the real-world applications are quite wide, as video communication services are used in most settings nowadays. Communication efficiency is critical to business, academia, and more. Improving video communication would overall bolster productivity for everyone.
Esser, M., Thaler, A., Troje, N. F.
When looking at a picture of a human face in half-profile view, human observers tend to overestimate the orientation of the face. For instance, if oriented 30° with respect to the fronto-parallel view, observers asked to estimate the orientation indicate an angle of about 45°. What is the cause for this facial orientation bias? Is it specific to faces? Does it happen only in pictures? We tested three different hypotheses. (1) In pictures, the planar shape of the medium induces depth compression of the depicted content. (2) The optic array coming from an image is interpreted as if it came from a 3D object. If so, disparities between the pictureâs center-of-projection and the observerâs vantage point cause depth distortions. (3) Unless additional information is provided, the visual system uses a shape prior that assumes a circular horizontal cross section, rather than matching the elliptical cross section of a typical head. Experiment 1 was conducted in virtual reality where we simulated both 3D heads and images of them. We used a method of adjustment (âorient this face into a 45° positionâ). We found the orientation bias was much smaller than expected and only marginally different between picture and 3D conditions. In Experiment 2 we presented static pictures and used a method of constant stimuli (âis the person oriented more or less than 45° from frontal view?â). We systematically varied the center-of-projection of the picture while keeping the observerâs vantage point constant. We observed a pronounced bias, but that bias did not depend on the point of projection. We hypothesized that the differences in experienced orientation bias between Experiments 1 and 2 were due to participants having access to shape-from-motion in Experiment 1, but not in Experiment 2. In Experiment 3, we replicated the orientation bias with a non-facial object â a coffee mug with a handle that defined its orientation.
Other
Wang, X. M., Troje, N. F.
Interacting with people and three-dimensional objects depicted on a screen is perceptually different from interacting with them in real life. This difference resides in their corresponding perceptual spaces: The former involves pictorial space, the latter, visual space. Studies have examined the perceptual geometry of pictorial or visual space, but rarely their connection. In the current study, we connected visual and pictorial space using an exocentric pointing task and investigated how binocular disparity and motion parallax affect this connection. In a virtual environment, we displayed a pointing virtual character within a frame that functioned as a window frame or a picture frame but could also adopt intermediate states by differentially controlling binocular disparity and motion parallax. We asked participants to rotate him to point at targets located in visual space. In Experiment 1, we manipulated the virtual characterâs distance to the screen and found that binocular disparity determines the distance relationship between visual and pictorial space. In Experiment 2, we changed the participantsâ viewing angle relative to the screen and found that motion parallax determines the directional relationship between visual and pictorial space. We discuss the theoretical and practical implications of our results in the context of video-mediated telecommunication.
Ghorbani, S., Ferstl, Y., Holden, D., Troje, N. F., Carbonneau, M. A.
We present ZeroEGGS, a neural network framework for speech-driven gesture generation with zero-shot style control by example. This means style can be controlled via only a short example motion clip, even for motion styles unseen during training. Our model uses a Variational framework to learn a style embedding, making it easy to modify style through latent space manipulation or blending and scaling of style embeddings. The probabilistic nature of our framework further enables the generation of a variety of outputs given the same input, addressing the stochastic nature of gesture motion. In a series of experiments, we first demonstrate the flexibility and generalizability of our model to new speakers and styles. In a user study, we then show that our model outperforms previous state-of-the-art techniques in naturalness of motion, appropriateness for speech, and style portrayal. Finally, we release a high-quality dataset of full-body gesture motion including fingers, with speech, spanning across 19 different styles.
2021
Papers
Peng, W., Cracco, E., Troje, N. F., Brass, M.
When observing point light walkers orthographically projected onto a frontoparallel plane, the direction in which they are walking is ambiguous. Nevertheless, observers more often perceive them as facing towards than as facing away from them. This phenomenon is known as the âfacing-the-viewer biasâ (FTV). Two interpretations of the facing-the-viewer bias exist in the literature: a top-down and a bottom-up interpretation. Support for the top-down interpretation comes from evidence that social anxiety correlates with the FTV bias. However, the direction of the relationship between the FTV bias and social anxiety is inconsistent across studies and evidence for a correlation has mostly been obtained with relatively small samples. Therefore, the first aim of the current study was to provide a strong test of the hypothesized relationship between social anxiety and the facing-the-viewer bias in a large sample of 200 participants recruited online. In addition, a second aim was to further extend top-down accounts by investigating if the FTV bias is also related to autistic traits. Our results replicate the FTV bias, showing that people indeed tend to perceive orthographically projected point light walkers as facing toward them. However, no correlation between the FTV bias and social interaction anxiety (tau = -.01, p = .86, BF = .18) or autistic traits (tau = -.0039, p = .45, BF = .18) was found. As such, our data cannot confirm the top-down interpretation of the facing-the-viewer bias.
Ghorbani, S., Mahdaviani, K., Thaler, A., Kording, K., Cook, D. J., Blohm, G., Troje, N. F.
Large high-quality datasets of human body shape and kinematics lay the foundation for modelling and simulation approaches in computer vision, computer graphics, and biomechanics. Creating datasets that combine naturalistic recordings with high-accuracy data about ground truth body shape and pose is challenging because different motion recording systems are either optimized for one or the other. We address this issue in our dataset by using different hardware systems to record partially overlapping information and synchronized data that lends itself to transfer learning. This multimodal dataset contains 9 hours of optical motion capture data, 17 hours of video data from 4 different points of view recorded by stationery and hand-held cameras, and 6.6 hours of inertial measurement units data recorded from 60 female and 30 male actors performing a collection of 21 everyday actions and sports movements. The processed motion capture data is also available as realistic 3D human meshes. We anticipate uses for this dataset for research on human pose estimation, action recognition, motion modelling, gait analysis, and body shape reconstruction.
Eftekharifar, S., Thaler, A., Troje, N. F.
Cybersickness is an enduring problem for users of virtual environments. While it is generally assumed that cybersickness is caused by discrepancies in perceived self-motion between the visual and vestibular systems, little is known about the relative contribution of motion parallax and binocular disparity to the occurrence of cybersickness. We investigated the role of these two depth cues in cybersickness by simulating a roller-coaster ride using a head-mounted display. Participants could see the tracks via a frame placed at the front side of the roller-coaster cart. We manipulated the state of the frame, so it behaved like: 1) a window into the virtual scene, 2) a 2D screen, 3) and 4) a window for one of the two depth cues, and a 2D screen for the other. Participants completed the Simulator Sickness Questionnaire before and after the experiment, and verbally reported their level of discomfort at repeated intervals during the ride. Additionally, participantsâ electrodermal activity (EDA) was recorded. The results of the questionnaire and the verbal ratings revealed the largest increase in cybersickness when the frame behaved like a window, and least increase when the frame behaved like a 2D screen. Cybersickness scores were at an intermediate level for the conditions where the frame simulated only one depth cue. This suggests that neither motion parallax nor binocular disparity had a more prominent effect on the severity of cybersickness. The EDA responses increased at about the same rate in all conditions, suggesting that EDA is not necessarily coupled with subjectively experienced cybersickness.
Chen, K., Ye, Y., Troje, N. F., Zhou, W.
There has been accumulating evidence of human social chemo-signaling, but the underlying mechanisms remain poorly understood. Considering the evolutionarily conserved roles of oxytocin and vasopressin in reproductive and social behaviors, we examined whether the two neuropeptides are involved in the subconscious processing of androsta-4,16,-dien-3-one and estra-1,3,5(10),16-tetraen-3-ol, two human chemosignals that respectively convey masculinity and femininity to the targeted recipients. Psychophysical data collected from 216 heterosexual and homosexual men across 5 experiments totaling 1,056 testing sessions consistently showed that such chemosensory communications of sex were blocked by a competitive antagonist of both oxytocin and vasopressin receptors called atosiban, administered nasally. On the other hand, intranasal oxytocin, but not vasopressin, modulated the decodings of androstadienone and estratetraenol in manners that were dose-dependent, nonmonotonic and contingent upon the recipientsâ social proficiency. Taken together, these findings establish a causal link between neuroendocrine factors and subconscious chemosensory communications of sex in humans.
Chang, D. H. F., Troje, N. F., Ikegaya, Y., Fujita, I., Ban, H.
We sought to understand the spatiotemporal characteristics of biological motion perception. We presented observers with biological motion walkers that differed solely in terms of biological form or solely in terms of kinematics. We added a third walker in which we rendered the form undiagnostic and also obscured a critical kinematic feature. Participants were asked to discriminate the facing direction of the stimuli while their magnetoencephalographic responses were concurrently imaged. We found that two univariate response components can be observed within the first 400 ms, with the âearlyâ response propagating in a feed-forward manner from early to extrastriate cortex, and the âlateâ response following a somewhat reversed order. Moreover, while univariate responses, particularly in inferiortemporal cortex show biological motion form-specificity only after 300 ms, multivariate patterns specific to form can be well discriminated from those for local cues as early as 100 ms after stimulus onset. By finally examining the representation similarity of fMRI and MEG patterned responses, we show that early responses to biological motion are most likely sourced to occipital cortex while later responses likely originate from extrastriate body areas. We conclude that unlike mechanisms governing the perception of biological form-from-motion, those underlying the extraction of biological kinematics may be located in more radial portions of cortex or in even deeper regions of the brain that cannot be well accessed by MEG.
Bachmann, J., Zabicki, A., Gradl, S., Kurz, J., Munzert, J., Troje, N. F., Krueger, B.
This study compared how two virtual display conditions of human body expressions influenced explicit and implicit dimensions of emotion perception and response behavior in women and men. Two avatars displayed emotional interactions (angry, sad, affectionate, happy) in a âpictorialâ condition depicting the emotional interactive partners on a screen within a virtual environment and a âvisualâ condition allowing participants to share space with the avatars, thereby enhancing co-presence and agency. Subsequently to the stimulus presentation, explicit valence perception and response tendency (i.e. the explicit tendency to avoid or approach the situation) were assessed on rating scales. Implicit responses, i.e. postural and autonomic responses towards the observed interactions were measured by means of postural displacement and changes of skin conductance. Results showed that self-reported presence differed between pictorial and visual conditions, however, it was not correlated with skin conductance responses. Valence perception was only marginally influenced by the virtual condition and explicit response behavior not at all. There were gender-mediated effects on postural response tendencies as well as gender differences in explicit response behavior but not in valence perception. Exploratory analyses revealed a link between valence perception and preferred behavioral response in women but not in men. We conclude that the display condition seems to influence automatic motivational tendencies but not higher-level cognitive evaluations. Moreover, intragroup differences in explicit and implicit response behavior highlight the importance of individual factors beyond gender.
Proceedings
Eftekharifar, S., Thaler, A., Troje, N. F.
Exposure to nature has been shown to have a positive effect on peopleâs mental health. Little research has compared restorative effects of simulated nature presented by different media. Here, we investigated stress recovery when viewing a computer-generated nature setting presented in visual and pictorial space in virtual reality. Participants experienced a stress induction task and were then put into one of two relaxation scenarios: they either viewed the nature scene in visual space, (they were immersed into it; presence condition), or they viewed a large depiction of it in pictorial space (picture condition). Participantsâ affective state was assessed before and after stress induction, and after relaxation using the ZIPERS questionnaire. We additionally recorded electrodermal activity as a measure of physiological arousal. The results revealed that relaxation led to an increase in positive affect scores and a decrease in electrodermal activity only in the presence condition. The negative affect scores decreased significantly for both conditions similarly. Our results show that restoration is more effective in visual than in pictorial space.
Bebko, A. O., Thaler, A., Troje, N. F.
Realistic virtual characters are important for many applications.The SMPL body model is based on 3D body scans and uses bodyshape and pose-dependent blendshapes to achieve realistic humananimations [3]. Recently, a large database of SMPL animationscalled AMASS has been released [4]. Here, we present a toolthat allows these animations to be viewed and controlled in Unitycalled the BioMotionLab SMPL Unity Player (bmlSUP). This toolprovides an easy interface to load and display AMASS animationsin 3D immersive environments and mixed reality. We present thefunctionality, uses, and possible applications of this new tool.
Symposia and Published Abstracts
Ross, G., Dowling, B., Troje, N. F., Fischer, S., Graham, R.
Movement screens are frequently used to distinguish between stylistic differences in movement patterns such as pathological abnormalities or skill related differences in sport1; however, abnormalities are often visually detected resulting in poor reliability2. Therefore, our previous research has focused on the development of an objective movement competency scoring tool to score movement based on kinematic data3. Currently, the method requires optical motion capture, which is expensive and time-consuming to use, creating a barrier for adoption within industry. The purpose of this study was to assess the objective toolâs performance using data that can be collected easily and inexpensively in the field using wearable sensors such as inertial measurement units (IMU). The secondary purpose of this study was to refine the architecture of the tool to optimize classification rates. Motion capture data from 542 athletes performing seven dynamic screening movements were analyzed. A PCA-based pattern recognition technique, ensemble feature selection and machine learning algorithms with the Euclidean norm of the segment linear accelerations and angular velocities as inputs was used to classify athletes based on skill level. Results/Résultats Depending on the movement, using metrics achievable with IMUs and a linear discriminant analysis, 75.1-84.7% of athletes were accurately classified as either elite or novice (Figure 1). We have provided proof that an objective, data-driven method can detect meaningful differences from a movement screening battery when using data that can be collected using IMUs. This provides a large methodological advance as these can be collected in the field using wearable sensors. The method offers an objective, inexpensive tool to enhance screening, assessment, and rehabilitation in sport and clinical settings. 1. Cook et al., 2014. Int.J.Sports Phys.Ther. 9, 549â563. 2. McCunn et al., 2016. Sport.Med. 46, 763â781. 3. Ross et al., 2018. Med.Sci.Sport.Exerc. 50, 1457â1464.
Other
Peng, W., Cracco, E., Troje, N. F., Brass, M.
Previous research suggests that belief in free will correlates positively with intention perception. However, whether belief in free will is also related to more basic social processes is unknown. Based on evidence that biological motion is an intention-carrier, we investigate if belief in free will and two related beliefs, namely belief in dualism and belief in determinism, are associated with biological motion perception. Signal Detection Theory (SDT) was used to measure participantsâ ability to detect biological motion from scrambled background noise (d') and their response bias (c) in doing so. In two experiments we found that belief in determinism and belief in dualism, but not belief in free will, were associated with the perception of biological motion. However, no causal relationship was found when experimentally manipulating free will-related beliefs. In general, our research suggests that basic social processes, like biological motion perception, can be predicted by high-level beliefs.
2020
Papers
Ross, G. B., Dowling, B., Troje, N. F., Fischer, S. L., Graham, R. B.
Movement screens are frequently used to distinguish differences in movement patterns such as pathological abnormalities or skill related differences in sport; however, abnormalities are often visually detected by a human assessor resulting in poor reliability. Therefore, our previous research has focused on the development of an objective movement assessment tool to classify elite and novice athletesâ kinematic data using machine learning algorithms. Classifying elite and novice athletes can be beneficial to objectively detect differences in movement patterns between the athletes, which can then be used to provide higher quality feedback to athletes and their coaches. Currently, the method requires optical motion capture, which is expensive and time-consuming to use, creating a barrier for adoption within industry. Therefore, the purpose of this study was to assess whether machine learning could classify athletes as elite or novice using data that can be collected easily and inexpensively in the field using wearable sensors such as inertial measurement units (IMU). A secondary purpose of this study was to refine the architecture of the tool to optimize classification rates. Motion capture data from 542 athletes performing seven dynamic screening movements were analyzed. A principal component analysis (PCA)-based pattern recognition technique and machine learning algorithms with the Euclidean norm of the segment linear accelerations and angular velocities as inputs was used to classify athletes based on skill level. Depending on the movement, using metrics achievable with IMUs and a linear discriminant analysis, 78.3-86.8% of athletes were accurately classified as elite or novice. We have provided evidence that suggests an objective, data-driven method can detect meaningful differences during a movement screening battery when using data that can be collected using IMUs, thus providing a large methodological advance as these can be collected in the field using sensors. This method offers an objective, inexpensive tool that can be easily implemented in the field to potentially enhance screening, assessment, and rehabilitation in sport and clinical settings.
Rajendran, S. S., Bottari, D., Shareef, I., Pitchaimuthu, K., Sourav, S., Troje, N. F., Kekunnaya, R., Röder, B.
Visual input during the first years of life is vital for the development of numerous visual functions. Previous reports have shown that, while normal development of global motion perception seems to require visual input during an early sensitive period, biological motion (BM) detection does not seem to do so. A more complex form of BM processing is the identification of human actions. Here we tested whether the identification rather than detection of BM is experience dependent. A group of human participants who had been treated for congenital cataracts (of up to 18 year duration, CC group) had to identify ten actions performed by human line figures. In addition they performed a coherent motion (CM) detection task, which required identifying the direction of coherent motion amidst the movement of random dots. As controls, individuals with reversed developmental cataracts were included to distinguish effects of congenital vs. later deprivation. Moreover, normally sighted controls were tested both with vision blurred to match the visual acuity of the CC individuals (vision matched group) and with full sight (sighted control group). The CC group identified biological actions with an extraordinary high accuracy (on average ~85% correct) and was indistinguishable from the vision matched control group. By contrast, CM processing impairments of the CC group persisted even after controlling for visual acuity. These results in the same individuals demonstrate an impressive resilience of biological motion processing to aberrant early visual experience and at the same time a sensitive period for the development of coherent motion processing.
Kurz, J., Helm, F., Troje, N. F., Munzert, J.
Correctly perceiving the movements of opponents is essential in everyday life as well as in many sports. Several studies have shown a better prediction performance for detailed stimuli compared to point-light displays (PLDs). However, it remains unclear whether differences in prediction performance result from explicit information about articulation or from information about body shape. We therefore presented three different types of stimuli (PLDs, stick figures, and avatars) with different amount of available information of soccer playersâ run-ups. Stimulus presentation was faded out at ball contact. Participants had to react to the perceived shot direction with a full-body movement. Results showed no differences for time to ball contact between presentation modes. However, prediction performance was significantly better for avatars and stick figures compared to PLDs, but did not differ between avatars and stick figures, suggesting that explicit information about articulation of the major joints is crucial for better prediction performance, and plays a larger role than detailed information about body shape. We also tracked eye movements and found that gaze behavior for skinned avatars differed from those for PLDs and stick figures, with no significant differences between PLDs and stick figures. This effect was due to more and longer fixations on the head when skinned avatars were presented.
Karimpur, H., Eftekharifar, S., Troje, N. F., Fiehler, K.
An essential difference between pictorial space displayed as paintings, photos or computer screens, and the visual space experienced in the real world is that the observer has a defined loca-tion, and thus valid information about distance and direction of objects, in the latter but not in the former. Thus, egocentric information should be more reliable in visual space while allocentric in-formation should be more reliable in pictorial space. The majority of studies relied on pictorial rep-resentations (images on a computer screen) leaving it unclear whether the same coding mecha-nisms apply in visual space. Using a memory-guided reaching task in virtual reality (VR), we investi-gated allocentric coding in both visual space (on a table in VR) and pictorial space (on a monitor which is on the table in VR). Our results suggest that the brain uses allocentric information to rep-resent objects in both pictorial and visual space. Contrary to our hypothesis, the influence of allo-centric cues was stronger in visual space than in pictorial space, also after controlling for retinal stimulus size, confounding allocentric cues, and differences in presentation depth. We discuss pos-sible reasons for stronger allocentric coding in visual than pictorial space.
Helm, F., Canal-Bruland, R., Mann, D., Troje, N. F., Munzert, J.
A broad body of literature shows that athletes can effectively anticipate an opponentâs action outcomes based on movement kinematics. To make themselves less âreadableâ to their opponents, athletes perform disguised actions. Because disguised actions raise the level of uncertainty about potential action outcomes, observers may rely more heavily on additional sources of information such as situational probability information. The current study therefore aimed to examine the relative contributions of the kinematic and situational probability information with different levels of disguised kinematics. More specifically, we tested whether the weighting of the informational sources (kinematic vs. probabilistic) shifts relative to the certainty of the available kinematic information. To this end, the ambiguity of the kinematic information of animated avatars of handball throwers was systematically manipulated using linear morphing. In a virtual-reality environment, trained novice observers (N=23) were asked to classify as quickly and accurately as possible whether observed throws were either genuine or disguised. In addition, we also systematically manipulated information about the performerâs action preferences (AP) by explicitly informing participants about the performerâs AP to disguise their throw (25%, 50%, and 75%). Results showed that when the kinematics were ambiguous observers relied more heavily on the probabilistic information. For the AP 25% condition, observers were more likely to report that ambiguous throws were genuine (p < 0.001), whereas they classified the ambiguous throws as being disguised in the AP 75% condition (p < 0.001). These findings suggest that observers rely more strongly on non-kinematic (situational probability) information when the reliability of the observable movement kinematics becomes less certain.
Ghorbani, S., Wloka, C., Etemad, A., Brubaker, M. A., Troje, N. F.
We present a probabilistic framework to generate character animations based on weak control signals, such that the synthesized motions are realistic while retaining the stochastic nature of human movement. The proposed architecture, which is designed as a hierarchical recurrent model, maps each sub-sequence of motions into a stochastic latent code using a variational autoencoder extended over the temporal domain. We also propose an objective function which respects the impact of each joint on the pose and compares the joint angles based on angular distance. We use two novel quantitative protocols and human qualitative assessment to demonstrate the ability of our model to generate convincing and diverse periodic and non-periodic motion sequences without the need for strong control signals.
Eftekharifar, S., Thaler, A., Troje, N. F.
The sense of presence is defined as a subjective feeling of being situated in an environment and occupying a location therein. The sense of presence is a defining feature of virtual environments. In two experiments, we aimed at investigating the relative contribution of motion parallax and stereopsis to the sense of presence, using two versions of the classic pit room paradigm in virtual reality. In Experiment 1, participants were asked to cross a deep abyss between two platforms on a narrow plank. Participants completed the task under three experimental conditions: 1) when the lateral component of motion parallax was disabled, 2) when stereopsis was disabled, and 3) when both stereopsis and motion parallax were available. As a subjective measure of presence, participants completed a presence questionnaire after each condition. Additionally, electrodermal activity (EDA) was recorded as a measure of anxiety. In Experiment 1, EDA responses were significantly higher with restricted motion parallax as compared to the other two conditions. However, no difference was observed in terms of the subjective presence scores across the three conditions. To test whether these results were due to the nature of the environment, participants in Experiment 2 experienced a slightly less stressful environment where they were asked to stand on a ledge and drop virtual balls to specified targets into the abyss. The same experimental manipulations were used as in Experiment 1. Again, the EDA responses were significantly higher when motion parallax was impaired as compared to when stereopsis was disabled. The results of the presence questionnaire revealed a reduced sense of presence with impaired motion parallax compared to the normal viewing condition. Across the two experiments, our results unexpectedly demonstrate that presence in the virtual environments is not necessarily linked to EDA responses elicited by affective situations as has been implied by earlier studies.
Bebko, A., Troje, N. F.
Advances in virtual reality (VR) technology have made it a valuable new tool for vision and perception researchers. Coding VR experiments from scratch can be difficult and time-consuming so researchers rely on software such as Unity game engine to create and edit virtual scenes. However, Unity lacks built-in tools for controlling experiments. Existing third-party add-ins require complicated scripts to define experiments. This can be difficult and requires advanced coding knowledge, especially for multifactorial experimental designs. In this paper, we describe a new free and open-source tool called the BiomotionLab Toolkit for Unity Experiments (bmlTUX) that provides a simple interface for controlling experiments in Unity. In contrast to existing tools, bmlTUX provides a graphical interface to automatically handle combinatorics, counterbalancing, randomization, mixed designs, and blocking of trial order. The toolbox works âout-of-the-boxâ since simple experiments can be created with almost no coding. Furthermore, multiple design configurations can be swapped with a drag-and-drop interface allowing researchers to test new configurations iteratively while maintaining the ability to easily revert to previous configurations. Despite its simplicity, bmlTUX remains highly flexible and customizable, catering to coding novices and experts alike.
Proceedings
Wang, X. M., Thaler, A., Eftekharifar, S., Bebko, A. O., Troje, N. F.
Stereopsis and motion parallax provide depth information, capable of producing more realistic user experiences after being integrated into a flat screen (e.g. immersive virtual reality). Extensive research shows that stereoscopic screens increase realism, while few studies have investigated usersâ responses to parallax screens without stereopsis. In this study, we examined usersâ evaluations of screens with only parallax or stereopsis. We found that with only parallax, the mapping between observer motion and viewpoint change should be around 0.6 for a more realistic perceptual experience, and observers were less sensitive to stereoscopic distortions as a result of a different interpupillary distance scaling.
Thaler, A., Bieg, A., Mahmood, N., Black, M. J., Mohler, B. J., Troje, N. F.
Animated virtual characters are essential to many applications. Little is known so far about biological and personality inferences made from a virtual characterâs body shape and motion. Here, we investigated how sex-specific differences in walking style relate to the perceived attractiveness and confidence of male and female virtual characters. The characters were generated by reconstructing body shape and walking motion from optical motion capture data. The results suggest that sexual dimorphism in walking style plays a different role in attributing biological and personality traits to male and female virtual characters. This finding has important implications for virtual character animation.
Sepas-Moghaddam, A., Ghorbani, S., Troje, N. F., Etemad, A.
Gait recognition, referring to the identification of individuals based on the manner in which they walk, can be very challenging due to the variations in the viewpoint of the camera and the appearance of individuals. Current state-of-the-art methods for gait recognition have been dominated by deep learning models, notably those based on partial feature representations. In this context, we propose a novel deep network, learning to transfer multi-scale partial gait representations using capsules to obtain more discriminative gait features. Our network first obtains multi-scale partial representations using a state-of-the-art deep partial feature extractor. It then recurrently learns the correlations and co-occurrences of the patterns among the partial features in forward and backward directions using Bidirectional Gated Recurrent Units (BGRU). Finally, a capsule network is adopted to learn deeper part-whole relationships and assigns more weights to the more relevant features while ignoring the spurious dimensions, thus obtaining final features that are more robust to both viewing and appearance changes. The performance of our method has been extensively tested on two gait recognition datasets, CASIA-B and OU-MVLP, using four challenging test protocols. The results of our method have been compared to the state-of-the-art gait recognition solutions, showing the superiority of our model, notably when facing challenging viewing and carrying conditions.
Symposia and Published Abstracts
Wang, X. M., Bebko, A. O., Thaler, A., Troje, N. F.
The human visual system seems to be well able to interpret the layout of objects in pictures but is less accurate in determining the location from which the picture was taken. Here, we used an exocentric pointing task in immersive virtual reality (VR) to infer the perceived distance between the observer and a pointing virtual character (VC). Participants adjusted orientations of the VC to a highlighted target positioned on a 2.5m-radius circle. We presented the VC inside a frontoparallel frame at the center of this circle. The frame either behaved like a picture or like a window. We also used two intermediate conditions: either stereopsis behaved as if the frame was a window and motion parallax behaved as if it was a picture, or vice versa. The VC was rendered at different distances relative to the frame (-2, -1, 0, and 1m) as determined by projected size and perspective projection, as well as stereopsis and motion parallax information, depending on condition. Perceived distance was inferred from the adjusted pointing direction and the known location of the target. Perceived distance deviated systematically from the intended distance. The data could be modeled accurately (r^2 mean = 0.93, SD = 0.08) after taking a second parameter into account â a depth compression factor. We found that if the frame did not provide stereopsis perceived distance varied little as a function of intended distance and was estimated to be closely behind the frame, and there was little depth compression. However, when the frame provided stereopsis perceived distance was close to the intended distance when taking considerable depth compression into account. This study demonstrated that when viewing a picture, observers perceive depicted objects to be slightly behind the picture plane even if size, perspective, and motion parallax indicate different distances, and stereopsis dominates perceived distance.
Thaler, A., Bieg, A., Mahmood, N., Black, M. J., Mohler, B. J., Troje, N. F.
Human gait patterns are rich in socially relevant information. While many studies have investigated sex-specific differences in walking style, little is known about how sexual dimorphism relates to the perceived attractiveness and confidence of a person. In two studies, 40 observers (20 female, 20 male) rated the attractiveness and another 36 observers (18 female, 18 male) rated the confidence of 50 men and 50 women from the bmlRUB motion capture database, each presented in three different ways in virtual reality: (a) as a 3D virtual character with each actor's individual shape and walking motion reconstructed from optical motion capture data using the MoSh algorithm (Loper et al. 2014, SIGGRAPH Asia), (b) as a static virtual character, and (c) as a walking stick-figure (Troje 2002, JOV). Correlations between all 12 sets of ratings (2 walker sex x 2 participant sex x 3 presentation types) of the two datasets revealed that sexual dimorphism in walking style plays a different role in male and female walkers for attractiveness and confidence ratings. Sexual dimorphism dominates female attractiveness and male confidence assigned to animated virtual characters and stick-figures. The more feminine a woman walks, the more attractive she is rated; the more masculine a man walks, the more confident he is rated. Perceived male attractiveness and female confidence, on the other hand, are determined by increased vertical body movements which make the walkers appear bouncy and energetic. High ratings of the static virtual characters are characterised by tall and slim body shapes for male and female attractiveness, and female confidence, and tall and strong body shapes for male confidence (as compared to small and heavy body shapes). Sexual dimorphism seems to play a different role in attributing biological and personality traits to male and female walkers, but male and female observers agree on their ratings.
Rajendran, S. S., Bottari, D., Shareef, I., Pitchaimuthu, K., Sourav, S., Troje, N. F., Kekunnaya, R., Röder, B.
Visual input during developmental years is vital for the maturity of numerous visual functions. Previous reports have shown that, while normal development of global motion perception seems to require visual input during an early sensitive period, biological motion detection does not seem to do so. A more complex form of biological motion processing is the identification of human actions. Here we tested whether the identification rather than detection of biological motion is experience dependent. A group of human participants who had been treated for partially long lasting congenital cataract (up to 14 years, CC group) had to identify ten actions performed by human line figures. In addition they performed a coherent motion detection task (CM task), which required to identify the direction of coherent motion amidst the movement of random dots. As controls, individuals with developmental cataracts were included to control for the timing of the visual deprivation. Moreover, normally sighted controls were tested both with vision blurred to match the visual acuity of the CC individuals (vision matched group) and with full sight (sighted controls group). The CC group identified biological actions with an extraordinary high accuracy (~85%) and was indistinguishable from the vision matched group. By contrast, coherent motion processing impairments of the CC group persisted even after controlling for visual acuity. These results in the same individuals demonstrate an impressive resilience of biological motion processing to aberrant early visual experience and at the same time a sensitive period for the development of global motion processing in early ontogeny.
Bebko, A. O., Troje, N. F.
Advances in virtual reality (VR) technology have provided a wealth of valuable new approaches to vision researchers. VR offers a critical new depth cue, active motion parallax, that provides the observer with a location in the virtual scene that behaves like true locations do: It changes in predictable ways as the observer moves. The contingency between observer motion and visual stimulation is critical and technically challenging and makes coding VR experiments from scratch impractical. Therefore, researchers typically use software such as Unity game engine to create and edit virtual scenes. However, Unity lacks built-in tools for controlling experiments, and existing third-party add-ins require substantial scripting and coding knowledge to design even the simplest of experiments, especially for multifactorial designs. Here, we describe a new free and open-source tool called the BiomotionLab Toolkit for Unity Experiments (bmlTUX). Unlike existing tools, our toolkit provides a graphical interface for configuring factorial experimental designs and turning them into executable experiments. New experiments work âout-of-the-boxâ and can be created with fewer than twenty lines of code. The toolkit can automatically handle the combinatorics of both random and counterbalanced factors, mixed designs with within- and between-subject factors, and blocking, repetition, and randomization of trial order. A well-defined API makes it easy for users to interface their custom-developed stimulus generation with the toolkit. Experiments can store multiple configurations that can be swapped with a drag-and-drop interface. During runtime, the experimenter can interactively control the flow of trials and monitor the progression of the experiment. Despite its simplicity, bmlTUX remains highly flexible and customizable, catering to both novice and advanced coders. The toolkit simplifies the process of getting experiments up and running quickly without the hassle of complicated scripting.
Other
Wang, J. Z., Badler, N., Berthouze, N., Glimore, R. O., Johnson, K. L., Lapedriza, A., Lu, X., Troje, N. F.
Developing computational methods for bodily expressed emotion understanding can benefit from knowledge and approaches of multiple fields, including computer vision, robotics, psychology/psychiatry, graphics, data mining, machine learning, and movement analysis. The panel, consisting of active researchers in some closely-related fields, attempts to open a discussion on the future of this new and exciting research area. This paper documents the opinions expressed by the individual panelists.
Ghorbani, S., Mahdaviani, K., Thaler, A., Kording, K., Cook, D. J., Blohm, G., Troje, N. F.
Human movements are both an area of intense study and the basis of many applications such as character animation. For many applications, it is crucial to identify movements from videos or analyze datasets of movements. Here we introduce a new human Motion and Video dataset MoVi, which we make available publicly. It contains 60 female and 30 male actors performing a collection of 20 predefined everyday actions and sports movements, and one self-chosen movement. In five capture rounds, the same actors and movements were recorded using different hardware systems, including an optical motion capture system, video cameras, and inertial measurement units (IMU). For some of the capture rounds, the actors were recorded when wearing natural clothing, for the other rounds they wore minimal clothing. In total, our dataset contains 9 hours of motion capture data, 17 hours of video data from 4 different points of view (including one hand-held camera), and 6.6 hours of IMU data. In this paper, we describe how the dataset was collected and post-processed; We present state-of-the-art estimates of skeletal motions and full-body shape deformations associated with skeletal motion. We discuss examples for potential studies this dataset could enable.
2019
Papers
Troje, N. F.
Virtual reality (VR) has always fascinated vision researchers and lay people as it blurs the otherwise very obvious and robust boundaries between reality itself and pictorial representations of it. In contrast to all other visual renderings, it provides the viewer with a well-defined, well-behaving location in the rendered scenery. This difference is critical. Rather than representing just a gradual improvement of rendering quality, VR technology offers an important qualitative change in the way we can study perception.
Larson, D. R., Paulter, N. G., Troje, N. F.
The detection performance of a walk-through metal detector (WTMD) is affected not only by the electromagnetic properties and size and shape of the test objects, but potentially also by the type of motion of the test object through the portal of the WTMD. This motion, it has been argued, can contribute to the uncertainty in detecting threat objects being carried through the WTMD. Typical laboratory-based testing uses a robotic system, or similar, to push a test object through the portal with a trajectory that is a straight line and has a constant velocity. This testing, although reproducible and accurate, does not test for those trajectories that are representative of natural body motion. We report the effects of nonrectilinear trajectories on the detection performance of WTMDs.
Kenny, S., Mahmood, N., Honda, C., Black, M. J., Troje, N. F.
The individual shape of the human body, including the geometry of its articulated structure and the distribution of weight over that structure, influences the kinematics of a personâs movements. How sensitive is the visual system to inconsistencies between shape and motion introduced by retargeting motion from one person onto the shape of another? We used optical motion capture to record five pairs of male performers with large differences in body weight, while they pushed, lifted, and threw objects. From these data, we estimated both the kinematics of the actions as well as the performerâs individual body shape. To obtain consistent and inconsistent stimuli, we created animated avatars by combining the shape and motion estimates from either a single performer or from different performers. Using these stimuli we conducted three experiments in an immersive virtual reality environment. First, a group of participants detected which of two stimuli was inconsistent.Performance was very low, and results were only marginally significant. Next, a second group of participants rated perceived attractiveness, eeriness, and humanness of consistent and inconsistent stimuli, but these judgements of animation characteristics were not affected by consistency of the stimuli. Finally, a third group of participants rated properties of the objects rather than of the performers. Here, we found strong influences of shape-motion inconsistency on perceived weight and thrown distance of objects. This suggests that the visual system relies on its knowledge of shape and motion and that these components are assimilated into an altered perception of the action outcome. We propose that the visual system attempts to resist inconsistent interpretations of human animations. Actions involving object manipulations present an opportunity for the visual system to reinterpret the introduced inconsistencies as a change in the dynamics of an object rather than as an unexpected combination of body shape and body motion.
Proceedings
Mahmood, N., Ghorbani, N., Troje, N. F., Pons-Moll, G., Black, M. J.
Large datasets are the cornerstone of recent advances in computer vision using deep learning. In contrast, existing human motion capture (mocap) datasets are small and the motions limited, hampering progress on learning models of human motion. While there are many different datasets available, they each use a different parameterization of the body, making it difficult to integrate them into a single meta dataset. To address this, we introduce AMASS, a large and varied database of human motion that unifies 15 different optical marker-based mocap datasets by representing them within a common framework and parameterization. We achieve this using a new method, MoSh++, that converts mocap data into realistic 3D human meshes represented by a rigged body model; here we use SMPL, which is widely used and provides a standard skeletal representation as well as a fully rigged surface mesh. The method works for arbitrary marker sets, while recovering soft-tissue dynamics and realistic hand motion. We evaluate MoSh++ and tune its hyperparameters using a new dataset of 4D body scans that are jointly recorded with marker-based mocap. The consistent representation of AMASS makes it readily useful for animation, visualization, and generating training data for deep learning. Our dataset is significantly richer than previous human motion collections, having more than 40 hours of motion data, spanning over 300 subjects, more than 11,000 motions, and will be publicly available to the research community.
Ghorbani, S., Etemad, A., Troje, N. F.
Optical marker-based motion capture is a vital tool in applications such as motion and behavioral analysis, animation, and biomechanics. Labelling, that is, assigning optical markers to the pre-defined positions on the body, is a time consuming and labour intensive post-processing part of current motion capture pipelines. The problem can be considered as a ranking process in which markers shuffled by an unknown permutation matrix are sorted to recover the correct order. In this paper, we present a framework for automatic marker labelling which first estimates a permutation matrix for each individual frame using a differentiable permutation learning model and then utilizes temporal consistency to identify and correct remaining labelling errors. Experiments conducted on the test data show the effectiveness of our framework.
Symposia and Published Abstracts
Kurz, J., Reiser, M., Troje, N. F., Munzert, J.
Anticipation of left- compared to right-sided actions of an opponent seems to be more difficult. The present study aimed to investigate whether prediction accuracy and gaze behavior of left- vs. right-footed penalties differ from each other in a representative experimental setting. 29 participants (soccer goalkeepers, soccer players, non-soccer players) predicted shot direction (left/right) of left- and right-footed penalties from a goalkeeper's perspective. Stimuli were presented in life-size on a large screen (3.2 x 2.1 m) and occluded at ball contact. Participants had to perform a full-body movement towards the predicted shot direction. Accuracy was defined in terms of the correct re-sponse direction. Percentage of time of gaze was examined for five areas of interest (head, upper body, hip, supporting leg, shooting leg). A 2 (condition) x 3 (group) ANOVA for accuracy revealed a significant main effect for condition (F(1,27) = 5.3, p < .05), with higher accuracy for right- (M = 68%) compared to left-footed (M = 64%) penalties. All other main effects and interactions did not attain significance. A 2 (condition) x 3 (group) x 5 (area) ANOVA for gaze revealed a significant main effect for area (F(4,104) = 4.0, p < .05), showing the longest viewing time toward the shooting leg. All other main effects and interactions did not attain significance. The present results indicate higher accuracy for right- compared to left-footed penalties. However, gaze behavior did not differ between left- and right-footed penalties. It can be argued that information processing is different for left- vs. right-sided actions.
Ghorbani, N., Etemad, A., Troje, N. F.
Optical marker-based motion capture is a vital tool in applications such as motion and behavioral analysis, animation, and biomechanics. Labelling, that is, assigning optical markers to the pre-defined positions on the body, is a time consuming and labor-intensive postprocessing part of current motion capture pipelines. The problem can be considered as a ranking process in which markers shuffled by an unknown permutation matrix are sorted to recover the correct order. In this work, we present a framework for automatic marker labelling which first estimates a permutation matrix for each individual frame using a differentiable permutation learning model and then utilizes temporal consistency to identify and correct remaining labelling errors. Experiments conducted on the test data show the effectiveness of our framework.
Eftekharifar, S., Troje, N. F.
The sense of presence is highly intertwined with virtual reality (VR) and is defined as subjective feeling of being in an environment even when users are physically situated in another. Several depth cues seem to be involved to create the sense of presence in VR. Motion parallax and stereopsis are considered the essential parts of immersive virtual environments. However, their relative contribution to create the sense of presence is unclear. In two experiments, we attempted to answer this questions using two versions of the classic pit room paradigm. In the first experiment, participants were asked to cross a deep abyss between two platforms on a narrow plank. Participants completed the task under three experimental conditions: 1) Lateral component of motion parallax was disabled, 2) Stereopsis was disabled, 3) Normal VR with both stereopsis and motion parallax. As a subjective measure of presence, participants responded to a presence questionnaire after each condition while their electrodermal activity (EDA) was also recorded as a measure of stress and anxiety in stressful environments. The importance of motion parallax was shown by EDA (F[2,54]=6.71; P<0.005). Questionnaire scores, however, did not show any difference among the conditions (F[2,54]=0.04; n.s). We applied the same experimental manipulations to a second experiment. In a slightly less stressful situation, participants were asked to stand on a ledge and drop virtual balls to the specified targets into the abyss. Results of both presence questionnaire (F[2,36]=6.39; p<0.05) and EDA (F[2,36]=8.19; p<0.005) demonstrated the importance of motion parallax over stereopsis. Both experiments showed that in VR, motion parallax is a more important depth cue than stereopsis in terms of fear response as measures by EDA. Presence questionnaires also revealed the importance of motion parallax to the sense of presence in the second experiment.
Eftekharifar, S., Troje, N. F.
Stereopsis and active motion parallax are two of the main perceptual factors provided by head mounted displays to create a sense of presence in virtual environments. However, their relative contribution to create the sense of presence is not clear and existing results are somewhat controversial. Here, we study the contribution of stereopsis and active motion parallax using variants of the classic pit room paradigm in two experiments. In the first, participants were required to cross a deep abyss between two platforms on a narrow plank. They completed the task under three experimental conditions: (1) Standard VR (both motion parallax and stereopsis were available); (2) The lateral component of motion parallax was disabled; (3) Stereopsis was disabled. Participants responded to a presence questionnaire after each condition and their electrodermal activity (EDA) was recorded as a measure of stress and anxiety in a threatening situation. Results revealed a main effect of condition and the importance of motion parallax (F[2,54] = 6.71; p< 0.005). Questionnaire results, on the other hand, did not show an effect of experimental condition (F[2,54]=0.04; n.s.). In the second experiment, we applied a similar paradigm in a less stressful context. Participants were standing on the ledge over the pit and dropped a ball trying to hit a target on the ground. Experimental conditions and dependent measures were similar to experiment 1. Both EDA (F[2,36] = 8.19; p< 0.005) and presence questionnaire (F[2,36] = 6.39; p< 0.05) revealed a main effect of condition. Motion parallax affected the EDA and questionnaire scores more than stereopsis. The results from this study suggest that in VR, motion parallax is a more efficient depth cue than stereopsis in terms of fear response which was measured by EDA. Presence questionnaires also showed the importance of motion parallax in the second experiment.
Cui, A. -X., Troje, N. F., Cuddy, L.
Listeners are keenly aware of statistical regularities embedded in music (Kuhn & Dienes, 2005), an awareness or knowledge that develops with cultural exposure (Lantz, Cuddy, & Kim, 2014). However, such knowledge may also be acquired within the timespan of a lab study (Loui, Wessel, & Hudson Kam, 2010). Here, our goal was to uncover potential neural correlates of this acquisition. We measured perceptual and responses during a probe-tone task that required listeners to learn an unfamiliar pitch distribution during a 30-min exposure phase. Forty participants gave probe-tone ratings to probe tones following melodic context before and after exposure to the to-be-learned distribution. Probe tones were categorized by whether they occurred in the probe-tone context and whether they occurred during exposure. While participants gave probe-tone ratings, we recorded their EEG data using 128-electrode EGI Hydrocel Geodesic Sensor Nets. Probe tone ratings were influenced by the local tone distribution heard in the probe-tone context but also by the tone distribution of the entire music genre. After exposure, tones occurring only during exposure received higher ratings than those which never occurred in the genre. In previous work we have shown that participantsâ brain activity in the time window of 380 to 450 ms after probe tone onset, associated with the P3b component, captures participantsâ long-term knowledge about musical regularities. We thus expected a closer correspondence of this component to probe-tone ratings after exposure. However, it more closely corresponded to probe-tone ratings before exposure. Taken together our results suggest that participants are able to gain knowledge about musical regularities after short exposure. Neural correlates of long-term knowledge begin to emerge after a short timespan. Subsequent research should aim to measure the longevity of this knowledge, and consider the implications of our results for the interpretation of the P3b component.
Chang, D. H. F., Troje, N. F., Ban, H.
Previous fMRI work has indicated that both biological form and biological kinematics information have overlapping representations in the human brain; yet, it is unclear as to whether there is a temporal distinction in terms of their relative engagement that is stimulus-dependent. We presented observers (N=21) with upright and inverted biological motion walkers that contained solely biological form (global stimulus), solely biological kinematics (local natural stimulus), or neither natural form nor kinematics information (modified stimulus) and asked them to discriminate the facing direction of the stimuli while concurrently imaging neuromagnetic responses using magnetoencephalography (Elekta Neuromag 360). For all three stimulus classes, we found early (100 ms) responses in lateraloccipital regions that preceded responses in inferiortemporal and fusiform (150-200 ms), and superiortemporal regions (350-500 ms), with response amplitudes differing among the three stimulus classes in extrastriate regions only. Specifically, amplitudes were larger for the inverted global stimulus than for the upright counterpart in fusiform cortex, in addition to surrounding inferior- and superior-temporal regions. In these same regions, amplitudes were higher for the local natural stimulus than for the modified stimulus, but only when stimuli were presented upright. Moreover, amplitudes were higher for the global stimulus than the local natural and modified stimuli, but only when stimuli were presented upside-down. We then compared the representational dissimilarity of MEG sensor patterns with ROI-multivariate response patterns acquired in a second group of observers (N=19) using fMRI (3T) and identical stimuli. Interestingly, we found a marked distinction between the onset of MEG-fMRI representational correspondence, occurring much earlier in early visual cortex (V1-V3) than in higher-order extrastriate body areas. These data suggest that biological motion perception proceeds with temporal systematicity in cortex, engaging early visual cortex prior to inferior-temporal cortex, and finally the oft-implicated superior temporal regions, with stimulus-specificity emerging in later stages.
Ben-Ami, S., Troje, N. F., Sinha, P.
The perception of biological motion is handled effortlessly by our visual system and found even in animals and human neonates. Prior studies testing patients years after recovering from congenital blindness have revealed that this skill is spared, with subjects showing preserved behavioral and electrophysiological responses to visual displays of human coordinated movement even after prolonged periods of congenital blindness. These evidences of an early developing and resilient sensitivity have led to questioning if visual experience is at all required for development of specialization for biological motion or if development of neural systems for processing of biological motion may be independent of visual input. We addressed this question by testing the longitudinal development of the ability to detect biological motion and to extract meaningful information from it in 18 individuals aged 7-21 years with profound congenital visual deprivation, immediately after treatment with sight-restoring surgery. Subjects were shown unmasked point light displays and asked to identify a person by choosing between displays of actions and their inverted, spatially-scrambled or phase-scrambled version in experiment 1, and to determine walking direction in experiment 2. We found that the ability to discriminate biological motion and to determine walking direction were both correlated with visual acuity. We did not find such a correlation in age-matched controls with comparable simulated acuity-reduction. Together, these results paint a picture attesting to the role of visual experience in the emergence of biological motion perception. We probed the use of local cues by sight-restored patients for assessing walking direction in an additional experiment, by manipulating each individual dot and inverting its trajectoryâs directionality. We found reduced reliance of the patient group on local motion information, in contrast to healthy sighted adults and controls observing displays with comparable blur. This difference remained evident over the course of six months following surgery.
Bebko, A., Troje, N. F.
The visual space in front of our eyes and the pictorial space that we see in a photo or painting of the same scene behave differently in many ways. Here, we used virtual reality (VR) to investigate size perception of objects in visual space and their projections in the picture plane. We hypothesize that perceived changes in the size of objects that subtend identical visual angles in pictorial and visual space are due to the dual nature of pictures: The flatness and location of the picture âcross-talksâ (Sedgwick, 2003) with the perception of the depicted three-dimensional space. If the picture is at distance dpic and the depicted object at dobj, size-distance relations influence perceived relative sizes. The picture is expected to be scaled by a factor c*(dobj /dpic-1)+1 to match the object, where c is a constant between 0 and 1. In a VR environment, eight participants toggled back and forth between a view of an object seen through a window in an adjacent room, and a picture that replaced the window. Participants adjusted the picture scale to match the size of the object through 60 trials varying dobj and dpic. A multilevel regression indicated that the above model does not hold. Rather, we found a striking asymmetry between the roles of object and picture. If dobj greater than dpic (object behind picture) then c was 0.005 (t(7) = 7.80, p < 0.001). In contrast, if dobj less than dpic (object in front of picture), c was 0.33 (t(7) = 3, p < 0.001). We discuss this result in the context of a number of different theories that address the particular nature by which the flatness of the picture plane influences the perception of pictorial space.
2018
Papers
Weech, S., Troje, N. F.
Depth-ambiguous point-light walkers are most frequently seen as facing-the-viewer. It has been argued that the facing-the-viewer bias depends on recognizing the stimulus as a person. Accordingly, reducing the social relevance of biological motion by presenting stimuli upside-down has been shown to reduce facing-the-viewer bias. Here, we replicated the experiment that reported this finding, and added stick figure walkers to the task in order to assess the effect of explicit shape information on facing bias for inverted figures. We measured the facing-the-viewer bias for upright and inverted stick figure walkers and point-light walkers presented in different azimuth orientations. Inversion of the stimuli did not reduce facing direction judgments to chance levels. In fact, we observed a significant facing away bias in the inverted stimulus conditions. Additionally, we found no difference in the pattern of data between stick figure and point-light walkers. Although the results are broadly consistent with previous findings, we do not conclude that inverting biological motion simply negates the facing-the-viewer bias; rather, inversion causes stimuli to be seen facing away from the viewer more often than not. The results support the interpretation that primarily low-level visual processes are responsible for the biases produced by both upright and inverted stimuli.
Weech, S., Moon, J., Troje, N. F.
Use of virtual reality (VR) technology is often accompanied by a series of unwanted symptoms, including nausea and headache, which are characterised as âsimulator sicknessâ. Sensory mismatch has been thought to lie at the heart of the problem and recent studies have shown that reducing cue mismatch in VR can have a therapeutic effect. Specifically, electrical stimulation of vestibular afferent nerves (galvanic vestibular stimulation; GVS) can reduce simulator sickness in VR. However, GVS poses a risk to certain populations and can also result in negative symptoms in normal, healthy individuals. Here, we tested whether noisy vestibular stimulation through bone-vibration can also reduce symptoms of simulator sickness. We carried out two experiments in which participants performed a spatial navigation task in VR and completed the Simulator Sickness Questionnaire over a series of trials. Experiment 1 was conducted using a high-end projection-based VR display, whereas Experiment 2 involved the use of a consumer head mounted display. During each trial, vestibular stimulation was either: 1) absent; 2) coupled with large angular accelerations of the projection camera; or 3) applied randomly throughout each trial. In half of the trials, participants actively navigated using a motion controller, and in the other half they were moved passively through the environment along pre-recorded motion trajectories. In both experiments we obtained lower simulator sickness scores when vestibular stimulation was coupled with angular accelerations of the camera. This effect was obtained for both active and passive movement control conditions, which did not differ. The results suggest that noisy vestibular stimulation can reduce the sensory conflict that may underlie simulator sickness, and that this effect appears to generalize across VR conditions. We propose further examination of this stimulation technique.
Wang, Y., Wang, L., Xu, Q., Liu, D., Chen, L., Troje, N. F., He, S., Jian, Y.
The ability to detect biological motion (BM) and decipher the meaning therein is essential to human survival and social interaction. However, at the individual level, we are not equally equipped with this ability. In particular, impaired BM perception and abnormal neural responses to BM have been observed in autism spectrum disorder (ASD), a highly heritable neurodevelopmental disorder characterized by devastating social deficits. Here, we examined the underlying sources of individual differences in two abilities fundamental to BM perception (i.e., the abilities to process local kinematic and global configurational information of BM) and explored whether BM perception shares a common genetic origin with autistic traits. Using the classical twin method, we found reliable genetic influences on BM perception and revealed a clear dissociation between its two componentsâwhereas genes account for about 50% of the individual variation in local BM processing, global BM processing is largely shaped by environment. Critically, participantsâ sensitivity to local BM cues was negatively correlated with their autistic traits through the dimension of social communication, with the covariation largely mediated by shared genetic effects. These findings demonstrate that the ability to process BM, especially with regard to its inherent kinetics, is heritable. They also advance our understanding of the sources of the linkage between autistic symptoms and BM perception deficits, opening up the possibility of treating the ability to process local BM information as a distinct hallmark of social cognition.
Veto, P., Uhlig, M., Troje, N.F., Einhäuser, W.
Can cognition penetrate action-to-perception transfer? Participants observed a structure-from-motion cylinder of ambiguous rotation direction. Beforehand, they experienced one of two mechanical models: an unambiguous cylinder was connected to a rod by either a belt (cylinder and rod rotating in the same direction) or by gears (both rotating in opposite directions). During ambiguous cylinder presentation, mechanics and rod were invisible, making both conditions visually identical. Observers inferred the rodâs direction from their moment-by-moment subjective perceptual interpretation of the ambiguous cylinder. They reported the (hidden) rodâs direction by rotating a manipulandum in either the same or the opposite direction. With respect to their effect on perceptual stability, the resulting match/non-match between perceived cylinder rotation and manipulandum rotation showed a significant interaction with the cognitive model they had previously been biased with. For the âbeltâ model, congruency between cylinder perception and manual action is induced by same-direction report. Here, we found that same-direction movement stabilized the perceived motion direction, replicating a known congruency effect. For the âgearâ model, congruency between perception and action is â in contrast â induced by opposite-direction report. Here, no effect of perception-action congruency was found: perceptual congruency and cognitive model nullified each other. Hence, an observerâs internal model of a machineâs operation guides action-to-perception transfer.
Ross, G., Dowling, B., Troje, N. F., Fischer, S. L., Graham, R. B.
Movement screens are frequently used to identify abnormal movement patterns that may increase risk of injury or hinder performance. Abnormal patterns are often detected visually based on the observations of a coach or clinician. Quantitative, or data-driven methods can increase objectivity, remove issues related to inter-rater reliability and offer the potential to detect new and important features that may not be observable by the human eye. Applying principal components analysis (PCA) to whole-body motion data may provide an objective data-driven method to identify unique and statistically important movement patterns, an important first step to objectively characterize optimal patterns or identify abnormalities. Therefore, the primary purpose of this study was to determine if PCA could detect meaningful differences in athletesâ movement patterns when performing a non-sport-specific movement screen. As a proof of concept, athlete skill level was selected a priori as a factor likely to affect movement performance. Methods: Motion capture data from 542 athletes performing seven dynamic screening movements (i.e. bird-dog, drop jump, T-balance, step-down, L-hop, hop-down, and lunge) were analyzed. A PCA-based pattern recognition technique and linear discriminant analysis with cross-validation were used to determine if skill level could be predicted objectively using whole-body motion data. Results: Depending on the movement, the validated linear discriminate analysis models accurately classified 70.66-82.91% of athletes as either elite or novice. Conclusion: We have provided proof that an objective data-driven method can detect meaningful movement pattern differences during a movement screening battery based on a binary classifier (i.e. skill level in this case). Improving this method can enhance screening, assessment and rehabilitation in sport, ergonomics and medicine.
Chang, D. H. F., Ban, B., Ikegaya, Y., Fujita, I., Troje, N. F.
Using fMRI and multivariate analyses we sought to understand the neural representations of articulated body shape and local kinematics in biological motion. We show that in addition to a cortical network that includes areas identified previously for biological motion perception, including the posterior superior temporal sulcus, inferior frontal gyrus, and ventral body areas, the ventral lateral nucleus, a presumably motoric thalamic area is sensitive to both form and kinematic information in biological motion. Our findings suggest that biological motion perception is not achieved as an end-point of segregated cortical form and motion networks as often suggested, but instead involves earlier parts in the visual system including a subcortical network.
Bottari, D., Kekunnaya, R., Hense, M., Troje, N. F., Sourav, S., Röder, B.
The present study tested whether or not functional adaptations following congenital blindness are maintained in humans after sight restoration and whether they interfere with visual recovery. In permanently congenital blind individuals both intramodal plasticity (e.g. changes in auditory cortex) as well as crossmodal plasticity (e.g. an activation of visual cortex by auditory stimuli) have been observed. Both phenomena were hypothesized to contribute to improved auditory functions. For example, it has been shown that early permanently blind individuals outperform sighted controls in auditory motion processing and that auditory motion stimuli elicit activity in typical visual motion areas. Yet it is unknown what happens to these behavioral adaptations and cortical reorganizations when sight is restored, that is, whether compensatory auditory changes are lost and to which degree visual motion processing is reinstalled. Here we employed a combined behavioral/electrophysiological approach in a group of sight-recovery individuals with a history of a transient phase of congenital blindness lasting for several months to several years. They, as well as two control groups, one with visual impairments, one normally sighted, were tested in a visual and an auditory motion discrimination experiment. Task difficulty was manipulated by varying the visual motion coherence and the signal to noise ratio, respectively. The congenital cataract-reversal individuals showed lower performance in the visual global motion task than both control groups. At the same time, they outperformed both control groups in auditory motion processing suggesting that at least some compensatory behavioral adaptation as a consequence of a complete blindness from birth was maintained. Alpha oscillatory activity during the visual task was significantly lower in congenital cataract reversal individuals and they did not show ERPs modulated by visual motion coherence as observed in both control groups. In contrast, beta oscillatory activity in the auditory task, which varied as a function of SNR in all groups, was overall enhanced in congenital cataract reversal individuals. These results suggest that intramodal plasticity elicited by a transient phase of blindness was maintained and might mediate the prevailing auditory processing advantages in congenital cataract reversal individuals. By contrast, auditory and visual motion processing do not seem to compete for the same neural resources. We speculate that incomplete visual recovery is due to impaired neural network turning which seems to depend on early visual input. The present results demonstrate a privilege of the first arriving input for shaping neural circuits mediating both auditory and visual functions.
Proceedings
Thaler, A., Wellerdiek, A. C., Leyrer, M., Volkova-Volkmar, E., Troje, N. T., Mohler, B. J.
Avatars are important for games and immersive social media applications. Although avatars are still not complete digital copies of the user, they often aim to represent a user in terms of appearance (color and shape) and motion. Previous studies have shown that humans can recognize their own motions in point-light displays. Here, we investigated whether recognition of self-motion is dependent on the avatarâs fidelity and the congruency of the avatarâs sex with that of the participants. Participants performed different actions that were captured and subsequently remapped onto three different body representations: a point-light figure, a male, and a female virtual avatar. In the experiment, participants viewed the motions displayed on the three body representations and responded to whether the motion was their own. Our results show that there was no influence of body representation on self-motion recognition performance, participants were equally sensitive to recognize their own motion on the point-light figure and the virtual characters. In line with previous research, recognition performance was dependent on the action. Sensitivity was highest for uncommon actions, such as dancing and playing ping-pong, and was around chance level for running, suggesting that the degree of individuality of performing certain actions affects self-motion recognition performance. Our results show that people were able to recognize their own motions even when individual body shape cues were completely eliminated and when the avatarâs sex differed from own. This suggests that people might rely more on kinematic information rather than shape and sex cues for recognizing own motion. This finding has important implications for avatar design in game and immersive social media applications.
Eftekharifar, S., Troje, N. F.
Virtual reality, in contrast to visual stimulation on computer screens, is characterized by the illusion of presence. Stereopsis and motion parallax are two of the main perceptual factors provided by head mounted displays to create a sense of depth in virtual environments. However, the relative contribution of stereopsis and motion parallax to create the sense of presence is not clear and existing results are somewhat controversial. Here, we study the contribution of stereopsis and motion parallax using the classic pit room paradigm. Participants are required to cross a deep abyss between two platforms on a narrow plank under three experimental conditions. Participants responded to a presence questionnaire after each condition and their electrodermal activity (EDA) was recorded as a measure of stress and anxiety in a threatening situation. The EDA results demonstrated the importance of motion parallax over stereopsis, however, the questionnaire scores were not different among three conditions.
Symposia and Published Abstracts
Veto, P., Uhlig, M., Troje, N. F., Einhäuser, W.
Theories like âcommon codingâ suggest joint representations of action and perception, which implies a bidirectional coupling between these domains. Effects of perception on action are self-evident. Evidence for direct effects of action on perception arises from perceptual bistability: congruent movements stabilize the interpretation of an ambiguous stimulus. Can cognitive processes affect such action-to-perception transfer? Observers viewed a structure-from-motion cylinder of ambiguous motion direction. Prior to the ambiguous stimulus, we presented unambiguous versions that suggest a mechanical model on how the cylinder connects to a rod; in the âbelt-driveâ condition the rod rotated in the same direction as the cylinder, in the âgear-driveâ condition in the opposing direction. Observers rotated a manipulandum either the same way as the rod (congruent instruction) or in the opposing way (incongruent instruction). In the "belt-drive" condition, the congruent instruction translates to congruency between perception and manual rotation. In the "gear-drive" condition, the congruent instruction translates to *in*congruency between perception and action. If the action-to-perception transfer is not influenced by the internal model of the underlying mechanics, we would find that congruent movement stabilizes the percept in both conditions. If, however, the effect depends upon cognitive assumptions, we would find a more stable percept with incongruent movement in the âgear-driveâ condition. Results showed a significant interaction between the trained mechanical model and the action-to-perception transfer. While the congruency-effect was present in the âbelt-driveâ condition, no difference in either direction was found following the âgear-driveâ training. This suggests that perceptual and cognitive congruency effects nullify each other. Hence, the observersâ internal model of a machineâs operation influences action-to-perception transfer.
Troje, N. F., Theunissen, L.
Head-bobbing during terrestrial locomotion is observed in many bird species. However, the functional significance of this behavior is not obvious at all. Current theories focus on visual functions: A visual input that is free of self-induced optic flow during the hold phase, and increased flow velocities that improve signal-noise ratios for motion-parallax during the thrust phase of the head. I will critically review the evidence for these theories and, working with pigeons, I will present the results of experiments that failed to replicate earlier findings in their support. I will then discuss two new theories and present experimental support for them: The first concerns the possibility to monocularly estimate distance to objects and agents in situations in which normal motion parallax would not be able to provide information. The second is based on measurements of ground reaction forces during locomotion and suggests that head-bobbing reduces metabolic costs during walking.
Troje, N. F., Rosen, D., Eftekharifar, S.
The orientation of a half-profile face presented on a screen or printed out on paper tends to get overestimated. If participants are asked to orient a face half way between frontal view and profile view, they typically choose an angle somewhere between 30 and 40 degrees. In this study, we demonstrate the phenomenon itself, and we test the hypothesis that it is directly related to presenting the face in the pictorial space of the flat screen rather than in the egocentric visual space of the observer. In our experiment, we asked participants to use keyboard presses to rotate a 3D rendering of a human head to orient it at 45 deg, that is, half way between frontal and profile view. A single block consisted of 80 trials. In each of them, the head was initially presented in a random initial orientation. Employing a repeated-measures design, participants completed two such blocks in counterbalanced order. Both viewing conditions were implemented in virtual reality (HTC Vive with Lighthouse tracking). In the first, participants saw a columnar pedestal with the head mounted on top of it in the visual space before them. In the second block, the same scene was recorded with a fixed camera and projected on a virtual computer screen. The results indicated that the mean estimates for angular orientation in visual space (M = 43.01, SD = 5.96) and pictorial space (M = 37.40, SD = 6.99) did differ significantly, t(15) = 5.13, p < .001 (two-tailed t-test). That fact that overestimation of slant angles observed in pictorial representations disappears in visual space is interpreted as evidence that the observed orientation bias is a result of depth compression due to the flatness of the picture itself which is perceived alongside with the depicted contents of the picture in a âtwofoldâ way.
Troje, N. F.
The overall size of the body, the length of its limbs, and the distribution of weight over the body all affect the kinematics and the underlying dynamics of human action. When looking at other people in order to assess who they are and what they do, the visual system applies internal models that describe the relations between body shape and body motion. What kind of knowledge do such models contain? I will present a number of behavioural studies that investigate to which extent the visual system is able to exploit relations between kinematics and body shape. Our stimuli are based on motion capture data from individuals which lift boxes of different weight, push heavy sledges along the floor, or throw small items at different distances. From these motion capture data, we reconstruct the kinematics of the body, but also the individual body shape of the actors. Hybridizing shape and motion from actors with very different body weights allows us to create stimuli with varying degrees of inconsistencies between body shape and body motion. Participants are displayed with these stimuli and are asked to discriminate the veridical from the hybrid stimuli, to attribute properties such as attractiveness and realism to the renderings, and to estimate physical properties of the manipulated objects. Our results show, that observers are not able to discriminate veridical from hybrid stimuli and that there are no systematic differences in the way they attribute properties to the actors. However, we observe systematic differences in perceived object properties. The results demonstrate that changes in the relation between body shape and body kinematics are perceived, but that they are interpreted as changes in the physical properties of the manipulated objects rather than being perceived as explicit inconsistencies in the renderings.
Loeffler, J., Kenny, S., Ghorbani, S., Raab, M., Cañal-Bruland, R., Troje, N. F.
Wenn wir in unserem täglichen Leben gehen passen wir unaufhörlich unsere Gehbewegungen an. Wir gehen schneller oder langsamer, schreiten zielstrebig oder schlendern gemütlich - je nachdem wann wir wo sein möchten. Um unsere Bewegungen zu kontrollieren, helfen uns multisensorische Informationen, wie zum Beispiel visuelle und propriozeptive Information. Wie verändern sich kinematische Parameter bei Inkongruenz von visueller und propriozeptiver Information? Umgebungen in der virtuellen Realität ermöglichen es, diese Frage zu beantworten, indem beispielsweise die visuell wahrgenommene Gehgeschwindigkeit relativ zur realen Gehgeschwindigkeit verstärkt oder vermindert wird. Basierend auf Theorien und empirischer Evidenz wird angenommen, dass die visuelle Modalität bei der Integration multisensorischer Informationen während der Bewegungsausführung eine entscheidende Rolle spielt. Daraus folgt die Vorhersage, dass sich durch eine on-line Manipulation der visuell wahrgenommenen Gehgeschwindigkeit Bewegungsmuster verändern. Um diese Hypothese zu untersuchen wurden Probanden gebeten in einer virtuellen Umgebung in normaler Gehgeschwindigkeit gehen. Während des Gehens wurde die visuell wahrgenommene Gehgeschwindigkeit (der optische Fluss) relativ zur realen Gehgeschwindigkeit entweder verstärkt oder vermindert. Kinematische Parameter wurden sowohl während des Gehens mit Manipulation und ohne Manipulation des optischen Flusses gemessen. Erste Analysen zeigen, dass Personen dazu tendieren bei langsamerem optischen Fluss (realtiv zur realen Gehgeschwindigkeit) schneller zu gehen. Die Ergebnisse könnten dazu beitragen optimale virtuelle Realitätsumgebungen zu entwickeln, da in virtuellen Realitätsumgebungen sehr oft die reale nicht mit der wahrgenommenen Bewegungsgeschwindigkeit übereinstimmt und Auswirkungen auf kinematische Parameter noch weitestgehend unbekannt sind. In klinischen Settings könnten die hier verwendeten (oder ähnliche) Manipulationen Patienten, die Schwierigkeiten beim Gehen haben, helfen. Eine Möglichkeit wäre, den optischen Fluss so zu manipulieren, dass Patienten ihre Bewegung anpassen und beispielsweise bei langsamerem optischen Fluss automatisch schneller gehen.
Kurz, J., Helm, F., Troje, N. F., Munzert, J.
In Sportarten wie Tennis, Handball oder FuÃball hat die Wahrnehmung der gegneri-schen Bewegung einen entscheidenden Einfluss auf die Vorhersageleistung der Akteure. Für die Visualisierung der Stimuli wurden in bisherigen Studien meistens Point-light Dar-stellungen oder Videoaufzeichnungen verwendet. Eine Reihe von Studien (u.a. Abernethy et al., 2001; Ward et al., 2002) haben bereits untersucht, welchen Einfluss die unterschied-lichen Darstellungsarten der Stimuli auf die Vorhersageleistung haben. Allerdings liegt bis-her nur eine Studie (Ward et al., 2002) vor, die den Einfluss der unterschiedlichen Darstel-lungsarten (Point-light Darstellung vs. Videoaufzeichnung) auf das Blickverhalten unter-sucht hat. Ziel der aktuellen Untersuchung war es, an der Studie von Ward et al. (2002) anzuknüpfen und den Einfluss verschieden detaillierter Stimuli beim FuÃball-Elfmeter auf die Vorhersageleistung, die Antwortzeit und das Blickverhalten unter möglichst realitäts-naher Antwortmöglichkeit zu untersuchen. An der Untersuchung nahmen 13 männliche FuÃballspieler im Alter von 18 bis 28 Jahren und einer Erfahrung von durchschnittlich 14,8 ± 4,4 Jahren teil. Auf einer Leinwand (3,2 x 2,1 m) wurden Schüsse (links/rechts) von FuÃballspielern mit unterschiedlichem Detailierungsgrad dargestellt. Die FuÃballspieler wurden als Point-light Darstellung, als Stick-figure Darstellung oder als Avatar (Loper, Mahmodd & Black, 2014) präsentiert. Die Aufgabe der Probanden bestand darin, die Schussrichtung (links/rechts) so schnell und so genau wie möglich vorherzusagen. Die Probanden wurden instruiert, eine aktive, mög-lichst realitätsnahe Bewegung in die jeweilige Ecke zu vollziehen. Die Blickbewegungen wurden mit einem mobilen Eye-tracker aufgezeichnet (SMI, Teltow, Deutschland). Eine einfaktorielle ANOVA (p < 0,01) zeigte, dass die Vorhersageleistung bei Point-light Darstellungen (M = 63 %) signifikant schlechter war als bei Stick-figure Darstellungen (M = 70 %) und Avataren (M = 72 %). Dagegen konnten bei der Antwortzeit keine signifi-kanten Unterschiede zwischen den verschiedenen Darstellungsarten ermittelt werden (F < 1). Bei den Blickbewegungen konnten eine signifikant höhere Anzahl an Fixationen (p < 0,05) sowie eine signifikant kürzere Fixationsdauer (p < 0,05) bei Avataren im Ver-gleich zu Point-light und Stick-figure Darstellungen ermittelt werden. Ebenso zeigte sich, dass bei Avataren der Blick signifikant länger auf den Kopf gerichtet ist im Vergleich zu Point-light (p < 0,01) und Stick-figure Darstellungen (p < 0,01). Unsere Ergebnisse zeigen, dass es signifikante Unterschiede in der Vorhersage-leistung und im Blickverhalten zwischen Avataren und Point-light sowie Stick-figure Dar-stellungen gibt. Interessanterweise unterscheiden sich Point-light und Stick-figure Darstel-lungen nicht beim Blickverhalten, sondern lediglich bei der Vorhersageleistung. Dies füh-ren wir auf eine bessere Verarbeitung der vorliegenden Informationen bei Stick-figure Dar-stellungen im Vergleich zu Point-light Darstellungen zurück.
Eftekharifar, S., Troje, N. F.
Using head-mounted virtual reality systems in which haptic feedback is provided by matching objects of the real world with objects of the virtual word, the demand on the accuracy of the mapping between virtual and real space depends on the accuracy of the visual-motor mapping of the userâs sensorimotor system. Using a system that consists of an Oculus DK2 head-mounted display and the LEAP motion controller, by which participants can see renderings of their hands, we probed the tolerance of participants to distortions of the mapping between motor space and visual space. Participants were asked to keep their open hands symmetrically in front of them such that the two thumbs were close, but without touching each other. We then manipulated the visual-motor mapping in two different ways by either introducing a linear, homogenous translation of both hands, or a nonlinear transformation, which corresponds to a compression or expansion of the space between the two hands. Using this technique we moved their hands in one of six (2 x lateral, anterior-posterior, vertical) directions and asked them to indicate which one it was. The detection threshold was determined as the displacement at which they were correct in 58% (1/6 + 0.5* 5/6) of the cases. A 2x3 ANOVA (condition x direction) revealed a main effect of condition (F(1,54)=75, p< 0.001). Participants are more sensitive detecting the relative displacement of the hands (4 cm) than their absolute location in space (5.3 cm). Knowing detection thresholds informs the design of haptic devices for mixed VR since it determines the tolerance of users to the amount of displacements between real and virtual objects. The results also suggest that the coordination of relative positions of hands is more accurate compared to the absolute location.
Cui, A.-X., Malcolm, P. M., Müller, T. S., Troje, N. F., Cuddy, L. L.
Background Musical stimuli present a unique opportunity to examine the influence of prior knowledge on statistical learning. Knowledge of pitch distributions of music may be acquired through past informal exposure (Cui, Diercks, Troje, & Cuddy, 2016) or formal music training (Cuddy & Badertscher, 1987). Using the latterâs variance in the population we can ask whether it corresponds to variance in statistical learning ability (Siegelman, Bogaerts, Christiansen, & Frost, 2017). Aims Here, we examine the influence on statistical learning of participantsâ prior music exposure to pitch distributional information. Method Thirty-four participants listened to 160 tone sequences each followed by a probe-tone, judging each probe-toneâs fit with the prior sequence. In one block, sequences were generated from an unfamiliar tone distribution. In the other, sequences were generated from a distribution typical for a piece written in C-major, considered a distribution familiar to participants exposed to Western music. The four probe-tones either occurred (congruent) or did not occur (incongruent) in the sequence. Probe-tones were identical for both blocks but differed in their congruency to the distributions. Concurrently we recorded EEG data using EGI HydroCel Nets. We analysed the mean amplitude of a 40 ms time window centred around the maximal peak 380-450 ms post probe-tone onset, corresponding to the time window of the P3b component. Results An ANOVA on the proportion of times each probe-tone was judged âfittingâ, with factors distribution, probe-tone, and block order, revealed an interaction between distribution and probe-tone, F(3, 78) = 79.28, p <.001. Congruent probe-tones were judged âfittingâ more often. Hits and false alarm rates corresponding to the judged fit of congruent and incongruent tones, respectively, were converted to measures of sensitivity dâ, higher for the familiar than the unfamiliar distribution, t(33) = 5.62, p < .001, and response bias C, more conservative for the familiar than the unfamiliar distribution, t(33) = 2.97, p = .005. Years of music training and sensitivity correlated positively for the familiar distribution, r(32) = .40, p = .018, but not for the unfamiliar, or with C for either distribution, ps > .05. Analysis of the EEG data found a significant effect of congruency at frontal electrodes for the familiar, F(1, 33) = 8.83, p = .006, but not for the unfamiliar distribution, p > .05. Conclusions Participants were sensitive to the distributional information in the tone sequences. The difference in sensitivity between distributions supports our hypothesis that prior knowledge influences responses. Moreover, the association with music training for the familiar and lack thereof for the unfamiliar distribution shows that prior knowledge and music training influence responses in specific cases but not statistical learning itself. The exaggerated P3b component for incongruent tones in the familiar distribution suggests that this component represents a violation of knowledge represented in long-term memory, as it was absent when participants listened to the unfamiliar distribution. This allows us to analyse the P3b component in participants exposed to an unfamiliar distribution for a period of time in order to examine the trajectory of musical knowledge in future studies. References Cui, A. X., Diercks, C., Troje, N. F., & Cuddy, L. L. (2016). Short and long term representation of an unfamiliar tone distribution. PeerJ, 4, e2399. Cuddy, L. L., & Badertscher, B. (1987). Recovery of the tonal hierarchy: Some comparisons across age and levels of musical experience. Perception & psychophysics, 41(6), 609-620. Siegelman, N., Bogaerts, L., Christiansen, M. H., & Frost, R. (2017). Towards a theory of individual differences in statistical learning. Phil. Trans. R. Soc. B, 372(1711), 20160059. Keywords: statistical learning, EEG, probe-tone, musical knowledge.
Cui, A.-X., Malcolm, P. M., Müller, T. S., Troje, N. F., Cuddy, L. L.
As a domain general mechanism statistical learning interests both music and language cognition researchers (Patel, 2008). Recently, Siegelman, Bogaerts, Christiansen, and Frost (2017) proposed that statistical learning ability varies among individuals. Using musical stimuli we have a unique opportunity to examine whether prior knowledge may influence statistical learning. Prior knowledge of music may be acquired by either past informal exposure or formal music training. Here, we examine the correspondence between music training and statistical learning, and whether prior exposure to pitch distributional information through music exposure influences participantsâ responses. Twenty-eight participants listened to two blocks of 80 sequences, each sequence containing 34 isochronous tones of 150 ms each and followed by a probe-tone. Participants judged each probe-toneâs fit with the prior sequence. In one block, sequences were generated from an unfamiliar tone distribution; in the other, sequences were generated from a familiar distribution analogous to that of a piece written in C-major. Congruent probe-tones were tones that had occurred during the tone sequence while incongruent probe-tones had not occurred. The probe-tones were physically identical for both blocks but differed in their congruency to the distributions. Congruent probe-tones were judged as âfittingâ more often than incongruent probe-tones, with a stronger effect for the familiar distribution. Sensitivity as measured by dâ (hits and false alarms corresponding to congruent and incongruent tones judged as fitting respectively) was higher for the familiar than for the unfamiliar distribution. The correlation between years of music training and sensitivity was positive only for the familiar distribution. The difference in sensitivity between distributions supports our hypothesis that prior knowledge influences participantsâ responses. In particular, the association between music training for the familiar distribution and the lack thereof for the unfamiliar distribution shows that prior knowledge and music training jointly influence responses but may not influence statistical learning itself. Patel, A. D. (2008). Music, language, and the brain. New York: Oxford University Press. Siegelman, N., Bogaerts, L., Christiansen, M. H., & Frost, R. (2017). Towards a theory of individual differences in statistical learning. Phil. Trans. R. Soc. B, 372(1711), 20160059.
2017
Papers
Weech, S., Troje, N. F.
Studies of the illusory sense of self-motion elicited by a moving visual surround ('vection') have revealed key insights about how sensory information is integrated. Vection usually occurs after a delay of several seconds following visual motion onset, whereas self-motion in the natural environment is perceived immediately. It has been suggested that this latency relates to the sensory mismatch between visual and vestibular signals at motion onset. Here, we tested three techniques with the potential to reduce sensory mismatch in order to shorten vection onset latency: noisy galvanic vestibular stimulation (GVS) and bone conducted vibration (BCV) at the mastoid processes, and body vibration applied to the lower back. In Experiment 1, we examined vection latency for wide field visual rotations about the roll axis and applied a burst of stimulation at the start of visual motion. Both GVS and BCV reduced vection latency by two seconds compared to the control condition, whereas body vibration had no effect on latency. In Experiment 2, the visual stimulus rotated about the pitch, roll, or yaw axis and we found a similar facilitation of vection by both BCV and GVS in each case. In a control experiment, we confirmed that air-conducted sound administered through headphones was not sufficient to reduce vection onset latency. Together the results suggest that noisy vestibular stimulation facilitates vection, likely due to an upweighting of visual information caused by a reduction in vestibular sensory reliability.
Veto, P., Einhäuser, W., Troje, N. F.
Visual illusions explore the limits of sensory processing and provide an ideal testbed to study perception. Size illusions â stimuli whose size is consistently misperceived â do not only result from sensory cues, but can also be induced by cognitive factors, such as social status. Here we investigate, whether the ecological relevance of biological motion can also distort perceived size. We asked observers to judge the size of point-light walkers (PLWs), configurations of dots whose movements induce the perception of human movement, and visually matched control stimuli (inverted PLWs). We find that upright PLWs are consistently judged as larger than inverted PLWs, while static pointlight figures did not elicit the same effect. We also show the phenomenon using an indirect paradigm: observers judged the relative size of a disc that followed an inverted PLW larger than a disc following an upright PLW. We interpret this as a contrast effect: The upright PLW is perceived larger and thus the subsequent disc is judged smaller. Together, these results demonstrate that ecologically relevant biological-motion stimuli are perceived larger than visually matched control stimuli. Our findings present a novel case of illusory size perception, where ecological importance leads to a distorted perception of size.
Theunissen, L. M., Troje, N. F.
Stabilization of the head in animals with limited capacity to move their eyes is key to maintain a stable image on the retina. In many birds, including pigeons, a prominent example for the important role of head stabilization is the characteristic head-bobbing behaviour observed during walking. Multimodal sensory feedback from the eyes, the vestibular system and proprioceptors in body and neck is required to control head stabilization. Here, we trained unrestrained pigeons (Columba livia) to stand on a perch that was sinusoidally moved with a motion platform along all three translational and three rotational degrees of freedom. We varied the frequency of the perturbation and we recorded the pigeonsâ responses under both light and dark conditions. Head, body, and platform movements were assessed with a high-speed motion capture system and the data were used to compute gain and phase of head and body movements in response to the perturbations. Comparing responses under dark and light conditions, we estimate the contribution of visual feedback to the control of the head. Our results show that the head followed the movement of the motion platform to a large extent during translations, but it was almost perfectly stabilized against rotations. Visual feedback only improved head stabilization during translations but not during rotations. The body compensated roll and pitch rotations, but did not contribute to head stabilization during translations and yaw rotations. From the results, we conclude that head stabilization in response to translations and rotations depends on different sensory feedback and that visual feedback plays only a limited role for head stabilization during standing.
Theunissen, L. M., Reid, T., Troje, N. F.
Pecking at small targets requires accurate spatial coordination of the head. Planning of the peck has been proposed to occur in two distinct stop phases, but although this idea has now been around for a long time, the specific functional roles of these stop phases remain unsolved. Here, we investigated the characteristics of the two stop phases using high-speed motion capture and examined their functions with two experiments. In experiment 1, we tested the hypothesis that the second stop phase is used to pre-program the final approach to a target and analyzed head movements while pigeons (Columba livia) pecked at targets of different size. Our results show that the duration of both stop phases significantly increased as stimulus size decreased. We also found significant positive correlations between stimulus size and the distances of the beaks to the stimulus during both stop phases. In experiment 2, we used a two-alternative forced choice task with different levels of difficulty to test the hypothesis that the first stop phase is used to decide between targets. The results indicate that the characteristics of the stop phases do not change with an increasing difficulty between the two choices. Therefore, we conclude that the first stop phase is not exclusively used to decide upon a target to peck at, but also contributes to the function of the second stop phase, which is improving pecking accuracy and planning the final approach to the target.
Helm, F., Troje, N. F., Munzert, J.
This article describes the motion database for a large sample (n = 2,400) of 7-m penalty throws in team handball that includes 1,600 disguised throws. Throws were performed by both novice (n = 5) and expert (n = 5) penalty takers. The article reports the methods and materials used to capture the motion data. The database itself is accessible for download via JLU Web Server and provides all raw files in a three-dimensional motion data format (.c3d). Additional information is given on the marker placement of the penalty taker, goalkeeper, and ball together with details on the skill levels and/or playing history of the expert group. The database was first used by Helm, Munzert, and Troje (under reveiw) to investigate the kinematic patterns of disguised movements. Results of this analysis are reported and discussed in their article "Kinematic Patterns Underlying Disguised Movements: The Role of Spatial and Temporal Dissimilarity".
Helm, F., Munzert, J., Troje, N. F.
This study examined the kinematic characteristics of disguised movements by applying linear discriminant (LDA) and dissimilarity analyses to the motion data from 788 disguised and 792 non-disguised 7-m penalty throws performed by novice and expert handball field players. Results of the LDA showed that discrimination between type of throws (disguised vs. non-disguised) was more error-prone when throws were performed by experts (spatial: 4.6%; temporal: 29.6%) compared to novices (spatial: 1.0%; temporal: 20.2%). The dissimilarity analysis revealed significantly smaller spatial dissimilarities and variations between type of throws in experts compared to novices (p < .001), but also showed that these spatial dissimilarities and variations increased significantly in both groups the closer the throws came to the moment of (predicted) ball release. In contrast, temporal dissimilarities did not differ significantly between groups. Thus, our data clearly demonstrate that expertise in disguising one's own action intentions results in an ability to perform disguised penalty throws that are highly similar to genuine throws. We suggest that this expertise depends mainly on keeping spatial dissimilarities small. However, the attempt to disguise becomes a challenge the closer one gets to the action outcome (i.e., ball release) becoming visible.
Fini, C., Bardi, L., Troje, N. F., Committeria, G., Brass, M.
Recent results have shown that the way we categorize space varies as a function of the frame of reference. If the reference frame (RF) is another person vs. an object, the distance is judged as reduced. It has been suggested that such an effect is due to the spontaneous processing of the other's motor potentialities. To investigate the impact of movement representation on space perception,we used biological motion displays as a prime for a spatial categorization task. In Exp. 1, participants were presentedwith a point-lightwalker or a scrambledmotion, and then judged the location (âNearâ or âFarâ) of a target with a human body or an inanimate object as RF. In Exp. 2, participantswere primed with point-lightwalkers of different speeds: a runner, a normalwalker and a slowwalker. In Exp. 3 they were primed with a point-light display depicting a human body sitting down on or standing up froma chair,with a human body RF either oriented or not oriented towards the target. Results showed a reduced judged distancewhen the human body RFwas primed with a point-light walker (Exp. 1). Furthermore, we found an additional reduction of the judged distance when priming with a runner (Exp. 2). Finally, Exp. 3 showed that the human body RF has to be target oriented as a precondition for priming effects of the point-light walker.
Proceedings
Larson, D. R., Paulter, N. G., Troje, N. F.
The effect of motion through the portal of a walk-through metal detector (WTMD) has often been considered to contribute to the uncertainty in detecting threat objects being carried through the WTMD. However, typical metrological testing uses a robotic system, or similar, to push a test object through the portal with a trajectory that is a straight line and has a constant velocity. This testing, although reproducible and accurate, does not present other than linear trajectories. On the other hand, testing using clean testers, that is, people not carrying any metal objects other than the test object, is not reproducible or accurate because of the great variation between transits through the portal from any one clean tester. We report using a robotic system to accurately study the effect of non-straight-line trajectories of different velocities for the test object passing through the portal of the WTMD.
Kenny, S., Mahmood, N., Honda, C., Black, M. J., Troje, N. F.
The individual shape of the human body, including the geometry of its articulated structure and the distribution of weight over that structure, influences the kinematics of a personâs movements. How sensitive is the visual system to inconsistencies between shape and motion introduced by retargeting motion from one person onto the shape of another? We used optical motion capture to record five pairs of male performers with large differences in body weight, while they pushed, lifted, and threw objects. Based on a set of 67 markers, we estimated both the kinematics of the actions as well as the performerâs individual body shape. To obtain consistent and inconsistent stimuli, we created animated avatars by combining the shape and motion estimates from either a single performer or from different performers. In a virtual reality environment, observers rated the perceived weight or thrown distance of the objects. They were also asked to explicitly discriminate between consistent and hybrid stimuli. Observers were unable to accomplish the latter, but hybridization of shape and motion influenced their judgements of action outcome in systematic ways. Inconsistencies between shape and motion were assimilated into an altered perception of the action outcome.
Book Chapters
Symposia and Published Abstracts
Weech, S., Troje, N. F.
Sickness and low immersion experienced in virtual reality (VR) inhibit the widespread adoption of VR technology. Both are thought to relate to errors in the process of estimating self-motion from multisensory stimuli. Recoupling multisensory cues can generate more convincing illusory self-motion (vection) and reduce sickness, but current methods rely on expensive or invasive techniques to simulate the expected sensory cues. We present an approach to reducing cue mismatch based on statistical sensory reweighting principles, and outline initial evidence that shows the addition of vestibular ânoiseâ can produce a facilitation of vection and a reduction in sickness in VR. In one study we examined the effect of noisy vestibular stimulation (stochastic galvanic stimulation, or bone-conducted vibration) on the onset latency of circular vection. Both noisy stimulation methods reduced the onset latency of illusory self-motion for roll, pitch, and yaw optic flow stimuli. In a second study we applied bone-conducted vibration in a navigation task in VR and measured simulator sickness. We found that sickness was reduced significantly when the timing of noisy vestibular stimulation was coupled with the occurrence of expected vestibular cues. The results provide evidence that a low-cost, non-invasive vestibular stimulus has the potential to influence vection and simulator sickness. Second, the results reiterate the finding that vection onset latency and sickness in VR are related to multisensory mismatch. This work constitutes the first effort to use noisy vestibular stimulation to improve the user experience in the virtual environment. Future studies are needed to explicate the mechanism through which noisy stimulation affects self-motion perception.
Cui, A.-X., Dederichs, M., Troje, N. F. , Cuddy, L.
How do we acquire knowledge about a specific music system? Music system specific knowledge is important for our bility to differentiate between different music systems, and serves as a framework for appreciating new music. Krumhansl (1990) has proposed that we develop a representation of a music system by abstracting its statistical regularities during exposure. This idea has been supported by developmental, cross-cultural, and cognitive research. For instance, participants are able to abstract statistical regularities of an unfamiliar music system after short exposures. In a previous experiment in our lab we exposed participants to an unfamiliar music system defined by the frequency of occurrence of different pitches. Before and after the exposure phase, probe tone ratings were obtained for different tones following tone sequences from the same music system (contexts). Crucially, some tones occurred as part of the music system during the exposure phase, but never as part of a context. Before exposure, ratings of these tones were lower than ratings of tones that occurred in the context, suggesting that participants use short-term regularities. However, ratings of these tones increased after exposure, whereas ratings of tones, which never occurred as part of the music system remained similar. This suggests that participants had also used abstracted knowledge over and above the information that was present in the context after exposure. Thus we propose that participants are able to abstract short-term as well as long-term knowledge about an unfamiliar music system after short exposure. Several electroencephalographic (EEG) markers have been found to characterize different kinds of musical knowledge, thus presenting an opportunity to test our conclusions. The mismatch negativity (MMN) is known to indicate violation of representations of short-term regularities, i.e., short-term knowledge. On the other hand, the early right anterior negativity (ERAN) is thought to indicate violation of representations of longer-term regularities, i.e., long-term knowledge (Koelsch, 2009). Aims: Using ERPs as diagnostic measures of different kinds of knowledge, we want to describe the nature of abstracted musical knowledge in more detail. Results will further our understanding of how we gain knowledge about a specific music system. Method: We will expose participants to an unfamiliar music system for 30 minutes defined by the distribution of the systemâs constituent pitches. Before and after the exposure phase, ERPs are recorded with a 128 electrode dense array EEG system for probe tones following tone sequences (contexts), which are generated from a similar but not identical distribution. Tone sequences heard during exposure and used as contexts will differ in 12% of their constituent tones, but overlap in the used pitches otherwise, suggesting that both types of tone sequences are from the same music genre. This will allow us to compare ERPs to probe tones that occur (a) both during exposure and in contexts, (b) only in contexts, (c) only during exposure, and (d) neither during exposure or in contexts. Furthermore, ERPs collected after the exposure phase will be compared to ERPs collected before the exposure phase for each type of probe tone. Results: Our presentation at ESCOM 2017 will consist of a summary of our work to date, and will outline in particular our behavioral data that, we propose, identifies short-term and long-term representations of pitch distributions. We will also present new EEG data from comparisons of different EEG-recording conditions. Conclusions: Data will inform us about the nature of statistical knowledge abstracted from exposure to music. Our design will allow us to draw conclusions about the time course of statistical learning, and advance our understanding of how we gain knowledge about a specific music system. References: Koelsch, S. (2009). Music syntactic processing and auditory memory: Similarities and differences between ERAN and MMN. Psychophysiology 46(1), 179-190. Krumhansl, C. L. (1990). Cognitive Foundations of Musical Pitch. Oxford University Press.
Chang, D. H. F., Hiroshi, B., Ikegaya, Y., Fujita, I., Troje, N.F.
We report findings from both human fMRI (n = 35), and MEG (n = 10) experiments that tested neural responses to dynamic ("local", acceleration) cues in biological motion. We measured fMRI responses (3T Siemens Trio, 1.5 mm3) to point-light stimuli that were degraded according to: 1. spatial coherency (intact, horizontally scrambled with vertical order retained, horizontally scrambled with vertical order inverted); 2. local motion (intact, constant velocity); and 3. temporal structure (intact, scrambled). Results from MVPA decoding analyses revealed surprising sensitivity of subcortical (non-visual) thalamic area ventral lateral nucleus (VLN) for discriminating local naturally-accelerating biological motion from constant velocity motion, in addition to a wide cortical network that extends dorsally through the IPS and ventrally, including the STS. Retaining the vertical order of the local trajectories resulted in higher accuracies than inverting it, but phase-randomization did not affect (discrimination) responses. In a separate experiment, different subjects were presented with the same stimuli while magnetic responses were measured using a 360 channel whole head MEG system (Neuromag 360, Elekta; 1000 Hz sampling frequency). Results revealed responses in much of the same cortical network identified using fMRI, peaking at 100-150 ms, and again at 350-500 ms after stimulus onset during which we also observed important functional differences with greater activity in hMT+, LO, and STS for structure-from-motion versus the local natural acceleration stimulus, and greater early (V1-V3) and IPS activity for the local natural acceleration versus constant velocity motion. We also observed activity along the medial surface by 200 ms. The fact that medial activity arrives distinctly following early cortical activity (100-150 ms), but before the 350-500 ms window suggests that the implication of thalamic VLN for biological motion perception observed with fMRI may have arisen from early cortical responses, but not higher order extrastriate cortex.
2016
Papers
Ware, E. L. R., Saunders, D. R., Troje, N. F.
A closed-loop teleprompter system was used to isolate and manipulate social interactivity in the natural courtship interactions of pigeons, Columbia livia. In Experiment 1 a live face-to-face real time interaction between two courting pigeons (Live) was compared to a played back version of the video stimulus recorded during the pairs Live interaction. We found that pigeons were behaving interactively; their behavior depended on the relationships between their own signals and those of their partner. In Experiment 2, we tested whether social interactivity relies on spatial cues present in the facing direction of a partnerâs display. By moving the teleprompter camera 90º away from its original location, the partnerâs display was manipulated to appear as if it is directed 90º away from the subject. We found no effect of spatial offset on the pigeonâs behavioral response. In Experiment 3, three time delays, 1s, 3s and 9s, a Live condition, and a playback condition were chosen to investigate the importance of temporal contiguity in social interactivity. Furthermore, both opposite sex (courtship) and same sex (rivalry) pairs were studied to investigate whether social-context affects social interactivity sensitivity. Our results showed that pigeon courtship behavior is sensitive to temporal contiguity. Behavior declined in the 9s and Playback conditions as compared to Live condition and the shorter time delays. For males only, courtship behavior also increased in the 3s delay condition. The effect of social interactivity and time delay was not observed in rivalry interactions, suggesting that social interactivity may be specific to courtship.
Phillipou, A., Rossell, S., Gurvich, C., Castle, D., Troje, N.F., Abel, L.
Objective: Anorexia Nervosa (AN) is a psychiatric condition characterised by a distortion of body image. However, whether individuals with AN can accurately perceive the size of other individualsâ bodies is unclear. Method: In the current study, 24 females with AN and 24 healthy control participants undertook two biological motion tasks while eyetracking was performed: to identify the gender and to indicate the walkersâ body size. Results: AN participants tended to âhyperscanâ stimuli, but did not demonstrate differences in how visual attention was directed to different body areas relative to controls. Groups also did not differ in their estimation of body size. Discussion: The hyperscanning behaviours suggest increased anxiety to disorder-relevant stimuli in AN. The lack of group difference in the estimation of body size suggests that the AN group were able to judge the body size of others accurately. The findings are discussed in terms of body image distortion specific to oneself in AN.
Lisney, T. J., Troje, N. F.
Many birds bob their head as they walk or run on the ground. The functional significance of this behaviour is unclear, but there is strong evidence that it plays a significant role in enhancing visual perception. If head-bobbing is advantageous, however, then it is a puzzle that some birds do not head-bob. As a group, gulls (Laridae) are among the birds that reportedly do not head-bob, yet here we report head-bobbing among Ring-billed Gulls (Larus delawarensis), observed and filmed in Ontario, when walking relatively slowly while foraging on the ground. This suggests that head-bobbing plays a key role in the visual detection of food items in this species. We suggest that head-bobbing may be a relatively common behaviour in foraging Ring-billed Gulls and speculate that other gulls (and indeed other birds) previously thought not to head-bob may in fact do so under certain circumstances.
Klüver, M., Hecht, H., Troje, N. F.
Why do some people appear attractive to us while others don't? Evolutionary psychology states that sexual attractiveness has evolved to assess the reproductive qualities of a potential mate. Past research in the field has identified a number of traits that can be linked directly to qualities such as immuno-competence, developmental stability, and fertility. The current study is motivated by the hypothesis that attractiveness is determined not just by individual, independent traits, but also by whether or not their pattern is internally consistent. Exploiting the domain of biological motion, we manipulate internal consistency between anthropometry and kinematics of a moving body. In two experiments, we varied internal consistency by using original point-light walkers (high internal consistency) and hybrid walkers, generated by combining anthropometric and kinematic data from different walkers (low internal consistency). As predicted, we found a significant link between internal consistency and sexual attractiveness, suggesting that internal consistency signals health and mate quality.
Cui, A.-X., Diercks, C., Troje, N. F., Cuddy, L. L.
We report on a study conducted to extend our knowledge about the process of gaining a mental representation of music. Several studies, inspired by research on the statistical learning of language, have investigated statistical learning of sequential rules underlying tone sequences. Given that the mental representation of music correlates with distributional properties of music, we tested whether participants are able to abstract distributional information contained in tone sequences to form a mental representation. For this purpose, we created an unfamiliar music genre defined by an underlying tone distribution, to which 40 participants were exposed. Our stimuli allowed us to differentiate between sensitivity to the distributional properties contained in test stimuli and long term representation of the distributional properties of the music genre overall. Using a probe tone paradigm and a two-alternative forced choice discrimination task, we show that listeners are able to abstract distributional properties of music through mere exposure into a long term representation of music. This lends support to the idea that statistical learning is involved in the process of gaining musical knowledge.
Bottari, D., Troje, N.F., Ley, P., Hense, M., Kekunnaya, R., Röder, B.
Functional brain development is characterized by sensitive periods during which experience must be available to allow for the full development of neural circuits and associated behavior. Yet, only few neural markers of sensitive period plasticity in humans are known. Here we employed electroencephalographic recordings in a unique sample of twelve humans who had been blind from birth and regained sight through cataract surgery between four months and 16 years of age. Two additional control groups were tested: a group of visually impaired individuals without a history of total congenital blindness and a group of typically sighted individuals. The EEG was recorded while participants performed a visual discrimination task involving intact and scrambled biological motion stimuli. Posterior alpha and theta oscillations were evaluated. The three groups showed indistinguishable behavioral performance and in all groups evoked theta activity varied with biological motion processing. By contrast, alpha oscillatory activity was significantly reduced only in individuals with a history of congenital cataracts. These data document on the one hand brain mechanisms of functional recovery (related to theta oscillations) and on the other hand, for the first time, a sensitive period for the development of alpha oscillatory activity in humans.
Symposia and Published Abstracts
Weech, S., Konar, Y., Troje, N. F.
Simulator sickness in virtual reality (VR) poses a massive barrier to widespread adoption of current VR technologies. Galvanic vestibular stimulation (GVS) at the mastoid processes presents an effective technique for reducing simulator sickness during simulated motion. This benefit is thought to be achieved by reducing visual and vestibular mismatches, which are also associated with the degree of immersiveness and vection experienced. However, the invasiveness of GVS means it is unlikely to find widespread adoption as a remedy for simulator sickness. Here we examined whether vestibular stimulation accomplished through bone conducted vibration (BCV) could achieve similar effects on vection to those produced by GVS. We measured vection latency and magnitude for subjects undergoing BCV and GVS. In the BCV condition we applied transient 500Hz vibration to the mastoid processes at visual motion onset. In the GVS condition we used a transient pink noise signal to bilaterally stimulate the vestibular system at visual motion onset. Overall, the results of our study strongly suggest that BCV is comparable to GVS in the degree to which visual-vestibular mismatch can be mitigated. This finding helps to pave the way towards a non-invasive method that could improve immersiveness and ameliorate simulator sickness in VR. Such a method could reshape the way that researchers and consumers operate in VR.
Veto, P., Einhäuser, W., Troje, N. F.
Size perception is distorted in several illusions, including some that rely on complex social attributes: for example, people of higher subjective importance are associated with larger size. Biological motion receives preferential visual processing over non-biological motion with similar low-level properties, a difference presumably related to a stimulusâ ecological importance. Hence, we asked whether biological motion perception can also lead to size illusions. In each trial, observers (N=16) were simultaneously exposed to an upright and an inverted point-light walker from a frontal view for 250 ms. After disappearance of the walkers, two circles were flashed at their positions. The circles differed in size to varying degrees, and observers were queried to indicate with a non-speeded button press, which of the circles appeared larger. We conducted paired sample t-tests on the parameters of the psychometric curves fitted to response frequencies for upright versus inverted cued targets. We found that the circle at the location of the upright walker was perceived smaller than the circle at the location of the inverted walker (t(15) = 2.37, p < .05). Our findings introduce a novel illusion: biological motion reduces subsequently perceived size relative to non-biological motion.
Troje, N. F., Theunissen, L.
Head-bobbing during terrestrial locomotion is observed in many bird species. However, the functional significance of this behaviour is not clear at all. Current theories focus on visual functions: A visual input that is free of self induced optic flow during the hold phase, and increased flow velocities that provide increased signal-noise ratios for motion-parallax measures during the thrust phase of the head. I will critically review the evidence for these theories and, using pigeons, I will present the results of experiments that failed to replicate earlier findings in their support. As an alternative, I will discuss two new theories and experimental support for them: The first concerns the possibility to monocularly estimate distance to objects and agents in situations in which normal motion parallax would not be able to provide information. The second is based on measurements of ground reaction forces during locomotion and suggests that head-bobbing reduces the metabolic costs associated with walking.
Troje, N. F., Bieg, A., Mahmood, N., Mohler, B., Black, M.
While there exists plenty of work on facial attractiveness only little is known about how the rest of the body determines the perception of another person. We were particularly interested in how the shape of the body and the way it moves contributes to attractiveness. Observers (20 male and 20 female) rated the attractiveness of 50 men and 50 women from the BML database each displayed in either of three ways in a 3D immersive virtual reality: (a) static bodies reconstructed from the motion capture data by means of MoSh (Loper et al. 2014, SIGGRAPH Asia) displayed as detailed 3D shapes ; (b) walking stick-figures (Troje 2002, JOV); (c) same bodies as above, but animated with the corresponding walking movements. Correlations between all 12 sets of ratings (2 participant sex x 2 walker sex x 3 display types) reveal three different factors that contribute to the perception of attractiveness. The first factor is sexual dimorphism and applies to female attractiveness assigned to stick figures and moving meshes. The more feminine a woman, the more attractive she is rated. The second is characterized by increased vertical movement which makes the attractive walker appear bouncy, confident, and vigorous. It dominates male attractiveness assigned to stick-figures and moving meshes. The third factor is characterized by slim, tall body shape (attractive) as compared to stout and wider shapes and applies to ratings of static body shapes of both male and female walkers. Male and female observers agree in all cases. The way we move affects our appeal to others as much as the appearance of the static body. While sexual dimorphism dominates female attractiveness, it does not play much of a role for male attractiveness â neither in the shape nor in the motion domain.
Reid, T., Theunissen, L., Troje, N. F.
Pecking at seeds requires accurate spatial coordination of the head. Pigeons have a special sequence of head movements while pecking, which consists of two stop phases and two thrust phases. Although it has been known for a long time that pigeons stop their head during pecking (Goodale, 1983), the function of the stop phases remained unresolved. Here, we hypothesized that the stop phases are used for motor planning and expected that they are the longer the smaller the target stimulus is. Pigeons (Columba livia) were observed using high-speed motion capture (Qualisys, Oqus) while they pecked at white circles with diameters between 5 and 32 mm to receive a food reward. Our results show that the duration of both stop phases significantly increased as stimulus size decreased. We also found significant positive correlations between stimulus size and the distances of the beaks to the stimulus. Furthermore, head orientation was pre-adjusted to the target position after the first stop phase and finalized after the second. Therefore, we conclude that the first stop phase is not only used to decide upon a broad area to peck at, as suggested earlier, but also contributes to preparation and motor planning of the final approach to the target.
Phillipou, A., Rossell, S. L., Castle, D. J., Gurvich, C., Hughes, M. E., Nibbs, J. B., Troje, N. F., Abel, L. A.
Background: Anorexia Nervosa (AN) is associated with the highest mortality rate of any mental illness, yet the neurobiological underpinnings of the condition remain unclear. The neural circuitry involved in the production of saccadic eye movements is well understood, and saccadic eye movement tasks have been used to investigate neurobiological deficits in a range of other psychiatric populations. The present study was the first to assess saccadic control in AN. Methods: 24 females with AN and 25 healthy controls matched for age, gender and premorbid intelligence were assessed on a battery of basic saccadic eye movement and visual scanpath tasks, with eyetracking and functional magnetic resonance imaging (fMRI). Results: The AN group displayed a number of deficits in saccadic control including shorter prosaccade latencies, increased memory-guided inhibitory errors, hyperscanning of stimuli, and associated differences in brain activity, relative to healthy controls. Conclusions: The findings suggest a disinhibition of saccadic control in AN, and a potential role of gamma-aminobutyric acid (GABA) in the superior colliculus in the psychopathology of AN.
Kenny, S., Troje, N. F.
Perceiving the weight of a lifted object from visual displays of the lifting person is a non-trivial task. Runeson and Frykholm (1981), who worked with biological motion point-light displays, attributed the ability to estimate the weight of a lifted box to what they called the Kinematic Specification of Dynamics. The KSD assumes that dynamics are inferred from observed kinematic patterns by means of an internal model of the relations between body shape and body kinematics. Using MoSh, that is, Motion and Shape Capture from Sparse Markers (Loper, Mahmood, & Black, 2014) we created animated, life-like human avatars from surface motion capture data of performers lifting light and heavy boxes. For some of our stimuli, we then combined the body shape of one lifter with the kinematics of another to create hybrid lifters. In the consistent condition, stimuli were generated using the shape and movement from the same performer. In the low- and high- inconsistency conditions, the shape and movements of the stimuli were taken from different performers; however, in the former, the shape and motion were from different performers with similar body masses, and in the latter, shape was matched with motion from individuals with dissimilar body masses. Participants estimated the perceived weight of the lifted box. Results showed that participants could discriminate between box weights, although they slightly overestimated their real weight. However, we did not find the expected dependency of internal consistency. Further studies will examine the degree to which larger inconsistencies are detectable, and in which domains internal consistency matters.
Helm, F., Weech, S., Munzert, J., Troje, N. F.
The internal simulation of observed actions is thought to play a crucial role in the process of recognizing action intentions of others (Jeannerod, 2001). In many situations, performers attempt to manipulate this process in order to deceive another. For example, competitive athletes may try to convince opponents that one action is being performed, while another is actually carried out. It has been shown that the visual system is sensitive to the intentions of deceptive movements, although prediction accuracy is reduced (e.g. Grèzes et al., 2004). Real and deceptive actions typically display a degree of spatiotemporal dissimilarity in terms of motion trajectories and temporal dynamics of the movement kinematics. Currently there is no research that examines how these spatiotemporal dissimilarities influence the discriminability of deceptive movements for novice and expert observers. We addressed this question in the context of handball throws. We motion captured deceptive and non-deceptive throwing movements of novice and elite handball field players and used these to generate realistic 3D avatars. In a perceptual task, we asked novice and expert handball players to judge whether observed throws were either deceptive or non-deceptive. The results show that both groups were highly sensitive to deception in throws. Expert observers were significantly better than novices at discriminating throws from both elite and novice performers. In general, discriminability was directly related to spatiotemporal dissimilarities between deceptive and non-deceptive throws. Elite performers produced deceptive throws that were highly similar to non-deceptive throws, resulting in higher misclassifications by observers. We interpreted these findings in the context of prediction errors resulting from internal simulation of kinematically similar deceptive and non-deceptive throws. The higher sensitivity to deception displayed by experts points towards superior action simulation based on stored motor representations for practiced actions, resulting in fewer prediction errors. Future neurophysiological studies should directly address this interpretation.
Cui, A.-X., Diercks, C., Troje, N. F., Cuddy, L. L.
Background and Aims: Our research aimed to explore the role of statistical learning in the process of gaining musical knowledge. We investigated the questions a) whether participants are able to abstract pitch distributional information from novel melodies after 30 minutes of exposure, b) whether this abstraction becomes a mental representation (schema), and c) whether this representation influences responses to melodies generated from a similar, but not identical, pitch distribution. Method: We assessed statistical learning before and after exposure with the probe tone method. Participants listened to novel melodies and indicated the goodness of fit (GoF) of each of 12 chromatic probe tones to these melodies. Melodies heard during exposure and used as probe tone contexts were both based on a whole-tone scale, but differed in 12% of the presented tones. This allowed us to investigate GoF responses to probe tones that had occurred both during exposure and in probe tone contexts (EyPy), probe tones that had occurred in the probe tone contexts, but not during exposure (EnPy), probe tones that had occurred during exposure, but not in probe tone contexts (EyPn), and probe tones that had occurred neither during exposure or in probe tone contexts (EnPn). Afterwards, participants completed a discrimination task, in which they indicated which of two melodies resembled the heard melodies. Results and Conclusions: As expected, prior to exposure, GoF responses were higher for probe tones appearing in the novel melodies than probe tones not appearing in the novel melodies. This finding also held post exposure, but in addition, GoF responses for probe tone category EyPn increased significantly after exposure. This indicates that participants were able to abstract pitch distributional information during 30 min of exposure as well as when readily accessible (heard in the probe tone context), and integrate this information when making GoF responses. The average percent correct of 76% in the discrimination task demonstrates that participants were able to transfer the gained musical knowledge to another task.
Chang, D. H. F., Ban, H., Troje, N. F.
It has long been assumed that the human brain contains dedicated machinery for processing biological motion. Behaviourally, biological motion perception has been shown to implicate mechanisms that are distinct: one that governs the retrieval of global structure (body form) and another purported to process âlocalâ information that is particularly sensitive to the gravity-defined acceleration pattern conveyed by the feet. Here, we used fMRI to dissociate the neural underpinnings of these two mechanisms. We measured responses (N=16 participants) to point-light stimuli containing solely structural information (local horizontal directionality neutralized but global structure intact), solely local information (global structure destroyed, but local information intact), and perturbed local information, presented vertically- upright and inverted. Observers were asked to judge walking direction. Results from SVM (MVPA) analyses indicate widespread sensitivity to global structure from early cortex (V1-V3) extending to the inferior and superior parietal lobule, in contrast to comparatively weak sensitivity to local information in cortex. Strikingly, we found significant sensitivity to local information in subcortical ventral lateral nucleus, VLN, an area not sensitive to global structure. These data suggest that distinct networks are engaged for biological structure versus local motion processing, the latter of which may rely primarily on earlier, subcortical systems.
2015
Papers
Ware, E. L. R., Saunders, D. R., Troje, N. F.
Visual motion, a critical cue in communication, can be manipulated and studied using video playback methods. A primary concern for the video playback researcher is the degree to which objects presented on video appear natural to the non-human subject. Here we argue that the quality of motion cues on video, as determined by the videoâs image presentation rate (IPR), are of particular importance in determining a subjectâs social response behaviour. We present an experiment testing the effect of variations in IPR on pigeon (Columbia livia) response behavior towards video images of courting opposite sex partners. Male and female pigeons were presented with three video playback stimuli, each containing a different social partner. Each stimulus was then modified to appear at one of three IPRs: 15, 30 or 60 progressive (p) frames per second. The results showed that courtship behaviour became significantly longer in duration as IPR increased. This finding implies that the IPR significantly affects the perceived quality of motion cues impacting social behaviour. In males we found that the duration of courtship also depended on the social partner viewed and that this effect interacted with the effects of IPR on behaviour. Specifically, the effect of social partner reached statistical significance only when the stimuli were displayed at 60 p, demonstrating the potential for erroneous results when insufficient IPRs are used. In addition to demonstrating the importance of IPR in video playback experiments, these findings help to highlight and describe the role of visual motion processing in communication behaviour.
Michalak, J., Rohde, K., Troje, N. F.
Background and Objects: Several studies have shown that physical exercises like walking have effects on depression. These studies have focused on increasing the intensity of physical activity. In present study we investigated whether not only the intensity but also the style of physical activity has an effect on depression related processes. Method: Using a biofeedback technique, we manipulated participants (39 undergraduates) to change their walking patterns to either reflect the characteristics of depressed patients or a particularly happy walking style. The intensity of walking (i.e. walking speed) was held constant across condition. During walking, participants first encoded and later recalled a series of emotionally loaded terms (Ramel, Goldin, Eyler, Brown, Gotlib, & McQuaid, 2007). Results: The difference between recalled positive and recalled negative words was much lower in participants who adopted a depressed walking style as compared to participants who walked as if they were happy. Limitations: The effects of gait manipulation were investigated in a non-clinical group of undergraduates. Conclusions: The observed change in memory bias supports the idea that beyond the intensity of walking the style of walking has effects on the vulnerability to depression.
Heenan, A., Troje, N. F.
Orthographically projected biological motion stimuli are depth-ambiguous. Consequently, their projection when oriented towards the viewer is the same as when oriented away. Despite this, observers tend to interpret such stimuli as facing the viewer more often. Some have speculated that this facing-the-viewer bias may exist for sociobiological reasons: Mistaking another human as retreating when they are actually approaching could have more severe consequences than the opposite error. An implication of this theory is that the facing-towards percept may be perceived as more threatening than the facing-away percept. Given this, as well as the finding that anxious individuals have been found to display an attentional bias towards threatening stimuli, we reasoned that more anxious individuals might have stronger facing-the-viewer biases. Furthermore, since anxious individuals have been found to perform poorer on inhibition tasks, we hypothesized that inhibitory ability would mediate the relationship between anxiety and the facing-the-viewer bias (i.e., difficulty inhibiting the threatening percept). Exploring individual differences, we asked participants to complete anxiety questionnaires, to perform a Go/No-Go task, and then to complete a perceptual task that allowed us to assess their facing-the-viewer biases. As hypothesized, we found that both greater anxiety and weaker inhibitory ability were associated with greater facing-the-viewer biases. In addition, we found that inhibitory ability significantly mediated the relationship between anxiety and facing-the-viewer biases. Our results provide further support that the facing-the-viewer bias is sensitive to the sociobiological relevance of biological motion stimuli, and that the threat bias for ambiguous visual stimuli is mediated by inhibitory ability.
Cui, A.-X., Collett, M., Troje, N. F., Cuddy, L.
We investigated familiarity and preference judgments of participants towards a novel musical system. We exposed participants to tone sequences generated from a novel pitch probability profile. Afterwards we either asked participants to identify more familiar or we asked participants to identify preferred tone sequences in a two alternative forced choice task. The task paired a tone sequence generated from the pitch probability profile they had been exposed to and a tone sequence generated from another pitch probability profile at three levels of distinctiveness. We found that participants identified tone sequences as more familiar if they were generated from the same pitch probability profile which they had been exposed to. However, participants did not prefer these tone sequences. We interpret this relationship between familiarity and preference to be consistent with an inverted U shaped relationship between knowledge and affect. The fact that participants identified tone sequences as even more familiar if they were generated from the more distinctive (caricatured) version of the pitch probability profile which they had been exposed to suggests that the statistical learning of the pitch probability profile is involved in gaining of musical knowledge.
Chen, S. C., Xiao, C., Troje, N. F., Robertson, M., Hawryshyn, C. W.
Non-visual photoreceptors with diverse photopigments allow organisms to adapt to changing light conditions.Whereas visual photoreceptors are involved in image formation,non-visual photoreceptors mainly undertake various non-image-forming tasks. They form specialised photosensory systems that measure the quality and quantity of light and enable appropriate behavioural and physiological responses. Chromatophores are dermal non-visual photoreceptors directly exposed to light and they not only receive ambient photic input but also respond to it. These specialised photosensitive pigment cells enable animals to adjust body coloration to fit environments, and play an important role in mate choice, camouflage and ultraviolet (UV) protection. However, the signalling pathway underlying chromatophore photoresponses and the physiological importance of chromatophore colour change remain under-investigated. Here, we characterised the intrinsic photosensitive system of red chromatophores (erythrophores) in tilapia. Like some non-visual photoreceptors, tilapia erythrophores showed wavelength-dependent photoresponses in two spectral regions: aggregations of inner pigment granules under UV and short-wavelengths and dispersions under middle- and long-wavelengths.The action spectra curve suggested that two primary photopigments exert opposite effects on these light-driven processes: SWS1 (short-wavelength sensitive 1) for aggregations and RH2b (rhodopsin-like) for dispersions. Both western blot and immunohistochemistry showed SWS1 expression in integumentary tissues and erythrophores. The membrane potential of erythrophores depolarised under UV illumination, suggesting that changes in membrane potential are required for photoresponses. These results suggest that SWS1 and RH2b play key roles in mediating intrinsic erythrophore photoresponses in different spectral ranges and this chromatically dependent antagonistic photosensitive mechanism may provide an advantage to detect subtle environmental photic change.
Bottari, D., Troje, N. F., Ley, P., Hense, M., Kekunnaya, R., Röder, B.
Functional brain development is characterized by sensitive periods during which experience must be available to allow for the full development of neural circuits and associated behavior. Yet, the neural mechanisms of sensitive period plasticity in humans are not understood. Here we investigated a unique sample of humans who had been blind from birth and had undergone cataract surgery between 4 months and 16 years of age. Electrophysiological and behavioral parameters were assessed during two visual tasks (biological and global motion) for which different developmental trajectories are known. Alpha oscillations which have been associated with the control of the excitatory/inhibitory balance of neural circuits were markedly impaired in individuals with a history of congenital cataracts. By contrast alpha activity was unaffected in a sample of visually impaired individuals without a history of congenital total blindness. Moreover, behavioral recovery was worse in congenital cataract-reversal individuals only in the task in which controls exhibited activity-regulating alpha oscillations (global motion), but not in the task which was unrelated to alpha activity (biological motion) in controls. These results demonstrate a main mechanism of sensitive period plasticity in humans: The development of neural circuits controlling the excitatory/inhibitory balance of neural networks is linked to a sensitive period.
Bardi, Lara, Di Giorgio, Elisa, Lunghi, Marco, Troje, Nikolaus F, Simion, Francesca
The present study investigates whether the walking direction of a biological motion point-light display can trigger visuo-spatial attention in 6-month-old infants. A cueing paradigm and the recording of eye movements in a free viewing condition were employed. A control group of adults took part in the experiment. Participants were presented with a central point-light display depicting a walking human, followed by a single peripheral target. In experiment 1, the central biological motion stimulus depicting a walking human could be upright or upside-down and was facing either left or right. Results revealed that the latency of saccades toward the peripheral target was modulated by the congruency between the facing direction of the cue and the position of the target. In infants, as well as in adults, saccade latencies were shorter when the target appeared in the position signalled by the facing direction of the point-light walker (congruent trials) than when the target appeared in the contralateral position (incongruent trials). This cueing effect was present only when the biological motion cue was presented in the upright condition and not when the display was inverted. In experiment 2, a rolling point-light circle with unambiguous direction was adopted. Here, adults were influenced by the direction of the central cue. However no effect of congruency was found in infants. This result suggests that biological motion has a priority as a cue for spatial attention during development.
Symposia and Published Abstracts
Weech, S., Gale, D. J., Troje, N. F.
The 'Viewing from Above' prior (VFA) guides visual perception of ambiguous stimuli in shape-from-shading and shape-from-contour tasks (Reichel & Todd, 1990). Accordingly, the typical Necker cube tends to be reliably perceived from above, particularly for short presentations, although viewing-from-below is equally valid (e.g., Troje, 2010). Initial observations indicated that shape influences the reliability of the VFA: a 'diamond' stimulus (i.e., two attached square-base pyramids) is less likely to be perceived in one consistent orientation. We compared perception of diamond and Necker cube ambiguous line-drawings with four shapes that shared related features: point-up or point-down square-based pyramids and point-up or point-down pyramids attached to a Necker cube. The Necker cube was viewed from above (97%) significantly more often than the diamond (86%). The point-up pyramid was also viewed from above (97%) more often than the point-down pyramid (84%). We also found the point-up pyramid attached to the Necker cube was viewed from above (98%) more often than the same stimulus when inverted (87%). We suggest that a strong VFA prior mediated by a weaker prior for global convexity are generally consistent with the results. The finding contrasts with research that found equal strength for the two priors in shape-from-shading (Langer & Bulthoff, 2001), but supports research on the unequal strength of the two priors in line drawings (Mamassian & Landy, 1998). Specifically, all stimuli tended to be viewed from above; and the point-up and point-down pyramids adhere to a convexity towards the viewer if they are perceived from above and from below, respectively. The same is true for the pyramids attached to Necker cubes. However, the finding that the diamond stimulus is seen less often from above than other stimuli cannot be explained.
Michalak, J., Rohde, K., Mischnat, J., Troje, N. F.
Two studies on the effect of bodily manipulation on memory bias will be presented. A negative memory bias, i.e. the tendency to recall more negative than positive words, is one of the most robust findings in research on emotional processes in Major Depression. In study 1 an unobtrusive biofeedback technique was used to change the gait patterns of 39 undergraduates to either reflect the characteristics of depressed patients or a particularly happy walking style. During walking, participants first encoded and later recalled a series of emotionally loaded terms. Participants who adopted a happy walking style recalled a higher proportion of positive words, while participants who adopted a depressed walking style recalled a balanced proportion of positive and negative words. In study 2 30 currently depressed inpatients either sat in a slumped (depressed) or in an upright (non-depressed) posture while imagining a visual scene of themselves in connection with positive or depression related words presented to them on a computer screen. An incidental recall test of these words was conducted after a distraction task. The upright-sitting patients showed unbiased recall of positive and negative words while slumped patients showed recall biased towards more negative words. Results of both studies show that changes in the motoric system can affect one of the best-documented biases in depression.
Bottari, D., Troje, N. F., Ley, P., Hense, M., Kekunnaya, R., Röder, B.
Few models in humans allow the investigation of sensitive periods of functional development. The evaluation of the functional recovery in individuals who had suffered from congenital dense bilateral cataracts (opaque lenses that prevent patterned light to reach the retina) after the cataracts had been surgically removed provides such rare opportunity. Here we investigated 12 individuals with a history of congenital cataracts in a Biological Motion processing task while the EEG was recorded. Task of the participants was to detect a moving point light display of a cat amongst moving point-light displays of human walkers (biological motion, BM) and scrambled versions of the latter. The N170 was modulated by BM both in the cataract-individuals and in controls. Indeed, both groups were indistinguishable with respect not only to the N170 amplitude modulation but with respect of the scalp topography and the latency of this effect. In line with the neural results, the congenital cataract individuals performed indistinguishable from their controls in the EEG task and in an independent behavioural task assessing the thresholds for detecting biological motion. Since congenital cataract reversal individuals did not show modulation of the N170 for faces vs other objects, these data suggests independent developmental trajectories for these visual functions.
Bach, M., Frommherz, V., Lagrèze, W., Troje, N.
Biological motion is a fascinating phenomenon: Not only can we recognize walking with impoverished information, but also species identity, and in humans mood and sex. We here look at a developmental aspect, namely at which age gender recognition becomes successful. Based on a data set by Troje (2002, J Vision, doi:10.1167/2.5.2), we created three sex-neutral walkers for one task: front, left-walking and right-walking. For another task we created male and female walkers with three different levels of "gender strength". Each stimulus was presented twice in a blocked randomized fashion. Children in the age range 3 to 6 years participated; their parents had been fully acquainted with the study and signed a written agreement. In their kindergarten environment, the children were first familiarized with biological motion using the neutral front walker. When they understood the situation and were ready to participate further, two kinds of tasks were presented: Task 1 was to recognize direction of a point walker (rightâleft vs. leftâright), here called "walking recognition". The second task was to recognize the sex of a point walker ("girl" vs. "boy"), here called "gender recognition". The task outcome could be "non-compliant", "correct" or "incorrect". We found that the youngest age group (2-3 years) reported walking direction at chance level. This rose only slightly for 3-4. Above 4 years of age walking direction was reported correctly. For gender recognition, there was a significant effect of age (p< 0.01), but gender was recognized one year later than walking direction. There was no significant effect of "gender strength" for the values we had chosen. Our results suggest: During their kindergarten period, children still develop their biological motion capability; by school age, gender recognition has matured.
2014
Papers
Williamson, K. E., Jakobson, L. S., Saunders, D. R., Troje, N. F.
Biological motion perception can be assessed using tasks that require the extraction of local motion cues, the ability to perceptually group these cues to extract information about body structure, and higher-order processes required for action recognition and person identification (Troje, 2013). In the present study, we assessed the impact of prematurity and associated complications on the development of these processes in 8-11 year old children born prematurely at very low birth weight (< 1500 g) and matched, full-term controls. Preterm children exhibited difficulties in all four aspects of biological motion perception. However, intercorrelations between test scores were weak in both full-term and preterm children â a finding that supports the view that these processes are relatively independent. Preterm children also displayed more autistic-like traits than full-term peers. In preterm (but not full-term) children these traits were negatively correlated with performance in the task requiring structure-from-motion processing [r(30) = -.36, p < .05)], but positively correlated with the ability to extract identity [r(30) = .45, p < .05)]. These findings extend previous reports of vulnerability in systems involved in processing dynamic cues in preterm children, and suggest that a core deficit in social perception/cognition may contribute to the development of the social and behavioural difficulties even in members of this population who are functioning within the normal range intellectually. The results could inform the development of screening, diagnostic, and intervention tools.
Weech, S., McAdam, M., Kenny, S., Troje, N. F.
Orthographically-projected biological motion point-light displays are generally ambiguous with respect to their orientation in depth, yet observers consistently prefer the facing-the-viewer interpretation. There has been discussion as to whether this bias can be attributed to the social relevance of biological motion stimuli or relates to local, low level stimulus properties. In the present study we address this question. In Experiment 1, we compared the facing-the-viewer bias produced by a series of four stick figures and three human silhouettes that differed in posture, gender, and the presence vs. absence of walking motion. Using a paradigm in which we asked observers to indicate the spinning direction of these figures, we found no bias when participants observed silhouettes, whereas a pronounced degree of bias was elicited by most stick figures. We hypothesized that the ambiguous surface normals on the lines and dots that comprise stick figures are prone to a visual bias that assumes surfaces to be convex. The local surface orientations of the occluding contours of silhouettes are unambiguous, and as such the convexity bias does not apply. In Experiment 2, we tested the role of local features in ambiguous surface perception by adding dots to the elbows and knees of silhouettes. We found biases consistent with the facing directions implied by a convex body surface. The results unify a number of findings regarding the facing-the-viewer bias. We conclude that the facing-the-viewer bias is established at the level of surface reconstruction from local image features rather than on a semantic level.
Heenan, A., Troje, N. F.
Biological motion stimuli, such as orthographically projected stick figure walkers, are ambiguous about their orientation in depth. The projection of a stick figure walker oriented towards the viewer, therefore, is the same as its projection when oriented away. Even though such figures are depth-ambiguous, however, observers tend to interpret them as facing towards them more often than facing away. Some have speculated that this facing-the-viewer bias may exist for sociobiological reasons: Mistaking another human as retreating when they are actually approaching could have more severe consequences than the opposite error. Implied in this hypothesis is that the facing-towards percept of biological motion stimuli is potentially more threatening. Measures of anxiety and the facing-the-viewer bias should therefore be related, as researchers have consistently found that anxious individuals display an attentional bias towards more threatening stimuli. The goal of this study was to assess whether physical exercise (Experiment 1) or an anxiety induction/reduction task (Experiment 2) would significantly affect facing-the-viewer biases. We hypothesized that both physical exercise and progressive muscle relaxation would decrease facing-the-viewer biases for full stick figure walkers, but not for bottom- or top-half-only human stimuli, as these carry less sociobiological relevance. On the other hand, we expected that the anxiety induction task (Experiment 2) would increase facing-the-viewer biases for full stick figure walkers only. In both experiments, participants completed anxiety questionnaires, exercised on a treadmill (Experiment 1) or performed an anxiety induction/reduction task (Experiment 2), and then immediately completed a perceptual task that allowed us to assess their facing-the-viewer bias. As hypothesized, we found that physical exercise and progressive muscle relaxation reduced facing-the-viewer biases for full stick figure walkers only. Our results provide further support that the facing-the-viewer bias for biological motion stimuli is related to the sociobiological relevance of such stimuli.
Heenan, A., Best, M. W., Ouellette, S. J., Meiklejohn, E. , Troje, N. F., Bowie, C. R.
Stigma towards individuals diagnosed with schizophrenia continues despite increasing public knowledge about the disorder. Questionnaires are used almost exclusively to assess stigma despite self-report biases affecting their validity. The purpose of this experiment was to implicitly assess stigma towards individuals with schizophrenia by measuring visual perceptual biases immediately after participants conversed with a confederate. We manipulated both the diagnostic label attributed to the confederate (peer vs. schizophrenia) and the presence of behavioural symptoms (present vs. absent). Immediately before and after conversing with the confederate, we measured participantsâ facing-the-viewer (FTV) biases (the preference to perceive depth-ambiguous stick-figure walkers as facing towards them). As studies have suggested that the FTV bias is sensitive to the perception of threat, we hypothesized that FTV biases would be greater after participants conversed with someone they believed had schizophrenia, and also after they conversed with someone who presented symptoms of schizophrenia. We found partial support for these hypotheses. Participants had significantly greater FTV biases in the Peer Label/Symptoms Present condition. Interestingly, while FTV biases were lowest in the Schizophrenia Label/Symptoms Present condition, participants in this condition were most likely to believe that people with schizophrenia should face social restrictions. Our findings support that both implicit and explicit beliefs help develop and sustain stigma.
Symposia and Published Abstracts
Weech, S., Troje, N. F.
Depth-ambiguous point-light walkers are most frequently seen as facing-the-viewer (FTV). Inverting the figures considerably reduces this FTV bias (Vanrie et al., 2004). The finding has been used to argue that the FTV bias depends on recognizing the stimulus as a person â which is more difficult when the stimulus is inverted. Recent experiments indicate that the FTV bias is largely caused by a bias to perceive depth-ambiguous surfaces as convex (Weech and Troje, 2013). Based on this research, we hypothesized that the effect of inversion on FTV bias arises due to the difficulty with which coherent 3D shape is resolved from inverted point-light walkers. Without this shape, the stimulus appears âflatâ and the convexity bias does not play out. If explicit, coherent shape is provided (as in stick figures) we would expect no effect of inversion on FTV bias. We measured the FTV bias in 30 participants for upright and inverted point-light walkers and stick figures. We depicted stimuli at frontal and three-quarter views and recorded observersâ perceived facing directions. We defined the FTV bias as the percentage of responses signaling a facing-towards interpretation. Participants accurately chose one of the two veridical interpretations at a rate of over 95% for both stimulus types. We found an interaction between stimulus representation and orientation: The inversion effect for stick figures (44%) was smaller than that for point-light walkers (55%). This result supports our hypothesis to a limited degree. Unexpectedly, both stimulus types generated reliable facing-away bias when inverted. Results are consistent with the hypothesis that the lower part of the stimulus takes precedence when subjects are making judgments of facing directions, given that the knees and elbows are opposing in terms of the facing direction implied when assumed to be convex.Â
Kenny, S., Troje, N. F.
The degree of perspective distortion of an object depends on the ratio of its size to its distance from the rendering camera (the field-of-view, FOV). Previously, researchers have reported that sufficient amounts of linear perspective can disambiguate the direction of an otherwise depth-ambiguous point-light display (e.g., Schouten & Verfaillie, 2010). Based on their finding that the effect of FOV on the FTV bias is modulated by the height of the camera above ground Troje, Kenny, and Weech ( 2013) hypothesised that this observation is not based on linear perspective per se, but rather the result of a bias to see the walkerâs feet from above rather than from below. Here, we test explicitly if the previously reported effects of linear perspective are caused by camera elevation changes that pit a facing the viewer bias (FTV) against a very strong viewing from above bias. We asked participants to indicate the perceived facing direction of point-light displays, and modified the camera elevation according to a staircase procedure targeting the 25%, 50% and 75% FTV thresholds. A univariate ANOVA showed that camera elevation caused large changes in perceived facing direction at the three FTV bias thresholds: 25% (M = -12.87°, SD = 10.14°), 50% (M = -6.27°, SD = 10.99°), or 75% (M = -1.00°, SD = 8.28°), F(2, 22) = 16.04, p <.001. Increasing amounts of negative elevation, below the horizontal plane, led to the perception of point-light displays as facing away from the viewer. Most importantly, the resulting psychometric function is identical to those obtained with linear perspective methods that incidentally modify camera elevation (Troje, Weech, & Kenny, 2013). We argue that camera elevation, not linear perspective, produces the previously observed modifications of the facing-the-viewer bias of depth ambiguous point-light walkers.
Heenan, A., Troje, N. F.
Biological motion stimuli, depicted as orthographically projected stick-figure walkers (SFWs), do not contain any information about their orientation in depth: A fronto-parallel projection of a SFW facing the viewer is the same as its projection facing away. Despite this depth-ambiguity, observers tend to interpret SFWs as facing the viewer more often (Vanrie et al., 2004). Some researchers have speculated that this facing-the-viewer (FTV) bias has a sociobiological explanation: Mistaking an approaching human as retreating when he/she is actually approaching is assumed to be more costly than making the opposite mistake. Indeed, there appears to be support for this, as observers tend to have greater FTV biases for male walkers than for female walkers (Brooks et al., 2008; Schouten et al., 2010). We have also observed positive correlations between anxiety and FTV biases in our lab (Heenan et al., 2012). The goal of this study was to investigate whether physical exercise, which is known to reduce anxiety, would significantly reduce FTV biases for SFWs. We employed a 3 (Stimulus Type: full SFW, bottom-half-only, top-half-only; within-subjects) x 3 (Exercise Condition: standing, walking, or jogging on a treadmill; between-subjects) mixed design. We hypothesized that physical exercise would decrease FTV biases for the full SFWs only, as bottom-half- and top-half-only SFWs carry less sociobiological relevance than full SFWs. Sixty-six participants completed anxiety questionnaires, performed the treadmill task (10 min), and then immediately completed the SFW task. As hypothesized, physical exercise reduced FTV biases for the full SFW stimuli only. Furthermore, anxiety (measured before the treadmill task) was significantly correlated with FTV biases for the standing condition only. Our results suggest that the FTV bias for biological motion stimuli may indeed have a sociobiological basis.
Heenan, A., Best, M. W., Ouellette, S. J., Meiklejohn, E., Troje, N. F., Bowie C. R.
Background: Stigma towards individuals diagnosed with schizophrenia continues despite increasing public knowledge about the disorder. Questionnaires are used almost exclusively to assess stigma despite self-report biases affecting their validity. Perceived threat is known to be a key element in stigma, and recently researchers have argued that perceptual biases for human-like point-light walkers (PLFs) may be moderated by perceived threat. Specifically, researchers have found that greater facing the viewer (FTV) biases for depth-ambiguous PLFs (i.e., a bias to see these figures as facing towards you more often) is associated with greater perceived threat (e.g., more anxious people tend to have greater FTV biases). Observing perceptual biases elicited by such figures may therefore provide an implicit method of analyzing stigma because it allows researchers to assess perceived threat without asking participants about this directly. The purpose of this experiment was to implicitly assess stigma towards individuals with schizophrenia by measuring participants' FTV biases immediately before and after participants conversed with a confederate. Methods: Participants entered the laboratory and immediately completed the perceptual bias task. The perceptual bias task consisted of short (0.5 s) presentations of rotating PLFs, which participants then responded to regarding which way they perceived the figure rotating (i.e., clockwise or counter-clockwise). Unbeknownst to participants, all PLFs were rendered rotating counter-clockwise, but could be perceived rotating in either direction because of their depth-ambiguous nature. We could thus calculate each participantâs FTV bias by comparing their responses with the veridical walker positions. Once participants completed the initial perceptual bias task, they then conversed with a confederate for 10 minutes. We manipulated both the diagnostic label attributed to the confederate (peer vs. schizophrenia) and the presence of behavioural symptoms (present vs. absent). Immediately after conversing with the confederate, we again measured participantsâ FTV biases using the perceptual bias task. Following the completing of the PLF task, we administered an explicit measure of stigma as well (i.e., the Community Attitudes toward the Mentally Ill questionnaire, or CAMI). Results: As researchers have found that stronger FTV biases are elicited by more threatening stimuli, we hypothesized that FTV biases would be greater after participants conversed with someone they believed had schizophrenia, and also after they conversed with someone who presented symptoms of schizophrenia. We found partial support for these hypotheses. Participants had significantly greater FTV biases in the Peer Label/Symptoms Present condition. Interestingly, while FTV biases were lowest in the Schizophrenia Label/Symptoms Present condition, participants in this condition were most likely to believe that people with schizophrenia should face social restrictions. Thus, we found that participants felt significantly more threatened in the condition where they thought they were conversing with a peer, but the person was displaying symptoms of schizophrenia. However, even when they felt less threatened (i.e., as in Schizophrenia Label/Symptoms Present condition), they still harboured negative beliefs about people with schizophrenia. Discussion: Our findings support that both implicit and explicit beliefs help develop and sustain stigma. Our study was the first to assess the feasibility of using the FTV bias for PLFs as an implicit measure of perceived threat. The results of our study are promising, and suggest that implicit assessments of stigma may be able to provide much more information than explicit measures alone. This has implications for future stigma research, as the PLF task that we used in the present study can provide a fast and easy-to-administer implicit assessment of stigma. Future research on stigma using implicit measures is necessary in order to identify the contributions that implicitly and explicitly held beliefs have towards the development and maintenance of stigma towards individuals with schizophrenia.
Heenan, A., Baetz-Dougan, M., Tao, C., Troje, N. F.
Orthographically projected biological motion stimuli are depth-ambiguous and so their projection when oriented towards the viewer is the same as when oriented away. Despite this, observers tend to interpret such stimuli as facing the viewer more often, and some have argued that this facing-the-viewer (FTV) bias may exist for sociobiological reasons. We assessed participantsâ anxiety, assessed their inhibitory ability using a Go/No-Go task, and then assessed their FTV bias. We found that inhibitory ability significantly mediated the relationship between anxiety and the FTV bias (i.e., more anxious individuals had difficulty inhibiting the facing-the-viewer percept, resulting in greater FTV biases).
Fini, C., Bardi, L., Troje, N., Committeri, G., Brass, M.
Recent results suggest that the perception of extrapersonal space might be filtered not only by our own motion potentialities but also by the motor potentialities of others. We investigated whether the simulation of a walking action shapes our extrapersonal space perception. In three experiments we took advantage of biological motion displays as primes for a distance judgment task. In Exp 1, participants were presented with a point-light walker or a scrambled motion, and judged the location (âNearâ or âFarâ) of a target with a human body or an inanimate object as reference frames (RFs). In Exp 2, participants were primed with point-light walkers of different speeds: a runner, a normal and a slow walker. Finally, in Exp 3 we displayed a sitting down or a standing up point-light motion before the presentation of a target-oriented or a non target-oriented human body as RF. Results show a reduced perceived distance when the human body RF was primed with a point-light walker (Exp 1). Furthermore, we found an additional reduction of the perceived distance when priming with a runner (Exp 2). Finally, Exp. 3 shows the necessity of inferring the intention to cover the distance as a precondition for priming effects of the point-light walker.
Baetz-Dougan, M., Troje, N. F.
Risk, or food intake variability, influences animalsâ preference for constant or variable alternatives during foraging. A modified version of Scalar Expectancy Theory (SET) has shown predictive ability in the past, but is limited in its account of response effort. In the present study, pigeons performed a colour discrimination task in two exposure conditions. The effort condition incorporated variable pecking while the delay condition acted as a control by matching the temporal duration. We found that variable preference was reduced in the effort condition compared to the delay condition. Our results suggest magnitude estimation and attention have implications for modified SET.
2013
Papers
Troje, N. F., Aust, U.
Biological motion point-light displays are a rich and versatile instrument to study perceptual organization. Humans are able to retrieve information from biological motion through at least two different channels: The global articulated structure as revealed by the non-rigid, yet highly constrained deformation of the dot pattern, and the characteristics of local motion trajectories of individual dots. Here, we tested eight pigeons on a task in which they had to discriminate a left-facing from a right-facing biological motion point-light figure. Since the two stimuli were mirror-flipped versions of each other, we were not sure if the birds would be able to solve the task at all. However, all birds learned the discrimination quickly and performed at high accuracy. We then challenged them with a number of test trials introduced into the sequence of the normal training trials. Tested on backwards moving walkers, the majority of the birds indicated that they used local motion cues to solve the training task, while the remaining birds obviously used global, configural cues. Testing the pigeons on different versions of scrambled biological motion confirmed that each individual bird had made a clear decision for one of the two potentially available strategies. While we confirm a previously described local precedence in processing visual patterns, the fact that some birds used global features suggests that even the birds who relied on local cues probably dispose of the perceptual abilities to use global structure, but âchoseâ to not use them.
Sabbah, S., Troje, N. F., Gray, S. M., Hawryshyn, C. W.
Humans use three cone photoreceptor classes for colour vision, yet many birds, reptiles and shallow-water fish are tetrachromatic and use four cone classes. Screening pigments, which narrow the spectrum of photoreceptors in birds and diurnal reptiles, render visual systems with four cone classes more efficient. To date, however, the question of tetrachromacy in shallow-water fish that, like humans, lack screening pigments, is still unsolved. We raise the possibility that tetrachromacy in fish has evolved in response to higher spectral complexity of underwater light. We compared the dimensionality of colour vision in humans and fish by examining the spectral complexity of the colour signal reflected from objects into their eyes. We show that fish require four to six cone classes to reconstruct the colour signal of aquatic objects at the accuracy level achieved by humans viewing terrestrial objects. This is because environmental light, which alters the colour signals, is more complex and contains more spectral fluctuations underwater than on land. We further show that fish cones are better suited than human cones to detect these spectral fluctuations, suggesting that the capability of fish cones to detect high-frequency fluctuations in the colour signal confers an advantage. Taken together, we propose that tetrachromacy in fish has evolved to enhance the reconstruction of complex colour signals in shallow aquatic environments. Of course, shallow-water fish might possess fewer than four cone classes; however, this would come with the inevitable loss in accuracy of signal reconstruction.
Lillicrap, T. P., Moreno-Briseño, P., Diaz, R., Tweed, D. B., Troje, N. F., Fernandez-Ruiz, J.
While sensorimotor adaptation to prisms which displace the visual field takes minutes, adapting to an inversion of the visual field takes weeks. In spite of a long history of study, the basis of this profound difference remains poorly understood. Here we describe the computational issue which underpins this phenomenon and present experiments designed to explore the mechanisms involved. We show that displacements can be mastered without altering the update rule used to adjust motor commands. In contrast, inversions flip the sign of crucial variables called sensitivity derivatives â variables which capture how changes in motor commands affect task error, and therefore require an update of the feedback learning rule itself. Models of sensorimotor learning that assume internal estimates of these variables are known and fixed, predicted that when the sign of a sensitivity derivative is flipped, adaptations should become increasingly counterproductive. In contrast, models that relearn these derivatives predict that performance should initially worsen, but then improve smoothly and remain stable once the estimate of the new sensitivity derivative has been corrected. Here, we evaluated these predictions by looking at human performance on a set of pointing tasks with vision perturbed by displacing and inverting prisms. Our experimental data corroborate the classic observation that subjects reduce their motor errors under inverted vision. Subjects' accuracy initially worsened, and then improved. However, improvement was jagged rather than smooth and performance remained unstable even after 8 days of continually inverted vision, suggesting that subjects improve via an unknown mechanism, perhaps a combination of cognitive and implicit strategies. These results offer a new perspective on classic work with inverted vision.
Proceedings
Kuznetsova, A., Troje, N. F., Rosenhahn, B.
Due to rapid development of virtual reality industry, realistic modeling and animation is becoming more and more important. In the paper, we propose a method to synthesize both human appearance and motion given semantic parameters, as well as to create realistic animation of still meshes and to synthesize appearance based on a given motion. Our approach is data-driven and allows to correlate two databases containing shape and motion data. The synthetic output of the model is evaluated quantitatively and in terms of visual plausibility.
Book Chapters
Troje, N. F.
Biological motion perception refers to the ability to derive a wealth of information from the mere movement of a few isolated light dots that move along with a person to whom they are attached. The appeal of biological motion stems from the discrepancy between the sparseness of the visual stimulus itself and the richness of the information that it appears to provide to our visual system. The enduring interest in this phenomenon is due to the fact that it allows us to study one of the cardinal questions of perception: How does our brain turn generally noisy, incomplete, ambiguous sensory data into a consistent, stable, and predictable model of the world? The answer to this question lies in the fact that perception is not only based on current sensory information alone, but also on expectations based on previously learned knowledge about the statics of the world.
Symposia and Published Abstracts
Weech, S., Troje, N. F.
Point-light walkers (PLWs) and stick figures generally do not contain information about their in-depth orientation, yet observers prefer to report a facing-the-viewer (FTV) interpretation (Vanrie et al., 2004). It is not clear why this is the case While silhouette figures are equally depth-cue deprived, results show that the well-known Kayahara silhouette does not elicit the same facing bias as PLWs do (Troje & McAdam, 2010) If stick figures are subject to a FTV bias but the silhouette is not, then the cause for the facing bias must rest in one of the several differences between the two stimuli. In order to isolate the critical features, we measured FTV bias while systematically manipulating a number of attributes that differ between the two stimuli: gender, posture, dynamic vs. static presentation, display type (sticks vs. outline) We asked 10 observers to continuously report rotation directions (clockwise/counter clockwise) of all displays and measured FTV bias as the proportion of reversals from the 'away' to 'towards' interpretation Results indicated that most stick figures elicited a facing bias that was not present in silhouettes. Interestingly, the static, standing stick figure--which has little variation in depth along the anterior-posterior direction--elicited no bias. These findings provide a compelling tool for explaining the FTV bias in terms of differences between silhouettes, the standing SF and other variants. Our findings strongly suggest that the facing bias is driven by local stimulus features. Interpretations are discussed.
Weech, S., Troje, N. F.
Point-light walkers generally contain no information about their orientation in depth, yet observers consistently prefer the facing-the-viewer (FTV) interpretation (Vanrie et al., 2004). Some research (Schouten et al., 2011) suggests that local stimulus properties elicit the bias: Presentation of the lower half of point-light fi gures elicits a pronounced FTV bias, while presentation of the upper half does not. Other research suggests high level causes: Male walkers generate a stronger FTV bias than female walker (Brooks et al., 2004). Interestingly, no FTV biases are observed with human silhouettes (Troje & McAdam, 2010). We hypothesise that the FTV bias is due to a convexity prior (Mamassian & Landy, 1998). Accordingly, the knees afford a FTV interpretation while the elbows, specifi cally when pointing back (as typical for women) rather than sideways (as in males), afford a facing-away interpretation. This requires visual structures other than occluding contours which are neither concave nor convex with respect to the line of sight. Here we asked observers to indicate perceived rotation directions (clockwise/counter clockwise) of a silhouette of a crouching human figure with knees pointing forward and elbows pointing back. Four conditions were included: silhouettes presented with no markers, with markers at the centre of the knees, with markers at the centre of the elbows, and with markers on both knees and elbows. We measured facing bias as the proportion of depth reversals from the âawayâ to the âtowardâ interpretation. The silhouette alone elicited a weak facing bias, as did the silhouette with both elbow and knee markers present. As predicted, silhouettes with knee markers only elicited a FTV bias while elbow markers alone elicited a notable facing away bias. These results help to interpret and unify various fi ndings regarding the FTV bias, and support the idea that an experientially-driven convexity prior guides interpretations of depth-ambiguous figures.
Troje, N. F., Kenny, S., Weech, S.
Recent studies have successfully used linear perspective to disambiguate otherwise ambiguous projections of point-light displays. The observation implies that observers are sensitive to linear perspective even though the human body is amorphous, deforms during walking and does not provide obvious cues to linear perspective such as parallel lines, right angles, or texture gradients. However, a perspective camera located half a walker height above ground looks down at the feet and up at the head. We hypothesize that the effect of "perspective" is in fact reflecting a viewing-from-above bias operating on the feet whose motion is much more salient than that of the head. Using a staircase procedure, we measured PSE and slope of the psychometric function relating percentage of perceived facing direction to the amount of perspective (quantified as the visual angle the walker subtends at camera location) at three different vertical camera locations: at the level of the feet, at the level of the head, and half way inbetween. With the camera at the height of the walker's head, visual angle at PSE is 3.8 deg and the slope of the psychometric function is 5.4 %/deg. At half that height PSE and slope are 6.7 deg and 3.2 %/deg, respectively. With the camera at floor level visual angle never converges to a stable value reaching a PSE larger than 45 deg and a slope smaller than 0.6 %/deg at the end of a 80 trial staircase. The results imply that it is not perspective itself, but the projection of the feet seen slightly from above, which disambiguates the facing direction of the walker.
Troje, N. F., Kenny, S., Weech, S.
Orthographically projected point-light walkers are ambiguous with respect to depth. For instance, fronto-parallel views from the front and from the back of a bilaterally symmetric walker result in identical projections. In a number of recent studies it was shown that the introduction of linear perspective can gradually disambiguate perceived facing direction. This observation seems to imply that observers are sensitive to linear perspective in point-light displays. We hypothesize that effects of using an approaching perspective camera have nothing to do with linear perspective per se, but with the fact that the camera looks down at the feet â at least when its vertical location is above ground level. The two hypotheses can be distinguished experimentally: If the effect is due to linear perspective, only distance between camera and walker matters. If it is a result of looking down at the feet, the camera elevation angle (height/distance) determines the effect. Using a staircase procedure, we measured PSE and slope of the psychometric function relating percentage of perceived facing direction to the amount of perspective (quantifi ed in terms of fi eld-of-view angle, i.e. the visual angle subtended by the walker) in 10 participants. Three vertical camera levels were used within-subjects: feetlevel, half walker height, and head-level. As the camera is lowered from head- to mid-level, fi eld-of-view angles at PSE increase from 3.8 deg to 6.7 deg, and the slope of the psychometric function decreases from 3.7 %/deg to 2.4 %/deg. If the camera is at fl oor level the fi eld-of-view angle never converges to a stable value, reaching a nominal PSE of 43 deg and a slope of only 0.55 %/deg at the end of a 80-trial staircase. The data imply that it is not perspective, but the projection of the feet seen slightly from above that dis-ambiguates the perceived facing direction of the walker.
Refling, E. J., Heenan, A., Troje, N. F., MacDonald, T. K.
The purpose of this study was to examine how attachment anxiety and feelings of loneliness interact to influence perceptions of an ambiguous point-light walker. During prescreening, participants completed the Experiences in Close Relationships-Revised Questionnaire (Fraley et al., 2000). Subsequently, 143 participants were instructed to write about a time in their lives when they felt lonely (threat condition) or to write about what it is like to walk around their university campus (non-threat condition; Hicks et al., 2010). Participants were then exposed to many trials depicting a point-light walker in which they had to identify its direction of rotation. Analyses revealed that individuals high (vs. low) in attachment anxiety were more likely to view the figure as walking toward them; however, when participants were primed with loneliness, this perceptual difference was not observed. Explanations for these results are discussed as well as important theoretical implications for the literature on attachment.
Konar, Y., Troje, N. F.
In this study we tested participants from two different age groups on eight tests that are designed to test different aspects of biological motion, such as the direction of motion perception, ability to detect a point-light walker in noise, sensitivity to local biological motion invariants, ability to extract structure from motion, action recognition, gender discrimination, and two versions of an identity test: naming and recognition of stick figures Results for 26 older adults (mean 64 y) are compared to a group of 30 young adults (mean 23 y) from Saunders and Troje (2011). Consistent with Bennett et al (2006), older adults needed more coherent dots to tell direction of motion. Consistent with Pilz et al (2010), older adults could not tolerate as many noise dots as young, implicating a problem with figure-ground segregation. Scrambled walkers were used to test sensitivity to local biological motion invariants and results indicate that older adults have lower sensitivity. Finally, the recognition test was administered twice with an intermittent naming test. Older adults did not differ from young on block one but had a significant decrease in performance on block two. This indicates that older adults are generally capable of identifying stick figures when tested without interruptions Older adults did not differ from young on the other tests, indicating that the ability to extract structure from motion, action recognition, gender discrimination, and the ability to identify walkers do not decay with age.
Cui, A.-X., Collett, M., Troje, N. F., Cuddy, L. L.
With sufficient exposure to melodies generated by a novel second-order rule system (Loui, Wessel & Hudson Kam, 2010) new melodies are recognized, thus demonstrating acquisition of the statistical regularities of the system. This study is concerned with a first-order rule system: pitch distribution. We explored an interaction between distinctiveness of the distribution and retrieval instructions (explicit or implicit) on melody recognition. Melodic sequences were created at three levels of distinctiveness using an algorithm provided by Smith and Schmuckler (2004). Sequences were randomly generated from Temperleyâs pitch model (2007) with the pitch profile of the Hypophrygian or Lydian mode (Huron, 2006). 82 participants with little or no formal music education were recruited. In the training phase of the experiment, all participants were exposed to 100 tone sequences generated from one of the two pitch distributions at one level of distinctiveness. The testing phase paired a melody from the exposed distribution with one from the unexposed distribution at each level of distinctiveness (10 trials per level of distinctiveness). Half of the participants were asked which melody they found more familiar (explicit retrieval instruction), the other half was asked which melody they found more pleasant (implicit retrieval instruction). Participants under familiarity instructions performed significantly higher than chance but those under pleasantness instructions did not. Moreover, under familiarity instructions but not pleasantness instructions, level of distinctiveness was significant. The most distinctive level was considered most familiar, the least distinctive level least familiar. This demonstrates that participants could generalize the exposed pitch profile and is evidence for acquisition of a first-order rule system. Findings hint at dissociation between knowledge and affect, as suggested by Loui, Wessel, and Hudson Kam (2010).
2012
Papers
Rutherford, M. D., Troje, N. F.
Biological motion is easily perceived by neurotypical observers when encoded in point-light displays. Some but not all relevant research shows significant deficits in biological motion perception among those with ASD, especially with respect to emotional displays. We tested adults with and without ASD on the perception of masked biological motion and the perception of direction from coherent and scrambled biological motion. Within the autism spectrum group, there was a large and statistically significant relationship between IQ and the ability to perceive directionality in masked biological motion. There were no group differences in sensitivity to biological motion or the ability to identify the direction of motion. Possible implications are discussed, including the use of compensatory strategies in high IQ ASD.
Livne, M., Sigal, L., Troje, N. F., Fleet, D.
It is well known that biological motion conveys a wealth of socially meaningful information. From even a brief exposure, biological motion cues enable the recognition of familiar people, and the inference of attributes such as gender, age, mental state, actions and intentions. In this paper we show that from the output of a video-based 3D human tracking algorithm we can infer physical attributes (e.g., gender and weight) and aspects of mental state (e.g., happiness or sadness). In particular, with 3D articulated tracking we avoid the need for view-based models, specific camera viewpoints, and constrained domains. The task is useful for man-machine communication, and it provides a natural benchmark for evaluating the performance of 3D pose tracking methods (vs. conventional Euclidean joint error metrics). We show results on a large corpus of motion capture data and on the output of a simple 3D pose tracker applied to videos of people walking.
Legault, I., Troje, N. F., Faubert, J.
Healthy aging is associated with a number of perceptual changes, but measures of biological motion perception have yielded conflicting results. Biological motion provides information about a walker, from gender and identity, to speed, direction, and distance. In our natural environment, as someone approaches us (closer distances), this walker spans larger areas of our field of view, the extent of which can be under-utilized with age. Yet, the effect of age on biological motion perception in such real-world scenarios remains unknown. We assessed the effect of age on discriminating walking direction in upright and inverted biological motion patterns, positioned at various distances in virtual space. Findings indicate that discrimination is worse at closer distances, an effect exacerbated by age. Older adultsâ performance decreases at a distance as far away as 4 m, whereas younger adults maintain their performance as close as 1 m (worse at 0.5 m). This suggests that older observers are limited in their capacity to integrate information over larger areas of the visual field, and supports the notion that age-related effects are more apparent when larger neural networks are required to process simultaneous information. This has further implications for social contexts where information from biological motion is critical.
Symposia and Published Abstracts
Williamson, K. E., Jakobson, L. S., Saunders, D. R., Troje, N. F.
Biological motion perception involves several distinct processes (Troje, 2008). Deficits in one of these (detection of structure-from-motion) have been documented in preterm children (Pavlova et al., 2008; Taylor, et al., 2009). Here, we show that 8-11 year old children born very prematurely (< 32 weeks gestation) are impaired, relative to full term controls, in the extraction of local motion cues (p = .048), structure-from-motion detection (p = .064), and action recognition (p = .036). In addition, unlike controls, they show no age-related improvement in style recognition (p = .013). These results could inform the development of screening, diagnostic, and intervention tools.
Weech, S., Troje, N. F.
The inverted-U function between complexity and attractiveness has proven perhaps the most appealing theory of aesthetics through history, yet many questions surround the idea. Why has research in aesthetics equivocally supported this function? What are the true dimensions that interact to effect the two slopes of the function? Here, we outline a combination of issues producing conflicting results, namely the use of limited stimulus subsets; poor quantification of âcomplexityâ; and an under-emphasis on entropyâwhich represents the downward slope of the function. We present an overview and relate the history of the function to recent saliency and stochastic modeling approaches.
Troje, N. F., Lau, S.
Human motion is constrained by inertial and gravitational forces. In order to move energetically efficient our motor control systems must take these constraints into account when producing body movements. For instance when changing walking speed a whole array of associated kinematic parameters also change their values. These changes are subtle but systematic and we hypothesize that the visual system knows about them and evaluates them when visually assessing another persons movements. The motion of 50 male and 50 female participants was captured while they were walking on a treadmill at three different speeds (veridical speeds). Each sample was then presented as a point-light display and played back at the same three speeds (playback speeds). In that way, we created for each of the 100 walkers a set of 9 point-light displays (3 veridical x 3 playback speeds), only three of which displaying the person at the same speed at which he/she was recorded. Observers were then asked to use a Likert scale to rate how natural the displays appeared. Significant main effects of veridical speed and playback speed were found. The highest veridical speed and the medium playback speed were perceived to be most natural. Most importantly, we found a highly significant interaction between these two factors, indicating that observers very sensitively detected the inconsistencies between veridical speed and playback speed. Displays in which veridical and playback speed matched were always rated to be the most natural. This impressive sensitivity to the very subtle dependency of the kinematics of walkers on their speed demonstrates that the visual system employs implicit knowledge about the biomechanic relations between different kinematic parameters. It also exposes the level of sophistication required for biomechanical models that can generate convincingly realistic character animation in computer graphics.
Troje, N. F., Kroker, A. M., Bobyn, K., Li, Q.
Head-bobbing in pigeons and other birds is widely considered an optokinetic response which ensures retinal image stabilization during the hold phase and rapid transition into a new position during the thrust phase. However, similar retinal image stabilization is achieved by many other birds and most other vertebrates by means of saccadic eye movements. It is not clear why some bird species developed the potentially more expensive strategy of head-bobbing. In our experiments, we conducted a detailed analysis of both the kinematics of body and head and the kinetics of the reaction forces exerted on the ground to explore the effects of head-bobbing on the biomechanics of locomotion. Pigeons were trained to walk back and forth between two feeders. In doing so, they traverse two force plates. Equipped with passively reflecting markers, we also register the kinematics of head and body by means of high-speed optical motion capture technology. We estimate energy consumption by modelling pigeon gait in terms of the dynamics of an inverse pendulum â a model that has been applied successfully the bipedal walking of humans. According to this approach negative work is performed when a foot first strikes the ground, which then has to be compensated with positive work in order to lift the foot off the ground half a gait cycle later. Without any means to transfer energy between these two phases, the walking animal has to come up with the metabolic energy to cover both types of work. Any means to store the negative work exerted at heel strike and exploit it as a source for the positive work required during push-off would make locomotion energetically more efficient. We present a model which indicates that head-bobbing can provide this function. It assumes that the neck functions as a spring that connects the masses of body and head. Thus, it periodically transforms the kinematic energy of the head as it moves relative to the body alternates into elastic energy that it contains in its extreme positions. We show that the phase of the head-bobbing behaviour relative to the leg movements and both the amounts and the timing of the horizontal ground reaction forces measured over the gait cycle support this model. In conclusion, we show that rather than being energetically costly, as it may first seem, head-bobbing can help to reduce overall energy consumption during locomotion by transferring work between different phases of the gait cycle. Independently of improving energetic efficiency, it can be used to derive the required push-off force by means of muscles in the neck rather than in the feet. The strategy may thus be used by birds with legs that are thin and poorly equipped for forceful push-off.
Troje, N. F., Bobyn, K., Kroker, A. M., Li, Q.
Head-bobbing in pigeons and other birds is considered a strategy to stabilize the retinal image during the short periods when the head remains motionless. However, it might serve biomechanical functions, too. Specifically, it may allow storing negative work exerted when the leading foot first contacts the ground and transferring it into positive work required during push-off of the trailing foot. Based on the measurement of ground reaction forces and body kinematics in walking pigeons, we present a model that predicts the phase relations between head-bobbing and foot placement and explains headbobbing as a means for energetically efficient walking.
Rohde, K., Troje, N. F., Michalak, J.
Embodiment theories suggest a reciprocal relationship between bodily expression and the way in which emotions are processed (Niedenthal, 2007). Investigations focusing on the role of the body in the area of clinical psychology are rare. A central assumption of the embodiment framework is that changes in the motoric system affect emotional (e.g., depressive) processing. In the present research, we employed online feedback on the participantsâ gait characteristics in a sample of 39 students by using a motion capture technology (Troje, 2008, 2002). Participants received gait feedback based on a discriminant function (Michalak et al., 2009) which changed their gait in either a healthier or a more depressive manner. The Self-Referent Encoding Task (SRET; Ramel, Golding, Eyler, Brown, Gotlib, & McQuaid, 2007) was implemented in order to measure the effects of gait feedback on the memory of emotional material. While participants received online gait feedback, a list with 40 different positive and negative words was presented. After eight minutes of continued gait feedback the participants were asked to recall as many words as possible. Participants who received depressive gait feedback recalled significant more negative words in comparison to participants who received happy gait feedback. Moreover, we could show that the degree of the changes in gait correlated highly (r = .48) with the memory bias: The more participants changed their gait towards a depressive gait, the more negative words were recalled. The results of this study suggest that memory bias, an important maintaining factor in depressive disorders, can be changes by alterations in motoric patterns.
Kroker, A. M., Li, Q., Troje, N. F.
Head-bobbing performed by walking pigeons is characterized by two alternating phases, the hold phase and the thrust phase. While the body moves at a more or less constant rate, the head remains almost motionless in space. When this is no longer possible, the hold phase is replaced with a sudden forward thrust during which the neck is extended and the head rapidly moves into a new position. One result of this behaviour is the stabilization of the retinal image during the hold phase. However, achieving this requires the solution to a sophisticated control problem. The motor signal required to compensate for the forward motion of the body could in principle be controlled by visual, vestibular or somatosensory signals in any combination. Any one of these sensory sources provides its own specific advantages and disadvantages. Vision is slow yet determines an absolute location. The vestibular system, on the other hand, might be much faster, but its signals need to be integrated twice to provide location. Both sensory systems are located in the head and therefore cannot measure the perturbation of the body directly. Rather, they need to be used within a feedback based control system. Any sensor in the body, on the other side, could be used as input for a prediction-based feed-forward system. If calibrated well, a feed-forward system is potentially much faster than a feedback system. In our experiments, we used high-speed optical motion capture technology to characterize the kinematics of head and body of a pigeon walking back and forth between two feeders. Specifically, we investigated the control errors that occur at the transition from the thrust into the hold phase and then during the hold phase itself. We found two types of control errors, each operating on a different time scale. The first consists of a very small yet systematic drift in head position. The amount of the drift varies between birds (from about 0.0162 to -0.0093 m/s) but remains relatively constant for repeated measures of the same individual. The second consists of a small overshoot at the onset of the hold phase which is then followed by a series of corrective oscillatory movements of the head (mean: 15.82 Hz). We interpret this latter error to represent a close to real-time feedback control system based on vestibular input and the former to reflect the calibration of this system by means of visual information.
Kroker, A. M., Bobyn, K., Li, Q., Troje, N. F.
Head bobbing in pigeons is characterized by alternating a hold phase during which the head remains almost motionless and a thrust phase during which it is moved into the next position. What sensory information is used to control head position during the hold phase? Here, we use highspeed optical motion capture to characterize the control errors. Results indicate that control occurs on two entirely different time scales. A very small constant velocity of the head is indicative for accurate feed-forward control, while superimposed oscillatory movements of the head are interpreted to represent close to realtime feedback control.
Konar, Y., Troje, N. F.
Inversion effects have been described for biological motion point-light displays but also for static depictions of a person's body. Which role does the movement in the biological motion display really play? If it only provides the articulation of the body in the absence of explicitly drawn connections, then inversion effects for point-light displays should be about as strong as for static stick figures. We measured perceptual inversion effects of dynamic and static stick-figures and point-light displays and found that they are most pronounced with dynamic point-light stimuli. Results are critically discussed with respect to theories about configural processing.
Heenan, A., Troje, N. F.
Biological motion stimuli, depicted as orthographically projected pointlight displays, do not contain any information about their orientation in depth. For instance, the fronto-parallel projection of a point-light walker facing the viewer is the same as the projection of a receding walker. Even though inherently ambiguous, observers tend to interpret such walkers as facing the viewer. While some have suggested that this facing-the-viewer (FTV) bias exists for sociobiological reasons, there is currently a lack of evidence to support this claim. The goal of this study was to correlate individual differences in psychological characteristics (i.e., anxiety, depression, and personality traits) with the FTV bias. We hypothesized that the FTV bias would be positively correlated with measures of anxiety, as we rationalized that more anxious individuals would be more worried about misinterpreting an approaching person as a receding one. In addition to measuring the socially loaded FTV bias, we also assessed the degree of a socially neutral bias: The tendency to perceive the walker from above rather than from below (i.e., the viewing-from-above, or VFA, bias). None of the characteristics correlated with the FTV bias, but we found that anxiety (both as a current mood state and as a personality trait) was negatively correlated with the VFA bias. More anxious individuals were less likely to perceive walker stimuli as if viewing them from above. This result is discussed in the context of other studies which seem to indicate that anxiety impacts the use of statistical priors to disambiguate visual stimuli.
Heenan, A., Refling, E. J., MacDonald, T. K., Troje, N. F.
We examined the viewing-from-above (VFA) bias when viewing stick-figure walkers in a sample of undergraduate students. Stimuli were orthographically projected and contained no information about their orientation in depth, thus making them perceptually ambiguous. Previously in our lab, we found that greater anxiety correlates with greater VFA biases. Here, we measured attachment anxiety, and we induced loneliness in half the participants. Greater anxiety was correlated with greater VFA biases for both men in the control condition and women in the induced loneliness condition. We found the inverse relationship for men in the loneliness condition and women in the control condition.
2011
Papers
Schouten, B., Troje, N. F., Verfaille, K.
Depth-ambiguous point-light walkers (PLWs) elicit a facing bias: Observers perceive a PLW as facing toward them more often than as facing away (Vanrie,Dekeyser, & Verfaillie, Perception, 33, 547â560, 2004). While the facing bias correlates with the PLWâs perceived gender (Brooks et al., Current Biology, 18, R728âR729, 2008; Schouten, Troje, Brooks, van der Zwan, & Verfaillie, Attention, Perception, & Psychophysics, 72,1256â1260, 2010), it remains unclear whether the change in perceived in-depth orientation is caused by a change in perceived gender. In Experiment 1, we show that structural and kinematic stimulus properties that lead to the same changes in perceived gender elicit opposite changes in perceived in-depth orientation, indicating that the relation between perceived gender and in-depth orientation is not causal. The results of Experiments 2 and 3 further suggest that the perceived in-depth orientation of PLWs is strongly affected by locally acting stimulus properties. The facing bias seems to be induced by stimulus properties in the lower part of the PLW.
Pica, P., Jackson, S., Blake, R., Troje, N. F.
Cross cultural studies have played a pivotal role in elucidating the extent to which behavioral and mental characteristics depend on specific environmental influences. Surprisingly, little field research has been carried out on a fundamentally important perceptual ability, namely the perception of biological motion. In this report, we present details of studies carried out with the help of volunteers from the Mundurucu indigene, a group of people native to Amazonian territories in Brazil. We employed standard biological motion perception tasks inspired by over 30 years of laboratory research, in which observers attempt to decipher the walking direction of point-light (PL) humans and animals. Do our effortless skills at perceiving biological activity from PL animations, as revealed in laboratory settings, generalize to people who have never before seen representational depictions of human and animal activity? The results of our studies provide a clear answer to this important, previously unanswered question. Mundurucu observers readily perceived the coherent, global shape depicted in PL walkers, and experienced the classic inversion effects that are typically found when such stimuli are turned upside down. In addition, their performance was in accord with important recent findings in the literature, in the abundant ease with which they extracted direction information from local motion invariants alone. We conclude that the effortless, veridical perception of PL biological motion is a spontaneous and universal perceptual ability, occurring both inside and outside traditional laboratory environments.
Michalak, J., Troje, N. F., Heidenreich, T.
According to embodiment theories, the experience of emotional states affects somatovisceral and motoric systems, whereas the experience of bodily states affects methods by which emotional information is processed. In the light of the embodiment framework, we proposed that formerly depressed individuals with a high risk of depressive relapse would display deviations in the way they walk, which might then play a role in the escalating process of depressive relapse. Moreover, we proposed that training in mindful body awareness during mindfulness-based cognitive therapy (MBCT) might have a normalizing effect on gait patterns. Gait patterns of 23 formerly depressed outpatients were compared to those of 29 never-depressed control participants. Also, gait patterns of formerly depressed patients were measured before and after MBCT to assess changes in patterns. A Fourier-based description of walking data served as the basis for the analysis of gait parameters. Before MBCT, gaits of formerly depressed patients were characterized by reduced walking speed and reduced vertical movements of the upper body. After MBCT, walking speed and lateral swaying movements of the upper body were normalized, and a trend towards normalization of vertical head movements was observed. It was concluded that MBCT has a normalizing effect on gait patterns, thus displaying not only cognitive, but also "embodied" effects.
Legenbauer, T., Vocks, V., Betz, S., Báguena Puigcerver, M. J., Benecke, A., Troje, N. F., Rüddel, H.
Various components of body image were measured to assess body image disturbances in patients with obesity. To overcome limitations of previous studies, a photo distortion technique and a biological motion distortion device were included to assess static and dynamic aspects of body image. Questionnaires assessed cognitive-affective aspects, bodily attitudes, and eating behavior. Patients with obesity and a binge eating disorder (OBE, n = 15) were compared with patients with obesity only (ONB; n = 15), to determine the nature of any differences in body image disturbances. Both groups had high levels of body image disturbances with cognitive-affective deficits. Binge eating disorder (BED) participants also had perceptual difficulties (static only). Both groups reported high importance of weight and shape for self-esteem. There were some significant differences between the groups suggesting that a comorbid BED causes further aggravation. Body image interventions in obesity treatment may be warranted.
Hohmann, T., Troje, N. F., Olmos, A., Munzert, J.
Two experiments examined whether different levels of motor and visual experience influence action perception and whether this effect depends on the type of perceptual task. Within an action recognition task (Experiment 1), professional basketball players and novice college students were asked to identify basketball dribbles from point-light displays. Results showed faster reaction times and greater accuracy in experts, but no advantage when observing either own or teammatesâ actions compared with unknown expert players. Within an actor recognition task (Experiment 2), the same expert players were asked to identify the model actors. Results showed poor discrimination between teammates and players from another team, but a more accurate assignment of own actions to the own team. When asked to name the actor, experts recognised themselves slightly better than teammates. Results support the hypothesis that motor experience influences action recognition. They also show that the influence of motor experience on the perception of own actions depends on the type of perceptual task.
Hirai, M., Saunders, D. R., Troje, N. F.
Directional information can be retrieved from a point-light walker (PLW) in two different ways: either from recovering the global shape of the articulated body or from signals in the local motion of individual dots. Here, we introduce a voluntary eye movement task to assess how the direction of a centrally presented, task-irrelevant PLW affects the onset latency and accuracy of saccades to peripheral targets. We then use this paradigm to design experiments to study which aspects of biological motionâthe global form mediated by the motion of the walker or the local movements of critical featuresâdrive the observed attentional effects. Putting the two cues into conflict, we show that saccade latency and accuracy were affected by the local motion of the dots representing the walker's feetâbut only if they retain their familiar, predictable location within the display.
Hirai, M, Chang, DHF, Saunders, DR, Troje, NF
The presence of information in a visual display does not guarantee its use by the visual system. Studies of inversion effects in both face recognition and biological-motion perception have shown that the same information may be used by observers when it is presented in an upright display but not used when the display is inverted. In our study, we tested the inversion effect in scrambled biological-motion displays to investigate mechanisms that validate information contained in the local motion of a point-light walker. Using novel biological-motion stimuli that contained no configural cues to the direction in which a walker was facing, we found that manipulating the relative vertical location of the walkerâs feet significantly affected observersâ performance on a direction-discrimination task. Our data demonstrate that, by themselves, local cues can almost unambiguously indicate the facing direction of the agent in biological-motion stimuli. Additionally, we document a noteworthy interaction between local and global information and offer a new explanation for the effect of local inversion in biological-motion perception.
Symposia and Published Abstracts
Troje, N. F., Saunders, D. R.
Since Gunnar Johannson coined the term almost 40 years ago âbiological motionâ has been used for a variety of different phenomena. Some of them are complementary to each other and probably constitute entirely different processing mechanisms. A clear distinction between them is crucial to design both behavioural and neuroimaging studies, to assess their results, and to compare them among different studies. I will provide a careful analysis of the multiple facets of biological motion perception and I will suggest a framework that helps to safely navigate through concepts and experimental paradigms employed in biological motion research. In particular, I will show experimental data that demonstrate the dissociation between a processing level that uses local motion invariants to detect biological motion and label it as being animate, as compared to mechanisms that use motion to derive the articulated structure of a moving body, and derive information about actor and action. I will then introduce a standardized battery of tests which is able to independently probe performance on these and a number of additional key aspects of biological motion perception, along with normative data and a test-retest reliability analysis. Applications of this test in neuropsychology and cognitive neuroscience are discussed.
Troje, N. F., Davis, M.
Perceptually bistable visual stimuli provide an interesting means to study how the visual system turns the generally ambiguous flow of sensory information into a reasonably stable model of the world. Biological motion point-light displays provide a particularly interesting class of stimuli in this respect. Even though the stimulus itself does not contain any information about its orientation in depth, fronto parallel projections of a point-light walker are preferentially seen as if the walker is facing the viewer rather than facing away. In two different experiments, we show that the degree of this âfacing-the viewer biasâ strongly depends on the amount of exposure an observer previously had with point-light displays. We measure the degree of the facing bias by asking observers to indicate the apparent spin (clockwise or counter-clockwise) of a point-light walker â a method insensitive to a potentially confounding response bias. In the first experiment, we compared the degree of the facing bias between naïve observers and graduate students who work with point-light displays on a daily basis. In the second experiment, we exposed initially naïve observers over the course of several weeks systematically to point-light displays and measure the degree of the facing bias before and after this treatment. In both cases, we observe a substantial increase in facing bias with the amount of expertise the observers had with point-light displays. We discuss these results in the context of a process which sharpens prior expectations by means of self-reinforcement in the absence of information that contradicts the developing prior.
Troje, N. F.
A few dots moving as if attached to the major joints of a human body elicit a vivid percept of a person in action. Psychologists studying visual processing and perceptual organization were fascinated by this phenomenon since its first introduction to the community almost 40 years ago by Swedish psychologist Gunnar Johansson. The appeal of what was since called âbiological motion perceptionâ stems from the discrepancy between the sparseness of the visual stimulus itself and the richness of the information that it appears to provide to our visual system. The enduring interest in this phenomenon is due to the fact that it allows us to study one of the cardinal questions of perception: How does our brain turn generally noisy, incomplete, ambiguous sensory data into a consistent, stable, and predictable model of the world? The solution to this question lies in the fact that perception is not just based on current sensory information, but at least to the same degree on expectations based on previously acquired knowledge about the statics and regularities of the world. I will present my view on biological motion perception as a hierarchical, knowledge-based, hypothesis-driven process. Specifically, I will talk about the nature of the visual representations underlying this process, the role of redundant information in these representations, and the significance of internal consistency for the creation of perceptually convincing animations.
Saunders, D. R., Troje, N. F.
Tests designed to measure biological motion perception have often confounded two or more distinct perceptual abilities. These abilities include structure-from-nonrigid-motion, figure-ground segregation, and processing of local motion invariants. We have developed a battery of tests that measure these abilities independently, in addition to higher level biological motion abilities including action recognition, movement style perception, and person recognition. Seventy-five participants completed the battery, allowing for an individual-differences analysis. The lack of correlation between scores on the tests provides support for the independence of the underlying processes. In order to assess robustness of the tests to differences in the experimental environment, and to measure test-retest reliability, we had 30 additional participants complete the battery both in the lab and on their home computers. There was no effect of environment for the majority of the tests. Together, the results suggest that the test battery efficiently measures the components of biological motion perception, and performs nearly as well under uncontrolled viewing conditions. One future use of the battery is to fully characterize the perceptual deficits of special populations with respect to biological motion.
Michalak, J., Burg, J., Heidenreich, T., Troje, N. F.
In this paper we will present results of a series of studies investigating the embodiment of depression. The first series of studies analyzed gait patterns in depression. Using a motion capture system we investigated (1) whether dynamic gait patterns of currently and formerly depressed patients differ from never depressed people and (2) whether mindfulness-based cognitive therapy (MBCT) normalize gait patterns of formerly depressed patients. Motion data of 23 formerly depressed patients participating in MBCT, 14 currently depressive inpatients and 29 never depressed participants were collected. The data was analyzed by fourier-based descriptions and computation of linear classifiers. Gait patterns of currently depressed patients as well as formerly depressed patients differed form never depressed people. Moreover, MBCT had some normalizing effect on the way patients walk. We conclude that training in mindfulness might change proprioceptive-bodily feedback that is important in the generation of depressive states. In a second series of studies we investigated the associations between the ability to stay in contact with ones body during mindful breathing and depression related variables. We utilized a new experimental paradigm, which is strongly oriented to breathing meditation, to assess mindfulness. Participants were required to observe their breath during predetermined time periods and to indicate each time they lose their sense of it, e.g. because of mind wandering. Results of a study with 42 undergraduates showed that the ability to stay mindful in contact with ones body during breathing was associated with lower levels of rumination and depression. We are currently exploring the mindful breathing exercise in samples of currently and formerly depressed participants. Results of these studies will also be presented.
2010
Papers
Troje, N. F., McAdam, M.
The silhouette illusion published online a number of years ago by the Japanese Flash designer Nobuyuki Kayahara has received substantial attention from the online community. One feature that seems to make it interesting is an apparent rotational bias: Observers see it spinning more often clockwise than counter-clockwise. Here, we show that this rotational bias is in fact due to the visual systemâs preference for viewpoints from above rather than from below.
Schouten, B., Troje, N. F., Brooks, A., van der Zwan, R., Verfaillie, K.
Under orthographic projection, biological motion point-light walkers offer no cues to the order of the dots in depth: Views from the front and from the back result in the very same stimulus. Yet observers show a bias toward seeing a walker facing the viewer (Vanrie, Dekeyser, & Verfaillie, 2004). Recently, we reported that this facing bias strongly depends on the gender of the walker (Brooks et al., 2008). The goal of the present study was, first, to examine the robustness of the effect by testing a much larger subject sample and, second, to investigate whether the effect depends on observer sex. Despite the fact that we found a significant effect of figure gender, we clearly failed to replicate the strong effect observed in the original study. We did, however, observe a significant interaction between figure gender and observer sex.
Saunders, D. R., Williamson, D., Troje, N. F.
Humans can perceive many properties of a creature in motion from the movement of the major joints alone. However it is likely that some regions of the body are more informative than others, dependent on the task. We recorded eye movements while participants performed two tasks with point-light walkers: determining the direction of walking, or determining the walker's gender. To vary task difficulty, walkers were displayed from different view angles and with different degrees of expressed gender. The effects on eye movement were evaluated by generating fixation maps, and by analyzing the number of fixations in regions of interest representing the shoulders, pelvis, and feet. In both tasks participants frequently fixated the pelvis region, but there were relatively more fixations at the shoulders in the gender task, and more fixations at the feet in the direction task. Increasing direction task difficulty increased the focus on the foot region. An individual's task performance could not be predicted by their distribution of fixations. However by showing where observers seek information, the study supports previous findings that the feet play an important part in the perception of walking direction, and that the shoulders and hips are particularly important for the perception of gender.
Perry, A., Troje, N. F., Bentin, S.
Putative contributions of a human mirror neuron system (hMNS) to the perception of social information have been assessed by measuring the suppression of EEG oscillations in the mu/alpha (8-12 Hz), beta (15-25 Hz) and low-gamma (25-25 Hz) ranges while participants processed social information revealed by point-light displays of human motion. Identical dynamic displays were presented and participants were instructed to distinguish the intention, the emotion, or the gender of a moving image of a person, while they performed an adapted odd-ball task. Relative to a baseline presenting a nonbiological but meaningful motion display, all three biological motion conditions reduced the EEG amplitude in the mu/alpha and beta ranges, but not in the low-gamma range. Suppression was larger in the intention than in the emotion and gender conditions, with no difference between the latter two. Moreover, the suppression in the intention condition was negatively correlated with an accepted measure of empathy (EQ), revealing that participants high in empathy scores manifested less suppression. For intention and emotion the suppression was larger at occipital than at central sites, suggesting that factors other than motor system were in play while processing social information embedded in the motion of point-light displays.
MacKinnon, L. M., Troje, N. F., Dringenberg, H. C.
It is unknown whether the rodent visual system can perceive biological motion, an ability present in primates, cats, and several bird species. Using a water-maze visual discrimination task, we find that rats can be trained to distinguish between left- and rightward motion of abstract point-light displays of walking humans. However, rats were unable to generalize to a novel point-light display (a walking cat), or to a display of a backward walking human, where overall body configuration and local, ballistic foot motion provide directly opposing cues regarding movement direction. Together, these experiments provide the first demonstration of the ability of rodents to extract motion direction cues from abstract, point-light displays. However, when isolated, neither the overall body configuration nor the local motion of the feet appears to provide sufficient information for rats to reliably extract movement direction in biological motion displays.
Kuhlmeier, Valerie A, Troje, Nikolaus F, Lee, Vivian
In the present study, we examined if young infants can extract information regarding the directionality of biological motion. We report that 6-month-old infants can differentiate leftward and rightward motions from a movie depicting the sagittal view of an upright human point-light walker, walking as if on a treadmill. Inversion of the stimuli resulted in no detection of directionality. These findings suggest that biological motion displays convey information for young infants beyond that which distinguishes them from nonbiological motion; aspects of the action itself are also detected. The potential visual mechanisms underlying biological motion detection, as well as the behavioral interpretations of point-light figures, are discussed.
Gurnsey, R., Troje, N. F.
There is evidence that human observers are more sensitive to the direction-of-heading of point-light walkers defined by first-order than second-order motions. We addressed this question by measuring the minimum direction difference (azimuth) that observers could discriminate when the dots composing the walkers were conveyed by first or second-order motions. Sensitivity to azimuth differences for four stimulus types (two first-order and two second-order) was tested at a range of stimulus sizes and at eccentricities of 0â16° in the right visual field. We find that for most stimulus types and eccentricities any azimuth threshold can be obtained by an appropriate adjustment of stimulus size. To achieve a given azimuth threshold second-order stimuli must be larger than the corresponding first-order stimuli. Therefore, stimulus magnification equates sensitivity to walker direction and we may say that sensitivity to walker direction is generally cue-independent. Similarly, in most cases stimulus magnification is sufficient to eliminate eccentricity dependent variability from the azimuth thresholds. Interestingly, the magnification required match peripheral to foveal thresholds increases faster with eccentricity for first-order stimuli than for second-order stimuli, while at the same time thresholds for first-order stimuli are lower than those for second-order stimuli at corresponding sizes and eccentricities.
Gurnsey, R., Roddy, G., Troje, N. F.
Many previous studies have used noise tolerance to quantify sensitivity to point-light walkers heading ±90° from straight-ahead. Here we measured the smallest deviations from straight-ahead that observers could detect (azimuth thresholds) in the absence of noise. Thresholds were measured at a range of stimulus sizes and eccentricities for (1) upright and (2) inverted walkers, (3) intact walkers, those without feet and those with only feet, and (4) in the presence and absence of a second, attention-absorbing task. At large stimulus sizes azimuth thresholds were very small (between 1 and 2°) except in the case of inverted walkers. Size scaling generally compensated for eccentricity dependent sensitivity loss, however in the case of inverted walkers the data were quite noisy. At large sizes walkers without feet elicited higher thresholds than those with only feet, suggesting a special role for the feet even when walkers are not viewed side-on. Unlike others, we found no evidence that competing tasks affected performance. We argue that the value of our modified direction-discrimination task lies in its focus on the limits of discrimination within the domain of interest, rather than the amount of noise needed to impair discrimination of widely separated stimulus values.
Chang, D. H. F., Harris, L. R., Troje, N. F.
We investigated the roles of egocentric, gravitational, and visual environmental reference frames for face and biological motion perception. We tested observers on face and biological motion tasks while orienting the visual environment and the observer independently with respect to gravity using the York Tumbling Room. The relative contribution of each reference frame was assessed by arranging pairs of frames to be either aligned or opposed to each other while rendering the third uninformative by orienting it sideways relative to the stimulus. The perception of both biological motion and faces were optimal when the stimulus was aligned with egocentric coordinates. However, when the egocentric reference frame was rendered uninformative, the perception of biological motion, but not faces, relied more on stimulus alignment with gravity rather than the visual environment.
Bockemühl, T., Troje, N. F., Dürr, V.
A central question in motor control is how the central nervous system(CNS) deals with redundant degrees of freedom (DoFs) inherentin the musculoskeletal system. One way to simplify control of aredundant system is to combine several DoFs into synergies. Inreaching movements of the human arm, redundancy occurs atthe kinematic level because there is an unlimited number of armpostures for each position of the hand. Redundancy also occurs atthe level of muscle forces because each arm posture can be maintainedby a set of muscle activation patterns. Both postural andforce-related motor synergies may contribute to simplify the controlproblem. The present study analyzes the kinematic complexityof natural, unrestrained human arm movements, and detects theamount of kinematic synergy in a vast variety of arm postures.We have measured inter-joint coupling of the human arm andshoulder girdle during fast, unrestrained, and untrained catchingmovements. Participants were asked to catch a ball launchedtowards them on 16 different trajectories. These had to be reachedfrom two different initial positions. Movement of the right arm wasrecorded using optical motion capture and was transformed into10 joint angle time courses, corresponding to 3 DoFs of the shouldergirdle and 7 of the arm. The resulting time series of the armpostures were analyzed by principal components analysis (PCA).We found that the first three principal components (PCs) alwayscaptured more than 97% of the variance. Furthermore, subspacesspanned by PC sets associated with different catching positionsvaried smoothly across the arm's workspace. When we pooledcomplete sets of movements, three PCs, the theoretical minimum for reaching in 3D space, were sufficient to explain 80% of thedata's variance. We assumed that the linearly correlated DoFs ofeach significant PC represent cardinal joint angle synergies, andshowed that catching movements towards a multitude of targetsin the arm's workspace can be generated efficiently by linear combinationsof three of such synergies. The contribution of each synergychanged during a single catching movement and often variedsystematically with target location. We conclude that unrestrained,one-handed catching movements are dominated by strong kinematiccouplings between the joints that reduce the kinematiccomplexity of the human arm and shoulder girdle to three non-redundant DoFs.
Proceedings
Sigal, L., Troje, N. F., Fleet, D. J., Livne, M.
We show that, from the output of a simple 3D human pose tracker onecan infer physical attributes (e.g., gender and weight) and aspects of mental state(e.g., happiness or sadness). This task is useful for man-machine communication,and it provides a natural benchmark for evaluating the performance of 3D posetracking methods (vs. conventional Euclidean joint error metrics). Based on an extensivecorpus of motion capture data, with physical and perceptual ground truth,we analyze the inference of subtle biologically-inspired attributes from cyclicgait data; We show that inference is possible even with partial observations ofthe body, and with motions as short as a single gait cycle. Learning models fromsmall amounts of noisy video pose data is, however, prone to over-fitting. Tomitigate this we formulate learning in terms of domain adaptation for which themocap helps to regularize such models.While video-based 3D pose estimates arenoisy, they do support the inference of human attributes.
Symposia and Published Abstracts
Troje, N. F., McAdam, M.
Point-light walkers and stick-figures rendered orthographically and without self-occlusion do not contain any information as to their depth. For instance, a frontoparallel projection could depict a walker from the front or from the back. Nevertheless, observers show a strong bias towards seeing the walker as facing the viewer (FTV, Vanrie and Verfaille, 2006 Perception & Psychophysics 68 601-612). A related stimulus, the silhouette of a stationary human figure (Kayahara, 2003, http://www.procreo.jp) does not seem to show a FTV bias. We created stimuli representing gradual transitions from the silhouette figure to a stick-figure. Rotating them slowly about a vertical axis, we asked observers to indicate if they saw clockwise or counterclockwise rotation. Measuring frequency and angle of perceptual reversals we derived a unbiased measure of the FTV bias. Results reveal that the FTV bias is not due to the presence or absence of walking behaviour and does not depend on the posture of the figure. It is a direct consequence of transitioning from a silhouette to the stick figure. The FTV bias can be explained assuming that the visual system perceives the marks (dots or sticks) to be on the surface of an opaque body and that it assumes a higher probability of them being on the front than on the back.
Troje, N. F., Aust, U.
We investigate the pigeon's ability to discriminate the direction into which a biological motion point-light walker is facing. We do that for two reasons. First, it has been shown that pigeons have great difficulty to distinguish between mirror symmetric versions of the same static object. Here, we want to demonstrate that this is not the case for biological motion displays which they can readily distinguish, even if they are mirror symmetric versions of one another. Second, we ask whether pigeons discriminate facing direction based on motion-mediated form, or whether they rather rely on local cues.Using a two-alternative forced choice paradigm in which pigeons had to peck on one of two stimuli presented on a screen, eight pigeons were trained to discriminate between a right and a left facing walker depicting either a human or a pigeon. We then tested them with non-reinforced catch trials inserted into the continuing training sessions.In Experiment 1, these catch trials were walkers played backwards. These stimuli provide conflicting cues. The global structure points in one direction while the local motion cues the other direction. Six out of the eight birds clearly chose the direction indicated by the local motion. The other two birds based their decision on the global shape. In a number of additional experiments, we presented upright and inverted spatially scrambled versions of the displays. Only the birds that had indicated to use local motion cues in Experiment 1 could handle these tasks, the other two birds responded at random.The results show interesting individual differences between pigeons. While most of the birds rely on local motion to derive facing direction, others can handle global, motion-mediated shape. Each individual bird makes a clear decision for one or the other strategy, though. Results also show that pigeons have no problems to distinguish between mirror symmetric versions of the same stimulus as long as motion is involved.
Troje, N. F.
Biological motion stick-figures rendered orthographically and without self-occlusions do not contain any information about the order of their elements in depth and therefore are consistent with at least two different in-depth interpretations. Interestingly, however, the visual system often prefers one over the other interpretation. In this study, we are investigating two different sources for such biases: the looking-from-above bias and the facing-the-viewer bias (Vanrie et al. 2004). We measure perceived depth as a function of the azimuthal orientation of the walker, the camera elevation, and the walker's gender, which have previously been reported to also affect the facing bias (Brooks et al, 2008). We also compare dynamic walkers with static stick-figure displays. Observers are required to determine whether 0.5 s presentations of stick-figures are rotating clockwise or counter-clockwise - basically telling us in that way which of the two possible in-depth interpretations they are perceiving. In contrast to previous work, this measure is entirely bias-free in itself. Data collected with this method show that the facing-the-viewer bias is even stronger than previously reported and that it entirely dominates the viewing-from-above bias. Effects of walker gender could not be confirmed. Static figures which imply motion result in facing biases which are almost as strong as obtained for dynamic walker. The viewing-from-above bias becomes prominent for the profile views of walkers, for which the facing-the-viewer bias does not apply, and for other depth ambiguous stimuli (such as the Necker cube). In all these cases, we find a very strong bias to interpret the 2D image in terms of a 3D scene as seen from above rather than from below. We discuss our results in the context of other work on depth ambiguous figures and look at differences between the initial percept as measured in our experiments and bistability observed during longer stimulus presentations.
Saunders, D. R., Williamson, D. K., Troje, N. F.
Even when a display of a person walking is presented only as dots following the motion of the major joints, human observers can readily determine both the facing direction of locomotion and higher-level properties, including the gender of the individual. We investigated the spatial concentration of direction and gender cues, by tracking the eye movements of 16 participants while they judged either property of a point-light display. The walkers had different levels of ambiguity of both their direction and their gender, which affected the difficulty of the tasks. Fixation locations were recorded throughout the 2 s presentation times. We analyzed the fixation data in two ways: first by creating fixation maps for the different conditions, and second by finding the average number of fixations that fell into three ROIs, representing the shoulders, pelvis and feet. In accordance with past literature emphasizing the role of lateral shoulder sway in gender identification, participants on average fixated more on the shoulders in the gender task than in the direction task. Analysis of individual differences showed that more fixations in the shoulder region predicted slightly better performance in the gender task. On the other hand, the number of fixations on the pelvis, an area also known to contain gender information, was not significantly different between tasks. In accordance with studies showing that the motion of the feet contains cues to direction, participants fixated significantly more often on the feet in the direction task. The feet were rarely fixated in the gender task. In general, task difficulty did not have an effect on fixation patterns, except in the case of walkers viewed from the side, which produced on average slightly fewer feet fixations in the direction task.
Roddy, G., Saunders, D., Troje, N. F., Gurnsey, R
Thornton et al. (2002, Perception) showed that detection of point-light walkers (biological motion) embedded in noise is impaired in the presence of a competing change detection task (dual-task). Gurnsey et al. (2010, Journal of Vision) argued that the presence of masking noise dots was critical to this effect. Therefore, we measured walker direction discrimination thresholds with variable noise levels under single- and dual task conditions (change detection). For most subjects-as predicted--direction thresholds were identical under single- and dual-task conditions in the absence of noise, but thresholds increased with noise in the dual task condition.Hebb Award AbstractThornton et al. (2002) asked observers to judge whether point-light walkers were heading ±90° from the line of sight while simultaneously performing a change detection task. They found walker direction discrimination was impaired in the presence of the competing task, suggesting that attention is critical to the detection of biological motion. However, a subsequent study by Gurnsey et al. (2010) showed that neither the addition of a colour discrimination task nor a radial frequency discrimination task impaired walker azimuth thresholds (the absolute angular difference from straight ahead). Although it is arguable that colour is separable from the form or motion information needed to discriminate walker direction the same cannot be said about the radial frequency task.It is possible that the discrepancy between these two results is related to Thornton et al.'s (2002) use of masking dots. Because the standard direction discrimination task is trivially easy in the absence of noise, Thompson et al. embedded their walkers in noise to limit performance. (This is common practice in the biological motion literature.) In the study by Gurnsey et al. (2010) azimuth thresholds at large sizes were ±1.5° on average. If biological motion is susceptible to resource competition then ±90° walkers should make fewer demands on attention than ±1.5° walkers. It may be that in the Thornton et al. study the observers' difficulty was with segregating walker from noise rather than with encoding the properties of the walker itself.In the present task we measured walker azimuth thresholds with and without noise in single- and dual-task conditions (as per Thornton et al., 2002). For four of our six observers we found that the competing task had no effect on thresholds in the absence of noise but thresholds increased linearly with noise in the dual task condition. This result is consistent with the idea noise impairs segmentation rather than sensitivity to properties of the walker. Three of these four subjects had extensive experience judging walkers in noise, and all four were avid video gamers. Two of our observers were impaired in the dual task even when no noise was present and exhibited much higher thresholds overall in the dual-task condition. One of these subjects had extensive experience judging point-light walkers without noise. Although we have evidence that noise alters the ability to segment walkers from noise, it seems that subject variables have a very large effect on performance in dual task experiments.
Michalak, J., Troje, N. F., Heidenreich, T.
Theoretischer Hintergrund: Embodiment Theorien postulieren eine enge reziproke Wechselwirkung zwischen dem motorischen System und emotionalen Prozessen. In einer Serie von mehreren Studien wurde untersucht, ob sich (1) Unterschiede in den Gangmustern von akut und niemals depressiven Personen zeigen; (2) ob sich solche Gangcharakteristika auch bei nicht-depressiven Personen nach Induktion von trauriger Stimmung zeigen; (3) ob sich auch bei ehemals depressiven Patienten Auffälligkeiten im Gangmuster zeigen und (4) ob Mindfulness-based Cognitive Therapy (MBCT) einen normalisierenden Einfluss auf die Gangmuster ehemals depressiver Patienten hat. Methoden: Gangmuster von 14 akut depressiven Patienten, 23 ehemals depressiven Patienten, die an einem MBCT-Kurs teilnahmen und 29 niemals depressiven Versuchspersonen wurden analysiert. Ergebnisse: Die Gangmuster ehemals depressiver und akut depressiver Patienten unterscheiden sich von niemals depressiven VPn. MBCT hat einen teilweise normalisierenden Effekt auf das Gangmuster. Diskussion: Depressive Personen zeigen Auffälligkeiten im Bereich des motorischen Systems. MBCT verändert diese Auffälligkeiten und hat somit möglicherweise günstige Auswirkungen auf propriozeptives Feedback, das im Rückfallgeschehen eine wichtige Rolle spielen könnte.
Michalak, J., Troje, N. F., Heidenreich, T.
According to embodiment theories, emotional states affect somatovisceral and motoric systems, whereas bodily states affect methods by which emotional information is processed. In the present research we investigated (1) whether dynamic gait patterns of currently and formerly depressed patients differ from never depressed people and (2) whether mindfulness-based cognitive therapy (MBCT) normalize gait patterns of formerly depressed patients. Motion data of 23 formerly depressed patients participating in MBCT, 14 currently depressive inpatients and 29 never depressed participants were collected with an optical motion capture system. The data was analyzed by fourier-based descriptions and computation of linear classifiers. Gait patterns of currently depressed patients as well as formerly depressed patients differed form never depressed people. Moreover, MBCT had some normalizing effect on the way patients walk. We conclude that training in mindfulness might change proprioceptive-bodily feedback that is important in the generation of depressive states.
McAdam, M., Troje, N. F.
The silhouette illusion depicts a rotating dancer. Published online (Kayahara, 2003) it has since travelled the internet. As any silhouette, the display is depth-ambiguous. Consequently, the direction of rotation is ambiguous as well. The online community has noticed that perceived rotation direction is biased toward one direction, and a number of hypotheses have been provided to explain this. Here, we systematically test the hypothesis that our visual system prefers to see the silhouette from above rather than from below. We varied camera elevation and show that the resulting biases do indeed account for the ones observed in the original illusion.
König, A., Schölmerich, A., Troje, N. F.
Meta-analysis and reviews on sexual recidivism consistently report thatsexual preference for children is one of the strongest predictors for reoffendingin child molesters. The aim of our study is to introduce a newpowerful implicit stimulus for the detection of sexual preference in childmolesters with and without the clinical diagnosis of paedophilia.During the early 1970s Gunnar Johansson (1973) introduced Point-Light-Walkers (PLWs) as a new stimulus to perception research. Hefound that 12 lights fixed at the major joints of a moving person aresufficient to identify socially relevant features and the kind of actionsdisplayed. Looking at the vast amount of literature on biological motionperception, we think that the ecological power combined with the ambiguityof PLWs make them appropriate as a new stimulus for forensicresearch dealing with sexual preference.The current study was performed in order to determine differences inattractiveness-ratings of PLWs between child molesters with (n = 44) andwithout (n = 21) the ICD-10 diagnosis of paedophilia and a controlgroup of non-sex offenders (n = 68). We assume that sexual preferencefor underage girls and boys respectively is associated with higher attractiveness-ratings for female or male walking-patterns of children.We used logistic regression analysis to differentiate between differentsubgroups of child molesters and non-sex offenders. Attractivenessratingsfor female walking-patterns of children were the strongest predic77tor for the diagnosis of paedophilia. Overall correct-classification ratesvaried from 74.6-76.1%, positive predictive values from 65.0-80.0%and negative predictive values from 68.8-77.6%. Our results on predictivepower are at least comparable to the findings of phallometric measures(Abel et al., 1998) and implicit association methods (Gray et al.,2005).Nevertheless any single psychometric measure can never be sufficientenough to verify or rule out an individual clinical diagnosis of paedophilia.
Hirai, M., Saunders, D. R., Troje, N. F.
Our visual system can extract directional information even from spatially scrambled point-light displays (Troje & Westhoff, 2006). In three experiments, we measured saccade latencies to investigate how local features in biological motion affect attentional processes. Participants made voluntary saccades to targets appearing on the left or the right of a central fixation point, which were congruent, neutral or incongruent with respect to the facing direction of a centrally presented point-light display. In Experiment 1, we presented two kinds of human point-light walker stimuli (coherent and spatially scrambled) with three different viewpoints (left-facing, frontal view, right-facing) at two different stimulus durations (200 and 500 ms) to sixteen observers. The saccade latency of the incongruent condition was significantly longer compared to that of the congruent condition for the 200-ms coherent point-light walker stimuli, but not for the spatially scrambled stimuli. In Experiment 2, a new group of observers (N = 12) were presented with two point-light walker displays. The only difference with respect to Exp. 1 was, that in the scrambled version of the stimulus, the location of the dots representing the feet, was kept constant. Different from the results of Experiment 1, the saccade latency in the incongruent condition was significantly longer than that in the congruent condition irrespective of the stimulus types. In Experiment 3, we put into conflict the facing direction indicated by the local motion of the feet and the facing direction as indicated by the global structure of the walker by presenting newly recruited observers (N = 12) with backwards walking point-light walkers. In agreement with the results of Experiment 2, the modulation of the saccade latency was dependent on the direction of feet motion, irrespective of the postural structure of the walker. These results suggest that the local motion of the feet determines reflexive orientation responses.
Chang, D. H. F., Troje, N. F.
The walking direction of a biological entity is conveyed by both global structure-from-motion information and local motion signals. Global and local cues also carry distinct inversion effects. In particular, the local motion-based inversion effect is carried by the feet of the walker. Here, we searched for a "super foot", defined as the motion of a single dot that conveys maximal directional information and carries a large inversion effect, by using a psychophysical procedure driven by a multi-objective evolutionary algorithm (MOEA). We report on two rounds of searches involving the evolution of 25-27 generations each (1000 trials/generation) conducted via a web-based interface. The search involved an eight-dimensional space spanned by amplitudes and phases of a 2nd-order fourier representation of the dot's motion in the image plane. On each trial, observers were presented with multiple copies of a "foot" chosen from a population of feet stimuli for the current generation and were required to indicate whether the perceived stimulus was right- or left- facing. The stimuli were shown at upright and inverted orientations. Upon completion of a generation, each stimulus was evaluated for its "fitness" based upon its ability to convey direction and carry an inversion effect from observer accuracy rates. The fittest stimuli were then selected to form a subsequent generation for testing via methods of crossover and mutation. We show that the MOEA was effective at driving increases in accuracy rates for the upright stimuli and increases in the inversion effect, quantified as the difference between upright and inverted stimuli, across generations. We show further that the two rounds of searches, beginning at different points in space, converge towards the same region. We characterize the "super foot" in relation to current theories about the importance of gravity-constrained dynamics for biological motion perception.
2009
Papers
van der Zwan, R., MacHatch, C., Kozlowski, D., Troje, N. F., Blanke, O., Brooks, A.
The movement of an organism typically provides an observer with information in more than one sensory modality. The integration of information modalities reduces the likelihood that the observer will be confronted with a scene that is perceptually ambiguous. With that in mind, observers were presented with a series of point-light walkers each of which varied in the strength of the gender information they carried. Presenting those stimuli with auditory walking sequences containing ambiguous gender information had no effect on observersâ ratings of visually perceived gender. When the visual stimuli were paired with auditory cues that were unambiguously female, observersâ judgments of walker gender shifted such that ambiguous walkers were judged to look more female. To show that this is a perceptual rather than a cognitive effect, we induced visual gender after-effects with and without accompanying female auditory cues. The pairing of gender-neutral visual stimuli with unambiguous female auditory cues during adaptation elicited male after-effects. These data suggest that biological motion processing mechanisms can integrate auditory and visual cues to facilitate the extraction of higher-order features like gender. Possible neural substrates are discussed.
Saunders, D. R., Suchan, J., Troje, N. F.
Biological-motion perception consists of a number of different phenomena. They include global mechanisms that support the retrieval of the coherent shape of a walker, but also mechanisms which derive information from the local motion of its parts about facing direction and animacy, independent of the particular shape of the display. A large body of the literature on biological-motion perception is based on a synthetic stimulus generated by an algorithm published by James Cutting in 1978 (Perception 7 393 â 405). Here we show that this particular stimulus lacks a visual invariant inherent to the local motion of the feet of a natural walker, which in more realistic motion patterns indicates the facing direction of a walker independent of its shape. Comparing Cuttingâs walker to a walker derived from motion-captured data of real human walkers, we find no difference between the two displays in a detection task designed such that observers had to rely on global shape. In a direction discrimination task, however, in which only local motion was accessible to the observer, performance on Cuttingâs walker was at chance, while direction could still be retrieved from the stimuli derived from the real walker.
Murphy, P., Brady, N., Fitzgerald, M., Troje, N. F.
A central feature of autistic spectrum disorders (ASDs) is a difficulty in identifying and reading humanexpressions, including those present in the moving human form. One previous study, by Blake et al.(2003), reports decreased sensitivity for perceiving biological motion in children with autism, suggestingthat perceptual anomalies underlie problems in social cognition. We revisited this issue using a novelpsychophysical task. 16 adults with ASDs and 16 controls were asked to detect the direction of movementof human point-light walkers which were presented in both normal and spatially scrambled formsin a background of noise. Unlike convention direction discrimination tasks, in which walkers walk 'on thespot' while facing left or right, we added translatory motion to the stimulus so that the walkers physicallymoved across the screen. Therefore, while a cue of coherent, translatory motion was available in both thenormal and scrambled walker forms, the normal walker alone contained information about the configurationand kinematics of the human body. There was a significant effect of walker type, with reducedresponse times and error when the normal walker was present. Most importantly, these improvementswere the same for both participant groups, suggesting that people with ASDs do not have difficultyintegrating local visual information into a global percept of the moving human form. The discrepancybetween these and previous findings of impaired biological motion perception in ASDs are discussedwith reference to differences in the age and diagnosis of the participants, and the nature of the task.
Michalak, J., Troje, N., Fischer, J., Vollmar, P., Heidenreich, T., Schulte, D.
To analyze gait patterns associated with sadness and depression. Embodiment theories suggest a reciprocal relationship between bodily expression and the way in which emotions are processed. Methods: In Study 1, the gait patterns of 14 inpatients suffering from major depression were compared with those of matched never-depressed participants. In Study 2, we employed musical mood induction to induce sad and positive mood in a sample of 23 undergraduates. A Fourier-based description of walking data served as the basis for the computation of linear classifiers and for the analysis of gait parameters. Results: Gait patterns associated with sadness and depression are characterized by reduced walking speed, arm swing, and vertical head movements. Moreover, depressed and sad walkers displayed larger lateral swaying movements of the upper body and a more slumped posture. Conclusion: The results of the present study indicate that a specific gait pattern characterizes individuals in dysphoric mood.
Jiménez Ortega, L., Stoppa, K., Güntürkün, O., Troje, N. F.
Many birds show a characteristic forward and backward head movement, while walking, running and sometimes during landing after flight, called head bobbing. During the hold phase, the head of the bird remains stable in space, while during the thrust phase, the head is rapidly moved forward. Three main functions for head bobbing have been proposed: Head bobbing might have a biomechanical cause, it might serve depth perception via motion parallax, or it might be an optokinetic response that primarily serves in image stabilization for improved vision during the hold phase. To investigate vision during the different phases and in particular to test for visual suppression during the saccadic thrust phase, we tested pigeons on a shape discrimination task, presenting the stimuli exclusively either in the hold phase, thrust phase or at random times. Visual stimuli were presented either in a frontal or in a lateral position. Results clearly demonstrate that shape discrimination is as good during the thrust phase as it is during the hold phase.
Chang, D. H. F., Troje, N. F.
The perception of biological motion is subserved by both a global process that retrieves structural information and a local process that is sensitive to individual limb motions. Here, we present an experiment aimed to characterize these two mechanisms psychophysically. Naive observers were tested on one of two tasks. In a walker detection task designed to address global processing, observers were asked to discriminate coherent from scrambled walkers presented in separate intervals. In an alternate direction discrimination task designed to address primarily local processing, observers were asked to discriminate walking direction from both coherent and spatially scrambled displays. In both tasks, we investigated performance-specificity to human (versus non-human) motion and the effects of mask density and learning on task performance. Performance in the walker detection task was best for the human walker, was susceptible to learning, and was heavily hindered by increasing mask densities. In contrast, performance on the direction discrimination task, in particular for the scrambled walkers, was unaffected by walker type, did not show a learning trend, and was relatively robust to masking noise. These findings suggest that the visual system processes global and local information contained in biological motion via distinct neural mechanisms that have very different properties.
Chang, D. H. F., Troje, N. F.
The ability to derive the facing direction of a spatially scrambled point-light walker relies on the motions of the feet and is impaired if they are inverted. We exploited this local inversion effect in three experiments that employed novel stimuli derived from only fragments of full foot trajectories. In Experiment 1, observers were presented with stimuli derived from a single fragment or a pair of counterphase fragments of the foot trajectory of a human walker in a direction discrimination task. We show that direction can be retrieved for displays as short as 100 ms and is retrieved in an orientation-dependent manner only for stimuli derived from the paired fragments. In Experiment 2, we investigated direction retrieval from stimuli derived from paired fragments of other foot motions. We show that the inversion effect is correlated with the difference in vertical acceleration between the constituent fragments of each stimulus. In Experiment 3, we compared direction retrieval from the veridical human walker stimuli with stimuli that were identical but had accelerations removed. We show that the inversion effect disappears for the stimuli containing no accelerations. The results suggest that the local inversion effect is carried by accelerations contained in the foot motions.
Proceedings
Zeiler, M. D., Taylor G. W., Troje, N. F., Hinton, G. E.
In an effort to better understand the complex courtship be-haviour of pigeons, we have built a model learned from motion capturedata. We employ a Conditional Restricted Boltzmann Machine (CRBM)with binary latent features and real-valued visible units. The units areconditioned on information from previous time steps to capture dynam-ics. We validate a trained model by quantifying the characteristic âhead-bobbingâ present in pigeons. We also show how to predict missing databy marginalizing out the hidden variables and minimizing free energy.
Book Chapters
Symposia and Published Abstracts
Troje, N. F., Rutherford, M. D.
A number of recent studies on biological motion perception in people with autism have produced conflicting results with respect to the question whether biological motion perception is impaired in people with autism spectrum disorder (ASD) or not. We designed two experiments which probe two different aspects of biological motion and tested a group of 13 adult, high functioning people with autism as well as a group of age matched control subjects. The first experiment required observers to indicate whether a display showing a mask of scrambled walkers also contained a coherent walker or not. Solving this task requires the observer to perceptually organize the dots constituting the walker into a coherent percept. The second task required the observer to indicate perceived facing direction of a walker presented in sagittal view. In the critical condition, the walker was scrambled. Solving this task requires intact processing of the cues contained in the local motion of individual dots which signals direction and animacy to normal observers. In both experiments, stimuli were shown both upright and inverted and the degree of the inversion effect which observers experience was quantified. In both tasks, human and non-human (cat, pigeon) walkers were employed. Results reproduced general effects of inversion, masking, and the nature of the walker, all of which have been shown earlier. However, they did not reveal any main group effect nor any interactions that involved the between-subject factor. In both experiments, the overall performance, the degree of the inversion effect, the sensitivity to mask density, and differences in the processing of human vs. non-human walkers were the same between the two groups. However, for the ASD group, in the direction task, we found a significant positive correlation between subjects' IQ and overall performance and a negative correlation between subjects' IQ and sensitivity to stimulus inversion.
Roddy, G., Troje, N. F., Gurnsey, R.
Previous research has shown that stimulus magnification is sufficient to equate sensitivity to biological motion across the visual field. However, this research used point-light walkers with fixed direction differences that make it impossible to judge whether the limits of walker direction discrimination change with eccentricity. We addressed this question by measuring walker direction-discrimination thresholds at a range of sizes from 0° to 16°. We found asymptotic thresholds, at all eccentricities, to be ±1.14 degrees from straight ahead. The psychometric functions at each eccentricity were shifted versions of each other on a log size axis. Therefore, when we divided stimulus size at each eccentricity (E) by an appropriate F = 1 + E/E2 (where E2 is the eccentricity at which stimulus size must double to achieve equivalent-to-foveal performance) all thresholds collapsed onto a single psychometric function. Therefore, stimulus magnification was sufficient to equate sensitivity to walker direction across the visual field. The average E2 value required to achieve this was 1.02. We also examined the role of attention in eccentricity-dependent sensitivity loss using a dual-task procedure in which participants were asked to judge first the colour (red or green) then the direction of the point-light walker. The difficulty of the colour judgment, and hence the level of attentional-engagement, was controlled by maintaining colour contrast at threshold levels. The dual-task returned a single E2 value of 1.20 for walker direction discrimination, suggesting that there is no effect of splitting attention between colour and direction at either fixation or 16°. Although there were no costs of splitting attention in the present study, it may be that such costs would be seen when subjects have to divide attention either between (i) two different spatial aspects of a stimulus or (ii) two different locations.
Perry, A., Troje, N. F., Bentin, S.
Motor actions suppress the EEG activity over the sensory-motor cortex, in a frequency range between 8-13 Hz, a range labeled Mu rhythms. Mu-suppression is induced not only by actual movements but also while the participant observes actions executed by someone else. This characteristic of Mu rhythms putatively associates them with the Mirror-Neurons System, which has been implicated in humans with social skills abilities and ToM. Further evidence for association between mu rhythms and social skills comes both from studies of individuals with Autistic Spectrum Disorders, and from a few studies with typical participants. These studies showed different mu rhythms modulations depending on the degree of social content of an observed human action. We further explored the basic relation between mu rhythms and social interaction. Specifically, using point-light biological motion, we manipulated the observer's task while keeping the stimuli identical across tasks. In separate blocks EEG was recorded while observers were instructed to process either the gender or the emotion or the intention of a moving pattern revealing the same biological motion of humans. The participants also completed two questionnaires - The Interpersonal Reactivity Index, and The Empathy Quotient. Mu suppression was found in all conditions relative to a baseline consisting of a moving circle. The suppression was modulated by task, strengthening the proposed association between mu rhythms and social interaction skills. Significant correlations between mu suppression and the scores on the personality scales unveiled theory-based individual variability in the activation of the mu-suppression mechanism.
Legault, I., Troje, N. F., Faubert, J.
Human ability to perceive biological motion pattern is well established. Furthermore, it has been shown that older observers can be quite efficient at detecting biological motion. Recently, Legault & Faubert (VSS 2008) showed that young adult biological motion perception is influenced by distance in virtual space. Observers obtained good performance when a 1.8 meter biological motion target was located 1 meter or further but performances decreased dramatically at the shorter distance (less than a meter). The purpose of the present study was to determine if there is a difference between younger and older adult's performances when biological motion patterns are presented at different distances in virtual space. To create our setup, we used a full immersive virtual reality environment (CAVE), giving the observers an immersive 3D experience. We used a biological motion pattern composed of 13 dots, walking left or right on a treadmill. The size of the walker was 1.80 meters and it was shown at virtual distances from the observer of 0.50, 1, 2, 4 and 16 meters. The observer's task was to identify the walker's direction (left or right) in upright and inverted conditions. The walker was presented in a scrambled mask which was generated by randomly selecting dots with biological motion patterns and repositioning them in 3D space. Threshold mask density was determined using an adaptive staircase procedure. The results showed that older adults are influence by distance and that their performance begins to decrease at a distance of 4 meters, compared to young adults who perform well down to a distance of 1 meter. In other words, biological motion detection in noise, in upright and inverted conditions, depends on how far the walker is positioned in 3D virtual space and the critical distance where biological motion judgements break down highly depends on observer age.
Gurnsey, R., Troje, N. F.
Purpose: Size scaling compensates for eccentricity-dependent sensitivity loss in a point-light-walker (PLW) direction discrimination task (Gurnsey et al., Vision Research, 2008) and PLW direction discrimination thresholds reach similar asymptotically low levels at large sizes for eccentricities of 0 to 16° (Gurnsey et al., submitted). Here we ask how PLW direction discrimination thresholds change as a function of stimulus size and eccentricity for first and second order stimuli.
Methods: On each trial a PLW was shown moving left or right at an angle (±?°) from straight ahead. An adaptive threshold procedure was used to determine threshold ? at a range of stimulus sizes (uniform magnifications) at eccentricities from 0 to 16° in the right visual field. Second order walkers comprised uniform luminance dots embedded in dynamic noise (SO1) or vice versa (SO2). First order walkers were structurally identical to the second order walkers but had a higher mean luminance in the uniform luminance region; FO1 and FO2, respectively.
Results: Within each condition dividing stimulus size at each eccentricity (E) by an appropriate F = 1 + E/E2 (where E2 is the eccentricity at which stimulus size must double to achieve equivalent-to-foveal performance) collapsed all thresholds onto a single psychometric function. The average E2 values were: E2(S01) = 2.85, E2(S02) = 2.03, E2(F01) = 1.50 and E2(FO2) = 0.80; asymptotic thresholds averaged ±3.91°, ±3.83°, ±3.90° and ±4.17° respectively. However, SO1 stimuli could not be discriminated at 8 and 16° and had to be much larger at fixation in order for thresholds to be measured.
Conclusions: Second order signals can elicit PLW direction discrimination thresholds similar to first order signals. For second order stimuli, noise dots in a uniform background convey information about walker direction at much smaller sizes than do uniform dots in a noise background.
Chang, D. H. F., Troje, N. F.
Traditional studies of acceleration perception have measured acceleration sensitivity in terms of the ratio of final to initial velocity or the proportion of change in velocity relative to the average velocity. From these studies, it is unclear as to how sensitivity to visual acceleration is affected by stimulus properties such as motion orientation, base velocity, and size. Here, we measured visual sensitivity to acceleration by parameterizing acceleration as it is defined: the change in velocity per unit time. Observers (n = 18) were asked to discriminate an accelerated stimulus from a constant velocity stimulus equated for mean velocity and size. Acceleration was adjusted according to the QUEST staircase procedure and thresholds, defined as the acceleration discriminated at the 82% correct-level, were obtained for positive and negative acceleration, horizontal and vertical motion, two base velocities, and two trajectory sizes. Consistent with previous findings, thresholds, if expressed according to proportion of velocity change relative to the base velocity were relatively constant across base velocities and sizes. Critically, we show that absolute acceleration thresholds varied in a manner analogous to Weberâs law. We show also that thresholds were better for motions along the horizontal axis than the vertical axis, but only at the high base velocity and smaller size. Furthermore, acceleration sensitivity was not affected by the sign of acceleration or stimulus direction within the principle axes. These findings are discussed in the context of predictions of acceleration sensitivity from previous data for the perception of animate and inanimate motions.
Chang, D. H. F., Harris, L. R., Troje, N. F.
The perception of both biological motion and faces is widely reported to be orientation-dependent (i.e., is impaired when the stimulus is inverted). For the perception of biological motion, there are at least two distinct inversion effects: one that is based upon the retrieval of form from motion and another that is based on the local motion of the limbs. The orientation of a visual stimulus can however be described with respect to a variety of allocentric (e.g., gravity, visual-environment) and egocentric (e.g., head-based) frames of reference. In the standard experimental setting, such reference frames are all aligned. Here, we investigated the role of different reference frames for the perception of faces and biological motion by testing observers (n = 12) on face recognition and biological motion tasks inside the York University "tumbling room". Independent rotations of the room, observer, and stimulus enabled comparisons of retinal, visual-environmental, and gravitational frames of reference. The biological motion task required observers to discriminate the facing direction (left or right) of a treadmill point-light walker shown in sagittal view. The face recognition task required observers to determine whether two consecutively presented faces were of the same or different identities. Performances on both the biological motion and face recognition tasks were best when the stimulus was aligned with the observer, as compared to gravity or the room. Interestingly, performances were also better when the stimulus was aligned with gravity as compared to the room, but for the biological motion task only. The results suggest that the perception of both biological motion and faces operates in large part in accordance with an egocentric frame of reference.
Chang, D. H. F., Harris, L. R., Troje, N. F.
We investigated the reference frames for the face and biological motion inversion effects by testing observers on face recognition and biological motion direction discrimination tasks inside the York University "tumbling room". Rotations of the room and the observer enabled comparisons of retinal, visual-environmental, and gravitational frames of reference. Performances on the biological motion task were best when the stimulus was aligned with the observer, and better when the stimulus was aligned with gravity rather than the room. Face recognition was best when the stimulus was aligned with the observer but did not differ between congruency with gravity or the room. The results suggest that the perception of both biological motion and faces operates in large part in accordance with an egocentric frame of reference.
2008
Papers
Thompson, B., Troje, N. F., Hansen, B. C., Hess, R. F.
Although a number of low-level visual deficits in amblyopia have been identified, it is still unclear to what extent these deficits extend throughout the visual processing hierarchy. Biological motion perception can be a useful measure of local and global visual processing since the point-light stimuli that are often used to study this ability carry both local motion and global form information. To investigate the integrity of the biological motion processing system in amblyopia, we employed both detection and discrimination tasks with coherent or scrambled point-light walkers either alone or embedded in different types of point-light masks. These manipulations allowed for control over the amount of form and/or motion information available to the observers that could be used for task performance. We found that amblyopic eyes could process both the global form and local motion components of point-light walkers, indicating intact processing for these stimuli. However, amblyopic eyes did show an increased susceptibility to the addition of masking dots suggesting that segregation of signal from noise is deficient in amblyopia.
Provost, M. P., Troje, N. F., Quinsey, V. L.
Strategic pluralism suggests that women engage in short-term sexual relationships when the benefits to doing so outweigh the costs. We investigated attraction to indicators of good genes (namely, masculinity as demonstrated by point-light walkers) in women varying in menstrual cycle status and sociosexual orientation. When women are fertile, they have the ability to gain genetic benefits from a male partner and should also be attracted to high levels of masculinity in men as a signal of genetic benefits. Sociosexual orientation is an individual difference that indicates openness to short-term mating and, thus, should influence aspects of mating strategy. Women with an unrestricted sociosexual orientation, as compared to women with a restricted sociosexual orientation, are more likely to engage in short-term relationships and obtain fewer nongenetic resources from their mates. Thus, they should place heavy emphasis on male masculinity as a sign of genetic benefits available from their mates. In this study, women indicated the walker most attractive to them on a constructed continuum of male and female point-light walkers. In Study 1, fertile women, as compared to nonfertile women, showed a greater attraction to masculinity. In Study 2, women demonstrated a strong positive relationship between sociosexuality and attraction to masculinity.
Provost, M. P., Quinsey, V. L., Troje, N. F.
We investigated variations in gait between women at high and at low conception probability, and how men rated those variations. Women participated in a motion capture study where we recorded the kinematics of their walking patterns. Women who were not using hormonal contraception (n = 19) repeated the study during the late follicular stage and the luteal stage of their menstrual cycle. Using a discriminant function analysis, we found significant differences in walking behavior between naturally cycling women at their follicular and luteal phases, with 71% of the walks classified correctly. However, there was no difference between walks of women in their follicular stage and women using hormonal birth control (n = 23). We compared structural and kinematic characteristics of the womenâs walking patterns that appeared to be characteristic of women in the specific conception risk groups, but found no significant differences. In a second study, 35 men rated the walks of women not using hormonal contraception as slightly more attractive during the luteal stage of the cycle compared to the late follicular stage. Thus, for women not using hormonal birth control, it would appear that some information regarding female fertility appears to be encoded in gait.
König, A., Schölmerich, A., Troje, N. F.
Differences between boys and girls in anatomic structural features and dynamic movement patterns increase during childhood. We studied anatomic structural properties and dynamic characteristics in a cross-sectional sample of 27 girls and 27 boys ranging from 4 to 16 years of age. The subjects walked and were filmed to create digitized 3-D-point-light models. Linear discriminant functions based on dynamic information classified gender and age of individual walkers above chance level, the accuracy increasing with age. In addition, discriminant functions based on anatomic structural information could identify gender only within separated age-groups. Correlative interactions between age-specific anatomic body structure and dynamic aspects appear to differ for both genders during different developmental phases. These results have implications for anthropometric norms and the development of movement patterns.
Jiménez Ortega, L., Stoppa, K., Güntürkün, O., Troje, N. F.
The retina of the pigeon has two areas of enhanced vision: the red field looking into the frontal binocular field and the yellow field projecting into the lateral monocular field. The entire retina projects to the tectofugal pathway, whereas the monocular areas mainly project to the thalamofugal pathway. In the present study we examine how the information received in different retinal areas and hemispheres is integrated within the pigeon brain. The pigeonsâ task was to discriminate between two shapes by pecking on one of the two keys located at one end of an experimental alley, while walking back and forth between two feeders. Intraocular transfer between the red and the yellow field was tested by moving the stimulus from the frontal to the lateral visual field in consecutive steps and vice versa. When the stimuli were perceived among the edge between the red and the yellow field, the pigeons showed a drastic decrease of performance that we interpret to result from a switch from the tectofugal to the thalamofugal system. There were virtually no traces of intraocular transfer of information from the tectofugal to the thalamofugal pathway, although, in a second experiment a weak intraocular transfer of information from the thalamofugal to the tectofugal system was observed. In a third experiment, interocular transfer of information between the yellow fields of the two eyes was tested. In eight out of nine birds, no interocular transfer was found. In addition, pigeons showed more difficulties to learn the task in the monocular right visual field than in the monocular left visual field, suggesting the existence of an asymmetric organization of the thalamofugal system in the pigeon brain.
Gurnsey, R., Roddy, G., Ouhnana, M., Troje, N. F.
There is conflicting evidence about whether stimulus magnification is sufficient to equate the discriminability of point-light walkers across the visual field. We measured the accuracy with which observers could report the directions of point-light walkers moving ±4° from the line of sight, and the accuracy with which they could identify five different point-light walkers. In both cases accuracy was measured over a sevenfold range of sizes at eccentricities from 0° to 16° in the right visual field. In most cases observers (N = 6) achieved 100% accuracy at the largest stimulus sizes (20° height) at all eccentricities. In both tasks the psychometric functions at each eccentricity were shifted versions of each other on a log-size axis. Therefore, by dividing stimulus size at each eccentricity (E) by an appropriate F = 1 + E/E2 (where E2 represents the eccentricity at which stimulus size must double to achieve equivalent-to-foveal performance) all data could be fit with a single function. The average E2 value was .91 (SEM = .19, N = 6) in the walker-direction discrimination task and 1.34 (SEM = .21, N = 6) in the walker identification task. We conclude that size scaling is sufficient to equate discrimination and identification of point-light walkers across the visual field.
Freitag, C. M., Konrad, C., Häberlen, M., Kleser, C., von Gontard, A., Reith, W., Troje, N. F., Krick, C.
In individuals with autism or autism-spectrum-disorder (ASD), conflicting results have been reported regarding the processing of biological motion tasks. As biological motion perception and recognition might be related to impaired imitation, gross motor skills and autism specific psychopathology in individuals with ASD, we performed a functional MRI study on biological motion perception in a sample of 15 adolescent and young adult individuals with ASD and typically developing, age, sex and IQ matched controls. Neuronal activation during biological motion perception was compared between groups, and correlation patterns of imitation, gross motor and behavioral measures with neuronal activation were explored. Differences in local gray matter volume between groups as well as correlation patterns of psychopathological measures with gray matter volume were additionally compared. On the behavioral level, recognition of biological motion was assessed by a reaction time (RT) task. Groups differed strongly with regard to neuronal activation and RT, and differential correlation patterns with behavioral as well as with imitation and gross motor abilities were elicited across and within groups. However, contrasting with the initial hypothesis, additional differences between groups were observed during perception and recognition of spatially moving point lights in general irrespective of biological motion. Results either point towards difficulties in higher-order motion perception or in the integration of complex motion information in the association cortex. This interpretation is supported by differences in gray matter volume as well as correlation with repetitive behavior bilaterally in the parietal cortex and the right medial temporal cortex. The specific correlation of neuronal activation during biological motion perception with hand-finger imitation, dynamic balance and diadochokinesis abilities emphasizes the possible relevance of difficulties in biological motion perception or impaired self-other matching for action imitation and gross motor difficulties in individuals with ASD.
Chang, D. H. F., Troje, N. F.
We present three experiments that investigated the perception of animacy and direction from local biological motion cues. Coherent and scrambled point-light displays of humans, cats, and pigeons that were upright or inverted were embedded in a random dot mask and presented to naive observers. Observers assessed the animacy of the walker on a six-point Likert scale in Experiment 1, discriminated the direction of walking in Experiment 2, and completed both the animacy rating and the direction discrimination tasks in Experiment 3. We show that like the ability to discriminate direction, the perception of animacy from scrambled displays that contain solely local cues is orientation specific and can be well-elicited within exposure times as short as 200 ms. We show further that animacy ratings attributed to our stimuli are linearly correlated with the ability to discriminate their direction of walking. We conclude that the mechanisms responsible for processing local biological motion signals not only retrieve locomotive direction but also aid in assessing the presence of animate agents in the visual environment.
Brooks, A., Schouten, B., Troje, N. F., Verfaillie, K., Blanke, O., van der Zwan, R.
The sensitivity of the mammalian visual system to biological motion cues has been shown to be general and acute [1], [2] and [3]. Human observers, in particular, can deduce higher-order information, such as the orientation of a figure (which way it is facing), its gender, emotional state, and even personality traits, on the basis only of sparse motion cues. Even when the stimulus information is confined to point lights attached to the major joints of an actor (so-called point-light figures), observers can use information about the way the actor is moving to tell what they are doing, whether they are a male or female, and how they are feeling [4], [5] and [6]. Here we report the novel finding that stimulus manipulations that made such walkers appear more female also had the effect of making the walkers appear more often as if they were walking away from rather than towards observers. Using frontal-view (or rear-view) point-light displays of human walkers, we asked observers to judge whether they seemed to be walking towards or away from the viewing position. Independent of their own gender, observers reliably reported those figures they perceived to be male as looking like they were approaching (as reported in [7]), but those they perceived to be female as walking away. Furthermore, figures perceived to be gender-neutral also appeared more often, although not exclusively, to be walking towards observers.
Aaen-Stockdale, C., Thompson, B., Hess, R. F., Troje, N. F.
Previous work investigating whether biological motion is supported by local second-order motion has been contradictory, with different groups finding either a difference or no difference in performance compared to that obtained with first-order stimuli. Here we show psychophysically, using randomized-polarity and contrast-modulated stimuli, that detection of second-order biological motion walkers is worse for stimuli defined by second-order cues, but this difference is explained by a difference in visibility of the local motion in the stimuli. By mixing first-order and second-order dots within the same stimulus, we show that, when the two types of dot are equally visible, first-order noise dots can mask a second-order walker, and vice-versa. We also show that direction-discrimination of normal, inverted and scrambled walkers follow the same pattern for second-order as that obtained with first-order stimuli. These results are consistent with biological motion being processed by a mechanism that is cue-invariant.
Book Chapters
Symposia and Published Abstracts
Williamson, K. E., Jakobson, L. S., Troje, N. F.
Recently, Troje (2008; Troje & Westhoff, 2006) has suggested that the local motion contained in upright, scrambled biological motion displays can trigger a simple "life detection" mechanism. The goal of the present study was to further characterize this mechanism. In two experiments, we assessed participants' ability to make accurate direction-facing judgments about point-light displays presented very briefly in central vision. In both experiments, the walkers varied in terms of the amount of the configural information that was available in the displays, and with regard to their orientation (upright or inverted) and facing direction. In the first experiment (in which stimuli were unmasked) we found that heading could be discerned from upright, scrambled displays even with brief (170 ms) exposure durations. In the second experiment, we showed that local motion cues could support accurate heading judgments, regardless of the species depicted (human, cat or pigeon). In contrast, when viewers had to rely solely on global cues to make their heading judgments, their performance was disproportionately better with upright human displays. Exposure times in this experiment were 500 ms, and all stimuli were masked. Whether they had to rely on local or global cues to make their heading judgments, viewers in Experiment 2 (unlike those in Experiment 1) tended to show a bias to report seeing a right-facing walker. We speculate that the right-facing bias may be more apparent when longer exposure durations are used, or in situations where greater attentional resources are required (as is the case when a target must be disembedded from a mask). The right-facing bias is discussed in relation to the literature on attentional biases and specialized scanning habits associated with reading.
Ware, E. L. R., Saunders, D. R., Troje, N. F.
Pigeons (Columba livia) and their mutual courtship display represent a good animal model of interactive communication. Here we ask the question â Is pigeon courtship behaviour based on anticipation of social reactions and controlled by social feedback? In other words, is it inter-subjective in nature? Using a closed-circuit TV setup that allows the manipulation of real-time interaction between two pigeons, we manipulated social contingency and inter-individual timing, to test for the perception of social influence and social synchrony respectively. To test social synchrony perception we delayed the interaction by 0, 1, 3, or 9 sec. To test social contingency perception subjects courted another pigeon either in real-time (contingent interaction) or on pre-recorded video (non-contingent). We then repeated this experiment, extending the duration of each condition from 2 to 6 minutes. In all cases, our dependent measure was courtship intensity. Our results show that pigeons adjust their display intensity depending on the presence of social contingencies during mutual visual display, an effect observed only when conditions last 6 minutes. Pigeons, representing a highly social species, appear to possess true inter-subjectivity, a capacity that is known to be omnipresent in humans.
Troje, N. F.
Innate sensitivity to characteristics of social stimuli, such as faces and biological motion, may facilitate learning about caregivers and other conspecifics. To date, most research has focused on early preferences for faces. Within two hours of birth newborns look preferentially towards three blobs arranged as facial features (e.g., Mondloch et al., 1999); likewise, visually inexperienced chicks preferentially approach the head and neck region of a hen (Johnson & Horn, 1988). These early preferences are not tuned to species-specific details of faces but serve two important functions: they facilitate rapid recognition of conspecifics (i.e., potential care-givers) and, at least in humans, allow for the later development of expert face processing (Le Grand et al., 2001). The present symposium will examine whether similar developmental principles apply to biological motion. The first two speakers will present evidence that both dark-reared chicks (Regolin) and human newborns (Simion) demonstrate a preference for biological motion over other patterns of motion. Like face perception, this preference is not species-specific; newborns look preferentially towards walking hens and chicks show a preference for walking cats. The ability of both chicks and human newborns to detect biological motion is impaired when stimuli are inverted, indicating that perception of biological motion is constrained by core knowledge of gravity. Despite similar patterns of sensitivity to faces and biological motion in early development, the two systems are differentially affected by early visual deprivation. The third presentation (Maurer) will draw on studies of children treated for bilateral congenital cataract to show that, unlike most visual functions, sensitivity to biological motion develops normally in the absence of early visual experience. The discussant (Troje), who is well known for his extensive studies of adultsâ sensitivity to biological motion, will consider the implications of these findings for understanding social development and the origins of a âlife detectorâ
Troje, N. F.
If biological motion point-light displays are presented upside down, performance on most tasks is strongly impaired. We have recently shown that this inversion effect has two entirely different and independent causes. One is due to the inversion of the familiar upright shape. The second is related to a visual filter tuned to the gravity-constrained local motion of the feet of a human or animal in locomotion. Here, we are investigating whether the two inversion effects operate in retinal coordinates or in gravitational coordinates. We designed two different tasks isolating the structure-from-motion aspect of biological motion, on the one hand, and the mechanism tuned to local motion, on the other hand, and conducted experiments in which either the stimulus or the observer were turned upside down. The results clearly indicate that both inversion effects operate in retinal coordinates and are not affected by vestibular input. Apparently, a heuristic that gravity is aligned with retinal coordinates is replacing a reality check that would require visual-vestibular sensory integration.
Thurman, S., Pyles, J., Troje, N. F.
There is a growing body of literature investigating point-light biological motion perception. Based solely on the kinematics of a handful of dots representing the body and major joints of a human actor, observers can extract complex information such as gender from point-light displays. Many previous studies have used artificially generated point-light animations to investigate critical features for gender discrimination (Cutting, 1978; Mather & Murdoch, 1994). Here we investigate the diagnostic cues for gender discrimination of natural point-light walkers using a technique similar to temporal "bubbles" (Thurman & Grossman, 2007), an adaptation of the "bubbles" technique (Gosselin & Schyns, 2001). We presented three full cycles of a point-light walker, randomly chosen from a set of 25 male and 25 female actors (Troje, 2002), while observers made forced-choice gender discriminations. On each trial, we removed a randomly chosen subset of frames from the animation and assessed performance as a function of frames present and absent. We reason that performance is best when a non-critical interval is removed, but declines when a critical interval is removed. Hence, our experiment identifies the temporal windows and diagnostic features that most often lead to a correct gender discrimination. Preliminary results suggest that hip sway, as reflected by the distance between the hip dots over time in the profile view, is a primary critical feature for discriminating gender in natural point-light displays. This result is consistent with previous studies using artificially generated point-light animations (Barclay et al., 1978; Cutting 1978). This interpretation is supported by the observation that male walkers in our data set with high levels of hip movement are consistently misclassified as female, and that females with low hip movement are typically misclassified as male.
Saunders, D. R., Gurnsey, R., Troje, N. F.
Biological motion perception involves two distinct visual mechanisms. One is based on deriving the global shape of the agent, while the other is based on the local motion of individual dots. In this study, we present a new method that allows us to further characterize the two mechanisms. By measuring azimuth discrimination thresholds for point-light walkers that contain either only local information or only global structure, we avoid a number of confounds which make other methods less reliable. Our results confirm the dissociation between the two proposed mechanisms.
Murphy, P., Brady, N., Troje, N. F.
The question of whether individuals with autism spectrum disorder (ASD) are impaired in the perception of biological motion is as yet unresolved. Here adults with high-functioning autism and neurotypical controls judged the direction of motion of a normal or spatially scrambled point-light walker which, on each trial, walked from the centre of the screen either leftward or rightward at ~3 deg s-1 in a strip of scrambled walker noise of variable density. The walker appeared with variable onset time between 0 and 500 ms after the noise onset. While the ASD group showed slower reaction times and more errors in judging the direction of motion, their performance was otherwise comparable to controls; specifically, they showed superior performance for normal over scrambled walkers, for delayed over immediate onset, and they showed comparable increases in reaction time and error with noise density. Finally, error rates differed on left and rightward trials, an effect which distinguished ASD and control performance: this is discussed with reference to hemispheric asymmetry in local and global processing.
Michalak, J., Troje, N. F., Schulte, D., Heidenreich, T.
Objectives: (1) Do dynamic gait patterns of currently and formerly depressed patients differ from never depressed people (2) Does mindfulness-based cognitive therapy (MBCT) normalize gait patterns of formerly depressed patients? Methods: Gait patterns of 30 formerly depressed patients participating in MBCT, 14 currently depressive inpatients and 30 never depressed participants were analyzed by fourier-based descriptions and computation of linear classifiers. Results: Â Gait patterns of currently depressed patients and formerly depressed patients differ form never depressed people. MBCT has some normalizing effect on the way patients walk. Conclusions: Mindfulness might change proprioceptive-bodily feedback important in the generation of depressive states.
Kuhlmeier, V., Troje, N. F., Lee, V.
Background and Aims: Biological motion can be conveyed through capturing videos of humans walking with point lights attached to their major joints, thus eliminating the appearance of mass, depth, bodily features. Studies consistently show that adult observers immediately identify biological motion displays as a human walking (Johansson, 1973). Additionally, for adults, scrambled biological motion which is completely devoid of structural information not only retains information about the direction of a walking human, but also is subject to a pronounced inversion effect such that the direction of inverted walkers is difficult to determine (Troje & Westhoff, 2006). Infants as young as 3-months of age are also sensitive to biological motion (e.g, Bertenthal et al., 1984; 1985); we tested here whether they can also detect the direction of walking, and if this is subject to an inversion effect. Procedure: 6-month-old infants were habituated to movies of an upright (Experiment 1) or inverted (Experiment 2) point-light walker who walked as if on a treadmill (i.e., there was no actual displacement across the screen). For half of the infants, the walk was to the right, and for half the walk was to the left. In test, infants saw the familiar direction of walking on one trial, and the new direction on a second, with order counterbalanced. Looking time to each display was recorded and analyzed. Results and Conclusions: When presented with an upright walker, infants were sensitive to a switch in the direction of walking from habituation to test. Infants looked longer at the new direction than the old in test trials (familiar direction M=7.64s, new direction M=14.03s; t(19)=2.86, p=.01). In contrast, when the walker was inverted, infants did not seem to notice the switch in direction, and looking times were equal across both trials (familiar direction M=10.06s, new direction M=8.61s; t(18)=.510, p=.616). Thus, young infants not only seem to recognize the difference between upright and inverted walkers and visually prefer the former (Bertenthal et al., 1984), but they also can detect the direction of motion of upright walkers. The detection of direction does not, however, extend to inverted walkers, suggesting the existence of an inversion effect.
König, A., Schölmerich, A., Troje, N. F.
Zusammenfassung: Der menschliche Gang dient nicht alleine der Fortbewegung, sondern ist darüber hinaus ein effektives Medium der sozialen Kommunikation. Auch anhand stark reduzierter Stimuli ist unser Wahrnehmungssystem aufgrund hoch spezialisierter Hirnareale in der Lage, eine Vielzahl an relevanten Informationen aus biologischen Bewegungsmustern effizient zu enkodieren. Zur Identifizierung von Geschlecht, Alter und Emotionslage des Gegenübers reichen uns meist wenige Sekunden. Mittels der in den 70er Jahren von Gunnar Johansson entwickelten Methode der Point-Light-Displays lässt sich die dynamische Information des Ganges von strukturellen anatomischen Eigenschaften trennen. Auf diese Weise wurden aus 15 Lichtpunkten bestehende prototypische kindliche und erwachsene 3D-Modelle beiderlei Geschlechts erzeugt und computergestützt einer Stichprobe von Kindesmissbrauchern (n = 65) und Kontrollprobanden (n = 68) präsentiert. In Abhängigkeit vom Alter der Point-Light-Walker zeigen sich unter anderem signifikante Gruppenunterschiede in den Attraktivitätsbeurteilungen der Stimuli. Basierend auf den Attraktivitäts-, Geschlechts- und Altersbeurteilungen der Probanden sollen mit Hilfe der Classification Regression Tree Analysis (CART) erste Klassifikationsergebnisse hinsichtlich unterschiedlicher Probandenmerkmale (z.B. sexuelle Orientierung, ICD-10-Diagnose Pädophilie, Opfergeschlecht und Opferalter) dargestellt werden.
Holland, G., Mody, S., Troje, N. F.
A significant amount of past research has studied person identification from point light displays of walking humans, investigating parameters such as viewing angle and the differential contributions of structural and kinematic information. However, little is known about the ability of human observers to generalize identity across different activities. In this study we use a same/different paradigm to compare observers' ability to identify point light displays within and across activities. We drew from a database of 100 motion-captured humans, each of which encompassed both walking and running activities. Subjects were shown successive paired stimuli and had to indicate whether the stimuli represented the same or different person. In either case, the two displays were at slightly different viewpoints. Two independent factors were examined: stimulus pairing (walker/walker, runner/runner, walker/runner) and information content (structural only, kinematic only, full information). For all information contents for stimulus pairing of matching activities (walker/walker, runner/runner) subjects performed significantly better than chance (t(5)=2.71, p0.05). The main effect of Pairing was significant (F(2, 30)=35.7, p[[lt]]0.001), with the walker/runner pairing being the most difficult. Information was not a significant factor. However, there was a significant interaction between Pairing and Information (F(4, 30)=4.03, p[[lt]]0.01) that manifested in performance on the runner/runner task in particular being better for full information than for structural or kinematic only. Results are discussed in light of a principal components-based linear model that estimates a runner time series from a given walker time series by equating principal component coordinates.
Chang, D. H. F., Troje, N. F.
The ability to discriminate direction from spatially scrambled point-light displays relies on the orientation of the foot dot motions (Troje & Westhoff, 2006). We present two experiments that investigated this local motion-based inversion effect by testing direction discrimination from novel biological motion displays that exaggerate and display solely foot-specific information. In Experiment 1, we isolated the foot motion of a treadmill human walker, human runner, cat, and pigeon and presented observers (n = 20) with 1000 ms displays consisting of 10 copies of two foot dots that traced 150 ms segments at counterphase positions of the gait cycle. For each foot type, we derived left and right signalling displays from five such segment pairs that collectively sampled the entire gait cycle and presented them at both upright and inverted orientations. Direction discrimination accuracies varied with foot type, orientation, and segment pair. Significantly, the decrease in accuracies due to inversion was most substantial for the runner stimuli which exhibit the most pronounced vertical velocity changes and smallest for the cat stimuli which carry little vertical motion. In Experiment 2, a new group of observers (n = 20) were presented with the natural human walker stimuli of Experiment 1 and with stimuli that were spatiotemporally-matched to the natural stimuli but moved with constant velocities. Here, overall discrimination accuracies did not differ per foot type, decreased with inversion, and varied with segment pair. Critically, performances were higher for upright than for inverted displays for the natural stimuli only. Upright and inverted versions of the constant velocity stimuli did not differ. The results suggest that the local inversion effect in biological motion perception is carried by the velocity gradients of the foot motions. We conjecture that the visual system is sensitive to characteristic velocity changes exhibited by biological movements in a gravity-driven environment.
Chang, D.H.F., Troje, N. F.
We tested direction discrimination from biological motion stimuli that display only fragments of full foot trajectories at upright or inverted orientations. Results from observers presented with displays derived from counterphase fragments of different types of foot motions showed an inversion effect that was largest for stimuli derived from the human runner which exhibit pronounced vertical accelerations. Results from new observers presented with veridical human walker stimuli and stimuli that were identical but had accelerations removed showed an inversion effect for the veridical stimuli only. These findings suggest that the local inversion effect is carried by acceleration cues in foot motions.
2007
Papers
Zhang, Z., Troje, N. F.
In this paper, we present and evaluate a method of reconstructing three-dimensional (3D) periodic human motion from two-dimensional (2D) motion sequences. Using Fourier decomposition, we construct a compact representation for periodic human motion. A low-dimensional linear motion model is learned from a training set of 3D Fourier representations by means of Principal Components Analysis. Twodimensional test data are projected onto this model with two approaches: least-square minimization and calculation of a maximum a posterior probability using the Bayesâ rule. We present two different experiments in which both approaches are applied to 2D data obtained from 3D walking sequences projected onto a plane. In the first experiment, we assume the viewpoint is known. In the second experiment, the horizontal viewpoint is unknown and is recovered from the 2D motion data. The results demonstrate that using the linear model not only can missing motion data be reconstructed, but unknown view angles for 2D test data can also be retrieved.
Westhoff, C., Troje, N. F.
We examined the role of kinematic information for person identification. Observers learned to name seven walkers shown as point-light displays, which were normalized by their size, shape, and gait frequency, either under a frontal, half-profile, or profile view. In two experiments we analyzed the impact of individual harmonics as created by a Fourier analysis of a walking pattern, as well as the relative importance of the amplitude and the phase spectra in walkers shown from different viewpoints. The first harmonic contains most of the individual information, but performance was also above chance level when only the second harmonic was available. Normalization of the amplitude of a walking pattern resulted in a severe deterioration of performance, whereas the relative phase of the point-lights is only used from a frontal viewpoint. No overall advantage for a single learning viewpoint was found, and there is considerable generalization to novel testing viewpoints.
Vocks, S., Legenbauer, T., Troje, N. F., Rüddel, H., Schulte, D.
The aim of the present study was to find out whether in bulimia nervosa the perceptual component of a disturbed body image is restricted to the overestimation of oneâs own body dimensions (static body image) or can be extended to a misperception of oneâs own motion patterns (dynamic body image). Method: Participants with bulimia nervosa (n ¼ 30) and normal controls (n ¼ 55) estimated their body dimensions by means of a photo distortion technique and their walking patterns using a biological motion distortion device. Results: Not only did participants with bulimia nervosa overestimate their own body dimensions, but also they perceived their own motion patterns corresponding to a higher BMI than did controls. Static body image was correlated with shape/ weight concerns and drive for thinness, whereas dynamic body image was associated with social insecurity and body image avoidance. Conclusion: In bulimia nervosa, body image disturbances can be extended to a dynamic component.
Thompson, B., Hansen, B. C., Hess, R. F., Troje, N. F.
Biological motion perception, having both evolutionary and social importance, is performed by the human visual system with a high degree of sensitivity. It is unclear whether peripheral vision has access to the specialized neural systems underlying biological motion perception; however, given the motion component, one would expect peripheral vision to be, if not specialized, at least highly accurate in perceiving biological motion. Here we show that the periphery can indeed perceive biological motion. However, the periphery suffers from an inability to detect biological motion signals when they are embedded in dynamic visual noise. We suggest that this peripheral deficit is not due to biological motion perception per se, but to signal/noise segregation.
Liedvogel, M., Feenders, G., Wada, K., Troje, N. F., Jarvis, E. D., Mouritsen, H.
Cluster N is a cluster of forebrain regions found in night-migratory songbirds that shows high activation of activity-dependent gene expression during night-time vision. We have suggested that Cluster N may function as a specialized night-vision area in nightmigratory birds and that it may be involved in processing light-mediated magnetic compass information. Here, we investigated these ideas. We found a significant lateralized dominance of Cluster N activation in the right hemisphere of European robins (Erithacus rubecula). Activation predominantly originated from the contralateral (left) eye. Garden warblers (Sylvia borin) tested under different magnetic field conditions and under monochromatic red light did not show significant differences in Cluster N activation. In the fairly sedentary Sardinian warbler (Sylvia melanocephala), which belongs to the same phyolgenetic clade, Cluster N showed prominent activation levels, similar to that observed in garden warblers and European robins. Thus, it seems that Cluster N activation occurs at night in all species within predominantly migratory groups of birds, probably because such birds have the capability of switching between migratory and sedentary life styles. The activation studies suggest that although Cluster N is lateralized, as is the dependence on magnetic compass orientation, either Cluster N is not involved in magnetic processing or the magnetic modulations of the primary visual signal, forming the basis for the currently supported light-dependent magnetic compass mechanism, are relatively small such that activity-dependent gene expression changes are not sensitive enough to pick them up.
Folta, K., Troje, N. F., Güntürkün, O.
Neurons of the pigeon's diencephalic n. rotundus were demonstrated to show visual responses of short and long latency representing ascending signals of the retinoâtectoârotundal system and descending signals from telencephaloâtectoârotundal fibers. Pigeons thus provide an ideal model to investigate the convergence of ascending and descending visual processing streams at single cell level. Although it is known that rotundal responses of long latency show distinct response characteristics, dependent on the stimulus being presented monocularly or binocularly, the mechanisms underlying these response differences are still unclear. While it is possible that the simultaneity of eye stimulation produces a change of processing, it is also possible that the relative timing and order between ipsilateral and contralateral signals are the decisive variable. To test between both possibilities, we recorded from cells in the pigeons n. rotundus while providing monocular or binocular visual stimulation and varying the delay and order of eye presentations. We revealed that the precise temporal interaction and order of ascending and descending inputs to the tectum decide about late responses with burst or tonic characteristics. When descending signals reached the tectum before the ascending signals, rotundal cells showed late responses that were characterized by burst activity patterns. When ascending input reached the tectum first, responses with tonic characteristic were observed. These effects might become mediated by intratectal mechanisms, the nucleus ventrolateralis thalami, or the bed nuclei of the tectothalamic tract and might constitute the neural basis of a bihemispheric gating function.
Book Chapters
Symposia and Published Abstracts
Williamson, K., Jakobson, L., Troje, N.F.
Gunnar Johansson (1973) was the first to demonstrate that human observers can perceive animate activity solely from information about the movements of dots attached to the joints of an otherwise invisible figure. From even brief exposure to these dynamic "point light" displays, viewers are able to extract surprisingly detailed information, including information about the actor's gender and mental state (e.g., Troje, 2002). Recently, Troje and Westhoff (2005) have suggested that several independent processes are involved in biological motion perception, the most basic of which is a simple form of "life detection" that is automatically triggered by low-level, local motion cues. In support of this, they found that, even when configural information was disrupted through spatial scrambling of the dots comprising a point light walker, participants were still able to judge the direction the walker was facing quite accurately, provided the moving dots were presented in their normal (upright) orientation. Stimuli in this study were presented in central vision and viewing times were unlimited. In the present study, we replicated this basic result using very brief exposure durations (200 and 170 ms). We also went on to show that viewers could achieve above-chance direction discrimination performance with upright, scrambled displays when the targets were presented in peripheral vision. Performance in the periphery, however, varied as a function of the side of presentation and the direction the walker was facing. Specifically, while participants were better at processing right-facing (compared to left-facing) walkers in the right visual field, they were equally accurate at processing right- and left-facing walkers in their left visual field. This interaction was seen both with configural and scrambled displays. This interesting result is discussed with reference to recent studies examining hemispheric differences in spatial attention and body representation, and in the operation of the "mirror neuron" system.
Ware, E. L. R., Troje, N. F.
Joint action is behaviour that requires coordination between animals to achieve a desired goal. We investigated the perception of social feedback in mutual courtship of the pigeon, Columba livia, in a double closed-loop teleconferencing setup. Pigeons could interact in real time with the life-sized video image of the other bird. We manipulated social feedback in two ways; 1) by altering temporal contiguity by three delays of 1s, 3s and 10 s and 2) by altering social contingency, by playing back a video of the subjectâs partner from a previous interaction. Courtship intensity decreased in all three temporal delay conditions. Courtship intensity did not differ between a zero-delay condition and the non-contingent playback condition. We conclude that pigeon courtship is sensitive to temporal contiguity in social feedback, implying visual coordination in joint action. Pigeons did not show social contingency perception, and may lack a representation of social causality.
Troje, N. F., Chang, D. H. F.
Visual perception of biological motion is a complex process that involves several independent mechanisms. Particularly, two such mechanisms have to be distinguished. One responds to the local motion of the feet of a moving animal and signals both the presence and the facing direction of the animal. The second integrates the global configuration of a set of moving dots into the coherent, articulated shape of a human or animal body. We hypothesize that the first one is evolutionary old, not specific to human motion, and not sensitive to learning, while the second requires individual learning and is therefore specific to human motion. Here, we conducted two experiments. The first one required an observer to derive the direction in which a stationary walker was facing. The walker depicted either a human walker, a walking pigeon or a walking cat masked by a varying number of stationary flickering dots. Walkers were shown either spatially intact or scrambled. Five blocks of 60 trials each were run to probe for learning effects. The second experiment was a 2AFC detection experiment. In each trial, two displays were shown. One contained only a mask of scrambled walkers while the other one also contained a coherent walker. Walkers depicted a human, a pigeon, or a cat. Again, five blocks with 60 trials each were run to test for learning effects. Results confirmed our hypotheses: For the first task which focused on the local mechanism, we found effects of the number of masking dots, and an effect of scrambling, but neither an effect of the nature of the walker, nor an effect of learning. In contrast, for the second task (requiring global shape-from-motion processing) we found much better performance for the human walker as compared to the non-human walkers, and a strong effect of learning.
Thompson, B., Hansen, B.C., Hess, R.F., Troje, N.F.
Background. It is currently believed that the cortical deficit associated with amblyopic vision extends beyond striate cortex into extrastriate areas. Biological motion perception has been localized to a specific extrastriate cortical region (STS) which receives input from both dorsal and ventral visual processing streams. We used a variety of biological motion perception tasks to assess the function of this extrastriate region in amblyopia. Methods. Amblyopic observers viewed biological motion stimuli with either their amblyopic or fellow fixing eye. A range of tasks were used to better characterize the ability of amblyopic eyes to perceive biological motion. Detection of a point light walker was measured using both scrambled walker masks and linear motion masks to modulate task difficulty. Walking direction discrimination was also measured using both scrambled walkers, which provided only motion information, and unscrambled walkers. These stimuli were embedded in linear dot masks of various densities. Results. Amblyopic eyes showed a deficit in biological motion detection. Amblyopic eyes did not however show a similar deficit for walking direction discrimination and could perform this task with both unscrambled and scrambled walkers. Conclusion. Amblyopic eyes are impaired at segregating a point light walker from a noise mask. However the ability to extract information from the biological motion of the walker dots showed little impairment.
Saunders, D. R., Suchan, J., Troje, N. F.
In 1978, James Cutting published an algorithm to generate point-light displays that resemble the movements of the joints of a human walker. The method has since been used frequently to create stimuli for research on biological motion perception. More recently, Troje and Westhoff (2006) found that pattern of local movement of the feet was used to derive the direction in which a point-light walker is facing, even when structural information is removed. The results of previous studies that direction could not be determined using a scrambled version of Cutting's walker, may be explained by the significantly different motion of the feet between Cutting's walker and motion-captured humans. To compare the two stimuli, 14 participants performed a detection task and a direction task. Walkers consisted of 11 points presenting a sagittal view. In the detection task, walkers were embedded in a scrambled walker mask consisting of 50, 100, or 200 dots. Participants had to decide which of two successive intervals contained the walker. In the direction task, participants judged whether the walking figure was oriented towards the left or the right. The mask consisted of randomly appearing stationary dots (50, 200, or 750) with limited lifetime. Half of the walkers were spatially coherent and half of them were scrambled. Observers performed equally well for the two walkers in the detection task. However in the direction task, the error rate for Cutting's walker was significantly higher than for the motion-captured walker. Most of the difference came from the scrambled walker condition, where error rate increased from 39% to 48%. We conclude that Cutting's walker lacks critical features which signal direction in real walking motion, and suggest that studies which have presented the local motion of the Cutting walker as a stimulus need to be revisited.
Piotrowski, A., Jakobson, L., Troje, N.F.
Biological motion perception refers to the ability to perceive and interpret the movements of animate objects, in the absence of form cues (Johansson, 1973). To date, only one study has examined this ability in elderly observers (Norman et al., 2004). These authors reported that, particularly at longer exposure durations (400 ms), biological motion perception was well-preserved in older adults. Norman et al. assessed participants' ability to recognize specific activities that, in some instances, were partially occluded. In Experiment 1 of the present study, a walker was displayed (for 200 ms) in a frontal view on half of the trials; on the remaining trials a spatially scrambled walker was presented. Stimuli were shown either in isolation, or in a scrambled walker mask comprised of 25 to 150 dots. The task on each trial was to decide whether a coherent walker was present. In Experiment 2, a walker appeared on each trial, in a profile view, and participants indicated whether it was facing left or right. The stimulus was presented alone or in a field of masking dots, as described above. Healthy elderly and young adults performed essentially at ceiling levels on both tasks when no mask was added. The presence of masking dots, however, had a much more deleterious effect on the performance of elderly participants than on that of young adults. Indeed, elderly participants' performance fell to chance levels as more masking dots were added. While the two groups differed in terms of education, and scores on both the Token Test and Digit Symbol, these differences could not account for the impairment seen in elderly participants on the biological motion perception tasks. We conclude that healthy elderly show a marked impairment in their ability to perceive biological motion in the presence of visual noise, at short exposure durations.
Konrad, C., Häberlen, M., v. Gontard, A., Reith, W., Troje, N. F., Krick, C., Freitag, C.
Einleitung: Störungen der Wahrnehmung biologischer Bewegungen wurden bei Menschen mit autistischen Störungen mehrfach beschrieben. Im Rahmen einer gröÃeren Studie zeigten sich Unterschiede der Hirnaktivierung in temporo-parietalen und frontalen Regionen von Jugendlichen mit autistischen Störungen bei der Beobachtung komplexer biologischer Bewegungen. Ziel dieser morphometrischen Studie ist es, mögliche anatomische Grundlagen der gestörten Verarbeitung der Beobachtung komplexer Bewegungen zu untersuchen. Methode: Dreizehn männlichen und zwei weiblichen Jugendliche mit autistischen Störungen nach DSM IV und eine gleiche Anzahl alters-, geschlechts- und IQ-gematchten Kontrollpersonen (Altersmittel 18 Jahre) wurden in einem 1,5 Tesla MRT untersucht (Siemens Sonata, Erlangen, Germany). Neben funktionellen Messungen (siehe Freitag et al. DGPPN 2007) wurden sagitale T1-gewichtete MPRAGE Sequenzen mit isotropen Voxeln mit 1 mm Kantenlänge anatomische Bilder aufgenommen. Das Volumen der grauen Substanz wurde mittels optimierter voxel-basierter Morphometrie (VBM) untersucht. Diskussion/Ergebnisse: Jugendliche mit autistischen Störungen zeigten eine signifikante Reduktion des Volumens der grauen Substanz in der Region des rechten intraparietalen Sulcus, die in enger räumlicher Beziehung zu funktionellen Aktivierungen bei der Betrachtung biologischer Bewegungen stand. Eine signifikante Reduktion des Volumens der grauen Substanz wurde auÃerdem bilateral im mittleren frontalen Gyrus gefunden. Frühe strukturelle parietale Veränderungen könnten eine Ursache für die Beeinträchtigung der Wahrnehmung komplexer biologischer Bewegungen bei Jugendlichen mit autistischen Störungen sein.
König, A., Schölmerich, A., Troje, N. F.
Im Verlauf der Kindheit werden Unterschiede zwischen Jungen und Mädchen im Körperbau und der Bewegungsdynamik sichtbar. Die Entwicklung von Körperproportionen und der Gangdynamik wurde an einer Querschnittsstichprobe von 27 Mädchen und 27 Jungen im Alter von 4 bis 16 Jahren analysiert. Zur Synthese digitalisierter Point-Light Modelle wurden die Gangmuster der Probanden im dreidimensionalen Raum erfasst. Lineare Diskriminanzfunktionen der Bewegungsinformation erlauben die überzufällig korrekte Zuordnung individueller Gangmuster nach Alter und Geschlecht, wobei die Präzision mit dem Alter ansteigt. Berechnet man die Diskriminanzfunktion als altersspezifische Funktion innerhalb von drei Altersgruppen, ergibt sich zusätzlich zur dynamischen Identifikation eine korrekte Zuordnung auf Grund der strukturellen anatomischen Information. Korrelative Zusammenhänge zwischen altersspezifischen Körperproportionen und dynamischen Aspekten treten dabei für Mädchen und Jungen in unterschiedlichen Entwicklungsphasen auf. Diese Ergebnisse werfen ein neues Licht auf die bisherigen Erkenntnisse der anthropometrischen Forschung und die Entwicklung von geschlechtsspezifischen Bewegungsmustern.
Hohmann, Tanja, Munzert, Joern, Troje, Nikolaus F
Experts detect important action-directing cues in the environment faster and more accurately (Williams & Ward, 2003). The presence and absence of relations between biological motion and a discrete environmental object influences the movement perception (Shipley & Cohen, 2000). The aim of the present study was to examine the difference between experts and novices when anticipating different kinds of basketball dribble while looking at PLDs under different conditions. Sixteen experts and 18 novices observed speed, spin, cross-over, behind-the-back, and between-thelegs dribbling in randomized order for a total of 60 trials each block (5 dribbles x 3 models x 2 examples of each kind of dribbling x 2). There were three blocks with different conditions (player only, player with ball, player with the sound of the ball). Participants had to press a button as soon as they thought they knew which kind of dribble would follow. The dependent measure was a combination of the reaction time and the number of correct/wrong responses. Results of an ANOVA revealed a significant difference between experts and novices in the anticipation of dribbling movements (F = 5.576, p < .001). Recognition performance also differed significantly between the five types of dribble (F = 62.69, p < .001). Spin dribbling was recognized faster and better than all other types. Experts were especially better at recognizing cross-over and behind-the-back dribbling than novices. Performance did not differ significantly between the three conditions in either group (F = .911, p = .386). This was unexpected, because it had been assumed that the presentation of the ball would particularly help observers to anticipate the following movement faster. It is concluded that the perception of biological motion is guided predominantly by the movement itself and not by other environmental cues independent of expertise.
Hess, R. F., Thompson, B., Hansen, B., Troje, N. F.
Biological motion perception, having both evolutionary and social importance, is performed by the human visual system with a high degree of sensitivity. It is unclear whether peripheral vision has access to the specialized neural systems underlying biological motion perception; however, given the motion component, one would expect peripheral vision to be, if not specialized, at least highly accurate in perceiving biological motion. Here we show that the periphery can indeed perceive biological motion. However, the periphery suffers from an inability to detect biological motion signals when they are embedded in dynamic visual noise. We suggest that this peripheral deficit is not due to biological motion perception per se, but to signal/noise segregation.
Halevina, A., Troje, N.F.
Biological motion point-light walkers convey information about the sex of a walker. As has been shown earlier, retrieving this information depends on the viewpoint: Frontal views are easier to classify than profile views. However, what happens if a walker is shown from a varying viewpoint as is the case when we see a walker walking on a circle? Multiple viewpoints should facilitate activation of a three-dimensional representation which might help classification. On the other hand, the additional rotation might mask intrinsic (that is, relative) motion diagnostic for the sex of the walker and therefore hinder classification. In the current study, observers had to indicate perceived sex of point-light displays of individual walkers shown either in frontal view (0 deg), half profile view (30 deg), profile view (90 deg), or in a condition in which the viewpoint rotated from -50 to 50 deg over the display time of 2 sec. In addition, we manipulated the information provided. Walkers contained either only structural information, only kinematic information, or all information. The results replicated earlier findings showing that performance at frontal and half profile view is much better than at profile view and that kinematic information is required for sex classification whereas structural information has very little diagnostic value. In addition, we could show that rotating views of a walker are clearly resulting in worse classification than frontal or half-profile views, but were classified much better than profile view walkers. We conclude that three-dimensional representations do not facilitate sex classification from biological motion. Diagnostic information about sex is primarily contained in the kinematics within the fronto-parallel plane and the motion due to rotation of the walker aggravates retrieval of this information.
Freitag, C., Konrad, C., Häberlen, M., v. Gontard, A., Reith, W., Troje, N. F., Krick, C.
Einleitung: Eine Dysfunktion des dorsalen visuellen Verarbeitungsweges wurde in verschiedenen Studien bei Personen mit autistischen Störungen (ASD) gefunden. Da die Bewegungswahrnehmung unter anderem auch im dorsalen visuellen Weg verarbeitet wird und in einer Studie Störungen der Bewegungswahrnehmung bei Kindern mit Autismus beschrieben worden sind, ist in der vorliegenden Untersuchung die neurale Aktivierung bei der Beobachtung einfacher und komplexer biologischer Bewegungen bei Personen mit ASD und gematchten Kontrollpersonen untersucht worden. Methode: Die BOLD-Antwort bei der Beobachtung zweier Stimuli, einer einfachen und einer komplexen biologischen Bewegung, wurde bei 13 männlichen und 2 weiblichen Jugendlichen mit ASD und 13 männlichen und 2 weiblichen alters- und IQ-gematchten Kontrollpersonen verglichen (Alter MW 18 Jahre, Spanne 14-28). Diskussion/Ergebnisse: Die Aktivierung durch den einfachen Bewegungsstimulus unterschied sich nicht zwischen den Gruppen. Bei der komplexen biologischen Bewegung zeigten sich deutliche Unterschiede in temporo-parietalen und frontalen Regionen. Diskussion Die Ergebnisse weisen auf eine grundlegende Beeinträchtigung der Wahrnehmung komplexer biologischer Bewegungen hin, die mit einer verminderter Aufmerksamkeitsaktivierung einherzugehen scheint.
Chang, D.H.F., Troje, N.F.
Directional information can be extracted from scrambled point-light displays that are devoid of all structural cues prompting the suggestion of a distinct local mechanism in biological motion perception that may serve as a general "life detector" (Troje & Westhoff, 2006). We investigated this hypothesis by testing the perception of both animacy and direction from point-light stimuli. Coherent and scrambled point-light displays of humans, cats, and pigeons that were upright or inverted were embedded in a random dot mask and presented in saggital view to two groups of naïve observers (n = 12/grp). The first group assessed the animacy of the walker on a six-point Likert scale and the second group discriminated the direction of walking. Across blocks, stimulus duration varied from 200 - 1000 ms. Coherent stimuli appeared more animate than scrambled stimuli (p [[lt]] 0.001) and inversion decreased animacy ratings (p [[lt]] 0.001), although more substantially for coherent than for scrambled walkers (p = 0.007). Similarly, discrimination accuracies were higher for coherent versus scrambled stimuli (p [[lt]] 0.001) and inversion decreased performance (p [[lt]] 0.001), but more substantially for coherent than for scrambled walkers (p = 0.004). Both animacy ratings and discrimination accuracies did not differ for animal type (ps [[gt]] 0.200) nor stimulus duration (ps [[gt]] 0.300). The results indicate that like the ability to discriminate direction, the perception of animacy from scrambled displays is orientation-specific. We suggest that the responsible mechanism uses a dynamic, gravity-dependent framework to assess the presence of life in the environment and is remarkably robust, operating efficiently at limited exposure times.
Chang, D.H.F., Troje, N.F.
Directional information can be extracted from upright scrambled point-light displays that are devoid of all structural cues prompting the suggestion of a distinct local mechanism in biological motion perception that may serve as a âlife detectorâ (Troje & Westhoff, 2006). Whether the proposed mechanism conveys information beyond direction is unknown. We present three experiments that investigated the perception of both animacy and direction from point-light stimuli. Coherent and scrambled point-light displays of humans, cats, and pigeons that were upright or inverted were embedded in a random dot mask and presented in sagittal view to three groups of naïve observers. Observers assessed the animacy of the walker on a six-point Likert scale in Experiment 1, discriminated the direction of walking in Experiment 2, and completed both the animacy rating and direction discrimination tasks in Experiment 3. Stimulus duration varied from 200 â 1000 ms across blocks in Experiments 1 and 2, and was fixed at 500 ms in Experiment 3. Coherent stimuli appeared more animate than scrambled stimuli and inversion decreased animacy ratings, although more substantially for coherent than for scrambled walkers. Similarly, discrimination accuracies were higher for coherent versus scrambled stimuli and inversion decreased performance, but more substantially for coherent than for scrambled walkers. Both animacy ratings and discrimination accuracies did not differ for animal type or stimulus duration. Experiment 3 showed further, a linear correlation between animacy ratings and discrimination accuracies. The results indicate that like the ability to discriminate direction, the perception of animacy from scrambled displays is orientation-specific. The linear relationship between the animacy and direction data suggests that they address a similar mechanism within our context. We propose that the responsible mechanism uses a dynamic, gravity-dependent framework to interpret terrestrial articulated locomotion and is remarkably robust, operating efficiently at limited exposure times.
2006
Papers
Watanabe, S., Troje, N. F.
The purpose of the present study is to examine the applicability of a computer-generated, virtual animal to study animal cognition. Pigeons were trained to discriminate between movies of a real pigeon and a rat. Then, they were tested with movies of the computer-generated (CG) pigeon. Subjects showed generalization to the CG pigeon, however, they also responded to modified versions in which the CG pigeon was showing impossible movement, namely hopping and walking without its head bobbing. Hence, the pigeons did not attend to these particular details of the display. When they were trained to discriminate between the normal and the modified version of the CG pigeon, they were able to learn the discrimination. The results of an additional partial occlusion test suggest that the subjects used head movement as a cue for the usual vs. unusual CG pigeon discrimination.
Vocks, S., Legenbauer, T., Troje, N. F., Schulte, D.
Zusammenfassung. Theoretischer Hintergrund: Ein negatives Körperbild kann sich bei Essstörungen in einer Ãberschätzung der eigenen Körperdimensionen (perzeptive Komponente), negativen Gedanken und Gefühlen hinsichtlich des eigenen Körpers (kognitiv-affektive Komponente) sowie körperbezogenem Vermeidungs- und Kontrollverhalten (behaviorale Komponente) manifestieren. Fragestellung: Es soll überprüft werden, ob diese drei Komponenten eines gestörten Körperbildes durch ein kognitiv-verhaltenstherapeutisches Körperbildtherapieprogramm verbessert werden können. Methode: 24 Essstörungspatientinnen wurden vor und nach einer zehn Sitzungen umfassenden Körperbildtherapie sowie nach einem dreimonatigen Katamnesezeitraum untersucht. Ergebnisse: Während die perzeptive Körperbildkomponente nicht durch die Körperbildtherapie beeinflusst wurde, zeigten sich deutliche Verbesserungen auf der kognitiv-affektiven und behavioralen Ebene. Auch die Essstörungssymptomatik und allgemeine Belastung der Patientinnen reduzierten sich. Diese Effekte blieben über den Katamnesezeitraum stabil. Schlussfolgerungen: Die Befunde liefern Hinweise auf die Wirksamkeit der Körperbildtherapie.
Troje, N. F., Westhoff, C.
If biological-motion point-light displays are presented upside down, adequate perception is strongly impaired. Reminiscent of the inversion effect in face recognition, it has been suggested that the inversion effect in biological motion is due to impaired configural processing in a highly trained expert system. Here, we present data that are incompatible with this view. We show that observers can readily retrieve information about direction from scrambled point-light displays of humans and animals. Even though all configural information is entirely disrupted, perception of these displays is still subject to a significant inversion effect. Inverting only parts of the display reveals that the information about direction, as well as the associated inversion effect, is entirely carried by the local motion of the feet. We interpret our findings in terms of a visual filter that is tuned to the characteristic motion of the limbs of an animal in locomotion and hypothesize that this mechanism serves as a general detection system for the presence of articulated terrestrial animals.
Troje, N. F., Geyer, H., Sadr, J., Nakayama, K.
Human visual perception is highly adaptive. While this has been known and studied for a long time in domains such as color vision, motion perception, or the processing of spatial frequency, a number of more recent studies have shown that adaptation and adaptation aftereffects also occur in high-level visual domains like shape perception and face recognition. Here, we present data that demonstrate a pronounced aftereffect in response to adaptation to the perceived gender of biological motion point-light walkers. A walker that is perceived to be ambiguous in gender under neutral adaptation appears to be male after adaptation with an exaggerated female walker and female after adaptation with an exaggerated male walker. We discuss this adaptation aftereffect as a tool to characterize and probe the mechanisms underlying biological motion perception.
Rotman, G., Troje, N. F., Johansson, R. S., Flanagan, J. R.
We previously showed that, when observers watch an actor performing a predictable block-stacking task, the coordination between the observerâs gaze and the actorâs hand is similar to the coordination between the actorâs gaze and hand. Both the observer and the actor direct gaze to forthcoming grasp and block landing sites and shift their gaze to the next grasp or landing site at around the time the hand contacts the block or the block contacts the landing site. Here we compare observersâ gaze behavior in a block manipulation task when the observers did and when they did not know, in advance, which of two blocks the actor would pick up first. In both cases, observers managed to fixate the target ahead of the actorâs hand and showed proactive gaze behavior. However, these target fixations occurred later, relative to the actorâs movement, when observers did not know the target block in advance. In perceptual tests, in which observers watched animations of the actor reaching partway to the target and had to guess which block was the target, we found that the time at which observers were able to correctly do so was very similar to the time at which they would make saccades to the target block. Overall, our results indicate that observers use gaze in a fashion that is appropriate for hand movement planning and control. This in turn suggests that they implement representations of the manual actions required in the task and representations that direct task-specific eye movements.
Loidolt, M., Aust, U., Steurer, M., Troje, N. F., Huber, L.
A go/no-go procedure was used to train pigeons to discriminate pictures of human faces differing only in shape, with either static images or movies of human faces dynamically rotating in depth. On the basis of experimental findings in humans and some earlier studies on three-dimensional object perception in pigeons, we expected dynamic stimulus presentation to support the pigeonâs perception of the complex morphology of a human face. However, the performance of the subjects presented with movies was either worse than (AVI format movies) or did not differ from (uncompressed dynamic presentation) that of the subjects trained with a single or with multiple static images of the faces. Furthermore, generalization tests to other presentation conditions and to novel static views revealed no promoting effect of dynamic training. Except for the subjects trained on multiple static views, performance dropped to chance level with views outside the training range. These results are in contrast to some prior reports from the literature, since they suggest that pigeons, unlike humans, have difficulty using the additional structural information provided by the dynamic presentation and integrating the multiple views into a three-dimensional object.
Kersten, M., Steward, J., Ellis, R., Troje, N. F.
We present empirical studies that consider the effects of stereopsis and simulated aerial perspective on depth perception in translucent volumes. We consider a purely absorptive lighting model, in which light is not scattered or reflected, but is simply absorbed as it passes through the volume. A purely absorptive lighting model is used, for example, when rendering digitally reconstructed radiographs (DRRs), which are synthetic Xâray images reconstructed from CT volumes. Surgeons make use of DRRs in planning and performing operations, so an improvement of depth perception in DRRs may help diagnosis and surgical planning.
Jokisch, D., Daum, I., Troje, N. F.
We investigated the influence of viewing angle on performance in recognising the identity of one's own person and familiar individuals such as friends or colleagues from walking patterns. Viewpoint-dependent recognition performance was tested in two groups of twelve persons who knew each other very well. Participants' motion data were acquired by recording their walking patterns in three-dimensional space with the use of a motion capture system. Size-normalised point-light displays of biological motion of these walking patterns, including one's own, were presented to the same group members on a computer screen in frontal view, half-profile view, and profile view. Observers were requested to assign the person's name to the individual gait pattern. No feedback was given. Whereas recognition performance of one's own walking patterns was viewpoint independent, recognition rate for other familiar individuals was better for frontal and half-profile view than for profile view. These findings are discussed in the context of the theory of common coding of motor and visual body representations.
Book Chapters
Other Contributions
Symposia and Published Abstracts
Troje, N. F., Szabo, S.
The arithmetic mean of the same number of male and female biological motion point-light walkers represented in a morphable, linear walking space is perceived to be male. The perceptually neutral walker corresponds to a point in the female part of the space. It is not clear, though, if this “male bias” is a genuine phenomenon or an artifact of the specific walker space and its underlying metric. Here, we present a number of experiments in which observers reported the perceived sex of a series of walkers while we varied the range and the distribution from which the walkers were sampled. We observe a pronounced range effect: If we sample from a distribution which is shifted towards the female part such that it is now centered around the walker that appeared to be sexually neutral before, observers adapt to the range and still perceive more walkers to be male than female. On the other hand, if we sample from a larger range with the same mean the observed “male bias” changes only marginally. We conclude that the male bias is not an artifact of the motion space used but a genuine phenomenon. We discuss different possible causes and particularly the question of whether the “male bias” is stimulus-specific or rather a more general phenomenon.
Troje, N. F.
Biological motion perception has long been treated as one single phenomenon. Science often moves forward by trying to simplify complex ideas. However, I will suggest that in the case of biological motion perception the “unitary view” has hindered research and slowed down progress. In my talk I will identify a number of dissociable processing levels that need to be distinguished carefully to improve our understanding of the complex phenomenon of biological motion perception.
Troje, N. F.
According to a classic theory by Morton and Johnson [1], humans and chicks (and probably other animals as well) share an important principle in the design of the developing face and conspecific recognition system. In humans, an innate mechanism (CONSPEC) based on a very coarse template of a face is reponsible for guiding attention to face-like stimuli, ensuring that a second mechanism (CONLEARN) receives ample input to learn about the subtle cues that carry information about the identity of another person. In newly hatched chicks, the two mechanisms have similar roles and control filial imprinting on their mothers.
Based on experiments investigating the inversion effect in biological motion [2], I will present evidence for the dissociation of two mechanisms for biological motion perception. One of them attracts attention to a particular signature in biological motion that is largely independent of the particular nature of the animal generating it. This mechanism works well in the visual periphery and functions as a general âlife detectorâ. I will argue that it is evolutionary old and that humans share it with other animals (including chicks [3]). The other mechanism is based on learning. It relies on the first one, which makes sure that sufficient visual input is provided for this learning process. Evolutionary, it is a more recent acquirement and its particular implementation might strongly vary between species.
[1] Morton, J., and Johnson, M.H. (1991). CONSPEC and CONLERN: a two-process theory of infant face recognition. Psychological Review 98, 164-181.
[2] Troje, N.F., and Westhoff, C. (2006). The inversion effect in biological motion perception: evidence for a "life detector"? Curr Biol 16, 821-824.
[3] Vallortigara, G., and Regolin, L. (2006). Gravity bias in the interpretation of biological motion by inexperienced chicks. Curr Biol 16, R279-280.
Saunders, D. R., Troje, N. F.
Animal courtship in many species is not only an evaluation process, but also a two-way communicative interaction. Although pigeon courtship has been extensively described, it is not known whether male and female pigeons coordinate their body movements as a form of communication. We used simultaneous motion capture recordings of both partners during courtship to examine correlations between kinematic measurements reflecting behaviors at several different time scales. The behaviors included head bobbing, movement speed, body orientation, and direction of turning. Each was modeled using semi-markov processes previously used for modeling parallel streams of human nonverbal behaviors during conversation. The analysis results in a quantitative ethogram of the choreography of pigeon courtship and elucidates the different levels of coordination, as well as the role of leading, following, and behavioral synchronization between the two partners.
Sadr, J., Troje, N.F., Nakayama, K.
People are more than faces, and much of our perception of others derives from visual appraisal of bodies and their movement -- rich sources of information as to gender, identity, etc. We find that, even ignoring overt courtship displays (eg, dancing), the mere act of walking, a ubiquitous human activity, provides observers a compelling percept of attractiveness. Previously, we demonstrated the influence of sexual dimorphism and prototypicality on attractiveness of human gait; here we extend this to examine the role of symmetry. To do so, we obtain attractiveness ratings for motion-captured women, displayed as point-light walkers, and for their perfectly symmetric counterparts. Our results show that making symmetric an individual's body and movement can indeed increase attractiveness, although this benefit might not be seen for less attractive individuals. Moreover, a key feature of our approach (Troje, 2002) is the ability to independently manipulate the symmetry of either the body or its movement and thus investigate the contribution of each to attractiveness. Whereas previously examined anatomical asymmetries may be quite small and difficult to measure and to perceive visually, we propose that asymmetries in movement may be more readily observed and salient. Our results thus far indicate that, at least for more attractive individuals, symmetry of movement has a greater bearing on attractiveness than does anatomic symmetry. In conclusion, we suggest that explicitly and independently manipulating anatomic and kinematic symmetry (and sexual dimorphism, prototypicality, etc) of motion-captured individuals provides an important complement to existing correlational and video-based methods in the study of person perception.
Puca, R.M., Rinkenauer, G., Troje, N.F.
Point-light walkers are ambiguous stimuli when no explicit depth cues are provided. Recently it has been shown that frontal views of such walkers are more often interpreted as facing towards the viewer than facing away from the viewer (Vanrie, Dekeyser & Verfaillie, 2003). As human walkers are social stimuli, we investigated the influence of social motives on participantsâ perception. Participants were shown moving point-light figures from 80 different male and female walkers and they had to decide for each walker whether he or she seemed to move towards or away from them. Again moving-towards interpretations were more frequent than moving-away interpretations. This result, however, was qualified by participantsâ dispositional need for affiliation and their gender. Among women the preference for the towards-interpretation could only be shown when their need for affiliation was high but not when it was low. In contrast, menâs need for affiliation had no impact on their perception. High affiliation women decided, however, more often than high affiliation men that the walkers moved towards them. The influence of motivation on perception is discussed.
Michalak, J., Troje, N.F., Schulte, D., Heidenreich, T.
Human gate patterns give a lot of information about person walking. For example, it is possible to derive information about gender or mood merely from dynamic gate features. Mindfulness-based approaches stress the importance of getting in contact with the here-and-now experience of the body. Moreover, recent theories of emotion highlight the role of proprioceptive-bodily information in the generation of emotional states. Therefore, investigation of bodily processes might be a relevant target for the study of mindfulness-based approaches. In our current research we analyzed the gate patterns of 30 patients participating in a mindfulness-based cognitive therapy (MBCT), 18 acute depressive inpatients and 30 never depressed participants by fourier - based descriptions and computation of linear classifiers. The following research questions were guiding our research: (1) Do dynamic gate patterns of acute und formerly depressed patients differ from never depressed people (2) Does MBCT normalize gate patterns of formerly depressed patients?
König, A., Schölmerich, A., Troje, N.F.
Human gait contains a huge amount of socially relevant information and even highly reduced visual stimuli such as Point-Light-Walkers (PLWs) are sufficient to transmit information about a walking personâs gender and age. The aim of our study is to develop a new perception-based method for the detection of paedophilic interests using PLWs as stimulus material. The advantage of PLWs is that they are ambiguous and do not show any explicit sexual content.
Seven prototypical female and 7 male PLWs covering an age range from 4 to 30 years were presented to a group of child molesters (n = 21) and a control group (n = 30). The experiment consisted of a gender-, attractiveness and age-rating-block. In each block all PLWs were presented twice for 4 s in a random order.Discriminant Analysis is able to classify 82.4% of all cases correctly as child molesters or control participants, using individual attractiveness- and gender-ratings as predictive variables (?2 (6, N = 51) = 28.55, p = .00007). Currently we are collecting additional data to investigate interactions of âvictim ageâ and âvictim genderâ with visual ratings of child molesters.
[1] Morton, J., and Johnson, M.H. (1991). CONSPEC and CONLERN: a two-process theory of infant face recognition. Psychological Review 98, 164-181.
[2] Troje, N.F., and Westhoff, C. (2006). The inversion effect in biological motion perception: evidence for a "life detector"? Curr Biol 16, 821-824.
[3] Vallortigara, G., and Regolin, L. (2006). Gravity bias in the interpretation of biological motion by inexperienced chicks. Curr Biol 16, R279-280.
Bockemühl, T., Troje, N.F., Dürr, V.
A central question in motor control is how the CNS dealswith redundant degrees of freedom inherent in musculoskeletalsystems. The human arm is a prime example for such a system. It showsa vast variety of behavior and plays a key role in most aspects ofour life. However, because of its biomechanical complexity, the armposes a formidable control problem to our CNS. In this study weanalyzed movements performed by human subjects, which were asked tocatch a ball launched towards them on 16 different trajectories.Subjects had to initiate movements from 2 different startingpositions. Motor activity of the right arm was recorded using opticalmotion capture and was transformed into 10 joint angle time coursesby a model-based optimization algorithm. The resulting time series ofarm postures were analyzed by principal components analysis (PCA). Wegenerally found that more than 90% of movement variance was capturedmostly by as few as 2 principal components (PCs). Furthermore,subspaces spanned by PC sets associated with different catchingpositions varied smoothly across the arm’s workspace. When wepooled complete sets of movements, 3 PCs were still sufficient toexplain 80% of the data’s variance. This indicates strongkinematic couplings between the joints of the arm. We hypothesizethat flexible and context-dependent behavior like multijoint humanarm movement does not necessarily require complex neural algorithms.Instead, we show that catching movements towards diverse targets canbe generated efficiently by linear combinations of a small set ofcardinal movement synergies.
2005
Papers
Zhang, Z., Troje, N. F.
Based on a three-dimensional (3D) linear model and the Bayesian rule,a method is explored to identify human walkers from two-dimensional (2D) motion sequences taken from different viewpoints. Principal component analysis constructs the 3D linear model from a set of Fourier represented examples. The sets of coefficients derived from projecting 2D motion sequences onto the 3D model by means of a maximum a posterior estimate is used as a signature of a walker. Simulating an identification experiment on a set of walking data we show that these signatures show invariance across viewpoints and can be used for viewpoint-independent person identification.
Watson, T. L., Johnston, A., Hill, H. C. H., Troje, N. F.
Natural face and head movements were mapped onto a computer rendered three-dimensional average of 100 laser-scanned heads in order to isolate movement information from spatial cues and nonrigid movements from rigid head movements (Hill and Johnston, 2001). Experiment 1 investigated whether subjects could recognize, from a rotated view, facial motion that had previously been presented at a full-face view using a delayed match to sample experimental paradigm. Experiment 2 compared recognition for views that were either between or outside intially presented views. Experiment 3 compared discrimination at full face, three-quarters, and profle after learning at each of these views. A significant face inversion effect in Experiments 1 and 2 indicated subjects were using face-based information rather than more general motion or temporal cues for optimal performance. In each experiment recognition performance only ever declined with a change in viewpoints between sample and test views when rigid moton was present. Nonrigid, face-based motion appears to be encoded in a viewpoint invariant, object-centred manner, whereas rigid head movement is encoded in a more view specific manner.
Troje, N. F., Westhoff, C., Lavrov, M.
Human observers are able to identify a person based on his or her gait. However, little is known about the underlying mechanisms and the kind of information used to accomplish such a task. In this study, participants learned to discriminate seven male walkers shown as point-light displays from frontal, half-profile, or profile view. The displays were gradually normalized with respect to size, shape, and walking frequency, and identification performance was measured. All observers quickly learned to discriminate the walkers, but there was an overall advantage in favor of the frontal view. No effect of size normalization was found, but performance deteriorated when shape or walking frequency was normalized. Presenting the walkers from novel viewpoints resulted in a further decrease in performance. However, even after applying all normalization steps and rotating the walker by 90º, recognition performance was still nearly three times higher than chance level.
Jokisch, D., Troje, N. F., Koch, B., Schwarz, M., Daum, I.
Perception of biological motion (BM) is a fundamental property of the human visual system. It is as yet unclear which role the cerebellum plays with respect to the perceptual analysis of BM represented as point-light displays. Imaging studies investigating BM perception revealed inconsistent results concerning cerebellar contribution. The present study aimed to explore the role of the cerebellum in the perception of BM by testing the performance of BM perception in patients suffering from circumscribed cerebellar lesions and comparing their performance with an age-matched control group. Perceptual performance was investigated in an experimental task testing the threshold to detect BM masked by scrambled motion and a control task testing the detection of motion direction of coherent motion masked by random noise. Results show clear evidence for a differential contribution of the cerebellum to the perceptual analysis of coherent motion compared with BM. Whereas the ability to detect BM masked by scrambled motion was unaffected in the patient group, their ability to discriminate the direction of coherent motion in random noise was substantially affected. We conclude that intact cerebellar function is not a prerequisite for a preserved ability to detect BM. Because the dorsal motion pathway as well as the ventral form pathway contribute to the visual perception of BM, the question of whether cerebellar dysfunction affecting the dorsal pathway is compensated for by the unaffected ventral pathway or whether perceptual analysis of BM is performed completely without cerebellar contribution remains to be determined.
Jokisch, D., Daum, I., Suchan, B., Troje, N. F.
In the present study, we investigated how different processing stages involved in the perceptual analysis of biological motion (BM) are reflected by modulations in event-related potentials (ERP) in order to elucidate the time course and location of neural processing of BM. Data analysis was carried out using conventional averaging techniques as well as source localization with low resolution brain electromagnetic tomography (LORETA). ERPs were recorded in response to point-light displays of a walking person, an inverted walking person and displays of scrambled motion. Analysis yielded a pronounced negativity with a peak at 180 ms after stimulus onset which was more pronounced for upright walkers than for inverted walkers and scrambled motion. A later negative component between 230 and 360 ms after stimulus onset had a larger amplitude for upright and inverted walkers as compared to scrambled walkers. In the later component, negativity was more pronounced in the right hemisphere revealing asymmetries in BM perception. LORETA analysis yielded evidence for sources specific to BM within the right fusiform gyrus and the right superior temporal gyrus for the second component, whereas sources for BM in the early component were located in areas associated with attentional aspects of visual processing. The early component might reflect the pop-out effect of a moving dot pattern representing the highly familiar form of a human figure, whereas the later component might be associated with the specific analysis of motion patterns providing biologically relevant information.
Hill, H. C. H., Troje, N. F., Johnston, A.
Is it possible to exaggerate the different ways in which people talk, just as we can caricature their faces? In this paper, we exaggerate animated facial movement to investigate how the emotional manner of speech is conveyed. Range-specific exaggerations selectively emphasized emotional manner whereas domain-specific exaggerations of differences in duration did not. Range-specific exaggeration relative to a time-locked average was more effective than absolute exaggeration of differences from the static, neutral face, despite smaller absolute differences in movement. Thus, exaggeration is most effective when the average used captures shared properties, allowing task-relevant differences to be selectively amplified. Playing the stimuli backwards showed that the effects of exaggeration were temporally reversible, although emotion-consistent ratings for stimuli played forwards were higher overall. Comparison with silent video showed that these stimuli also conveyed the intended emotional manner, that the relative rating of animations depends on the emotion, and that exaggerated animations were always rated at least as highly as video. Explanations in terms of key frame encoding and muscle-based models of facial movement are considered, as are possible methods for capturing timing-based cues.
Proceedings
Holman, D., Vertegaal, R., Troje, N.
In this paper, we present PaperWindows, a prototype windowing environment that simulates the use of digital paper displays. By projecting windows on physical paper, PaperWindows allows the capturing of physical affordances of paper in a digital world. The system uses paper as an input device by tracking its motion and shape with a Vicon Motion Capturing System. We discuss the design of a number of interaction techniques for manipulating information on paper displays.
Symposia and Published Abstracts
Westhoff, C., Troje, N.F.
Die Fähigkeit, andere Personen anhand ihrer Bewegungen identifizieren zu können, ist eine geläufige Beobachtung aus dem Alltag. Bereits gegen Ende der 70er Jahre wurden erste psychophysische Experimente zur Personenidentifikation durch sog. Point-Light Displays (PLDs) durchgeführt, bei denen Personen nur durch eine geringe Anzahl von Lichtpunkten dargestellt wurden. Mit Hilfe moderner Aufnahmemethoden (Motion Capturing) und die Anwendung von Fourier-Analysen ist es möglich, Gangmuster einzelner Personen in strukturelle und kinematische Komponenten zu zerlegen und dabei nur wenig individuelle Informationen einzubüßen. Diese Technik wurde in zwei Experimenten benutzt, um den Einfluss verschiedener Parameter auf die Identifikation individueller Gangmuster zu überprüfen. In Studie 1 lernte eine Gruppe von Versuchspersonen, PLDs von sieben verschiedenen männlichen Personen zu benennen. Im Verlauf des Experiments wurden die Informationen über individuelle Größe, Körperstruktur und Gangfrequenz aus den PLDs entfernt und durch die Mittelwerte der gesamten Gruppe ersetzt (Normalisierung). Es konnte gezeigt werden, dass Körperstruktur und Gangfrequenz, nicht aber die Körpergröße, einen signifikanten Einfluss auf die Erkennungsleistung ausüben. Studie 2 überprüfte die Rolle von kinematischen Faktoren. Ein erstes Experiment zeigte, dass die erste und zweite Harmonische der Fourier-Analyse hinreichend zur Identifikation beitragen, während Harmonische höherer Ordnung nicht berücksichtigt werden müssen. In einem zweiten Experiment wurden die PLDs hinsichtlich ihrer Amplituden- bzw. Phasen-Spektra normalisiert. Beide Spektra tragen signifikant zur Erkennung individueller Gangmuster bei, jedoch zeigte sich für das Amplituden-Spektrum ein wesentlich stärkerer Einfluss.
Vocks, S., Legenbauer, T., Kiszkenow, S., Troje, N.F., Schulte, D.
KörperbildkomponenteVocks, S., Legenbauer, T., Kiszkenow, S.,Troje, N.F., Schulte, D.Körperbildstörungen sind wesentlich an derEntstehung und Aufrecherhaltung der Anorexia und Bulimia nervosabeteiligt. Dennoch wurden Interventionen zur Verbesserung desKörperbildes in Essstörungstherapien zumeistvernachlässigt. Auch wurde das Körperbild oft nurunidimensional erfasst; tatsächlich ist es jedoch als einmultidimensionales Konstrukt anzusehen, welches sich aus vierinterindividuell unterschiedlich ausgeprägten Komponentenzusammensetzt. So beinhaltet es eine perzeptive (Überschätzungder eigenen Körperdimensionen), kognitive (negative Bewertungendes eigenen Körpers), affektive (negative körperbezogeneGefühle) und behaviorale Komponente (körperbezogenesVermeidungs- und Kontrollverhalten). Daher wurde eine Untersuchungzur Evaluation einer Körperbildtherapie durchgeführt,welche Interventionsbausteine bezüglich aller vierKörperbildkomponenten integriert. Entsprechend wird derTherapieerfolg multidimensional erfasst.
Troje, N. F., Westhoff, C.
Spatially scrambled point-light displays of humans oranimals in locomotion contain unambiguous information about the direction in which the agent is facing. Observers are well able to retrieve this information, but only if the displays are presented right-side up. Even though spatial integrity is not required for direction discrimination the temporal relations between dots may still be important. Here, we report the results of an experiment in which we manipulated the temporal integrity of spatially scrambled point-light displays in two different ways. In the first condition we apply random offsets to the phase of the single dots. Whereas this manipulation changes the “beat” of the pattern which defines the particular gait of the agent, it leaves it's general rhythmicity intact. In a second condition, we also changed the playback speed individually for each dot, which results in completely uncorrelated dot movements.
The results show a small but consistent effect of the temporal integrity on the strength of the inversion effect. As the degree of temporal scrambling increases the inversion effect decreases. Even though temporal integrity apparently plays a role, observers can still determine the direction of the upright, fully scrambled point-light agent with an accuracy that is still much higher than the spatially and temporally intact, but inverted walker.
The results can be modeled by means of a simple linear model and they are discussed in terms of a basic, yet reliable and form invariant visual filter designed for the general detection of animate motion in the visual environment.
Troje, N. F.
Animate motion patterns are a rich source of biologically significant information. They quickly and reliably signal the presence of an animate agent and convey its identity, actions and intentions. Observing other people in motion, our visual system is able to retrieve information about sex, age and weight of a person and can even detect signatures that identify a familiar individual. The relevant information is encoded on different levels in the motion patterns ranging from purely local motion information to complex global correlation patterns. In my talk, I will present empirical and computational data from two different studies. In the first, we investigated the nature of the general saliency of biological motion. Based on findings from experiments designed to explore the cause of the inversion effect in biological motion, we propose a simple, yet reliable and fast sensory filter that can detect the presence of a living animal in the visual environment. The proposed mechanism is based on local motion trajectories and does not require any form processing. On the other hand, I will present a computational framework that explores the complex correlation pattern of a moving body to retrieve information about a person's sex and other attributes that define his or her identity. It is based on a morphable representation of human walking data which in turn makes it possible to formulate linear classifiers for the attributes of interest. Results on sex classification obtained from our model are being compared to behavioural data and the relevance of our framework as a model for visual information processing in the human brain will be discussed.
Troje, N. F.
Biological motion has long been treated as one single phenomenon. Here, we want to suggest a dissociation between a mechanism responsible for a non-specific, shape independent detection of biological events, on the one hand, and a mechanism that retrieves shape from motion and recovers the articulation and general structure of a specific body, on the other hand. Evidence for this dissociation is presented in terms of experiments designed to explore the nature of the inversion effect in biological motion. As long as it is presented upright, scrambled biological motion still contains information about the facing direction of a walker. We isolate the invariants responsible for this inversion effect as being representative for the ballistic movement of limbs under conditions of gravity. The consequences of using scrambled motion to mask biological motion in detection experiments and the usage of scrambled motion as control stimulus in imaging studies are discussed.
Sadr, J., Troje, N.F., Nakayama, K.
While the study of facial attractiveness has explored a number of factors such as familiarity, symmetry, and sexual dimorphism, perhaps the most popular notion to emerge has been that the mean of a population is what is considered most attractive. In contrast to this concept of "averageness," however, exaggerations of sex differences have been shown to play a key role in attractiveness -- a finding now mirrored in the domain of biological motion (i.e., point-light walkers) where, in men's ratings of female walkers, attractiveness correlates very well with a gender axis (Troje, 2003). We should like to clarify that this is not due to merely approaching a hypothetical average female walker but more specifically to the relative display of sexually dimorphic characteristics, even to the detriment of averageness.
As with averages of faces, synthetic walkers made by averaging two or more individuals do generally appear to be attractive. This is certainly the case with the full population average, and, indeed, even averaging all walkers of below-average attractiveness can yield a walker that is above-average. However, even the maximally average walker is nevertheless less attractive than a number of real, individual walkers. Moreover, the most attractive individuals are not necessarily nearer to the average; direction of deviation from the mean may be more meaningful than distance, so that, e.g., far-from-average walkers may be very attractive if they exaggerate female characteristics.
Thus, the most average walker is not the most attractive, the most attractive walkers are not the most average, and walkers equidistant from the average may be very attractive or unattractive depending on their relative expression of sexually dimorphic traits. For biological motion, then, the perception of attractiveness (and perhaps of gender) might be guided not simply by prototypes anchored at averages of categories but by representations specifically attuned to salient variation between categories.
Sadr, J., Troje, N.F., Nakayama, K.
In a simple, sparse display of a few moving dots on a plain background, one's perception of a human actor can be remarkably compelling and rich; individual dots spontaneously cohere into a single object that relays extensive information about both actor and action. Here we show subjects' high-level categories (male vs. female) and related judgments (ratings of femininity or attractiveness)suggest representations of vectors or axes exploiting differences between classes, not simply averages, norms or prototypes defined within classes. Thus, for example, the most average female is not the one judged most feminine or most attractive.
Nathaniel, T., Güntürkün, O., Manns, M., Troje, N.F.
Head-bobbing in pigeons comprises of a hold phase and a thrust phase. Visual feedback required for fine tuning the head movement probably relies on continuous retinal image flow. In two experiments, we investigated the temporal properties for acquiring such optic flow information by measuring locomotion behaviour understroboscopic illumination. In Experiment 1, pigeons were trained to actively walk back and forth between two food hoppers. Locomotion behaviour was measured by relating travelled distance and subtended rotation angles to the number of steps performed. In Experiment 2, birds were restrained and moved passively on the belt of a treadmill while we measured the number of performed head-bobbing cycles. In both experiments strobe frequencies varied in steps from 1.0 Hz to 100 Hz on an equidistant logarithmic scale. Locomotion behaviour was normal at strobe frequencies above 20 Hz. Between 8 Hz and 20 Hz pigeons displayed a significant level of activity. However, all movements were executed on the spot leading to rotation but not to translation. Within the same range of strobe frequencies, head-bobbing in passively moved birds is suppressed, but rotational saccadic head-movements are unimpaired. At frequencies below 8 Hz activity ceases. We discuss the observed behaviour in the context of corollary discharge processing in the rotundus/triangularis complex of the pigeon’s brain.
Loidolt, M., Troje, N.F., Huber, L.
Whereas pigeons learn to discriminate moving point-light displays of different animate objects (human versus pigeon, Exp.1) as well as different movement categories (pecking versus walking, Exp.2) rather easily, it is a much harder task for them to discriminate the movement direction (to the left versus to the right) of a stationary walking point-light pigeon (Exp.3). Whereas in the first and second experiment discrimination can be explained by extraction of form or posture, a different process must have been at work in the third. Once the pigeons had acquired the discriminationin Exp. 3, they were presented (1) with versions of the training stimuli in which certain points were kept static (i.e., those forming the head, the torso, or the feet), so that only parts of the point-light pigeon were presented dynamically, and (2) with the original training stimuli presented upside down. For most pigeons dynamic presentation of the feet was most informative for detection of movement direction. When the dynamic stimuli were presented upside down, discrimination broke down completely. The present findings are in keeping with recent results in human subjects showing that assumptions about the direction of gravity play a role in interpreting biological motion (Shipley 2003, Troje 2004).
Jokisch, D., Daum, I., Koch, B., Schwarz, M., Troje, N.F.
Perception of biological motion is a fundamental property of the human visual system. It is as yet unclear which role the cerebellum plays with respect to the perceptual analysis of biological motion represented as point-light displays. Imaging studies investigating biological motion perception revealed inconsistent results concerning cerebellar contribution. The present study aims to explore the role of the cerebellum in the perception of biological motion by testing the performance of biological motion perception in patients suffering from circumscribed cerebellar lesions and comparing their performance with an age-matched control group.
Perceptual performance was investigated in an experimental task testing the threshold to detect biological motion masked by scrambled motion and a control task testing detection of motion direction of coherent motion masked by random noise. Results show clear evidence for a differential contribution of the cerebellum to the perceptual analysis of coherent motion compared to biological motion. Whereas the ability to detect biological motion masked by scrambled motion was unaffected in the patient group, their ability to discriminate direction of coherent motion in random noise was substantially affected. We conclude that intact cerebellar function is not a prerequisite for a preserved ability to detect biological motion. Since the dorsal motion pathway as well as the ventral form pathway contribute to the visual perception of biological motion, thequestion remains open, whether cerebellar dysfunction affecting the dorsal pathway is compensated for by the not affected ventral pathway or whether perceptual analysis of biological motion is performed completely without cerebellar contribution.
Bockemühl, T., Dürr, V., Troje, N.F.
Everyday situations demand high flexibility from neural mechanisms underlying motor control. Movement of the human arm is such an example for the control of a multijointed limb with redundant degrees of freedom. In this study we analyzed the movement patterns performed by the right arm of 9 different subjects. The observed task was to catch an approaching ball traversing the workspace of the arm on 16 different trajectories. Movement started from one of two different initial postures. Subjects were instructed to catch the ball at a convenient point. Only successful catches were evaluated. Time courses of 9 joint angles of arm and shoulder were computed from Cartesian marker coordinates obtained from an optical motion capture system. We subsequently applied principal components analysis to average sets of angular time courses of each one of the 32combinations of start posture and catch position. Though our kinematic model comprises 9 degrees of freedom we can show that 2principal components (PCs) account mostly for over 90% of the variance. PCs thereby vary smoothly across the arm’s workspace. When pooling all movements from one initial posture to any of the 16 catching positions, 2 to 3 PCs still explain 75 – 80% of the data’s variance. This indicates a strong coupling between the observed joint angles. Based on these findings we hypothesize that flexible and variable behavior like multijointed human arm movement does not necessarily require complex neural algorithms. Instead we show that catching movements within the entire workspace can be generated efficiently by linear combinations of a small set of basic movements.
2004
Papers
Collin, C. A., Liu, C.H., Troje, N.F., McMullen, P.A., Chaudhuri, A.
Previous studies have suggested that face identification is more sensitive to variations in spatial frequency content than object recognition, but none have compared how sensitive the 2 processes are to variations in spatial frequency overlap (SFO). The authors tested face and object matching accuracy under varying SFO conditions. Their results showed that object recognition was more robust to SFO variations than face recognition and that the vulnerability of faces was not due to reliance on configural processing. They suggest that variations in sensitivity to SFO help explain the vulnerability of face recognition to changes in image format and the lack of a middle-frequency advantage in object recognition.
Proceedings
Zhang, Z., Troje, N.F.
In this report, we present and evaluate a method of reconstructing three-dimensional (3D) periodic human motion from two-dimensional (2D) motion sequences. Based on a Fourier decomposition of a training set of 3D data, we construct a linear, morphable representation. Using this representation a low dimensional linear model is learned by means of Principle Component Analysis (PCA). Two dimensional test data are now projected onto this model and the resulting 3D reconstructions are evaluated. We present two different simulations. In the first experiment, we assume the 2D projection matrix to be known. In the second experiment, the horizontal viewpoint is unknown and is being recovered from the data.
Symposia and Published Abstracts
Westhoff, C., Troje, N. F.
With rigid objects, the visual system showssome degree of viewpoint invariance. Information obtained about anobject from one view can be used to identify the object from a novelview. Little is known whether we can also generalize to new views ofnon-rigid objects, such as human bodies. Here, we examine this questionusing biological motion. Stimuli were Fourier represented point-lightwalkers decomposed into an average posture and the first fiveharmonics. We investigated the role of these harmonics for personidentification from biological motion, and measured performance whenviewing angles were varied between learning and test. Three groups ofobservers were trained to identify seven male walkers, previouslyunknown to them, shown from different views: 0 deg (frontal view), 30deg, or 90 deg. Non-reinforced test stimuli were generated by firstcomputing an average walker and then replacing either only its first,second, or third to fifth harmonics with the respective harmonics ofthe individual walkers. In the test session these walkers were showneither from the same viewing angle as in the training sessions, or fromone of the two other viewpoints. Results show that walkers can beidentified best if shown from the same view as during training. Therewas also a significant transfer to other angles, but performancedeclined with increasing difference between learning view and testview. There was a marginal effect of test view, with the 30 deg viewproducing best performance. The first harmonic contributes mostinformation to the identification of the walkers. The second harmonicalone is still sufficient for recognition, but this is not the case forthe higher order harmonics. There were no significant interactionsbetween the type of harmonic and the training view or the testviewpoint, respectively. We conclude that there is a clear viewpointeffect for recognition of biological motion. Still, the visual systemis able to generalize to a considerable degree across viewpoints.This research is funded by the VolkswagenFoundation.
Westhoff, C., Troje, N. F.
Object recognition across changing viewpoints requires sophisticated neural processing. We examine recognition of point-light walkers and investigate the contribution of different Fourier components to viewpoint generalization performance. Observers were trained to identify seven walkers in one of three learning groups differing according to the viewpoint: frontal, half-profile, or profile view. Test stimuli were generated by replacing either the first, second, or third to fifth harmonics of an average walker with the respective harmonics of the individual walkers, and were shown from all three viewpoints. No differences in performance were found between the learning groups. The test view had a marginal effect with the half-profile view being recognized best. Walkers can be identified best if shown from the same angle as during training, but there is also significant transfer to other views. There were no significant interactions between the type of harmonic and the training view or the test viewpoint, respectively.
Westhoff, C., Troje, N. F.
Human observers are able to learn to discriminate individuals shown as point light walkers. The walking pattern of a person can be represented very accurately using discrete Fourier analysis. Here, we represent walking data by decomposing them into an average posture and the first five harmonics, and we investigated their role for person identification from biological motion. Observers learned to identify seven male walkers from one of three different viewpoints: 0 deg (frontal view), 30 deg, or 90 deg. The displays were previously normalized with respect to the shape of the walker and his walking frequency. Non-reinforced test stimuli were generated by first computing an average walker and then replacing either only the first, second, or third to fifth harmonics of this average walker with the respective harmonics of the individual original walkers. Results show that the first harmonic contributes most information to the identification of the walkers. The second harmonic alone is still sufficient for the task, but this is not the case for the higher order harmonics. No differences in performance were found between the three viewpoint groups. Walkers can be identified best if shown from the same viewing angle as in the training sessions, but there is also a significant transfer to other angles. There was a marginal effect of the test view, with the 90 deg view producing the worst performance. There were no significant interactions between the type of harmonic and the training view or the test viewpoint, respectively. We conclude that individual walking patterns can be represented adequately with the use of two Fourier-harmonics and that the visual system is not able to detect subtle differences between point light walkers beyond this representation.
This research is funded by the Volkswagen Foundation.
Vocks, S., Legenbauer, T., Troje, N.F., Zumfelde, M., Hildenbrand, S.
Background: As it was shown before,professional ballet dancers are at an enhanced risk of developingdisturbances of body image and eating disorders, because of the highpressure to maintain a low body weight. However, there is almost nodata on the dynamic body image which might be an important factorespecially in this group. Studies with non-professional ballet dancersare very rare, though they might be of a similar risk in developing aneating disorder.
Method: We compared a group of 25 non-professional ballet dancers with 33 control persons concerning several aspects of body image and eating behaviors with the "Multidimensional Body-Self Relations Questionnaire", the "Eating Disorder Examination Questionnaire", the "Eating Disorder Inventory" and the "Body Image Avoidance Questionnaire". In addition to those questionnaire measures, we included a digital distortion technique to get information with respect to the static and dynamic body image. Results: Results indicate that ballerinas show a higher preoccupation with overweight (p<.05) and food (p<.05) than controls, while eating behaviors were not different in both groups. Additionally, ballerinas scored higher on fitness orientation (p<.05) and fitness evaluation (p<.001). No group differences were found for the other components of body image as measured with the digital distortion technique.
Conclusion: These results indicate that non-professional ballerinas have at the most a slightly enhanced risk of developing eating disorders and body image disturbances as compared with professional ballet dancers. We assume that the pressure to be thin is not as strong on non-professionals as on professionals.
Vocks, S., Legenbauer, T., Troje, N.F., Hupe, C., Rüddel, H., Stadtfeld-Oerteld, P., Rudolph, M., Schulte, D.
Background: Besides pathological eating behaviour, body image disturbances are a main charateristicum of anorexia and bulimia nervosa. In the past, only static body image was examined; dynamic aspects as perception and evaluation of one's own motion patterns have not been studied yet in eating disorders.
Method: To assess static body image, patients with anorexia and bulimia nervosa (n=22) and a healthy control group (n=58) estimated their 'real', 'felt' and , 'ideal' figure with a digital distortion technique. Assessment of the dynamic body image was realized by a computer programme based on the 'Biomotion-Technique' (Troje, 2002). Patients were asked to adjust motion patterns shown on the screen along the body mass index axis, representing at best their ,real', ,felt' and ,ideal' motions.
Results: Concerning static body image, patients with eating disorders show a significantly stronger overestimation of their ,real' (p=.014) and ,felt' (p>.001) body dimensions than control subjects, while for 'ideal' body image, no differences were found. Additionally, for dynamic body image, there was a trend to significant group difference with respect to the estimation of ,real' (p=.077) and a highly significant difference for the estimation of ,felt' (p>.001) motions. Patients estimated their motion patterns in the direction of a higher body mass index. Again, the 'ideal' motion pattern was not different in both groups.
Discussion: It was demonstrated for the first time that in patients with anorexia and bulimia nervosa, the body image disturbances includes, in addition to the static aspect, a dynamic component, too.
References: Troje, N.F. (2002). Decomposing biological motion. Journal of Vision, 2(5), 371-387.
Vocks, S., Legenbauer, T., Troje, N.F., Hupe, C., Rüddel, H., Stadtfeld-Oerteld, P., Rudolph, M., Schulte, D.
Fragestellung: Neben dem pathologischenEssverhalten stellen Körperbildstörungen einHauptcharakteristikum der Anorexia und Bulimia nervosa dar. In derVergangenheit wurde allerdings ausschließlich das statischeKörperbild beforscht; dynamische Aspekte wie die Wahrnehmung undBewertung von eigenen Bewegungen bei Patientinnen mit Essstörungenwurden bisher nicht berücksichtigt.Methode: Zur Erhebung des statischenKörperbildes sollten Patientinnen mit Anorexia und Bulimia nervosa(n=22) und eine gesunde Kontrollgruppe (n=58) ein Digitalfoto von ihrerPerson in einem enganliegenden Anzug am Computer entsprechend ihrer,tatsächlichen', ,gefühlten' und ,idealen' Figur einstellen.Die Erfassung des dynamischen Körperbildes erfolgte anhand einesvon unserer Arbeitsgruppe entwickelten Computerprogramms, welches aufder ,Biomotion-Technik' (Troje, 2002) basiert. Hierbei stellen dieProbandinnen ein per Bildschirm dargebotenes Bewegungsmuster entlangeiner Body-Mass-Index-Achse so ein, dass dieses ihren,tatsächlichen', ,gefühlten' und ,idealen' Bewegungenentspricht. Ergebnisse: Bezüglich des statischenKörperbildes zeigten die Patientinnen mit Essstörungen imVergleich zur Kontrollgruppe eine signifikant stärkereÜberschätzung ihrer ,tatsächlichen' (p=.014) und,gefühlten' (p>.001) Körperdimensionen, während sichbeim ,idealen' Körperbild keine Unterschiede ergaben. Auch beimdynamischen Körperbild zeigten sich tendenziell signifikanteUnterschiede zwischen beiden Gruppen hinsichtlich der Einschätzungihrer ,tatsächlichen' (p=.077) sowie ein hochsignifikanterUnterschied hinsichtlich der ,gefühlten' Bewegungsmuster(p>.001), wobei die Patientinnen mit Anorexia und Bulimia nervosaihre Bewegungen in Richtung eines höheren Body Mass Indexeinstuften. Das ,ideale' Bewegungsmuster unterschied sich nichtzwischen den beiden Gruppen. Schlussfolgerung: Es konnte erstmalsnachgewiesen werden, dass bei Patientinnen mit Anorexia und Bulimianervosa neben einer Störung des statischen auch eine Störungdynamischen Körperbildes vorliegt. Literatur: Troje, N.F. (2002). Decomposingbiological motion. Journal of Vision, 2(5), 371-387.
Troje, N. F.
Biological motion point-light displays produce a strong inversion effect which shows similarities to the well known inversion effect in face recognition. In face recognition, the inversion effect has been interpreted in terms of a distinction between configural processing and featural processing. There is evidence that turning faces upside-down hinders configural processing. The inversion effect for biological motion may also be based on impaired configural processing. The effects of turning a walker upside-down should then be comparable to the effects observed with scrambled motion stimuli, i.e. with stimuli that keep local motion intact but displace the trajectories of the single dots randomly. An alternative explanation for the inversion effect in biological motion is based on recent findings that assumptions about the direction of gravity play a role in interpreting biological motion. If inverted gravity is the reason for the inversion effect, upright scrambled motion should produce less effects than intact but inverted motion.We tested the two alternative predictions using a task in which subjects had to determine the apparent walking direction of human and animal point-light walkers shown in sagittal view. The displays maintained stationary on the screen, i.e. they looked like being recorded on a treadmill, and they were masked with a dynamic random dot background.Both response times and accuracy clearly showed that scrambling the motion had only a very minor impact on direction perception. Even though subjects had no idea what kind of animal they might be seeing they could indicate its walking direction fast and accurately. On the other hand, turning the displays upside-down had a strong impact on the perceived direction with performance being almost at chance.We conclude that the inversion effect in biological motion is not due to disturbance of configural processing but is rather a result of prior assumptions about the direction of gravity becoming invalid.
This research is funded by the Volkswagen Foundation.
Troje, N. F.
Biological motion perception and Marey’s work. NikolausF. Troje Department of Psychologie, Queen’s University, Kingston. Besides many other achievements, Etienne Jules Marey was a pioneer in data visualization. He realized very early that complex physiological data can only be understood if they can be represented and displayed in away that suits human perception. The ease with which our visual system can interprete complex animate motion patterns in order to retrieve information about both a performed action as well as the performing actor has fascinated many researchers since. I will outline the impact that Marey’s work had on studying visual perception of biological motion and I will sketch the main results obtained in this field. I will finally introduce you to a framework that can be used to analyse human motion patterns with respect to biological, psychological attributes of the agent. I will further show how the same framework can be used to synthesize human motion patterns with well defined attributes to be used for computer animation and robotics, and I will finally discuss it as a model for visual motion processing in the human brain.
Troje, N. F.
Animate motion contains information about the identity of an agent as well as about his or her actions, intentions, emotions, and personality. The human visual system is highly sensitive to biological motion and has evolved a remarkable capability to extract this socially relevant information from the way a person moves. The effortlessness with which we can adequately interpret other peoples motion grossly contrasts the fact that we still know very little about the mechanisms underlying information encoding in, and retrieval from, biological motion patterns. In this contribution, I want to outline a framework that enables us to successfully search for the effects of a number of different traits on human gait patterns. The proposed model is based on the statistics of a database of motion capture data. Based on linearization of the motion data, a motion space is defined which is spanned by the first few principal components obtained from the database of input walkers. Using biological and psychological traits attributed to the input walkers, linear discriminant functions are computed which define vectors in the motion space that generalize the respective trait. The framework will be used to explore variances in human walking patterns with respect to sex, age, attractiveness, a number of different emotions and identity. In doing so, I will stress the fact that successful retrieval of the respective information from the input data is only possible if we treat the moving person in a holistic way which retains the complex correlations between different moving parts of the articulated body as well as between the kinematics and static, structural properties of the body. The framework used to analyze animate motion is a generative model, which can also be used for motion synthesis. In order to create psychologically convincing motion it is crucial to preserve the correlative nature of animate motion. The human visual system is extremely sensitive to violations of the complex correlation patterns and I will present examples on how it uses implicit knowledge about the biomechanics (and general physics) of articulated motion causing these correlations.
Jokisch, D., Daum, I., Troje, N.F.
The human visual system is very sensitive tothe detection of animate motion patterns. We can efficiently recognizehuman action patterns and attribute many features of psychological,biological and social relevance to other persons. An experimentalapproach for studying information from biological motion (BM) withreduced interference from non dynamic cues is to represent the mainjoints of a person's body by bright dots against a dark background.
In the present study we investigated the influence of viewing angle on recognition performance of walking patterns from familiar persons represented as point-light displays (PLD). Classical work on individual recognition (Cutting & Kozslowski, 1977; Beardsworth & Buckner, 1981) provided empirical evidence that cues from BM contain information enabling to recognize familiar persons and to identify one's own walking pattern when it is represented as PLD.
We tested viewpoint-dependent recognition performance in two groups of twelve persons respectively knowing each other very well. Motion data of the participants were acquired by recording their walking patterns in 3D space using a motion capture system. The location of the major joints were computed from the trajectories of the original markers. Size normalized PLDs of these walking patterns were presented to the same group members on a computer screen in three different orientations (frontal view, half profile view and profile view). `Before the experiment observers were shown a list of all occurring names, including their own. Observers were requested to press a button if they had recognized the person's gait pattern and to indicate afterwards the person's name by clicking on the corresponding name button in a list containing all names. Observers did not receive feedback on their response.
Analysis revealed a significant effect of viewing angle on recognition performance. Displays presented in frontal view and in half profile view were significantly more often correctly identified than displays presented in profile view. Whereas recognition performance was found to be significantly above chance level, BM failed to provide a high reliable cue for individual identification if walking patterns of familiar persons had not been seen before as PLD.
We conclude that individual features of gait dynamics can be more efficiently extracted when seen in frontal or half profile view. This viewpoint dependent recognition effect might be due to the fact that the attention towards another person is triggered if this person is approaching us resulting in increased exposure to frontal views of gait patterns.
Jokisch, D., Daum, I., Troje, N.F.
The human visual system is very sensitive to the detection of animate motion patterns. We can efficiently recognize human action patterns and attribute many features of psychological, biological and social relevance to other persons. In the present study we investigated the influence of viewing angle on recognition performance of walking patterns of one's own person and familiar individuals such as friends or colleagues represented as point-light displays (PLD).We tested viewpoint-dependent recognition performance in two groups of twelve persons who know each other very well. Participants' motion data were acquired by recording their walking patterns in 3D space using a motion capture system. Locations of major joints were computed from the trajectories of the original markers. Size normalized PLDs of these walking patterns were presented to the same group members on a computer screen in frontal view, half profile view and profile view. Before the experiment, observers were shown a list of names of all people to be presented, including their own. Observers were requested to press a button if they had recognized the person's gait pattern and to indicate afterwards the person's name by clicking on the corresponding name button in the names' list. No feedback was given to the observers.Whereas recognition performance of the own walking patterns was viewpoint independent, recognition rate for other familiar individuals was better for frontal and half profile view than for profile view. We conclude that the viewpoint dependent recognition effect for other people might be due to attention being triggered if another person is approaching us, resulting in increased exposure to frontal and half profile views of gait patterns. The finding of viewpoint independent recognition effects for own movement patterns might be related to a crossmodal transfer from motor to visual representations.This research is funded by the Volkswagen Foundation.
Jiménez Ortega, L., Troje, N.F.
Many birds show a characteristic backward and forward head movement called head-bobbing. It is composed of a hold phase, where the head remains static in space, and a thrust phase, where the head is moved forward. Three main functions for head-bobbing have been proposed:biomechanical function, image stabilization and depth perception through motion parallax. However its function is not yet well understood. Although head-bobbing behaviour has often been discussed in the literature, the birds that bob their head are not listed. It has been reported that head-bobbing occurs in at least 8 of the 27 orders of birds and in 28 species such as pigeons, doves, hens, starlings, pheasants, coots, rails, sand-pipers, phalaropes, parrots, magpies and quail. At the moment, we are collecting data about head-bobbing birds by an exhaustive field observation of different species of birds. Until now we found head-bobbing birds in 10 orders, 25 families and more than 60 species. In contrast we observed non-head-bobbing birds in 9 orders, 20 families and almost 100 species. The list indicates that around 40% of the birds show head-bobbing. We discuss whether head-bobbing is a monophyletic or apolyphyletic trait and we analyze the ecological and behavioural factors under head-bobbing behaviour, such as predatory pressure, source of feeding, habitat, etc. Based on these findings, we will discuss the functional significance of avian head-bobbing.
2003
Papers
Troje, N. F.
Both face recognition and biological-motion perception are strongly orientation-dependent. Recognition performance decreases if the stimuli are rotated with respect to their normal upright orientation. Here, the question whether this effect operates in egocentric coordinates or in environmental coordinates is examined. In addition to the use of rotated stimuli the observers were also rotated and tested both with a same ^ different face-recognition task and with a biological-motion detection task. A strong orientation effect was found that depended only on the stimulus orientation relative to the observer. This result clearly indicates that orientation effects in both stimulus domains operate in an egocentric frame of reference. This finding is discussed in terms of the particular requirement of extracting sophisticated information for social recognition and communication from faces and biological motion.
Jokisch, D., Troje, N. F.
Animals as well as humans adjust their gait patterns in order to minimize energy required for their locomotion. A particularly important factor is the constant force of earthâs gravity. In many dynamic systems, gravity defines a relation between temporal and spatial parameters. The stride frequency of an animal that moves efficiently in terms of energy consumption depends on its size. In two psychophysical experiments, we investigated whether human observers can employ this relation in order to retrieve size information from point-light displays of dogs moving with varying stride frequencies across the screen. In Experiment 1, observers had to adjust the apparent size of a walking point-light dog by placing it at different depths in a three-dimensional depiction of a complex landscape. In Experiment 2, the size of the dog could be adjusted directly. Results show that displays with high stride frequencies are perceived to be smaller than displays with low stride frequencies and that this correlation perfectly reflects the predicted inverse quadratic relation between stride frequency and size. We conclude that biological motion can serve as a cue to retrieve the size of an animal and, therefore, to scale the visual environment.
Guski, R., Troje, N. F.
We report three experiments in which visual or audiovisual displays depicted a surface (target) set into motion shortly after one or more events occurred. A visual motion was used as an initial event, followed directly either by the target motion or by one of three marker events: a collision sound, a blink of the target stimulus, or the blink together with the sound. The delay between the initial event and the onset of the target motion was varied systematically. The subjects had to rate the degree of perceived causality between these events. The results of the first experiment showed a systematic decline of causality judgments with an increasing time delay. Causality judgments increased when additional auditory or visual information marked the onset of the target motion. Visual blinks of the target and auditory clacks produced similar causality judgments. The second experiment tested several models of audiovisual causal processing by varying the position of the sound within the visual delay period. No systematic effect of the sound position occurred. The third experiment showed a subjective shortening of delays filled by a clack sound, as compared with unfilled delays. However, this shortening cannot fully explain the increased tolerance for delays containing the clack sound. Taken together, the results are consistent with the interpretation that the main source of the causality judgments in our experiments is the impression of a plausible unitary event and that perfect synchrony is not necessary in this case.
Other Contributions
Symposia and Published Abstracts
Troje, N. F., Geyer, H.
The human visual system shows an impressive sensitivity to subtleties in animate motion patterns carrying biologically relevant information. Frontal views of biological motion point-light walkers can be classified with respect to the gender of the walker withhigh accuracy. Here, we document pronounced adaptation effects that alterthe perceived gender of a point-light walker.
Stimuli were generated using a morphing technique which provides smooth transitions along a linear discriminant function classifying a set of 80 walkers according to their sex/sup/1/sup/. Thirteen different walkers were sampled along the male-female walking axis covering a total range of 7 standard deviations of the walker distribution. In a first experiment we determined the location of a perceptually neutral walker on the male-female axis. Five different presentation times between 350 and 7000 ms were used. Using a two-alternative-fourced-choice procedure, subjects had to indicate for each display, whether it showed a man or a woman. The data were fitted by logistic psychometric functions. A morph half way between an average male and an average female walker was rated to be male. To obtain a perceptually neutral walker about one standard deviation of femaleness has to be added. Those results were independent of the presentation time.
In a second experiment, observers were first presented with 7000 ms point-light displays of either an exaggerated male walker, an exaggerated female walker or a perceptually neutral walker. After this adaptation period, they were tested with short presentations of walkers sampled along the male-female walking axis. The presentation time of the test stimulus was either 350, 700, or 1400 ms. Adaptation results in a pronounced shift of perceived gender of the test stimulus. A neutral walker is perceived to be female after adaptation with the exaggerated male and male after adaptation with the exaggerated female walker.
Our data demonstrate that adaptation can occur not only within low-level vision processes but also at high-level information processing stages. In the case of biological motion perception the aftereffects are affecting stages at which the complex information from the series of single moving light-dots is integrated into a coherent percept of walking person.
This research is funded by the Volkswagen Foundation.
Troje, N. F. (2002). Decomposing biological motion: A framework for analysis and synthesis of human gait patterns. Journal of Vision, 2:371-387, http://journalofvision.org/2/5/2
Troje, N. F., Bach, M.
Adaptation is a very general and basic phenomenon in biological information processing, covering a broad range from gain control to "fatigue". Adaptation provides an active mechanism for efficient data compression by removal of redundancy: encoding changes of properties rather than the properties themselves allows the visual system to acquire, transmit, process and store information in a highly economical manner while minimizing losses. However, besides its functional significance, adaptation has also proven to be a valuable scientific instrument to non-invasively investigative, characterize and isolate sensory information processing pathways. In the visual domain, adaptation has traditionally been used mainly to study early visual processing. During the last few years, however, it has become evident that adaptation and corresponding after-effects also play a major role in high-level cognitive processing and that it can be employed to study phenomena such as face recognition or biological motion perception. In this symposium, we want to trace this development spanning the whole range between low-level vision and high-level cognitive processes, on the one hand, while emphasizing the dualistic nature of adaptation as a neural mechanism and as an investigative tool, on the other hand. J. Zanker will open the series of presentations by providing a general introduction into the concepts of spatio-temporal visual signal coding that lead to the phenomena of after-effects in time as well as to simultaneous contrast enhancement in space. Using the example of motion boundaries in time and space he will illustrate this point in more detail by comparing results from a computational motion detection model to psychophysical observations. The next two talks will provide illustrative examples of the use of adaptation for probing the properties of low level visual filters. In the contribution of M. Fahle selective adaptation is used as a tool to study the effects of perceptual learning on the characteristics of orientation selective visual filters. M. Bach uses a complex double adaptation paradigm to isolate direction specific motion responses in VEP from direction unspecific flicker responses. M. Greenlee's contribution is particularly interesting because he shows that contrast gain control, a mechanism that implements adaptation to varying light intensities, itself can be highly adaptive, therefore demonstrating "second order adaptation" in the visual system. In the last two contributions it is shown that adaptation is not only a low-levelvisual phenomenon. D. Leopold presents data on aftereffects in face recognition and N.Troje finds similar effects for biological motion perception.
Troje, N. F.
Human motion contains a wealth of information about actions and intentions, but also about identity and personal attributes of the moving person. Our visual system can retrieve information about a person's gender, age, emotional state and personality traits and we can individually recognize a good friend -- solely based on his or her walking patterns. What our visual system seems to solve so effortlessly is still a riddle in vision research and an unsolved problem in computer vision. Little is known about exactly how biologically and psychologically relevant information is encoded in visual motion patterns. Here, I will outline a general framework that can be used to retrieve information from biological motion patterns and I will present a number of examples which will demonstrate the capabilities as well as the limits of the proposed approach.
The approach is based on transforming biological motion data into a representation that subsequently allows for analysis using linear statistics and pattern recognition techniques. The required linearization is achieved by using a Fourier-based approach. The DC part of the transform encodes structural information, i.e. the geometry of the moving body, whereas its dynamic parts encode the motion itself. The major property of the resulting representation is, that it is morphable: linear combinations of existing walking patterns result in proper walking patterns which represent smooth transitions between the original walkers.
Using this Fourier-based representation we apply principal component analysis to reduce the dimensionality of the resulting space to an extent that allows to compute efficient linear classifiers. The classifiers can be trained with different properties of interest. These properties are either obtained from the subjects themselves (like the sex of a walker) or they can reflect the perception of observers who were asked to rate point-light displays of the walkers (for instance, according to their attractiveness). I will show examples for extracting information about sex, weight, attractiveness and a number of emotional and personality traits. I will furthermore compare some of the computational results with psychophysically obtained data from human observers in order to discuss the relevance of our approach as a model for the mechanisms underlying the recognition of animate motion in the human visual system.
Troje, N. F.
Biological motion contains plenty of visual information about several attributes of biological andpsychological significance. In particularly, we can accurately determinethe sex of a walker from the way he or she moves. Furthermore, motion patternscan vary to a large extent in perceived sexual attractiveness. In this study,we investigate the relation between perceived attractiveness and gender usingdynamic point-light displays from 40 male and 40 female walkers. Using alinear, morphable stimulus space, we determined discriminant functions basedon attractiveness ratings of 12 male and 12 female participants. In a firstblock observers were shown with displays of the other sex and asked to ratethe walkers in terms of their sexual attractiveness. In the second blockthey were presented with walkers of their own sex and asked to rate the assumedattractiveness on the other sex. The resulting discriminant functions arevisualized in terms of caricatured walking displays and compared with thelinear discriminant function that best classifies the sex of a walker. Theresults show that female attractiveness as rated by male observers highlycorrelates with gender -- i.e. with the projection of a walker onto the linearsex classifier. In contrast, female attractiveness as rated by female observersis virtually independent of gender and rather appears to display a vivacious,energetic character. Male attractiveness as rated by male and female observersshows a similar tendency. Whereas male observers assume themselves to berated attractive by females when displaying masculinity, the discriminantfunction based on the ratings of female observers is in fact almost perpendicularto the gender discriminant function.
For an online demonstration, see http://www.biomotionlab.de/Demos/attractivity.html.
Patton, T., Yelda, S., Buschmann, J.-U., Troje, N.F., Shimizu, T.
Male pigeons show species-specific courtship displays in front of female pigeons. The present study examined whether visual information, without auditory or tactile input, could trigger such courtship displays. The study also examined which technical measures could be used to present the visual stimuli. We studied the behaviors of male pigeons in response to video-taped and computer-animated stimuli presented on a computer monitor. The subjects did show courtship displays in front of video-taped and computer-animated females. However, they showed little or no such behavior when empty cages or upside-down views of females were presented on the monitor. Thus, subjects selectively reacted to the visual stimuli, suggesting that the artificial pigeons can be used as stimuli to represent a viable potential mate.
Jokisch, D., Troje, N.F., Kress T., Daum, I.
The human visual system is very sensitive to animatemotion patterns. Humans can detect efficiently another living being in avisual scene and retrieve many features of psychological, biological andsocial relevance. By representing the main jointsof a person's body by bright dots against a dark background, observers caneasily recognize a human walker and determine his/her gender, recognize variousaction patterns and identify individual persons. The importance of the perceptionof biologically relevant motion patterns is reflected by the identificationof a specific neural circuitry as shown by brain imaging studies. Whereasbasic principles of the neural basis of perception of biological motion areunderstood, many issues concerning the temporal characteristics of the processingof such kind of information are as yet unclear. In the present study we investigatedhow inversion of biological motion stimuli affects components of event relatedpotentials (ERP). ERPs were recorded in response to point-light displaysof an upright walking person, point-light displays of an inverted walkingperson and displays of scrambled motion, in which the moving dots had thesame motion vectors as in biological motion displays with their initial startingpositions being randomized. Analysis yielded a N170 componentat parieto-occipital electrodes, which was more pronounced for upright walkersthan for inverted walkers and scrambled motion. A later component in thetime window between 300 and 400 ms after stimulus onset had a larger amplitudefor upright walkers and inverted walkers as compared to scrambled walkers.We hypothesize that the N 170 component reflects the holistic recognitionof prototypical configurations of a human body, whereas the later componentis associated with the integration of the dots' interrelations to a coherentpercept.
Jokisch, D., Troje, N.F., Kress, T., Daum, I.
The human visual system is very sensitive to the detection of animate motion patterns. Humans can detect efficiently another living being in a visual scene and retrieve many features of psychological, biological and social relevance.
An experimental approach for studying information from biological motion without interference from form is to represent the main joints of a person's body by bright dots against a dark background. Using this procedure, observers can easily recognize a human walker and determine his/her gender, recognize various action patterns and identify individual persons. The importance of the perception of biologically relevant motion patterns is reflected by the identification of a specific neural network as shown by brain imaging studies. Whereas basic principles of the neural basis of perception of biological motion are understood, many issues concerning the temporal characteristics of the processing of such kind of information are as yet unclear.
In the present study we investigated how different processing stages involved in the perception of biological motion are reflected by modulations in event related potentials (ERP). ERPs were recorded in response to point-light displays of a walking person, point-light displays of an inverted walking person and displays of scrambled motion, in which the moving dots had the same motion vectors as in biological motion displays with their initial starting positions being randomized.
Analysis yielded a N200 component at parieto-occipital electrodes, which was more pronounced for upright walkers than for inverted walkers and scrambled motion. A later component in the time window between 300 and 400 ms after stimulus onset had a larger amplitude for upright walkers and inverted walkers as compared to scrambled walkers. We hypothesize that the N 200 component reflects the holistic recognition of a human person, whereas the later component is associated with the integration of dot patterns to a coherent percept.
Jokisch, D., Kress, T., Daum, I., Troje, N.F.
The human visual system is very sensitive tothe detection of animate motion patterns. We can efficiently detectanother living being in a visual scene, recognize human action patternsand attribute many features of psychological, biological and socialrelevance to other persons. An experimental approach for studyinginformation from biological motion (BM) with reduced interference fromnon dynamic cues is to represent the main joints of a person's body bybright dots against a dark background.In the present study we investigated howdifferent processing stages involved in the perception of BM arereflected by modulations in event related potentials (ERP) in order toelucidate the time course and location of neural processing in BMperception. ERPs were recorded in response to point-light displays of awalking person, an inverted walking person and displays of scrambledmotion. Analysis yielded a pronounced negativity with a peak at 180 msafter stimulus onset (N180) at parieto-occipital electrodes which wasmore pronounced for upright walkers than for inverted walkers andscrambled motion. A later negative component between 230 and 360 msafter stimulus onset (N2) had a larger amplitude for upright andinverted walkers as compared to scrambled walkers and reveals a shiftof maximum negativity to temporo-parietal areas. In the latercomponent, negativity was more pronounced over the right hemisphererevealing asymmetries in BM perception. The early component mightreflect the global recognition of a human person, whereas the latercomponent might be associated with the local integration of dotpatterns into a coherent percept. We conclude that visual processingneeded to perform the highly demanding task to discriminate between BMand scrambled motion can be achieved within a time period of merely 180ms.This evidence for fast, efficient processingunderlines the importance of perception of BM and provides furtherevidence for a specific neural network involved in processingbiologically relevant motion signals. Furthermore, theright-hemispheric dominance associated with the perception of BM showsclear analogies to asymmetries in face perception and probably reflectsthe social relevance of animate motion perception.
Jiménez-Ortega, L., Troje, N. F.
Computing the distance of an object from motion parallax involves the comparison of the displacement or velocity of the observer's eye with the hereby induced displacement or velocity of the object on the retina. Motion parallax computation therefore requires knowledge about the relative speed between the observer and the object. In many situations this information is not available to the observer -- either because the observer or the object moves with an unknown velocity. Theoretically, one could still determine distance by moving with two different speeds and employing only knowledge about the difference between them. We refer to this mechanism as "differential motion parallax" and we assume that many birds use this mechanism to monocularly measure distance in the lateral visual field. Here, we examine whether humans are capable of using differential motion parallax. Observers had to indicate whether a central, horizontal array of small squares was before or behind a plane represented by two flanking horizontal arrays. We measured depth discrimination thresholds for the monocularly viewed patterns with and without adding a constant, but from trial to trial unpredictibly varying motion component to the stimulus. Since motion parallax was the only cue, subjects had to make lateral translational movements with their upper body in order to solve the task. If the stimulus did not move, subjects demonstrated a high accuracy (in average 0.2 % of the viewing distance). Adding a constant speed of the same magnitude to both the central pattern and the flanking arrays did only slightly impair this performance. However, when adding constant, randomly chosen speeds independently to both patterns, the threshold increased dramatically, suggesting that the human visual system is not able not take advantage of differential motion parallax.This research is funded by the Volkswagen Foundation.
Huber, L., Loidolt, L., Troje, N.F.
Animals are consitently faced with theproblemof recognizing objects from various perspectives in the naturalenvironment,either because of their own movements or because of the movements ofsomeexternal object. For instance, depth rotation dramatically alters theinformationpresent in any two-dimensional view of an object; yet humans readilyrecognizemost objects despite enormous variations in vantage point. Do non-humananimalsalso show generalization over rotation in depth? Here we report anexperimentthat tested the perspective processing in the pigeon when presentedwithpictures of human heads rotated in depth. In the training, thediscriminationperformance of pigeons presented with pictures of two male headsrotatingin depth did not differ from the performance of pigeons trained on asingleview of each head or from the performance of pigeons trained onmultiplestatic views of the heads. In the transfer test, novel views of theheadswere then tested for recognition. The pigeons of all three groupsshowedviewpoint dependence, with their performance declining systematicallywithdegree of rotation from the nearest training view. These findingssuggestthat pigeons--in cotrast to humans-- do not integrate the multipleviewsof the rotating heads into the percept of a 3D-object but rather encodetheheads in the specific poses in which they are seen. This implies thattheheads are represented as a collection of familiar views that reflecttheidiosyncratic morphological aspects of these particular viewpoints.
2002
Papers
Troje, N. F.
Biological motion contains information about the identity of an agent as well as about his or her actions, intentions, and emotions. The human visual system is highly sensitive to biological motion and capable of extracting socially relevant information from it. Here we investigate the question of how such information is encoded in biological motion patterns and how such information can be retrieved. A framework is developed that transforms biological motion into a representation allowing for analysis using linear methods from statistics and pattern recognition. Using gender classification as an example, simple classifiers are constructed and compared to psychophysical data from human observers. The analysis reveals that the dynamic part of the motion contains more information about gender than motion-mediated structural cues. The proposed framework can be used not only for analysis of biological motion but also to synthesize new motion patterns. A simple motion modeler is presented that can be used to visualize and exaggerate the differences in male and female walking patterns.
Proceedings
Watson, T. L., Johnston, A., Hill, H. C., Troje, N. F.
To investigate viewpoint dependence in dynamic faces an avatar was animated using actorsí movements. In Experiment 1 subjects were shown a fullface animation. They were then asked to judge which of two rotated moving avatars matched the first. Test view, orientation and the type of motion were manipulated. In a second experiment subjects were shown two views of the same facial animation and were asked which of the two avatars was the same as the initial animation. Initial views could be rotated to 15° and 45° or 45° and 75° while test views were presented at 30° or 60°. Learnt view, test view, orientation and type of movement (rigid + non-rigid vs non-rigid) were manipulated. Both experiments and movement conditions produced an advantage for upright over inverted matching demonstrating subjects were encoding facial information. Non-rigid movement alone showed no effect of view for both experiments demonstrating viewpoint invariance. Rigid and nonrigid movement presented together produced a decline in performance for larger test rotations in Experiment 1, while Experiment 2 produced a differential advantage for 30° test rotation wheninitially viewed upright faces were rotated to 15° and 45° however no difference was found in the 45° and 75° condition or with inverted faces. These experiments suggest that non-rigid facial movement is represented in a viewpoint invariant manner whereas the addition of rigid head movements encourages a more viewpoint dependent encoding when the initial orientation of the head is not rotated further than the half profile (45°).
Troje, N. F.
A framework is outlined that can be employed to obtain gender and other characteristics of the agent from human motion patterns and subsequently use this information to synthesize motion with particular, well-defined biological and psychological attributes. The proposed model is based on the statistics of a data base of motion capture data. Based on linearization of the motion data, a motion space is defined which is spanned by the first few principal components obtained from the data base of input walkers. Using biological and psychological traits attributed to the input walkers, linear discriminant functions are computed which define vectors in the motion space that generalize the respective trait. These vectors are in turn used to generatewalking patterns with the respective properties.
Symposia and Published Abstracts
Troje, N. F., Geyer, H.
The human visual system shows an impressive sensitivity to subtleties in animate motion patterns carrying biologically relevant information. Frontal views of biological motion point-light walkers can be classified with respect to the gender of the walker with high accuracy. Here, we document pronounced adaptation effects that alter the perceived gender of a point-light walker. Stimuli were generated using a morphing technique which provides smooth transitions between male and female walking patterns. Observers were first presented with 5 walking cycles of either an exaggerated male walker, an exaggerated female walker or a neutral walker. Subsequently, they were tested with 700 ms presentations of walkers sampled along the male-female walking axis. Their task was to indicate whether the test walker was a man or a woman. A psychometric function was fitted to the data. Adaptation to the male walker results in a pronounced shift of perceived gender of the test stimulus. A neutral walker is perceived to be female after adaptation with the exaggerated male and male after adaptation with the exaggerated female walker. This demonstrates that adaptation can occur not only within low-level vision processes but also at high-level information processing stages.
This research is funded by the Volkswagen Foundation.
Troje, N. F., Förster, A.
Both face recognition and biological motion perception are strongly orientation dependent. Recognition performance decreases if the stimuli are rotated with respect to their normal upright orientation. Here, we examine the question of whether this effect operates in egocentric coordinates or in external coordinates. Two different tasks were employed. In the first task observers had to indicate whether two successivly presented images of human faces were same or different. In the second task subjects had to indicate whether a display of 50 moving dots contained a point-light walker or not. The stimuli were either shown right side up or rotated 90 degrees clockwise. The observer was either sitting upright or lying on his left side.In the face recognition task, error rates were effected neither by the observers position nor by the stimulus orientation. A strong interaction (F(1,7)=23.7, p<0.005) between the two factors indicated, that the performance is only determined by the relative orientation of stimulus and observer. Performance is best if the stimulus has the same orientation as the observer. The same result is obtained for the biological motion task (interaction: F(1,7)=22.6, p<0.005).We conclude that the frame of reference within which both inversion effects operate is an egocentric, probably a retinal frame of reference.
This research is funded by the Volkswagen Foundation.
Troje, N. F.
Biological motion patterns contain information about the identity of a person including biological attributes such as gender and age as well as personality traits. The human visual system is highly sensitive to biological motion and capable to extract this information from it. How is this information encoded in biological motion patterns and how can it be retrieved from it? I will outline a framework that transforms biological motion into a representation within which it can be analysed by means of linear methods from statistics and pattern recognition. Based on this framework the attributes that transmit information about gender and about basic personality traits are analysed. Furthermore, an ideal observer model for gender classification is constructed and compared to the performance of human observers. Although the artificial classifier performs generally much better than human observers, it turns out that it picks up on similar features as human observers do.
Troje, N. F.
Biological motion contains information about the identity of an agent as well as about his or her actions, intentions, and emotions. The human visual system is highly sensitive to biological motion and capable of extracting socially relevant information from it. Here we investigate the question of how such information is encoded in biological motion patterns and how such information can be retrieved. A framework is developed that transforms biological motion into a representation allowing for analysis using linear methods from statistics and pattern recognition. Using gender classification as an example, simple classifiers are constructed and compared to psychophysical data from human observers. The analysis reveals that the dynamic part of the motion contains more information about gender than motion-mediated structural cues. The proposed framework can be used not only for analysis of biological motion but also to synthesize new motion patterns. A simple motion modeler is presented that can be used to visualize and exaggerate the differences in male and female walking patterns.
Richwien, S., Troje, N. F.
The information provided in a point-light display of a walking figure conveys information both about the geometry and structure of the moving body as well as about the dynamics itself. Human observers can estimate the sex of a point-light walker with a reasonable accuracy. In this study we examine whether information about gender is carried rather by motion mediated structural information or by dynamic cues.
Three groups of observers were presented with point-light displays of 80 individual walkers and had to attribute a gender to each of them. To the first 8 observers the walkers were shown in frontal view (0 deg), another 8 observers saw them in half profile (30 deg) and yet another 8 observers were presented with profile views (90 deg). After a first block of sex ratings of the veridical walkers a second block was shown which contained modifications of the walkers. Those stimuli were either normalized by their structure, thus containing only dynamic information to be used for gender classification or they were normalized by their dynamics, thus containing only structural information.
Both factors effected the number of mis-classifications (Orientation: F(2,21)=31.9, p<0.001; Information: F(2,42)=27.2, p<0.001). The interaction was not significant. With respect to the orientation of the walker, performance was best in the 0 deg condition (27% errors on average) and worst in the profile view condition (44% errors). Error rates on the veridical walkers were on average across all orientations 30%. Performance on the dynamic-only stimuli was much better (32% errors) than on the structure-only stimuli (39% errors).
Dynamical information seems to be more important for gender discrimination than structural information and is best accessible in the frontal view.
This research is funded by the Volkswagen Foundation.
Pinnow, M., Engelke, D., Troje, N.F., Schölmerich, A.
Marlies Pinnow, Dagmar Engelke, Nikolaus Troje, & Axel SchölmerichDepartment of Psychology, Ruhr-University Bochum, Germanymarlies.pinnow@ruhr-uni-bochum.deFaces, like most objects, change appearance when they are rotated. Adult viewers identify human faces from different viewpoints with remarkable accuracy. While the general interest of infants in faces is well documented, we know little about their ability to identify rotated faces.
Our study investigated three- and six-month-old infants' recognition performance using images of laser-scanned three-dimensional head models preserving the nartural texture (Troje & Bülthoff, 1996) as stimuli. 27 infants were habituated in six consecutive trials to the full-face view of either a female or a male face. Following habituation, infants were presented with two dishabituation trials, during which all infants viewed the habituated face in a new angle and a new face in the identical angle. The rotating angles used in this study varied from ±10 deg to ±50 deg.
The 6-months-old infants looked longer at the new face than at the habituated face in the 10 and 20 deg rotation when compared with the 3-months-olds, whereas higher degrees of rotation yielded no differences between the new and the rotated face for both age-groups.
Our results must be viewed with caution, since many infants did not show the expected drop in fixation time during habituation. Currently, we are modifying the protocol to gather additional data.
Troje, N.F. & Bülthoff, H.H, (1996), Face recognition under varying poses: the role of texture and shape. Vision Research, 36 (12), 1761-1771.
Lavrov, M., Troje, N.F.
A familiar person can be recognized by the way he or she moves. We investigated this ability using point-light displays of seven different walkers shown from three different viewpoints. Each observer was presented with only one viewpoint and was trained to name the walkers. During training, diagnostic information was gradually reduced by normalizing the stimuli with respect to their size, their shape, and their speed. We evaluated the influence of those manipulations on the learning curve and on the performance in separate non-reinforced test sessions. Finally, the fully normalized walkers were shown in different orientations in order to test the observer's ability to generalize to new viewpoints. Starting at chance level (14% correct responses) subjects learned quickly to associate the correct names to the stimuli (77% correct on average across all observers after 60 presentations of each walker). Normalizing the walkers by their size did not impair performance neither in the learning curve nor in the testing sessions. Normalization of the walkers with respect to their shape did cause a slight drop in performance (from 88% to 77%). An additional normalization with respect to walking speed had a stronger effect (from 86% to 71%). After relearning, observers still reached a performance of 85% correct. Performance in the test sessions was somewhat smaller. Results confirmed the ones obtained from the learning curves: Size is not used as a cue and shape does only play a minor role. The last test on viewpoint generalisation showed that although performance drops considerably even with fully normalized walkers, a 90 deg viewpoint changed walker is correctly identified in 40% of all trials. This research is funded by the Volkswagen Foundation.
Hill, H., Pollick, F.E., Kamachi, M., Troje, N.F., Watson, T., Johnston, A.
The ways in which we move our faces and bodies are the source of much biologically important information. Such movements can be exaggerated by extending techniques developed for static facial caricature into the temporal domain. Spatial exaggeration of movement is accomplished by first time-normalising the sequences to be exaggerated and then exaggerating the differences between individual frames and an average frame. We can also exaggerate the temporal properties of movement by reversing and extrapolating the time-normalisation step. Previous findings from a variety of domains where exaggeration has been shown to enhance the perception of task-relevant information are reviewed, together with new data showing that spatial exaggeration of emotional utterances relative to an averaged utterance can enhance their perceived happiness, sadness, or angriness. Conversely, it is argued that exaggerating emotional or other individual differences may actually interfere with the recovery of information common to all the sequences being averaged, in this case the lexical content. Lastly, motion exaggeration, like facial caricature, may reflect general underlying principles involved in the encoding and discrimination of biological movement, an as yet poorly understood process.
Cunningham, D. W., Thornton, I. M., Troje, N. F., Bülthoff, H. H.
Biological motion contains many forms of information. Observers are usually able to tell `what' action is being performed (eg. walking versus running), `how' it is being performed (eg. quickly versus slowly),and by `whom' (eg. a young versus an old actor). We used visual search to explore the perception of gender-from-motion. In the first experiment, we used computer-animated, fully rendered human figures in which the structural and dynamic information for gender were factorially combined. In separate blocks, observers were asked to locate a figure walking with a male or female gait among distractors having the same form but opposite motion. In the second experiment, point-light walkers moved along random paths in a 3-D virtual environment. Observers were asked to locate a figure walking with a male or female gait among distractors with the opposite motion. In both experiments, the set size was varied between 1 and 4, and targets were present on 50% of the trials. The results suggest that (a) visual search can be used to explore gender-from-motion, (b) extraction of gender-from-motion is fairly inefficient (search slopes often exceed 500 ms item-1), and (c) there appears to be an observer-gender by target-gender interaction, with male observers producing lower RTs for female targets and vice versa.
2001
Papers
Diekamp, B., Hellmann, B., Troje, N. F., Wang, S. R., Güntürkün, O.
A direct projection of the nucleus of the basal optic root (nBOR) onto the nucleus rotundus (Rt) in the pigeon would link the accessory optic system to the ascending tectofugal pathway and could thus combine self- and object-motion processes. In this study, injections of retrograde tracers into the Rt revealed some cells in central nBOR to project onto the ipsilateral Rt. Contrary, injections into the diencephalic component of the ascending thalamofugal pathway resulted in massive labeling of neurons in dorsal nBOR. Single unit recordings showed that visual nBOR units could be activated by antidromic stimulation through the Rt. Successful collision tests applied to nBOR cells revealed that the connection between nBOR and Rt is direct. These data provide strong evidence for a direct and differential projection of nBOR subcomponents onto the thalamic relays of the two ascending visual pathways. q 2001 Elsevier Science Ireland Ltd. All rights reserved.
Other Contributions
Symposia and Published Abstracts
Troje, N. F., Jokisch, D.
Menschen und Tiere bewegen sich in einer physikalischen Welt nach physikalischen Gesetzen. Eine konstante Größe, welche dabei zeitliche und räumliche Parameter in Beziehung setzt, ist die Gravitation. Dies gilt zum Beispiel bei Pendelbewegungen, in denen bei konstanter Gravitation die Pendellänge und die Schwingungsdauer in einer festen Beziehung stehen. Biologische Bewegung enthält Elemente von Pendelbewegung sowie von weiteren Bewegungskomponenten in denen räumliche und zeitliche Größen in definierten Beziehungen zueinander stehen. Wir untersuchten in unseren Experimenten, ob diese Information vom visuellen System dafür verwendet werden kann, die absolute Größe eines sich bewegenden Lebewesens zu bestimmen. Wir zeigten unseren Versuchspersonen Punktlichtquellen-Displays von echten Hunden und von synthetischen Vierbeinergängen, die mit unterschiedlicher Geschwindigkeit abgespielt wurden, und ließen sie Angaben über deren Größe machen. Ein Tier mit langsamen Bewegungen wird tatsächlich größer wahrgenommen, als ein Tier mit schnellen Bewegungen. In der Beziehung zwischen Bewegungsgeschwindigkeit und wahrgenommener Größe drücken sich dabei die physikalischen Beziehungen zwischen Trägheitskräften und Gravitationskräften aus.
Troje, N. F.
The human visual system is highly sensitive to animate motion patterns. It is able to classify and identify motion patterns on several levels ranging from the recognition of an action to the identification of the actor. How is such information encoded in visual motion data? We present a computational model that transforms visual motion data into a representation that allows identification of diagnostic invariants and we test the model for its ability to discriminate between human walking and running.The model is based on an algorithm that transforms visual motion data such that they can be successfully analyzed with linear methods from statistics and pattern recognition. The input data can be either three-dimensional trajectories of feature points on the body or their two dimensional projections. The transformation is based on a linear decomposition of postural data into a few components whose coefficients change with sinusoidal temporal patterns. Different aspects of this representation are diagnostic for different aspects of the motion patterns. We employ a linear discrimination function to classify different gait patterns.We tested the model for its ability to discriminate between 10 walking and 10 running sequences using both 3D motion capture data as well as 2D projections from 12 different viewpoints. The algorithm is robust with respect to viewpoint and with respect to the position of the marker points on the body. Classification errors are below 4%. Discrimination is mainly based on phase relations between the postural components.The algorithm does not employ any a-priori knowledge about the articulation of the body or the labels of the features. Together with a video based tracking algorithm, it can therefore be extended to work directly on video data. Hence, it is not only a powerful model for biological information processing but has also implications for both computer vision and computer graphics.
Troje, N. F.
Animate motion contains not only information about the actions of a person but also about his or her identity, intentions and emotions. We can recognize a good friend by the way he or she moves and we can attribute age, gender and other characteristics to an unfamiliar person. How is such information encoded in visual motion data and how can it be retrieved by the visual system? We present an algorithm that transforms visual motion data such that they can be successfully approached with linear methods from statistics and pattern recognition. The transformation is based on a linear decomposition of postural data into a few components that change with sinusoidal temporal patterns. The components repeat consistently across subjects such that linear combinations of existing motion data result in smooth, meaningful interpolations. Examples are presented how this model can be used to discriminate between different types of motion (here: walking vs. running) and how it can be used to classify instances of the same type of motion (e.g. walking) in terms of characteristics of the actor (here: gender classification). Since the transformation is reversible the model can also be used for synthesis and modeling of animate motion. Therefore, it serves not only as a model for biological information processing but has also implications for both computer vision and computer graphics.
Troje, N. F.
Efficient detection of the presence of another animate creature has an immense evolutionary significance. Animate motion, however, not only signals the presence of another creature but also carries information about its individual characteristics, actions and intentions. Humans are well able to recognize a good friend by the way he or she moves and an unknown person can easily be classified according to attributes such as gender and age based only on his or her gait patterns.
How is this kind of information encoded in animate motion patterns and how can it be retrieved from them? We present a computational model for the analysis of human gait. This model is used as a tool to analyze animate motion patterns in order to extract diagnostic invariants. It also serves as a model for animate motion perception in the visual system.
Animate motion can be described as a set of feature points (e. g. joints of an articulated body) that change their position in time. The core of the method described here is an algorithm that maps animate motion defined as a set of feature trajectories into a linear space such that linear operations become meaningful. Analogous to methods used to linearize image information (Troje and Vetter, 1998; Vetter and Troje, 1997) linearization of motion data is achieved by dissociating the information into a range-specific part and a domain-specific part (Ramsay and Silverman, 1997). For motion data, the domain-specific part contains information about the timing of the motion, whereas the range-specific part contains information about the position of the features.
The resulting linearized representation allows access to methods from linear statistics and classical pattern recognition that can be used to decode information from the motion patterns. Since the representation is lossless and thus invertible it can also be used to synthesize new, artificial motion patterns with well defined and parameterized features. This not only allows us to generate well-controlled animate motion stimuli for psychophysical and neuroethological investigations but also has applications in computer animation industry.
Ramsay, J. O. and Silverman, B. W. (1997). Functional Data Analysis. New York: Springer.
Troje, N. F. and Vetter, T. (1998). Representations of human faces. In Downward processing in the perception representation mechanism (ed. C. Taddei-Ferretti and C. Musio), pp. 189-205. Singapore: World Scientific.
Vetter, T. and Troje, N. F. (1997). Separation of texture and shape in images of faces for image coding and synthesis. Journal of the Optical Society of America A 14, 2152-2161.
Troje, N. F.
Pigeons as well as many other birds show a characteristic head-bobbing motion during walking, pecking and other behaviours: Hold-phases during which the retraction of the head compensates the forward motion of the body alternate with thrust-phases during which the head is saccadically extended forward. Based on the observed stabilization of the head during the hold-phase of the walking pigeon head-bobbing is interpreted to be an optokinetic response.
Head-bobbing also occurs during flight. In that case, however, the bird travels with a speed that can not be compensated by the retraction of the head. Our hypothesis is that head-bobbing is not only an optokinetic reaction that serves image stabilization but that it is used to derive depth information from differential motion parallax. A flying bird has the problem that it does not have direct access to information about its own velocity over ground -- a prerequisite needed to compute depth from standard motion parallax. Proprioceptive systems can provide information about velocity relative to the surrounding air, but motion of the medium itself (wind, convection) can add a further velocity component. Differential motion parallax computation might be the solution to that problem: The eye is moved with two different velocities. Comparing the difference between these velocities with the difference between the corresponding retinal image velocities can reveal distance.
We tested this hypothesis by measuring the pigeons head-motion during the approach to a perch. In this situation precise distance information is particularly important to the bird. However, to induce optic flow of the landing site on the retina the motion of the eye has to have a component perpendicular to the line along which the pigeon travels. The data show that head-bobbing does indeed contain such a component. The birds move their head not on a straight line from one hold position to the next but on a curved, U-shaped trajectory that would induce substantial optical flow of the landing site on the retina. This research is funded by the Volkswagen Foundation
Thornton, I.M., Cunningham, D.W., Troje, N.F., Bülthoff, H.H.
Johansson's (1973) point light walkers remain one of the most compelling demonstrations of how motion can determine the perception of form. Most studies of biological motion perception have presented isolated point-light figures in unstructured environments. Recently we have begun to explore the perception of human motion using more naturalistic displays and tasks. Here, we report new findings on the perception of gender using a visual search paradigm. Three-dimensional walking sequences were captured from human actors (4 male, 4 female) using a 7 camera motion capture system. These sequences were processed to produce point-light computer models which were displayed in a simple virtual environment. The figures appeared in a random location and walked on a random path within the bounds of an invisible virtual arena. Walkers could face and move in all directions, moving in both the frontal parallel plane and in depth. In separate blocks observers searched for a male or a female target among distractors of the opposite gender. Set size was varied from between 1 and 4. Targets were present on 50% of trials. Preliminary results suggest that both male and female observers can locate targets of the opposite gender faster than targets of the same gender.
Loidolt, M., Huber L., Troje, N.F.
In laboratory experiments, pigeons (Columba livia) have shown to categorize a variety of visual stimulus classes. Our aim is to investigate which aspects of the class members pigeons employ for successive discrimination. Recently, we found that the pigeons' ability to discriminate between photorealistic frontal images of human faces on the basis of sex, was predominantly based on information contained in the visual texture of the images rather than in their configural properties. The subjects' substantial failure to utilize the shape information contained in human faces was surprising. However, in these experiments we only used static images, so an important attribute of natural stimuli was excluded: parallactic motion. At least in humans motion is one of the most efficient cues for shape perception. Here we report experiments conducted in order to investigate whether or not motion supports the use of shape information in pigeons. Using laser scanned 3D-models of human faces, we compared the classification performance of three groups of pigeons. One group was trained to discriminate faces rotating around the vertical axis. A second group was trained with a single static view of each face. The subjects of a third group were trained with multiple static images showing the faces from all the viewpoints visible in the rotating face stimuli, so they were presented with all the information inherent in these stimuli except coherent motion itself. The results of these experiments will be presented and its implications for further work on this topic will be discussed.
Keywords: visual categorization, pigeon, motion, configural shape, perception
Jokisch, D., Troje, N.F.
Gravitation ist eine konstante Kraft, welche auf alle Bewegung in der Umwelt einwirkt. In vielen Fällen definiert Gravitation dabei eine feste Beziehung zwischen zeitlichen und räumlichen Bewegungsgrößen (z.B. zwischen der Länge und der Periodendauer eines Pendels). Terrestrische Bewegung von Lebewesen ist ebenfalls von dieser Beziehung betroffen. Lokomotion erfordert kostbare Energie und um deren Verbrauch gering zu halten, zeichnen sich Gangmuster dadurch aus, dass sich die Gesamtenergie periodisch zwischen kinetischer, potentialer und elastischer Energie verschiebt. In einem psychophysischen Experiment untersuchten wir, inwieweit das menschliche visuelle System in der Lage ist, durch Geschwindigkeit vermittelte Größeninformation aus biologischen Bewegungsmustern von Tieren zu gewinnen.Mit Hilfe eines Bewegungserfassungssystems (Vicon 512, Oxford Metrics) wurden 3D-Trajektorien von Markerpunkten vermessen, die an den Gelenkpunkten von drei Hunden verschiedener Größe befestigt waren, welche sich mit mittlerer Geschwindigkeit durch das Messvolumen bewegten. Point-light Displays von diesen biologischen Bewegungsmustern wurden generiert, indem Projektionen der Markerpositionen in sagittaler Ansicht auf einem Computermonitor animiert wurden. Neben der Originalgeschwindigkeit wurden zwei langsamere und zwei schnellere Wiedergabegeschwindigkeiten gezeigt. Die Position der Animation auf dem Bildschirm blieb dabei unverändert, so dass der Eindruck entstand, als würde der Hund auf einem Laufband laufen. Die Größe aller Animationen blieb konstant (8.5 Grad Sehwinkel). Aufgabe der Versuchspersonen war es, die absolute Größe des Hundes (Schulterhöhe) in Zentimetern anzugeben.Sowohl die absolute Größe des Hundes als auch die Wiedergabegeschwindigkeit hatten einen signifikanten Einfluss auf die Größeneinschätzung in der erwarteten Richtung. In Animationen größerer Hunde wurden diese tatsächlich als größer wahrgenommen als in solchen von kleineren Hunden (F(1,30) = 30.8, p<0.001). In mit verlangsamter Geschwindigkeit präsentierten Animationen wurden die Hunde größer wahrgenommen als in solchen mit erhöhter Geschwindigkeit. (F (4,60) =37.4, p<0.001).Wir schließen daraus, dass menschliche visuelle Wahrnehmung die Wirkung von Gravitationskräften als Hinweisreiz für Größenwahrnehmung in biologischen Bewegungsmustern nutzen kann. Weitere Untersuchungen sind notwendig, um die exakte psychophysische Beziehung zwischen Wiedergabegeschwindigkeit und wahrgenommener Größe zu bestimmen.
Jokisch, D., Midford, P.E., Troje, N.F.
Gravity is a constant force that affects motion in the physical world. This is particularly true for animate motion, since animals try to energetically optimize their gait by shifting energy between different states such as kinetic, potential, and elastic energy. Constant gravity defines a fixed relation between temporal and spatial measures in motions such as pendulum motion and ballistic motion. In a psychophysical experiment, we tested whether the human visual system can retrieve size information mediated by gravity from dynamic point-light displays of animal locomotion. We used a motion capture system (Vicon 512, Oxford Metrics) to acquire the 3D-trajectories of marker points attached to the major joints of three dogs with different sizes. Biological motion point-light displays were created by animating these data on a computer screen in a sagittal view. Animation was presented for one second on a monitor at five different playback speed. The position of the animation remained in the center of the screen. The retinal sizes for all dog animations were identical (8.5 degrees of visual angle). Subjects had to estimate the absolute size of the dog in terms of its shoulder height in centimeters. Both the true size of the dog and the playback speed significantly affected the size estimate in the expected direction. Animations from larger dogs were perceived to be larger than animations from smaller dogs (F(1,30)=30.8, p<0.001). Dogs presented in slow motion were perceived to be larger than dogs in fast motion (F(4,60)=37.4, p<0.001). We conclude that human visual perception is able to use gravitational acceleration as a cue for size perception in biological motion displays. Further research is necessary to find out the exact psychophysical relationship between speed of biological motion and perceived size.
Buschmann, J-U.F., Troje, N.F.
The retinal image is a two-dimensional projection of the three-dimensional world. Information about the third dimension, i.e. the depth of an object or a scene, originates from several cues. One of them is the shading caused by directional light. Here, we demonstrate a compelling visual illusion that is probably caused by a misinterpretation of shading information. A three-dimensional head model was stretched along the axis perpendicular to the symmetry plane by a factor of cos(25)/cos(35) and was compressed along the horizontal axis in the symmetry plane by a factor of sin(25)/sin(35). If this head is rendered in orthographic projection with an orientation of 35° (with respect to the frontal view) using only ambient light, it results in exactly the same image as the projection of the undistorted head projected with 25° orientation. Instead of ambient light we use directional light. Surprisingly, rather than revealing the distortion in depth, directional illumination causes the projected image of the distorted head to appear considerably wider. We quantitatively measured this illusion by determining the point of subjective equivalence (PSE) using a fixed-interval nulling paradigm. From both images (the undistorted and the distorted head) we derived a series of images with constant height but with a width that ranged in nine steps from 93% to 107% of the original width. In a series of 180 trials these images were simultaneously shown together with the reference head (undistorted, 25à orientation, 100% width). Subjects had to indicate which of the heads appeared to be wider. Psychometric functions were fitted to the data and PSEs were determined. The mean PSE across 15 subjects was 3.2%. As revealed by the psychometric function for the undistorted head, identical images that differ in their width by that amount were correctly discriminated at a rate of 90%. We discuss the observed illusion in the context of shape-from-shading models.
Bergert, S., Hausmann, M., Troje, N.F., Güntürkün, O.
In einem Experiment zur Lateralisation der Gesichterwahrnehmung zeigten wir in einem visuellen Halbfeldparadigma 16 Männern und 16 Frauen Gesichter mit einem same-different-task. Mit diesem Experiment gingen wir der Frage nach, welchen Einfluss die Merkmale FORM und TEXTUR auf die Lateralisation der Gesichterwahrnehmung ausüben. Dazu verwendeten wir Gesichter, die sich bezüglich dieser beiden Merkmale unterschieden. Wir erwarteten für die rechte Hemisphäre einen Vorteil für das Merkmal FORM und für die linke Hemisphäre einen Vorteil für das Merkmal TEXTUR. Es zeigte sich jedoch keine direkte Interaktion von Stimulusmerkmalen (FORM/TEXTUR) und visuellem Halbfeld (RVF/LVF), sondern eine mit dem Geschlecht der Versuchsperson. Männer zeigten im RVF weniger Fehler für das Merkmal FORM und im LVF weniger Fehler für das Merkmal TEXTUR. Bei Frauen kehrte sich das Lateralisationsmuster um. Diese Ergebnisse sprechen dafür, dass es keinen einfachen Geschlechtsunterschied in der funktionellen cerebralen Asymmetrie gibt, sondern dass diese durch Stimulusmerkmale bestimmt wird.
2000
Papers
Troje, N. F., Frost, B. J.
The head movement of a walking pigeon Columba livia is characterized by two alternating phases, a thrust phase and a hold phase. While the head is rapidly thrust forward during the thrust phase, it has been shown repeatedly that it remains virtually motionless with respect to translation along a horizontal axis (roll axis) during the hold phase. It has been shown that the stabilization during the hold phase is under visual control. This has led to the view that the pigeonâs head-bobbing is an optokinetic response to stabilize the retinal image during the hold phase. However, it has never been shown explicitly that the head is really held stable in space with respect to other translatory or rotatory dimensions. Using videography, we show here that this is in fact the case: except for a small but systematic slip that presumably serves as an error signal for retinal image stabilization, the head of the pigeon remains locked in space not only with respect to the horizontal (roll) axis but also with respect to vertical translation (along the yaw axis) and with respect to rotation around the pitch and yaw axes.
Huber, L., Troje, N. F., Loidolt, M., Aust, U., Grass, D.
Recently (Troje, Huber, Loidolt, Aust, &Fieder 1999), we found that pigeons discriminated between large sets of photorealistic frontal images of human faces on the basis of sex. This ability was predominantly based on information contained in the visual texture of those images rather than in their configural properties. The pigeons could learn the distinction even when differences of shape and average intensity were completely removed. Here, we provedmore speci®cally the pigeonsâ flexibility and efficiency to utilize the class-distinguishing information contained in complex natural classes. First, we used principal component as well as discriminant function analysis in order to determine which aspects of the male and female images could support successful categorization. We then conducted various tests involving systematic transformations and reduction of the feature content to examine whether or not the pigeonsâ categorization behaviour comes under the control of categorylevel feature dimensions - that is, those stimulus aspects that most accurately divide the stimulus classes into the experimenter-defined categories of "Male" and "Female". Enhanced classification ability in the presence of impoverished test faces that varied only along one of the ®rst three principal components provided evidence that the pigeons used these class-distinguishing stimulus aspects as a basis for generalization to new instances.
Symposia and Published Abstracts
Troje, N. F., Hausmann, M.
The information contained in the image of a human face can be subdivided into components attributed to visual texture ("texture") and components attributed to configural relations ("shape"). Here, we present evidence for a dissociation of texture and shape processing in a same-different face recognition task by showing that hemispheric laterality is different for those two domains. Ten students participated in two experiments. In both experiments the two images of one trial differed only in their visual texture or they differed only in their shape. In Exp. 1 the first image was shown 6.3 degrees either to the left or to the right of a fixation point and the second image was shown centrally (lateralized encoding). In Exp. 2 the first image was shown centrally whereas the second was presented peripherally (lateralized retrieval). In Exp. 1 shape processing shows a left-hemispheric dominance whereas texture processing shows a right-hemispheric dominance. In Exp. 2 the opposite is the case.
1999
Papers
Troje, N. F., Kersten, D.
The question of whether object representations in the human brain are object-centered or viewer-centered has motivated a variety of experiments with divergent results. A key issue concerns the visual recognition of objects seen from novel views. If recognition performance depends on whether a particular view has been seen before, it can be interpreted as evidence for a viewer-centered representation. Previous experiments used unfamiliar objects to provide the experimenter with complete control over the observers previous experience with the object. In this study, we tested whether human recognition shows viewpoint dependence for the highly familiar faces of well known colleagues and for the observerâs own face. We found that observers are poorer at recognizing their own profile, whereas there is no difference in response time between frontal and profile views of other faces. This result shows that extensive experience and familiarity with oneâs own face is not sufficient to produce viewpoint invariance. Our result provides strong evidence for viewer-centered representation in human visual recognition even for highly familiar objects.
Symposia and Published Abstracts
Troje, N. F., Frost, B. J.
The head movement of a walking pigeon Columba livia is characterized by two alternating phases, a thrust phase and a hold phase. While the head is rapidly thrust forward during the thrust phase, it has been shown repeatedly that it remains virtually motionless with respect to translation along a horizontal axis (roll axis) during the hold phase. It has been shown that the stabilization during the hold phase is under visual control. This has led to the view that the pigeon's head-bobbing is an optokinetic response to stabilize the retinal image during the hold phase. However, it has never been shown explicitly that the head is really held stable in space with respect to other translatory or rotatory dimensions. Using videography, we show here that this is in fact the case: except for a small but systematic slip that presumably serves as an error signal for retinal image stabilization, the head of the pigeon remains locked in space not only with respect to the horizontal (roll) axis but also with respect to vertical translation (along the yaw axis) and with respect to rotation around the pitch and yaw axes.
1998
Papers
Troje, N. F., Siebeck, U.
Changing the position of a light source illuminating a human face induces an apparent shift of the precieved orientation of that face. The direction of this apparent shift is opposite to the shift of the light source. We demonstrated tht illumination-induced apparent orientation shift (IAOS), quantified it in terms of the physical orientation shift needed to compensate for it, and eveluated the results in the context of possible mechanisms underlying orientation judgment. Results indicate that IAOS depends not only on the angle between the two light source positions, but also on the mean orientation of the face. Availability of cues coded in the visual texture of the face did not affect IAOS. The most effective cue was the location of the visible outline of the face. IAOS seems to be due to a shift of this outline when shadowed areas of the face merge with the black background. We conclude that an important mechanism for orientation judgment is based on a comparison of visible parts left and right of the profile line.
Troje, N. F., Bulthoff, H. H.
The role of bilateral symmetry in face recognition is investigated in two psychophysical experiments using a Same/Different paradigm. The results of Experiment 1 confirm the hypothesis that the ability to identify mirror symmetric patterns is used for viewpoint generalization by apparoximating the view summetric to the learned view by its mirror reversed image. The results of Experiment 2 show that the match between the virtual view and the test image is performed directly between the images. Performance drops dramatically if the symmetry between the intesity patterns of the learning and the testing view is disturbed by an asymmetric illumination, although the symmetry between the spacial arrangement of high-level features is retained. Experimental results are discussed in terms of their relation to existing approaches to object recognition.
Braje, W., Kersten, D., Tarr, M. J., Troje, N. F.
How do observers recognize faces despite dramatic image variations that arise from changes in illumination? This paper examines 1) whether face recognition is sensitive to illumination direction, and 2) whether cast shadows improve performance by providing information about illumination, or hinder performance by introducing spurious edges. In Experiment 1, observers judged whether 2 sequentially-presented faces, illuminated from the same or different directions, were the same or different individuals. Cast shadows were present for half of the observers. Performance was impaired by a change in the illumination direction and by the presence of shadows. In Experiment 2, observers learned to name 8 faces under one illumination direction (left/right) and one cast-shadow condition (present/absent); they were later tested under novel illumination and shadow conditions. Performance declined for unfamiliar illumination directions, but not for unfamiliar shadow conditions. The finding that face recognition is illumination dependent is consistent with the use of image-based representations. The results indicate that face recognition processes are sensitive to either the direction of lighting or the resultant pattern of shading, and that cast shadows can hinder recognition, possibly by masking informative features or leading to spurious contours.
Proceedings
Troje, N. F., Vetter, T.
Several models for parameterized face representations have been proposed in the last years. A simple coding scheme treats the image of a face as a long vector with each entry coding for the intensity of one single pixel in the image (e.g. Sirovich & Kirby 1987). Although simple and straightforward, such pixel-based representations have several disadvantages. We propose a representation for images of faces that separates texture and 2D shape by exploiting pixel-by-pixel correspondence between the images. The advantages of this representation compared to pixel-based representations are demonstrated by means of the quality of low-dimensional reconstructions derived from principal component analysis and by means of the performance that a simple linear classifier can achieve for sex classification.
Troje, N. F.
Human faces are approximatelybilaterally symmetric. We study the ability to generalize to novel viewsof human faces focusing on the role of that symmetry. Our hypothesis is thatthe ability to identify mirror symmetric images is used for viewpoint generalizationby approximating the symmetric view of a learned view by its mirror symmetricimage. Two psychophysical experiments are performed using a same/differentparadigm. Experiment 1 shows that generalization to the symmetric view isbetter than generalization to otherwise different views. If the symmetricview is replaced by the mirror reversed learning view, performance furtherincreases. Experiment 2 shows that the match between the learned view andthe testing image is performed directly on the level of the images. Performancedrops significantly if the symmetry between the intensity patterns of learningand testing view is disturbed by an asymmetric illumination, although thesymmetry between the spatial arrangement of high-level features is retained.We show that a simple image-based model can explain important aspects ofthe data and we show how this model can be extended towards a general algorithmfor image comparison. Experimental results are discussed in terms of theirrelation to existing approaches to object recognition.
Technical Reports
Troje, N. F., Kersten, D.
The question of whether objectrepresentations in the human brain are objectcentered or viewer-centeredhas motivated a variety of experiments with divergent results. A key issueconcerns the visual recognition of objects seen from novel views. If recognitionperformance depends on whether a particular view has been seen before, itcan be interpreted as evidence for a viewer-centered representation. Previousexperiments used unfamiliar objects to provide the experimenter with completecontrol over the observers previous experience with the object. In this study,we tested whether human recognition shows viewpoint dependence for the highlyfamiliar faces of well known colleagues and for the observer's own face.We found that observers are poorer at recognizing their own profile, whereasthere is no difference in response time between frontal and profile viewsof other faces. This result shows that extensive experience and familiaritywith one's own face is not sufficient to produce viewpoint invariance. Ourresult provides strong evidence for viewer-centred representation in humanvisual recognition even for highly familiar objects.
Troje, N. F., Huber, L., Loidolt, M., Aust, U., Fieder, M.
Pigeons are known to be ableto categorize a wide variety of visual stimulus classes. However, it remainsunclear which are the characteristics of the perceptually relevant featuresemployed to reach such good performance. Here, we investigate the relativecontributions of texture and shape information to categorization decisionsabout complex natural classes. We trained three groups of pigeons to discriminatebetween sets of photorealistic frontal images of human faces according tosex and subsequently tested them on different stimulus sets. Only the pigeonsthat were presented with texture information were successful at the discriminationtask. Pigeons seem to possess a sophisticated texture processing system butare less capable in discriminating shapes. The results are discussed in termsof the possible evolutionary advantages of utilizing texture as a very generaland potent perceptual dimension in the birds' visual environment.
Other Contributions
Symposia and Published Abstracts
Loidolt, M., Aust, U., Huber, L., Troje, N.F., Fieder, M.
Tauben ( Columba livia ) sind bekannt für ihre Fähigkeit, in Lernexperimenten eine große Anzahl verschiedenster visueller Stimulusklassen kategorisieren zu können. Es ist jedoch bis heute nicht klar, welche die dabei für die Taube perzeptuell relevanten Merkmale in den Bildern sind. Wir untersuchten anhand einer Kategorisierungsaufgabe mit komplexen natürlichen Bildern, welche Rolle die in den Bildern enthaltene Form- und Texturinformation spielt. Die Tauben (n=24) wurden darauf trainiert, Bilder von menschlichen Gesichtern nach dem Geschlecht zu unterscheiden. Wir verwendeten eine Art der Bildrepräsentation, die es erlaubt, Textur- und 2D-Forminformation in den Gesichtern zu trennen ("correspondence-based representation"; siehe Vetter, T. & Troje N.F. 1995. In: Mustererkennung. Springer, 118-125). Der erste Teil unserer Studie bestand in einem Vergleich der Klassifikationsleistungen dreier Gruppen von Tauben (je 8 Tiere), wobei jede dieser Gruppen mit einer anderen Version der Gesichterbilder trainiert wurde, die sich bezüglich Textur- und Forminformation unterschieden. Eine Gruppe (ORIGINAL) wurde mit den Originalbildern trainiert, die zweite Gruppe (TEXTUR) mit Bildern von Gesichtern, die alle identische Form besaßen und sich nur hinsichtlich der Texturinformation unterschieden, und die dritte Gruppe (FORM) wurde mit Bildern von Gesichtern trainiert, die sich nur in der Form unterschieden, jedoch alle dieselbe Texturinformation aufwiesen. 100 Bilder (50 männliche und 50 weibliche Gesichter) wurden für dieses Klassifikationstraining verwendet, 100 weitere für die anschließenden Klassifikationstests. Die Ergebnisse zeigen deutlich, daß die Tauben hauptsächlich die in den Bildern enthaltene Texturinformation verwendeten. Während die Tiere der ORIGINAL- und der TEXTUR-Gruppe im Training rasch eine hohe Klassifikationsleistung erreichten und im Generalisationstest mit neuen Bildern gute Transferleistungen zeigten, waren die Ergebnisse bei der FORM-Gruppe signifikant schlechter. Wir konnten zeigen, daß Texturmerkmale für Tauben nicht nur genügend Information bieten, um eine komplexe natürliche Kategorie leicht klassifizieren zu können, sondern darüber hinaus auch, daß sie - im Gegensatz zum Menschen - Texturmerkmale gegenüber Formmerkmalen deutlich bevorzugen. Offensichtlich haben Tauben ein hoch entwickeltes Texturerkennungssystem, das für standpunktunabhängige Objekterkennung von Bedeutung sein könnte.
Huber, L., Troje, N.F., Loidolt, M., Aust U.
Studies based on anthropocentric cognitive models of categorisation and computer simulations have failed to explain the ways in which animals analyse and group natural stimuli. Among animal researchers, it is commonly agreed that stimuli that may appear complex to the experimenter might be classified by the subjects using very simple image properties. Furthermore, a fundamental problem posed by natural categorisation is how organisms such as the pigeon, which are devoid of language and presumably also of further high-level capacities, can rapidly extract abstract invariances from stimulus classes containing instances so variable that neither the class rule nor the exemplar can be physically described. An example of such a class is the human face. Despite innumerable attempts, it remains to be determined how humans sort faces into psychologically relevant classes. However,it is possible that birds categorise this class of stimuli in much simpler, though still successful ways. Our pigeons rapidly sorted two hundred photo-realistic frontal views of human faces according to sex. Using a correspondence-based representation of faces, we dissociated the information about an item into two parts; one coding for the spatial configuration of its features, and the other for its particular appearance. The results of the original discrimination training clearly indicated that pigeons preferred to exploit the surface properties of faces to their spatial properties. Furthermore,subsequent transfer tests revealed that within the surface domain they used overall brightness, colour, brightness gradients (ratio between the top and bottom half of the face), and shading.
1997
Papers
Vetter, T., Troje, N. F.
Human faces differ in shape and texture. Image representations based on such a separation have been reported by several authors [for review, see Beymer and Poggio, (1996)]. This paper investigates such a representation of human faces based on a separation of texture and two-dimensional shape formation. Texture and shape were separated using pixel-by-pixel correspondence between the different images, which was established through algorithms known from optical flow computation. The paper demostrates the improvement of the proposed representation over well established pixel-based techniques in terms of coding efficiency and in terms of the ability to generalize to new images of faces. The evaluation is performed by calculating different distance measures between the original image and its reconstruction and by measuring the time human subjects need to discriminate them.
O'Toole, A., Vetter T., Troje, N. F., Bulthoff, H. H.
The sex of a face is perhaps its most salient feature. A principal components analysis {PCA} was applied separetely to the three-dimensional (3-D) structure and graylevel image (GLI) data from laser-scanned human heads. Individual components from both analyses captured information related to the sex of the face. Notably, single projection coefficients characterized complex differences between the 3-D structure of male and female heads and between male and female GLI maps. In a series of simulations, the quality of the information available in the 3-D head versus GLI data for predicting the sex of the face has been compared. The results indicated that the 3-D head data supported more accurate sex classification than the GLI data, across a range of PCA-compressed (dimensionally-reduced) representations of the heads. This kind of dual face representation can give insight into the nature of the information available to humans for categorizing and remembering faces.
Symposia and Published Abstracts
Troje, N. F., Siebeck, U.
Purpose: Judging the orientation of bilateral objects, e.g. faces, is a task that humans perform everyday. In order to achieve some understanding of the mechanisms underlying this ability, we investigated the influence of directional light on perceived orientation. Changing the illuminant's position induces a substantial apparent orientation shift. We documented this phenomenon and measured its size as a function of the mean orientation of the face and the angle between the two light directions. To assure that the phenomenon is based on perceived orientation of the face rather than on gaze direction, we repeated the experiment using surface models with constant albedo.
Methods: Using a 2AFC paradigm, the participants decided whether a sequence of two images appeared to rotate to the left or to the right. The images showed the same face in different orientations and illuminated from different directions. Between the two images, a distractor was shown for 2 sec. The physical rotation nullifying the illumination induced apparent orientation shift was determined. Results: Depending on the set of parameters used, an apparent orientation shift of up to 10 degs can be induced. The effect is minimal when the face is shown in a frontal view and maximal if the face is oriented 30 to 45 degs from frontal. The increase of the apparent orientation shift with the angle between the two light directions saturates at an angle of 60 degs between the light directions. The phenomenon does not require photorealisticly textured stimuli. Using surface models with constant albedo yields similar results.
Conclusions: The results are discussed in the context of possible strategies for orientation judgement. The finding that the magnitude of the effect is dependent on the mean orientation of the face implies that the effect is unlikely to be based only on local surface attitude judgements. Rather, we favor a model which assumes that first the symmetry plane of the face is detected, and then a comparison is made between the visible parts on both sides of this plane to estimate orientation.
Siebeck, U., Troje, N.F.
How do people estimate the orientation of other people's faces? We observed that two images of a face seen from the same orientation, but illuminated from different angles, appeared to have different orientations. The first experiment was designed to document and quantify this phenomenon with respect to the average orientation of the face. The images were rendered on a black background that made it impossible to discriminate the shadowed facial parts from the background. We determined the physical orientation shift necessary to compensate for the illumination-induced effect. Results showed that the measured illumination-induced apparent orientaion shift (IAOS) correlates positively with the average orientation of the face and reaches values of up to 9°. This correlation implies that the mechanism is not based on local surface attitude judgements. We propose a model in which the symmetry plane of the face is detected, and then a comparison is made between the visible parts on both sides of this plane. The effect of the shadow occluding parts of the faces would then be responsible for the apparent orientation shift. To test this hypothesis we repeated the first experiment using a background colour that allowed subjects to perceive the true outline of the faces. We found that the IAOS was reduced to values of less than 2° and no longer depended on the average orientation of the faces. The results imply that orientation may be judged by comparing the size of the visible parts of the left and right halves of the face.
1996
Papers
Troje, N. F., Bulthoff, H. H.
Although remarkably robust, face recognition is not perfectly invariant to pose and viewpoint changes. It has long been known that both profile and full-face views result in poorer recognition performance than a ¾ view. However, little data exist which investigate this phenomenon in detail. The present work provides such data using a high angular resolution and a large range of poses. Since there are inconsistencies in the literature concerning these issues, we emphasize the different roles of the learning view and the testing view in the recongnition experiment. We also emphasize the roles of information contained in the texture and in the shape of a face. Our stimuli were generated from laser-scanned head models and contained either the natural texture or only Lambertian shading and no texture. The results of our same/different face recognition experiments are: (1) only the learning view but not the testing view affects recognition performance. (2) For textured faces the optimal learning view is closer to the full-face view than for the shaded faces. (3) For shaded faces, we find a significantly better recognition performance for the symmetric view. The results can be interpreted in terms of different strategies to recover invariants from texture and from shading.
Kersten, D., Troje, N. F., Bulthoff, H. H.
We show a cylindrical projection of the human head. This projection is ambiguous with respect to head pose. Viewing such a projection produces perceptual competition for a few discrete views.
Technical Reports
Troje, N. F., Vetter, T.
Several models for parameterized face representations have been proposed in the last years. A simple coding scheme treats the image of a face as a long vector with each entry codingfor the intensity of one single pixel in the image (e.g. Sirovich & Kirby1987). Although simple and straightforward, such pixel-based representationshave several disadvantages. We propose a representation for images of facesthat separates texture and 2D shape by exploiting pixel-by-pixel correspondencebetween the images. The advantages of this representation compared to pixel-basedrepresentations are demonstrated by means of the quality of low-dimensionalreconstructions derived from principal component analysis and by means ofthe performance that a simple linear classifier can achieve for sex classification.
Symposia and Published Abstracts
Troje, N.F., Vetter, T.
In human perception, as well as in machine vision, a crucial step in solving any object recognition task is an appropriate description of the object class under consideration. We emphasise this issue when considering the object class `human faces'. We discuss different representations that can be characterised by the degree of alignment between the images they provide for. The representations used span the whole range between a purely pixel-based image representation and a sophisticated model-based representation derived from the pixel-to-pixel correspondence between the faces [Vetter and Troje, 1995, in Mustererkennung Eds G Sagerer, S Posch, F Kummert (Berlin: Springer)]. The usefulness of these representations for sex classification was compared. This was done by first applying a Karhunen -- Loewe transformation on the representation to orthogonalise the data. A linear classifier was trained by means of a gradient-descent procedure. The classification error in a completely cross-validated simulation ranged from 15% in the simplest version of the pixel-based representation to 2.5% for the correspondence-based representation. However, even with intermediate representations very good performance was achieved.
Troje, N. F., Bülthoff, H. H.
Purpose: Recently, we investigated human performance to generalize to novel views of a learned face (Troje & Bülthoff, 1996, Vision Research, in press). Among other results, we made the observation that generalization to views that are symmetric with respect to the frontal view is much better than to otherwise different views. Here, we present new psychophysical experiments investigating the nature of this performance. In particular, our question is, whether this performance is based on the bilateral symmetry of the 3D-object or on the resulting mirror symmetry of the images.
Methods: Two experiments were performed. Both used a SAME/DIFFERENT recognition paradigm in which two images of faces were shown in immediate succession. The subject then decided, whether or not the two images showed the same person. The images were made from 3D head models and showed the face without its natural texture, only applying a Lambertian shading model. This allows decoupling the symmetry of the view from the mirror symmetry of the image. Symmetric views yield mirror symmetric images only if the simulated light source in both images comes from the direction of the camera or if it is also symmetric with respect to this direction. However, if the light comes from the same side, the images taken from symmetric viewpoints are no longer mirror symmetric. In experiment 1 training and testing views where either identical or symmetric. The light direction was also either identical or symmetric, yielding four different conditions. In experiment 2 the light always came from the direction of the camera. Training and testing views were either identical, symmetric or otherwise different. In a fourth condition we used instead of the symmetric view the flipped, perfectly mirror symmetric image for testing.
Results: Experiment 1. If both viewpoint and illumination were symmetric, performance was much better (p<0.005) than when only the view or the illumination changed. Experiment 2. Generalization to the perfectly mirror symmetric image was even better (p<0.05) than to the symmetric view. Conclusions: The better generalization to the symmetric view of a head model is not based on knowledge about the almost bilaterally symmetric three-dimensional structure of the head but rather on the simple image operation of identifying mirror symmetric images.
OâToole, A., Vetter, T., Troje, N.F., Bülthoff, H.H.
Purpose: We compared quality of information available in 3D surface models versus texture maps for classifying human faces by sex. Methods: 3D surface models and texture maps from laser scans of 130 human heads (65 male, 65 female) were analyzed with separate principal components analyses (PCAs). Individual principal components (PCs) from the 3D head data characterized complex structural differences between male and female heads. Likewise, individual PCs in the texture analysis contrasted characteristically male vs. female texture patterns (e.g., presence/absence of facial hair shadowing). More formally, representing faces with only their projection coefficients onto the PCs, and varying the subspace from 1 to 50 dimensions, we trained a series of perceptrons to predict the sex of the faces using either the 3D or texture data. A "leave-one-out" technique was applied to measure the gen-eralizability of the perceptron's sex predictions. Results: While very good sex generalization performance was obtained for both representations, even with very low dimensional subspaces (e.g., 76.1% correct with only one 3D projection coefficient), the 3D data supported more accurate sex classification across nearly the entire range of subspaces tested. For texture, 93.8% correct sex generalization was achieved with a minimun subspace of 20 projection coefficients. For 3D data, 96.9% correct generalization was achieved with 17 projection coefficients. Conclusions: These data highlight the importance of considering the kinds of information available in different face representations with respect to the task demands.
Braje, W.F., Kersten, D., Tarr, M., Troje, N.F.
Purpose:How do observers recognize objects despite dramatic image variations that arise from changes in illumination? Some evidence suggests that changes in illumination direction influence object recognition (Kersten et al., ARVO 1995). We examine whether illumination dependency extends to face recognition. A corollary issue is whether cast shadows improve performance by providing information about light source direction, or hinder performance by introducing spurious edges that must be discounted prior to recognition. Methods:The 3-D geometry and color texture maps of 80 human faces were digitized using a 3-D laser scanner (Cyberwareâ¢). Each face was rendered with the light source located in the upper right or upper left quadrant in front of the face, and with cast shadows present or absent. On each trial, 2 faces were presented sequentially, each followed by a mask (a random array of face features). The two faces were illuminated from either the same or different directions. Observers judged whether the two faces were the same or different individuals. Half of the observers viewed faces with cast shadows, and half viewed faces without cast shadows. Results:When there was no change in the direction of illumination, recognition was faster (877 vs. 930 ms, p<.001) and more accurate (92% vs. 85%, p<.001) than when there was a change. Recognition was slower when cast shadows were present than when they were absent (973 vs. 833 ms, p<.05). Conclusions:Our finding that face recognition is illumination dependent is consistent with the use of "information-rich" representations for recognition. The results indicate that 1) face recognition processes are sensitive to either the direction of lighting or the resultant pattern of shading, and 2) rather than aiding recognition by providing information about the illuminant, cast shadows hinder recognition, possibly by masking out informative features or leading to spurious contours.
1995
Proceedings
Vetter, T., Troje, N. F.
Human faces differ in shape and texture. This paper describes a representation of grey-level images of human faces based on an automated separation of two-dimensional shape and texture. The separation was done using the point correspondence between the different images, which was established through algorithms known from optical flow computation. A linear description of the separated texture and shape spaces allows a smooth modeling of human faces. Pictures of faces along the principal axes of a small data set of 50 faces are shown. We also show face reconstructions based on this small example set.
O'Toole, A., Bulthoff, H. H., Troje, N.F., Vetter, T.
We describe a computational modelof face recognition that makes use of the overlapping texture and shape informationvisible in different views of faces. The model operates on view dependentdata from three-dimensional laser scans of human heads, which were registeredonto a three-dimensional head model. We show that the overlapping visibleregions of heads can support accurate recognition even with pose differencesof as much as 90 degrees (full face to profile view) between the learningand testing view.
Technical Reports
Vetter, T., Troje, N. F.
Human faces differ in shapeand texture. This paper describes a representation of grey-level images ofhuman faces based on an automated separation of two-dimensional shape andtexture. The separations were done using the point correspondence betweenthe different images, which was established through algorithms known fromoptical flow computation. A linear description of the separated texture andshape spaces allows a smooth modeling of human faces. Images of faces alongthe principal axes of a small data set of 50 faces are shown. We also reconstructimages of faces using the 49 remaining faces in our data set. These reconstructionsare the projections of an image into the space spanned by the textures andshapes of the other faces.
O'Toole, A. J., Bulthoff, H. H, Troje, N. F., Vetter, T.
We describe a computational model of face recognitionthat makes use of the overlapping texture and shape information visible indifferent views of faces. The model operates on view dependent data from three-dimensional laser scans of human heads, which provided three-dimensionalsurface data as well as surface image detail in the form of a texture map.View-dependent information from these surface and texture representationswas registered onto separate three-dimensional head models. We used an auto-associativememory model as a pattern completion device to fill in parts of the headfrom a learned view when a test view with partially overlapping information was used as a memory key. We show that the overlapping visible regions ofheads for both surface and texture data can support accurate recognition,even with pose differences of as much as 90 degrees (full face to profileview) between the learning and test view.
Symposia and Published Abstracts
1994
Papers
Chittka, L., Shmida, A., Troje, N., Menzel, R.
Based on the measurements of 1063 flower reflection spectra, we show that flower colours fall into distinct clusters in the colour space of a bee. It is demonstrated that this clustering is caused by a limited variability in the floral spectral reflectance curves. There are as few as 10 distinct types of such curves, five of which constitute 85% of all measurements. UV reflections are less frequent and always lower in intensity than reflections in other parts of the spectrum. A further cluster of colour loci is formed in the centre of the colour space. It contains the colour loci of green leaves, several other background materials and only very few flowers. We propose a system to classify the reflection functions of flowers, and a set of colour names for bee colours.
Other Contributions
Symposia and Published Abstracts
1993
Papers
Troje, N. F.
Wavelength discrimination in the flower visiting blowfly Lucilia spec. was investigated in an attempt to elucidate the mechanisms underlying colour vision in this insect. The flies were subjected to a classical conditioning procedure in which they had to discrimate between a rewarded and an unrewarded monochromatic light stimulus. The results reveal large wavelength ranges within which no discrimination occurs, between which, however, a very dictinct discrimination is found. The first range consists of the UV region up 400nm (UV). The second range comprises wavelengths between 400nm and 515mn (BLUE) and the third range all wavelengths longer than 515nm (YELLOW). A simple model consisting of two colour opponent subsystems (R7p/R8p and R7y/R8y) can explain these results. Each of the two subsystems is assumed to evaluate only whether the sign of the difference between the excitations of R7 and R8 is positive or negative. For the whole system there are thus four possible conditions: p+y+, p+y-, p-y+, p-y-. Three of them correspond to the experimentally obtained wavelength ranges. The fourth condition (p-y-) might represent a still hypothetical PURPLE category in which the stimulus is made of both short and long wavelengths.
Symposia and Published Abstracts
1991
Other Contributions
1990
Other Contributions