1
|
How small changes to one eye's retinal image can transform the perceived shape of a very familiar object. Proc Natl Acad Sci U S A 2024; 121:e2400086121. [PMID: 38621132 PMCID: PMC11046684 DOI: 10.1073/pnas.2400086121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2024] [Accepted: 03/07/2024] [Indexed: 04/17/2024] Open
Abstract
Vision can provide useful cues about the geometric properties of an object, like its size, distance, pose, and shape. But how the brain merges these properties into a complete sensory representation of a three-dimensional object is poorly understood. To address this gap, we investigated a visual illusion in which humans misperceive the shape of an object due to a small change in one eye's retinal image. We first show that this illusion affects percepts of a highly familiar object under completely natural viewing conditions. Specifically, people perceived their own rectangular mobile phone to have a trapezoidal shape. We then investigate the perceptual underpinnings of this illusion by asking people to report both the perceived shape and pose of controlled stimuli. Our results suggest that the shape illusion results from distorted cues to object pose. In addition to yielding insights into object perception, this work informs our understanding of how the brain combines information from multiple visual cues in natural settings. The shape illusion can occur when people wear everyday prescription spectacles; thus, these findings also provide insight into the cue combination challenges that some spectacle wearers experience on a regular basis.
Collapse
|
2
|
Detecting visual texture patterns in binary sequences through pattern features. J Vis 2023; 23:1. [PMID: 37910088 PMCID: PMC10627294 DOI: 10.1167/jov.23.13.1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2023] [Accepted: 09/13/2023] [Indexed: 11/03/2023] Open
Abstract
We measured human ability to detect texture patterns in a signal detection task. Observers viewed sequences of 20 blue or yellow tokens placed horizontally in a row. They attempted to discriminate sequences generated by a random generator ("a fair coin") from sequences produced by a disrupted Markov sequence (DMS) generator. The DMSs were generated in two stages: first a sequence was generated using a Markov chain with probability, pr = 0.9, that a token would be the same color as the token to its left. The Markov sequence was then disrupted by flipping each token from blue to yellow or vice versa with probability, pd-the probability of disruption. Disruption played the role of noise in signal detection terms. We can frame what observers are asked to do as detecting Markov texture patterns disrupted by noise. The experiment included three conditions differing in pd (0.1, 0.2, 0.3). Ninety-two observers participated, each in only one condition. Overall, human observers' sensitivities to texture patterns (d' values) were markedly less than those of an optimal Bayesian observer. We considered the possibility that observers based their judgments not on the entire texture sequence but on specific features of the sequences such as the length of the longest repeating subsequence. We compared human performance with that of multiple optimal Bayesian classifiers based on such features. We identify the single- and multiple-feature models that best match the performance of observers across conditions and develop a pattern feature pool model for the signal detection task considered.
Collapse
|
3
|
The role of consciously timed movements in shaping and improving auditory timing. Proc Biol Sci 2023; 290:20222060. [PMID: 36722075 PMCID: PMC9890119 DOI: 10.1098/rspb.2022.2060] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2022] [Accepted: 12/16/2022] [Indexed: 02/02/2023] Open
Abstract
Our subjective sense of time is intertwined with a plethora of perceptual, cognitive and motor functions, and likewise, the brain is equipped to expertly filter, weight and combine these signals for seamless interactions with a dynamic world. Until relatively recently, the literature on time perception has excluded the influence of simultaneous motor activity, yet it has been found that motor circuits in the brain are at the core of most timing functions. Several studies have now identified that concurrent movements exert robust effects on perceptual timing estimates, but critically have not assessed how humans consciously judge the duration of their own movements. This creates a gap in our understanding of the mechanisms driving movement-related effects on sensory timing. We sought to address this gap by administering a sensorimotor timing task in which we explicitly compared the timing of isolated auditory tones and arm movements, or both simultaneously. We contextualized our findings within a Bayesian cue combination framework, in which separate sources of temporal information are weighted by their reliability and integrated into a unitary time estimate that is more precise than either unisensory estimate. Our results revealed differences in accuracy between auditory, movement and combined trials, and (crucially) that combined trials were the most accurately timed. Under the Bayesian framework, we found that participants' combined estimates were more precise than isolated estimates, yet were sub-optimal when compared with the model's prediction, on average. These findings elucidate previously unknown qualities of conscious motor timing and propose computational mechanisms that can describe how movements combine with perceptual signals to create unified, multimodal experiences of time.
Collapse
|
4
|
Newly learned shape-color associations show signatures of reliability-weighted averaging without forced fusion or a memory color effect. J Vis 2022; 22:8. [PMID: 36580296 PMCID: PMC9804025 DOI: 10.1167/jov.22.13.8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022] Open
Abstract
Reliability-weighted averaging of multiple perceptual estimates (or cues) can improve precision. Research suggests that newly learned statistical associations can be rapidly integrated in this way for efficient decision-making. Yet, it remains unclear if the integration of newly learned statistics into decision-making can directly influence perception, rather than taking place only at the decision stage. In two experiments, we implicitly taught observers novel associations between shape and color. Observers made color matches by adjusting the color of an oval to match a simultaneously presented reference. As the color of the oval changed across trials, so did its shape according to a novel mapping of axis ratio to color. Observers showed signatures of reliability-weighted averaging-a precision improvement in both experiments and reweighting of the newly learned shape cue with changes in uncertainty in Experiment 2. To ask whether this was accompanied by perceptual effects, Experiment 1 tested for forced fusion by measuring color discrimination thresholds with and without incongruent novel cues. Experiment 2 tested for a memory color effect, observers adjusting the color of ovals with different axis ratios until they appeared gray. There was no evidence for forced fusion and the opposite of a memory color effect. Overall, our results suggest that the ability to quickly learn novel cues and integrate them with familiar cues is not immediately (within the short duration of our experiments and in the domain of color and shape) accompanied by common perceptual effects.
Collapse
|
5
|
Is source elevation an auditory distance cue? A preliminary study. Perception 2022; 51:3010066221114589. [PMID: 35989643 DOI: 10.1177/03010066221114589] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
The aim of this work was to evaluate whether the angular elevation of a sound source could generate auditory cues which improve the auditory distance perception in a similar way to that previously reported by visual modality. For this purpose, we compared ADP curves obtained with sources located both at the listeners' ears and at ground level. Our hypothesis was that the participants can interpret the relation between elevation and distance of ground-level sources (which are linked geometrically) so we expected them to perceive their distances more accurately than those at ear level. However, the responses obtained with sources located at ground level were almost identical to those obtained at the height of the listeners' ears, showing that, under the conditions of our experiment, auditory elevation cues do not influence auditory distance perception.
Collapse
|
6
|
The role of viscosity in flavor preference: plasticity and interactions with taste. Chem Senses 2022; 47:bjac018. [PMID: 35972847 PMCID: PMC9380780 DOI: 10.1093/chemse/bjac018] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
The brain combines gustatory, olfactory, and somatosensory information to create our perception of flavor. Within the somatosensory modality, texture attributes such as viscosity appear to play an important role in flavor preference. However, research into the role of texture in flavor perception is relatively sparse, and the contribution of texture cues to hedonic evaluation of flavor remains largely unknown. Here, we used a rat model to investigate whether viscosity preferences can be manipulated through association with nutrient value, and how viscosity interacts with taste to inform preferences for taste + viscosity mixtures. To address these questions, we measured preferences for moderately viscous solutions prepared with xanthan gum using 2-bottle consumption tests. By experimentally exposing animals to viscous solutions with and without nutrient value, we demonstrate that viscosity preferences are susceptible to appetitive conditioning. By independently varying viscosity and taste content of solutions, we further show that taste and viscosity cues both contribute to preferences for taste + viscosity mixtures. How these 2 modalities are combined depended on relative palatability, with mixture preferences falling in between component preferences, suggesting that hedonic aspects of taste and texture inputs are centrally integrated. Together, these findings provide new insight into how texture aspects of flavor inform hedonic perception and impact food choice behavior.
Collapse
|
7
|
Bayesian causal inference in visuotactile integration in children and adults. Dev Sci 2021; 25:e13184. [PMID: 34698430 PMCID: PMC9285718 DOI: 10.1111/desc.13184] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2021] [Revised: 09/01/2021] [Accepted: 10/05/2021] [Indexed: 11/27/2022]
Abstract
If cues from different sensory modalities share the same cause, their information can be integrated to improve perceptual precision. While it is well established that adults exploit sensory redundancy by integrating cues in a Bayes optimal fashion, whether children under 8 years of age combine sensory information in a similar fashion is still under debate. If children differ from adults in the way they infer causality between cues, this may explain mixed findings on the development of cue integration in earlier studies. Here we investigated the role of causal inference in the development of cue integration, by means of a visuotactile localization task. Young children (6-8 years), older children (9.5-12.5 years) and adults had to localize a tactile stimulus, which was presented to the forearm simultaneously with a visual stimulus at either the same or a different location. In all age groups, responses were systematically biased toward the position of the visual stimulus, but relatively more so when the distance between the visual and tactile stimulus was small rather than large. This pattern of results was better captured by a Bayesian causal inference model than by alternative models of forced fusion or full segregation of the two stimuli. Our results suggest that already from a young age the brain implicitly infers the probability that a tactile and a visual cue share the same cause and uses this probability as a weighting factor in visuotactile localization.
Collapse
|
8
|
Echoes of L1 Syllable Structure in L2 Phoneme Recognition. Front Psychol 2021; 12:515237. [PMID: 34354620 PMCID: PMC8329372 DOI: 10.3389/fpsyg.2021.515237] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2019] [Accepted: 03/23/2021] [Indexed: 11/13/2022] Open
Abstract
Learning to move from auditory signals to phonemic categories is a crucial component of first, second, and multilingual language acquisition. In L1 and simultaneous multilingual acquisition, learners build up phonological knowledge to structure their perception within a language. For sequential multilinguals, this knowledge may support or interfere with acquiring language-specific representations for a new phonemic categorization system. Syllable structure is a part of this phonological knowledge, and language-specific syllabification preferences influence language acquisition, including early word segmentation. As a result, we expect to see language-specific syllable structure influencing speech perception as well. Initial evidence of an effect appears in Ali et al. (2011), who argued that cross-linguistic differences in McGurk fusion within a syllable reflected listeners’ language-specific syllabification preferences. Building on a framework from Cho and McQueen (2006), we argue that this could reflect the Phonological-Superiority Hypothesis (differences in L1 syllabification preferences make some syllabic positions harder to classify than others) or the Phonetic-Superiority Hypothesis (the acoustic qualities of speech sounds in some positions make it difficult to perceive unfamiliar sounds). However, their design does not distinguish between these two hypotheses. The current research study extends the work of Ali et al. (2011) by testing Japanese, and adding audio-only and congruent audio-visual stimuli to test the effects of syllabification preferences beyond just McGurk fusion. Eighteen native English speakers and 18 native Japanese speakers were asked to transcribe nonsense words in an artificial language. English allows stop consonants in syllable codas while Japanese heavily restricts them, but both groups showed similar patterns of McGurk fusion in stop codas. This is inconsistent with the Phonological-Superiority Hypothesis. However, when visual information was added, the phonetic influences on transcription accuracy largely disappeared. This is inconsistent with the Phonetic-Superiority Hypothesis. We argue from these results that neither acoustic informativity nor interference of a listener’s phonological knowledge is superior, and sketch a cognitively inspired rational cue integration framework as a third hypothesis to explain how L1 phonological knowledge affects L2 perception.
Collapse
|
9
|
J. J. Gibson's "Ground Theory of Space Perception". Iperception 2021; 12:20416695211021111. [PMID: 34377427 PMCID: PMC8334293 DOI: 10.1177/20416695211021111] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2021] [Accepted: 05/11/2021] [Indexed: 11/25/2022] Open
Abstract
J. J. Gibson's ground theory of space perception is contrasted with Descartes' theory, which reduces all of space perception to the perception of distance and angular direction, relative to an abstract viewpoint. Instead, Gibson posits an embodied perceiver, grounded by gravity, in a stable layout of realistically textured, extended surfaces and more delimited objects supported by these surfaces. Gibson's concept of optical contact ties together this spatial layout, locating each surface relative to the others and specifying the position of each object by its location relative to its surface of support. His concept of surface texture-augmented by perspective structures such as the horizon-specifies the scale of objects and extents within this layout. And his concept of geographical slant provides surfaces with environment-centered orientations that remain stable as the perceiver moves around. Contact-specified locations on extended environmental surfaces may be the unattended primitives of the visual world, rather than egocentric or allocentric distances. The perception of such distances may best be understood using Gibson's concept of affordances. Distances may be perceived only as needed, bound through affordances to the particular actions that require them.
Collapse
|
10
|
Cue combination in goal-oriented navigation. Q J Exp Psychol (Hove) 2021; 74:1981-2001. [PMID: 33885351 PMCID: PMC8902265 DOI: 10.1177/17470218211015796] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
This study examined cue combination of self-motion and landmark cues in
goal-localisation. In an immersive virtual environment, before walking a two-leg
path, participants learned the locations of three goal objects (one at the path
origin, that is, home) and landmarks. After walking the path without seeing
landmarks or goals, participants indicated the locations of the home and
non-home goals in four conditions: (1) path integration only, (2) landmarks
only, (3) both path integration and the landmarks, and (4) path integration and
rotated landmarks. The ratio of the length between the testing position (P) and
the turning point (T) over the length between the T and the three goals (G)
(i.e., PT/TG) was manipulated. The results showed the cue combination
consistently for participants’ heading estimates but not for goal-localisation.
In Experiments 1 and 2 (using distal landmarks), the cue combination for goal
estimates appeared in a small length ratio (PT/TG = 0.5) but disappeared in a
large length ratio (PT/TG = 2). In Experiments 3 and 4 (using proximal
landmarks), while the cue combination disappeared for the home with a medium
length ratio (PT/TG = 1), it appeared for the non-home goal with a large length
ratio (PT/TG = 2) and only disappeared with a very large length ratio
(PT/TG = 3). These findings are explained by a model stipulating that cue
combination occurs in self-localisation (e.g., heading estimates), which leads
to one estimate of the goal location; proximal landmarks produce another goal
location estimate; these two goal estimates are then combined, which may only
occur for non-home goals.
Collapse
|
11
|
Combining cues to judge distance and direction in an immersive virtual reality environment. J Vis 2021; 21:10. [PMID: 33900366 PMCID: PMC8083085 DOI: 10.1167/jov.21.4.10] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2019] [Accepted: 01/31/2021] [Indexed: 11/24/2022] Open
Abstract
When we move, the visual direction of objects in the environment can change substantially. Compared with our understanding of depth perception, the problem the visual system faces in computing this change is relatively poorly understood. Here, we tested the extent to which participants' judgments of visual direction could be predicted by standard cue combination rules. Participants were tested in virtual reality using a head-mounted display. In a simulated room, they judged the position of an object at one location, before walking to another location in the room and judging, in a second interval, whether an object was at the expected visual direction of the first. By manipulating the scale of the room across intervals, which was subjectively invisible to observers, we put two classes of cue into conflict, one that depends only on visual information and one that uses proprioceptive information to scale any reconstruction of the scene. We find that the sensitivity to changes in one class of cue while keeping the other constant provides a good prediction of performance when both cues vary, consistent with the standard cue combination framework. Nevertheless, by comparing judgments of visual direction with those of distance, we show that judgments of visual direction and distance are mutually inconsistent. We discuss why there is no need for any contradiction between these two conclusions.
Collapse
|
12
|
Inhibition of saccade initiation improves saccade accuracy: The role of local and remote visual distractors in the control of saccadic eye movements. J Vis 2021; 21:17. [PMID: 33729451 PMCID: PMC7980046 DOI: 10.1167/jov.21.3.17] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2020] [Accepted: 01/28/2021] [Indexed: 11/24/2022] Open
Abstract
When a distractor appears close to the target location, saccades are less accurate. However, the presence of a further distractor, remote from those stimuli, increases the saccade response latency and improves accuracy. Explanations for this are either that the second, remote distractor impacts directly on target selection processes or that the remote distractor merely impairs the ability to initiate a saccade and changes the time at which unaffected target selection processes are accessed. In order to tease these two explanations apart, here we examine the relationship between latency and accuracy of saccades to a target and close distractor pair while a remote distractor appears at variable distance. Accuracy improvements are found to follow a similar pattern, regardless of the presence of the remote distractor, which suggests that the effect of the remote distractor is not the result of a direct impact on the target selection process. Our findings support the proposal that a remote distractor impairs the ability to initiate a saccade, meaning the competition between target and close distractor is accessed at a later time, thus resulting in more accurate saccades.
Collapse
|
13
|
Synergy of spatial frequency and orientation bandwidth in texture segregation. J Vis 2021; 21:5. [PMID: 33560290 PMCID: PMC7873498 DOI: 10.1167/jov.21.2.5] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2020] [Accepted: 12/23/2020] [Indexed: 11/28/2022] Open
Abstract
Defining target textures by increased bandwidths in spatial frequency and orientation, we observed strong cue combination effects in a combined texture figure detection and discrimination task. Performance for double-cue targets was better than predicted by independent processing of either cue and even better than predicted from linear cue integration. Application of a texture-processing model revealed that the oversummative cue combination effect is captured by calculating a low-level summary statistic (\(\Delta CE_m\)), which describes the differential contrast energy to target and reference textures, from multiple scales and orientations, and integrating this statistic across channels with a winner-take-all rule. Modeling detection performance using a signal detection theory framework showed that the observers' sensitivity to single-cue and double-cue texture targets, measured in \(d^{\prime }\) units, could be reproduced with plausible settings for filter and noise parameters. These results challenge models assuming separate channeling of elementary features and their later integration, since oversummative cue combination effects appear as an inherent property of local energy mechanisms, at least for spatial frequency and orientation bandwidth-modulated textures.
Collapse
|
14
|
Abstract
In 1979, James Gibson completed his third and final book "The Ecological Approach to Visual Perception". That book can be seen as the synthesis of the many radical ideas he proposed over the previous 30 years - the concept of information and its sufficiency, the necessary link between perception and action, the need to see perception in relation to an animal's particular ecological niche and the meanings (affordances) offered by the visual world. One of the fundamental concepts that lies beyond all of Gibson's thinking is that of optic flow: the constantly changing patterns of light that reach our eyes and the information it provides. My purpose in writing this paper has been to evaluate the legacy of Gibson's conceptual ideas and to consider how his ideas have influenced and changed the way we study perception.
Collapse
|
15
|
Abstract
To form a more reliable percept of the environment, the brain needs to estimate its own sensory uncertainty. Current theories of perceptual inference assume that the brain computes sensory uncertainty instantaneously and independently for each stimulus. We evaluated this assumption in four psychophysical experiments, in which human observers localized auditory signals that were presented synchronously with spatially disparate visual signals. Critically, the visual noise changed dynamically over time continuously or with intermittent jumps. Our results show that observers integrate audiovisual inputs weighted by sensory uncertainty estimates that combine information from past and current signals consistent with an optimal Bayesian learner that can be approximated by exponential discounting. Our results challenge leading models of perceptual inference where sensory uncertainty estimates depend only on the current stimulus. They demonstrate that the brain capitalizes on the temporal dynamics of the external world and estimates sensory uncertainty by combining past experiences with new incoming sensory signals.
Collapse
|
16
|
Abstract
Colloquially referred to as "taste," flavor is in reality a thoroughly multisensory experience. Yet, a mechanistic understanding of the multisensory computations underlying flavor perception and food choice is lacking. Here, we used a multisensory flavor choice task in rats to test specific predictions of the statistically optimal integration framework, which has previously yielded much insight into cue integration in other multisensory systems. Our results confirm three key predictions of this framework in the unique context of flavor choice behavior, providing novel mechanistic insight into multisensory flavor processing.NEW & NOTEWORTHY The authors demonstrate that rats make choices about which flavor solution (i.e., taste-odor mixture) to consume by weighting the individual taste and odor components according to the reliability of the information they provide about which solution is the preferred one. A similar weighting operation underlies multisensory cue combination in other domains and offers novel insight into the computations underlying multisensory flavor perception and food choice behavior.
Collapse
|
17
|
Specular highlights improve color constancy when other cues are weakened. J Vis 2020; 20:4. [PMID: 33170203 PMCID: PMC7674000 DOI: 10.1167/jov.20.12.4] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2020] [Accepted: 09/07/2020] [Indexed: 11/24/2022] Open
Abstract
Previous studies suggest that to achieve color constancy, the human visual system makes use of multiple cues, including a priori assumptions about the illumination ("daylight priors"). Specular highlights have been proposed to aid constancy, but the evidence for their usefulness is mixed. Here, we used a novel cue-combination approach to test whether the presence of specular highlights or the validity of a daylight prior improves illumination chromaticity estimates, inferred from achromatic settings, to determine whether and under which conditions either cue contributes to color constancy. Observers made achromatic settings within three-dimensional rendered scenes containing matte or glossy shapes, illuminated by either daylight or nondaylight illuminations. We assessed both the variability of these settings and their accuracy, in terms of the standard color constancy index (CCI). When a spectrally uniform background was present, neither CCIs nor variability improved with specular highlights or daylight illuminants (Experiment 1). When a Mondrian background was introduced, CCIs decreased overall but were higher for scenes containing glossy, as opposed to matte, shapes (Experiments 2 and 3). There was no overall reduction in variability of settings and no benefit for scenes illuminated by daylights. Taken together, these results suggest that the human visual system indeed uses specular highlights to improve color constancy but only when other cues, such as from the local surround, are weakened.
Collapse
|
18
|
Abstract
One of the most important tasks for humans is the attribution of causes and effects in all wakes of life. The first systematical study of visual perception of causality-often referred to as phenomenal causality-was done by Albert Michotte using his now well-known launching events paradigm. Launching events are the seeming collision and seeming transfer of movement between two objects-abstract, featureless stimuli ("objects") in Michotte's original experiments. Here, we study the relation between causal ratings for launching events in Michotte's setting and launching collisions in a photorealistically computer-rendered setting. We presented launching events with differing temporal gaps, the same launching processes with photorealistic billiard balls, as well as photorealistic billiard balls with realistic motion dynamics, that is, an initial rebound of the first ball after collision and a short sliding phase of the second ball due to momentum and friction. We found that providing the normal launching stimulus with realistic visuals led to lower causal ratings, but realistic visuals together with realistic motion dynamics evoked higher ratings. Two-dimensional versus three-dimensional presentation, on the other hand, did not affect phenomenal causality. We discuss our results in terms of intuitive physics as well as cue conflict.
Collapse
|
19
|
Abstract
Existing theories suggest that reacting to dynamic stimuli is made possible by relying on internal estimates of kinematic variables. For example, to catch a bouncing ball the brain relies on the position and speed of the ball. However, when kinematic information is unreliable one may additionally rely on temporal cues. In the bouncing ball example, when visibility is low one may benefit from the temporal information provided by the sound of the bounces. Our work provides evidence that humans rely on such temporal cues and automatically integrate them with kinematic information to optimize their performance. This finding reveals a hitherto unappreciated role of the brain’s timing mechanisms in sensorimotor function. To coordinate movements with events in a dynamic environment the brain has to anticipate when those events occur. A classic example is the estimation of time to contact (TTC), that is, when an object reaches a target. It is thought that TTC is estimated from kinematic variables. For example, a tennis player might use an estimate of distance (d) and speed (v) to estimate TTC (TTC = d/v). However, the tennis player may instead estimate TTC as twice the time it takes for the ball to move from the serve line to the net line. This latter strategy does not rely on kinematics and instead computes TTC solely from temporal cues. Which of these two strategies do humans use to estimate TTC? Considering that both speed and time estimates are inherently uncertain and the ability of the human brain to combine different sources of information, we hypothesized that humans estimate TTC by integrating speed information with temporal cues. We evaluated this hypothesis systematically using psychophysics and Bayesian modeling. Results indicated that humans rely on both speed information and temporal cues and integrate them to optimize their TTC estimates when both cues are present. These findings suggest that the brain’s timing mechanisms are actively engaged when interacting with dynamic stimuli.
Collapse
|
20
|
Abstract
Human perceptual decisions are often described as optimal. Critics of this view have argued that claims of optimality are overly flexible and lack explanatory power. Meanwhile, advocates for optimality have countered that such criticisms single out a few selected papers. To elucidate the issue of optimality in perceptual decision making, we review the extensive literature on suboptimal performance in perceptual tasks. We discuss eight different classes of suboptimal perceptual decisions, including improper placement, maintenance, and adjustment of perceptual criteria; inadequate tradeoff between speed and accuracy; inappropriate confidence ratings; misweightings in cue combination; and findings related to various perceptual illusions and biases. In addition, we discuss conceptual shortcomings of a focus on optimality, such as definitional difficulties and the limited value of optimality claims in and of themselves. We therefore advocate that the field drop its emphasis on whether observed behavior is optimal and instead concentrate on building and testing detailed observer models that explain behavior across a wide range of tasks. To facilitate this transition, we compile the proposed hypotheses regarding the origins of suboptimal perceptual decisions reviewed here. We argue that verifying, rejecting, and expanding these explanations for suboptimal behavior - rather than assessing optimality per se - should be among the major goals of the science of perceptual decision making.
Collapse
|
21
|
Combination of Interaural Level and Time Difference in Azimuthal Sound Localization in Owls. eNeuro 2018; 4:eN-NWR-0238-17. [PMID: 29379866 PMCID: PMC5779116 DOI: 10.1523/eneuro.0238-17.2017] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2017] [Revised: 11/21/2017] [Accepted: 11/21/2017] [Indexed: 11/21/2022] Open
Abstract
A function of the auditory system is to accurately determine the location of a sound source. The main cues for sound location are interaural time (ITD) and level (ILD) differences. Humans use both ITD and ILD to determine the azimuth. Thus far, the conception of sound localization in barn owls was that their facial ruff and asymmetrical ears generate a two-dimensional grid of ITD for azimuth and ILD for elevation. We show that barn owls also use ILD for azimuthal sound localization when ITDs are ambiguous. For high-frequency narrowband sounds, midbrain neurons can signal multiple locations, leading to the perception of an auditory illusion called a phantom source. Owls respond to such an illusory percept by orienting toward it instead of the true source. Acoustical measurements close to the eardrum reveal a small ILD component that changes with azimuth, suggesting that ITD and ILD information could be combined to eliminate the illusion. Our behavioral data confirm that perception was robust against ambiguities if ITD and ILD information was combined. Electrophysiological recordings of ILD sensitivity in the owl’s midbrain support the behavioral findings indicating that rival brain hemispheres drive the decision to orient to either true or phantom sources. Thus, the basis for disambiguation, and reliable detection of sound source azimuth, relies on similar cues across species as similar response to combinations of ILD and narrowband ITD has been observed in humans.
Collapse
|
22
|
Stereoscopic Slant Contrast and the Perception of Inducer Slant at Brief Stimulus Presentations. Perception 2017; 47:171-184. [PMID: 29117775 DOI: 10.1177/0301006617739755] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
Slant contrast refers to a stereoscopic phenomenon in which the perceived slant of a test object is affected by the disparity of a surrounding inducer object. Slant contrast has been proposed to involve cue conflict, but it is unclear whether this idea is useful in explaining slant contrast at short stimulus presentations (<1 s). We measured both slant contrast and perceived inducer slant while varying the presentation duration (100-800 ms) of stereograms with several spatial configurations. In three psychophysical experiments, we found that (a) both slant contrast and perceived inducer slant increased as a function of stimulus duration, and (b) slant contrast was relatively stable across different test and inducer shapes at each short stimulus duration, whereas perceived inducer slant increased when cue conflict was reduced. These results suggest that at brief, not long stimulus presentations, the cue conflict between disparity and perspective plays a smaller role in slant contrast than other depth cues.
Collapse
|
23
|
Abstract
Astoundingly, looking at a photograph with one eye can yield an experience of depth of the depicted objects similar to that from viewing the real objects with both eyes. Édouard Claparède (1873–1940) was one of the first to report this phenomenon in a French paper published in 1904. I give some historical and theoretical context to the phenomenon, provide some biographical information about Claparède, and provide a translation into English of Claparède’s paper.
Collapse
|
24
|
Bayesian Analysis of Perceived Eye Level. Front Comput Neurosci 2016; 10:135. [PMID: 28018204 PMCID: PMC5156681 DOI: 10.3389/fncom.2016.00135] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2016] [Accepted: 12/01/2016] [Indexed: 12/03/2022] Open
Abstract
To accurately perceive the world, people must efficiently combine internal beliefs and external sensory cues. We introduce a Bayesian framework that explains the role of internal balance cues and visual stimuli on perceived eye level (PEL)—a self-reported measure of elevation angle. This framework provides a single, coherent model explaining a set of experimentally observed PEL over a range of experimental conditions. Further, it provides a parsimonious explanation for the additive effect of low fidelity cues as well as the averaging effect of high fidelity cues, as also found in other Bayesian cue combination psychophysical studies. Our model accurately estimates the PEL and explains the form of previous equations used in describing PEL behavior. Most importantly, the proposed Bayesian framework for PEL is more powerful than previous behavioral modeling; it permits behavioral estimation in a wider range of cue combination and perceptual studies than models previously reported.
Collapse
|
25
|
Optimal cue combination and landmark-stability learning in the head direction system. J Physiol 2016; 594:6527-6534. [PMID: 27479741 DOI: 10.1113/jp272945] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2016] [Accepted: 07/07/2016] [Indexed: 12/11/2022] Open
Abstract
Maintaining a sense of direction requires combining information from static environmental landmarks with dynamic information about self-motion. This is accomplished by the head direction system, whose neurons - head direction cells - encode specific head directions. When the brain integrates information in sensory domains, this process is almost always 'optimal' - that is, inputs are weighted according to their reliability. Evidence suggests cue combination by head direction cells may also be optimal. The simplicity of the head direction signal, together with the detailed knowledge we have about the anatomy and physiology of the underlying circuit, therefore makes this system a tractable model with which to discover how optimal cue combination occurs at a neural level. In the head direction system, cue interactions are thought to occur on an attractor network of interacting head direction neurons, but attractor dynamics predict a winner-take-all decision between cues, rather than optimal combination. However, optimal cue combination in an attractor could be achieved via plasticity in the feedforward connections from external sensory cues (i.e. the landmarks) onto the ring attractor. Short-term plasticity would allow rapid re-weighting that adjusts the final state of the network in accordance with cue reliability (reflected in the connection strengths), while longer term plasticity would allow long-term learning about this reliability. Although these principles were derived to model the head direction system, they could potentially serve to explain optimal cue combination in other sensory systems more generally.
Collapse
|
26
|
Inflections of the Bayesian Paradigm in Perceptual Psychology. Perception 2016; 45:1412-1425. [PMID: 27669709 DOI: 10.1177/0301006616669959] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
Bayesian modeling has gained a conspicuous position in contemporary perceptual psychology. It can be examined from two viewpoints: a formal one, concerning the logical attributes of and the algebraic operations on the components of the models, and a substantive one, concerning the empirical meaning of those components. We maintain that, while there is homogeneity between Bayesian models of visual perception in their formal setup, remarkable differences can be found in their substantive aspect, that is, how the question "Where do probabilities come from?" is answered when designing the models. In particular, we focus on an inflection that we call "congenial" because it consistently embodies the inversion idea of the Bayes' rule in terms of optical inversion and highlight delicate issues that face this inflection for a consistent realization of the scientific program it represents. We also suggest ideas concerning the organization of the Bayesian area within perceptual psychology, which appears variegated, with the congenial inflection in a central position, and a fringe of disputable classification along the border.
Collapse
|
27
|
Abstract
There are many similarities between binocular disparity and motion parallax as sources of information about the structure and layout of 3-D objects and surfaces. The former can be thought of as a transformation that maps one eye's image onto the other while the latter is a transformation that maps the changes in one eye's image over time. There are many empirical similarities in the ways we use the two sources of information but there are also significant differences. A consideration of those differences leads to the conclusion that, rather than seeing motion parallax as a close analogue of binocular stereopsis, motion parallax is better thought of as a special case of the kinetic depth effect in which the depth order of the depicted 3-D object or surface can be disambiguated by vertical perspective information.
Collapse
|
28
|
The Effectiveness of Vertical Perspective and Pursuit Eye Movements for Disambiguating Motion Parallax Transformations. Perception 2016; 45:1279-1303. [PMID: 27343187 DOI: 10.1177/0301006616655815] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
In the kinetic depth effect, the direction of the perceived depth and the direction of apparent rotation of a 3-D structure are linked, and typically ambiguous, whereas depth from motion parallax during both observer- and object-movement is stable and unambiguous. Rogers and Rogers demonstrated that the vertical perspective transformations play an important role in disambiguating the direction of the perceived depth in parallax-defined surfaces but more recently Nawrot et al. have proposed that pursuit eye movements provide the crucial disambiguating information. Theoretical considerations suggest that pursuit eye movements could not, in principle, provide the necessary information because 3-D objects as surfaces may rotate during observer- or object-movement. The empirical evidence presented here shows that vertical perspective transformations are sufficient for the unambiguous perception of parallax depth whereas pursuit eye movements are not necessary and may not even be sufficient.
Collapse
|
29
|
Abstract
In a series of recent experiments, we found that if rats are presented with two temporal cues, each signifying that reward will be delivered after a different duration elapses (e.g., tone-10 seconds / light-20 seconds), they will behave as if they have computed a weighted average of these respective durations. In the current article, we argue that this effect, referred to as "temporal averaging", can be understood within the context of Bayesian Decision Theory. Specifically, we propose and provide preliminary data showing that, when averaging, rats weight different durations based on the relative variability of the information their respective cues provide.
Collapse
|
30
|
Cue Combination of Conflicting Color and Luminance Edges. Iperception 2015; 6:2041669515621215. [PMID: 27551364 PMCID: PMC4975110 DOI: 10.1177/2041669515621215] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/01/2022] Open
Abstract
Abrupt changes in the color or luminance of a visual image potentially indicate object boundaries. Here, we consider how these cues to the visual "edge" location are combined when they conflict. We measured the extent to which localization of a compound edge can be predicted from a simple maximum likelihood estimation model using the reliability of chromatic (L-M) and luminance signals alone. Maximum likelihood estimation accurately predicted the pattern of results across a range of contrasts. Predictions consistently overestimated the relative influence of the luminance cue; although L-M is often considered a poor cue for localization, it was used more than expected. This need not indicate that the visual system is suboptimal but that its priors about which cue is more useful are not flat. This may be because, although strong changes in chromaticity typically represent object boundaries, changes in luminance can be caused by either a boundary or a shadow.
Collapse
|
31
|
Perceptual biases and cue weighting in perception of 3D slant from texture and stereo information. J Vis 2015; 15:15.2.14. [PMID: 25761332 DOI: 10.1167/15.2.14] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open
Abstract
Multiple cues are typically available for perceiving the 3D slant of surfaces, and slant perception has been used as a test case for investigating cue integration. Previous evidence suggests that texture and stereo slant cues contribute in an optimal Bayesian manner. We tested whether a Bayesian model could also account for perceptual underestimation of slant from texture. One explanation proposed by Todd, Christensen, and Guckes (2010) is that slant from texture is based on an inaccurate optical variable. An alternative Bayesian explanation is that perceptual underestimation is due to the influence of frontal cues and/or a frontal prior, which is weighted according to the reliability of slant cues. We measured slant perception using a hand-alignment task for conditions that provided only texture, only stereo, or combined texture and stereo cues. Slant estimates from monocular texture showed large biases toward frontal, with proportionally more underestimation at low slants than high slants. Slant estimates from stereo alone were more accurate, and adding texture information did not reduce accuracy. These results are consistent with a frontal influence that is decreasingly weighted as slant information becomes more reliable. We also included conditions with small cue conflicts to measure the relative weighting of texture and stereo cues. Consistent with previous studies, texture had a significant effect on slant estimates in binocular conditions, and the relative weighting of texture increased with slant. In some cases, perceived slant from combined stereo and texture cues was higher than from either cue in isolation. Both the perceptual biases and the cue weights were generally consistent with a Bayesian model that optimally integrates texture and stereo slant cues with frontal cues and/or a frontal prior.
Collapse
|
32
|
Abstract
The context in which an object is found can facilitate its recognition. Yet, it is not known how effective this contextual information is relative to the object's intrinsic visual features, such as color and shape. To address this, we performed four experiments using rendered scenes with novel objects. In each experiment, participants first performed a visual search task, searching for a uniquely shaped target object whose color and location within the scene was experimentally manipulated. We then tested participants' tendency to use their knowledge of the location and color information in an identification task when the objects' images were degraded due to blurring, thus eliminating the shape information. In Experiment 1, we found that, in the absence of any diagnostic intrinsic features, participants identified objects based purely on their locations within the scene. In Experiment 2, we found that participants combined an intrinsic feature, color, with contextual location in order to uniquely specify an object. In Experiment 3, we found that when an object's color and location information were in conflict, participants identified the object using both sources of information equally. Finally, in Experiment 4, we found that participants used whichever source of information-either color or location-was more statistically reliable in order to identify the target object. Overall, these experiments show that the context in which objects are found can play as important a role as intrinsic features in identifying the objects.
Collapse
|
33
|
Abstract
To extend our understanding of the early visual hierarchy, we investigated the long-range integration of first- and second-order signals in spatial vision. In our first experiment we performed a conventional area summation experiment where we varied the diameter of (a) luminance-modulated (LM) noise and (b) contrast-modulated (CM) noise. Results from the LM condition replicated previous findings with sine-wave gratings in the absence of noise, consistent with long-range integration of signal contrast over space. For CM, the summation function was much shallower than for LM suggesting, at first glance, that the signal integration process was spatially less extensive than for LM. However, an alternative possibility was that the high spatial frequency noise carrier for the CM signal was attenuated by peripheral retina (or cortex), thereby impeding our ability to observe area summation of CM in the conventional way. To test this, we developed the "Swiss cheese" stimulus of Meese and Summers (2007) in which signal area can be varied without changing the stimulus diameter, providing some protection against inhomogeneity of the retinal field. Using this technique and a two-component subthreshold summation paradigm we found that (a) CM is spatially integrated over at least five stimulus cycles (possibly more), (b) spatial integration follows square-law signal transduction for both LM and CM and (c) the summing device integrates over spatially-interdigitated LM and CM signals when they are co-oriented, but not when cross-oriented. The spatial pooling mechanism that we have identified would be a good candidate component for a module involved in representing visual textures, including their spatial extent.
Collapse
|
34
|
Reliability-dependent contributions of visual orientation cues in parietal cortex. Proc Natl Acad Sci U S A 2014; 111:18043-8. [PMID: 25427796 DOI: 10.1073/pnas.1421131111] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Creating accurate 3D representations of the world from 2D retinal images is a fundamental task for the visual system. However, the reliability of different 3D visual signals depends inherently on viewing geometry, such as how much an object is slanted in depth. Human perceptual studies have correspondingly shown that texture and binocular disparity cues for object orientation are combined according to their slant-dependent reliabilities. Where and how this cue combination occurs in the brain is currently unknown. Here, we search for neural correlates of this property in the macaque caudal intraparietal area (CIP) by measuring slant tuning curves using mixed-cue (texture + disparity) and cue-isolated (texture or disparity) planar stimuli. We find that texture cues contribute more to the mixed-cue responses of CIP neurons that prefer larger slants, consistent with theoretical and psychophysical results showing that the reliability of texture relative to disparity cues increases with slant angle. By analyzing responses to binocularly viewed texture stimuli with conflicting texture and disparity information, some cells that are sensitive to both cues when presented in isolation are found to disregard one of the cues during cue conflict. Additionally, the similarity between texture and mixed-cue responses is found to be greater when this cue conflict is eliminated by presenting the texture stimuli monocularly. The present findings demonstrate reliability-dependent contributions of visual orientation cues at the level of the CIP, thus revealing a neural correlate of this property of human visual perception.
Collapse
|
35
|
Abstract
Humans and animals can integrate sensory evidence from various sources to make decisions in a statistically near-optimal manner, provided that the stimulus presentation time is fixed across trials. Little is known about whether optimality is preserved when subjects can choose when to make a decision (reaction-time task), nor when sensory inputs have time-varying reliability. Using a reaction-time version of a visual/vestibular heading discrimination task, we show that behavior is clearly sub-optimal when quantified with traditional optimality metrics that ignore reaction times. We created a computational model that accumulates evidence optimally across both cues and time, and trades off accuracy with decision speed. This model quantitatively explains subjects's choices and reaction times, supporting the hypothesis that subjects do, in fact, accumulate evidence optimally over time and across sensory modalities, even when the reaction time is under the subject's control.
Collapse
|
36
|
Boundary segmentation from dynamic occlusion-based motion parallax. J Vis 2014; 14:14.4.15. [PMID: 24762951 DOI: 10.1167/14.4.15] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open
Abstract
Active observer movement results in retinal image motion that is highly dependent on the scene layout. This retinal motion, often called motion parallax, can yield significant information about the boundaries between objects and their relative depth differences. Previously we examined segmentation from shear-based motion parallax, which consists of only relative motion information. Here, we examine segmentation from dynamic occlusion-based motion parallax, which contains both relative motion and accretion-deletion. We utilized random dots whose motion was modulated with vertical low spatial frequency envelopes and synchronized to head movements (Head Sync), or recreated using previously recorded head movement data for the same stationary observer (Playback). Observers judged the orientation of a boundary between regions of oppositely moving dots in a 2AFC task. The results demonstrate that observers perform poorer when the stimulus motion is synchronized to head movement, particularly at smaller relative depths, even though that head movement provides significant information about depth. Both expansion-compression and accretion-deletion in isolation could support segmentation, albeit with reduced performance. Therefore, unlike our previous results for depth ordering, expansion-compression and accretion-deletion contribute similarly to segmentation. Furthermore, human observers do not appear to utilize depth information to improve segmentation performance.
Collapse
|
37
|
A neural hierarchy for illusions of time: duration adaptation precedes multisensory integration. J Vis 2013; 13:13.14.4. [PMID: 24306853 DOI: 10.1167/13.14.4] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open
Abstract
Perceived time is inherently malleable. For example, adaptation to relatively long or short sensory events leads to a repulsive aftereffect such that subsequent events appear to be contracted or expanded (duration adaptation). Perceived visual duration can also be distorted via concurrent presentation of discrepant auditory durations (multisensory integration). The neural loci of both distortions remain unknown. In the current study we use a psychophysical approach to establish their relative positioning within the sensory processing hierarchy. We show that audiovisual integration induces marked distortions of perceived visual duration. We proceed to use these distorted durations as visual adapting stimuli yet find subsequent visual duration aftereffects to be consistent with physical rather than perceived visual duration. Conversely, the concurrent presentation of adapted auditory durations with nonadapted visual durations results in multisensory integration patterns consistent with perceived, rather than physical, auditory duration. These results demonstrate that recent sensory history modifies human duration perception prior to the combination of temporal information across sensory modalities and provides support for adaptation mechanisms mediated by duration selective neurons situated in early areas of the visual and auditory nervous system (Aubie, Sayegh, & Faure, 2012; Duysens, Schaafsma, & Orban, 1996; Leary, Edwards, & Rose, 2008).
Collapse
|
38
|
Depth perception from dynamic occlusion in motion parallax: roles of expansion-compression versus accretion-deletion. J Vis 2013; 13:13.12.10. [PMID: 24130259 DOI: 10.1167/13.12.10] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open
Abstract
Motion parallax, or differential retinal image motion from observer movement, provides important information for depth perception. We previously measured the contribution of shear motion parallax to depth, which is only composed of relative motion information. Here, we examine the roles of relative motion and accretion-deletion information in dynamic occlusion motion parallax. Observers performed two-alternative forced choice depth-ordering tasks in response to low spatial frequency patterns of horizontal random dot motion that were synchronized to the observer's head movements. We examined conditions that isolated or combined expansion-compression and accretion-deletion across a range of simulated relative depths. At small depths, expansion-compression provided reliable depth perception while accretion-deletion had a minor contribution: When the two were in conflict, the perceived depth was dominated by expansion-compression. At larger depths in the cue-conflict experiment, accretion-deletion determined the depth-ordering performance. Accretion-deletion in isolation did not yield any percept of depth even though, in theory, it provided sufficient information for depth ordering. Thus, accretion-deletion can substantially enhance depth perception at larger depths but only in the presence of relative motion. The results indicate that expansion-compression contributes to depth from motion parallax across a broad range of depths whereas accretion-deletion contributes primarily at larger depths.
Collapse
|
39
|
Abstract
Reach-to-grasp movements require information about the distance and size of target objects. Calibration of this information could be achieved via feedback information (visual and/or haptic) regarding terminal accuracy when target objects are grasped. A number of reports suggest that the nervous system alters reach-to-grasp behavior following either a visual or haptic error signal indicating inaccurate reaching. Nevertheless, the reported modification is generally partial (reaching is changed less than predicted by the feedback error), a finding that has been ascribed to slow adaptation rates. It is possible, however, that the modified reaching reflects the system's weighting of the visual and haptic information in the presence of noise rather than calibration per se. We modeled the dynamics of calibration and showed that the discrepancy between reaching behavior and the feedback error results from an incomplete calibration process. Our results provide evidence for calibration being an intrinsic feature of reach-to-grasp behavior.
Collapse
|
40
|
Abstract
The spatial resolution of disparity perception is poor compared to luminance perception, yet we do not notice that depth edges are more blurry than luminance edges. Is this because the two cues are combined by the visual system? Subjects judged the locations of depth-defined or luminance-defined edges, which were separated by up to 5.6 min of arc. The perceived edge location was a function of the depth-defined edge and the luminance-defined edge, with the luminance edge tending to play a larger role. Our data are compatible with but not completely explained by an optimal cue-combination model that gives more reliable cues a heavier weight. Both edge cues (depth and luminance) contribute to the final percept, with an adaptive weighting depending on the task and the acuity with which each cue is perceived.
Collapse
|
41
|
Lateralization of kin recognition signals in the human face. J Vis 2010; 10:9. [PMID: 20884584 PMCID: PMC4453869 DOI: 10.1167/10.8.9] [Citation(s) in RCA: 36] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2010] [Accepted: 05/26/2010] [Indexed: 11/24/2022] Open
Abstract
When human subjects view photographs of faces, their judgments of identity, gender, emotion, age, and attractiveness depend more on one side of the face than the other. We report an experiment testing whether allocentric kin recognition (the ability to judge the degree of kinship between individuals other than the observer) is also lateralized. One hundred and twenty-four observers judged whether or not pairs of children were biological siblings by looking at photographs of their faces. In three separate conditions, (1) the right hemi-face was masked, (2) the left hemi-face was masked, or (3) the face was fully visible. The d' measures for the masked left hemi-face and masked right hemi-face were 1.024 and 1.004, respectively (no significant difference), and the d' measure for the unmasked face was 1.079, not significantly greater than that for either of the masked conditions. We conclude, first, that there is no superiority of one or the other side of the observed face in kin recognition, second, that the information present in the left and right hemi-faces relevant to recognizing kin is completely redundant, and last that symmetry cues are not used for kin recognition.
Collapse
|
42
|
Abstract
It is often assumed that the space we perceive is Euclidean, although this idea has been challenged by many authors. Here we show that, if spatial cues are combined as described by Maximum Likelihood Estimation, Bayesian, or equivalent models, as appears to be the case, then Euclidean geometry cannot describe our perceptual experience. Rather, our perceptual spatial structure would be better described as belonging to an arbitrarily curved Riemannian space.
Collapse
|
43
|
The Mixture of Bernoulli Experts: a theory to quantify reliance on cues in dichotomous perceptual decisions. J Vis 2009; 9:6.1-19. [PMID: 19271876 PMCID: PMC2757636 DOI: 10.1167/9.1.6] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2008] [Accepted: 09/02/2008] [Indexed: 11/24/2022] Open
Abstract
The appearances of perceptually bistable stimuli can by definition be reported with confidence, so these stimuli may be useful to investigate how visual cues are learned and combined to construct visual appearance. However, interpreting experimental data (percent of trials seen one way or the other) requires a theoretically motivated measure of cue effectiveness. Here we describe a simple Bayesian theory for dichotomous perceptual decisions: the Mixture of Bernoulli Experts or MBE. In this theory, a cue's subjective reliability is the product of a weight and an estimate of the cue's ecological validity. The theory (1) justifies the use of probit analysis to measure the system's reliance on a cue and (2) enables hypothesis testing. To illustrate, we used apparent 3D rotation direction in perceptually ambiguous Necker cube movies to test whether the visual system relied on a newly recruited cue (position of the stimulus within the visual field) to the same extent when a long-trusted cue (binocular disparity) was present or not present in the display. For six trainees, reliance on the newly recruited cue was similar whether or not the long-trusted cue was present, suggesting that the visual system assumed the new cue to be conditionally independent.
Collapse
|
44
|
Vergence-accommodation conflicts hinder visual performance and cause visual fatigue. J Vis 2008; 8:33.1-30. [PMID: 18484839 PMCID: PMC2879326 DOI: 10.1167/8.3.33] [Citation(s) in RCA: 374] [Impact Index Per Article: 23.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2007] [Accepted: 01/31/2008] [Indexed: 11/24/2022] Open
Abstract
Three-dimensional (3D) displays have become important for many applications including vision research, operation of remote devices, medical imaging, surgical training, scientific visualization, virtual prototyping, and more. In many of these applications, it is important for the graphic image to create a faithful impression of the 3D structure of the portrayed object or scene. Unfortunately, 3D displays often yield distortions in perceived 3D structure compared with the percepts of the real scenes the displays depict. A likely cause of such distortions is the fact that computer displays present images on one surface. Thus, focus cues-accommodation and blur in the retinal image-specify the depth of the display rather than the depths in the depicted scene. Additionally, the uncoupling of vergence and accommodation required by 3D displays frequently reduces one's ability to fuse the binocular stimulus and causes discomfort and fatigue for the viewer. We have developed a novel 3D display that presents focus cues that are correct or nearly correct for the depicted scene. We used this display to evaluate the influence of focus cues on perceptual distortions, fusion failures, and fatigue. We show that when focus cues are correct or nearly correct, (1) the time required to identify a stereoscopic stimulus is reduced, (2) stereoacuity in a time-limited task is increased, (3) distortions in perceived depth are reduced, and (4) viewer fatigue and discomfort are reduced. We discuss the implications of this work for vision research and the design and use of displays.
Collapse
|