1
|
Choi R, Feldman J, Singh M. Perceptual Biases in the Interpretation of Non-Rigid Shape Transformations from Motion. Vision (Basel) 2024; 8:43. [PMID: 39051229 PMCID: PMC11270375 DOI: 10.3390/vision8030043] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2024] [Revised: 06/25/2024] [Accepted: 06/28/2024] [Indexed: 07/27/2024] Open
Abstract
Most existing research on the perception of 3D shape from motion has focused on rigidly moving objects. However, many natural objects deform non-rigidly, leading to image motion with no rigid interpretation. We investigated potential biases underlying the perception of non-rigid shape interpretations from motion. We presented observers with stimuli that were consistent with two qualitatively different interpretations. Observers were shown a two-part 3D object with the smaller part changing in length dynamically as the whole object rotated back and forth. In two experiments, we studied the misperception (i.e., perceptual reinterpretation) of the non-rigid length change to a part. In Experiment 1, observers misperceived this length change as a part orientation change (i.e., the smaller part was seen as articulating with respect to the larger part). In Experiment 2, the stimuli were similar, except the silhouette of the part was visible in the image. Here, the non-rigid length change was reinterpreted as a rigidly attached part with an "illusory" non-orthogonal horizontal angle relative to the larger part. We developed a model that incorporated this perceptual reinterpretation and could predict observer data. We propose that the visual system may be biased towards part-wise rigid interpretations of non-rigid motion, likely due to the ecological significance of movements of humans and other animals, which are generally constrained to move approximately part-wise rigidly. That is, not all non-rigid deformations are created equal: the visual systems' prior expectations may bias the system to interpret motion in terms of biologically plausible shape transformations.
Collapse
Affiliation(s)
- Ryne Choi
- Department of Psychology and Rutgers Center for Cognitive Science (RuCCS), Rutgers University, Piscataway, NJ 08854, USA
| | | | | |
Collapse
|
2
|
Kemp JT, Cesanek E, Domini F. Perceiving depth from texture and disparity cues: Evidence for a non-probabilistic account of cue integration. J Vis 2023; 23:13. [PMID: 37486299 PMCID: PMC10382782 DOI: 10.1167/jov.23.7.13] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2022] [Accepted: 06/12/2023] [Indexed: 07/25/2023] Open
Abstract
Bayesian inference theories have been extensively used to model how the brain derives three-dimensional (3D) information from ambiguous visual input. In particular, the maximum likelihood estimation (MLE) model combines estimates from multiple depth cues according to their relative reliability to produce the most probable 3D interpretation. Here, we tested an alternative theory of cue integration, termed the intrinsic constraint (IC) theory, which postulates that the visual system derives the most stable, not most probable, interpretation of the visual input amid variations in viewing conditions. The vector sum model provides a normative approach for achieving this goal where individual cue estimates are components of a multidimensional vector whose norm determines the combined estimate. Individual cue estimates are not accurate but related to distal 3D properties through a deterministic mapping. In three experiments, we show that the IC theory can more adeptly account for 3D cue integration than MLE models. In Experiment 1, we show systematic biases in the perception of depth from texture and depth from binocular disparity. Critically, we demonstrate that the vector sum model predicts an increase in perceived depth when these cues are combined. In Experiment 2, we illustrate the IC theory radical reinterpretation of the just noticeable difference (JND) and test the related vector sum model prediction of the classic finding of smaller JNDs for combined-cue versus single-cue stimuli. In Experiment 3, we confirm the vector sum prediction that biases found in cue integration experiments cannot be attributed to flatness cues, as the MLE model predicts.
Collapse
Affiliation(s)
- Jovan T Kemp
- Department of Cognitive, Linguistic, and Psychological Sciences, Brown University, Providence, RI, USA
| | - Evan Cesanek
- Mortimer B. Zuckerman Mind Brain Behavior Institute, Columbia University, New York, NY, USA
| | - Fulvio Domini
- Department of Cognitive, Linguistic, and Psychological Sciences, Brown University, Providence, RI, USA
- Italian Institute of Technology, Rovereto, Italy
| |
Collapse
|
3
|
Domini F. The case against probabilistic inference: a new deterministic theory of 3D visual processing. Philos Trans R Soc Lond B Biol Sci 2023; 378:20210458. [PMID: 36511407 PMCID: PMC9745883 DOI: 10.1098/rstb.2021.0458] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022] Open
Abstract
How the brain derives 3D information from inherently ambiguous visual input remains the fundamental question of human vision. The past two decades of research have addressed this question as a problem of probabilistic inference, the dominant model being maximum-likelihood estimation (MLE). This model assumes that independent depth-cue modules derive noisy but statistically accurate estimates of 3D scene parameters that are combined through a weighted average. Cue weights are adjusted based on the system representation of each module's output variability. Here I demonstrate that the MLE model fails to account for important psychophysical findings and, importantly, misinterprets the just noticeable difference, a hallmark measure of stimulus discriminability, to be an estimate of perceptual uncertainty. I propose a new theory, termed Intrinsic Constraint, which postulates that the visual system does not derive the most probable interpretation of the visual input, but rather, the most stable interpretation amid variations in viewing conditions. This goal is achieved with the Vector Sum model, which represents individual cue estimates as components of a multi-dimensional vector whose norm determines the combined output. This model accounts for the psychophysical findings cited in support of MLE, while predicting existing and new findings that contradict the MLE model. This article is part of a discussion meeting issue 'New approaches to 3D vision'.
Collapse
Affiliation(s)
- Fulvio Domini
- CLPS, Brown University, 190 Thayer Street Providence, Rhode Island 02912-9067, USA
| |
Collapse
|
4
|
Tanrıkulu ÖD, Froyen V, Feldman J, Singh M. The interpretation of dynamic occlusion: Combining contour geometry and accretion/deletion of texture. Vision Res 2022; 199:108075. [PMID: 35689958 DOI: 10.1016/j.visres.2022.108075] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2021] [Revised: 05/24/2022] [Accepted: 05/25/2022] [Indexed: 11/16/2022]
Abstract
Conventional accounts of motion perception mostly treat accretion/deletion-the appearance or disappearance of texture at a boundary between regions-as an essentially decisive cue to relative depth: the accreting/deleting surface is interpreted as being behind adjacent surfaces. Under certain circumstances, however, accretion/deletion can be perceived in a radically different way: the accreting or deleting surface is seen as rotating in depth in front of adjacent surfaces. This alternative interpretation suggests a problem in conventional accounts of motion interpretation that cannot account for this phenomenon, in part because they ignore the role of contour geometry. In two experiments, we examined the combined role of contour convexity and accretion/deletion in determining the perception of relative depth by parametrically manipulating the strength of each cue. Our results show that convexity plays a more substantial role, often dominating the 3D percept, even in cases when the saliency of the convexity cue is substantially weakened on a contour where the texture was accreting/deleting at high rates. These results highlight the need for a rethinking of theories of perceptual organization in the critical case of moving stimuli.
Collapse
Affiliation(s)
- Ö Dağlar Tanrıkulu
- Department of Psychology, Center for Cognitive Science, Rutgers University, United States; Cognitive Science Program, Williams College, United States.
| | - Vicky Froyen
- Department of Psychology, Center for Cognitive Science, Rutgers University, United States
| | - Jacob Feldman
- Department of Psychology, Center for Cognitive Science, Rutgers University, United States
| | - Manish Singh
- Department of Psychology, Center for Cognitive Science, Rutgers University, United States
| |
Collapse
|
5
|
Chen S, Li Y, Pan JS. Monocular Perception of Equidistance: The Effects of Viewing Experience and Motion-generated Information. Optom Vis Sci 2022; 99:470-478. [PMID: 35149634 DOI: 10.1097/opx.0000000000001878] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022] Open
Abstract
SIGNIFICANCE Using static depth information, normal observers monocularly perceived equidistance with high accuracy. With dynamic depth information and/or monocular viewing experience, they perceived with high precision. Therefore, monocular patients, who were adapted to monocular viewing, should be able to perceive equidistance and perform related tasks. PURPOSE This study investigated whether normal observers could accurately and precisely perceive equidistance with one eye, in different viewing environments, with various optical information and monocular viewing experience. METHODS Sixteen normally sighted observers monocularly perceived the distance (5 to 30 m) between a target and the self and replicated it either in some hallways that contained ample static monocular depth information but had a limited field of view or on a lawn that contained less depth information but had a large field of view. Participants remained stationary or walked 5 m before performing the task, as a manipulation of the availability of dynamic depth information. Eight observers wore eye patches for 3 hours before the experiment and gained monocular viewing experience, whereas the others did not. Both accuracy and precision were measured. RESULTS As long as static monocular depth information was available, equidistance perception was effectively accurate, despite minute underestimation. Perception precision was improved by prior monocular walking and/or experience with monocularity. Accuracy and precision were not affected by the viewing environments. CONCLUSIONS Using static and dynamic monocular depth information and/or with monocular experience, normal observers judged equidistance with reliable accuracy and precision. This implied that patients with monocular vision, who are better adapted than participants of this study, should also be able to perceive equidistance and perform distance-dependent tasks in natural viewing environments.
Collapse
Affiliation(s)
- Shenying Chen
- Department of Psychology, Sun Yat-sen University, Guangzhou, China
| | - Yusi Li
- Department of Psychology, Sun Yat-sen University, Guangzhou, China
| | | |
Collapse
|
6
|
Campagnoli C, Hung B, Domini F. Explicit and implicit depth-cue integration: Evidence of systematic biases with real objects. Vision Res 2021; 190:107961. [PMID: 34757304 DOI: 10.1016/j.visres.2021.107961] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2021] [Revised: 09/28/2021] [Accepted: 10/03/2021] [Indexed: 11/27/2022]
Abstract
In previous studies using VR, we found evidence that 3D shape estimation agrees to a superadditivity rule of depth-cue combination, by which adding depth cues leads to greater perceived depth and, in principle, to depth overestimation. Superadditivity can be quantitatively accounted for by a normative theory of cue integration, via adapting a model termed Intrinsic Constraint (IC). As for its qualitative nature, it remains unclear whether superadditivity represents the genuine readout of depth-cue integration, as predicted by IC, or alternatively a byproduct of artificial virtual displays, because they carry flatness cues that can bias depth estimates in a Bayesian fashion, or even just a way for observers to express that a scene "looks deeper" with more depth cues by explicitly inflating their depth judgments. In the present study, we addressed this question by testing whether the IC model's prediction of superadditivity generalizes to real world settings. We asked participants to judge the perceived 3D shape of cardboard prisms through a matching task. To control for the potential interference of explicit reasoning, we also asked participants to reach-to-grasp the same objects and we analyzed the in-flight grip size throughout the reaching. We designed a novel technique to carefully control binocular and monocular 3D cues independently, allowing to add or remove depth information seamlessly. Even with real objects, participants exhibited a clear superadditivity effect in both tasks. Furthermore, the magnitude of this effect was accurately predicted by the IC model. These results confirm that superadditivity is an inherent feature of depth estimation.
Collapse
Affiliation(s)
- Carlo Campagnoli
- School of Psychology, University of Leeds, Leeds, UK; Department of Psychology, Princeton University, Princeton, NJ, USA.
| | - Bethany Hung
- The Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - Fulvio Domini
- Department of Cognitive, Linguistic and Psychological Science, Brown University, Providence, RI, USA
| |
Collapse
|
7
|
The Z-Box illusion: dominance of motion perception among multiple 3D objects. PSYCHOLOGICAL RESEARCH 2021; 86:1683-1697. [PMID: 34480245 DOI: 10.1007/s00426-021-01589-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2021] [Accepted: 08/27/2021] [Indexed: 10/20/2022]
Abstract
In the present article, we examine a novel illusion of motion-the Z-Box illusion-in which the presence of a bounding object influences the perception of motion of an ambiguous stimulus that appears within. Specifically, the stimuli are a structure-from-motion (SFM) particle orb and a wireframe cube. The orb could be perceived as rotating clockwise or counterclockwise while the cube could only be perceived as moving in one direction. Both stimuli were presented on a two-dimensional (2D) display with inferred three-dimensional (3D) properties. In a single experiment, we examine motion perception of a particle orb, both in isolation and when it appears within a rotating cube. Participants indicated the orb's direction of motion and whether the direction changed at any point during the trial. Accuracy was the critical measure while motion direction, the number of particles in the orb and presence of the wireframe cube were all manipulated. The results suggest that participants could perceive the orb's true rotation in the absence of the cube so long as it was made up of at least ten particles. The presence of the cube dominated perception as participants consistently perceived congruent motion of the orb and cube, even when they moved in objectively different directions. These findings are considered as they relate to prior research on motion perception, computational modelling of motion perception, structure from motion and 3D object perception.
Collapse
|
8
|
Cesanek E, Taylor JA, Domini F. Persistent grasping errors produce depth cue reweighting in perception. Vision Res 2020; 178:1-11. [PMID: 33070029 DOI: 10.1016/j.visres.2020.09.007] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2020] [Revised: 09/08/2020] [Accepted: 09/17/2020] [Indexed: 11/18/2022]
Abstract
When a grasped object is larger or smaller than expected, haptic feedback automatically recalibrates motor planning. Intriguingly, haptic feedback can also affect 3D shape perception through a process called depth cue reweighting. Although signatures of cue reweighting also appear in motor behavior, it is unclear whether this motor reweighting is the result of upstream perceptual reweighting, or a separate process. We propose that perceptual reweighting is directly related to motor control; in particular, that it is caused by persistent, systematic movement errors that cannot be resolved by motor recalibration alone. In Experiment 1, we inversely varied texture and stereo cues to create a set of depth-metamer objects: when texture specified a deep object, stereo specified a shallow object, and vice versa, such that all objects appeared equally deep. The stereo-texture pairings that produced this perceptual metamerism were determined for each participant in a matching task (Pre-test). Next, participants repeatedly grasped these depth metamers, receiving haptic feedback that was positively correlated with one cue and negatively correlated with the other, resulting in persistent movement errors. Finally, participants repeated the perceptual matching task (Post-test). In the condition where haptic feedback reinforced the texture cue, perceptual changes were correlated with changes in grasping performance across individuals, demonstrating a link between perceptual reweighting and improved motor control. Experiment 2 showed that cue reweighting does not occur when movement errors are rapidly corrected by standard motor adaptation. These findings suggest a mutual dependency between perception and action, with perception directly guiding action, and actions producing error signals that drive motor and perceptual learning.
Collapse
Affiliation(s)
- Evan Cesanek
- Department of Cognitive, Linguistic, & Psychological Sciences, Brown University, Providence, RI, United States.
| | - Jordan A Taylor
- Department of Psychology, Princeton University, Princeton, NJ, United States
| | - Fulvio Domini
- Department of Cognitive, Linguistic, & Psychological Sciences, Brown University, Providence, RI, United States
| |
Collapse
|
9
|
Perceiving blurry scenes with translational optic flow, rotational optic flow or combined optic flow. Vision Res 2019; 158:49-57. [PMID: 30796993 DOI: 10.1016/j.visres.2018.11.008] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2018] [Revised: 11/28/2018] [Accepted: 11/29/2018] [Indexed: 11/21/2022]
Abstract
Perceiving the spatial layout of objects is crucial in visual scene perception. Optic flow provides information about spatial layout. This information is not affected by image blur because motion detection uses low spatial frequencies in image structure. Therefore, perceiving scenes with blurry vision should be effective when optic flow is available. Furthermore, when blurry images and optic flow interact, optic flow specifies spatial relations and calibrates blurry images. Calibrated image structure then preserves spatial relations specified by optic flow after motion stops. Thus, perceiving blurry scenes should be stable when optic flow and blurry images are available. We investigated the types of optic flow that facilitate recognition of blurry scenes and evaluated the stability of performance. Participants identified scenes in blurry videos when viewing single frames and the entire videos that contained translational flow (Experiment 1), rotational flow (Experiment 2) or both (Experiment 3). When first viewing the blurry images, participants identified a few scenes. When viewing blurry video clips, their performance improved with translational flow, whether it was available alone or in combination with rotational flow. Participants were still able to perceive scenes from static blurry images one week later. Therefore, translational flow interacts with blurry image structures to yield effective and stable scene perception. These results imply that observers with blurry vision may be able to identify their surrounds when they locomote.
Collapse
|
10
|
Abstract
Pose estimation of objects in real scenes is critically important for biological and machine visual systems, but little is known of how humans infer 3D poses from 2D retinal images. We show unexpectedly remarkable agreement in the 3D poses different observers estimate from pictures. We further show that all observers apply the same inferential rule from all viewpoints, utilizing the geometrically derived back-transform from retinal images to actual 3D scenes. Pose estimations are altered by a fronto-parallel bias, and by image distortions that appear to tilt the ground plane. We used pictures of single sticks or pairs of joined sticks taken from different camera angles. Observers viewed these from five directions, and matched the perceived pose of each stick by rotating an arrow on a horizontal touchscreen. The projection of each 3D stick to the 2D picture, and then onto the retina, is described by an invertible trigonometric expression. The inverted expression yields the back-projection for each object pose, camera elevation, and observer viewpoint. We show that a model that uses the back-projection, modulated by just two free parameters, explains 560 pose estimates per observer. By considering changes in retinal image orientations due to position and elevation of limbs, the model also explains perceived limb poses in a complex scene of two bodies lying on the ground. The inferential rules simply explain both perceptual invariance and dramatic distortions in poses of real and pictured objects, and show the benefits of incorporating projective geometry of light into mental inferences about 3D scenes.
Collapse
|
11
|
Pan JS, Bingham N, Chen C, Bingham GP. Breaking camouflage and detecting targets require optic flow and image structure information. APPLIED OPTICS 2017; 56:6410-6418. [PMID: 29047842 DOI: 10.1364/ao.56.006410] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/24/2017] [Accepted: 07/07/2017] [Indexed: 06/07/2023]
Abstract
Use of motion to break camouflage extends back to the Cambrian [In the Blink of an Eye: How Vision Sparked the Big Bang of Evolution (New York Basic Books, 2003)]. We investigated the ability to break camouflage and continue to see camouflaged targets after motion stops. This is crucial for the survival of hunting predators. With camouflage, visual targets and distracters cannot be distinguished using only static image structure (i.e., appearance). Motion generates another source of optical information, optic flow, which breaks camouflage and specifies target locations. Optic flow calibrates image structure with respect to spatial relations among targets and distracters, and calibrated image structure makes previously camouflaged targets perceptible in a temporally stable fashion after motion stops. We investigated this proposal using laboratory experiments and compared how many camouflaged targets were identified either with optic flow information alone or with combined optic flow and image structure information. Our results show that the combination of motion-generated optic flow and target-projected image structure information yielded efficient and stable perception of camouflaged targets.
Collapse
|
12
|
Butkiewicz T, Stevens AH. Effectiveness of Structured Textures on Dynamically Changing Terrain-like Surfaces. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2016; 22:926-934. [PMID: 26529737 DOI: 10.1109/tvcg.2015.2467962] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
Previous perceptual research and human factors studies have identified several effective methods for texturing 3D surfaces to ensure that their curvature is accurately perceived by viewers. However, most of these studies examined the application of these techniques to static surfaces. This paper explores the effectiveness of applying these techniques to dynamically changing surfaces. When these surfaces change shape, common texturing methods, such as grids and contours, induce a range of different motion cues, which can draw attention and provide information about the size, shape, and rate of change. A human factors study was conducted to evaluate the relative effectiveness of these methods when applied to dynamically changing pseudo-terrain surfaces. The results indicate that, while no technique is most effective for all cases, contour lines generally perform best, and that the pseudocontour lines induced by banded color scales convey the same benefits.
Collapse
|
13
|
Nawrot M, Ratzlaff M, Leonard Z, Stroyan K. Modeling depth from motion parallax with the motion/pursuit ratio. Front Psychol 2014; 5:1103. [PMID: 25339926 PMCID: PMC4186274 DOI: 10.3389/fpsyg.2014.01103] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2014] [Accepted: 09/11/2014] [Indexed: 11/13/2022] Open
Abstract
The perception of unambiguous scaled depth from motion parallax relies on both retinal image motion and an extra-retinal pursuit eye movement signal. The motion/pursuit ratio represents a dynamic geometric model linking these two proximal cues to the ratio of depth to viewing distance. An important step in understanding the visual mechanisms serving the perception of depth from motion parallax is to determine the relationship between these stimulus parameters and empirically determined perceived depth magnitude. Observers compared perceived depth magnitude of dynamic motion parallax stimuli to static binocular disparity comparison stimuli at three different viewing distances, in both head-moving and head-stationary conditions. A stereo-viewing system provided ocular separation for stereo stimuli and monocular viewing of parallax stimuli. For each motion parallax stimulus, a point of subjective equality (PSE) was estimated for the amount of binocular disparity that generates the equivalent magnitude of perceived depth from motion parallax. Similar to previous results, perceived depth from motion parallax had significant foreshortening. Head-moving conditions produced even greater foreshortening due to the differences in the compensatory eye movement signal. An empirical version of the motion/pursuit law, termed the empirical motion/pursuit ratio, which models perceived depth magnitude from these stimulus parameters, is proposed.
Collapse
Affiliation(s)
- Mark Nawrot
- Department of Psychology, Center for Visual and Cognitive Neuroscience, North Dakota State University Fargo, ND, USA
| | - Michael Ratzlaff
- Department of Psychology, Center for Visual and Cognitive Neuroscience, North Dakota State University Fargo, ND, USA
| | - Zachary Leonard
- Department of Psychology, Center for Visual and Cognitive Neuroscience, North Dakota State University Fargo, ND, USA
| | - Keith Stroyan
- Math Department, University of Iowa Iowa City, IA, USA
| |
Collapse
|
14
|
|
15
|
Stroyan K, Nawrot M. Visual depth from motion parallax and eye pursuit. J Math Biol 2012; 64:1157-88. [PMID: 21695531 PMCID: PMC3348271 DOI: 10.1007/s00285-011-0445-1] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2011] [Revised: 05/26/2011] [Indexed: 10/18/2022]
Abstract
A translating observer viewing a rigid environment experiences "motion parallax", the relative movement upon the observer's retina of variously positioned objects in the scene. This retinal movement of images provides a cue to the relative depth of objects in the environment, however retinal motion alone cannot mathematically determine relative depth of the objects. Visual perception of depth from lateral observer translation uses both retinal image motion and eye movement. In Nawrot and Stroyan (Vision Res 49:1969-1978, 2009) we showed mathematically that the ratio of the rate of retinal motion over the rate of smooth eye pursuit mathematically determines depth relative to the fixation point in central vision. We also reported on psychophysical experiments indicating that this ratio is the important quantity for perception. Here we analyze the motion/pursuit cue for the more general, and more complicated, case when objects are distributed across the horizontal viewing plane beyond central vision. We show how the mathematical motion/pursuit cue varies with different points across the plane and with time as an observer translates. If the time varying retinal motion and smooth eye pursuit are the only signals used for this visual process, it is important to know what is mathematically possible to derive about depth and structure. Our analysis shows that the motion/pursuit ratio determines an excellent description of depth and structure in these broader stimulus conditions, provides a detailed quantitative hypothesis of these visual processes for the perception of depth and structure from motion parallax, and provides a computational foundation to analyze the dynamic geometry of future experiments.
Collapse
Affiliation(s)
- Keith Stroyan
- Mathematics Department, University of Iowa, Iowa City, IA, 52242, USA.
| | | |
Collapse
|
16
|
Fantoni C, Caudek C, Domini F. Perceived surface slant is systematically biased in the actively-generated optic flow. PLoS One 2012; 7:e33911. [PMID: 22479473 PMCID: PMC3316515 DOI: 10.1371/journal.pone.0033911] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2011] [Accepted: 02/19/2012] [Indexed: 12/04/2022] Open
Abstract
Humans make systematic errors in the 3D interpretation of the optic flow in both passive and active vision. These systematic distortions can be predicted by a biologically-inspired model which disregards self-motion information resulting from head movements (Caudek, Fantoni, & Domini 2011). Here, we tested two predictions of this model: (1) A plane that is stationary in an earth-fixed reference frame will be perceived as changing its slant if the movement of the observer's head causes a variation of the optic flow; (2) a surface that rotates in an earth-fixed reference frame will be perceived to be stationary, if the surface rotation is appropriately yoked to the head movement so as to generate a variation of the surface slant but not of the optic flow. Both predictions were corroborated by two experiments in which observers judged the perceived slant of a random-dot planar surface during egomotion. We found qualitatively similar biases for monocular and binocular viewing of the simulated surfaces, although, in principle, the simultaneous presence of disparity and motion cues allows for a veridical recovery of surface slant.
Collapse
Affiliation(s)
- Carlo Fantoni
- Center for Neuroscience and Cognitive, Systems@UniTn, Istituto Italiano di Tecnologia, Rovereto, Italy.
| | | | | |
Collapse
|
17
|
Warren WH. Does this computational theory solve the right problem? Marr, Gibson, and the goal of vision. Perception 2012; 41:1053-60. [PMID: 23409371 PMCID: PMC3816718 DOI: 10.1068/p7327] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
Abstract
David Marr's book Vision attempted to formulate athoroughgoing formal theory of perception. Marr borrowed much of the "computational" level from James Gibson: a proper understanding of the goal of vision, the natural constraints, and the available information are prerequisite to describing the processes and mechanisms by which the goal is achieved. Yet, as a research program leading to a computational model of human vision, Marr's program did not succeed. This article asks why, using the perception of 3D shape as a morality tale. Marr presumed that the goal of vision is to recover a general-purpose Euclidean description of the world, which can be deployed for any task or action. On this formulation, vision is underdetermined by information, which in turn necessitates auxiliary assumptions to solve the problem. But Marr's assumptions did not actually reflect natural constraints, and consequently the solutions were not robust. We now know that humans do not in fact recover Euclidean structure--rather, they reliably perceive qualitative shape (hills, dales, courses, ridges), which is specified by the second-order differential structure of images. By recasting the goals of vision in terms of our perceptual competencies, and doing the hard work of analyzing the information available under ecological constraints, we can reformulate the problem so that perception is determined by information and prior knowledge is unnecessary.
Collapse
Affiliation(s)
- William H Warren
- Department of Cognitive, Linguistic, and Psychological Sciences, Brown University, Providence, RI 02912, USA.
| |
Collapse
|
18
|
Domini F, Shah R, Caudek C. Do we perceive a flattened world on the monitor screen? Acta Psychol (Amst) 2011; 138:359-66. [PMID: 21986481 DOI: 10.1016/j.actpsy.2011.07.007] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2011] [Revised: 07/27/2011] [Accepted: 07/29/2011] [Indexed: 11/28/2022] Open
Abstract
The current model of three-dimensional perception hypothesizes that the brain integrates the depth cues in a statistically optimal fashion through a weighted linear combination with weights proportional to the reliabilities obtained for each cue in isolation (Landy, Maloney, Johnston, & Young, 1995). Even though many investigations support such theoretical framework, some recent empirical findings are at odds with this view (e.g., Domini, Caudek, & Tassinari, 2006). Failures of linear cue integration have been attributed to cue-conflict and to unmodelled cues to flatness present in computer-generated displays. We describe two cue-combination experiments designed to test the integration of stereo and motion cues, in the presence of consistent or conflicting blur and accommodation information (i.e., when flatness cues are either absent, with physical stimuli, or present, with computer-generated displays). In both conditions, we replicated the results of Domini et al. (2006): The amount of perceived depth increased as more cues were available, also producing an over-estimation of depth in some conditions. These results can be explained by the Intrinsic Constraint model, but not by linear cue combination.
Collapse
Affiliation(s)
- Fulvio Domini
- Department of Cognitive, Linguistic & Psychological Sciences, Brown University, Providence, RI 02912, USA.
| | | | | |
Collapse
|
19
|
Caudek C, Fantoni C, Domini F. Bayesian modeling of perceived surface slant from actively-generated and passively-observed optic flow. PLoS One 2011; 6:e18731. [PMID: 21533197 PMCID: PMC3077406 DOI: 10.1371/journal.pone.0018731] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2010] [Accepted: 03/11/2011] [Indexed: 11/23/2022] Open
Abstract
We measured perceived depth from the optic flow (a) when showing a stationary physical or virtual object to observers who moved their head at a normal or slower speed, and (b) when simulating the same optic flow on a computer and presenting it to stationary observers. Our results show that perceived surface slant is systematically distorted, for both the active and the passive viewing of physical or virtual surfaces. These distortions are modulated by head translation speed, with perceived slant increasing directly with the local velocity gradient of the optic flow. This empirical result allows us to determine the relative merits of two alternative approaches aimed at explaining perceived surface slant in active vision: an "inverse optics" model that takes head motion information into account, and a probabilistic model that ignores extra-retinal signals. We compare these two approaches within the framework of the bayesian theory. The "inverse optics" bayesian model produces veridical slant estimates if the optic flow and the head translation velocity are measured with no error; because of the influence of a "prior" for flatness, the slant estimates become systematically biased as the measurement errors increase. The bayesian model, which ignores the observer's motion, always produces distorted estimates of surface slant. Interestingly, the predictions of this second model, not those of the first one, are consistent with our empirical findings. The present results suggest that (a) in active vision perceived surface slant may be the product of probabilistic processes which do not guarantee the correct solution, and (b) extra-retinal signals may be mainly used for a better measurement of retinal information.
Collapse
Affiliation(s)
- Corrado Caudek
- Department of Psychology, Università degli Studi di Firenze, Firenze, Italy.
| | | | | |
Collapse
|
20
|
Integration of disparity and velocity information for haptic and perceptual judgments of object depth. Acta Psychol (Amst) 2011; 136:300-10. [PMID: 21237442 DOI: 10.1016/j.actpsy.2010.12.003] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2010] [Revised: 12/09/2010] [Accepted: 12/10/2010] [Indexed: 11/23/2022] Open
Abstract
Do reach-to-grasp (prehension) movements require a metric representation of three-dimensional (3D) layouts and objects? We propose a model relying only on direct sensory information to account for the planning and execution of prehension movements in the absence of haptic feedback and when the hand is not visible. In the present investigation, we isolate relative motion and binocular disparity information from other depth cues and we study their efficacy for reach-to-grasp movements and visual judgments. We show that (i) the amplitude of the grasp increases when relative motion is added to binocular disparity information, even if depth from disparity information is already veridical, and (ii) similar distortions of derived depth are found for haptic tasks and perceptual judgments. With a quantitative test, we demonstrate that our results are consistent with the Intrinsic Constraint model and do not require 3D metric inferences (Domini, Caudek, & Tassinari, 2006). By contrast, the linear cue integration model (Landy, Maloney, Johnston, & Young, 1995) cannot explain the present results, even if the flatness cues are taken into account.
Collapse
|
21
|
Abstract
Many organisms and objects deform nonrigidly when moving, requiring perceivers to separate shape changes from object motions. Surprisingly, the abilities of observers to correctly infer nonrigid volumetric shapes from motion cues have not been measured, and structure from motion models predominantly use variants of rigidity assumptions. We show that observers are equally sensitive at discriminating cross-sections of flexing and rigid cylinders based on motion cues, when the cylinders are rotated simultaneously around the vertical and depth axes. A computational model based on motion perspective (i.e., assuming perceived depth is inversely proportional to local velocity) predicted the psychometric curves better than shape from motion factorization models using shape or trajectory basis functions. Asymmetric percepts of symmetric cylinders, arising because of asymmetric velocity profiles, provided additional evidence for the dominant role of relative velocity in shape perception. Finally, we show that inexperienced observers are generally incapable of using motion cues to detect inflation/deflation of rigid and flexing cylinders, but this handicap can be overcome with practice for both nonrigid and rigid shapes. The empirical and computational results of this study argue against the use of rigidity assumptions in extracting 3D shape from motion and for the primacy of motion deformations computed from motion shears.
Collapse
|
22
|
Jain A, Backus BT. Experience affects the use of ego-motion signals during 3D shape perception. J Vis 2010; 10:10.14.30. [PMID: 21191132 DOI: 10.1167/10.14.30] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open
Abstract
Experience has long-term effects on perceptual appearance (Q. Haijiang, J. A. Saunders, R. W. Stone, & B. T. Backus, 2006). We asked whether experience affects the appearance of structure-from-motion stimuli when the optic flow is caused by observer ego-motion. Optic flow is an ambiguous depth cue: a rotating object and its oppositely rotating, depth-inverted dual generate similar flow. However, the visual system exploits ego-motion signals to prefer the percept of an object that is stationary over one that rotates (M. Wexler, F. Panerai, I. Lamouret, & J. Droulez, 2001). We replicated this finding and asked whether this preference for stationarity, the "stationarity prior," is modulated by experience. During training, two groups of observers were exposed to objects with identical flow, but that were either stationary or moving as determined by other cues. The training caused identical test stimuli to be seen preferentially as stationary or moving by the two groups, respectively. We then asked whether different priors can exist independently at different locations in the visual field. Observers were trained to see objects either as stationary or as moving at two different locations. Observers' stationarity bias at the two respective locations was modulated in the directions consistent with training. Thus, the utilization of extraretinal ego-motion signals for disambiguating optic flow signals can be updated as the result of experience, consistent with the updating of a Bayesian prior for stationarity.
Collapse
Affiliation(s)
- Anshul Jain
- SUNY Eye Institute and Graduate Center for Vision Research, SUNY College of Optometry, New York, NY 10036, USA.
| | | |
Collapse
|
23
|
Di Luca M, Domini F, Caudek C. Inconsistency of perceived 3D shape. Vision Res 2010; 50:1519-31. [PMID: 20470815 DOI: 10.1016/j.visres.2010.05.006] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2010] [Revised: 05/05/2010] [Accepted: 05/05/2010] [Indexed: 11/16/2022]
Abstract
Internal consistency of local depth, slant, and curvature judgments was studied by asking participants to match two 3D surfaces rendered by different mixtures of 3D cues (velocity, texture, and shading). We found that perceptual judgments were not consistent with each other, with cue-specific distortions. Adding multiple cues did not eliminate the inconsistencies of the judgments. These results can be predicted by the Intrinsic Constraint (IC) model according to which the perceptual metric local estimates are a monotonically increasing function of the Signal-to-Noise Ratio of the optimal combination of direct information of 3D shape (Domini, Caudek, & Tassinari, 2006).
Collapse
Affiliation(s)
- M Di Luca
- Max Planck Institute for Biological Cybernetics, Tuebingen, Germany
| | | | | |
Collapse
|
24
|
Domini F, Caudek C. Matching perceived depth from disparity and from velocity: Modeling and psychophysics. Acta Psychol (Amst) 2010; 133:81-9. [PMID: 19963200 DOI: 10.1016/j.actpsy.2009.10.003] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2009] [Revised: 10/06/2009] [Accepted: 10/08/2009] [Indexed: 10/20/2022] Open
Abstract
We asked observers to match in depth a disparity-only stimulus with a velocity-only stimulus. The observers' responses revealed systematic biases: the two stimuli appeared to be matched in depth when they were produced by the projection of different distal depth extents. We discuss two alternative models of depth recovery that could account for these results. (1) Depth matches could be obtained by scaling the image signals by constants not specified by optical information, and (2) depth matches could be obtained by equating the stimuli in terms of their signal-to-noise ratios (see Domini & Caudek, 2009). We show that the systematic failures of shape constancy revealed by observers' judgments are well accounted for by the hypothesis that the apparent depth of a stimulus is determined by the magnitude of the retinal signals relative to the uncertainty (i.e., internal noise) arising from the measurement of those signals.
Collapse
|
25
|
Fernandez JM, Farell B. A new theory of structure-from-motion perception. J Vis 2009; 9:23.1-20. [PMID: 20053086 DOI: 10.1167/9.11.23] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2008] [Accepted: 09/22/2009] [Indexed: 11/24/2022] Open
Abstract
Humans can recover 3-D structure from the projected 2D motion field of a rotating object, a phenomenon called structure from motion (SFM). Current models of SFM perception are limited to the case in which objects rotate about a frontoparallel axis. However, as our recent psychophysical studies showed, frontoparallel axes of rotation are not representative of the general case. Here we present the first model to address the problem of SFM perception for the general case of rotations around an arbitrary axis. The SFM computation is cast as a two-stage process. The first stage computes the structure perpendicular to the axis of rotation. The second stage corrects for the slant of the axis of rotation. For cylinders, the computed object shape is invariant with respect to the observer's viewpoint (that is, perceived shape doesn't change with a change in the direction of the axis of rotation). The model uses template matching to estimate global parameters such as the angular speed of rotation, which are then used to compute the local depth structure. The model provides quantitative predictions that agree well with current psychophysical data for both frontoparallel and non-frontoparallel rotations.
Collapse
Affiliation(s)
- Julian M Fernandez
- Institute for Sensory Research, Syracuse University, Syracuse, NY 13224, USA.
| | | |
Collapse
|
26
|
Warren PA, Rushton SK. Perception of scene-relative object movement: Optic flow parsing and the contribution of monocular depth cues. Vision Res 2009; 49:1406-19. [PMID: 19480063 DOI: 10.1016/j.visres.2009.01.016] [Citation(s) in RCA: 47] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
We have recently suggested that the brain uses its sensitivity to optic flow in order to parse retinal motion into components arising due to self and object movement (e.g. Rushton, S. K., & Warren, P. A. (2005). Moving observers, 3D relative motion and the detection of object movement. Current Biology, 15, R542-R543). Here, we explore whether stereo disparity is necessary for flow parsing or whether other sources of depth information, which could theoretically constrain flow-field interpretation, are sufficient. Stationary observers viewed large field of view stimuli containing textured cubes, moving in a manner that was consistent with a complex observer movement through a stationary scene. Observers made speeded responses to report the perceived direction of movement of a probe object presented at different depths in the scene. Across conditions we varied the presence or absence of different binocular and monocular cues to depth order. In line with previous studies, results consistent with flow parsing (in terms of both perceived direction and response time) were found in the condition in which motion parallax and stereoscopic disparity were present. Observers were poorer at judging object movement when depth order was specified by parallax alone. However, as more monocular depth cues were added to the stimulus the results approached those found when the scene contained stereoscopic cues. We conclude that both monocular and binocular static depth information contribute to flow parsing. These findings are discussed in the context of potential architectures for a model of the flow parsing mechanism.
Collapse
Affiliation(s)
- Paul A Warren
- School of Psychology and Communications Research Centre, Cardiff University, Cardiff, CF10 3AT Wales, UK.
| | | |
Collapse
|
27
|
Nawrot M, Stroyan K. The motion/pursuit law for visual depth perception from motion parallax. Vision Res 2009; 49:1969-78. [PMID: 19463848 PMCID: PMC2735858 DOI: 10.1016/j.visres.2009.05.008] [Citation(s) in RCA: 30] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2008] [Revised: 04/23/2009] [Accepted: 05/04/2009] [Indexed: 11/21/2022]
Abstract
One of vision's most important functions is specification of the layout of objects in the 3D world. While the static optical geometry of retinal disparity explains the perception of depth from binocular stereopsis, we propose a new formula to link the pertinent dynamic geometry to the computation of depth from motion parallax. Mathematically, the ratio of retinal image motion (motion) and smooth pursuit of the eye (pursuit) provides the necessary information for the computation of relative depth from motion parallax. We show that this could have been obtained with the approaches of Nakayama and Loomis [Nakayama, K., & Loomis, J. M. (1974). Optical velocity patterns, velocity-sensitive neurons, and space perception: A hypothesis. Perception, 3, 63-80] or Longuet-Higgens and Prazdny [Longuet-Higgens, H. C., & Prazdny, K. (1980). The interpretation of a moving retinal image. Proceedings of the Royal Society of London Series B, 208, 385-397] by adding pursuit to their treatments. Results of a psychophysical experiment show that changes in the motion/pursuit ratio have a much better relationship to changes in the perception of depth from motion parallax than do changes in motion or pursuit alone. The theoretical framework provided by the motion/pursuit law provides the quantitative foundation necessary to study this fundamental visual depth perception ability.
Collapse
Affiliation(s)
- Mark Nawrot
- Center for Visual Neuroscience, Department of Psychology, North Dakota State University, Fargo, ND 58104, USA.
| | | |
Collapse
|
28
|
Fernandez JM, Farell B. Is perceptual space inherently non-Euclidean? JOURNAL OF MATHEMATICAL PSYCHOLOGY 2009; 53:86-91. [PMID: 20161280 PMCID: PMC2702877 DOI: 10.1016/j.jmp.2008.12.006] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/28/2023]
Abstract
It is often assumed that the space we perceive is Euclidean, although this idea has been challenged by many authors. Here we show that, if spatial cues are combined as described by Maximum Likelihood Estimation, Bayesian, or equivalent models, as appears to be the case, then Euclidean geometry cannot describe our perceptual experience. Rather, our perceptual spatial structure would be better described as belonging to an arbitrarily curved Riemannian space.
Collapse
Affiliation(s)
| | - Bart Farell
- Institute for Sensory Research, Syracuse University, Syracuse, NY, 13244, USA
| |
Collapse
|
29
|
Tassinari H, Domini F. The intrinsic constraint model for stereo-motion integration. Perception 2008; 37:79-95. [PMID: 18399249 DOI: 10.1068/p5501] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]
Abstract
How the visual system integrates the information provided by several depth cues is central for vision research. Here, we present a model for how the human visual system combines disparity and velocity information. The model provides a depth interpretation to a subspace defined by the covariation of the two signals. We show that human performance is consistent with the predictions of the model, and compare them with those of another theoretical approach, the modified weak-fusion model. We discuss the validity of each approach as a model for human perception of 3-D shape from multiple cues to depth.
Collapse
Affiliation(s)
- Hadley Tassinari
- Department of Cognitive and Linguistic Sciences, Brown University, Box 1978, Providence, RI 02912, USA
| | | |
Collapse
|
30
|
Colas F, Droulez J, Wexler M, Bessière P. A unified probabilistic model of the perception of three-dimensional structure from optic flow. BIOLOGICAL CYBERNETICS 2007; 97:461-477. [PMID: 17987312 DOI: 10.1007/s00422-007-0183-z] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/28/2006] [Accepted: 09/25/2007] [Indexed: 05/25/2023]
Abstract
Human observers can perceive the three- dimensional (3-D) structure of their environment using various cues, an important one of which is optic flow. The motion of any point's projection on the retina depends both on the point's movement in space and on its distance from the eye. Therefore, retinal motion can be used to extract the 3-D structure of the environment and the shape of objects, in a process known as structure-from-motion (SFM). However, because many combinations of 3-D structure and motion can lead to the same optic flow, SFM is an ill-posed inverse problem. The rigidity hypothesis is a constraint supposed to formally solve the SFM problem and to account for human performance. Recently, however, a number of psychophysical results, with both moving and stationary human observers, have shown that the rigidity hypothesis alone cannot account for human performance in SFM tasks, but no model is known to account for the new results. Here, we construct a Bayesian model of SFM based mainly on one new hypothesis, that of stationarity, coupled with the rigidity hypothesis. The predictions of the model, calculated using a new and powerful methodology called Bayesian programming, account for a wide variety of experimental findings.
Collapse
Affiliation(s)
- Francis Colas
- LPPA, Collège de France, 11, place Marcelin Berthelot, 75231 Paris Cedex 05, France.
| | | | | | | |
Collapse
|
31
|
Abstract
The present study investigates how observers assign depth in point-light figures, by manipulating spatiotemporal characteristics of the stimuli. Previous research on the perception of point-light walkers revealed bistability (i.e., that a point-light walker is perceived as either facing the viewer or facing away from the viewer) and the presence of a perceptual bias (i.e., a tendency to perceive the figure as facing the viewer). Here, we study the generality of these phenomena by having observers indicate the global depth orientation of different ambiguous point-light actions. Results demonstrate bistability for all actions, but the presence of a preferred interpretation depends strongly on the performed action, showing that the process of depth assignment takes into account the movements the point-light figure performs. Two additional experiments, using unfamiliar movement patterns without strong semantic correlates, show that purely kinematic aspects of a naction also strongly affect d epth assignment. Together, the results reveal the perception of depth in point-light figures to be a flexible processinvolving both bottom-up and top-down components.
Collapse
Affiliation(s)
- Jan Vanrie
- Katholieke Universiteit Leuven, Leuven, Belgium
| | | |
Collapse
|
32
|
Domini F, Caudek C, Tassinari H. Stereo and motion information are not independently processed by the visual system. Vision Res 2006; 46:1707-23. [PMID: 16412492 DOI: 10.1016/j.visres.2005.11.018] [Citation(s) in RCA: 42] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2004] [Revised: 11/10/2005] [Accepted: 11/15/2005] [Indexed: 11/16/2022]
Abstract
Many visual tasks are carried out by using multiple sources of sensory information to estimate environmental properties. In this paper, we present a model for how the visual system combines disparity and velocity information. We propose that, in a first stage of processing, the best possible estimate of the affine structure is obtained by computing a composite score from the disparity and velocity signals. In a second stage, a maximum likelihood Euclidean interpretation is assigned to the recovered affine structure. In two experiments, we show that human performance is consistent with the predictions of our model. The present results are also discussed in the framework of another theoretical approach of the depth cue combination process termed Modified Weak Fusion.
Collapse
Affiliation(s)
- Fulvio Domini
- Department of Cognitive and Linguistic Sciences, Brown University, USA.
| | | | | |
Collapse
|
33
|
Naji JJ, Freeman TCA. Perceiving depth order during pursuit eye movement. Vision Res 2004; 44:3025-34. [PMID: 15474575 DOI: 10.1016/j.visres.2004.07.007] [Citation(s) in RCA: 29] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2004] [Revised: 06/23/2004] [Indexed: 10/26/2022]
Abstract
Pursuit eye movements alter retinal motion cues to depth. For instance, the sinusoidal retinal velocity profile produced by a translating, corrugated surface resembles a sinusoidal shear during pursuit. One way to recover the correct spatial phase of the corrugation's profile (i.e. which part is near and which part is far) is to combine estimates of shear with extra-retinal estimates of translation. In support of this hypothesis, we found the corrugation's spatial phase appeared ambiguous when retinal shear was viewed without translation, but unambiguous when translated and viewed with or without a pursuit eye movement. The eyes lagged the sinusoidal translation by a small but persistent amount, raising the possibility that retinal slip could serve as the disambiguating cue in the eye-moving condition. A yoked control was therefore performed in which measured horizontal slip was fed back into a fixated shearing stimulus on a trial-by-trial basis. The results showed that the corrugation's phase was only seen unambiguously during the real eye movement. This supports the idea that extra-retinal estimates of eye velocity can help disambiguate ordinal depth structure within moving retinal images.
Collapse
Affiliation(s)
- Jenny J Naji
- School of Psychology, Cardiff University, Tower Building, Park Place, CF10 3AT, Wales, UK
| | | |
Collapse
|
34
|
Abstract
A common approach to explaining the perception of form is through the use of static features. The weakness of this approach points naturally to dynamic definitions of form. Considering dynamical form, however, leads inevitably to the need to explain how events are perceived as time-extended--a problem with primacy over that even of qualia. Optic flow models, energy models, models reliant on a rigidity constraint are examined. The reliance of these models on the instantaneous specification of form at an instant, t, or across a series of such instants forces the consideration of the primary memory supporting both the perception of time-extended events and the time-extension of consciousness. This cannot be reduced to an integration over space and time. The difficulty of defining the basis for this memory is highlighted in considerations of dynamic form in relation to scales of time. Ultimately, the possibility is raised that psychology must follow physics in a more profound approach to time and motion.
Collapse
Affiliation(s)
- Stephen E Robbins
- Center for Advanced Product Engineering, Metavante Corporation, 10850 West Park Place, Milwaukee, WI 53224, USA.
| |
Collapse
|