1
|
Kemp JT, Cesanek E, Domini F. Perceiving depth from texture and disparity cues: Evidence for a non-probabilistic account of cue integration. J Vis 2023; 23:13. [PMID: 37486299 PMCID: PMC10382782 DOI: 10.1167/jov.23.7.13] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2022] [Accepted: 06/12/2023] [Indexed: 07/25/2023] Open
Abstract
Bayesian inference theories have been extensively used to model how the brain derives three-dimensional (3D) information from ambiguous visual input. In particular, the maximum likelihood estimation (MLE) model combines estimates from multiple depth cues according to their relative reliability to produce the most probable 3D interpretation. Here, we tested an alternative theory of cue integration, termed the intrinsic constraint (IC) theory, which postulates that the visual system derives the most stable, not most probable, interpretation of the visual input amid variations in viewing conditions. The vector sum model provides a normative approach for achieving this goal where individual cue estimates are components of a multidimensional vector whose norm determines the combined estimate. Individual cue estimates are not accurate but related to distal 3D properties through a deterministic mapping. In three experiments, we show that the IC theory can more adeptly account for 3D cue integration than MLE models. In Experiment 1, we show systematic biases in the perception of depth from texture and depth from binocular disparity. Critically, we demonstrate that the vector sum model predicts an increase in perceived depth when these cues are combined. In Experiment 2, we illustrate the IC theory radical reinterpretation of the just noticeable difference (JND) and test the related vector sum model prediction of the classic finding of smaller JNDs for combined-cue versus single-cue stimuli. In Experiment 3, we confirm the vector sum prediction that biases found in cue integration experiments cannot be attributed to flatness cues, as the MLE model predicts.
Collapse
Affiliation(s)
- Jovan T Kemp
- Department of Cognitive, Linguistic, and Psychological Sciences, Brown University, Providence, RI, USA
| | - Evan Cesanek
- Mortimer B. Zuckerman Mind Brain Behavior Institute, Columbia University, New York, NY, USA
| | - Fulvio Domini
- Department of Cognitive, Linguistic, and Psychological Sciences, Brown University, Providence, RI, USA
- Italian Institute of Technology, Rovereto, Italy
| |
Collapse
|
2
|
Domini F. The case against probabilistic inference: a new deterministic theory of 3D visual processing. Philos Trans R Soc Lond B Biol Sci 2023; 378:20210458. [PMID: 36511407 PMCID: PMC9745883 DOI: 10.1098/rstb.2021.0458] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022] Open
Abstract
How the brain derives 3D information from inherently ambiguous visual input remains the fundamental question of human vision. The past two decades of research have addressed this question as a problem of probabilistic inference, the dominant model being maximum-likelihood estimation (MLE). This model assumes that independent depth-cue modules derive noisy but statistically accurate estimates of 3D scene parameters that are combined through a weighted average. Cue weights are adjusted based on the system representation of each module's output variability. Here I demonstrate that the MLE model fails to account for important psychophysical findings and, importantly, misinterprets the just noticeable difference, a hallmark measure of stimulus discriminability, to be an estimate of perceptual uncertainty. I propose a new theory, termed Intrinsic Constraint, which postulates that the visual system does not derive the most probable interpretation of the visual input, but rather, the most stable interpretation amid variations in viewing conditions. This goal is achieved with the Vector Sum model, which represents individual cue estimates as components of a multi-dimensional vector whose norm determines the combined output. This model accounts for the psychophysical findings cited in support of MLE, while predicting existing and new findings that contradict the MLE model. This article is part of a discussion meeting issue 'New approaches to 3D vision'.
Collapse
Affiliation(s)
- Fulvio Domini
- CLPS, Brown University, 190 Thayer Street Providence, Rhode Island 02912-9067, USA
| |
Collapse
|
3
|
Lappin JS, Bell HH. Form and Function in Information for Visual Perception. Iperception 2022; 12:20416695211053352. [PMID: 35003612 PMCID: PMC8728782 DOI: 10.1177/20416695211053352] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2021] [Revised: 07/19/2021] [Accepted: 09/27/2021] [Indexed: 11/16/2022] Open
Abstract
Visual perception involves spatially and temporally coordinated variations in diverse
physical systems: environmental surfaces and symbols, optical images, electro-chemical
activity in neural networks, muscles, and bodily movements—each with a distinctly
different material structure and energy. The fundamental problem in the theory of
perception is to characterize the information that enables both perceptual awareness and
real-time dynamic coordination of these diverse physical systems. Gibson's psychophysical
and ecological conception of this problem differed from that of mainstream science both
then and now. The present article aims to incorporate Gibson's ideas within a general
conception of information for visual perception. We emphasize the essential role of
spatiotemporal form, in contrast with symbolic information. We consider contemporary
understanding of surface structure, optical images, and optic flow. Finally, we consider
recent evidence about capacity limitations on the rate of visual perception and
implications for the ecology of vision.
Collapse
|
4
|
Abstract
Shape is an interesting property of objects because it is used in ordinary discourse in ways that seem to have little connection to how it is typically defined in mathematics. The present article describes how the concept of shape can be grounded within Euclidean and non-Euclidean geometry and also to human perception. It considers the formal methods that have been proposed for measuring the differences among shapes and how the performance of those methods compares with shape difference thresholds of human observers. It discusses how different types of shape change can be perceptually categorized. It also evaluates the specific data structures that have been used to represent shape in models of both human and machine vision, and it reviews the psychophysical evidence about the extent to which those models are consistent with human perception. Based on this review of the literature, we argue that shape is not one thing but rather a collection of many object attributes, some of which are more perceptually salient than others. Because the relative importance of these attributes can be context dependent, there is no obvious single definition of shape that is universally applicable in all situations.
Collapse
Affiliation(s)
- James T Todd
- Department of Psychology, The Ohio State University, Columbus, OH, USA
| | | |
Collapse
|
5
|
Billino J, Drewing K. Age Effects on Visuo-Haptic Length Discrimination: Evidence for Optimal Integration of Senses in Senior Adults. Multisens Res 2018; 31:273-300. [PMID: 31264626 DOI: 10.1163/22134808-00002601] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2017] [Accepted: 07/25/2017] [Indexed: 11/19/2022]
Abstract
Demographic changes in most developed societies have fostered research on functional aging. While cognitive changes have been characterized elaborately, understanding of perceptual aging lacks behind. We investigated age effects on the mechanisms of how multiple sources of sensory information are merged into a common percept. We studied visuo-haptic integration in a length discrimination task. A total of 24 young (20-25 years) and 27 senior (69-77 years) adults compared standard stimuli to appropriate sets of comparison stimuli. Standard stimuli were explored under visual, haptic, or visuo-haptic conditions. The task procedure allowed introducing an intersensory conflict by anamorphic lenses. Comparison stimuli were exclusively explored haptically. We derived psychometric functions for each condition, determining points of subjective equality and discrimination thresholds. We notably evaluated visuo-haptic perception by different models of multisensory processing, i.e., the Maximum-Likelihood-Estimate model of optimal cue integration, a suboptimal integration model, and a cue switching model. Our results support robust visuo-haptic integration across the adult lifespan. We found suboptimal weighted averaging of sensory sources in young adults, however, senior adults exploited differential sensory reliabilities more efficiently to optimize thresholds. Indeed, evaluation of the MLE model indicates that young adults underweighted visual cues by more than 30%; in contrast, visual weights of senior adults deviated only by about 3% from predictions. We suggest that close to optimal multisensory integration might contribute to successful compensation for age-related sensory losses and provides a critical resource. Differentiation between multisensory integration during healthy aging and age-related pathological challenges on the sensory systems awaits further exploration.
Collapse
Affiliation(s)
- Jutta Billino
- Department of Psychology, Justus-Liebig-Universität, Otto-Behaghel-Str. 10F, 35394 Giessen, Germany
| | - Knut Drewing
- Department of Psychology, Justus-Liebig-Universität, Otto-Behaghel-Str. 10F, 35394 Giessen, Germany
| |
Collapse
|
6
|
Abstract
Local solid shape applies to the surface curvature of small surface patches—essentially regions of approximately constant curvatures—of volumetric objects that are smooth volumetric regions in Euclidean 3-space. This should be distinguished from local shape in pictorial space. The difference is categorical. Although local solid shape has naturally been explored in haptics, results in vision are not forthcoming. We describe a simple experiment in which observers judge shape quality and magnitude of cinematographic presentations. Without prior training, observers readily use continuous shape index and Casorati curvature scales with reasonable resolution.
Collapse
Affiliation(s)
- Jan Koenderink
- University of Leuven (KU Leuven), Belgium; Faculteit Sociale Wetenschappen, Universiteit Utrecht, The Netherlands
| | - Andrea van Doorn
- Faculteit Sociale Wetenschappen, Universiteit Utrecht, The Netherlands
| | | |
Collapse
|
7
|
Papenmeier F, Schwan S. If you watch it move, you'll recognize it in 3D: Transfer of depth cues between encoding and retrieval. Acta Psychol (Amst) 2016; 164:90-5. [PMID: 26765253 DOI: 10.1016/j.actpsy.2015.12.010] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2015] [Revised: 12/07/2015] [Accepted: 12/20/2015] [Indexed: 11/19/2022] Open
Abstract
Viewing objects with stereoscopic displays provides additional depth cues through binocular disparity supporting object recognition. So far, it was unknown whether this results from the representation of specific stereoscopic information in memory or a more general representation of an object's depth structure. Therefore, we investigated whether continuous object rotation acting as depth cue during encoding results in a memory representation that can subsequently be accessed by stereoscopic information during retrieval. In Experiment 1, we found such transfer effects from continuous object rotation during encoding to stereoscopic presentations during retrieval. In Experiments 2a and 2b, we found that the continuity of object rotation is important because only continuous rotation and/or stereoscopic depth but not multiple static snapshots presented without stereoscopic information caused the extraction of an object's depth structure into memory. We conclude that an object's depth structure and not specific depth cues are represented in memory.
Collapse
|
8
|
Abstract
What are the geometric primitives of binocular disparity? The Venetian blind effect and other converging lines of evidence indicate that stereoscopic depth perception derives from disparities of higher-order structure in images of surfaces. Image structure entails spatial variations of intensity, texture, and motion, jointly structured by observed surfaces. The spatial structure of binocular disparity corresponds to the spatial structure of surfaces. Independent spatial coordinates are not necessary for stereoscopic vision. Stereopsis is highly sensitive to structural disparities associated with local surface shape. Disparate positions on retinal anatomy are neither necessary nor sufficient for stereopsis.
Collapse
|
9
|
Jovanovic B, Drewing K. The influence of intersensory discrepancy on visuo-haptic integration is similar in 6-year-old children and adults. Front Psychol 2014; 5:57. [PMID: 24523712 PMCID: PMC3906500 DOI: 10.3389/fpsyg.2014.00057] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2013] [Accepted: 01/16/2014] [Indexed: 11/28/2022] Open
Abstract
When participants are given the opportunity to simultaneously feel an object and see it through a magnifying or reducing lens, adults estimate object size to be in-between visual and haptic size. Studies with young children, however, seem to demonstrate that their estimates are dominated by a single sense. In the present study, we examined whether this age difference observed in previous studies, can be accounted for by the large discrepancy between felt and seen size in the stimuli used in those studies. In addition, we studied the processes involved in combining the visual and haptic inputs. Adults and 6-year-old children judged objects that were presented to vision, haptics or simultaneously to both senses. The seen object length was reduced or magnified by different lenses. In the condition inducing large intersensory discrepancies, children's judgments in visuo-haptic conditions were almost dominated by vision, whereas adults weighted vision just by ~40%. Neither the adults' nor the children's discrimination thresholds were predicted by models of visuo-haptic integration. With smaller discrepancies, the children's visual weight approximated that of the adults and both the children's and adults' discrimination thresholds were well predicted by an integration model, which assumes that both visual and haptic inputs contribute to each single judgment. We conclude that children integrate seemingly corresponding multisensory information in similar ways as adults do, but focus on a single sense, when information from different senses is strongly discrepant.
Collapse
Affiliation(s)
- Bianca Jovanovic
- Department for Developmental Psychology, Institute for Psychology, Justus-Liebig University Giessen, Germany
| | - Knut Drewing
- Department for Developmental Psychology, Institute for Psychology, Justus-Liebig University Giessen, Germany
| |
Collapse
|
10
|
Cellini C, Kaim L, Drewing K. Visual and haptic integration in the estimation of softness of deformable objects. Iperception 2013; 4:516-31. [PMID: 25165510 PMCID: PMC4129386 DOI: 10.1068/i0598] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2013] [Revised: 11/14/2013] [Indexed: 11/14/2022] Open
Abstract
Softness perception intrinsically relies on haptic information. However, through everyday experiences we learn correspondences between felt softness and the visual effects of exploratory movements that are executed to feel softness. Here, we studied how visual and haptic information is integrated to assess the softness of deformable objects. Participants discriminated between the softness of two softer or two harder objects using only-visual, only-haptic or both visual and haptic information. We assessed the reliabilities of the softness judgments using the method of constant stimuli. In visuo-haptic trials, discrepancies between the two senses' information allowed us to measure the contribution of the individual senses to the judgments. Visual information (finger movement and object deformation) was simulated using computer graphics; input in visual trials was taken from previous visuo-haptic trials. Participants were able to infer softness from vision alone, and vision considerably contributed to bisensory judgments (∼35%). The visual contribution was higher than predicted from models of optimal integration (senses are weighted according to their reliabilities). Bisensory judgments were less reliable than predicted from optimal integration. We conclude that the visuo-haptic integration of softness information is biased toward vision, rather than being optimal, and might even be guided by a fixed weighting scheme.
Collapse
Affiliation(s)
- Cristiano Cellini
- Department of General Psychology, Justus-Liebig-University of Giessen, Otto-Behaghel-Strasse 10F, 35394 Giessen, Germany; e-mail:
| | - Lukas Kaim
- Department of General Psychology, Justus-Liebig-University of Giessen, Otto-Behaghel-Strasse 10F, 35394 Giessen, Germany; e-mail:
| | - Knut Drewing
- Department of General Psychology, Justus-Liebig-University of Giessen, Otto-Behaghel-Strasse 10F, 35394 Giessen, Germany; e-mail:
| |
Collapse
|
11
|
Object recognition using metric shape. Vision Res 2012; 69:23-31. [PMID: 22884632 DOI: 10.1016/j.visres.2012.07.013] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/28/2011] [Revised: 07/19/2012] [Accepted: 07/23/2012] [Indexed: 11/20/2022]
|
12
|
Abstract
How do retinal images lead to perceived environmental objects? Vision involves a series of spatial and material transformations--from environmental objects to retinal images, to neurophysiological patterns, and finally to perceptual experience and action. A rationale for understanding functional relations among these physically different systems occurred to Gustav Fechner: Differences in sensation correspond to differences in physical stimulation. The concept of information is similar: Relationships in one system may correspond to, and thus represent, those in another. Criteria for identifying and evaluating information include (a) resolution, or the precision of correspondence; (b) uncertainty about which input (output) produced a given output (input); and (c) invariance, or the preservation of correspondence under transformations of input and output. We apply this framework to psychophysical evidence to identify visual information for perceiving surfaces. The elementary spatial structure shared by objects and images is the second-order differential structure of local surface shape. Experiments have shown that human vision is directly sensitive to this higher-order spatial information from interimage disparities (stereopsis and motion parallax), boundary contours, texture, shading, and combined variables. Psychophysical evidence contradicts other common ideas about retinal information for spatial vision and object perception.
Collapse
|
13
|
Warren WH. Does this computational theory solve the right problem? Marr, Gibson, and the goal of vision. Perception 2012; 41:1053-60. [PMID: 23409371 PMCID: PMC3816718 DOI: 10.1068/p7327] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
Abstract
David Marr's book Vision attempted to formulate athoroughgoing formal theory of perception. Marr borrowed much of the "computational" level from James Gibson: a proper understanding of the goal of vision, the natural constraints, and the available information are prerequisite to describing the processes and mechanisms by which the goal is achieved. Yet, as a research program leading to a computational model of human vision, Marr's program did not succeed. This article asks why, using the perception of 3D shape as a morality tale. Marr presumed that the goal of vision is to recover a general-purpose Euclidean description of the world, which can be deployed for any task or action. On this formulation, vision is underdetermined by information, which in turn necessitates auxiliary assumptions to solve the problem. But Marr's assumptions did not actually reflect natural constraints, and consequently the solutions were not robust. We now know that humans do not in fact recover Euclidean structure--rather, they reliably perceive qualitative shape (hills, dales, courses, ridges), which is specified by the second-order differential structure of images. By recasting the goals of vision in terms of our perceptual competencies, and doing the hard work of analyzing the information available under ecological constraints, we can reformulate the problem so that perception is determined by information and prior knowledge is unnecessary.
Collapse
Affiliation(s)
- William H Warren
- Department of Cognitive, Linguistic, and Psychological Sciences, Brown University, Providence, RI 02912, USA.
| |
Collapse
|
14
|
The selectivity of neurons in the macaque fundus of the superior temporal area for three-dimensional structure from motion. J Neurosci 2010; 30:15491-508. [PMID: 21084605 DOI: 10.1523/jneurosci.0820-10.2010] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022] Open
Abstract
Motion is a potent cue for the perception of three-dimensional (3D) shape in primates, but little is known about its underlying neural mechanisms. Guided by recent functional magnetic resonance imaging results, we tested neurons in the fundus of the superior temporal sulcus (FST) area of two macaque monkeys (Macaca mulatta, one male) using motion-defined surface patches with various 3D shapes such as slanted planes, saddles, or cylinders. The majority of the FST neurons (>80%) were selective for stimuli depicting specific shapes, and all the surfaces tested were represented among the selective FST neurons. Importantly, this selectivity tolerated changes in speed, position, size, or between binocular and monocular presentations. This tolerance demonstrates that the 3D structure-from-motion (3D-SFM) selectivity of FST neurons is a higher-order selectivity, which cannot be reduced to a lower-order speed selectivity. The 3D-SFM selectivity of FST neurons was unaffected by removal of the opposed-motion cue that supplemented the speed gradient cue in the standard stimuli. When tested with the same standard stimuli, fewer neurons in the middle temporal/visual 5 (MT/V5) area were selective than FST neurons. In addition, selective MT/V5 neurons represented fewer types of surfaces and were less tolerant of stimulus changes than FST neurons. Overall, these results indicate that FST neurons code motion-defined 3D shape fragments, underscoring the central role of FST in processing 3D-SFM.
Collapse
|
15
|
Lee YL, Bingham GP. Large perspective changes yield perception of metric shape that allows accurate feedforward reaches-to-grasp and it persists after the optic flow has stopped! Exp Brain Res 2010; 204:559-73. [PMID: 20563715 DOI: 10.1007/s00221-010-2323-2] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2009] [Accepted: 06/02/2010] [Indexed: 11/25/2022]
Abstract
Lee et al. (Percept Psychophys 70:1032-1046, 2008a) investigated whether visual perception of metric shape could be calibrated when used to guide feedforward reaches-to-grasp. It could not. Seated participants viewed target objects (elliptical cylinders) in normal lighting using stereo vision and free head movements that allowed small (approximately 10 degrees) perspective changes. The authors concluded that poor perception of metric shape was the reason reaches-to-grasp should be visually guided online. However, Bingham and Lind (Percept Psychophys 70:524-540, 2008) showed that large perspective changes (> or =45 degrees) yield good perception of metric shape. So, now we repeated the Lee et al.'s study with the addition of information from large perspective changes. The results were accurate feedforward reaches-to-grasp reflecting accurate perception of both metric shape and metric size. Large perspective changes occur when one locomotes into a workspace in which reaches-to-grasp are subsequently performed. Does the resulting perception of metric shape persist after the large perspective changes have ceased? Experiments 2 and 3 tested reaches-to-grasp with delays (Exp. 2, 5-s delay; Exp. 3, approximately 16-s delay) and multiple objects to be grasped after a single viewing. Perception of metric shape and metric size persisted yielding accurate reaches-to-grasp. We advocate the study of nested actions using a dynamic approach to perception/action.
Collapse
|
16
|
Exploratory pressure influences haptic shape perception via force signals. Atten Percept Psychophys 2010; 72:823-38. [PMID: 20348586 DOI: 10.3758/app.72.3.823] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
|
17
|
Haptic shape perception from force and position signals varies with exploratory movement direction and the exploring finger. Atten Percept Psychophys 2009; 71:1174-84. [PMID: 19525546 DOI: 10.3758/app.71.5.1174] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
We investigated how exploratory movement influences signal integration in active touch. Participants judged the amplitude of a bump specified by redundant signals: When a finger slides across a bump, the finger's position follows the bump's geometry (position signal); simultaneously, it is exposed to patterns of forces depending on the gradient of the bump (force signal). We varied amplitudes specified by force signals independently of amplitudes specified by position signals. Amplitude judgment was a weighted linear function of the amplitudes specified by both signals, under different exploratory conditions. The force signal's contribution to the judgment was higher when the participants explored with the index finger, as opposed to the thumb, and when they explored along a tangential axis, as opposed to a radial one (pivot congruent with shoulder joint). Furthermore, for tangential, as compared with radial, axis exploration, amplitude judgments were larger (and more accurate), and amplitude discrimination was better. We attribute these exploration-induced differences to biases in estimating bump amplitude from force signals. Given the choice, the participants preferred tangential explorations with the index finger-a behavior that resulted in good discrimination performance. A role for an active explorer, as well as biases that depend on exploration, should be taken into account when signal integration models are extended to active touch.
Collapse
|
18
|
Warren WH. How do animals get about by vision? Visually controlled locomotion and orientation after 50 years. Br J Psychol 2009; 100:277-81. [PMID: 19351453 DOI: 10.1348/000712609x414150] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
|
19
|
Drewing K, Wiecki TV, Ernst MO. Material properties determine how force and position signals combine in haptic shape perception. Acta Psychol (Amst) 2008; 128:264-73. [PMID: 18359467 DOI: 10.1016/j.actpsy.2008.02.002] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2007] [Revised: 02/12/2008] [Accepted: 02/12/2008] [Indexed: 11/29/2022] Open
Abstract
When integrating estimates from redundant sensory signals, humans seem to weight these estimates according to their reliabilities. In the present study, human observers used active touch to judge the curvature of a shape. The curvature was specified by positional and force signals: When a finger slides across a surface, the finger's position follows the surface geometry (position signal). At the same time, it is exposed to patterns of forces depending on the gradient of the surface (force signal; Robles-de-la-Torre, G., & Hayward, V. (2001). Force can overcome object geometry in the perception of shape through active touch. Nature, 412, 445-448). We show that variations in the surface's material properties (compliance, friction) influence the sensorily available position and force signals, as well as the noise associated with these signals. Along with this, material properties affect the weights given to the position and force signals for curvature judgements. Our findings are consistent with the notion of an observer who weights signal estimates according to their reliabilities. That is, signal weights shifted with the signal noise, which in the present case resulted from active exploration.
Collapse
Affiliation(s)
- Knut Drewing
- Institute for Psychology, Justus-Liebig University, Giessen, Germany.
| | | | | |
Collapse
|
20
|
Norman JF, Todd JT, Norman HF, Clayton AM, McBride TR. Visual discrimination of local surface structure: slant, tilt, and curvedness. Vision Res 2006; 46:1057-69. [PMID: 16289208 DOI: 10.1016/j.visres.2005.09.034] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2005] [Revised: 09/20/2005] [Accepted: 09/21/2005] [Indexed: 10/25/2022]
Abstract
In four experiments, observers were required to discriminate interval or ordinal differences in slant, tilt, or curvedness between designated probe points on randomly shaped curved surfaces defined by shading, texture, and binocular disparity. The results reveal that discrimination thresholds for judgments of slant or tilt typically range between 4 degrees and 10 degrees; that judgments of one component are unaffected by simultaneous variations in the other; and that the individual thresholds for either the slant or tilt components of orientation are approximately equal to those obtained for judgments of the total orientation difference between two probed regions. Performance was much worse, however, for judgments of curvedness, and these judgments were significantly impaired when there were simultaneous variations in the shape index parameter of curvature.
Collapse
Affiliation(s)
- J Farley Norman
- Department of Psychology, Western Kentucky University, Bowling Green, KY 42101-1030, USA.
| | | | | | | | | |
Collapse
|
21
|
Fan J, Liu F. Visual perception of surface wrinkles. Percept Mot Skills 2006; 101:925-34. [PMID: 16491698 DOI: 10.2466/pms.101.3.925-934] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
To study the relationship between visual perception of magnitude of wrinkles and geometrical parameters of surfaces, four potentially relevant parameters of the surface profile were considered: the variance (sigma2), the cutting frequency (Fc), the effective disparity curvature (Dce) of the wrinkled surface over the eyeball distance of the observer, and the frequency component of the disparity curvature (Dcf). Analysis of garment seams with varying amount of pucker showed that, while the logarithm for each of these four parameters has a strong linear relationship with the visually perceived magnitude of wrinkles, following the Fechner Law, the effective disparity curvature (Dce) and the frequency component of the disparity curvature (Dcf) with visual perception appeared stronger. This modeling may be an objective method for measuring magnitude of surface wrinkles.
Collapse
Affiliation(s)
- Jintu Fan
- Institute of Textiles and Clothing, The Hong Kong Polytechnic University, Hung Hom, Kowloon, Hong Kong, China.
| | | |
Collapse
|
22
|
Drewing K, Ernst MO. Integration of force and position cues for shape perception through active touch. Brain Res 2006; 1078:92-100. [PMID: 16494854 DOI: 10.1016/j.brainres.2005.12.026] [Citation(s) in RCA: 65] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2004] [Revised: 12/09/2005] [Accepted: 12/09/2005] [Indexed: 11/25/2022]
Abstract
This article systematically explores cue integration within active touch. Our research builds upon a recently made distinction between position and force cues for haptic shape perception: when sliding a finger across a bumpy surface, the finger follows the surface geometry (position cue). At the same time, the finger is exposed to forces related to the slope of the surface (force cue). Experiment 1 independently varied force and position cues to the curvature of 3D arches. Perceived curvature could be well described as a weighted average of the two cues. Experiment 2 found more weight of the position cue for more convex high arches and higher weight of the force cue for less convex shallow arches--probably mediated through a change in relative cue reliability. Both findings are in good agreement with the maximum-likelihood estimation (MLE) model for cue integration and, thus, carry this model over to the domain of active haptic perception.
Collapse
Affiliation(s)
- Knut Drewing
- Max Planck Institute for Biological Cybernetics, Tübingen, Germany.
| | | |
Collapse
|
23
|
HANADA MITSUHIKO. Effects of circular motion on judgment of rotation direction and depth order in visual motion. JAPANESE PSYCHOLOGICAL RESEARCH 2005. [DOI: 10.1111/j.1468-5884.2005.00294.x] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]
|
24
|
Foo P, Warren WH, Duchon A, Tarr MJ. Do humans integrate routes into a cognitive map? Map- versus landmark-based navigation of novel shortcuts. J Exp Psychol Learn Mem Cogn 2005; 31:195-215. [PMID: 15755239 DOI: 10.1037/0278-7393.31.2.195] [Citation(s) in RCA: 132] [Impact Index Per Article: 6.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
Do humans integrate experience on specific routes into metric survey knowledge of the environment, or do they depend on a simpler strategy of landmark navigation? The authors tested this question using a novel shortcut paradigm during walking in a virtual environment. The authors find that participants could not take successful shortcuts in a desert world but could do so with dispersed landmarks in a forest. On catch trials, participants were drawn toward the displaced landmarks whether the landmarks were clustered near the target location or along the shortcut route. However, when landmarks appeared unreliable, participants fell back on coarse survey knowledge. Like honeybees (F. C. Dyer, 1991), humans do not appear to derive accurate cognitive maps from path integration to guide navigation but, instead, depend on landmarks when they are available.
Collapse
Affiliation(s)
- Patrick Foo
- Department of Cognitive and Linguistic Sciences, Brown University.
| | | | | | | |
Collapse
|
25
|
JINTUFAN. VISUAL PERCEPTION OF SURFACE WRINKLES. Percept Mot Skills 2005. [DOI: 10.2466/pms.101.7.925-934] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
|
26
|
Di Luca M, Domini F, Caudek C. Spatial integration in structure from motion. Vision Res 2004; 44:3001-13. [PMID: 15474573 DOI: 10.1016/j.visres.2004.07.004] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2003] [Revised: 09/22/2003] [Indexed: 11/26/2022]
Abstract
In three experiments we investigated whether the perception of 3D structure from the optic-flow involves a process of spatial integration. The observer's task was to judge the 3D orientation of local velocity field patches. In two conditions, the patches were presented either in isolation, or as part of a global optic-flow. In Experiment 1, the global optic-flow was a linear velocity field. In Experiment 2, the patches were embedded in a randomly perturbed linear velocity field. In Experiment 3, the local patches belonged to a smoothly curved surface. The results of these three experiments lead to two main conclusions: (1) a process linking spatially separated patches into global entities does affect the perception of local surface orientation induced by the optic-flow, and (2) linearity or smoothness of the global velocity field are not necessary conditions for spatial integration.
Collapse
Affiliation(s)
- Massimiliano Di Luca
- Department of Cognitive and Linguistic Sciences, Brown University, P.O. Box 1978, Providence, RI 02912, USA
| | | | | |
Collapse
|
27
|
Vuong QC, Domini F, Caudek C. Evidence for patchwork approximation of shape primitives. ACTA ACUST UNITED AC 2004; 66:1246-59. [PMID: 15751479 DOI: 10.3758/bf03196849] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
Investigators have proposed that qualitative shapes are the primitive information of spatial vision: They preserve an approximately one-to-one mapping between surfaces, images, and perception. Given their importance, we examined how the visual system recovers these primitives from sparse disparity fields that do not provide sufficient information for their recovery. We hypothesized that the visual system interpolates sparse disparities with planes, resulting in a patchwork approximation of the implicitly defined shapes. We presented observers with stereo displays simulating planar or smooth curved surfaces having different curvatures. The observers' task was to detect whether dots deviated from these surfaces or to discriminate planar from curved or planar from scrambled surfaces. Consistent with our hypothesis, increasing curvature had detrimental effects on observers' performance (Experiments 1-3). Importantly, this patchwork approximation leads to the recovery of the proposed shape primitives, since observers were more accurate at discriminating planar-from-curved than planar-from-scrambled surfaces with matched disparity range (Experiment 4).
Collapse
Affiliation(s)
- Quoc C Vuong
- Brown University, Providence, Rhode Island, USA.
| | | | | |
Collapse
|
28
|
Wickelgren EA, Bingham GP. Perspective distortion of trajectory forms and perceptual constancy in visual event identification. ACTA ACUST UNITED AC 2004; 66:629-41. [PMID: 15311662 DOI: 10.3758/bf03194907] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
Previous studies have shown that people can use the information in trajectory forms to recognize visual events. A trajectory form is composed of the path of motion and the change in speed along that path. In past studies, however, only sensitivity to trajectory forms viewed from a single perspective was examined. The optical components change when an event is viewed from different perspectives, and the projected form of the trajectory is transformed. Does event recognition exhibit constancy despite these changes? In Experiment 1, participants were familiarized with five different trajectory forms viewed from a single perspective. Then the participants had to identify the same events viewed from different perspectives: from the side, at an angle, and entirely in depth. The participants exhibited perceptual constancy. Experiment 2 revealed, however, that both the change in optical components and the perspective transformations affected recognition.
Collapse
Affiliation(s)
- Emily A Wickelgren
- Department of Psychology, California State University, Sacramento, California 95819-6007, USA.
| | | |
Collapse
|
29
|
|
30
|
Todd JT, Norman JF. The visual perception of 3-D shape from multiple cues: are observers capable of perceiving metric structure? PERCEPTION & PSYCHOPHYSICS 2003; 65:31-47. [PMID: 12699307 DOI: 10.3758/bf03194781] [Citation(s) in RCA: 98] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
Three experiments are reported in which observers judged the three-dimensional (3-D) structures of virtual or real objects defined by various combinations of texture, motion, and binocular disparity under a wide variety of conditions. The tasks employed in these studies involved adjusting the depth of an object to match its width, adjusting the planes of a dihedral angle so that they appeared orthogonal, and adjusting the shape of an object so that it appeared to match another at a different viewing distance. The results obtained on all of these tasks revealed large constant errors and large individual differences among observers. There were also systematic failures of constancy over changes in viewing distance, orientation, or response task. When considered in conjunction with other, similar reports in the literature, these findings provide strong evidence that human observers do not have accurate perceptions of 3-D metric structure.
Collapse
Affiliation(s)
- James T Todd
- Department of Psychology, Ohio State University, Columbus, Ohio 43210, USA.
| | | |
Collapse
|
31
|
Abstract
Temporal integration was investigated in the minimal conditions necessary to perform a structure-from-motion (SFM) task. Observers were asked to discriminate three-dimensional (3D) surface orientations in conditions in which the stimulus displays simulated velocity fields providing, in each frame transition, either sufficient (3 moving dots) or insufficient information (1 or 2 moving dots) to perform the task. When only two moving dots were shown in each frame transition of the stimulus displays (Experiment 1), we found that performance decreased as dot-lifetime increased. A facilitation effect of the overall display duration was also found. The negative effect of dot-lifetime on performance contrasts with what found in Experiment 2 with three dots in each frame transition, where performance improved with increasing dot-lifetime up to 170 ms, and then reached a plateau. Finally, for an optimal dot-lifetime of 150 ms, we found that performance was still above chance when each frame transition specified the motion of only one dot (Experiment 3). These results indicate that temporal recruitment alone can support the recovery of 3D information from sparse motion signals, thus providing a strong indication for the importance of temporal integration in the perceptual analysis of the optic flow. Our results reveal, moreover, that temporal integration in SFM has different characteristics, depending on whether, in each frame transition, the stimulus displays provide either sufficient (3 or more moving dots) or insufficient information (1 or 2 moving dots) to specify the higher-order properties of the optic flow necessary for 3D surface recovery.
Collapse
Affiliation(s)
- Corrado Caudek
- Department of Psychology, University of Trieste, Via S Anastasio 12, 34134 Trieste, Italy.
| | | | | |
Collapse
|
32
|
Muchisky MM, Bingham GP. Trajectory forms as a source of information about events. PERCEPTION & PSYCHOPHYSICS 2002; 64:15-31. [PMID: 11916299 DOI: 10.3758/bf03194554] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
The ability to use trajectory forms as visual information about events was tested. A trajectory form is defined as the variation in velocity along a path of motion. In Experiment 1, we tested the ability to detect trajectory form differences between simulations of a freely swinging pendulum and a hand-moved pendulum. The trajectory form of the freely swinging pendulum was symmetric around the mid-point, whereas the hand moved was not. In Experiment 2, we isolated trajectory form information by varying the amplitudes of events while holding their periods constant. Straight path versions of the harmonic events from Experiment 1 were tested. In Experiment 3, we tested sensitivity to symmetrical peakening or flattening of trajectory forms. Participants detected small differences in all three experiments. In Experiment 4, we tested the ability to identify specific events based only on differences in trajectory forms. Participants were able to identify four different events. We investigated properties of trajectory forms that might potentially be detected and used as information, and we found that the curvature yielded good results.
Collapse
|
33
|
Abstract
Oscillation thresholds were evaluated for detecting motion and discriminating relative motion. Three horizontally aligned Gaussian blobs oscillated horizontally, with the center in-phase or out-of-phase with the two flankers. Motion thresholds were well below those for static bisection, and involved small contrast changes (<0.25%). Remarkably, acuity was better for discriminating phase relations than for detecting rigid motion, averaging 8.7 and 11.0 arcsec, respectively, for 100 arcmin between blobs. Phase discrimination acuities were robust over separations of 20-320 arcmin and temporal frequencies of 1.5-6 Hz. Motion phase relations must be coherent among spatially separate retinal signals, carrying information about intrinsic image structure.
Collapse
Affiliation(s)
- J S Lappin
- Vanderbilt Vision Research Center, 301 Wilson Hall, Vanderbilt University, 111 21st Ave. South, Nashville, TN 37240-0009, USA.
| | | | | |
Collapse
|
34
|
Atkins JE, Fiser J, Jacobs RA. Experience-dependent visual cue integration based on consistencies between visual and haptic percepts. Vision Res 2001; 41:449-61. [PMID: 11166048 DOI: 10.1016/s0042-6989(00)00254-6] [Citation(s) in RCA: 73] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/01/2022]
Abstract
We study the hypothesis that observers can use haptic percepts as a standard against which the relative reliabilities of visual cues can be judged, and that these reliabilities determine how observers combine depth information provided by these cues. Using a novel visuo-haptic virtual reality environment, subjects viewed and grasped virtual objects. In Experiment 1, subjects were trained under motion relevant conditions, during which haptic and visual motion cues were consistent whereas haptic and visual texture cues were uncorrelated, and texture relevant conditions, during which haptic and texture cues were consistent whereas haptic and motion cues were uncorrelated. Subjects relied more on the motion cue after motion relevant training than after texture relevant training, and more on the texture cue after texture relevant training than after motion relevant training. Experiment 2 studied whether or not subjects could adapt their visual cue combination strategies in a context-dependent manner based on context-dependent consistencies between haptic and visual cues. Subjects successfully learned two cue combination strategies in parallel, and correctly applied each strategy in its appropriate context. Experiment 3, which was similar to Experiment 1 except that it used a more naturalistic experimental task, yielded the same pattern of results as Experiment 1 indicating that the findings do not depend on the precise nature of the experimental task. Overall, the results suggest that observers can involuntarily compare visual and haptic percepts in order to evaluate the relative reliabilities of visual cues, and that these reliabilities determine how cues are combined during three-dimensional visual perception.
Collapse
Affiliation(s)
- J E Atkins
- Department of Brain and Cognitive Sciences and the Center for Visual Science, University of Rochester, Rochester, NY 14627, USA
| | | | | |
Collapse
|
35
|
Abstract
Previous investigators have shown that observers' visual cue combination strategies are remarkably flexible in the sense that these strategies adapt on the basis of the estimated reliabilities of the visual cues. However, these researchers have not addressed how observers' acquire these estimated reliabilities. This article studies observers' abilities to learn cue combination strategies. Subjects made depth judgments about simulated cylinders whose shapes were indicated by motion and texture cues. Because the two cues could indicate different shapes, it was possible to design tasks in which one cue provided useful information for making depth judgments, whereas the other cue was irrelevant. The results of experiment 1 suggest that observers' cue combination strategies are adaptable as a function of training; subjects adjusted their cue combination rules to use a cue more heavily when the cue was informative on a task versus when the cue was irrelevant. Experiment 2 demonstrated that experience-dependent adaptation of cue combination rules is context-sensitive. On trials with presentations of short cylinders, one cue was informative, whereas on trials with presentations of tall cylinders, the other cue was informative. The results suggest that observers can learn multiple cue combination rules, and can learn to apply each rule in the appropriate context. Experiment 3 demonstrated a possible limitation on the context-sensitivity of adaptation of cue combination rules. One cue was informative on trials with presentations of cylinders at a left oblique orientation, whereas the other cue was informative on trials with presentations of cylinders at a right oblique orientation. The results indicate that observers did not learn to use different cue combination rules in different contexts under these circumstances. These results are consistent with the hypothesis that observers' visual systems are biased to learn to perceive in the same way views of bilaterally symmetric objects that differ solely by a symmetry transformation. Taken in conjunction with the results of Experiment 2, this means that the visual learning mechanism underlying cue combination adaptation is biased such that some sets of statistics are more easily learned than others.
Collapse
Affiliation(s)
- R A Jacobs
- Center for Visual Science, University of Rochester, NY 14627, USA.
| | | |
Collapse
|
36
|
Orban GA, Sunaert S, Todd JT, Van Hecke P, Marchal G. Human cortical regions involved in extracting depth from motion. Neuron 1999; 24:929-40. [PMID: 10624956 DOI: 10.1016/s0896-6273(00)81040-5] [Citation(s) in RCA: 142] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
We used functional magnetic resonance imaging (fMRI) to investigate brain regions involved in extracting three-dimensional structure from motion. A factorial design included two-dimensional and three-dimensional structures undergoing rigid and nonrigid motions. As predicted from monkey data, the human homolog of MT/V5 was significantly more active when subjects viewed three-dimensional (as opposed to two-dimensional) displays, irrespective of their rigidity. Human MT/V5+ (hMT/V5+) is part of a network with right hemisphere dominance involved in extracting depth from motion, including a lateral occipital region, five sites along the intraparietal sulcus (IPS), and two ventral occipital regions. Control experiments confirmed that this pattern of activation is most strongly correlated with perceived three-dimensional structure, in as much as it arises from motion and cannot be attributed to numerous two-dimensional image properties or to saliency.
Collapse
Affiliation(s)
- G A Orban
- Katholieke Universiteit te Leuven, Faculty of Medicine, Laboratorium voor Neuro- en Psychofysiologie, Leuven, Belgium.
| | | | | | | | | |
Collapse
|
37
|
Todd JT, Perotti VJ. The visual perception of surface orientation from optical motion. PERCEPTION & PSYCHOPHYSICS 1999; 61:1577-89. [PMID: 10598471 DOI: 10.3758/bf03213119] [Citation(s) in RCA: 23] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
Observers viewed monocular animations of rotating dihedral angles and were required to indicate their perceived structures by adjusting the magnitude and orientation of a stereoscopic dihedral angle. The motion displays were created by directly manipulating various aspects of the image velocity field, including the mean translation, the horizontal and vertical velocity gradients, and the manner in which these gradients changed over time. The adjusted orientation of each planar facet was decomposed into components of slant and tilt. Although the tilt component was estimated with a high degree of accuracy, the judgments of slant exhibited large systematic errors. The magnitude of perceived slant was determined primarily by the magnitude of the velocity gradient scaled by its direction. The results also indicate that higher order temporal derivatives of the moving elements had little effect on observers' judgments.
Collapse
Affiliation(s)
- J T Todd
- Department of Psychology, Ohio State University, Columbus 43210, USA.
| | | |
Collapse
|
38
|
Abstract
We report the results of a depth-matching experiment in which subjects were asked to adjust the height of an ellipse until it matched the depth of a simulated cylinder defined by texture and motion cues. In one-third of the trials the shape of the cylinder was primarily given by motion information, in another one-third of the trials it was given by texture information, and on the remaining trials it was given by both sources of information. Two optimal cue combination models are described where optimality is defined in terms of Bayesian statistics. The parameter values of the models are set based on subjects' responses on trials when either the motion cue or the texture cue was informative. These models provide predictions of subjects' responses on trials when both cues were informative. The results indicate that one of the optimal models provides a good fit to the subjects' data, and the second model provides an exceptional fit. Because the predictions of the optimal models closely match the experimental data, we conclude that observers' cue-combination strategies are indeed optimal, at least under the conditions studied here.
Collapse
Affiliation(s)
- R A Jacobs
- Center for Visual Science, University of Rochester, NY 14627, USA.
| |
Collapse
|
39
|
Fine I, Jacobs RA. Modeling the combination of motion, stereo, and vergence angle cues to visual depth. Neural Comput 1999; 11:1297-330. [PMID: 10423497 DOI: 10.1162/089976699300016250] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022]
Abstract
Three models of visual cue combination were simulated: a weak fusion model, a modified weak model, and a strong model. Their relative strengths and weaknesses are evaluated on the basis of their performances on the tasks of judging the depth and shape of an ellipse. The models differ in the amount of interaction that they permit among the cues of stereo, motion, and vergence angle. Results suggest that the constrained nonlinear interaction of the modified weak model allows better performance than either the linear interaction of the weak model or the unconstrained nonlinear interaction of the strong model. Further examination of the modified weak model revealed that its weighting of motion and stereo cues was dependent on the task, the viewing distance, and, to a lesser degree, the noise model. Although the dependencies were sensible from a computational viewpoint, they were sometimes inconsistent with psychophysical experimental data. In a second set of experiments, the modified weak model was given contradictory motion and stereo information. One cue was informative in the sense that it indicated an ellipse, while the other cue indicated a flat surface. The modified weak model rapidly reweighted its use of stereo and motion cues as a function of each cue's informativeness. Overall, the simulation results suggest that relative to the weak and strong models, the modified weak fusion model is a good candidate model of the combination of motion, stereo, and vergence angle cues, although the results also highlight areas in which this model needs modification or further elaboration.
Collapse
Affiliation(s)
- I Fine
- Center for Visual Science, University of Rochester, Rochester, NY 14627, USA
| | | |
Collapse
|