1
|
Lew WH, Coates DR. Impact of monocular vs. binocular contrast and blur on the range of functional stereopsis. Vision Res 2023; 212:108309. [PMID: 37595435 DOI: 10.1016/j.visres.2023.108309] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2022] [Revised: 07/31/2023] [Accepted: 08/01/2023] [Indexed: 08/20/2023]
Abstract
Stereopsis depends on the smallest stereo threshold (lower limit) and the upper fusion limit. While studies have shown that the lower limit worsens with reduced contrast and blur, more strongly in monocular than in binocular conditions, the effect on the upper limit remains uncertain. Here, we assess the impact of contrast and blur on the range of the disparity sensitivity function (DSF) in a stereo letter recognition task. Subjects had to identify the stereo letters embedded in a random dot stereogram, and adaptive staircases were used to estimate the two limits. Five subjects performed the experiment at baseline contrast (100%), with different contrast (32% and 10%) and blur (+0.75DS and +1.25DS) in monocular and binocular degradation. We proposed three possible outcomes: 1) the range collapses in both directions 2) the lower limit threshold reduces, but the upper limit is not affected 3) the threshold for both limits increases and the range remains the same. We found that the curve for both limits was lowpass in shape, resulting in a smaller range at higher SFs. The results were similar to the first prediction, where the threshold for the lower limit increased while the upper limit was reduced at lower contrast and higher blur. The shrinkage of DSF is significant in monocular conditions. However, with blur, there was inter-subject variability. A simple cross-correlation stereo-matching algorithm was used to quantify the effect of contrast and blur. The results were consistent with the behavioral result that the range of DSF decreases with image degradation.
Collapse
Affiliation(s)
- Wei Hau Lew
- University of Houston College of Optometry, Houston, TX, United States.
| | - Daniel R Coates
- University of Houston College of Optometry, Houston, TX, United States
| |
Collapse
|
2
|
Oluk C, Bonnen K, Burge J, Cormack LK, Geisler WS. Stereo slant discrimination of planar 3D surfaces: Frontoparallel versus planar matching. J Vis 2022; 22:6. [PMID: 35467704 PMCID: PMC9055558 DOI: 10.1167/jov.22.5.6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2021] [Accepted: 02/19/2022] [Indexed: 11/24/2022] Open
Abstract
Binocular stereo cues are important for discriminating 3D surface orientation, especially at near distances. We devised a single-interval task where observers discriminated the slant of a densely textured planar test surface relative to a textured planar surround reference surface. Although surfaces were rendered with correct perspective, the stimuli were designed so that the binocular cues dominated performance. Slant discrimination performance was measured as a function of the reference slant and the level of uncorrelated white noise added to the test-plane images in the left and right eyes. We compared human performance with an approximate ideal observer (planar matching [PM]) and two subideal observers. The PM observer uses the image in one eye and back projection to predict a test image in the other eye for all possible slants, tilts, and distances. The estimated slant, tilt, and distance are determined by the prediction that most closely matches the measured image in the other eye. The first subideal observer (local planar matching [LPM]) applies PM over local neighborhoods and then pools estimates across the test plane. The second suboptimal observer (local frontoparallel matching [LFM]) uses only location disparity. We find that the ideal observer (PM) and the first subideal observer (LPM) outperforms the second subideal observer (LFM), demonstrating the additional benefit of pattern disparities. We also find that all three model observers can account for human performance, if two free parameters are included: a fixed small level of internal estimation noise, and a fixed overall efficiency scalar on slant discriminability.
Collapse
Affiliation(s)
- Can Oluk
- Center for Perceptual Systems and Department of Psychology, University of Texas at Austin, Austin, TX, USA
| | - Kathryn Bonnen
- School of Optometry, Indiana University Bloomington, Bloomington, IN, USA
| | - Johannes Burge
- Department of Psychology, University of Pennsylvania, Philadelphia, PA, USA
| | - Lawrence K Cormack
- Center for Perceptual Systems and Department of Psychology, University of Texas at Austin, Austin, TX, USA
| | - Wilson S Geisler
- Center for Perceptual Systems and Department of Psychology, University of Texas at Austin, Austin, TX, USA
| |
Collapse
|
3
|
Abbasi S, Tavakoli M, Boveiri HR, Mosleh Shirazi MA, Khayami R, Khorasani H, Javidan R, Mehdizadeh A. Medical image registration using unsupervised deep neural network: A scoping literature review. Biomed Signal Process Control 2022. [DOI: 10.1016/j.bspc.2021.103444] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/26/2023]
|
4
|
Goutcher R, Wilcox LM. Surface slant impairs disparity discontinuity discrimination. Vision Res 2020; 180:37-50. [PMID: 33360607 DOI: 10.1016/j.visres.2020.12.001] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2020] [Revised: 11/20/2020] [Accepted: 12/01/2020] [Indexed: 11/24/2022]
Abstract
Binocular disparity signals are highly informative about the three-dimensional structure of visual scenes, including aiding the detection of depth discontinuities between surfaces. Here, we examine factors affecting sensitivity to such surface discontinuities. Participants were presented with random dot stereograms depicting two planar surfaces slanted in opposite directions and were asked to judge the sign of the depth discontinuity created where those surfaces met. Although the judgement was focussed on the adjacent edges, the precision of depth discontinuity discrimination depended upon the slant of the two surfaces: increasing surface slants to ±60° increased discontinuity discrimination thresholds by, on average, a factor of 5. Control experiments examining discontinuity discrimination across surfaces with identical slants showed either biases in discontinuity judgements or reduced threshold elevation. These results suggest that sensitivity to depth discontinuities is affected by processing limitations in both local absolute disparity measurement mechanisms and mechanisms selective for disparity differences. As further evidence in support of this conclusion, we show that our results are well-described by a model of discontinuity discrimination based on the encoding of local differences in relative disparity.
Collapse
Affiliation(s)
- Ross Goutcher
- Psychology, Faculty of Natural Sciences, University of Stirling, Stirling FK9 4LA, UK.
| | - Laurie M Wilcox
- Department of Psychology, Centre for Vision Research, York University, Toronto, ON, Canada
| |
Collapse
|
5
|
Ding J, Levi DM. A unified model for binocular fusion and depth perception. Vision Res 2020; 180:11-36. [PMID: 33359897 DOI: 10.1016/j.visres.2020.11.009] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2020] [Revised: 11/13/2020] [Accepted: 11/16/2020] [Indexed: 11/27/2022]
Abstract
We describe a new unified model to explain both binocular fusion and depth perception, over a broad range of depths. At each location, the model consists of an array of paired spatial frequency filters, with different relative horizontal shifts (position disparity) and interocular phase disparities of 0, 90, ±180, or -90°. The paired filters with different spatial profiles (non-zero phase disparity) compute interocular misalignment and provide phase-disparity energy (binocular fusion energy) to drive selection of the appropriate filters along the position disparity space until the misalignment is eliminated and sensory fusion is achieved locally. The paired filters with identical spatial profiles (0 phase disparity) compute the position-disparity energy. After sensory fusion, the combination of position and possible residual phase disparity energies is calculated for binocular depth perception. Binocular fusion occurs at multiple scales following a coarse-to-fine process. At a given location, the apparent depth is the weighted sum of fusion shifts combined with residual phase disparity in all spatial-frequency channels, and the weights depend on stimulus spatial frequency and stimulus contrast. To test the theory, we measured disparity minimum and maximum thresholds (Dmin and Dmax) at three spatial frequencies and with different intraocular contrast levels. The stimuli were Random-Gabor-Patch (RGP) stereograms consisting of Gabor patches with random positions and phases, but with a fixed spatial frequency. The two eyes viewed identical arrays of patches except that one eye's array could be shifted horizontally and could differ in contrast. Our experiments and modeling reveal two contrast normalization mechanisms: (1) Energy Normalization (EN): Binocular energy is normalized with monocular energy after the site of binocular combination. This predicts constant Dmin thresholds when varying stimulus contrast in the two eyes; (2) DSKL model Interocular interactions: Monocular contrasts are normalized before the binocular combination site through interocular contrast gain-control and gain-enhancement mechanisms. This predicts contrast dependent Dmax thresholds. We tested a range of models and found that a model consisting of a second-order pathway with DSKL interocular interactions and a first-order pathway with EN at each spatial-frequency band can account for both the Dmin and Dmax data very well. Simulations show that the model makes reasonable predictions of suprathreshold depth perception.
Collapse
Affiliation(s)
- Jian Ding
- School of Optometry and the Helen Wills Neuroscience Institute, University of California, Berkeley, Berkeley, CA 94720-2020, United States.
| | - Dennis M Levi
- School of Optometry and the Helen Wills Neuroscience Institute, University of California, Berkeley, Berkeley, CA 94720-2020, United States
| |
Collapse
|
6
|
Goutcher R, Hibbard PB. Impairment of cyclopean surface processing by disparity-defined masking stimuli. J Vis 2020; 20:1. [PMID: 32040160 PMCID: PMC7331773 DOI: 10.1167/jov.20.2.1] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open
Abstract
Binocular disparity signals allow for the estimation of three-dimensional shape, even in the absence of monocular depth cues. The perception of such disparity-defined form depends, however, on the linkage of multiple disparity measurements over space. Performance limitations in cyclopean tasks thus inform us about errors arising in disparity measurement and difficulties in the linkage of such measurements. We used a cyclopean orientation discrimination task to examine the perception of disparity-defined form. Participants were presented with random-dot sinusoidal modulations in depth and asked to report whether they were clockwise or counter-clockwise rotated. To assess the effect of different noise structures on measurement and linkage processes, task performance was measured in the presence of binocular, random-dot masks, structured as either antiphase depth sinusoids, or as random distributions of dots in depth. For a fixed number of surface dots, the ratio of mask-to-surface dots was varied to obtain thresholds for orientation discrimination. Antiphase masks were found to be more effective than random depth masks, requiring a lower mask-to-surface dot ratio to inhibit performance. For antiphase masks, performance improved with decreased cyclopean frequency, increased disparity amplitude, and/or an increase in the total number of stimulus dots. Although a cross-correlation model of disparity measurement could account for antiphase mask performance, random depth masking effects were consistent with limitations in relative disparity processing. This suggests that performance is noise-limited for antiphase masks and complexity-limited for random masks. We propose that use of differing mask types may prove effective in understanding these distinct forms of impairment.
Collapse
|
7
|
Metlapally S, Bharadwaj SR, Roorda A, Nilagiri VK, Yu TT, Schor CM. Binocular cross-correlation analyses of the effects of high-order aberrations on the stereoacuity of eyes with keratoconus. J Vis 2019; 19:12. [PMID: 31185094 PMCID: PMC6559754 DOI: 10.1167/19.6.12] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open
Abstract
Stereoacuity losses are induced by increased magnitudes and interocular differences in high-order aberrations (HOAs). This study used keratoconus as a model to investigate the impact of HOAs on disparity processing and stereoacuity. HOAs and stereoacuity were quantified in subjects with keratoconus (n = 21) with HOAs uncorrected (wearing spectacles) or minimized (wearing rigid gas-permeable contact lenses) and in control subjects without keratoconus (n = 5) for 6-mm pupil diameters. Disparity signal quality was estimated using metrics derived from binocular cross-correlation functions of stereo pairs convolved with point-spread functions from these HOAs. Metrics computed for all subjects were compared with stereoacuities. The effects of contrast losses and phase shifts on disparity signal quality were studied independently by manipulating the amplitude and phase components of optical transfer functions. The magnitudes, orientations, interocular relationships in magnitude, and shape of the point-spread function affected the cross-correlation metrics that determine disparity signal quality. Stereoacuity covaries strongly with cross-correlation metrics and moderately with image-quality metrics. Both phase distortions and contrast losses due to HOAs significantly influence computations of binocular disparity. HOA-induced stereoacuity reductions are attributable to disparity blur and noise from image properties that reduce the height and kurtosis of the peak stimulus disparity match of the cross-correlation. Phase distortions and contrast losses due to HOAs are both partly responsible for the greater stereoacuity losses seen with spectacles compared to rigid gas-permeable contact lenses in keratoconus.
Collapse
Affiliation(s)
| | - Shrikant R Bharadwaj
- Brien Holden Institute of Optometry and Vision Sciences, LV Prasad Eye Institute, Telangana, India
| | | | - Vinay Kumar Nilagiri
- Brien Holden Institute of Optometry and Vision Sciences, LV Prasad Eye Institute, Telangana, India
| | | | | |
Collapse
|
8
|
Read JCA, Cumming BG. The psychophysics of stereopsis can be explained without invoking independent ON and OFF channels. J Vis 2019; 19:7. [PMID: 31173632 PMCID: PMC6690401 DOI: 10.1167/19.6.7] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022] Open
Abstract
Early vision proceeds through distinct ON and OFF channels, which encode luminance increments and decrements respectively. It has been argued that these channels also contribute separately to stereoscopic vision. This is based on the fact that observers perform better on a noisy disparity discrimination task when the stimulus is a random-dot pattern consisting of equal numbers of black and white dots (a “mixed-polarity stimulus,” argued to activate both ON and OFF stereo channels), than when it consists of all-white or all-black dots (“same-polarity,” argued to activate only one). However, it is not clear how this theory can be reconciled with our current understanding of disparity encoding. Recently, a binocular convolutional neural network was able to replicate the mixed-polarity advantage shown by human observers, even though it was based on linear filters and contained no mechanisms which would respond separately to black or white dots. Here, we show that a subtle feature of the way the stimuli were constructed in all these experiments can explain the results. The interocular correlation between left and right images is actually lower for the same-polarity stimuli than for mixed-polarity stimuli with the same amount of disparity noise applied to the dots. Because our current theories suggest stereopsis is based on a correlation-like computation in primary visual cortex, this postulate can explain why performance was better for the mixed-polarity stimuli. We conclude that there is currently no evidence supporting separate ON and OFF channels in stereopsis.
Collapse
Affiliation(s)
- Jenny C A Read
- Institute of Neuroscience, Newcastle University, Newcastle upon Tyne, UK
| | - Bruce G Cumming
- Laboratory of Sensorimotor Research, National Eye Institute, National Institutes of Health, Bethesda, MD, USA
| |
Collapse
|
9
|
Goutcher R, Connolly E, Hibbard PB. Surface continuity and discontinuity bias the perception of stereoscopic depth. J Vis 2018; 18:13. [DOI: 10.1167/18.12.13] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open
Affiliation(s)
- Ross Goutcher
- Psychology, Faculty of Natural Sciences, University of Stirling, Stirling, UK
| | - Eilidh Connolly
- Psychology, Faculty of Natural Sciences, University of Stirling, Stirling, UK
| | | |
Collapse
|
10
|
Leimkuhler T, Kellnhofer P, Ritschel T, Myszkowski K, Seidel HP. Perceptual Real-Time 2D-to-3D Conversion Using Cue Fusion. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2018; 24:2037-2050. [PMID: 28504938 DOI: 10.1109/tvcg.2017.2703612] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/07/2023]
Abstract
We propose a system to infer binocular disparity from a monocular video stream in real-time. Different from classic reconstruction of physical depth in computer vision, we compute perceptually plausible disparity, that is numerically inaccurate, but results in a very similar overall depth impression with plausible overall layout, sharp edges, fine details and agreement between luminance and disparity. We use several simple monocular cues to estimate disparity maps and confidence maps of low spatial and temporal resolution in real-time. These are complemented by spatially-varying, appearance-dependent and class-specific disparity prior maps, learned from example stereo images. Scene classification selects this prior at runtime. Fusion of prior and cues is done by means of robust MAP inference on a dense spatio-temporal conditional random field with high spatial and temporal resolution. Using normal distributions allows this in constant-time, parallel per-pixel work. We compare our approach to previous 2D-to-3D conversion systems in terms of different metrics, as well as a user study and validate our notion of perceptually plausible disparity.
Collapse
|
11
|
Gao Y, Li J, Li J, Wang S. Modeling the convergence accommodation of stereo vision for binocular endoscopy. Int J Med Robot 2017; 14. [PMID: 29052314 DOI: 10.1002/rcs.1866] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2017] [Revised: 07/21/2017] [Accepted: 09/01/2017] [Indexed: 11/10/2022]
Abstract
BACKGROUND The stereo laparoscope is an important tool for achieving depth perception in robot-assisted minimally invasive surgery (MIS). METHODS A dynamic convergence accommodation algorithm is proposed to improve the viewing experience and achieve accurate depth perception. Based on the principle of the human vision system, a positional kinematic model of the binocular view system is established. The imaging plane pair is rectified to ensure that the two rectified virtual optical axes intersect at the fixation target to provide immersive depth perception. RESULTS Stereo disparity was simulated with the roll and pitch movements of the binocular system. The chessboard test and the endoscopic peg transfer task were performed, and the results demonstrated the improved disparity distribution and robustness of the proposed convergence accommodation method with respect to the position of the fixation target. CONCLUSIONS This method offers a new solution for effective depth perception with the stereo laparoscopes used in robot-assisted MIS.
Collapse
Affiliation(s)
- Yuanqian Gao
- School of Mechanical Engineering, Tianjin University, China
| | - Jinhua Li
- School of Mechanical Engineering, Tianjin University, China
| | - Jianmin Li
- School of Mechanical Engineering, Tianjin University, China
| | - Shuxin Wang
- School of Mechanical Engineering, Tianjin University, China
| |
Collapse
|
12
|
Cammack P, Harris JM. Depth perception in disparity-defined objects: finding the balance between averaging and segregation. Philos Trans R Soc Lond B Biol Sci 2017; 371:rstb.2015.0258. [PMID: 27269601 PMCID: PMC4901452 DOI: 10.1098/rstb.2015.0258] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 03/09/2016] [Indexed: 11/20/2022] Open
Abstract
Deciding what constitutes an object, and what background, is an essential task for the visual system. This presents a conundrum: averaging over the visual scene is required to obtain a precise signal for object segregation, but segregation is required to define the region over which averaging should take place. Depth, obtained via binocular disparity (the differences between two eyes’ views), could help with segregation by enabling identification of object and background via differences in depth. Here, we explore depth perception in disparity-defined objects. We show that a simple object segregation rule, followed by averaging over that segregated area, can account for depth estimation errors. To do this, we compared objects with smoothly varying depth edges to those with sharp depth edges, and found that perceived peak depth was reduced for the former. A computational model used a rule based on object shape to segregate and average over a central portion of the object, and was able to emulate the reduction in perceived depth. We also demonstrated that the segregated area is not predefined but is dependent on the object shape. We discuss how this segregation strategy could be employed by animals seeking to deter binocular predators. This article is part of the themed issue ‘Vision in our three-dimensional world’.
Collapse
Affiliation(s)
- P Cammack
- School of Psychology and Neuroscience, University of St Andrews, St Andrews KY16 9JP, UK
| | - J M Harris
- School of Psychology and Neuroscience, University of St Andrews, St Andrews KY16 9JP, UK
| |
Collapse
|
13
|
Guan P, Banks MS. Stereoscopic depth constancy. Philos Trans R Soc Lond B Biol Sci 2017; 371:rstb.2015.0253. [PMID: 27269596 PMCID: PMC4901447 DOI: 10.1098/rstb.2015.0253] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 03/09/2016] [Indexed: 01/03/2023] Open
Abstract
Depth constancy is the ability to perceive a fixed depth interval in the world as constant despite changes in viewing distance and the spatial scale of depth variation. It is well known that the spatial frequency of depth variation has a large effect on threshold. In the first experiment, we determined that the visual system compensates for this differential sensitivity when the change in disparity is suprathreshold, thereby attaining constancy similar to contrast constancy in the luminance domain. In a second experiment, we examined the ability to perceive constant depth when the spatial frequency and viewing distance both changed. To attain constancy in this situation, the visual system has to estimate distance. We investigated this ability when vergence, accommodation and vertical disparity are all presented accurately and therefore provided veridical information about viewing distance. We found that constancy is nearly complete across changes in viewing distance. Depth constancy is most complete when the scale of the depth relief is constant in the world rather than when it is constant in angular units at the retina. These results bear on the efficacy of algorithms for creating stereo content. This article is part of the themed issue ‘Vision in our three-dimensional world’.
Collapse
Affiliation(s)
- Phillip Guan
- UC Berkeley-UCSF Graduate Program in Bioengineering, Berkeley and San Francisco, CA 94720, USA
| | - Martin S Banks
- UC Berkeley-UCSF Graduate Program in Bioengineering, Berkeley and San Francisco, CA 94720, USA School of Optometry, Vision Science Program, UC Berkeley, Berkeley, CA 94720, USA
| |
Collapse
|
14
|
Neurons in Striate Cortex Signal Disparity in Half-Matched Random-Dot Stereograms. J Neurosci 2017; 36:8967-76. [PMID: 27559177 DOI: 10.1523/jneurosci.0642-16.2016] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2016] [Accepted: 06/17/2016] [Indexed: 11/21/2022] Open
Abstract
UNLABELLED Human stereopsis can operate in dense "cyclopean" images containing no monocular objects. This is believed to depend on the computation of binocular correlation by neurons in primary visual cortex (V1). The observation that humans perceive depth in half-matched random-dot stereograms, although these stimuli have no net correlation, has led to the proposition that human depth perception in these stimuli depends on a distinct "matching" computation possibly performed in extrastriate cortex. However, recording from disparity-selective neurons in V1 of fixating monkeys, we found that they are in fact able to signal disparity in half-matched stimuli. We present a simple model that explains these results. This reinstates the view that disparity-selective neurons in V1 provide the initial substrate for perception in dense cyclopean stimuli, and strongly suggests that separate correlation and matching computations are not necessary to explain existing data on mixed correlation stereograms. SIGNIFICANCE STATEMENT The initial step in stereoscopic 3D vision is generally thought to be a correlation-based computation that takes place in striate cortex. Recent research has argued that there must be an additional matching computation involved in extracting stereoscopic depth in random-dot stereograms. This is based on the observation that humans can perceive depth in stimuli with a mean binocular correlation of zero (where a correlation-based mechanism should not signal depth). We show that correlation-based cells in striate cortex do in fact signal depth here because they convert fluctuations in the correlation level into a mean change in the firing rate. Our results reinstate the view that these cells provide a sufficient substrate for the perception of stereoscopic depth.
Collapse
|
15
|
Development of Relative Disparity Sensitivity in Human Visual Cortex. J Neurosci 2017; 37:5608-5619. [PMID: 28473649 DOI: 10.1523/jneurosci.3570-16.2017] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2016] [Revised: 04/21/2017] [Accepted: 04/25/2017] [Indexed: 12/31/2022] Open
Abstract
Stereopsis is the primary cue underlying our ability to make fine depth judgments. In adults, depth discriminations are supported largely by relative rather than absolute binocular disparity, and depth is perceived primarily for horizontal rather than vertical disparities. Although human infants begin to exhibit disparity-specific responses between 3 and 5 months of age, it is not known how relative disparity mechanisms develop. Here we show that the specialization for relative disparity is highly immature in 4- to 6-month-old infants but is adult-like in 4- to 7-year-old children. Disparity-tuning functions for horizontal and vertical disparities were measured using the visual evoked potential. Infant relative disparity thresholds, unlike those of adults, were equal for vertical and horizontal disparities. Their horizontal disparity thresholds were a factor of ∼10 higher than adults, but their vertical disparity thresholds differed by a factor of only ∼4. Horizontal relative disparity thresholds for 4- to 7-year-old children were comparable with those of adults at ∼0.5 arcmin. To test whether infant immaturity was due to spatial limitations or insensitivity to interocular correlation, highly suprathreshold horizontal and vertical disparities were presented in alternate regions of the display, and the interocular correlation of the interdigitated regions was varied from 0% to 100%. This manipulation regulated the availability of coarse-scale relative disparity cues. Adult and infant responses both increased with increasing interocular correlation by similar magnitudes, but adult responses increased much more for horizontal disparities, further evidence for qualitatively immature stereopsis based on relative disparity at 4-6 months of age.SIGNIFICANCE STATEMENT Stereopsis, our ability to sense depth from horizontal image disparity, is among the finest spatial discriminations made by the primate visual system. Fine stereoscopic depth discriminations depend critically on comparisons of disparity relationships in the image that are supported by relative disparity cues rather than the estimation of single, absolute disparities. Very young human and macaque infants are sensitive to absolute disparity, but no previous study has specifically studied the development of relative disparity sensitivity, a hallmark feature of adult stereopsis. Here, using high-density EEG recordings, we show that 4- to 6-month-old infants display both quantitative and qualitative response immaturities for relative disparity information. Relative disparity responses are adult-like no later than 4-7 years of age.
Collapse
|
16
|
Hornsey RL, Hibbard PB, Scarfe P. Binocular Depth Judgments on Smoothly Curved Surfaces. PLoS One 2016; 11:e0165932. [PMID: 27824895 PMCID: PMC5100889 DOI: 10.1371/journal.pone.0165932] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2015] [Accepted: 10/20/2016] [Indexed: 12/04/2022] Open
Abstract
Binocular disparity is an important cue to depth, allowing us to make very fine discriminations of the relative depth of objects. In complex scenes, this sensitivity depends on the particular shape and layout of the objects viewed. For example, judgments of the relative depths of points on a smoothly curved surface are less accurate than those for points in empty space. It has been argued that this occurs because depth relationships are represented accurately only within a local spatial area. A consequence of this is that, when judging the relative depths of points separated by depth maxima and minima, information must be integrated across separate local representations. This integration, by adding more stages of processing, might be expected to reduce the accuracy of depth judgements. We tested this idea directly by measuring how accurately human participants could report the relative depths of two dots, presented with different binocular disparities. In the first, Two Dot condition the two dots were presented in front of a square grid. In the second, Three Dot condition, an additional dot was presented midway between the target dots, at a range of depths, both nearer and further than the target dots. In the final, Surface condition, the target dots were placed on a smooth surface defined by binocular disparity cues. In some trials, this contained a depth maximum or minimum between the target dots. In the Three Dot condition, performance was impaired when the central dot was presented with a large disparity, in line with predictions. In the Surface condition, performance was worst when the midpoint of the surface was at a similar distance to the targets, and relatively unaffected when there was a large depth maximum or minimum present. These results are not consistent with the idea that depth order is represented only within a local spatial area.
Collapse
Affiliation(s)
- Rebecca L. Hornsey
- Department of Psychology, University of Essex, Wivenhoe Park, Colchester, CO4 3SQ, United Kingdom
| | - Paul B. Hibbard
- Department of Psychology, University of Essex, Wivenhoe Park, Colchester, CO4 3SQ, United Kingdom
| | - Peter Scarfe
- School of Psychology and Clinical Language Sciences, University of Reading, Earley Gate, Whiteknights Road, Reading, RG6 6AL, United Kingdom
| |
Collapse
|
17
|
A Single Mechanism Can Account for Human Perception of Depth in Mixed Correlation Random Dot Stereograms. PLoS Comput Biol 2016; 12:e1004906. [PMID: 27196696 PMCID: PMC4873186 DOI: 10.1371/journal.pcbi.1004906] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2015] [Accepted: 04/08/2016] [Indexed: 11/19/2022] Open
Abstract
In order to extract retinal disparity from a visual scene, the brain must match corresponding points in the left and right retinae. This computationally demanding task is known as the stereo correspondence problem. The initial stage of the solution to the correspondence problem is generally thought to consist of a correlation-based computation. However, recent work by Doi et al suggests that human observers can see depth in a class of stimuli where the mean binocular correlation is 0 (half-matched random dot stereograms). Half-matched random dot stereograms are made up of an equal number of correlated and anticorrelated dots, and the binocular energy model-a well-known model of V1 binocular complex cells-fails to signal disparity here. This has led to the proposition that a second, match-based computation must be extracting disparity in these stimuli. Here we show that a straightforward modification to the binocular energy model-adding a point output nonlinearity-is by itself sufficient to produce cells that are disparity-tuned to half-matched random dot stereograms. We then show that a simple decision model using this single mechanism can reproduce psychometric functions generated by human observers, including reduced performance to large disparities and rapidly updating dot patterns. The model makes predictions about how performance should change with dot size in half-matched stereograms and temporal alternation in correlation, which we test in human observers. We conclude that a single correlation-based computation, based directly on already-known properties of V1 neurons, can account for the literature on mixed correlation random dot stereograms.
Collapse
|
18
|
This issue at a glance. J Curr Ophthalmol 2016; 28:3-4. [PMID: 27239594 PMCID: PMC4881224 DOI: 10.1016/j.joco.2016.03.006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/03/2022] Open
|
19
|
Stereoacuity after photorefractive keratectomy in myopia. J Curr Ophthalmol 2016; 28:17-20. [PMID: 27239597 PMCID: PMC4881238 DOI: 10.1016/j.joco.2016.01.005] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2015] [Accepted: 01/31/2016] [Indexed: 11/24/2022] Open
Abstract
Purpose Stereopsis, as a part of visual function, is the ability of differentiating between the two eyes' views (binocular disparity), due to the eyes' different positions. The aim of this study was to compare stereoscopic vision before and after photorefractive keratectomy (PRK) in myopia. Methods In a prospective interventional case series study clinical trial, forty-eight myopic individuals (age range: 18–34 years) who had undergone PRK surgery by a Bausch & Lomb Technolas 217z excimer laser were included. In all patients, stereoscopic vision was assessed using TNO test charts at 40 cm distance preoperatively and at 3 and 6 months postoperatively. Results A total of 48 cases (96 eyes, 69% female) with a mean age of 26.70 ± 4.89 years (range: 18–34 years) were treated. Uncorrected visual acuity (UCVA) was improved and refraction was corrected significantly after PRK surgery. The stereoscopic vision in patients was 246.56 ± 98.43 s of arc before PRK surgery. Postoperatively, the stereoacuities were recorded as 365.38 ± 112.65 s of arc and 343.51 ± 88.96 s of arc at 3 and 6 months, respectively. These differences were statistically significant (p < 0.001). Conclusion PRK was successful and safe in improving refractive error and UCVA, but it may deteriorate the stereoscopic vision. It may be due to an increase in higher order aberrations.
Collapse
|
20
|
Abstract
The sense of time is foundational for perception and action, yet it frequently departs significantly from physical time. In the paper we review recent progress on temporal contextual effects, multisensory temporal integration, temporal recalibration, and related computational models. We suggest that subjective time arises from minimizing prediction errors and adaptive recalibration, which can be unified in the framework of predictive coding, a framework rooted in Helmholtz's 'perception as inference'.
Collapse
Affiliation(s)
- Zhuanghua Shi
- Department of Psychology, University of Munich, Munich, Germany
| | - David Burr
- Neuroscience Institute, National Research Council, Pisa, Italy; Department of Neuroscience, University of Florence, Florence, Italy
| |
Collapse
|
21
|
Hibbard PB, Goutcher R, Hunter DW. Encoding and estimation of first- and second-order binocular disparity in natural images. Vision Res 2016; 120:108-20. [PMID: 26731646 PMCID: PMC4802249 DOI: 10.1016/j.visres.2015.10.016] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2014] [Revised: 10/26/2015] [Accepted: 10/26/2015] [Indexed: 11/23/2022]
Abstract
First- and second-order responses to natural binocular images are correlated. Second-order mechanisms can improve the accuracy of disparity estimation. Second-order mechanisms can extend the depth range of binocular stereopsis.
The first stage of processing of binocular information in the visual cortex is performed by mechanisms that are bandpass-tuned for spatial frequency and orientation. Psychophysical and physiological evidence have also demonstrated the existence of second-order mechanisms in binocular processing, which can encode disparities that are not directly accessible to first-order mechanisms. We compared the responses of first- and second-order binocular filters to natural images. We found that the responses of the second-order mechanisms are to some extent correlated with the responses of the first-order mechanisms, and that they can contribute to increasing both the accuracy, and depth range, of binocular stereopsis.
Collapse
Affiliation(s)
- Paul B Hibbard
- Department of Psychology, University of Essex, Colchester CO4 3SQ, UK; School of Psychology and Neuroscience, University of St Andrews, St Mary's Quad, South Street, St Andrews, KY16 9JP Scotland, UK.
| | - Ross Goutcher
- Psychology, School of Natural Sciences, University of Stirling, Stirling FK9 4LA, Scotland, UK
| | - David W Hunter
- School of Psychology and Neuroscience, University of St Andrews, St Mary's Quad, South Street, St Andrews, KY16 9JP Scotland, UK
| |
Collapse
|
22
|
Allenmark F, Moutsopoulou K, Waszak F. A new look on S-R associations: How S and R link. Acta Psychol (Amst) 2015; 160:161-9. [PMID: 26253594 DOI: 10.1016/j.actpsy.2015.07.016] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2015] [Revised: 07/28/2015] [Accepted: 07/28/2015] [Indexed: 10/23/2022] Open
Abstract
Humans can learn associations between stimuli and responses which allow for faster, more efficient behavior when the same response is required to the same stimulus in the future. This is called stimulus-response (S-R) priming. Perceptual representations are known to be modular and hierarchical, i.e. different brain areas represent different perceptual features and higher brain areas represent increasingly abstract properties of the stimulus. In this study we investigated how perceptually specific the stimulus in S-R priming is. In particular we wanted to test whether basic visual features play a role in the S-R associations. We used a novel stimulus: images of objects built from basic visual features. Participants performed a classification task on the objects. We found no significant effect on reaction times of switching vs. repeating perceptual features between presentations of the same object. This suggests that S-R associations involve a perceptually non-specific stimulus representation.
Collapse
|
23
|
Martins JA, Rodrigues JMF, du Buf H. Luminance, Colour, Viewpoint and Border Enhanced Disparity Energy Model. PLoS One 2015; 10:e0129908. [PMID: 26107954 PMCID: PMC4480855 DOI: 10.1371/journal.pone.0129908] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2015] [Accepted: 05/14/2015] [Indexed: 11/19/2022] Open
Abstract
The visual cortex is able to extract disparity information through the use of binocular cells. This process is reflected by the Disparity Energy Model, which describes the role and functioning of simple and complex binocular neuron populations, and how they are able to extract disparity. This model uses explicit cell parameters to mathematically determine preferred cell disparities, like spatial frequencies, orientations, binocular phases and receptive field positions. However, the brain cannot access such explicit cell parameters; it must rely on cell responses. In this article, we implemented a trained binocular neuronal population, which encodes disparity information implicitly. This allows the population to learn how to decode disparities, in a similar way to how our visual system could have developed this ability during evolution. At the same time, responses of monocular simple and complex cells can also encode line and edge information, which is useful for refining disparities at object borders. The brain should then be able, starting from a low-level disparity draft, to integrate all information, including colour and viewpoint perspective, in order to propagate better estimates to higher cortical areas.
Collapse
Affiliation(s)
- Jaime A. Martins
- Vision Laboratory (FCT), ISR-LARSyS, University of the Algarve, Faro, Portugal
- * E-mail:
| | | | - Hans du Buf
- Vision Laboratory (FCT), ISR-LARSyS, University of the Algarve, Faro, Portugal
| |
Collapse
|
24
|
Reynaud A, Gao Y, Hess RF. A normative dataset on human global stereopsis using the quick Disparity Sensitivity Function (qDSF). Vision Res 2015; 113:97-103. [PMID: 26028556 DOI: 10.1016/j.visres.2015.04.021] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2014] [Revised: 02/23/2015] [Accepted: 04/22/2015] [Indexed: 10/23/2022]
Abstract
Global stereopsis results from the lateral displacement of distributed textured elements between the eyes. In this study, we investigate how the key parameters of the disparity sensitivity function such as its peak sensitivity and spatial bandwidth are distributed across a pool of normal observers and how large the individual differences are. For this purpose, we adapted the quick Contrast Sensitivity Function (qCSF, Lesmes et al., 2010) to the quick Disparity Sensitivity Function (qDSF). We show that this new method is accurate and allows a rapid measurement of disparity sensitivity for a range of different disparity spatial frequencies. Our results confirm that there is a greater variability in human disparity sensitivity tuning compared to other common visual features, for example, 1st or 2nd order contrast sensitivity.
Collapse
Affiliation(s)
- Alexandre Reynaud
- McGill Vision Research, Dept. Ophthalmology, McGill University, Montreal, PQ, Canada.
| | - Yi Gao
- McGill Vision Research, Dept. Ophthalmology, McGill University, Montreal, PQ, Canada
| | - Robert F Hess
- McGill Vision Research, Dept. Ophthalmology, McGill University, Montreal, PQ, Canada
| |
Collapse
|
25
|
Sprague WW, Cooper EA, Tošić I, Banks MS. Stereopsis is adaptive for the natural environment. SCIENCE ADVANCES 2015; 1:e1400254. [PMID: 26207262 PMCID: PMC4507831 DOI: 10.1126/sciadv.1400254] [Citation(s) in RCA: 66] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/23/2014] [Accepted: 04/14/2015] [Indexed: 05/16/2023]
Abstract
Humans and many animals have forward-facing eyes providing different views of the environment. Precise depth estimates can be derived from the resulting binocular disparities, but determining which parts of the two retinal images correspond to one another is computationally challenging. To aid the computation, the visual system focuses the search on a small range of disparities. We asked whether the disparities encountered in the natural environment match that range. We did this by simultaneously measuring binocular eye position and three-dimensional scene geometry during natural tasks. The natural distribution of disparities is indeed matched to the smaller range of correspondence search. Furthermore, the distribution explains the perception of some ambiguous stereograms. Finally, disparity preferences of macaque cortical neurons are consistent with the natural distribution.
Collapse
Affiliation(s)
- William W. Sprague
- Vision Science Graduate Group, University of California, Berkeley, Berkeley, CA 94720, USA
- School of Optometry, University of California, Berkeley, Berkeley, CA 94720, USA
| | - Emily A. Cooper
- Helen Wills Neuroscience Institute, University of California, Berkeley, Berkeley, CA 94720, USA
| | - Ivana Tošić
- Ricoh Innovations Corp., Menlo Park, CA 94025, USA
| | - Martin S. Banks
- Vision Science Graduate Group, University of California, Berkeley, Berkeley, CA 94720, USA
- School of Optometry, University of California, Berkeley, Berkeley, CA 94720, USA
| |
Collapse
|
26
|
Muryy AA, Fleming RW, Welchman AE. Key characteristics of specular stereo. J Vis 2014; 14:14. [PMID: 25540263 PMCID: PMC4278431 DOI: 10.1167/14.14.14] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2014] [Accepted: 10/21/2014] [Indexed: 11/24/2022] Open
Abstract
Because specular reflection is view-dependent, shiny surfaces behave radically differently from matte, textured surfaces when viewed with two eyes. As a result, specular reflections pose substantial problems for binocular stereopsis. Here we use a combination of computer graphics and geometrical analysis to characterize the key respects in which specular stereo differs from standard stereo, to identify how and why the human visual system fails to reconstruct depths correctly from specular reflections. We describe rendering of stereoscopic images of specular surfaces in which the disparity information can be varied parametrically and independently of monocular appearance. Using the generated surfaces and images, we explain how stereo correspondence can be established with known and unknown surface geometry. We show that even with known geometry, stereo matching for specular surfaces is nontrivial because points in one eye may have zero, one, or multiple matches in the other eye. Matching features typically yield skew (nonintersecting) rays, leading to substantial ortho-epipolar components to the disparities, which makes deriving depth values from matches nontrivial. We suggest that the human visual system may base its depth estimates solely on the epipolar components of disparities while treating the ortho-epipolar components as a measure of the underlying reliability of the disparity signals. Reconstructing virtual surfaces according to these principles reveals that they are piece-wise smooth with very large discontinuities close to inflection points on the physical surface. Together, these distinctive characteristics lead to cues that the visual system could use to diagnose specular reflections from binocular information.
Collapse
Affiliation(s)
- Alexander A. Muryy
- School of Psychology, University of Birmingham, Edgbaston, Birmingham, UK
- Department of Psychology, University of Southampton, Highfield Campus, Southampton, UK
| | | | - Andrew E. Welchman
- School of Psychology, University of Birmingham, Edgbaston, Birmingham, UK
- Department of Psychology, University of Cambridge, Downing Street, Cambridge, UK
| |
Collapse
|
27
|
Doi T, Fujita I. Cross-matching: a modified cross-correlation underlying threshold energy model and match-based depth perception. Front Comput Neurosci 2014; 8:127. [PMID: 25360107 PMCID: PMC4197649 DOI: 10.3389/fncom.2014.00127] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/10/2014] [Accepted: 09/22/2014] [Indexed: 11/25/2022] Open
Abstract
Three-dimensional visual perception requires correct matching of images projected to the left and right eyes. The matching process is faced with an ambiguity: part of one eye's image can be matched to multiple parts of the other eye's image. This stereo correspondence problem is complicated for random-dot stereograms (RDSs), because dots with an identical appearance produce numerous potential matches. Despite such complexity, human subjects can perceive a coherent depth structure. A coherent solution to the correspondence problem does not exist for anticorrelated RDSs (aRDSs), in which luminance contrast is reversed in one eye. Neurons in the visual cortex reduce disparity selectivity for aRDSs progressively along the visual processing hierarchy. A disparity-energy model followed by threshold nonlinearity (threshold energy model) can account for this reduction, providing a possible mechanism for the neural matching process. However, the essential computation underlying the threshold energy model is not clear. Here, we propose that a nonlinear modification of cross-correlation, which we term “cross-matching,” represents the essence of the threshold energy model. We placed half-wave rectification within the cross-correlation of the left-eye and right-eye images. The disparity tuning derived from cross-matching was attenuated for aRDSs. We simulated a psychometric curve as a function of graded anticorrelation (graded mixture of aRDS and normal RDS); this simulated curve reproduced the match-based psychometric function observed in human near/far discrimination. The dot density was 25% for both simulation and observation. We predicted that as the dot density increased, the performance for aRDSs should decrease below chance (i.e., reversed depth), and the level of anticorrelation that nullifies depth perception should also decrease. We suggest that cross-matching serves as a simple computation underlying the match-based disparity signals in stereoscopic depth perception.
Collapse
Affiliation(s)
- Takahiro Doi
- Laboratory for Cognitive Neuroscience, Center for Information and Neural Networks, Graduate School of Frontier Biosciences, Osaka University Suita, Japan
| | - Ichiro Fujita
- Laboratory for Cognitive Neuroscience, Center for Information and Neural Networks, Graduate School of Frontier Biosciences, Osaka University Suita, Japan ; Center for Information and Neural Networks, National Institute of Information and Communications Technology Suita, Japan
| |
Collapse
|
28
|
Abstract
According to the geometric relational expression of binocular stereopsis, for a given viewing distance the magnitude of the perceived depth of objects would be the same, as long as the disparity magnitudes were the same. However, we found that this is not necessarily the case for random-dot stereograms that depict parallel, overlapping, transparent stereoscopic surfaces (POTS). The data from five experiments indicated that (1) the magnitude of perceived depth between the two outer surfaces of a three- or a four-POTS configuration can be smaller than that for an identical pair of stereo surfaces of a two-POTS configuration for the range of disparities that we used (5.2-19.4 arcmin); (2) this phenomenon can be observed irrespective of the total dot density of a POTS configuration, at least for the range that we used (1.1-3.3 dots/deg(2)); and (3) the magnitude of perceived depth between the two outer surfaces of a POTS configuration can be reduced as the total number of stereo surfaces is increased, up to four surfaces. We explained these results in terms of a higher-order process or processes, with an output representing perceived depth magnitude, which is weakened when the number of its surfaces is increased.
Collapse
|
29
|
Abstract
What are the geometric primitives of binocular disparity? The Venetian blind effect and other converging lines of evidence indicate that stereoscopic depth perception derives from disparities of higher-order structure in images of surfaces. Image structure entails spatial variations of intensity, texture, and motion, jointly structured by observed surfaces. The spatial structure of binocular disparity corresponds to the spatial structure of surfaces. Independent spatial coordinates are not necessary for stereoscopic vision. Stereopsis is highly sensitive to structural disparities associated with local surface shape. Disparate positions on retinal anatomy are neither necessary nor sufficient for stereopsis.
Collapse
|
30
|
Wilkinson N, Paikan A, Gredebäck G, Rea F, Metta G. Staring us in the face? An embodied theory of innate face preference. Dev Sci 2014; 17:809-25. [PMID: 24946990 DOI: 10.1111/desc.12159] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2013] [Accepted: 11/01/2013] [Indexed: 11/28/2022]
Abstract
Human expertise in face perception grows over development, but even within minutes of birth, infants exhibit an extraordinary sensitivity to face-like stimuli. The dominant theory accounts for innate face detection by proposing that the neonate brain contains an innate face detection device, dubbed 'Conspec'. Newborn face preference has been promoted as some of the strongest evidence for innate knowledge, and forms a canonical stage for the modern form of the nature-nurture debate in psychology. Interpretation of newborn face preference results has concentrated on monocular stimulus properties, with little mention or focused investigation of potential binocular involvement. However, the question of whether and how newborns integrate the binocular visual streams bears directly on the generation of observable visual preferences. In this theoretical paper, we employ a synthetic approach utilizing robotic and computational models to draw together the threads of binocular integration and face preference in newborns, and demonstrate cases where the former may explain the latter. We suggest that a system-level view considering the binocular embodiment of newborn vision may offer a mutually satisfying resolution to some long-running arguments in the polarizing debate surrounding the existence and causal structure of newborns' 'innate knowledge' of faces.
Collapse
Affiliation(s)
- Nick Wilkinson
- iCub Facility, Istituto Italiano di Tecnologia, Genova, Italy
| | | | | | | | | |
Collapse
|
31
|
Wilkinson N, Metta G. Bilateral gain control; an "innate predisposition" for all sorts of things. Front Neurorobot 2014; 8:9. [PMID: 24611045 PMCID: PMC3933809 DOI: 10.3389/fnbot.2014.00009] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2013] [Accepted: 02/05/2014] [Indexed: 12/02/2022] Open
Abstract
Empirical studies have revealed remarkable perceptual organization in neonates. Newborn behavioral distinctions have often been interpreted as implying functionally specific modular adaptations, and are widely cited as evidence supporting the nativist agenda. In this theoretical paper, we approach newborn perception and attention from an embodied, developmental perspective. At the mechanistic level, we argue that a generative mechanism based on mutual gain control between bilaterally corresponding points may underly a number of functionally defined “innate predispositions” related to spatial-configural perception. At the computational level, bilateral gain control implements beamforming, which enables spatial-configural tuning at the front end sampling stage. At the psychophysical level, we predict that selective attention in newborns will favor contrast energy which projects to bilaterally corresponding points on the neonate subject's sensor array. The current work extends and generalizes previous work to formalize the bilateral correlation model of newborn attention at a high level, and demonstrate in minimal agent-based simulations how bilateral gain control can enable a simple, robust and “social” attentional bias.
Collapse
Affiliation(s)
| | - Giorgio Metta
- iCub Facility, Istituto Italiano di Tecnologia Genova, Italy ; Centre for Robotics and Neural Systems, University of Plymouth Plymouth, UK
| |
Collapse
|
32
|
Goutcher R, Hibbard PB. Mechanisms for similarity matching in disparity measurement. Front Psychol 2014; 4:1014. [PMID: 24409163 PMCID: PMC3884144 DOI: 10.3389/fpsyg.2013.01014] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2013] [Accepted: 12/20/2013] [Indexed: 11/13/2022] Open
Abstract
Early neural mechanisms for the measurement of binocular disparity appear to operate in a manner consistent with cross-correlation-like processes. Consequently, cross-correlation, or cross-correlation-like procedures have been used in a range of models of disparity measurement. Using such procedures as the basis for disparity measurement creates a preference for correspondence solutions that maximize the similarity between local left and right eye image regions. Here, we examine how observers’ perception of depth in an ambiguous stereogram is affected by manipulations of luminance and orientation-based image similarity. Results show a strong effect of coarse-scale luminance similarity manipulations, but a relatively weak effect of finer-scale manipulations of orientation similarity. This is in contrast to the measurements of depth obtained from a standard cross-correlation model. This model shows strong effects of orientation similarity manipulations and weaker effects of luminance similarity. In order to account for these discrepancies, the standard cross-correlation approach may be modified to include an initial spatial frequency filtering stage. The performance of this adjusted model most closely matches human psychophysical data when spatial frequency filtering favors coarser scales. This is consistent with the operation of disparity measurement processes where spatial frequency and disparity tuning are correlated, or where disparity measurement operates in a coarse-to-fine manner.
Collapse
Affiliation(s)
- Ross Goutcher
- Psychology, School of Natural Sciences, University of Stirling Stirling, Scotland, UK
| | - Paul B Hibbard
- Department of Psychology, University of Essex Colchester, UK
| |
Collapse
|
33
|
Vlaskamp BNS, Guan P, Banks MS. The venetian-blind effect: a preference for zero disparity or zero slant? Front Psychol 2013; 4:836. [PMID: 24273523 PMCID: PMC3822326 DOI: 10.3389/fpsyg.2013.00836] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2013] [Accepted: 10/21/2013] [Indexed: 11/21/2022] Open
Abstract
When periodic stimuli such as vertical sinewave gratings are presented to the two eyes, the initial stage of disparity estimation yields multiple solutions at multiple depths. The solutions are all frontoparallel when the sinewaves have the same spatial frequency; they are all slanted when the sinewaves have quite different frequencies. Despite multiple solutions, humans perceive only one depth in each visual direction: a single frontoparallel plane when the frequencies are the same and a series of small slanted planes-Venetian blinds-when the frequencies are quite different. These percepts are consistent with a preference for solutions that minimize absolute disparity or overall slant. The preference for minimum disparity and minimum slant are identical for gaze at zero eccentricity; we dissociated the predictions of the two by measuring the occurrence of Venetian blinds when the stimuli were viewed in eccentric gaze. The results were generally quite consistent with a zero-disparity preference (Experiment 1), but we also observed a shift toward a zero-slant preference when the edges of the stimulus had zero slant (Experiment 2). These observations provide useful insights into how the visual system constructs depth percepts from a multitude of possible depths.
Collapse
Affiliation(s)
- Björn N. S. Vlaskamp
- Vision Science Program, School of Optometry, University of BerkeleyBerkeley, CA, USA
- Philips ResearchEindhoven, Netherlands
| | - Phillip Guan
- The UC Berkeley UCSF Graduate Program in Bioengineering, University of California, BerkeleySan Francisco, CA, USA
| | - Martin S. Banks
- Vision Science Program, School of Optometry, University of BerkeleyBerkeley, CA, USA
- The UC Berkeley UCSF Graduate Program in Bioengineering, University of California, BerkeleySan Francisco, CA, USA
| |
Collapse
|
34
|
Abstract
To interact rapidly and effectively with our environment, our brain needs access to a neural representation--or map--of the spatial layout of the external world. However, the construction of such a map poses major challenges to the visual system, given that the images on our retinae depend on where the eyes are looking, and shift each time we move our eyes, head, and body to explore the world. Much research has been devoted to how the stability is achieved, with the debate often polarized between the utility of spatiotopic maps (that remain solid in external coordinates), as opposed to transiently updated retinotopic maps. Our research suggests that the visual system uses both strategies to maintain stability. fMRI, motion-adaptation, and saccade-adaptation studies demonstrate and characterize spatiotopic neural maps within the dorsal visual stream that remain solid in external rather than retinal coordinates. However, the construction of these maps takes time (up to 500 ms) and attentional resources. To solve the immediate problems created by individual saccades, we postulate the existence of a separate system to bridge each saccade with neural units that are 'transiently craniotopic'. These units prepare for the effects of saccades with a shift of their receptive fields before the saccade starts, then relaxing back into their standard position during the saccade, compensating for its action. Psychophysical studies investigating localization of stimuli flashed briefly around the time of saccades provide strong support for these neural mechanisms, and show quantitatively how they integrate information across saccades. This transient system cooperates with the spatiotopic mechanism to provide a useful map to guide interactions with our environment: one rapid and transitory, bringing into play the high-resolution visual areas; the other slow, long-lasting, and low-resolution, useful for interacting with the world.
Collapse
Affiliation(s)
- David C Burr
- Department of Psychology, University of Florence, via San Salvi 12, 50135 Florence, Italy.
| | | |
Collapse
|
35
|
Hou F, Huang CB, Liang J, Zhou Y, Lu ZL. Contrast gain-control in stereo depth and cyclopean contrast perception. J Vis 2013; 13:13.8.3. [PMID: 23820024 DOI: 10.1167/13.8.3] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open
Abstract
Although human observers can perceive depth from stereograms with considerable contrast difference between the images presented to the two eyes (Legge & Gu, 1989), how contrast gain control functions in stereo depth perception has not been systematically investigated. Recently, we developed a multipathway contrast gain-control model (MCM) for binocular phase and contrast perception (Huang, Zhou, Lu, & Zhou, 2011; Huang, Zhou, Zhou, & Lu, 2010) based on a contrast gain-control model of binocular phase combination (Ding & Sperling, 2006). To extend the MCM to simultaneously account for stereo depth and cyclopean contrast perception, we manipulated the contrasts (ranging from 0.08 to 0.4) of the dynamic random dot stereograms (RDS) presented to the left and right eyes independently and measured both disparity thresholds for depth perception and perceived contrasts of the cyclopean images. We found that both disparity threshold and perceived contrast depended strongly on the signal contrasts in the two eyes, exhibiting characteristic binocular contrast gain-control properties. The results were well accounted for by an extended MCM model, in which each eye exerts gain control on the other eye's signal in proportion to its own signal contrast energy and also gain control over the other eye's gain control; stereo strength is proportional to the product of the signal strengths in the two eyes after contrast gain control, and perceived contrast is computed by combining contrast energy from the two eyes. The new model provided an excellent account of our data (r(2) = 0.945), as well as some challenging results in the literature.
Collapse
Affiliation(s)
- Fang Hou
- Laboratory of Brain Processes, Department of Psychology, The Ohio State University, Columbus, OH, USA.
| | | | | | | | | |
Collapse
|
36
|
Conjunctions between motion and disparity are encoded with the same spatial resolution as disparity alone. J Neurosci 2012; 32:14331-43. [PMID: 23055504 DOI: 10.1523/jneurosci.3495-11.2012] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022] Open
Abstract
Neurons in cortical area MT respond well to transparent streaming motion in distinct depth planes, such as caused by observer self-motion, but do not contain subregions excited by opposite directions of motion. We therefore predicted that spatial resolution for transparent motion/disparity conjunctions would be limited by the size of MT receptive fields, just as spatial resolution for disparity is limited by the much smaller receptive fields found in primary visual cortex, V1. We measured this using a novel "joint motion/disparity grating," on which human observers detected motion/disparity conjunctions in transparent random-dot patterns containing dots streaming in opposite directions on two depth planes. Surprisingly, observers showed the same spatial resolution for these as for pure disparity gratings. We estimate the limiting receptive field diameter at 11 arcmin, similar to V1 and much smaller than MT. Higher internal noise for detecting joint motion/disparity produces a slightly lower high-frequency cutoff of 2.5 cycles per degree (cpd) versus 3.3 cpd for disparity. This suggests that information on motion/disparity conjunctions is available in the population activity of V1 and that this information can be decoded for perception even when it is invisible to neurons in MT.
Collapse
|
37
|
Vidal-Naquet M, Gepshtein S. Spatially invariant computations in stereoscopic vision. Front Comput Neurosci 2012; 6:47. [PMID: 22811665 PMCID: PMC3397313 DOI: 10.3389/fncom.2012.00047] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2011] [Accepted: 06/26/2012] [Indexed: 11/13/2022] Open
Abstract
PERCEPTION OF STEREOSCOPIC DEPTH REQUIRES THAT VISUAL SYSTEMS SOLVE A CORRESPONDENCE PROBLEM: find parts of the left-eye view of the visual scene that correspond to parts of the right-eye view. The standard model of binocular matching implies that similarity of left and right images is computed by inter-ocular correlation. But the left and right images of the same object are normally distorted relative to one another by the binocular projection, in particular when slanted surfaces are viewed from close distance. Correlation often fails to detect correct correspondences between such image parts. We investigate a measure of inter-ocular similarity that takes advantage of spatially invariant computations similar to the computations performed by complex cells in biological visual systems. This measure tolerates distortions of corresponding image parts and yields excellent performance over a much larger range of surface slants than the standard model. The results suggest that, rather than serving as disparity detectors, multiple binocular complex cells take part in the computation of inter-ocular similarity, and that visual systems are likely to postpone commitment to particular binocular disparities until later stages in the visual process.
Collapse
|
38
|
MITSUDO HIROYUKI. A minimal algorithm for computing the likelihood of binocular correspondence1. JAPANESE PSYCHOLOGICAL RESEARCH 2012. [DOI: 10.1111/j.1468-5884.2011.00504.x] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
|
39
|
Allenmark F, Read J. Spatial stereoresolution for depth corrugations may be set in primary visual cortex. BMC Neurosci 2011. [PMCID: PMC3240371 DOI: 10.1186/1471-2202-12-s1-p263] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
|
40
|
Allenmark F, Read JCA. Spatial stereoresolution for depth corrugations may be set in primary visual cortex. PLoS Comput Biol 2011; 7:e1002142. [PMID: 21876667 PMCID: PMC3158043 DOI: 10.1371/journal.pcbi.1002142] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2010] [Accepted: 06/16/2011] [Indexed: 11/18/2022] Open
Abstract
Stereo “3D” depth perception requires the visual system to extract binocular disparities between the two eyes' images. Several current models of this process, based on the known physiology of primary visual cortex (V1), do this by computing a piecewise-frontoparallel local cross-correlation between the left and right eye's images. The size of the “window” within which detectors examine the local cross-correlation corresponds to the receptive field size of V1 neurons. This basic model has successfully captured many aspects of human depth perception. In particular, it accounts for the low human stereoresolution for sinusoidal depth corrugations, suggesting that the limit on stereoresolution may be set in primary visual cortex. An important feature of the model, reflecting a key property of V1 neurons, is that the initial disparity encoding is performed by detectors tuned to locally uniform patches of disparity. Such detectors respond better to square-wave depth corrugations, since these are locally flat, than to sinusoidal corrugations which are slanted almost everywhere. Consequently, for any given window size, current models predict better performance for square-wave disparity corrugations than for sine-wave corrugations at high amplitudes. We have recently shown that this prediction is not borne out: humans perform no better with square-wave than with sine-wave corrugations, even at high amplitudes. The failure of this prediction raised the question of whether stereoresolution may actually be set at later stages of cortical processing, perhaps involving neurons tuned to disparity slant or curvature. Here we extend the local cross-correlation model to include existing physiological and psychophysical evidence indicating that larger disparities are detected by neurons with larger receptive fields (a size/disparity correlation). We show that this simple modification succeeds in reconciling the model with human results, confirming that stereoresolution for disparity gratings may indeed be limited by the size of receptive fields in primary visual cortex. Stereo depth perception requires the brain to detect displacements of features between the two eyes' images. Several current models use local cross-correlation between the two eyes' images, looking for small patches that are the most similar between the two images. There is evidence that cells in primary visual cortex are doing something very similar. This model captures many aspects of human depth perception, notably why we can see depth variation on much coarser scales than luminance variation. This suggests that the spatial resolution for depth perception is set in primary visual cortex. However, the model as currently implemented cannot explain why humans are as good at detecting sine-waves in depth as they are at detecting square-waves, a fact that we have previously raised as a challenge to the model. Here we show that if we introduce a size/disparity correlation, such that larger patches are used when searching for larger displacements of features between the two images, then simple models based on local cross-correlation can explain human performance for both sine- and square-wave depth corrugations, without needing to invoke more complicated disparity processing. This supports the proposal that spatial resolution for depth perception is set in primary visual cortex.
Collapse
Affiliation(s)
- Fredrik Allenmark
- Institute of Neuroscience, Newcastle University, Newcastle upon Tyne, United Kingdom.
| | | |
Collapse
|
41
|
Blake R, Wilson H. Binocular vision. Vision Res 2010; 51:754-70. [PMID: 20951722 DOI: 10.1016/j.visres.2010.10.009] [Citation(s) in RCA: 122] [Impact Index Per Article: 8.7] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2010] [Revised: 10/05/2010] [Accepted: 10/06/2010] [Indexed: 10/18/2022]
Abstract
This essay reviews major developments - empirical and theoretical - in the field of binocular vision during the last 25years. We limit our survey primarily to work on human stereopsis, binocular rivalry and binocular contrast summation, with discussion where relevant of single-unit neurophysiology and human brain imaging. We identify several key controversies that have stimulated important work on these problems. In the case of stereopsis those controversies include position vs. phase encoding of disparity, dependence of disparity limits on spatial scale, role of occlusion in binocular depth and surface perception, and motion in 3D. In the case of binocular rivalry, controversies include eye vs. stimulus rivalry, role of "top-down" influences on rivalry dynamics, and the interaction of binocular rivalry and stereopsis. Concerning binocular contrast summation, the essay focuses on two representative models that highlight the evolving complexity in this field of study.
Collapse
Affiliation(s)
- Randolph Blake
- Department of Brain and Cognitive Sciences, Seoul National University, Seoul, Republic of Korea.
| | | |
Collapse
|
42
|
Read JCA. Vertical binocular disparity is encoded implicitly within a model neuronal population tuned to horizontal disparity and orientation. PLoS Comput Biol 2010; 6:e1000754. [PMID: 20421992 PMCID: PMC2858673 DOI: 10.1371/journal.pcbi.1000754] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/03/2009] [Accepted: 03/22/2010] [Indexed: 11/23/2022] Open
Abstract
Primary visual cortex is often viewed as a "cyclopean retina", performing the initial encoding of binocular disparities between left and right images. Because the eyes are set apart horizontally in the head, binocular disparities are predominantly horizontal. Yet, especially in the visual periphery, a range of non-zero vertical disparities do occur and can influence perception. It has therefore been assumed that primary visual cortex must contain neurons tuned to a range of vertical disparities. Here, I show that this is not necessarily the case. Many disparity-selective neurons are most sensitive to changes in disparity orthogonal to their preferred orientation. That is, the disparity tuning surfaces, mapping their response to different two-dimensional (2D) disparities, are elongated along the cell's preferred orientation. Because of this, even if a neuron's optimal 2D disparity has zero vertical component, the neuron will still respond best to a non-zero vertical disparity when probed with a sub-optimal horizontal disparity. This property can be used to decode 2D disparity, even allowing for realistic levels of neuronal noise. Even if all V1 neurons at a particular retinotopic location are tuned to the expected vertical disparity there (for example, zero at the fovea), the brain could still decode the magnitude and sign of departures from that expected value. This provides an intriguing counter-example to the common wisdom that, in order for a neuronal population to encode a quantity, its members must be tuned to a range of values of that quantity. It demonstrates that populations of disparity-selective neurons encode much richer information than previously appreciated. It suggests a possible strategy for the brain to extract rarely-occurring stimulus values, while concentrating neuronal resources on the most commonly-occurring situations.
Collapse
Affiliation(s)
- Jenny C A Read
- Institute of Neuroscience, Newcastle University, Newcastle upon Tyne, United Kingdom.
| |
Collapse
|
43
|
Pollard, Mayhew, and Frisby's 1985 Paper. Perception 2009; 38:879-84. [DOI: 10.1068/pmkpol] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
|
44
|
Vlaskamp BNS, Filippini HR, Banks MS. Image-size differences worsen stereopsis independent of eye position. J Vis 2009; 9:17.1-13. [PMID: 19271927 DOI: 10.1167/9.2.17] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2008] [Accepted: 11/20/2008] [Indexed: 11/24/2022] Open
Abstract
With the eyes in forward gaze, stereo performance worsens when one eye's image is larger than the other's. Near, eccentric objects naturally create retinal images of different sizes. Does this mean that stereopsis exhibits deficits for such stimuli? Or does the visual system compensate for the predictable image-size differences? To answer this, we measured discrimination of a disparity-defined shape for different relative image sizes. We did so for different gaze directions, some compatible with the image-size difference and some not. Magnifications of 10-15% caused a clear worsening of stereo performance. The worsening was determined only by relative image size and not by eye position. This shows that no neural compensation for image-size differences accompanies eye-position changes, at least prior to disparity estimation. We also found that a local cross-correlation model for disparity estimation performs like humans in the same task, suggesting that the decrease in stereo performance due to image-size differences is a byproduct of the disparity-estimation method. Finally, we looked for compensation in an observer who has constantly different image sizes due to differing eye lengths. She performed best when the presented images were roughly the same size, indicating that she has compensated for the persistent image-size difference.
Collapse
|