1
|
Dai H, Micheyl C. Separating the contributions of primary and unwanted cues in psychophysical studies. Psychol Rev 2012; 119:770-88. [PMID: 22844984 DOI: 10.1037/a0029343] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
A fundamental issue in the design and the interpretation of experimental studies of perception relates to the question of whether the participants in these experiments could perform the perceptual task assigned to them using another feature, or cue, than that intended by the experimenter. An approach frequently used by auditory- and visual-perception researchers to guard against this possibility involves applying random variations to the stimuli across presentations or trials so as to make the "unwanted" cue unreliable for the participants. However, the theoretical basis of this widespread practice is not well developed. In this article, we describe a 2-channel model based on general principles of psychophysical signal detection theory, which can be used to assess the respective contributions of the unwanted cue and of the primary cue to performance or thresholds measured in perceptual discrimination experiments involving stimulus randomization. Example applications of the model to the analysis of results obtained in representative studies from the auditory- and visual-perception literature are provided. In several cases, the results of the model-based analyses indicate that the effectiveness of the randomization procedure was less than originally assumed by the authors of these studies. These findings underscore the importance of quantifying the potential influence of unwanted cues on the results of psychophysical experiments, even when stimulus randomization is used.
Collapse
Affiliation(s)
- Huanping Dai
- Dai, Department of Speech, Language, and Hearing Sciences,University of Arizona, Tucson, AZ 85721, USA.
| | | |
Collapse
|
2
|
Abstract
We report a novel multisensory decision task designed to encourage subjects to combine information across both time and sensory modalities. We presented subjects, humans and rats, with multisensory event streams, consisting of a series of brief auditory and/or visual events. Subjects made judgments about whether the event rate of these streams was high or low. We have three main findings. First, we report that subjects can combine multisensory information over time to improve judgments about whether a fluctuating rate is high or low. Importantly, the improvement we observed was frequently close to, or better than, the statistically optimal prediction. Second, we found that subjects showed a clear multisensory enhancement both when the inputs in each modality were redundant and when they provided independent evidence about the rate. This latter finding suggests a model where event rates are estimated separately for each modality and fused at a later stage. Finally, because a similar multisensory enhancement was observed in both humans and rats, we conclude that the ability to optimally exploit sequentially presented multisensory information is not restricted to a particular species.
Collapse
|
3
|
Seydell A, Knill DC, Trommershäuser J. Adapting internal statistical models for interpreting visual cues to depth. J Vis 2010; 10:1.1-27. [PMID: 20465321 DOI: 10.1167/10.4.1] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2009] [Accepted: 02/08/2010] [Indexed: 11/24/2022] Open
Abstract
The informativeness of sensory cues depends critically on statistical regularities in the environment. However, statistical regularities vary between different object categories and environments. We asked whether and how the brain changes the prior assumptions about scene statistics used to interpret visual depth cues when stimulus statistics change. Subjects judged the slants of stereoscopically presented figures by adjusting a virtual probe perpendicular to the surface. In addition to stereoscopic disparities, the aspect ratio of the stimulus in the image provided a "figural compression" cue to slant, whose reliability depends on the distribution of aspect ratios in the world. As we manipulated this distribution from regular to random and back again, subjects' reliance on the compression cue relative to stereoscopic cues changed accordingly. When we randomly interleaved stimuli from shape categories (ellipses and diamonds) with different statistics, subjects gave less weight to the compression cue for figures from the category with more random aspect ratios. Our results demonstrate that relative cue weights vary rapidly as a function of recently experienced stimulus statistics and that the brain can use different statistical models for different object categories. We show that subjects' behavior is consistent with that of a broad class of Bayesian learning models.
Collapse
Affiliation(s)
- Anna Seydell
- Department of General and Experimental Psychology, University of Giessen, Germany.
| | | | | |
Collapse
|
4
|
Abstract
This letter presents an improved cue integration approach to reliably separate coherent moving objects from their background scene in video sequences. The proposed method uses a probabilistic framework to unify bottom-up and top-down cues in a parallel, “democratic” fashion. The algorithm makes use of a modified Bayes rule where each pixel's posterior probabilities of figure or ground layer assignment are derived from likelihood models of three bottom-up cues and a prior model provided by a top-down cue. Each cue is treated as independent evidence for figure-ground separation. They compete with and complement each other dynamically by adjusting relative weights from frame to frame according to cue quality measured against the overall integration. At the same time, the likelihood or prior models of individual cues adapt toward the integrated result. These mechanisms enable the system to organize under the influence of visual scene structure without manual intervention. A novel contribution here is the incorporation of a top-down cue. It improves the system's robustness and accuracy and helps handle difficult and ambiguous situations, such as abrupt lighting changes or occlusion among multiple objects. Results on various video sequences are demonstrated and discussed. (Video demos are available at http://organic.usc.edu:8376/∼tangx/neco/index.html .)
Collapse
Affiliation(s)
- Xiangyu Tang
- Computer Science Department, University of Southern California, Los Angeles, CA, 90089, U.S.A
| | - Christoph von der Malsburg
- Frankfurt Institute for Advanced Studies, 60438, Frankfurt am Main, Germany, and Computer Science Department, University of Southern California, Los Angeles, CA 90089, U.S.A
| |
Collapse
|
5
|
Harris LR, Duke PA, Kopinska A. Flash lag in depth. Vision Res 2006; 46:2735-42. [PMID: 16469350 DOI: 10.1016/j.visres.2006.01.001] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2005] [Revised: 12/28/2005] [Accepted: 01/02/2006] [Indexed: 11/25/2022]
Abstract
The perceived position of a moving target at a particular point in time, indicated by a flash, is often judged to be different from its actual location. Here, we show that the position of a target moving in depth is also systematically mislocalized. We used three types of targets moving in depth at a range of speeds from 2 to 16 cm/s. (i) A target realistically rendered that included concordant looming, disparity, and perspective cues. (ii) A random dot surface whose depth was defined by disparity, without concordant perspective or looming cues. (iii) A surface of dynamic random dots whose depth was defined by disparity with no consistent motion visible monocularly. Subjects viewed the targets moving either towards or away from them and indicated whether the targets appeared to be nearer or farther than a continuously present reference depth at the moment that a flash was presented. A staircase procedure was used to null, and thus measure, any perceptual displacement from the reference depth. A flash lag in depth was found in which the target appeared ahead of its true position, displaced by a constant amount of time depending on the stimulus type and the direction of motion (towards or away). The time displacement varied from 76 ms (for the realistic target moving away from the observer) to 263 ms (for static random dots moving towards). These effects may depend on the confidence with which subjects were able to judge the location of our various targets: greater confidence leading to a smaller temporal displacement.
Collapse
Affiliation(s)
- Laurence R Harris
- Department of Psychology, Centre for Vision Research, York University, Toronto, Ontario, Canada M3J 1P3.
| | | | | |
Collapse
|
6
|
Pizlo Z, Li Y, Francis G. A new look at binocular stereopsis. Vision Res 2005; 45:2244-55. [PMID: 15924939 DOI: 10.1016/j.visres.2005.02.011] [Citation(s) in RCA: 15] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2004] [Revised: 12/17/2004] [Accepted: 02/14/2005] [Indexed: 11/25/2022]
Abstract
We report a new phenomenon, which illustrates that the role of binocular disparity in 3D shape perception critically depends on whether the parts are interpreted as belonging to a single object. The nature of this phenomenon was studied in four experiments. In the first two experiments the subjects were shown a sequence of stereoscopic images of a cube, in which binocular disparity indicated that the individual parts move towards or away from one eye. However, when the parts of the cube were perceived as elements of a single object, they appeared to move in a rigid fashion and the direction of motion was orthogonal to that predicted by the binocular disparities. The third experiment generalized these results to more complex polyhedra. The last experiment showed that constraints related to motion, such as rigidity, are important, but not critical for this phenomenon to occur. All these results imply that the interpretation as to what corresponds to a single object affects the importance (weight) of binocular disparity and may even eliminate its contribution altogether; the percept of a 3D shape is dominated by a priori constraints, and depth cues play a secondary role.
Collapse
Affiliation(s)
- Zygmunt Pizlo
- Department of Psychological Sciences, Purdue University, 703 Third Street, West Lafayette, IN 47907-2081, USA.
| | | | | |
Collapse
|
7
|
van Ee R, Adams WJ, Mamassian P. Bayesian modeling of cue interaction: bistability in stereoscopic slant perception. JOURNAL OF THE OPTICAL SOCIETY OF AMERICA. A, OPTICS, IMAGE SCIENCE, AND VISION 2003; 20:1398-1406. [PMID: 12868644 DOI: 10.1364/josaa.20.001398] [Citation(s) in RCA: 51] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/24/2023]
Abstract
Our two eyes receive different views of a visual scene, and the resulting binocular disparities enable us to reconstruct its three-dimensional layout. However, the visual environment is also rich in monocular depth cues. We examined the resulting percept when observers view a scene in which there are large conflicts between the surface slant signaled by binocular disparities and the slant signaled by monocular perspective. For a range of disparity-perspective cue conflicts, many observers experience bistability: They are able to perceive two distinct slants and to flip between the two percepts in a controlled way. We present a Bayesian model that describes the quantitative aspects of perceived slant on the basis of the likelihoods of both perspective and disparity slant information combined with prior assumptions about the shape and orientation of objects in the scene. Our Bayesian approach can be regarded as an overarching framework that allows researchers to study all cue integration aspects-including perceptual decisions--in a unified manner.
Collapse
Affiliation(s)
- Raymond van Ee
- Helmholtz Institute, Utrecht University, PrincetonPlein 5, 3584CC Utrecht, The Netherlands.
| | | | | |
Collapse
|
8
|
Landy MS, Kojima H. Ideal cue combination for localizing texture-defined edges. JOURNAL OF THE OPTICAL SOCIETY OF AMERICA. A, OPTICS, IMAGE SCIENCE, AND VISION 2001; 18:2307-20. [PMID: 11551065 DOI: 10.1364/josaa.18.002307] [Citation(s) in RCA: 80] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/23/2023]
Abstract
Many visual tasks can be carried out by using several sources of information. The most accurate estimates of scene properties require the observer to utilize all available information and to combine the information sources in an optimal manner. Two experiments are described that required the observers to judge the relative locations of two texture-defined edges (a vernier task). The edges were signaled by a change across the edge of two texture properties [either frequency and orientation (Experiment 1) or contrast and orientation (Experiment 2)]. The reliability of each cue was controlled by varying the distance over which the change (in frequency, orientation, or contrast) occurred-a kind of "texture blur." In some conditions, the position of the edge signaled by one cue was shifted relative to the other ("perturbation analysis"). An ideal-observer model, previously used in studies of depth perception and color constancy, was fitted to the data. Although the fit can be rejected relative to some more elaborate models, especially given the large quantity of data, this model does account for most trends in the data. A second, suboptimal model that switches between the available cues from trial to trial does a poor job of accounting for the data.
Collapse
Affiliation(s)
- M S Landy
- Department of Psychology and Center for Neural Science, New York University, New York 10003, USA.
| | | |
Collapse
|
9
|
Abstract
Sensory integration or sensor fusion -- the integration of information from different modalities, cues, or sensors -- is among the most fundamental problems of perception in biological and artificial systems. We propose a new architecture for adaptively integrating different cues in a self-organized manner. In Democratic Integration different cues agree on a result, and each cue adapts toward the result agreed on. In particular, discordant cues are quickly suppressed and recalibrated, while cues having been consistent with the result in the recent past are given a higher weight in the future. The architecture is tested in a face tracking scenario. Experiments show its robustness with respect to sudden changes in the environment as long as the changes disrupt only a minority of cues at the same time, although all cues may be disrupted at one time or another.
Collapse
Affiliation(s)
- J Triesch
- Department of Computer Science, University of Rochester, Rochester, NY 14627, USA
| | | |
Collapse
|