1
|
Sauer Y, Künstle DE, Wichmann FA, Wahl S. An objective measurement approach to quantify the perceived distortions of spectacle lenses. Sci Rep 2024; 14:3967. [PMID: 38368485 PMCID: PMC10874444 DOI: 10.1038/s41598-024-54368-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2023] [Accepted: 02/12/2024] [Indexed: 02/19/2024] Open
Abstract
The eye's natural aging influences our ability to focus on close objects. Without optical correction, all adults will suffer from blurry close vision starting in their 40s. In effect, different optical corrections are necessary for near and far vision. Current state-of-the-art glasses offer a gradual change of correction across the field of view for any distance-using Progressive Addition Lenses (PALs). However, an inevitable side effect of PALs is geometric distortion, which causes the swim effect, a phenomenon of unstable perception of the environment leading to discomfort for many wearers. Unfortunately, little is known about the relationship between lens distortions and their perceptual effects, that is, between the complex physical distortions on the one hand and their subjective severity on the other. We show that perceived distortion can be measured as a psychophysical scaling function using a VR experiment with accurately simulated PAL distortions. Despite the multi-dimensional space of physical distortions, the measured perception is well represented as a 1D scaling function; distortions are perceived less with negative far correction, suggesting an advantage for short-sighted people. Beyond that, our results successfully demonstrate that psychophysical scaling with ordinal embedding methods can investigate complex perceptual phenomena like lens distortions that affect geometry, stereo, and motion perception. Our approach provides a new perspective on lens design based on modeling visual processing that could be applied beyond distortions. We anticipate that future PAL designs could be improved using our method to minimize subjectively discomforting distortions rather than merely optimizing physical parameters.
Collapse
Affiliation(s)
- Yannick Sauer
- University of Tübingen, Tübingen, Germany.
- Carl Zeiss Vision International GmbH, Aalen, Germany.
| | - David-Elias Künstle
- University of Tübingen, Tübingen, Germany.
- Tübingen AI Center, Tübingen, Germany.
| | | | - Siegfried Wahl
- University of Tübingen, Tübingen, Germany
- Carl Zeiss Vision International GmbH, Aalen, Germany
| |
Collapse
|
2
|
Wichmann FA, Kornblith S, Geirhos R. Neither hype nor gloom do DNNs justice. Behav Brain Sci 2023; 46:e412. [PMID: 38054281 DOI: 10.1017/s0140525x23001711] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/07/2023]
Abstract
Neither the hype exemplified in some exaggerated claims about deep neural networks (DNNs), nor the gloom expressed by Bowers et al. do DNNs as models in vision science justice: DNNs rapidly evolve, and today's limitations are often tomorrow's successes. In addition, providing explanations as well as prediction and image-computability are model desiderata; one should not be favoured at the expense of the other.
Collapse
Affiliation(s)
- Felix A Wichmann
- Neural Information Processing Group, University of Tübingen, Tübingen, Germany
| | | | | |
Collapse
|
3
|
Abstract
Deep neural networks (DNNs) are machine learning algorithms that have revolutionized computer vision due to their remarkable successes in tasks like object classification and segmentation. The success of DNNs as computer vision algorithms has led to the suggestion that DNNs may also be good models of human visual perception. In this article, we review evidence regarding current DNNs as adequate behavioral models of human core object recognition. To this end, we argue that it is important to distinguish between statistical tools and computational models and to understand model quality as a multidimensional concept in which clarity about modeling goals is key. Reviewing a large number of psychophysical and computational explorations of core object recognition performance in humans and DNNs, we argue that DNNs are highly valuable scientific tools but that, as of today, DNNs should only be regarded as promising-but not yet adequate-computational models of human core object recognition behavior. On the way, we dispel several myths surrounding DNNs in vision science.
Collapse
Affiliation(s)
- Felix A Wichmann
- Neural Information Processing Group, University of Tübingen, Tübingen, Germany;
| | | |
Collapse
|
4
|
Huber LS, Geirhos R, Wichmann FA. The developmental trajectory of object recognition robustness: Children are like small adults but unlike big deep neural networks. J Vis 2023; 23:4. [PMID: 37410494 DOI: 10.1167/jov.23.7.4] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/07/2023] Open
Abstract
In laboratory object recognition tasks based on undistorted photographs, both adult humans and deep neural networks (DNNs) perform close to ceiling. Unlike adults', whose object recognition performance is robust against a wide range of image distortions, DNNs trained on standard ImageNet (1.3M images) perform poorly on distorted images. However, the last 2 years have seen impressive gains in DNN distortion robustness, predominantly achieved through ever-increasing large-scale datasets-orders of magnitude larger than ImageNet. Although this simple brute-force approach is very effective in achieving human-level robustness in DNNs, it raises the question of whether human robustness, too, is simply due to extensive experience with (distorted) visual input during childhood and beyond. Here we investigate this question by comparing the core object recognition performance of 146 children (aged 4-15 years) against adults and against DNNs. We find, first, that already 4- to 6-year-olds show remarkable robustness to image distortions and outperform DNNs trained on ImageNet. Second, we estimated the number of images children had been exposed to during their lifetime. Compared with various DNNs, children's high robustness requires relatively little data. Third, when recognizing objects, children-like adults but unlike DNNs-rely heavily on shape but not on texture cues. Together our results suggest that the remarkable robustness to distortions emerges early in the developmental trajectory of human object recognition and is unlikely the result of a mere accumulation of experience with distorted visual input. Even though current DNNs match human performance regarding robustness, they seem to rely on different and more data-hungry strategies to do so.
Collapse
Affiliation(s)
- Lukas S Huber
- Department of Psychology, University of Bern, Bern, Switzerland
- Neural Information Processing Group, University of Tübingen, Tübingen, Germany
- https://orcid.org/0000-0002-7755-6926
| | - Robert Geirhos
- Neural Information Processing Group, University of Tübingen, Tübingen, Germany
- https://orcid.org/0000-0001-7698-3187
| | - Felix A Wichmann
- Neural Information Processing Group, University of Tübingen, Tübingen, Germany
- https://orcid.org/0000-0002-2592-634X
| |
Collapse
|
5
|
Künstle DE, von Luxburg U, Wichmann FA. Estimating the perceived dimensionality of psychophysical stimuli using a triplet accuracy and hypothesis testing procedure. J Vis 2022. [DOI: 10.1167/jov.22.14.3331] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022] Open
Affiliation(s)
- David-Elias Künstle
- University of Tübingen
- International Max Planck Research School for Intelligent Systems, Tübingen
| | - Ulrike von Luxburg
- University of Tübingen
- Max Planck Institute for Intelligent Systems, Tübingen
| | | |
Collapse
|
6
|
Schönmann I, Künstle DE, Wichmann FA. Using an Odd-One-Out Design Affects Consistency, Agreement and Decision Criteria in Similarity Judgement Tasks Involving Natural Images. J Vis 2022. [DOI: 10.1167/jov.22.14.3232] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022] Open
|
7
|
Geirhos R, Narayanappa K, Mitzkus B, Thieringer T, Bethge M, Wichmann FA, Brendel W. The bittersweet lesson: data-rich models narrow the behavioural gap to human vision. J Vis 2022. [DOI: 10.1167/jov.22.14.3273] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022] Open
Affiliation(s)
- Robert Geirhos
- University of Tübingen
- International Max Planck Research School for Intelligent Systems
| | | | | | | | | | | | | |
Collapse
|
8
|
Abstract
Vision researchers are interested in mapping complex physical stimuli to perceptual dimensions. Such a mapping can be constructed using multidimensional psychophysical scaling or ordinal embedding methods. Both methods infer coordinates that agree as much as possible with the observer's judgments so that perceived similarity corresponds with distance in the inferred space. However, a fundamental problem of all methods that construct scalings in multiple dimensions is that the inferred representation can only reflect perception if the scale has the correct dimension. Here we propose a statistical procedure to overcome this limitation. The critical elements of our procedure are i) measuring the scale's quality by the number of correctly predicted triplets and ii) performing a statistical test to assess if adding another dimension to the scale improves triplet accuracy significantly. We validate our procedure through extensive simulations. In addition, we study the properties and limitations of our procedure using "real" data from various behavioral datasets from psychophysical experiments. We conclude that our procedure can reliably identify (a lower bound on) the number of perceptual dimensions for a given dataset.
Collapse
|
9
|
Abstract
Color constancy is our ability to perceive constant colors across varying illuminations. Here, we trained deep neural networks to be color constant and evaluated their performance with varying cues. Inputs to the networks consisted of two-dimensional images of simulated cone excitations derived from three-dimensional (3D) rendered scenes of 2,115 different 3D shapes, with spectral reflectances of 1,600 different Munsell chips, illuminated under 278 different natural illuminations. The models were trained to classify the reflectance of the objects. Testing was done with four new illuminations with equally spaced CIEL*a*b* chromaticities, two along the daylight locus and two orthogonal to it. High levels of color constancy were achieved with different deep neural networks, and constancy was higher along the daylight locus. When gradually removing cues from the scene, constancy decreased. Both ResNets and classical ConvNets of varying degrees of complexity performed well. However, DeepCC, our simplest sequential convolutional network, represented colors along the three color dimensions of human color vision, while ResNets showed a more complex representation.
Collapse
Affiliation(s)
- Alban Flachot
- Abteilung Allgemeine Psychologie, Justus Liebig University, Giessen, Germany.,
| | - Arash Akbarinia
- Abteilung Allgemeine Psychologie, Justus Liebig University, Giessen, Germany.,
| | - Heiko H Schütt
- Center for Neural Science, New York University, New York, NY, USA.,
| | - Roland W Fleming
- Experimental Psychology, Justus Liebig University, Giessen, Germany.,
| | - Felix A Wichmann
- Neural Information Processing Group, University of Tübingen, Germany.,
| | - Karl R Gegenfurtner
- Abteilung Allgemeine Psychologie, Justus Liebig University, Giessen, Germany.,
| |
Collapse
|
10
|
Affiliation(s)
| | - Robert Geirhos
- University of Tübingen
- International Max Planck Research School for Intelligent Systems
| | | |
Collapse
|
11
|
|
12
|
Geirhos R, Jacobsen JH, Michaelis C, Zemel R, Brendel W, Bethge M, Wichmann FA. Unintended cue learning: Lessons for deep learning from experimental psychology. J Vis 2020. [DOI: 10.1167/jov.20.11.652] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open
Affiliation(s)
- Robert Geirhos
- University of Tuebingen
- International Max Planck Research School for Intelligent Systems
| | | | - Claudio Michaelis
- University of Tuebingen
- International Max Planck Research School for Intelligent Systems
| | | | | | | | | |
Collapse
|
13
|
Abstract
In this article, we address the problem of measuring and analyzing sensation, the subjective magnitude of one's experience. We do this in the context of the method of triads: The sensation of the stimulus is evaluated via relative judgments of the following form: “Is stimulus Si more similar to stimulus Sj or to stimulus Sk?” We propose to use ordinal embedding methods from machine learning to estimate the scaling function from the relative judgments. We review two relevant and well-known methods in psychophysics that are partially applicable in our setting: nonmetric multidimensional scaling (NMDS) and the method of maximum likelihood difference scaling (MLDS). Considering various scaling functions, we perform an extensive set of simulations to demonstrate the performance of the ordinal embedding methods. We show that in contrast to existing approaches, our ordinal embedding approach allows, first, to obtain reasonable scaling functions from comparatively few relative judgments and, second, to estimate multidimensional perceptual scales. In addition to the simulations, we analyze data from two real psychophysics experiments using ordinal embedding methods. Our results show that in the one-dimensional perceptual scale, our ordinal embedding approach works as well as MLDS, while in higher dimensions, only our ordinal embedding methods can produce a desirable scaling function. To make our methods widely accessible, we provide an R-implementation and general rules of thumb on how to use ordinal embedding in the context of psychophysics.
Collapse
Affiliation(s)
- Siavash Haghiri
- Department of Computer Science, University of Tübingen, Germany
| | | | - Ulrike von Luxburg
- Department of Computer Science, University of Tübingen, Germany.,Max Planck Institute for Intelligent Systems, Tübingen, Germany
| |
Collapse
|
14
|
Meding K, Bruijns SA, Schölkopf B, Berens P, Wichmann FA. Phenomenal Causality and Sensory Realism. Iperception 2020; 11:2041669520927038. [PMID: 32537119 PMCID: PMC7268924 DOI: 10.1177/2041669520927038] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2019] [Accepted: 04/15/2020] [Indexed: 11/15/2022] Open
Abstract
One of the most important tasks for humans is the attribution of causes and effects in all wakes of life. The first systematical study of visual perception of causality-often referred to as phenomenal causality-was done by Albert Michotte using his now well-known launching events paradigm. Launching events are the seeming collision and seeming transfer of movement between two objects-abstract, featureless stimuli ("objects") in Michotte's original experiments. Here, we study the relation between causal ratings for launching events in Michotte's setting and launching collisions in a photorealistically computer-rendered setting. We presented launching events with differing temporal gaps, the same launching processes with photorealistic billiard balls, as well as photorealistic billiard balls with realistic motion dynamics, that is, an initial rebound of the first ball after collision and a short sliding phase of the second ball due to momentum and friction. We found that providing the normal launching stimulus with realistic visuals led to lower causal ratings, but realistic visuals together with realistic motion dynamics evoked higher ratings. Two-dimensional versus three-dimensional presentation, on the other hand, did not affect phenomenal causality. We discuss our results in terms of intuitive physics as well as cue conflict.
Collapse
Affiliation(s)
- Kristof Meding
- Neural Information Processing Group, Eberhard Karls Universität Tübingen; Empirical Inference Department, Max-Planck-Institute for Intelligent Systems, Tübingen, Germany
| | | | - Bernhard Schölkopf
- Empirical Inference Department, Max-Planck-Institute for Intelligent Systems, Tübingen, Stuttgart, Germany
| | - Philipp Berens
- Institute for Ophthalmic Research, Eberhard Karls Universität Tübingen
| | - Felix A Wichmann
- Neural Information Processing Group, Eberhard Karls Universität Tübingen
| |
Collapse
|
15
|
Geirhos R, Rubisch P, Rauber J, Temme CRM, Michaelis C, Brendel W, Bethge M, Wichmann FA. Inducing a human-like shape bias leads to emergent human-level distortion robustness in CNNs. J Vis 2019. [DOI: 10.1167/19.10.209c] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open
Affiliation(s)
- Robert Geirhos
- University of Tübingen
- International Max Planck Research School for Intelligent Systems
| | | | - Jonas Rauber
- University of Tübingen
- International Max Planck Research School for Intelligent Systems
| | | | - Claudio Michaelis
- University of Tübingen
- International Max Planck Research School for Intelligent Systems
| | | | - Matthias Bethge
- University of Tübingen
- Bernstein Center for Computational Neuroscience Tübingen
- Max Planck Institute for Biological Cybernetics
| | - Felix A Wichmann
- University of Tübingen
- Bernstein Center for Computational Neuroscience Tübingen
| |
Collapse
|
16
|
Schütt HH, Wichmann FA. A divisive model of midget and parasol ganglion cells explains the contrast sensitivity function. J Vis 2019. [DOI: 10.1167/19.10.79a] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open
Affiliation(s)
- Heiko H Schütt
- Neural Information Processing Group, University of Tübingen
- Center for Neuroscience, New York University
- Zuckerman Institute, Columbia University
| | | |
Collapse
|
17
|
Lang B, Aguilar G, Maertens M, Wichmann FA. The influence of observer lapses on maximum-likelihood difference scaling. J Vis 2019. [DOI: 10.1167/19.10.87b] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open
|
18
|
Schütt HH, Rothkegel LOM, Trukenbrod HA, Engbert R, Wichmann FA. Disentangling bottom-up versus top-down and low-level versus high-level influences on eye movements over time. J Vis 2019; 19:1. [PMID: 30821809 DOI: 10.1167/19.3.1] [Citation(s) in RCA: 22] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open
Abstract
Bottom-up and top-down as well as low-level and high-level factors influence where we fixate when viewing natural scenes. However, the importance of each of these factors and how they interact remains a matter of debate. Here, we disentangle these factors by analyzing their influence over time. For this purpose, we develop a saliency model that is based on the internal representation of a recent early spatial vision model to measure the low-level, bottom-up factor. To measure the influence of high-level, bottom-up features, we use a recent deep neural network-based saliency model. To account for top-down influences, we evaluate the models on two large data sets with different tasks: first, a memorization task and, second, a search task. Our results lend support to a separation of visual scene exploration into three phases: the first saccade, an initial guided exploration characterized by a gradual broadening of the fixation density, and a steady state that is reached after roughly 10 fixations. Saccade-target selection during the initial exploration and in the steady state is related to similar areas of interest, which are better predicted when including high-level features. In the search data set, fixation locations are determined predominantly by top-down processes. In contrast, the first fixation follows a different fixation density and contains a strong central fixation bias. Nonetheless, first fixations are guided strongly by image properties, and as early as 200 ms after image onset, fixations are better predicted by high-level information. We conclude that any low-level, bottom-up factors are mainly limited to the generation of the first saccade. All saccades are better explained when high-level features are considered, and later, this high-level, bottom-up control can be overruled by top-down influences.
Collapse
Affiliation(s)
- Heiko H Schütt
- Neural Information Processing Group, Universität Tübingen, Tübingen, Germany.,Experimental and Biological Psychology, University of Potsdam, Potsdam, Germany
| | - Lars O M Rothkegel
- Experimental and Biological Psychology, University of Potsdam, Potsdam, Germany
| | - Hans A Trukenbrod
- Experimental and Biological Psychology, University of Potsdam, Potsdam, Germany
| | - Ralf Engbert
- Experimental and Biological Psychology and Research Focus Cognitive Sciences, University of Potsdam, Potsdam, Germany
| | - Felix A Wichmann
- Neural Information Processing Group, Universität Tübingen, Tübingen, Germany
| |
Collapse
|
19
|
Trukenbrod HA, Barthelmé S, Wichmann FA, Engbert R. Spatial statistics for gaze patterns in scene viewing: Effects of repeated viewing. J Vis 2019; 19:5. [PMID: 31173630 DOI: 10.1167/19.6.5] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open
Abstract
Scene viewing is used to study attentional selection in complex but still controlled environments. One of the main observations on eye movements during scene viewing is the inhomogeneous distribution of fixation locations: While some parts of an image are fixated by almost all observers and are inspected repeatedly by the same observer, other image parts remain unfixated by observers even after long exploration intervals. Here, we apply spatial point process methods to investigate the relationship between pairs of fixations. More precisely, we use the pair correlation function, a powerful statistical tool, to evaluate dependencies between fixation locations along individual scanpaths. We demonstrate that aggregation of fixation locations within 4° is stronger than expected from chance. Furthermore, the pair correlation function reveals stronger aggregation of fixations when the same image is presented a second time. We use simulations of a dynamical model to show that a narrower spatial attentional span may explain differences in pair correlations between the first and the second inspection of the same image.
Collapse
Affiliation(s)
| | - Simon Barthelmé
- Centre National de la Recherche Scientifique, Gipsa-lab, Grenoble Institut National Polytechnique, France
| | - Felix A Wichmann
- Eberhard Karls University of Tübingen, Tübingen, Germany.,Bernstein Center for Computational Neuroscience Tübingen, Tübingen, Germany.,Max Planck Institute for Intelligent Systems, Tübingen, Germany
| | | |
Collapse
|
20
|
Wallis TS, Funke CM, Ecker AS, Gatys LA, Wichmann FA, Bethge M. Image content is more important than Bouma's Law for scene metamers. eLife 2019; 8:42512. [PMID: 31038458 PMCID: PMC6491040 DOI: 10.7554/elife.42512] [Citation(s) in RCA: 23] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2018] [Accepted: 03/09/2019] [Indexed: 11/16/2022] Open
Abstract
We subjectively perceive our visual field with high fidelity, yet peripheral distortions can go unnoticed and peripheral objects can be difficult to identify (crowding). Prior work showed that humans could not discriminate images synthesised to match the responses of a mid-level ventral visual stream model when information was averaged in receptive fields with a scaling of about half their retinal eccentricity. This result implicated ventral visual area V2, approximated ‘Bouma’s Law’ of crowding, and has subsequently been interpreted as a link between crowding zones, receptive field scaling, and our perceptual experience. However, this experiment never assessed natural images. We find that humans can easily discriminate real and model-generated images at V2 scaling, requiring scales at least as small as V1 receptive fields to generate metamers. We speculate that explaining why scenes look as they do may require incorporating segmentation and global organisational constraints in addition to local pooling. As you read this digest, your eyes move to follow the lines of text. But now try to hold your eyes in one position, while reading the text on either side and below: it soon becomes clear that peripheral vision is not as good as we tend to assume. It is not possible to read text far away from the center of your line of vision, but you can see ‘something’ out of the corner of your eye. You can see that there is text there, even if you cannot read it, and you can see where your screen or page ends. So how does the brain generate peripheral vision, and why does it differ from what you see when you look straight ahead? One idea is that the visual system averages information over areas of the peripheral visual field. This gives rise to texture-like patterns, as opposed to images made up of fine details. Imagine looking at an expanse of foliage, gravel or fur, for example. Your eyes cannot make out the individual leaves, pebbles or hairs. Instead, you perceive an overall pattern in the form of a texture. Our peripheral vision may also consist of such textures, created when the brain averages information over areas of space. Wallis, Funke et al. have now tested this idea using an existing computer model that averages visual input in this way. By giving the model a series of photographs to process, Wallis, Funke et al. obtained images that should in theory simulate peripheral vision. If the model mimics the mechanisms that generate peripheral vision, then healthy volunteers should be unable to distinguish the processed images from the original photographs. But in fact, the participants could easily discriminate the two sets of images. This suggests that the visual system does not solely use textures to represent information in the peripheral visual field. Wallis, Funke et al. propose that other factors, such as how the visual system separates and groups objects, may instead determine what we see in our peripheral vision. This knowledge could ultimately benefit patients with eye diseases such as macular degeneration, a condition that causes loss of vision in the center of the visual field and forces patients to rely on their peripheral vision.
Collapse
Affiliation(s)
- Thomas Sa Wallis
- Werner Reichardt Center for Integrative Neuroscience, Eberhard Karls Universität Tübingen, Tübingen, Germany.,Bernstein Center for Computational Neuroscience, Berlin, Germany
| | - Christina M Funke
- Werner Reichardt Center for Integrative Neuroscience, Eberhard Karls Universität Tübingen, Tübingen, Germany.,Bernstein Center for Computational Neuroscience, Berlin, Germany
| | - Alexander S Ecker
- Werner Reichardt Center for Integrative Neuroscience, Eberhard Karls Universität Tübingen, Tübingen, Germany.,Bernstein Center for Computational Neuroscience, Berlin, Germany.,Center for Neuroscience and Artificial Intelligence, Baylor College of Medicine, Houston, United States.,Institute for Theoretical Physics, Eberhard Karls Universität Tübingen, Tübingen, Germany
| | - Leon A Gatys
- Werner Reichardt Center for Integrative Neuroscience, Eberhard Karls Universität Tübingen, Tübingen, Germany
| | - Felix A Wichmann
- Neural Information Processing Group, Faculty of Science, Eberhard Karls Universität Tübingen, Tübingen, Germany
| | - Matthias Bethge
- Center for Neuroscience and Artificial Intelligence, Baylor College of Medicine, Houston, United States.,Institute for Theoretical Physics, Eberhard Karls Universität Tübingen, Tübingen, Germany.,Max Planck Institute for Biological Cybernetics, Tübingen, Germany
| |
Collapse
|
21
|
Rothkegel LOM, Schütt HH, Trukenbrod HA, Wichmann FA, Engbert R. Searchers adjust their eye-movement dynamics to target characteristics in natural scenes. Sci Rep 2019; 9:1635. [PMID: 30733470 PMCID: PMC6367441 DOI: 10.1038/s41598-018-37548-w] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/11/2018] [Accepted: 12/07/2018] [Indexed: 11/30/2022] Open
Abstract
When searching a target in a natural scene, it has been shown that both the target's visual properties and similarity to the background influence whether and how fast humans are able to find it. So far, it was unclear whether searchers adjust the dynamics of their eye movements (e.g., fixation durations, saccade amplitudes) to the target they search for. In our experiment, participants searched natural scenes for six artificial targets with different spatial frequency content throughout eight consecutive sessions. High-spatial frequency targets led to smaller saccade amplitudes and shorter fixation durations than low-spatial frequency targets if target identity was known. If a saccade was programmed in the same direction as the previous saccade, fixation durations and successive saccade amplitudes were not influenced by target type. Visual saliency and empirical fixation density at the endpoints of saccades which maintain direction were comparatively low, indicating that these saccades were less selective. Our results suggest that searchers adjust their eye movement dynamics to the search target efficiently, since previous research has shown that low-spatial frequencies are visible farther into the periphery than high-spatial frequencies. We interpret the saccade direction specificity of our effects as an underlying separation into a default scanning mechanism and a selective, target-dependent mechanism.
Collapse
Affiliation(s)
- Lars O M Rothkegel
- Department of Psychology, University of Potsdam, Karl-Liebknechtstraße 24/25, 14476, Potsdam, Germany.
| | - Heiko H Schütt
- Department of Psychology, University of Potsdam, Karl-Liebknechtstraße 24/25, 14476, Potsdam, Germany
- Neural Information Processing Group, University of Tübingen, Sand 6, 72076, Tübingen, Germany
| | - Hans A Trukenbrod
- Department of Psychology, University of Potsdam, Karl-Liebknechtstraße 24/25, 14476, Potsdam, Germany
| | - Felix A Wichmann
- Neural Information Processing Group, University of Tübingen, Sand 6, 72076, Tübingen, Germany
- Max Planck Institute for Intelligent Systems, Max-Planck-Ring 4, 72076, Tübingen, Germany
| | - Ralf Engbert
- Department of Psychology, University of Potsdam, Karl-Liebknechtstraße 24/25, 14476, Potsdam, Germany
| |
Collapse
|
22
|
Abstract
When watching the image of a natural scene on a computer screen, observers initially move their eyes toward the center of the image-a reliable experimental finding termed central fixation bias. This systematic tendency in eye guidance likely masks attentional selection driven by image properties and top-down cognitive processes. Here, we show that the central fixation bias can be reduced by delaying the initial saccade relative to image onset. In four scene-viewing experiments we manipulated observers' initial gaze position and delayed their first saccade by a specific time interval relative to the onset of an image. We analyzed the distance to image center over time and show that the central fixation bias of initial fixations was significantly reduced after delayed saccade onsets. We additionally show that selection of the initial saccade target strongly depended on the first saccade latency. A previously published model of saccade generation was extended with a central activation map on the initial fixation whose influence declined with increasing saccade latency. This extension was sufficient to replicate the central fixation bias from our experiments. Our results suggest that the central fixation bias is generated by default activation as a response to the sudden image onset and that this default activation pattern decreases over time. Thus, it may often be preferable to use a modified version of the scene viewing paradigm that decouples image onset from the start signal for scene exploration to explicitly reduce the central fixation bias.
Collapse
Affiliation(s)
| | | | - Heiko H Schütt
- University of Potsdam, Potsdam, Germany.,Eberhard Karls University, Tübingen, Germany
| | - Felix A Wichmann
- Eberhard Karls University, Tübingen, Germany.,Max Planck Institute for Intelligent Systems, Tübingen, Germany.,Bernstein Center for Computational Neuroscience, Tübingen, Germany
| | | |
Collapse
|
23
|
Abstract
A large part of classical visual psychophysics was concerned with the fundamental question of how pattern information is initially encoded in the human visual system. From these studies a relatively standard model of early spatial vision emerged, based on spatial frequency and orientation-specific channels followed by an accelerating nonlinearity and divisive normalization: contrast gain-control. Here we implement such a model in an image-computable way, allowing it to take arbitrary luminance images as input. Testing our implementation on classical psychophysical data, we find that it explains contrast detection data including the ModelFest data, contrast discrimination data, and oblique masking data, using a single set of parameters. Leveraging the advantage of an image-computable model, we test our model against a recent dataset using natural images as masks. We find that the model explains these data reasonably well, too. To explain data obtained at different presentation durations, our model requires different parameters to achieve an acceptable fit. In addition, we show that contrast gain-control with the fitted parameters results in a very sparse encoding of luminance information, in line with notions from efficient coding. Translating the standard early spatial vision model to be image-computable resulted in two further insights: First, the nonlinear processing requires a denser sampling of spatial frequency and orientation than optimal coding suggests. Second, the normalization needs to be fairly local in space to fit the data obtained with natural image masks. Finally, our image-computable model can serve as tool in future quantitative analyses: It allows optimized stimuli to be used to test the model and variants of it, with potential applications as an image-quality metric. In addition, it may serve as a building block for models of higher level processing.
Collapse
Affiliation(s)
- Heiko H Schütt
- Neural Information Processing Group, University of Tübingen, Tübingen, Germany.,Department of Experimental and Biological Psychology, University of Potsdam, Germany
| | - Felix A Wichmann
- Neural Information Processing Group, University of Tübingen, Tübingen, Germany.,Bernstein Center for Computational Neuroscience, Tübingen, Germany.,Max Planck Institute for Intelligent Systems, Tübingen, Germany
| |
Collapse
|
24
|
Wallis TSA, Funke CM, Ecker AS, Gatys LA, Wichmann FA, Bethge M. A parametric texture model based on deep convolutional features closely matches texture appearance for humans. J Vis 2017; 17:5. [PMID: 28983571 DOI: 10.1167/17.12.5] [Citation(s) in RCA: 31] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open
Abstract
Our visual environment is full of texture-"stuff" like cloth, bark, or gravel as distinct from "things" like dresses, trees, or paths-and humans are adept at perceiving subtle variations in material properties. To investigate image features important for texture perception, we psychophysically compare a recent parametric model of texture appearance (convolutional neural network [CNN] model) that uses the features encoded by a deep CNN (VGG-19) with two other models: the venerable Portilla and Simoncelli model and an extension of the CNN model in which the power spectrum is additionally matched. Observers discriminated model-generated textures from original natural textures in a spatial three-alternative oddity paradigm under two viewing conditions: when test patches were briefly presented to the near-periphery ("parafoveal") and when observers were able to make eye movements to all three patches ("inspection"). Under parafoveal viewing, observers were unable to discriminate 10 of 12 original images from CNN model images, and remarkably, the simpler Portilla and Simoncelli model performed slightly better than the CNN model (11 textures). Under foveal inspection, matching CNN features captured appearance substantially better than the Portilla and Simoncelli model (nine compared to four textures), and including the power spectrum improved appearance matching for two of the three remaining textures. None of the models we test here could produce indiscriminable images for one of the 12 textures under the inspection condition. While deep CNN (VGG-19) features can often be used to synthesize textures that humans cannot discriminate from natural textures, there is currently no uniformly best model for all textures and viewing conditions.
Collapse
Affiliation(s)
- Thomas S A Wallis
- Werner Reichardt Center for Integrative Neuroscience, Eberhard Karls Universität Tübingen, and the Bernstein Center for Computational Neuroscience, Tübingen, Germany
| | - Christina M Funke
- Werner Reichardt Center for Integrative Neuroscience, Eberhard Karls Universität Tübingen, and the Bernstein Center for Computational Neuroscience, Tübingen, Germany
| | - Alexander S Ecker
- Werner Reichardt Center for Integrative Neuroscience, Eberhard Karls Universität Tübingen, and Bernstein Center for Computational Neuroscience, Tübingen, Germany, and Department of Neuroscience, Baylor College of Medicine, Houston, TX, USA
| | - Leon A Gatys
- Werner Reichardt Center for Integrative Neuroscience, Eberhard Karls Universität Tübingen and the Bernstein Center for Computational Neuroscience, Tübingen, Germany
| | - Felix A Wichmann
- Neural Information Processing Group, Faculty of Science, Eberhard Karls Universität Tübingen, Bernstein Center for Computational Neuroscience, and the Max Planck Institute for Intelligent Systems, Empirical Inference Department, Tübingen, Germany
| | - Matthias Bethge
- Werner Reichardt Center for Integrative Neuroscience, Eberhard Karls Universität Tübingen, Bernstein Center for Computational Neuroscience, Institute for Theoretical Physics, Eberhard Karls Universität Tübingen, and the Max Planck Institute for Biological Cybernetics, Tübingen, Germany
| |
Collapse
|
25
|
Aguilar G, Wichmann FA, Maertens M. Comparing sensitivity estimates from MLDS and forced-choice methods in a slant-from-texture experiment. J Vis 2017; 17:37. [PMID: 28135347 DOI: 10.1167/17.1.37] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open
Abstract
Maximum likelihood difference scaling (MLDS) is a method for the estimation of perceptual scales based on the judgment of differences in stimulus appearance (Maloney & Yang, 2003). MLDS has recently also been used to estimate near-threshold discrimination performance (Devinck & Knoblauch, 2012). Using MLDS as a psychophysical method for sensitivity estimation is potentially appealing, because MLDS has been reported to need less data than forced-choice procedures, and particularly naive observers report to prefer suprathreshold comparisons to JND-style threshold tasks. Here we compare two methods, MLDS and two-interval forced-choice (2-IFC), regarding their capability to estimate sensitivity assuming an underlying signal-detection model. We first examined the theoretical equivalence between both methods using simulations. We found that they disagreed in their estimation only when sensitivity was low, or when one of the assumptions on which MLDS is based was violated. Furthermore, we found that the confidence intervals derived from MLDS had a low coverage; i.e., they were too narrow, underestimating the true variability. Subsequently we compared MLDS and 2-IFC empirically using a slant-from-texture task. The amount of agreement between sensitivity estimates from the two methods varied substantially across observers. We discuss possible reasons for the observed disagreements, most notably violations of the MLDS model assumptions. We conclude that in the present example MLDS and 2-IFC could equally be used to estimate sensitivity to differences in slant, with MLDS having the benefit of being more efficient and more pleasant, but having the disadvantage of unsatisfying coverage.
Collapse
Affiliation(s)
- Guillermo Aguilar
- Modelling of Cognitive Processes Group, Department of Software Engineering and Theoretical Computer Science, Technische Universität Berlin, Berlin, GermanyBernstein Center for Computational Neuroscience, Berlin, Germany
| | - Felix A Wichmann
- AG Neuronale Informationsverarbeitung, Mathematisch-Naturwissenschaftliche Fakultät, Eberhard Karls Universität, Tübingen, GermanyBernstein Center for Computational Neuroscience, Tübingen, GermanyMax-Planck-Institut für Intelligente Systeme, Tübingen, Germany
| | - Marianne Maertens
- Modelling of Cognitive Processes Group, Department of Software Engineering and Theoretical Computer Science, Technische Universität Berlin, Berlin, Germany
| |
Collapse
|
26
|
Schütt HH, Rothkegel LOM, Trukenbrod HA, Reich S, Wichmann FA, Engbert R. Likelihood-based parameter estimation and comparison of dynamical cognitive models. Psychol Rev 2017; 124:505-524. [PMID: 28447811 DOI: 10.1037/rev0000068] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
Dynamical models of cognition play an increasingly important role in driving theoretical and experimental research in psychology. Therefore, parameter estimation, model analysis and comparison of dynamical models are of essential importance. In this article, we propose a maximum likelihood approach for model analysis in a fully dynamical framework that includes time-ordered experimental data. Our methods can be applied to dynamical models for the prediction of discrete behavior (e.g., movement onsets); in particular, we use a dynamical model of saccade generation in scene viewing as a case study for our approach. For this model, the likelihood function can be computed directly by numerical simulation, which enables more efficient parameter estimation including Bayesian inference to obtain reliable estimates and corresponding credible intervals. Using hierarchical models inference is even possible for individual observers. Furthermore, our likelihood approach can be used to compare different models. In our example, the dynamical framework is shown to outperform nondynamical statistical models. Additionally, the likelihood based evaluation differentiates model variants, which produced indistinguishable predictions on hitherto used statistics. Our results indicate that the likelihood approach is a promising framework for dynamical cognitive models. (PsycINFO Database Record
Collapse
Affiliation(s)
- Heiko H Schütt
- Neural Information Processing Group, University of Tübingen
| | - Lars O M Rothkegel
- Department of Experimental and Biological Psychology, University of Potsdam
| | - Hans A Trukenbrod
- Department of Experimental and Biological Psychology, University of Potsdam
| | | | | | - Ralf Engbert
- Department of Experimental and Biological Psychology, University of Potsdam
| |
Collapse
|
27
|
Wichmann FA, Janssen DHJ, Geirhos R, Aguilar G, Schütt HH, Maertens M, Bethge M. Methods and measurements to compare men against machines. ACTA ACUST UNITED AC 2017. [DOI: 10.2352/issn.2470-1173.2017.14.hvei-113] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/01/2022]
|
28
|
Jäkel F, Singh M, Wichmann FA, Herzog MH. An overview of quantitative approaches in Gestalt perception. Vision Res 2016; 126:3-8. [PMID: 27353224 DOI: 10.1016/j.visres.2016.06.004] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2016] [Revised: 06/22/2016] [Accepted: 06/22/2016] [Indexed: 10/21/2022]
Abstract
Gestalt psychology is often criticized as lacking quantitative measurements and precise mathematical models. While this is true of the early Gestalt school, today there are many quantitative approaches in Gestalt perception and the special issue of Vision Research "Quantitative Approaches in Gestalt Perception" showcases the current state-of-the-art. In this article we give an overview of these current approaches. For example, ideal observer models are one of the standard quantitative tools in vision research and there is a clear trend to try and apply this tool to Gestalt perception and thereby integrate Gestalt perception into mainstream vision research. More generally, Bayesian models, long popular in other areas of vision research, are increasingly being employed to model perceptual grouping as well. Thus, although experimental and theoretical approaches to Gestalt perception remain quite diverse, we are hopeful that these quantitative trends will pave the way for a unified theory.
Collapse
Affiliation(s)
- Frank Jäkel
- Institute of Cognitive Science, University of Osnabrück, Germany.
| | - Manish Singh
- Department of Psychology and Center for Cognitive Science, Rutgers University, New Brunswick, NJ, United States
| | - Felix A Wichmann
- Neural Information Processing Group, Faculty of Science, and Bernstein Center for Computational Neuroscience Tübingen, University of Tübingen, Germany; Max Planck Institute for Intelligent Systems, Empirical Inference Department, Tübingen, Germany
| | - Michael H Herzog
- Laboratory of Psychophysics, Brain Mind Institute, École Polytechnique Fédérale de Lausanne (EPFL), Switzerland
| |
Collapse
|
29
|
Abstract
Most of the visual field is peripheral, and the periphery encodes visual input with less fidelity compared to the fovea. What information is encoded, and what is lost in the visual periphery? A systematic way to answer this question is to determine how sensitive the visual system is to different kinds of lossy image changes compared to the unmodified natural scene. If modified images are indiscriminable from the original scene, then the information discarded by the modification is not important for perception under the experimental conditions used. We measured the detectability of modifications of natural image structure using a temporal three-alternative oddity task, in which observers compared modified images to original natural scenes. We consider two lossy image transformations, Gaussian blur and Portilla and Simoncelli texture synthesis. Although our paradigm demonstrates metamerism (physically different images that appear the same) under some conditions, in general we find that humans can be capable of impressive sensitivity to deviations from natural appearance. The representations we examine here do not preserve all the information necessary to match the appearance of natural scenes in the periphery.
Collapse
|
30
|
Mohr J, Seyfarth J, Lueschow A, Weber JE, Wichmann FA, Obermayer K. BOiS-Berlin Object in Scene Database: Controlled Photographic Images for Visual Search Experiments with Quantified Contextual Priors. Front Psychol 2016; 7:749. [PMID: 27242646 PMCID: PMC4876128 DOI: 10.3389/fpsyg.2016.00749] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2016] [Accepted: 05/06/2016] [Indexed: 12/04/2022] Open
Affiliation(s)
- Johannes Mohr
- Neural Information Processing Group, Department of Electrical Engineering and Computer Science, Technische Universität Berlin Berlin, Germany
| | - Julia Seyfarth
- Neural Information Processing Group, Department of Electrical Engineering and Computer Science, Technische Universität Berlin Berlin, Germany
| | - Andreas Lueschow
- Cognitive Neurophysiology Group, Department of Neurology, Campus Benjamin Franklin, Charité-University Medicine Berlin Berlin, Germany
| | - Joachim E Weber
- Cognitive Neurophysiology Group, Department of Neurology, Campus Benjamin Franklin, Charité-University Medicine Berlin Berlin, Germany
| | - Felix A Wichmann
- Neural Information Processing Group, Faculty of Science, Bernstein Center for Computational Neuroscience Tübingen, University of TübingenTübingen, Germany; Empirical Inference Department, Max Planck Institute for Intelligent SystemsTübingen, Germany
| | - Klaus Obermayer
- Neural Information Processing Group, Department of Electrical Engineering and Computer Science, Technische Universität BerlinBerlin, Germany; Bernstein Center for Computational Neuroscience BerlinBerlin, Germany
| |
Collapse
|
31
|
Schütt HH, Harmeling S, Macke JH, Wichmann FA. Painfree and accurate Bayesian estimation of psychometric functions for (potentially) overdispersed data. Vision Res 2016; 122:105-123. [PMID: 27013261 DOI: 10.1016/j.visres.2016.02.002] [Citation(s) in RCA: 181] [Impact Index Per Article: 22.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2015] [Revised: 02/22/2016] [Accepted: 02/27/2016] [Indexed: 10/21/2022]
|
32
|
Betz T, Shapley R, Wichmann FA, Maertens M. Testing the role of luminance edges in White's illusion with contour adaptation. J Vis 2015; 15:14. [PMID: 26305862 PMCID: PMC6897287 DOI: 10.1167/15.11.14] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2015] [Accepted: 06/07/2015] [Indexed: 11/24/2022] Open
Abstract
White's illusion is the perceptual effect that two equiluminant gray patches superimposed on a black-and-white square-wave grating appear different in lightness: A test patch placed on a dark stripe of the grating looks lighter than one placed on a light stripe. Although the effect does not depend on the aspect ratio of the test patches, and thus on the amount of border that is shared with either the dark or the light stripe, the context of each patch must, in a yet to be specified way, influence their lightness. We employed a contour adaptation paradigm (Anstis, 2013) to test the contribution of each of the test patches' edges to the perceived lightness of the test patches. We found that adapting to the edges that are oriented parallel to the grating slightly increased the lightness illusion, whereas adapting to the orthogonal edges abolished, or for some observers even reversed, the lightness illusion. We implemented a temporal adaptation mechanism in three spatial filtering models of lightness perception, and show that the models cannot account for the observed adaptation effects. We conclude that White's illusion is largely determined by edge contrast across the edge orthogonal to the grating, whereas the parallel edge has little or no influence. We suggest mechanisms that could explain this asymmetry.
Collapse
|
33
|
Abstract
In humans and in foveated animals visual acuity is highly concentrated at the center of gaze, so that choosing where to look next is an important example of online, rapid decision-making. Computational neuroscientists have developed biologically-inspired models of visual attention, termed saliency maps, which successfully predict where people fixate on average. Using point process theory for spatial statistics, we show that scanpaths contain, however, important statistical structure, such as spatial clustering on top of distributions of gaze positions. Here, we develop a dynamical model of saccadic selection that accurately predicts the distribution of gaze positions as well as spatial clustering along individual scanpaths. Our model relies on activation dynamics via spatially-limited (foveated) access to saliency information, and, second, a leaky memory process controlling the re-inspection of target regions. This theoretical framework models a form of context-dependent decision-making, linking neural dynamics of attention to behavioral gaze data.
Collapse
Affiliation(s)
- Ralf Engbert
- University of Potsdam, Germany Bernstein Center for Computational Neuroscience Berlin, Berlin, Germany
| | - Hans A Trukenbrod
- University of Potsdam, Germany Bernstein Center for Computational Neuroscience Berlin, Berlin, Germany
| | - Simon Barthelmé
- Bernstein Center for Computational Neuroscience Berlin, Berlin, Germany University of Geneva, Geneva, Switzerland
| | - Felix A Wichmann
- Eberhard Karls University of Tübingen, Germany Bernstein Center for Computational Neuroscience Tübingen, Germany Max Planck Institute for Intelligent Systems, Tübingen, Germany
| |
Collapse
|
34
|
Abstract
Visual perception of object attributes such as surface lightness is crucial for successful interaction with the environment. How the visual system assigns lightness to image regions is not yet understood. It has been shown that the context in which a surface is embedded influences its perceived lightness, but whether that influence involves predominantly low-, mid-, or high-level visual mechanisms has not been resolved. To answer this question, we measured whether perceptual attributes of target image regions affected their perceived lightness when they were placed in different contexts. We varied the sharpness of the edge while keeping total target flux fixed. Targets with a sharp edge were consistent with the perceptual interpretation of a surface, and in that case, observers perceived significant brightening or darkening of the target. Targets with blurred edges rather appeared to be spotlights instead of surfaces; for targets with blurred edges, there was much less of a contextual effect on target lightness. The results indicate that the effect of context on the lightness of an image region is not fixed but is strongly affected by image manipulations that modify the perceptual attributes of the target, implying that a mid-level scene interpretation affects lightness perception.
Collapse
Affiliation(s)
- Marianne Maertens
- Modelling of Cognitive Processes Group, Department of Software Engineering and Theoretical Computer Science, Technische Universität Berlin, Berlin, Germany
| | - Felix A Wichmann
- Modelling of Cognitive Processes Group, Department of Software Engineering and Theoretical Computer Science, Technische Universität Berlin, Berlin, GermanyAG Neuronale Informationsverarbeitung, Mathematisch-Naturwissenschaftliche Fakultät, Eberhard Karls Universität, Tübingen, GermanyBernstein Center für Computational Neuroscience, Tübingen, GermanyMax-Planck-Institut für Intelligente Systeme, Abteilung Empirische Inferenz, Tübingen, GermanyCenter for Neural Science, New York University, New York, NY, USA
| | - Robert Shapley
- Modelling of Cognitive Processes Group, Department of Software Engineering and Theoretical Computer Science, Technische Universität Berlin, Berlin, GermanyAG Neuronale Informationsverarbeitung, Mathematisch-Naturwissenschaftliche Fakultät, Eberhard Karls Universität, Tübingen, GermanyBernstein Center für Computational Neuroscience, Tübingen, GermanyMax-Planck-Institut für Intelligente Systeme, Abteilung Empirische Inferenz, Tübingen, GermanyCenter for Neural Science, New York University, New York, NY, USA
| |
Collapse
|
35
|
Betz T, Shapley R, Wichmann FA, Maertens M. Noise masking of White's illusion exposes the weakness of current spatial filtering models of lightness perception. J Vis 2015; 15:1. [PMID: 26426914 PMCID: PMC6894438 DOI: 10.1167/15.14.1] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2015] [Accepted: 08/23/2015] [Indexed: 11/24/2022] Open
Abstract
Spatial filtering models are currently a widely accepted mechanistic account of human lightness perception. Their popularity can be ascribed to two reasons: They correctly predict how human observers perceive a variety of lightness illusions, and the processing steps involved in the models bear an apparent resemblance with known physiological mechanisms at early stages of visual processing. Here, we tested the adequacy of these models by probing their response to stimuli that have been modified by adding narrowband noise. Psychophysically, it has been shown that noise in the range of one to five cycles per degree (cpd) can drastically reduce the strength of some lightness phenomena, while noise outside this range has little or no effect on perceived lightness. Choosing White's illusion (White, 1979) as a test case, we replicated and extended the psychophysical results, and found that none of the spatial filtering models tested was able to reproduce the spatial frequency specific effect of narrowband noise. We discuss the reasons for failure for each model individually, but we argue that the failure is indicative of the general inadequacy of this class of spatial filtering models. Given the present evidence we do not believe that spatial filtering models capture the mechanisms that are responsible for producing many of the lightness phenomena observed in human perception. Instead we think that our findings support the idea that low-level contributions to perceived lightness are primarily determined by the luminance contrast at surface boundaries.
Collapse
|
36
|
Abstract
In the perceptual sciences, experimenters study the causal mechanisms of perceptual systems by probing observers with carefully constructed stimuli. It has long been known, however, that perceptual decisions are not only determined by the stimulus, but also by internal factors. Internal factors could lead to a statistical influence of previous stimuli and responses on the current trial, resulting in serial dependencies, which complicate the causal inference between stimulus and response. However, the majority of studies do not take serial dependencies into account, and it has been unclear how strongly they influence perceptual decisions. We hypothesize that one reason for this neglect is that there has been no reliable tool to quantify them and to correct for their effects. Here we develop a statistical method to detect, estimate, and correct for serial dependencies in behavioral data. We show that even trained psychophysical observers suffer from strong history dependence. A substantial fraction of the decision variance on difficult stimuli was independent of the stimulus but dependent on experimental history.We discuss the strong dependence of perceptual decisions on internal factors and its implications for correct data interpretation.
Collapse
Affiliation(s)
- Ingo Fründ
- Bernstein Center for Computational Neuroscience, Technical University Berlin, GermanyCenter for Vision Research, York University, Toronto, ON, Canada
| | - Felix A Wichmann
- Neural Information Processing Group, Eberhard Karls Universität, Max Planck Institute for Intelligent Systems, Bernstein Center for Computational Neuroscience, Tübingen, Germany
| | - Jakob H Macke
- Max Planck Institute for Biological Cybernetics, Bernstein Center for Computational Neuroscience, Werner Reichardt Centre for Integrative Neuroscience, Tübingen, GermanyGatsby Computational Neuroscience Unit, University College London, UK
| |
Collapse
|
37
|
Abstract
Pattern detection is the bedrock of modern vision science. Nearly half a century ago, psychophysicists advocated a quantitative theoretical framework that connected visual pattern detection with its neurophysiological underpinnings. In this theory, neurons in primary visual cortex constitute linear and independent visual channels whose output is linked to choice behavior in detection tasks via simple read-out mechanisms. This model has proven remarkably successful in accounting for threshold vision. It is fundamentally at odds, however, with current knowledge about the neurophysiological underpinnings of pattern vision. In addition, the principles put forward in the model fail to generalize to suprathreshold vision or perceptual tasks other than detection. We propose an alternative theory of detection in which perceptual decisions develop from maximum-likelihood decoding of a neurophysiologically inspired model of population activity in primary visual cortex. We demonstrate that this theory explains a broad range of classic detection results. With a single set of parameters, our model can account for several summation, adaptation, and uncertainty effects, thereby offering a new theoretical interpretation for the vast psychophysical literature on pattern detection.
Collapse
Affiliation(s)
- Robbe L T Goris
- Center for Neural Science, New York University, 4 Washington Place, Room 809, New York, NY 10003, USA.
| | | | | | | |
Collapse
|
38
|
Schönfelder VH, Wichmann FA. Identification of stimulus cues in narrow-band tone-in-noise detection using sparse observer models. J Acoust Soc Am 2013; 134:447-463. [PMID: 23862820 DOI: 10.1121/1.4807561] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/02/2023]
Abstract
The classical psychophysical paradigm of narrow-band tone-in-noise (TiN) detection has been under investigation for more than 70 years, yet no conclusive answer has been given as to which auditory stimulus features listeners rely on. Here, individual observer models were fit to a large trial-by-trial behavioral data set using a modern statistical analysis procedure. Relative perceptual weights were estimated for a set of auditory features including sound energy, representations of the spectra as well as summary statistics of both fine structure and envelope. The fitted models captured the behavior of all listeners on a single-trial level. The estimated perceptual weights were stable across signal levels. They suggest that responses of observers depended on stimulus energy, though that cue was not always dominant, as well as on band-pass detectors applied to the fine structure spectrum. A subset of the observers exhibited an additional dependence on sound envelope which was best captured by two envelope descriptors: average slope and extrema count. For some listeners, a concurrent analysis of sequential dependencies showed interactions between the current and several preceding decisions. There was no unique answer regarding the strategy individual listeners employ during TiN detection, and implications thereof are discussed.
Collapse
Affiliation(s)
- Vinzenz H Schönfelder
- Department for Modeling of Cognitive Processes, MAR 5-3, Technical University Berlin, Marchstrasse 23, 10587 Berlin, Germany.
| | | |
Collapse
|
39
|
Schönfelder VH, Wichmann FA. Sparse regularized regression identifies behaviorally-relevant stimulus features from psychophysical data. J Acoust Soc Am 2012; 131:3953-3969. [PMID: 22559369 DOI: 10.1121/1.3701832] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/31/2023]
Abstract
As a prerequisite to quantitative psychophysical models of sensory processing it is necessary to learn to what extent decisions in behavioral tasks depend on specific stimulus features, the perceptual cues. Based on relative linear combination weights, this study demonstrates how stimulus-response data can be analyzed in this regard relying on an L(1)-regularized multiple logistic regression, a modern statistical procedure developed in machine learning. This method prevents complex models from over-fitting to noisy data. In addition, it enforces "sparse" solutions, a computational approximation to the postulate that a good model should contain the minimal set of predictors necessary to explain the data. In simulations, behavioral data from a classical auditory tone-in-noise detection task were generated. The proposed method is shown to precisely identify observer cues from a large set of covarying, interdependent stimulus features--a setting where standard correlational and regression methods fail. The proposed method succeeds for a wide range of signal-to-noise ratios and for deterministic as well as probabilistic observers. Furthermore, the detailed decision rules of the simulated observers were reconstructed from the estimated linear model weights allowing predictions of responses on the basis of individual stimuli.
Collapse
Affiliation(s)
- Vinzenz H Schönfelder
- Department for Modeling of Cognitive Processes, Technical University Berlin, FR 6-4, Franklinstr 28/29, 10587 Berlin, Germany.
| | | |
Collapse
|
40
|
Abstract
Measuring sensitivity is at the heart of psychophysics. Often, sensitivity is derived from estimates of the psychometric function. This function relates response probability to stimulus intensity. In estimating these response probabilities, most studies assume stationary observers: Responses are expected to be dependent only on the intensity of a presented stimulus and not on other factors such as stimulus sequence, duration of the experiment, or the responses on previous trials. Unfortunately, a number of factors such as learning, fatigue, or fluctuations in attention and motivation will typically result in violations of this assumption. The severity of these violations is yet unknown. We use Monte Carlo simulations to show that violations of these assumptions can result in underestimation of confidence intervals for parameters of the psychometric function. Even worse, collecting more trials does not eliminate this misestimation of confidence intervals. We present a simple adjustment of the confidence intervals that corrects for the underestimation almost independently of the number of trials and the particular type of violation.
Collapse
Affiliation(s)
- Ingo Fründ
- Modellierung Kognitiver Prozesse, Technische Universität Berlin and Bernstein Center for Computational Neuroscience, Berlin, Germany.
| | | | | |
Collapse
|
41
|
Macke JH, Wichmann FA. Estimating predictive stimulus features from psychophysical data: The decision image technique applied to human faces. J Vis 2010; 10:22. [PMID: 20616129 DOI: 10.1167/10.5.22] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open
Abstract
One major challenge in the sensory sciences is to identify the stimulus features on which sensory systems base their computations, and which are predictive of a behavioral decision: they are a prerequisite for computational models of perception. We describe a technique (decision images) for extracting predictive stimulus features using logistic regression. A decision image not only defines a region of interest within a stimulus but is a quantitative template which defines a direction in stimulus space. Decision images thus enable the development of predictive models, as well as the generation of optimized stimuli for subsequent psychophysical investigations. Here we describe our method and apply it to data from a human face classification experiment. We show that decision images are able to predict human responses not only in terms of overall percent correct but also in terms of the probabilities with which individual faces are (mis-) classified by individual observers. We show that the most predictive dimension for gender categorization is neither aligned with the axis defined by the two class-means, nor with the first principal component of all faces-two hypotheses frequently entertained in the literature. Our method can be applied to a wide range of binary classification tasks in vision or other psychophysical contexts.
Collapse
Affiliation(s)
- Jakob H Macke
- Max-Planck-Institut für biologische Kybernetik, Tübingen, Werner Reichardt Centre for Integrative Neuroscience, University of Tübingen, Germany.
| | | |
Collapse
|
42
|
Wichmann FA, Drewes J, Rosas P, Gegenfurtner KR. Animal detection in natural scenes: critical features revisited. J Vis 2010; 10:6.1-27. [PMID: 20465326 DOI: 10.1167/10.4.6] [Citation(s) in RCA: 64] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2008] [Accepted: 12/19/2009] [Indexed: 11/24/2022] Open
Abstract
S. J. Thorpe, D. Fize, and C. Marlot (1996) showed how rapidly observers can detect animals in images of natural scenes, but it is still unclear which image features support this rapid detection. A. B. Torralba and A. Oliva (2003) suggested that a simple image statistic based on the power spectrum allows the absence or presence of objects in natural scenes to be predicted. We tested whether human observers make use of power spectral differences between image categories when detecting animals in natural scenes. In Experiments 1 and 2 we found performance to be essentially independent of the power spectrum. Computational analysis revealed that the ease of classification correlates with the proposed spectral cue without being caused by it. This result is consistent with the hypothesis that in commercial stock photo databases a majority of animal images are pre-segmented from the background by the photographers and this pre-segmentation causes the power spectral differences between image categories and may, furthermore, help rapid animal detection. Data from a third experiment are consistent with this hypothesis. Together, our results make it exceedingly unlikely that human observers make use of power spectral differences between animal- and no-animal images during rapid animal detection. In addition, our results point to potential confounds in the commercially available "natural image" databases whose statistics may be less natural than commonly presumed.
Collapse
Affiliation(s)
- Felix A Wichmann
- Modelling of Cognitive Processes, Berlin Institute of Technology & Bernstein Center for Computational Neuroscience Berlin, Berlin, Germany.
| | | | | | | |
Collapse
|
43
|
Jäkel F, Schölkopf B, Wichmann FA. Does Cognitive Science Need Kernels? Trends Cogn Sci 2009; 13:381-8. [DOI: 10.1016/j.tics.2009.06.002] [Citation(s) in RCA: 44] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2008] [Revised: 06/11/2009] [Accepted: 06/15/2009] [Indexed: 11/16/2022]
|
44
|
Goris RLT, Wichmann FA, Henning GB. A neurophysiologically plausible population code model for human contrast discrimination. J Vis 2009; 9:15. [PMID: 19761330 DOI: 10.1167/9.7.15] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2008] [Accepted: 04/30/2009] [Indexed: 11/24/2022] Open
Abstract
The pedestal effect is the improvement in the detectability of a sinusoidal grating in the presence of another grating of the same orientation, spatial frequency, and phase-usually called the pedestal. Recent evidence has demonstrated that the pedestal effect is differently modified by spectrally flat and notch-filtered noise: The pedestal effect is reduced in flat noise but virtually disappears in the presence of notched noise (G. B. Henning & F. A. Wichmann, 2007). Here we consider a network consisting of units whose contrast response functions resemble those of the cortical cells believed to underlie human pattern vision and demonstrate that, when the outputs of multiple units are combined by simple weighted summation-a heuristic decision rule that resembles optimal information combination and produces a contrast-dependent weighting profile-the network produces contrast-discrimination data consistent with psychophysical observations: The pedestal effect is present without noise, reduced in broadband noise, but almost disappears in notched noise. These findings follow naturally from the normalization model of simple cells in primary visual cortex, followed by response-based pooling, and suggest that in processing even low-contrast sinusoidal gratings, the visual system may combine information across neurons tuned to different spatial frequencies and orientations.
Collapse
Affiliation(s)
- Robbe L T Goris
- Laboratory of Experimental Psychology, University of Leuven, Leuven, Belgium.
| | | | | |
Collapse
|
45
|
Kienzle W, Franz MO, Schölkopf B, Wichmann FA. Center-surround patterns emerge as optimal predictors for human saccade targets. J Vis 2009; 9:7.1-15. [PMID: 19757885 DOI: 10.1167/9.5.7] [Citation(s) in RCA: 92] [Impact Index Per Article: 6.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2008] [Accepted: 02/22/2009] [Indexed: 11/24/2022] Open
Abstract
The human visual system is foveated, that is, outside the central visual field resolution and acuity drop rapidly. Nonetheless much of a visual scene is perceived after only a few saccadic eye movements, suggesting an effective strategy for selecting saccade targets. It has been known for some time that local image structure at saccade targets influences the selection process. However, the question of what the most relevant visual features are is still under debate. Here we show that center-surround patterns emerge as the optimal solution for predicting saccade targets from their local image structure. The resulting model, a one-layer feed-forward network, is surprisingly simple compared to previously suggested models which assume much more complex computations such as multi-scale processing and multiple feature channels. Nevertheless, our model is equally predictive. Furthermore, our findings are consistent with neurophysiological hardware in the superior colliculus. Bottom-up visual saliency may thus not be computed cortically as has been thought previously.
Collapse
Affiliation(s)
- Wolf Kienzle
- Empirical Inference Department, Max Planck Institute for Biological Cybernetics, Tübingen, Germany.
| | | | | | | |
Collapse
|
46
|
Goris RLT, Wagemans J, Wichmann FA. Modelling contrast discrimination data suggest both the pedestal effect and stochastic resonance to be caused by the same mechanism. J Vis 2008; 8:17.1-21. [PMID: 19146300 DOI: 10.1167/8.15.17] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2008] [Accepted: 07/28/2008] [Indexed: 11/24/2022] Open
Abstract
Computational models of spatial vision typically make use of a (rectified) linear filter, a nonlinearity and dominant late noise to account for human contrast discrimination data. Linear-nonlinear cascade models predict an improvement in observers' contrast detection performance when low, subthreshold levels of external noise are added (i.e., stochastic resonance). Here, we address the issue whether a single contrast gain-control model of early spatial vision can account for both the pedestal effect, i.e., the improved detectability of a grating in the presence of a low-contrast masking grating, and stochastic resonance. We measured contrast discrimination performance without noise and in both weak and moderate levels of noise. Making use of a full quantitative description of our data with few parameters combined with comprehensive model selection assessments, we show the pedestal effect to be more reduced in the presence of weak noise than in moderate noise. This reduction rules out independent, additive sources of performance improvement and, together with a simulation study, supports the parsimonious explanation that a single mechanism underlies the pedestal effect and stochastic resonance in contrast perception.
Collapse
Affiliation(s)
- Robbe L T Goris
- Laboratory for Experimental Psychology, University of Leuven, Tiensestraat, Leuven, Belgium.
| | | | | |
Collapse
|
47
|
Rosas P, Wichmann FA, Wagemans J. Texture and object motion in slant discrimination: failure of reliability-based weighting of cues may be evidence for strong fusion. J Vis 2007; 7:3. [PMID: 17685786 DOI: 10.1167/7.6.3] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2005] [Accepted: 12/30/2006] [Indexed: 11/24/2022] Open
Abstract
Different types of texture produce differences in slant-discrimination performance (P. Rosas, F. A. Wichmann, & J. Wagemans, 2004). Under the assumption that the visual system is sensitive to the reliability of different depth cues (M. O. Ernst & M. S. Banks, 2002; L. T. Maloney & M. S. Landy, 1989), it follows that the texture type should affect the influence of the texture cue in depth-cue combination. We tested this prediction by combining different texture types with object motion in a slant-discrimination task in two experiments. First, we used consistent cues to observe whether our subjects behaved as linearly combining independent estimates from texture and motion in a statistical optimal fashion (M. O. Ernst & M. S. Banks, 2002). Only 4% of our results were consistent with such an optimal combination of uncorrelated estimates, whereas about 46% of the data were consistent with an optimal combination of correlated estimates from cues. Second, we measured the weights for the texture and motion cues using perturbation analysis. The results showed a large influence of the motion cue and an increasing weight for the texture cue for larger slants. However, in general, the texture weights did not follow the reliability of the textures. Finally, we fitted the correlation coefficients of estimates individually for each texture, motion condition, and observer. This allows us to fit our data from both experiments to an optimal cue combination model with correlated estimates, but inspection of the fitted parameters shows no clear, psychophysically interpretable pattern. Furthermore, the fitted motion thresholds as a function of texture type are correlated with the slant thresholds as a function of texture type. One interpretation of such a finding is a strong coupling of cues.
Collapse
Affiliation(s)
- Pedro Rosas
- Katholieke Universiteit Leuven, Leuven, Belgium.
| | | | | |
Collapse
|
48
|
Abstract
The pedestal or dipper effect is the large improvement in the detectability of a sinusoidal grating observed when it is added to a masking or pedestal grating of the same spatial frequency, orientation, and phase. We measured the pedestal effect in both broadband and notched noise-noise from which a 1.5-octave band centered on the signal frequency had been removed. Although the pedestal effect persists in broadband noise, it almost disappears in the notched noise. Furthermore, the pedestal effect is substantial when either high- or low-pass masking noise is used. We conclude that the pedestal effect in the absence of notched noise results principally from the use of information derived from channels with peak sensitivities at spatial frequencies different from that of the signal and the pedestal. We speculate that the spatial-frequency components of the notched noise above and below the spatial frequency of the signal and the pedestal prevent "off-frequency looking," that is, prevent the use of information about changes in contrast carried in channels tuned to spatial frequencies that are very much different from that of the signal and the pedestal. Thus, the pedestal or dipper effect measured without notched noise appears not to be a characteristic of individual spatial-frequency-tuned channels.
Collapse
Affiliation(s)
- G Bruce Henning
- The Sensory Research Unit, Department of Experimental Psychology, Oxford University, Oxford, United Kingdom.
| | | |
Collapse
|
49
|
Jäkel F, Wichmann FA. Spatial four-alternative forced-choice method is the preferred psychophysical method for naïve observers. J Vis 2006; 6:1307-22. [PMID: 17209737 DOI: 10.1167/6.11.13] [Citation(s) in RCA: 66] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2005] [Accepted: 09/29/2006] [Indexed: 11/24/2022] Open
Abstract
H. R. Blackwell (1952) investigated the influence of different psychophysical methods and procedures on detection thresholds. He found that the temporal two-interval forced-choice method (2-IFC) combined with feedback, blocked constant stimulus presentation with few different stimulus intensities, and highly trained observers resulted in the "best" threshold estimates. This recommendation is in current practice in many psychophysical laboratories and has entered the psychophysicists' "folk wisdom" of how to run proper psychophysical experiments. However, Blackwell's recommendations explicitly require experienced observers, whereas many psychophysical studies, particularly with children or within a clinical setting, are performed with naïve observers. In a series of psychophysical experiments, we find a striking and consistent discrepancy between naïve observers' behavior and that reported for experienced observers by Blackwell: Naïve observers show the "best" threshold estimates for the spatial four-alternative forced-choice method (4-AFC) and the worst for the commonly employed temporal 2-IFC. We repeated our study with a highly experienced psychophysical observer, and he replicated Blackwell's findings exactly, thus suggesting that it is indeed the difference in psychophysical experience that causes the discrepancy between our findings and those of Blackwell. In addition, we explore the efficiency of different methods and show 4-AFC to be more than 3.5 times more efficient than 2-IFC under realistic conditions. While we have found that 4-AFC consistently gives lower thresholds than 2-IFC in detection tasks, we have found the opposite for discrimination tasks. This discrepancy suggests that there are large extrasensory influences on thresholds--sensory memory for IFC methods and spatial attention for spatial forced-choice methods--that are critical but, alas, not part of theoretical approaches to psychophysics such as signal detection theory.
Collapse
Affiliation(s)
- Frank Jäkel
- Max Planck Institute for Biological Cybernetics, Tübingen, Germany.
| | | |
Collapse
|
50
|
Abstract
We attempt to shed light on the algorithms humans use to classify images of human faces according to their gender. For this, a novel methodology combining human psychophysics and machine learning is introduced. We proceed as follows. First, we apply principal component analysis (PCA) on the pixel information of the face stimuli. We then obtain a data set composed of these PCA eigenvectors combined with the subjects' gender estimates of the corresponding stimuli. Second, we model the gender classification process on this data set using a separating hyperplane (SH) between both classes. This SH is computed using algorithms from machine learning: the support vector machine (SVM), the relevance vector machine, the prototype classifier, and the K-means classifier. The classification behavior of humans and machines is then analyzed in three steps. First, the classification errors of humans and machines are compared for the various classifiers, and we also assess how well machines can recreate the subjects' internal decision boundary by studying the training errors of the machines. Second, we study the correlations between the rank-order of the subjects' responses to each stimulus-the gender estimate with its reaction time and confidence rating-and the rank-order of the distance of these stimuli to the SH. Finally, we attempt to compare the metric of the representations used by humans and machines for classification by relating the subjects' gender estimate of each stimulus and the distance of this stimulus to the SH. While we show that the classification error alone is not a sufficient selection criterion between the different algorithms humans might use to classify face stimuli, the distance of these stimuli to the SH is shown to capture essentials of the internal decision space of humans. Furthermore, algorithms such as the prototype classifier using stimuli in the center of the classes are shown to be less adapted to model human classification behavior than algorithms such as the SVM based on stimuli close to the boundary between the classes.
Collapse
Affiliation(s)
- Arnulf B A Graf
- Max Planck Institute for Biological Cybernetics, D 72076 Tübingen, Germany.
| | | | | | | |
Collapse
|