1
|
Nadler EO, Darragh-Ford E, Desikan BS, Conaway C, Chu M, Hull T, Guilbeault D. Divergences in color perception between deep neural networks and humans. Cognition 2023; 241:105621. [PMID: 37716312 DOI: 10.1016/j.cognition.2023.105621] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2023] [Revised: 06/23/2023] [Accepted: 09/09/2023] [Indexed: 09/18/2023]
Abstract
Deep neural networks (DNNs) are increasingly proposed as models of human vision, bolstered by their impressive performance on image classification and object recognition tasks. Yet, the extent to which DNNs capture fundamental aspects of human vision such as color perception remains unclear. Here, we develop novel experiments for evaluating the perceptual coherence of color embeddings in DNNs, and we assess how well these algorithms predict human color similarity judgments collected via an online survey. We find that state-of-the-art DNN architectures - including convolutional neural networks and vision transformers - provide color similarity judgments that strikingly diverge from human color judgments of (i) images with controlled color properties, (ii) images generated from online searches, and (iii) real-world images from the canonical CIFAR-10 dataset. We compare DNN performance against an interpretable and cognitively plausible model of color perception based on wavelet decomposition, inspired by foundational theories in computational neuroscience. While one deep learning model - a convolutional DNN trained on a style transfer task - captures some aspects of human color perception, our wavelet algorithm provides more coherent color embeddings that better predict human color judgments compared to all DNNs we examine. These results hold when altering the high-level visual task used to train similar DNN architectures (e.g., image classification versus image segmentation), as well as when examining the color embeddings of different layers in a given DNN architecture. These findings break new ground in the effort to analyze the perceptual representations of machine learning algorithms and to improve their ability to serve as cognitively plausible models of human vision. Implications for machine learning, human perception, and embodied cognition are discussed.
Collapse
Affiliation(s)
- Ethan O Nadler
- Carnegie Observatories, USA; Department of Physics, University of Southern California, USA.
| | - Elise Darragh-Ford
- Kavli Institute for Particle Astrophysics and Cosmology and Department of Physics, Stanford University, USA
| | - Bhargav Srinivasa Desikan
- School of Computer and Communication Sciences, École Polytechnique Fédérale de Lausanne, Switzerland; Knowledge Lab, University of Chicago, USA
| | | | - Mark Chu
- School of the Arts, Columbia University, USA
| | | | | |
Collapse
|
2
|
Bowers JS, Malhotra G, Dujmović M, Llera Montero M, Tsvetkov C, Biscione V, Puebla G, Adolfi F, Hummel JE, Heaton RF, Evans BD, Mitchell J, Blything R. Deep problems with neural network models of human vision. Behav Brain Sci 2022; 46:e385. [PMID: 36453586 DOI: 10.1017/s0140525x22002813] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/03/2022]
Abstract
Deep neural networks (DNNs) have had extraordinary successes in classifying photographic images of objects and are often described as the best models of biological vision. This conclusion is largely based on three sets of findings: (1) DNNs are more accurate than any other model in classifying images taken from various datasets, (2) DNNs do the best job in predicting the pattern of human errors in classifying objects taken from various behavioral datasets, and (3) DNNs do the best job in predicting brain signals in response to images taken from various brain datasets (e.g., single cell responses or fMRI data). However, these behavioral and brain datasets do not test hypotheses regarding what features are contributing to good predictions and we show that the predictions may be mediated by DNNs that share little overlap with biological vision. More problematically, we show that DNNs account for almost no results from psychological research. This contradicts the common claim that DNNs are good, let alone the best, models of human object recognition. We argue that theorists interested in developing biologically plausible models of human vision need to direct their attention to explaining psychological findings. More generally, theorists need to build models that explain the results of experiments that manipulate independent variables designed to test hypotheses rather than compete on making the best predictions. We conclude by briefly summarizing various promising modeling approaches that focus on psychological data.
Collapse
Affiliation(s)
- Jeffrey S Bowers
- School of Psychological Science, University of Bristol, Bristol, UK ; https://jeffbowers.blogs.bristol.ac.uk/
| | - Gaurav Malhotra
- School of Psychological Science, University of Bristol, Bristol, UK ; https://jeffbowers.blogs.bristol.ac.uk/
| | - Marin Dujmović
- School of Psychological Science, University of Bristol, Bristol, UK ; https://jeffbowers.blogs.bristol.ac.uk/
| | - Milton Llera Montero
- School of Psychological Science, University of Bristol, Bristol, UK ; https://jeffbowers.blogs.bristol.ac.uk/
| | - Christian Tsvetkov
- School of Psychological Science, University of Bristol, Bristol, UK ; https://jeffbowers.blogs.bristol.ac.uk/
| | - Valerio Biscione
- School of Psychological Science, University of Bristol, Bristol, UK ; https://jeffbowers.blogs.bristol.ac.uk/
| | - Guillermo Puebla
- School of Psychological Science, University of Bristol, Bristol, UK ; https://jeffbowers.blogs.bristol.ac.uk/
| | - Federico Adolfi
- School of Psychological Science, University of Bristol, Bristol, UK ; https://jeffbowers.blogs.bristol.ac.uk/
- Ernst Strüngmann Institute (ESI) for Neuroscience in Cooperation with Max Planck Society, Frankfurt am Main, Germany
| | - John E Hummel
- Department of Psychology, University of Illinois Urbana-Champaign, Champaign, IL, USA
| | - Rachel F Heaton
- Department of Psychology, University of Illinois Urbana-Champaign, Champaign, IL, USA
| | - Benjamin D Evans
- Department of Informatics, School of Engineering and Informatics, University of Sussex, Brighton, UK
| | - Jeffrey Mitchell
- Department of Informatics, School of Engineering and Informatics, University of Sussex, Brighton, UK
| | - Ryan Blything
- School of Psychology, Aston University, Birmingham, UK
| |
Collapse
|
3
|
Brick C, Hood B, Ekroll V, de-Wit L. Illusory Essences: A Bias Holding Back Theorizing in Psychological Science. PERSPECTIVES ON PSYCHOLOGICAL SCIENCE 2021; 17:491-506. [PMID: 34283676 PMCID: PMC8902028 DOI: 10.1177/1745691621991838] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
The reliance in psychology on verbal definitions means that psychological research is unusually moored to how humans think and communicate about categories. Psychological concepts (e.g., intelligence, attention) are easily assumed to represent objective, definable categories with an underlying essence. Like the “vital forces” previously thought to animate life, these assumed essences can create an illusion of understanding. By synthesizing a wide range of research lines from cognitive, clinical, and biological psychology and neuroscience, we describe a pervasive tendency across psychological science to assume that essences explain phenomena. Labeling a complex phenomenon can appear as theoretical progress before there is sufficient evidence that the described category has a definable essence or known boundary conditions. Category labels can further undermine progress by masking contingent and contextual relationships and obscuring the need to specify mechanisms. Finally, we highlight examples of promising methods that circumvent the lure of essences and suggest four concrete strategies for identifying and avoiding essentialist intuitions in theory development.
Collapse
Affiliation(s)
- C Brick
- Department of Psychology, University of Amsterdam.,Department of Psychology, University of Cambridge
| | - B Hood
- School of Psychological Science, University of Bristol
| | - V Ekroll
- Department of Psychosocial Science, University of Bergen
| | - L de-Wit
- Department of Psychology, University of Cambridge
| |
Collapse
|
4
|
Keller AJ, Dipoppa M, Roth MM, Caudill MS, Ingrosso A, Miller KD, Scanziani M. A Disinhibitory Circuit for Contextual Modulation in Primary Visual Cortex. Neuron 2020; 108:1181-1193.e8. [PMID: 33301712 PMCID: PMC7850578 DOI: 10.1016/j.neuron.2020.11.013] [Citation(s) in RCA: 52] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2020] [Revised: 10/17/2020] [Accepted: 11/13/2020] [Indexed: 12/24/2022]
Abstract
Context guides perception by influencing stimulus saliency. Accordingly, in visual cortex, responses to a stimulus are modulated by context, the visual scene surrounding the stimulus. Responses are suppressed when stimulus and surround are similar but not when they differ. The underlying mechanisms remain unclear. Here, we use optical recordings, manipulations, and computational modeling to show that disinhibitory circuits consisting of vasoactive intestinal peptide (VIP)-expressing and somatostatin (SOM)-expressing inhibitory neurons modulate responses in mouse visual cortex depending on similarity between stimulus and surround, primarily by modulating recurrent excitation. When stimulus and surround are similar, VIP neurons are inactive, and activity of SOM neurons leads to suppression of excitatory neurons. However, when stimulus and surround differ, VIP neurons are active, inhibiting SOM neurons, which leads to relief of excitatory neurons from suppression. We have identified a canonical cortical disinhibitory circuit that contributes to contextual modulation and may regulate perceptual saliency.
Collapse
Affiliation(s)
- Andreas J Keller
- Department of Physiology, University of California, San Francisco, San Francisco, CA 94158-0444, USA; Howard Hughes Medical Institute, University of California, San Francisco, San Francisco, CA, USA.
| | - Mario Dipoppa
- Center for Theoretical Neuroscience, College of Physicians and Surgeons and Mortimer B. Zuckerman Mind Brain Behavior Institute, Columbia University, New York City, NY 10027, USA.
| | - Morgane M Roth
- Department of Physiology, University of California, San Francisco, San Francisco, CA 94158-0444, USA; Howard Hughes Medical Institute, University of California, San Francisco, San Francisco, CA, USA.
| | - Matthew S Caudill
- Center for Neural Circuits and Behavior, Neurobiology Section and Department of Neuroscience, University of California, San Diego, La Jolla, CA 92093-0634, USA; Howard Hughes Medical Institute, University of California, San Francisco, San Francisco, CA, USA
| | - Alessandro Ingrosso
- Center for Theoretical Neuroscience, College of Physicians and Surgeons and Mortimer B. Zuckerman Mind Brain Behavior Institute, Columbia University, New York City, NY 10027, USA
| | - Kenneth D Miller
- Center for Theoretical Neuroscience, College of Physicians and Surgeons and Mortimer B. Zuckerman Mind Brain Behavior Institute, Columbia University, New York City, NY 10027, USA; Department of Neuroscience, Swartz Program in Theoretical Neuroscience, Kavli Institute for Brain Science, College of Physicians and Surgeons and Mortimer B. Zuckerman Mind Brain Behavior Institute, Columbia University, New York City, NY, USA.
| | - Massimo Scanziani
- Department of Physiology, University of California, San Francisco, San Francisco, CA 94158-0444, USA; Center for Neural Circuits and Behavior, Neurobiology Section and Department of Neuroscience, University of California, San Diego, La Jolla, CA 92093-0634, USA; Howard Hughes Medical Institute, University of California, San Francisco, San Francisco, CA, USA.
| |
Collapse
|
5
|
Merkel C, Hopf JM, Schoenfeld MA. Modulating the global orientation bias of the visual system changes population receptive field elongations. Hum Brain Mapp 2019; 41:1765-1774. [PMID: 31872941 PMCID: PMC7267956 DOI: 10.1002/hbm.24909] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2019] [Revised: 12/10/2019] [Accepted: 12/15/2019] [Indexed: 11/06/2022] Open
Abstract
The topographical structure of the visual system in individual subjects can be visualized using fMRI. Recently, a radial bias for the long axis of population receptive fields (pRF) has been shown using fMRI. It has been theorized that the elongation of receptive fields pointing toward the fovea results from horizontal local connections bundling orientation selective units mostly parallel to their polar position within the visual field. In order to investigate whether there is a causal relationship between orientation selectivity and pRF elongation the current study employed a global orientation adapter to modulate the orientation bias for the visual system while measuring spatial pRF characteristics. The hypothesis was that the orientation tuning change of neural populations would alter pRF elongations toward the fovea particularly at axial positions parallel and orthogonal to the affected orientation. The results indeed show a different amount of elongation of pRF units and their orientation at parallel and orthogonal axial positions relative to the adapter orientation. Within the lower left hemifield, pRF radial bias and elongation showed an increase during adaptation to a 135° grating while both parameters decreased during the presentation of a 45° adapter stimulus. The lower right visual field showed the reverse pattern. No modulation of the pRF topographies were observed in the upper visual field probably due to a vertical visual field asymmetry of sensitivity toward the low contrast spatial frequency pattern of the adapter stimulus. These data suggest a direct relationship between orientation selectivity and elongation of population units within the visual cortex.
Collapse
Affiliation(s)
- Christian Merkel
- Department of Neurology, Otto-von-Guericke University, Magdeburg, Germany
| | - Jens-Max Hopf
- Department of Neurology, Otto-von-Guericke University, Magdeburg, Germany.,Department of Behavioral Neurology, Leibniz Institute for Neurobiology, Magdeburg, Germany
| | - Mircea Ariel Schoenfeld
- Department of Neurology, Otto-von-Guericke University, Magdeburg, Germany.,Department of Behavioral Neurology, Leibniz Institute for Neurobiology, Magdeburg, Germany.,Kliniken Schmieder, Heidelberg, Germany
| |
Collapse
|
6
|
Merkel C, Hopf JM, Schoenfeld MA. Spatial elongation of population receptive field profiles revealed by model-free fMRI back-projection. Hum Brain Mapp 2018; 39:2472-2481. [PMID: 29464880 DOI: 10.1002/hbm.24015] [Citation(s) in RCA: 20] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2017] [Revised: 01/22/2018] [Accepted: 02/13/2018] [Indexed: 11/07/2022] Open
Abstract
Estimates of visual field topographies in human visual cortex obtained through fMRI traveling wave techniques usually provide the parameters of population receptive field (pRF) location (polar angle, eccentricity) and receptive field size. These parameters are obtained by fitting the recorded data to a standard model population receptive field. In this work, pRF profiles are measured directly by back-projecting preprocessed fMRI time-series to sweeps of a bar across the visual field in different angles. The current data suggest that the model-free pRF profiles contain information not only about receptive field location and size but also about the pRF shape characteristics. The elongation (ellipticity) of pRFs decreases along the early visual hierarchy to a different degree for the ventral and the dorsal stream. Furthermore, ellipticity changes as a function of eccentricity. pRF orientation shows a high degree of collinearity with its angular position within the visual field. Using model-free pRF measurements, the traveling wave technique provides additional characteristics of pRF topographies that are not restricted to size and provide robust measures within the single subject.
Collapse
Affiliation(s)
- Christian Merkel
- Department of Neurology, Otto-von-Guericke University, Magdeburg, Germany
| | - Jens-Max Hopf
- Department of Neurology, Otto-von-Guericke University, Magdeburg, Germany.,Department of Behavioral Neurology, Leibniz Institute for Neurobiology, Magdeburg, Germany
| | - Mircea Ariel Schoenfeld
- Department of Neurology, Otto-von-Guericke University, Magdeburg, Germany.,Department of Behavioral Neurology, Leibniz Institute for Neurobiology, Magdeburg, Germany.,Kliniken Schmieder, Heidelberg, Germany
| |
Collapse
|
7
|
Abstract
Psychology moved beyond the stimulus response mapping of behaviorism by adopting an information processing framework. This shift from behavioral to cognitive science was partly inspired by work demonstrating that the concept of information could be defined and quantified (Shannon, 1948). This transition developed further from cognitive science into cognitive neuroscience, in an attempt to measure information in the brain. In the cognitive neurosciences, however, the term information is often used without a clear definition. This paper will argue that, if the formulation proposed by Shannon is applied to modern neuroimaging, then numerous results would be interpreted differently. More specifically, we argue that much modern cognitive neuroscience implicitly focuses on the question of how we can interpret the activations we record in the brain (experimenter-as-receiver), rather than on the core question of how the rest of the brain can interpret those activations (cortex-as-receiver). A clearer focus on whether activations recorded via neuroimaging can actually act as information in the brain would not only change how findings are interpreted but should also change the direction of empirical research in cognitive neuroscience.
Collapse
|
8
|
Hietanen MA, Cloherty SL, Ibbotson MR. Contrast and response gain control depend on cortical map architecture. Eur J Neurosci 2015; 42:2963-73. [PMID: 26432621 DOI: 10.1111/ejn.13091] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2015] [Revised: 09/23/2015] [Accepted: 09/28/2015] [Indexed: 11/29/2022]
Abstract
Visual cortical neurons are sensitive to visual stimulus contrast and most cells adapt their sensitivity to the prevailing visual environment. Specifically, they match the steepest region of their contrast response function to the prevailing contrast (contrast gain control), and reduce spike rates to limit saturation (response gain control). Most neurons are also tuned for stimulus orientation, and neurons with similar orientation preference are clustered together into iso-orientation zones arranged around pinwheels, i.e. points where all orientations are represented. Here we investigated the relationship between the contrast adaptation properties of neurons and their location relative to pinwheels in the orientation preference map. We measured orientation preference maps in cat cortex using optical intrinsic signal imaging. We then characterized the contrast adaptation properties of single neurons located close to pinwheels, in iso-orientation zones, and at regions in between. We found little evidence of differential contrast sensitivity of neurons adapted to zero contrast. However, after adaptation to their preferred orientation at high contrast, changes in both contrast and response gain were greater for neurons near pinwheels compared with other map regions. Therefore, in the adapted state, which is probably typical during natural viewing, there is a spatial map of contrast sensitivity that is associated with the orientation preference map. This differential adaptation revealed a new dimension of cortical functional organization, linking the contrast adaptation of cells with the orientation preference of their nearest neighbours.
Collapse
Affiliation(s)
- Markus A Hietanen
- National Vision Research Institute, Australian College of Optometry, Cnr Cardigan and Keppel Street, Carlton, Vic., 3053, Australia.,ARC Centre of Excellence for Integrative Brain Function and Department of Optometry and Vision Sciences, University of Melbourne, Parkville, Vic., Australia
| | - Shaun L Cloherty
- National Vision Research Institute, Australian College of Optometry, Cnr Cardigan and Keppel Street, Carlton, Vic., 3053, Australia.,ARC Centre of Excellence for Integrative Brain Function and Department of Optometry and Vision Sciences, University of Melbourne, Parkville, Vic., Australia.,Department of Electrical and Electronic Engineering, University of Melbourne, Parkville, Vic., Australia
| | - Michael R Ibbotson
- National Vision Research Institute, Australian College of Optometry, Cnr Cardigan and Keppel Street, Carlton, Vic., 3053, Australia.,ARC Centre of Excellence for Integrative Brain Function and Department of Optometry and Vision Sciences, University of Melbourne, Parkville, Vic., Australia
| |
Collapse
|
9
|
Sheridan P. Long-range cortical connections give rise to a robust velocity map of V1. Neural Netw 2015; 71:124-41. [PMID: 26343820 DOI: 10.1016/j.neunet.2015.08.005] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2014] [Revised: 08/03/2015] [Accepted: 08/13/2015] [Indexed: 10/23/2022]
Abstract
This paper proposes a two-dimensional velocity model (2DVM) of the primary visual cortex (V1). The model's novel aspect is that it specifies a particular pattern of long-range cortical temporal connections, via the Connection Algorithm, and shows how the addition of these connections to well-known spatial properties of V1 transforms V1 into a velocity map. The map implies a number of organizational properties of V1: (1) the singularity of each orientation pinwheel contributes to the detection of slow-moving spots across the visual field; (2) the speed component of neuronal velocity selectivity decreases monotonically across each joint orientation contour line for parallel motion and increases monotonically for orthogonal motion; (3) the cells that are direction selective to slow-moving objects are situated in the periphery of V1; and (4) neurons in distinct pinwheels tend to be connected to neurons with similar tuning preferences in other pinwheels. The model accounts for various types of known illusionary perceptions of human vision: perceptual filling-in, illusionary orientation and visual crowding. The three distinguishing features of 2DVM are: (1) it unifies the functional properties of the conventional energy model of V1; (2) it directly relates the functional properties to the known structure of the upper layers of V1; and (3) it implies that the spatial selectivity features of V1 are side effects of its more important role as a velocity map of the visual field.
Collapse
Affiliation(s)
- Phillip Sheridan
- School of Information and Communication Technology, Griffith University, University Drive, Meadowbrook, Qld, Australia.
| |
Collapse
|
10
|
Rentzeperis I, Nikolaev AR, Kiper DC, van Leeuwen C. Distributed processing of color and form in the visual cortex. Front Psychol 2014; 5:932. [PMID: 25386146 PMCID: PMC4209824 DOI: 10.3389/fpsyg.2014.00932] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2013] [Accepted: 08/05/2014] [Indexed: 11/23/2022] Open
Abstract
To what extent does the visual system process color and form separately? Proponents of the segregation view claim that distinct regions of the cortex are dedicated to each of these two dimensions separately. However, evidence is accumulating that color and form processing may, at least to some extent, be intertwined in the brain. In this perspective, we review psychophysical and neurophysiological studies on color and form perception and evaluate their results in light of recent developments in population coding.
Collapse
Affiliation(s)
- Ilias Rentzeperis
- Institute of Neuroinformatics, University of Zürich and Swiss Federal Institute of Technology Zürich, Switzerland ; Laboratory for Human Systems Neuroscience, RIKEN Brain Science Institute Wako, Japan
| | - Andrey R Nikolaev
- Laboratory for Perceptual Dynamics, University of Leuven Leuven, Belgium
| | - Daniel C Kiper
- Institute of Neuroinformatics, University of Zürich and Swiss Federal Institute of Technology Zürich, Switzerland
| | - Cees van Leeuwen
- Laboratory for Perceptual Dynamics, University of Leuven Leuven, Belgium
| |
Collapse
|
11
|
Wei H, Ren Y, Wang ZY. A computational neural model of orientation detection based on multiple guesses: comparison of geometrical and algebraic models. Cogn Neurodyn 2013; 7:361-79. [PMID: 24427212 PMCID: PMC3773326 DOI: 10.1007/s11571-012-9235-8] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2012] [Revised: 11/09/2012] [Accepted: 12/14/2012] [Indexed: 02/03/2023] Open
Abstract
The implementation of Hubel-Wiesel hypothesis that orientation selectivity of a simple cell is based on ordered arrangement of its afferent cells has some difficulties. It requires the receptive fields (RFs) of those ganglion cells (GCs) and LGN cells to be similar in size and sub-structure and highly arranged in a perfect order. It also requires an adequate number of regularly distributed simple cells to match ubiquitous edges. However, the anatomical and electrophysiological evidence is not strong enough to support this geometry-based model. These strict regularities also make the model very uneconomical in both evolution and neural computation. We propose a new neural model based on an algebraic method to estimate orientations. This approach synthesizes the guesses made by multiple GCs or LGN cells and calculates local orientation information subject to a group of constraints. This algebraic model need not obey the constraints of Hubel-Wiesel hypothesis, and is easily implemented with a neural network. By using the idea of a satisfiability problem with constraints, we also prove that the precision and efficiency of this model are mathematically practicable. The proposed model makes clear several major questions which Hubel-Wiesel model does not account for. Image-rebuilding experiments are conducted to check whether this model misses any important boundary in the visual field because of the estimation strategy. This study is significant in terms of explaining the neural mechanism of orientation detection, and finding the circuit structure and computational route in neural networks. For engineering applications, our model can be used in orientation detection and as a simulation platform for cell-to-cell communications to develop bio-inspired eye chips.
Collapse
Affiliation(s)
- Hui Wei
- Laboratory of Cognitive Model and Algorithm, School of Computer Science, Fudan University, Shanghai, 200433 China
| | - Yuan Ren
- Laboratory of Cognitive Model and Algorithm, School of Computer Science, Fudan University, Shanghai, 200433 China
| | - Zi Yan Wang
- Laboratory of Cognitive Model and Algorithm, School of Computer Science, Fudan University, Shanghai, 200433 China
| |
Collapse
|
12
|
Mather G, Pavan A, Bellacosa Marotti R, Campana G, Casco C. Interactions between motion and form processing in the human visual system. Front Comput Neurosci 2013; 7:65. [PMID: 23730286 PMCID: PMC3657629 DOI: 10.3389/fncom.2013.00065] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2012] [Accepted: 05/02/2013] [Indexed: 11/13/2022] Open
Abstract
The predominant view of motion and form processing in the human visual system assumes that these two attributes are handled by separate and independent modules. Motion processing involves filtering by direction-selective sensors, followed by integration to solve the aperture problem. Form processing involves filtering by orientation-selective and size-selective receptive fields, followed by integration to encode object shape. It has long been known that motion signals can influence form processing in the well-known Gestalt principle of common fate; texture elements which share a common motion property are grouped into a single contour or texture region. However, recent research in psychophysics and neuroscience indicates that the influence of form signals on motion processing is more extensive than previously thought. First, the salience and apparent direction of moving lines depends on how the local orientation and direction of motion combine to match the receptive field properties of motion-selective neurons. Second, orientation signals generated by "motion-streaks" influence motion processing; motion sensitivity, apparent direction and adaptation are affected by simultaneously present orientation signals. Third, form signals generated by human body shape influence biological motion processing, as revealed by studies using point-light motion stimuli. Thus, form-motion integration seems to occur at several different levels of cortical processing, from V1 to STS.
Collapse
Affiliation(s)
- George Mather
- School of Psychology, University of Lincoln Lincoln, UK
| | | | | | | | | |
Collapse
|
13
|
Wagemans J, Feldman J, Gepshtein S, Kimchi R, Pomerantz JR, van der Helm PA, van Leeuwen C. A century of Gestalt psychology in visual perception: II. Conceptual and theoretical foundations. Psychol Bull 2012; 138:1218-52. [PMID: 22845750 PMCID: PMC3728284 DOI: 10.1037/a0029334] [Citation(s) in RCA: 178] [Impact Index Per Article: 14.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
Our first review article (Wagemans et al., 2012) on the occasion of the centennial anniversary of Gestalt psychology focused on perceptual grouping and figure-ground organization. It concluded that further progress requires a reconsideration of the conceptual and theoretical foundations of the Gestalt approach, which is provided here. In particular, we review contemporary formulations of holism within an information-processing framework, allowing for operational definitions (e.g., integral dimensions, emergent features, configural superiority, global precedence, primacy of holistic/configural properties) and a refined understanding of its psychological implications (e.g., at the level of attention, perception, and decision). We also review 4 lines of theoretical progress regarding the law of Prägnanz-the brain's tendency of being attracted towards states corresponding to the simplest possible organization, given the available stimulation. The first considers the brain as a complex adaptive system and explains how self-organization solves the conundrum of trading between robustness and flexibility of perceptual states. The second specifies the economy principle in terms of optimization of neural resources, showing that elementary sensors working independently to minimize uncertainty can respond optimally at the system level. The third considers how Gestalt percepts (e.g., groups, objects) are optimal given the available stimulation, with optimality specified in Bayesian terms. Fourth, structural information theory explains how a Gestaltist visual system that focuses on internal coding efficiency yields external veridicality as a side effect. To answer the fundamental question of why things look as they do, a further synthesis of these complementary perspectives is required.
Collapse
Affiliation(s)
- Johan Wagemans
- University of Leuven (KU Leuven), Laboratory of Experimental Psychology, Tiensestraat 102, box 3711, BE-3000 Leuven, Belgium.
| | | | | | | | | | | | | |
Collapse
|
14
|
Wagemans J, Elder JH, Kubovy M, Palmer SE, Peterson MA, Singh M, von der Heydt R. A century of Gestalt psychology in visual perception: I. Perceptual grouping and figure-ground organization. Psychol Bull 2012; 138:1172-217. [PMID: 22845751 DOI: 10.1037/a0029333] [Citation(s) in RCA: 505] [Impact Index Per Article: 42.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
In 1912, Max Wertheimer published his paper on phi motion, widely recognized as the start of Gestalt psychology. Because of its continued relevance in modern psychology, this centennial anniversary is an excellent opportunity to take stock of what Gestalt psychology has offered and how it has changed since its inception. We first introduce the key findings and ideas in the Berlin school of Gestalt psychology, and then briefly sketch its development, rise, and fall. Next, we discuss its empirical and conceptual problems, and indicate how they are addressed in contemporary research on perceptual grouping and figure-ground organization. In particular, we review the principles of grouping, both classical (e.g., proximity, similarity, common fate, good continuation, closure, symmetry, parallelism) and new (e.g., synchrony, common region, element and uniform connectedness), and their role in contour integration and completion. We then review classic and new image-based principles of figure-ground organization, how it is influenced by past experience and attention, and how it relates to shape and depth perception. After an integrated review of the neural mechanisms involved in contour grouping, border ownership, and figure-ground perception, we conclude by evaluating what modern vision science has offered compared to traditional Gestalt psychology, whether we can speak of a Gestalt revival, and where the remaining limitations and challenges lie. A better integration of this research tradition with the rest of vision science requires further progress regarding the conceptual and theoretical foundations of the Gestalt approach, which is the focus of a second review article.
Collapse
Affiliation(s)
- Johan Wagemans
- University of Leuven (KU Leuven), Laboratory of Experimental Psychology, Tiensestraat 102, Box 3711, BE-3000 Leuven, Belgium.
| | | | | | | | | | | | | |
Collapse
|
15
|
Okamoto T, Ikezoe K, Tamura H, Watanabe M, Aihara K, Fujita I. Predicted contextual modulation varies with distance from pinwheel centers in the orientation preference map. Sci Rep 2011; 1:114. [PMID: 22355631 PMCID: PMC3216596 DOI: 10.1038/srep00114] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2011] [Accepted: 09/28/2011] [Indexed: 11/23/2022] Open
Abstract
In the primary visual cortex (V1) of some mammals, columns of neurons with the full range of orientation preferences converge at the center of a pinwheel-like arrangement, the ‘pinwheel center' (PWC). Because a neuron receives abundant inputs from nearby neurons, the neuron's position on the cortical map likely has a significant impact on its responses to the layout of orientations inside and outside its classical receptive field (CRF). To understand the positional specificity of responses, we constructed a computational model based on orientation preference maps in monkey V1 and hypothetical neuronal connections. The model simulations showed that neurons near PWCs displayed weaker but detectable orientation selectivity within their CRFs, and strongly reduced contextual modulation from extra-CRF stimuli, than neurons distant from PWCs. We suggest that neurons near PWCs robustly extract local orientation within their CRF embedded in visual scenes, and that contextual information is processed in regions distant from PWCs.
Collapse
Affiliation(s)
- Tsuyoshi Okamoto
- Faculty of Medical Sciences, Kyushu University, 3-1-1 Maidashi, Higashi-ku, Fukuoka 812-8582, Japan.
| | | | | | | | | | | |
Collapse
|
16
|
Alexander DM, Trengove C, Sheridan PE, van Leeuwen C. Generalization of learning by synchronous waves: from perceptual organization to invariant organization. Cogn Neurodyn 2011; 5:113-32. [PMID: 22654985 PMCID: PMC3100473 DOI: 10.1007/s11571-010-9142-9] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/10/2010] [Revised: 11/09/2010] [Accepted: 11/09/2010] [Indexed: 10/18/2022] Open
Abstract
From a few presentations of an object, perceptual systems are able to extract invariant properties such that novel presentations are immediately recognized. This may be enabled by inferring the set of all representations equivalent under certain transformations. We implemented this principle in a neurodynamic model that stores activity patterns representing transformed versions of the same object in a distributed fashion within maps, such that translation across the map corresponds to the relevant transformation. When a pattern on the map is activated, this causes activity to spread out as a wave across the map, activating all the transformed versions represented. Computational studies illustrate the efficacy of the proposed mechanism. The model rapidly learns and successfully recognizes rotated and scaled versions of a visual representation from a few prior presentations. For topographical maps such as primary visual cortex, the mechanism simultaneously represents identity and variation of visual percepts whose features change through time.
Collapse
Affiliation(s)
- David M. Alexander
- Laboratory for Perceptual Dynamics, RIKEN Brain Science Institute, Wako-shi, Saitama, Japan
| | - Chris Trengove
- Brain and Neural Systems Team, RIKEN Computational Science Research Program, Saitama, Japan
- Laboratory for Computational Neurophysics, RIKEN Brain Science Institute, Wako-shi, Saitama, Japan
| | - Phillip E. Sheridan
- School of Information and Communication Technology, Griffith University, Meadowbrook, QLD Australia
| | - Cees van Leeuwen
- Laboratory for Perceptual Dynamics, RIKEN Brain Science Institute, Wako-shi, Saitama, Japan
| |
Collapse
|