1
|
Yamane Y. Adaptation of the inferior temporal neurons and efficient visual processing. Front Behav Neurosci 2024; 18:1398874. [PMID: 39132448 PMCID: PMC11310006 DOI: 10.3389/fnbeh.2024.1398874] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2024] [Accepted: 07/16/2024] [Indexed: 08/13/2024] Open
Abstract
Numerous studies examining the responses of individual neurons in the inferior temporal (IT) cortex have revealed their characteristics such as two-dimensional or three-dimensional shape tuning, objects, or category selectivity. While these basic selectivities have been studied assuming that their response to stimuli is relatively stable, physiological experiments have revealed that the responsiveness of IT neurons also depends on visual experience. The activity changes of IT neurons occur over various time ranges; among these, repetition suppression (RS), in particular, is robustly observed in IT neurons without any behavioral or task constraints. I observed a similar phenomenon in the ventral visual neurons in macaque monkeys while they engaged in free viewing and actively fixated on one consistent object multiple times. This observation indicates that the phenomenon also occurs in natural situations during which the subject actively views stimuli without forced fixation, suggesting that this phenomenon is an everyday occurrence and widespread across regions of the visual system, making it a default process for visual neurons. Such short-term activity modulation may be a key to understanding the visual system; however, the circuit mechanism and the biological significance of RS remain unclear. Thus, in this review, I summarize the observed modulation types in IT neurons and the known properties of RS. Subsequently, I discuss adaptation in vision, including concepts such as efficient and predictive coding, as well as the relationship between adaptation and psychophysical aftereffects. Finally, I discuss some conceptual implications of this phenomenon as well as the circuit mechanisms and the models that may explain adaptation as a fundamental aspect of visual processing.
Collapse
Affiliation(s)
- Yukako Yamane
- Neural Computation Unit, Okinawa Institute of Science and Technology Graduate University, Okinawa, Japan
| |
Collapse
|
2
|
Fang Z, Bloem IM, Olsson C, Ma WJ, Winawer J. Normalization by orientation-tuned surround in human V1-V3. PLoS Comput Biol 2023; 19:e1011704. [PMID: 38150484 PMCID: PMC10793941 DOI: 10.1371/journal.pcbi.1011704] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2021] [Revised: 01/17/2024] [Accepted: 11/20/2023] [Indexed: 12/29/2023] Open
Abstract
An influential account of neuronal responses in primary visual cortex is the normalized energy model. This model is often implemented as a multi-stage computation. The first stage is linear filtering. The second stage is the extraction of contrast energy, whereby a complex cell computes the squared and summed outputs of a pair of the linear filters in quadrature phase. The third stage is normalization, in which a local population of complex cells mutually inhibit one another. Because the population includes cells tuned to a range of orientations and spatial frequencies, the result is that the responses are effectively normalized by the local stimulus contrast. Here, using evidence from human functional MRI, we show that the classical model fails to account for the relative responses to two classes of stimuli: straight, parallel, band-passed contours (gratings), and curved, band-passed contours (snakes). The snakes elicit fMRI responses that are about twice as large as the gratings, yet a traditional divisive normalization model predicts responses that are about the same. Motivated by these observations and others from the literature, we implement a divisive normalization model in which cells matched in orientation tuning ("tuned normalization") preferentially inhibit each other. We first show that this model accounts for differential responses to these two classes of stimuli. We then show that the model successfully generalizes to other band-pass textures, both in V1 and in extrastriate cortex (V2 and V3). We conclude that even in primary visual cortex, complex features of images such as the degree of heterogeneity, can have large effects on neural responses.
Collapse
Affiliation(s)
- Zeming Fang
- Department of Psychology and Center for Neural Science, New York University, New York City, New York, United States of America
- Department of Cognitive Science, Rensselaer Polytechnic Institute, Troy, New York, United States of America
| | - Ilona M. Bloem
- Department of Psychology and Center for Neural Science, New York University, New York City, New York, United States of America
| | - Catherine Olsson
- Department of Psychology and Center for Neural Science, New York University, New York City, New York, United States of America
| | - Wei Ji Ma
- Department of Psychology and Center for Neural Science, New York University, New York City, New York, United States of America
| | - Jonathan Winawer
- Department of Psychology and Center for Neural Science, New York University, New York City, New York, United States of America
| |
Collapse
|
3
|
Pan X, DeForge A, Schwartz O. Generalizing biological surround suppression based on center surround similarity via deep neural network models. PLoS Comput Biol 2023; 19:e1011486. [PMID: 37738258 PMCID: PMC10550176 DOI: 10.1371/journal.pcbi.1011486] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2023] [Revised: 10/04/2023] [Accepted: 09/04/2023] [Indexed: 09/24/2023] Open
Abstract
Sensory perception is dramatically influenced by the context. Models of contextual neural surround effects in vision have mostly accounted for Primary Visual Cortex (V1) data, via nonlinear computations such as divisive normalization. However, surround effects are not well understood within a hierarchy, for neurons with more complex stimulus selectivity beyond V1. We utilized feedforward deep convolutional neural networks and developed a gradient-based technique to visualize the most suppressive and excitatory surround. We found that deep neural networks exhibited a key signature of surround effects in V1, highlighting center stimuli that visually stand out from the surround and suppressing responses when the surround stimulus is similar to the center. We found that in some neurons, especially in late layers, when the center stimulus was altered, the most suppressive surround surprisingly can follow the change. Through the visualization approach, we generalized previous understanding of surround effects to more complex stimuli, in ways that have not been revealed in visual cortices. In contrast, the suppression based on center surround similarity was not observed in an untrained network. We identified further successes and mismatches of the feedforward CNNs to the biology. Our results provide a testable hypothesis of surround effects in higher visual cortices, and the visualization approach could be adopted in future biological experimental designs.
Collapse
Affiliation(s)
- Xu Pan
- Department of Computer Science, University of Miami, Coral Gables, FL, United States of America
| | - Annie DeForge
- School of Information, University of California, Berkeley, CA, United States of America
- Bentley University, Waltham, MA, United States of America
| | - Odelia Schwartz
- Department of Computer Science, University of Miami, Coral Gables, FL, United States of America
| |
Collapse
|
4
|
Launay C, Vacher J, Coen-Cagli R. UNSUPERVISED VIDEO SEGMENTATION ALGORITHMS BASED ON FLEXIBLY REGULARIZED MIXTURE MODELS. PROCEEDINGS. INTERNATIONAL CONFERENCE ON IMAGE PROCESSING 2022; 2022:4073-4077. [PMID: 36404988 PMCID: PMC9670685 DOI: 10.1109/icip46576.2022.9897691] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
We propose a family of probabilistic segmentation algorithms for videos that rely on a generative model capturing static and dynamic natural image statistics. Our framework adopts flexibly regularized mixture models (FlexMM) [1], an efficient method to combine mixture distributions across different data sources. FlexMMs of Student-t distributions successfully segment static natural images, through uncertainty-based information sharing between hidden layers of CNNs. We further extend this approach to videos and exploit FlexMM to propagate segment labels across space and time. We show that temporal propagation improves temporal consistency of segmentation, reproducing qualitatively a key aspect of human perceptual grouping. Besides, Student-t distributions can capture statistics of optical flows of natural movies, which represent apparent motion in the video. Integrating these motion cues in our temporal FlexMM further enhances the segmentation of each frame of natural movies. Our probabilistic dynamic segmentation algorithms thus provide a new framework to study uncertainty in human dynamic perceptual segmentation.
Collapse
Affiliation(s)
- Claire Launay
- Dept. of Systems & Comp. Biology, AECOM, Bronx, NY, USA
| | - Jonathan Vacher
- Laboratoire des Systèmes Perceptifs, DEC, ENS, PSL University, CNRS, Paris, France
| | - Ruben Coen-Cagli
- Dept. of Systems & Comp. Biology, AECOM, Bronx, NY, USA
- Dominick P. Purpura Dept. of Neuroscience, AECOM, Bronx, NY, USA
- Dept. of Ophthalmology & Visual Sciences, AECOM, Bronx, NY, USA
| |
Collapse
|
5
|
Li Y, Wang T, Yang Y, Dai W, Wu Y, Li L, Han C, Zhong L, Li L, Wang G, Dou F, Xing D. Cascaded normalizations for spatial integration in the primary visual cortex of primates. Cell Rep 2022; 40:111221. [PMID: 35977486 DOI: 10.1016/j.celrep.2022.111221] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2021] [Revised: 04/19/2022] [Accepted: 07/25/2022] [Indexed: 11/03/2022] Open
Abstract
Spatial integration of visual information is an important function in the brain. However, neural computation for spatial integration in the visual cortex remains unclear. In this study, we recorded laminar responses in V1 of awake monkeys driven by visual stimuli with grating patches and annuli of different sizes. We find three important response properties related to spatial integration that are significantly different between input and output layers: neurons in output layers have stronger surround suppression, smaller receptive field (RF), and higher sensitivity to grating annuli partially covering their RFs. These interlaminar differences can be explained by a descriptive model composed of two global divisions (normalization) and a local subtraction. Our results suggest suppressions with cascaded normalizations (CNs) are essential for spatial integration and laminar processing in the visual cortex. Interestingly, the features of spatial integration in convolutional neural networks, especially in lower layers, are different from our findings in V1.
Collapse
Affiliation(s)
- Yang Li
- State Key Laboratory of Cognitive Neuroscience and Learning & IDG/McGovern Institute for Brain Research, Beijing Normal University, Beijing 100875, China
| | - Tian Wang
- State Key Laboratory of Cognitive Neuroscience and Learning & IDG/McGovern Institute for Brain Research, Beijing Normal University, Beijing 100875, China; College of Life Sciences, Beijing Normal University, Beijing 100875, China
| | - Yi Yang
- State Key Laboratory of Cognitive Neuroscience and Learning & IDG/McGovern Institute for Brain Research, Beijing Normal University, Beijing 100875, China
| | - Weifeng Dai
- State Key Laboratory of Cognitive Neuroscience and Learning & IDG/McGovern Institute for Brain Research, Beijing Normal University, Beijing 100875, China
| | - Yujie Wu
- State Key Laboratory of Cognitive Neuroscience and Learning & IDG/McGovern Institute for Brain Research, Beijing Normal University, Beijing 100875, China
| | - Lianfeng Li
- China Academy of Launch Vehicle Technology, Beijing 100076, China
| | - Chuanliang Han
- State Key Laboratory of Cognitive Neuroscience and Learning & IDG/McGovern Institute for Brain Research, Beijing Normal University, Beijing 100875, China
| | - Lvyan Zhong
- State Key Laboratory of Cognitive Neuroscience and Learning & IDG/McGovern Institute for Brain Research, Beijing Normal University, Beijing 100875, China
| | - Liang Li
- Beijing Institute of Basic Medical Sciences, Beijing 100005, China
| | - Gang Wang
- Beijing Institute of Basic Medical Sciences, Beijing 100005, China
| | - Fei Dou
- State Key Laboratory of Cognitive Neuroscience and Learning & IDG/McGovern Institute for Brain Research, Beijing Normal University, Beijing 100875, China; College of Life Sciences, Beijing Normal University, Beijing 100875, China
| | - Dajun Xing
- State Key Laboratory of Cognitive Neuroscience and Learning & IDG/McGovern Institute for Brain Research, Beijing Normal University, Beijing 100875, China.
| |
Collapse
|
6
|
Price BH, Gavornik JP. Efficient Temporal Coding in the Early Visual System: Existing Evidence and Future Directions. Front Comput Neurosci 2022; 16:929348. [PMID: 35874317 PMCID: PMC9298461 DOI: 10.3389/fncom.2022.929348] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2022] [Accepted: 06/13/2022] [Indexed: 01/16/2023] Open
Abstract
While it is universally accepted that the brain makes predictions, there is little agreement about how this is accomplished and under which conditions. Accurate prediction requires neural circuits to learn and store spatiotemporal patterns observed in the natural environment, but it is not obvious how such information should be stored, or encoded. Information theory provides a mathematical formalism that can be used to measure the efficiency and utility of different coding schemes for data transfer and storage. This theory shows that codes become efficient when they remove predictable, redundant spatial and temporal information. Efficient coding has been used to understand retinal computations and may also be relevant to understanding more complicated temporal processing in visual cortex. However, the literature on efficient coding in cortex is varied and can be confusing since the same terms are used to mean different things in different experimental and theoretical contexts. In this work, we attempt to provide a clear summary of the theoretical relationship between efficient coding and temporal prediction, and review evidence that efficient coding principles explain computations in the retina. We then apply the same framework to computations occurring in early visuocortical areas, arguing that data from rodents is largely consistent with the predictions of this model. Finally, we review and respond to criticisms of efficient coding and suggest ways that this theory might be used to design future experiments, with particular focus on understanding the extent to which neural circuits make predictions from efficient representations of environmental statistics.
Collapse
Affiliation(s)
| | - Jeffrey P. Gavornik
- Center for Systems Neuroscience, Graduate Program in Neuroscience, Department of Biology, Boston University, Boston, MA, United States
| |
Collapse
|
7
|
Vacher J, Launay C, Coen-Cagli R. Flexibly regularized mixture models and application to image segmentation. Neural Netw 2022; 149:107-123. [PMID: 35228148 PMCID: PMC8944213 DOI: 10.1016/j.neunet.2022.02.010] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2021] [Revised: 01/08/2022] [Accepted: 02/07/2022] [Indexed: 11/23/2022]
Abstract
Probabilistic finite mixture models are widely used for unsupervised clustering. These models can often be improved by adapting them to the topology of the data. For instance, in order to classify spatially adjacent data points similarly, it is common to introduce a Laplacian constraint on the posterior probability that each data point belongs to a class. Alternatively, the mixing probabilities can be treated as free parameters, while assuming Gauss-Markov or more complex priors to regularize those mixing probabilities. However, these approaches are constrained by the shape of the prior and often lead to complicated or intractable inference. Here, we propose a new parametrization of the Dirichlet distribution to flexibly regularize the mixing probabilities of over-parametrized mixture distributions. Using the Expectation-Maximization algorithm, we show that our approach allows us to define any linear update rule for the mixing probabilities, including spatial smoothing regularization as a special case. We then show that this flexible design can be extended to share class information between multiple mixture models. We apply our algorithm to artificial and natural image segmentation tasks, and we provide quantitative and qualitative comparison of the performance of Gaussian and Student-t mixtures on the Berkeley Segmentation Dataset. We also demonstrate how to propagate class information across the layers of deep convolutional neural networks in a probabilistically optimal way, suggesting a new interpretation for feedback signals in biological visual systems. Our flexible approach can be easily generalized to adapt probabilistic mixture models to arbitrary data topologies.
Collapse
Affiliation(s)
- Jonathan Vacher
- Department of Systems & Computational Biology, Albert Einstein College of Medicine, 1300 Morris Park Ave, Bronx, 10461, NY, USA; Laboratoire des Systèmes Perceptif, Département d'Études Cognitives, École Normale Supérieure, PSL University, 24 rue Lhomond, Bâtiment Jaurès, 2éme étage, Paris, 75005, France.
| | - Claire Launay
- Department of Systems & Computational Biology, Albert Einstein College of Medicine, 1300 Morris Park Ave, Bronx, 10461, NY, USA.
| | - Ruben Coen-Cagli
- Department of Systems & Computational Biology, Albert Einstein College of Medicine, 1300 Morris Park Ave, Bronx, 10461, NY, USA; Dominick P. Purpura Department of Neuroscience, Albert Einstein College of Medicine, 1300 Morris Park Ave, Bronx, 10461, NY, USA; Department of Ophthalmology & Visual Sciences, Albert Einstein College of Medicine, 1300 Morris Park Ave, Bronx, 10461, NY, USA.
| |
Collapse
|
8
|
Shen Y, Wang J, Navlakha S. A Correspondence Between Normalization Strategies in Artificial and Biological Neural Networks. Neural Comput 2021; 33:3179-3203. [PMID: 34474484 PMCID: PMC8662716 DOI: 10.1162/neco_a_01439] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2021] [Accepted: 06/14/2021] [Indexed: 12/24/2022]
Abstract
A fundamental challenge at the interface of machine learning and neuroscience is to uncover computational principles that are shared between artificial and biological neural networks. In deep learning, normalization methods such as batch normalization, weight normalization, and their many variants help to stabilize hidden unit activity and accelerate network training, and these methods have been called one of the most important recent innovations for optimizing deep networks. In the brain, homeostatic plasticity represents a set of mechanisms that also stabilize and normalize network activity to lie within certain ranges, and these mechanisms are critical for maintaining normal brain function. In this article, we discuss parallels between artificial and biological normalization methods at four spatial scales: normalization of a single neuron's activity, normalization of synaptic weights of a neuron, normalization of a layer of neurons, and normalization of a network of neurons. We argue that both types of methods are functionally equivalent-that is, both push activation patterns of hidden units toward a homeostatic state, where all neurons are equally used-and we argue that such representations can improve coding capacity, discrimination, and regularization. As a proof of concept, we develop an algorithm, inspired by a neural normalization technique called synaptic scaling, and show that this algorithm performs competitively against existing normalization methods on several data sets. Overall, we hope this bidirectional connection will inspire neuroscientists and machine learners in three ways: to uncover new normalization algorithms based on established neurobiological principles; to help quantify the trade-offs of different homeostatic plasticity mechanisms used in the brain; and to offer insights about how stability may not hinder, but may actually promote, plasticity.
Collapse
Affiliation(s)
- Yang Shen
- Cold Spring Harbor Laboratory, Simons Center for Quantitative Biology, Cold Spring Harbor, NY 11724, U.S.A.
| | - Julia Wang
- Cold Spring Harbor Laboratory, Simons Center for Quantitative Biology, Cold Spring Harbor, NY 11724, U.S.A.
| | - Saket Navlakha
- Cold Spring Harbor Laboratory, Simons Center for Quantitative Biology, Cold Spring Harbor, NY 11724, U.S.A.
| |
Collapse
|
9
|
Paiton DM, Frye CG, Lundquist SY, Bowen JD, Zarcone R, Olshausen BA. Selectivity and robustness of sparse coding networks. J Vis 2020; 20:10. [PMID: 33237290 PMCID: PMC7691792 DOI: 10.1167/jov.20.12.10] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open
Abstract
We investigate how the population nonlinearities resulting from lateral inhibition and thresholding in sparse coding networks influence neural response selectivity and robustness. We show that when compared to pointwise nonlinear models, such population nonlinearities improve the selectivity to a preferred stimulus and protect against adversarial perturbations of the input. These findings are predicted from the geometry of the single-neuron iso-response surface, which provides new insight into the relationship between selectivity and adversarial robustness. Inhibitory lateral connections curve the iso-response surface outward in the direction of selectivity. Since adversarial perturbations are orthogonal to the iso-response surface, adversarial attacks tend to be aligned with directions of selectivity. Consequently, the network is less easily fooled by perceptually irrelevant perturbations to the input. Together, these findings point to benefits of integrating computational principles found in biological vision systems into artificial neural networks.
Collapse
Affiliation(s)
- Dylan M Paiton
- Vision Science Graduate Group, University of California Berkeley, Berkeley, CA, USA.,Redwood Center for Theoretical Neuroscience, University of California Berkeley, Berkeley, CA, USA.,
| | - Charles G Frye
- Redwood Center for Theoretical Neuroscience, University of California Berkeley, Berkeley, CA, USA.,Helen Wills Neuroscience Institute, University of California Berkeley, Berkeley, CA, USA.,
| | - Sheng Y Lundquist
- Department of Computer Science, Portland State University, Portland, OR, USA.,
| | - Joel D Bowen
- Vision Science Graduate Group, University of California Berkeley, Berkeley, CA, USA.,
| | - Ryan Zarcone
- Redwood Center for Theoretical Neuroscience, University of California Berkeley, Berkeley, CA, USA.,Biophysics, University of California Berkeley, Berkeley, CA, USA.,
| | - Bruno A Olshausen
- Vision Science Graduate Group, University of California Berkeley, Berkeley, CA, USA.,Redwood Center for Theoretical Neuroscience, University of California Berkeley, Berkeley, CA, USA.,Helen Wills Neuroscience Institute, University of California Berkeley, Berkeley, CA, USA.,
| |
Collapse
|
10
|
Bertalmío M, Gomez-Villa A, Martín A, Vazquez-Corral J, Kane D, Malo J. Evidence for the intrinsically nonlinear nature of receptive fields in vision. Sci Rep 2020; 10:16277. [PMID: 33004868 PMCID: PMC7530701 DOI: 10.1038/s41598-020-73113-0] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2019] [Accepted: 09/11/2020] [Indexed: 11/10/2022] Open
Abstract
The responses of visual neurons, as well as visual perception phenomena in general, are highly nonlinear functions of the visual input, while most vision models are grounded on the notion of a linear receptive field (RF). The linear RF has a number of inherent problems: it changes with the input, it presupposes a set of basis functions for the visual system, and it conflicts with recent studies on dendritic computations. Here we propose to model the RF in a nonlinear manner, introducing the intrinsically nonlinear receptive field (INRF). Apart from being more physiologically plausible and embodying the efficient representation principle, the INRF has a key property of wide-ranging implications: for several vision science phenomena where a linear RF must vary with the input in order to predict responses, the INRF can remain constant under different stimuli. We also prove that Artificial Neural Networks with INRF modules instead of linear filters have a remarkably improved performance and better emulate basic human perception. Our results suggest a change of paradigm for vision science as well as for artificial intelligence.
Collapse
Affiliation(s)
| | | | | | | | - David Kane
- Universitat Pompeu Fabra, Barcelona, Spain
| | - Jesús Malo
- Universitat de Valencia, Valencia, Spain
| |
Collapse
|