1
|
Bertalmío M, Durán Vizcaíno A, Malo J, Wichmann FA. Plaid masking explained with input-dependent dendritic nonlinearities. Sci Rep 2024; 14:24856. [PMID: 39438555 PMCID: PMC11496684 DOI: 10.1038/s41598-024-75471-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2024] [Accepted: 10/07/2024] [Indexed: 10/25/2024] Open
Abstract
A serious obstacle for understanding early spatial vision comes from the failure of the so-called standard model (SM) to predict the perception of plaid masking. But the SM originated from a major oversimplification of single neuron computations, ignoring fundamental properties of dendrites. Here we show that a spatial vision model including computations mimicking the input-dependent nature of dendritic nonlinearities, i.e. including nonlinear neural summation, has the potential to explain plaid masking data.
Collapse
Affiliation(s)
| | | | - Jesús Malo
- Universitat de València, València, Spain
| | | |
Collapse
|
2
|
Ni L, Burge J. Feature-specific divisive normalization improves natural image encoding for depth perception. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.09.05.611536. [PMID: 39345647 PMCID: PMC11429615 DOI: 10.1101/2024.09.05.611536] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 10/01/2024]
Abstract
Vision science and visual neuroscience seek to understand how stimulus and sensor properties limit the precision with which behaviorally-relevant latent variables are encoded and decoded. In the primate visual system, binocular disparity-the canonical cue for stereo-depth perception-is initially encoded by a set of binocular receptive fields with a range of spatial frequency preferences. Here, with a stereo-image database having ground-truth disparity information at each pixel, we examine how response normalization and receptive field properties determine the fidelity with which binocular disparity is encoded in natural scenes. We quantify encoding fidelity by computing the Fisher information carried by the normalized receptive field responses. Several findings emerge from an analysis of the response statistics. First, broadband (or feature-unspecific) normalization yields Laplace-distributed receptive field responses, and narrowband (or feature-specific) normalization yields Gaussian-distributed receptive field responses. Second, the Fisher information in narrowband-normalized responses is larger than in broadband-normalized responses by a scale factor that grows with population size. Third, the most useful spatial frequency decreases with stimulus size and the range of spatial frequencies that is useful for encoding a given disparity decreases with disparity magnitude, consistent with neurophysiological findings. Fourth, the predicted patterns of psychophysical performance, and absolute detection threshold, match human performance with natural and artificial stimuli. The current computational efforts establish a new functional role for response normalization, and bring us closer to understanding the principles that should govern the design of neural systems that support perception in natural scenes.
Collapse
Affiliation(s)
- Long Ni
- Department of Psychology, University of Pennsylvania, Pennsylvania PA
| | - Johannes Burge
- Department of Psychology, University of Pennsylvania, Pennsylvania PA
- Neuroscience Graduate Group, University of Pennsylvania, Pennsylvania PA
- Bioengineering Graduate Group, University of Pennsylvania, Pennsylvania PA
| |
Collapse
|
3
|
Salisbury JM, Palmer SE. A dynamic scale-mixture model of motion in natural scenes. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2023.10.19.563101. [PMID: 37961311 PMCID: PMC10634686 DOI: 10.1101/2023.10.19.563101] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/15/2023]
Abstract
Some of the most important tasks of visual and motor systems involve estimating the motion of objects and tracking them over time. Such systems evolved to meet the behavioral needs of the organism in its natural environment, and may therefore be adapted to the statistics of motion it is likely to encounter. By tracking the movement of individual points in movies of natural scenes, we begin to identify common properties of natural motion across scenes. As expected, objects in natural scenes move in a persistent fashion, with velocity correlations lasting hundreds of milliseconds. More subtly, but crucially, we find that the observed velocity distributions are heavy-tailed and can be modeled as a Gaussian scale-mixture. Extending this model to the time domain leads to a dynamic scale-mixture model, consisting of a Gaussian process multiplied by a positive scalar quantity with its own independent dynamics. Dynamic scaling of velocity arises naturally as a consequence of changes in object distance from the observer, and may approximate the effects of changes in other parameters governing the motion in a given scene. This modeling and estimation framework has implications for the neurobiology of sensory and motor systems, which need to cope with these fluctuations in scale in order to represent motion efficiently and drive fast and accurate tracking behavior.
Collapse
|
4
|
Peiso JR, Palmer SE, Shevell SK. Perceptual Resolution of Ambiguity: Can Tuned, Divisive Normalization Account for both Interocular Similarity Grouping and Difference Enhancement. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.04.01.587646. [PMID: 38617235 PMCID: PMC11014560 DOI: 10.1101/2024.04.01.587646] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/16/2024]
Abstract
Our visual system usually provides a unique and functional representation of the external world. At times, however, the visual system has more than one compelling interpretation of the same retinal stimulus; in this case, neural populations compete for perceptual dominance to resolve ambiguity. Spatial and temporal context can guide perceptual experience. Recent evidence shows that ambiguous retinal stimuli are sometimes resolved by enhancing either similarity or differences among multiple percepts. Divisive normalization is a canonical neural computation that enables context-dependent sensory processing by attenuating a neuron's response by other neurons. Experiments here show that divisive normalization can account for perceptual representations of either similarity enhancement (so-called grouping) or difference enhancement, offering a unified framework for opposite perceptual outcomes.
Collapse
Affiliation(s)
- Jaelyn R Peiso
- University of Chicago, Department of Psychology, Physics Frontier Center for Living Systems, Chicago, IL
| | - Stephanie E Palmer
- University of Chicago, Department of Organismal Biology & Anatomy, Department of Physics, Physics Frontier Center for Living Systems Chicago, IL
| | | |
Collapse
|
5
|
Kim T, Pasupathy A. Neural Correlates of Crowding in Macaque Area V4. J Neurosci 2024; 44:e2260232024. [PMID: 38670806 PMCID: PMC11170949 DOI: 10.1523/jneurosci.2260-23.2024] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2023] [Revised: 03/29/2024] [Accepted: 04/17/2024] [Indexed: 04/28/2024] Open
Abstract
Visual crowding refers to the phenomenon where a target object that is easily identifiable in isolation becomes difficult to recognize when surrounded by other stimuli (distractors). Many psychophysical studies have investigated this phenomenon and proposed alternative models for the underlying mechanisms. One prominent hypothesis, albeit with mixed psychophysical support, posits that crowding arises from the loss of information due to pooled encoding of features from target and distractor stimuli in the early stages of cortical visual processing. However, neurophysiological studies have not rigorously tested this hypothesis. We studied the responses of single neurons in macaque (one male, one female) area V4, an intermediate stage of the object-processing pathway, to parametrically designed crowded displays and texture statistics-matched metameric counterparts. Our investigations reveal striking parallels between how crowding parameters-number, distance, and position of distractors-influence human psychophysical performance and V4 shape selectivity. Importantly, we also found that enhancing the salience of a target stimulus could alleviate crowding effects in highly cluttered scenes, and this could be temporally protracted reflecting a dynamical process. Thus, a pooled encoding of nearby stimuli cannot explain the observed responses, and we propose an alternative model where V4 neurons preferentially encode salient stimuli in crowded displays. Overall, we conclude that the magnitude of crowding effects is determined not just by the number of distractors and target-distractor separation but also by the relative salience of targets versus distractors based on their feature attributes-the similarity of distractors and the contrast between target and distractor stimuli.
Collapse
Affiliation(s)
- Taekjun Kim
- Department of Biological Structure, University of Washington, Seattle, Washington 98195
- Washington National Primate Research Center, University of Washington, Seattle, Washington 98195
| | - Anitha Pasupathy
- Department of Biological Structure, University of Washington, Seattle, Washington 98195
- Washington National Primate Research Center, University of Washington, Seattle, Washington 98195
| |
Collapse
|
6
|
Goris RLT, Coen-Cagli R, Miller KD, Priebe NJ, Lengyel M. Response sub-additivity and variability quenching in visual cortex. Nat Rev Neurosci 2024; 25:237-252. [PMID: 38374462 PMCID: PMC11444047 DOI: 10.1038/s41583-024-00795-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 01/24/2024] [Indexed: 02/21/2024]
Abstract
Sub-additivity and variability are ubiquitous response motifs in the primary visual cortex (V1). Response sub-additivity enables the construction of useful interpretations of the visual environment, whereas response variability indicates the factors that limit the precision with which the brain can do this. There is increasing evidence that experimental manipulations that elicit response sub-additivity often also quench response variability. Here, we provide an overview of these phenomena and suggest that they may have common origins. We discuss empirical findings and recent model-based insights into the functional operations, computational objectives and circuit mechanisms underlying V1 activity. These different modelling approaches all predict that response sub-additivity and variability quenching often co-occur. The phenomenology of these two response motifs, as well as many of the insights obtained about them in V1, generalize to other cortical areas. Thus, the connection between response sub-additivity and variability quenching may be a canonical motif across the cortex.
Collapse
Affiliation(s)
- Robbe L T Goris
- Center for Perceptual Systems, University of Texas at Austin, Austin, TX, USA.
| | - Ruben Coen-Cagli
- Department of Systems and Computational Biology, Albert Einstein College of Medicine, Bronx, NY, USA
- Dominick P. Purpura Department of Neuroscience, Albert Einstein College of Medicine, Bronx, NY, USA
- Department of Ophthalmology and Visual Sciences, Albert Einstein College of Medicine, Bronx, NY, USA
| | - Kenneth D Miller
- Center for Theoretical Neuroscience, Columbia University, New York, NY, USA
- Kavli Institute for Brain Science, Columbia University, New York, NY, USA
- Dept. of Neuroscience, College of Physicians and Surgeons, Columbia University, New York, NY, USA
- Morton B. Zuckerman Mind Brain Behavior Institute, Columbia University, New York, NY, USA
- Swartz Program in Theoretical Neuroscience, Columbia University, New York, NY, USA
| | - Nicholas J Priebe
- Center for Learning and Memory, University of Texas at Austin, Austin, TX, USA
| | - Máté Lengyel
- Computational and Biological Learning Lab, Department of Engineering, University of Cambridge, Cambridge, UK
- Center for Cognitive Computation, Department of Cognitive Science, Central European University, Budapest, Hungary
| |
Collapse
|
7
|
Fang Z, Bloem IM, Olsson C, Ma WJ, Winawer J. Normalization by orientation-tuned surround in human V1-V3. PLoS Comput Biol 2023; 19:e1011704. [PMID: 38150484 PMCID: PMC10793941 DOI: 10.1371/journal.pcbi.1011704] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2021] [Revised: 01/17/2024] [Accepted: 11/20/2023] [Indexed: 12/29/2023] Open
Abstract
An influential account of neuronal responses in primary visual cortex is the normalized energy model. This model is often implemented as a multi-stage computation. The first stage is linear filtering. The second stage is the extraction of contrast energy, whereby a complex cell computes the squared and summed outputs of a pair of the linear filters in quadrature phase. The third stage is normalization, in which a local population of complex cells mutually inhibit one another. Because the population includes cells tuned to a range of orientations and spatial frequencies, the result is that the responses are effectively normalized by the local stimulus contrast. Here, using evidence from human functional MRI, we show that the classical model fails to account for the relative responses to two classes of stimuli: straight, parallel, band-passed contours (gratings), and curved, band-passed contours (snakes). The snakes elicit fMRI responses that are about twice as large as the gratings, yet a traditional divisive normalization model predicts responses that are about the same. Motivated by these observations and others from the literature, we implement a divisive normalization model in which cells matched in orientation tuning ("tuned normalization") preferentially inhibit each other. We first show that this model accounts for differential responses to these two classes of stimuli. We then show that the model successfully generalizes to other band-pass textures, both in V1 and in extrastriate cortex (V2 and V3). We conclude that even in primary visual cortex, complex features of images such as the degree of heterogeneity, can have large effects on neural responses.
Collapse
Affiliation(s)
- Zeming Fang
- Department of Psychology and Center for Neural Science, New York University, New York City, New York, United States of America
- Department of Cognitive Science, Rensselaer Polytechnic Institute, Troy, New York, United States of America
| | - Ilona M. Bloem
- Department of Psychology and Center for Neural Science, New York University, New York City, New York, United States of America
| | - Catherine Olsson
- Department of Psychology and Center for Neural Science, New York University, New York City, New York, United States of America
| | - Wei Ji Ma
- Department of Psychology and Center for Neural Science, New York University, New York City, New York, United States of America
| | - Jonathan Winawer
- Department of Psychology and Center for Neural Science, New York University, New York City, New York, United States of America
| |
Collapse
|
8
|
Weiss O, Bounds HA, Adesnik H, Coen-Cagli R. Modeling the diverse effects of divisive normalization on noise correlations. PLoS Comput Biol 2023; 19:e1011667. [PMID: 38033166 PMCID: PMC10715670 DOI: 10.1371/journal.pcbi.1011667] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2023] [Revised: 12/12/2023] [Accepted: 11/07/2023] [Indexed: 12/02/2023] Open
Abstract
Divisive normalization, a prominent descriptive model of neural activity, is employed by theories of neural coding across many different brain areas. Yet, the relationship between normalization and the statistics of neural responses beyond single neurons remains largely unexplored. Here we focus on noise correlations, a widely studied pairwise statistic, because its stimulus and state dependence plays a central role in neural coding. Existing models of covariability typically ignore normalization despite empirical evidence suggesting it affects correlation structure in neural populations. We therefore propose a pairwise stochastic divisive normalization model that accounts for the effects of normalization and other factors on covariability. We first show that normalization modulates noise correlations in qualitatively different ways depending on whether normalization is shared between neurons, and we discuss how to infer when normalization signals are shared. We then apply our model to calcium imaging data from mouse primary visual cortex (V1), and find that it accurately fits the data, often outperforming a popular alternative model of correlations. Our analysis indicates that normalization signals are often shared between V1 neurons in this dataset. Our model will enable quantifying the relation between normalization and covariability in a broad range of neural systems, which could provide new constraints on circuit mechanisms of normalization and their role in information transmission and representation.
Collapse
Affiliation(s)
- Oren Weiss
- Department of Systems and Computational Biology, Albert Einstein College of Medicine, Bronx, New York, United States of America
| | - Hayley A. Bounds
- Helen Wills Neuroscience Institute, University of California, Berkeley, Berkeley, California, United States of America
| | - Hillel Adesnik
- Helen Wills Neuroscience Institute, University of California, Berkeley, Berkeley, California, United States of America
- Department of Molecular and Cell Biology, University of California, Berkeley, Berkeley, California, United States of America
| | - Ruben Coen-Cagli
- Department of Systems and Computational Biology, Albert Einstein College of Medicine, Bronx, New York, United States of America
- Dominick P. Purpura Department of Neuroscience, Albert Einstein College of Medicine, Bronx, New York, United States of America
- Department of Ophthalmology and Visual Sciences, Albert Einstein College of Medicine, Bronx, New York, United States of America
| |
Collapse
|
9
|
Pan X, DeForge A, Schwartz O. Generalizing biological surround suppression based on center surround similarity via deep neural network models. PLoS Comput Biol 2023; 19:e1011486. [PMID: 37738258 PMCID: PMC10550176 DOI: 10.1371/journal.pcbi.1011486] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2023] [Revised: 10/04/2023] [Accepted: 09/04/2023] [Indexed: 09/24/2023] Open
Abstract
Sensory perception is dramatically influenced by the context. Models of contextual neural surround effects in vision have mostly accounted for Primary Visual Cortex (V1) data, via nonlinear computations such as divisive normalization. However, surround effects are not well understood within a hierarchy, for neurons with more complex stimulus selectivity beyond V1. We utilized feedforward deep convolutional neural networks and developed a gradient-based technique to visualize the most suppressive and excitatory surround. We found that deep neural networks exhibited a key signature of surround effects in V1, highlighting center stimuli that visually stand out from the surround and suppressing responses when the surround stimulus is similar to the center. We found that in some neurons, especially in late layers, when the center stimulus was altered, the most suppressive surround surprisingly can follow the change. Through the visualization approach, we generalized previous understanding of surround effects to more complex stimuli, in ways that have not been revealed in visual cortices. In contrast, the suppression based on center surround similarity was not observed in an untrained network. We identified further successes and mismatches of the feedforward CNNs to the biology. Our results provide a testable hypothesis of surround effects in higher visual cortices, and the visualization approach could be adopted in future biological experimental designs.
Collapse
Affiliation(s)
- Xu Pan
- Department of Computer Science, University of Miami, Coral Gables, FL, United States of America
| | - Annie DeForge
- School of Information, University of California, Berkeley, CA, United States of America
- Bentley University, Waltham, MA, United States of America
| | - Odelia Schwartz
- Department of Computer Science, University of Miami, Coral Gables, FL, United States of America
| |
Collapse
|
10
|
Luna R, Zabaleta I, Bertalmío M. State-of-the-art image and video quality assessment with a metric based on an intrinsically non-linear neural summation model. Front Neurosci 2023; 17:1222815. [PMID: 37559700 PMCID: PMC10408451 DOI: 10.3389/fnins.2023.1222815] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2023] [Accepted: 06/30/2023] [Indexed: 08/11/2023] Open
Abstract
The development of automatic methods for image and video quality assessment that correlate well with the perception of human observers is a very challenging open problem in vision science, with numerous practical applications in disciplines such as image processing and computer vision, as well as in the media industry. In the past two decades, the goal of image quality research has been to improve upon classical metrics by developing models that emulate some aspects of the visual system, and while the progress has been considerable, state-of-the-art quality assessment methods still share a number of shortcomings, like their performance dropping considerably when they are tested on a database that is quite different from the one used to train them, or their significant limitations in predicting observer scores for high framerate videos. In this work we propose a novel objective method for image and video quality assessment that is based on the recently introduced Intrinsically Non-linear Receptive Field (INRF) formulation, a neural summation model that has been shown to be better at predicting neural activity and visual perception phenomena than the classical linear receptive field. Here we start by optimizing, on a classic image quality database, the four parameters of a very simple INRF-based metric, and proceed to test this metric on three other databases, showing that its performance equals or surpasses that of the state-of-the-art methods, some of them having millions of parameters. Next, we extend to the temporal domain this INRF image quality metric, and test it on several popular video quality datasets; again, the results of our proposed INRF-based video quality metric are shown to be very competitive.
Collapse
Affiliation(s)
- Raúl Luna
- Institute of Optics, Spanish National Research Council (CSIC), Madrid, Spain
| | - Itziar Zabaleta
- Department of Information and Communication Technologies, Universitat Pompeu Fabra, Barcelona, Spain
| | - Marcelo Bertalmío
- Institute of Optics, Spanish National Research Council (CSIC), Madrid, Spain
| |
Collapse
|
11
|
Divisive normalization is an efficient code for multivariate Pareto-distributed environments. Proc Natl Acad Sci U S A 2022; 119:e2120581119. [PMID: 36161961 PMCID: PMC9546555 DOI: 10.1073/pnas.2120581119] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Divisive normalization is a canonical computation in the brain, observed across neural systems, that is often considered to be an implementation of the efficient coding principle. We provide a theoretical result that makes the conditions under which divisive normalization is an efficient code analytically precise: We show that, in a low-noise regime, encoding an n-dimensional stimulus via divisive normalization is efficient if and only if its prevalence in the environment is described by a multivariate Pareto distribution. We generalize this multivariate analog of histogram equalization to allow for arbitrary metabolic costs of the representation, and show how different assumptions on costs are associated with different shapes of the distributions that divisive normalization efficiently encodes. Our result suggests that divisive normalization may have evolved to efficiently represent stimuli with Pareto distributions. We demonstrate that this efficiently encoded distribution is consistent with stylized features of naturalistic stimulus distributions such as their characteristic conditional variance dependence, and we provide empirical evidence suggesting that it may capture the statistics of filter responses to naturalistic images. Our theoretical finding also yields empirically testable predictions across sensory domains on how the divisive normalization parameters should be tuned to features of the input distribution.
Collapse
|
12
|
Feng S, Cui Z, Han Z, Li H, Yu H. V1-Origin Bidirectional Plasticity in Visual Thalamo-Ventral Pathway and Its Contribution to Saliency Detection of Dynamic Visual Inputs. J Neurosci 2022; 42:6359-6379. [PMID: 35851327 PMCID: PMC9398546 DOI: 10.1523/jneurosci.0539-22.2022] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2022] [Revised: 06/12/2022] [Accepted: 07/08/2022] [Indexed: 11/21/2022] Open
Abstract
Visual neural plasticity and V1 saliency detection are vital for efficient coding of dynamically changing visual inputs. However, how does neural plasticity contribute to saliency detection of temporal statistically distributed visual stream remains unclear. Therefore, we adopted randomly presented but unevenly distributed stimuli with multiple orientations and examined the single-unit responses evoked by this biased orientation-adaptation protocol by single-unit recordings in the visual thalamo-ventral pathway of cats (of either sex). We found neuronal responses potentiated when the probability of biased orientation was slightly higher than other nonbiased ones and suppressed when the probability became much higher. This single neuronal short-term bidirectional plasticity is selectively induced by optimal stimuli but is interocularly transferable. It is inducible in LGN, Area 17, and Area 21a with distinct and hierarchically progressive patterns. With the results of latency analysis, receptive field structural test, cortical lesion, and simulations, we suggest this bidirectional plasticity may principally originate from the adaptation competition between excitatory and inhibitory components of V1 neuronal receptive field. In our simulation, above bidirectional plasticity could achieve saliency detection of dynamic visual inputs. These findings demonstrate a rapid probability dependent plasticity on the neural coding of visual stream and suggest its functional role in the efficient coding and saliency detection of dynamic environment.SIGNIFICANCE STATEMENT Novel elements within a dynamic visual stream can pop up from the context, which is vital for rapid response to a dynamically changing world. Saliency detection is a promising bottom-up mechanism contributing to efficient selection of visual inputs, wherein visual adaptation also plays a significant role. However, the saliency detection of dynamic visual stream is poorly understood. Here, we found a novel form of visual short-term bidirectional plasticity in multistages of the visual system that contributes to saliency detection of dynamic visual inputs. This bidirectional plasticity may principally originate from the local balance of excitation inhibition in primary visual cortex and propagates to lower and higher visual areas with progressive pattern change. Our findings suggest the excitation-inhibition balance within the visual system contributes to visual efficient coding.
Collapse
Affiliation(s)
- Shang Feng
- School of Life Sciences, State Key Laboratory of Medical Neurobiology, Collaborative Innovation Centre for Brain Science, Fudan University, Shanghai 200433, China
| | - Zhichang Cui
- School of Life Sciences, State Key Laboratory of Medical Neurobiology, Collaborative Innovation Centre for Brain Science, Fudan University, Shanghai 200433, China
| | - Zhengqi Han
- School of Life Sciences, State Key Laboratory of Medical Neurobiology, Collaborative Innovation Centre for Brain Science, Fudan University, Shanghai 200433, China
| | - Hongjian Li
- School of Life Sciences, State Key Laboratory of Medical Neurobiology, Collaborative Innovation Centre for Brain Science, Fudan University, Shanghai 200433, China
| | - Hongbo Yu
- School of Life Sciences, State Key Laboratory of Medical Neurobiology, Collaborative Innovation Centre for Brain Science, Fudan University, Shanghai 200433, China
| |
Collapse
|
13
|
Li Y, Wang T, Yang Y, Dai W, Wu Y, Li L, Han C, Zhong L, Li L, Wang G, Dou F, Xing D. Cascaded normalizations for spatial integration in the primary visual cortex of primates. Cell Rep 2022; 40:111221. [PMID: 35977486 DOI: 10.1016/j.celrep.2022.111221] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2021] [Revised: 04/19/2022] [Accepted: 07/25/2022] [Indexed: 11/03/2022] Open
Abstract
Spatial integration of visual information is an important function in the brain. However, neural computation for spatial integration in the visual cortex remains unclear. In this study, we recorded laminar responses in V1 of awake monkeys driven by visual stimuli with grating patches and annuli of different sizes. We find three important response properties related to spatial integration that are significantly different between input and output layers: neurons in output layers have stronger surround suppression, smaller receptive field (RF), and higher sensitivity to grating annuli partially covering their RFs. These interlaminar differences can be explained by a descriptive model composed of two global divisions (normalization) and a local subtraction. Our results suggest suppressions with cascaded normalizations (CNs) are essential for spatial integration and laminar processing in the visual cortex. Interestingly, the features of spatial integration in convolutional neural networks, especially in lower layers, are different from our findings in V1.
Collapse
Affiliation(s)
- Yang Li
- State Key Laboratory of Cognitive Neuroscience and Learning & IDG/McGovern Institute for Brain Research, Beijing Normal University, Beijing 100875, China
| | - Tian Wang
- State Key Laboratory of Cognitive Neuroscience and Learning & IDG/McGovern Institute for Brain Research, Beijing Normal University, Beijing 100875, China; College of Life Sciences, Beijing Normal University, Beijing 100875, China
| | - Yi Yang
- State Key Laboratory of Cognitive Neuroscience and Learning & IDG/McGovern Institute for Brain Research, Beijing Normal University, Beijing 100875, China
| | - Weifeng Dai
- State Key Laboratory of Cognitive Neuroscience and Learning & IDG/McGovern Institute for Brain Research, Beijing Normal University, Beijing 100875, China
| | - Yujie Wu
- State Key Laboratory of Cognitive Neuroscience and Learning & IDG/McGovern Institute for Brain Research, Beijing Normal University, Beijing 100875, China
| | - Lianfeng Li
- China Academy of Launch Vehicle Technology, Beijing 100076, China
| | - Chuanliang Han
- State Key Laboratory of Cognitive Neuroscience and Learning & IDG/McGovern Institute for Brain Research, Beijing Normal University, Beijing 100875, China
| | - Lvyan Zhong
- State Key Laboratory of Cognitive Neuroscience and Learning & IDG/McGovern Institute for Brain Research, Beijing Normal University, Beijing 100875, China
| | - Liang Li
- Beijing Institute of Basic Medical Sciences, Beijing 100005, China
| | - Gang Wang
- Beijing Institute of Basic Medical Sciences, Beijing 100005, China
| | - Fei Dou
- State Key Laboratory of Cognitive Neuroscience and Learning & IDG/McGovern Institute for Brain Research, Beijing Normal University, Beijing 100875, China; College of Life Sciences, Beijing Normal University, Beijing 100875, China
| | - Dajun Xing
- State Key Laboratory of Cognitive Neuroscience and Learning & IDG/McGovern Institute for Brain Research, Beijing Normal University, Beijing 100875, China.
| |
Collapse
|
14
|
Shirhatti V, Ravishankar P, Ray S. Gamma oscillations in primate primary visual cortex are severely attenuated by small stimulus discontinuities. PLoS Biol 2022; 20:e3001666. [PMID: 35700175 PMCID: PMC9197048 DOI: 10.1371/journal.pbio.3001666] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2022] [Accepted: 05/10/2022] [Indexed: 11/22/2022] Open
Abstract
Gamma oscillations (30 to 80 Hz) have been hypothesized to play an important role in feature binding, based on the observation that continuous long bars induce stronger gamma in the visual cortex than bars with a small gap. Recently, many studies have shown that natural images, which have discontinuities in several low-level features, do not induce strong gamma oscillations, questioning their role in feature binding. However, the effect of different discontinuities on gamma has not been well studied. To address this, we recorded spikes and local field potential from 2 monkeys while they were shown gratings with discontinuities in 4 attributes: space, orientation, phase, or contrast. We found that while these discontinuities only had a modest effect on spiking activity, gamma power drastically reduced in all cases, suggesting that gamma could be a resonant phenomenon. An excitatory–inhibitory population model with stimulus-tuned recurrent inputs showed such resonant properties. Therefore, gamma could be a signature of excitation–inhibition balance, which gets disrupted due to discontinuities. Gamma oscillations (30-80 Hz) in visual cortex have been hypothesized to play an important role in feature binding, but this role has recently been questioned. This study shows that visual stimulus-induced gamma oscillations are highly attenuated with even small discontinuities in the stimulus. This "resonant" behaviour can be explained by a simple excitatory-inhibitory model in which discontinuities lead to a small reduction in lateral inputs.
Collapse
Affiliation(s)
- Vinay Shirhatti
- Centre for Neuroscience, Indian Institute of Science, Bengaluru, India
- IISc Mathematics Initiative, Indian Institute of Science, Bengaluru, India
| | | | - Supratim Ray
- Centre for Neuroscience, Indian Institute of Science, Bengaluru, India
- IISc Mathematics Initiative, Indian Institute of Science, Bengaluru, India
- * E-mail:
| |
Collapse
|
15
|
Vacher J, Launay C, Coen-Cagli R. Flexibly regularized mixture models and application to image segmentation. Neural Netw 2022; 149:107-123. [PMID: 35228148 PMCID: PMC8944213 DOI: 10.1016/j.neunet.2022.02.010] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2021] [Revised: 01/08/2022] [Accepted: 02/07/2022] [Indexed: 11/23/2022]
Abstract
Probabilistic finite mixture models are widely used for unsupervised clustering. These models can often be improved by adapting them to the topology of the data. For instance, in order to classify spatially adjacent data points similarly, it is common to introduce a Laplacian constraint on the posterior probability that each data point belongs to a class. Alternatively, the mixing probabilities can be treated as free parameters, while assuming Gauss-Markov or more complex priors to regularize those mixing probabilities. However, these approaches are constrained by the shape of the prior and often lead to complicated or intractable inference. Here, we propose a new parametrization of the Dirichlet distribution to flexibly regularize the mixing probabilities of over-parametrized mixture distributions. Using the Expectation-Maximization algorithm, we show that our approach allows us to define any linear update rule for the mixing probabilities, including spatial smoothing regularization as a special case. We then show that this flexible design can be extended to share class information between multiple mixture models. We apply our algorithm to artificial and natural image segmentation tasks, and we provide quantitative and qualitative comparison of the performance of Gaussian and Student-t mixtures on the Berkeley Segmentation Dataset. We also demonstrate how to propagate class information across the layers of deep convolutional neural networks in a probabilistically optimal way, suggesting a new interpretation for feedback signals in biological visual systems. Our flexible approach can be easily generalized to adapt probabilistic mixture models to arbitrary data topologies.
Collapse
Affiliation(s)
- Jonathan Vacher
- Department of Systems & Computational Biology, Albert Einstein College of Medicine, 1300 Morris Park Ave, Bronx, 10461, NY, USA; Laboratoire des Systèmes Perceptif, Département d'Études Cognitives, École Normale Supérieure, PSL University, 24 rue Lhomond, Bâtiment Jaurès, 2éme étage, Paris, 75005, France.
| | - Claire Launay
- Department of Systems & Computational Biology, Albert Einstein College of Medicine, 1300 Morris Park Ave, Bronx, 10461, NY, USA.
| | - Ruben Coen-Cagli
- Department of Systems & Computational Biology, Albert Einstein College of Medicine, 1300 Morris Park Ave, Bronx, 10461, NY, USA; Dominick P. Purpura Department of Neuroscience, Albert Einstein College of Medicine, 1300 Morris Park Ave, Bronx, 10461, NY, USA; Department of Ophthalmology & Visual Sciences, Albert Einstein College of Medicine, 1300 Morris Park Ave, Bronx, 10461, NY, USA.
| |
Collapse
|
16
|
Uran C, Peter A, Lazar A, Barnes W, Klon-Lipok J, Shapcott KA, Roese R, Fries P, Singer W, Vinck M. Predictive coding of natural images by V1 firing rates and rhythmic synchronization. Neuron 2022; 110:1240-1257.e8. [PMID: 35120628 PMCID: PMC8992798 DOI: 10.1016/j.neuron.2022.01.002] [Citation(s) in RCA: 22] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2021] [Revised: 11/22/2021] [Accepted: 01/04/2022] [Indexed: 01/12/2023]
Abstract
Predictive coding is an important candidate theory of self-supervised learning in the brain. Its central idea is that sensory responses result from comparisons between bottom-up inputs and contextual predictions, a process in which rates and synchronization may play distinct roles. We recorded from awake macaque V1 and developed a technique to quantify stimulus predictability for natural images based on self-supervised, generative neural networks. We find that neuronal firing rates were mainly modulated by the contextual predictability of higher-order image features, which correlated strongly with human perceptual similarity judgments. By contrast, V1 gamma (γ)-synchronization increased monotonically with the contextual predictability of low-level image features and emerged exclusively for larger stimuli. Consequently, γ-synchronization was induced by natural images that are highly compressible and low-dimensional. Natural stimuli with low predictability induced prominent, late-onset beta (β)-synchronization, likely reflecting cortical feedback. Our findings reveal distinct roles of synchronization and firing rates in the predictive coding of natural images.
Collapse
Affiliation(s)
- Cem Uran
- Ernst Strüngmann Institute (ESI) for Neuroscience in Cooperation with Max Planck Society, 60528 Frankfurt, Germany; Donders Centre for Neuroscience, Department of Neuroinformatics, Radboud University Nijmegen, 6525 AJ Nijmegen, the Netherlands.
| | - Alina Peter
- Ernst Strüngmann Institute (ESI) for Neuroscience in Cooperation with Max Planck Society, 60528 Frankfurt, Germany
| | - Andreea Lazar
- Ernst Strüngmann Institute (ESI) for Neuroscience in Cooperation with Max Planck Society, 60528 Frankfurt, Germany
| | - William Barnes
- Ernst Strüngmann Institute (ESI) for Neuroscience in Cooperation with Max Planck Society, 60528 Frankfurt, Germany; Max Planck Institute for Brain Research, 60438 Frankfurt, Germany
| | - Johanna Klon-Lipok
- Ernst Strüngmann Institute (ESI) for Neuroscience in Cooperation with Max Planck Society, 60528 Frankfurt, Germany; Max Planck Institute for Brain Research, 60438 Frankfurt, Germany
| | - Katharine A Shapcott
- Ernst Strüngmann Institute (ESI) for Neuroscience in Cooperation with Max Planck Society, 60528 Frankfurt, Germany; Frankfurt Institute for Advanced Studies, 60438 Frankfurt, Germany
| | - Rasmus Roese
- Ernst Strüngmann Institute (ESI) for Neuroscience in Cooperation with Max Planck Society, 60528 Frankfurt, Germany
| | - Pascal Fries
- Ernst Strüngmann Institute (ESI) for Neuroscience in Cooperation with Max Planck Society, 60528 Frankfurt, Germany; Donders Institute for Brain, Cognition and Behaviour, Department of Biophysics, Radboud University Nijmegen, 6525 AJ Nijmegen, the Netherlands
| | - Wolf Singer
- Ernst Strüngmann Institute (ESI) for Neuroscience in Cooperation with Max Planck Society, 60528 Frankfurt, Germany; Max Planck Institute for Brain Research, 60438 Frankfurt, Germany; Frankfurt Institute for Advanced Studies, 60438 Frankfurt, Germany
| | - Martin Vinck
- Ernst Strüngmann Institute (ESI) for Neuroscience in Cooperation with Max Planck Society, 60528 Frankfurt, Germany; Donders Centre for Neuroscience, Department of Neuroinformatics, Radboud University Nijmegen, 6525 AJ Nijmegen, the Netherlands.
| |
Collapse
|
17
|
Gao S, Liu X. Explaining Orientation Adaptation in V1 by Updating the State of a Spatial Model. Front Comput Neurosci 2022; 15:759254. [PMID: 35250523 PMCID: PMC8895385 DOI: 10.3389/fncom.2021.759254] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2021] [Accepted: 12/06/2021] [Indexed: 11/17/2022] Open
Abstract
In this work, we extend an influential statistical model based on the spatial classical receptive field (CRF) and non-classical receptive field (nCRF) interactions (Coen-Cagli et al., 2012) to explain the typical orientation adaptation effects observed in V1. If we assume that the temporal adaptation modifies the “state” of the model, the spatial statistical model can explain all of the orientation adaptation effects in the context of neuronal output using small and large grating observed in neurophysiological experiments in V1. The “state” of the model represents the internal parameters such as the prior and the covariance trained on a mixed dataset that totally determine the response of the model. These two parameters, respectively, reflect the probability of the orientation component and the connectivity among neurons between CRF and nCRF. Specifically, we have two key findings: First, neural adapted results using a small grating that just covers the CRF can be predicted by the change of the prior of our model. Second, the change of the prior can also predict most of the observed results using a large grating that covers both CRF and nCRF of a neuron. However, the prediction of the novel attractive adaptation using large grating covering both CRF and nCRF also necessitates the involvement of a connectivity change of the center-surround RFs. In addition, our paper contributes a new prior-based winner-take-all (WTA) working mechanism derived from the statistical-based model to explain why and how all of these orientation adaptation effects can be predicted by relying on this spatial model without modifying its structure, a novel application of the spatial model. The research results show that adaptation may link time and space by changing the “state” of the neural system according to a specific adaptor. Furthermore, different forms of stimulus used for adaptation can cause various adaptation effects, such as an a priori shift or a connectivity change, depending on the specific stimulus size.
Collapse
Affiliation(s)
- Shaobing Gao
- College of Computer Science, Sichuan University, Chengdu, China
- *Correspondence: Shaobing Gao
| | - Xiao Liu
- Tomorrow Advancing Life Education Group (TAL), Beijing, China
| |
Collapse
|
18
|
Klímová M, Bloem IM, Ling S. The specificity of orientation-tuned normalization within human early visual cortex. J Neurophysiol 2021; 126:1536-1546. [PMID: 34550028 PMCID: PMC8794056 DOI: 10.1152/jn.00203.2021] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2021] [Revised: 09/20/2021] [Accepted: 09/20/2021] [Indexed: 11/22/2022] Open
Abstract
Normalization within visual cortex is modulated by contextual influences; stimuli sharing similar features suppress each other more than dissimilar stimuli. This feature-tuned component of suppression depends on multiple factors, including the orientation content of stimuli. Indeed, pairs of stimuli arranged in a center-surround configuration attenuate each other's response to a greater degree when oriented collinearly than when oriented orthogonally. Although numerous studies have examined the nature of surround suppression at these two extremes, far less is known about how the strength of tuned normalization varies as a function of continuous changes in orientation similarity, particularly in humans. In this study, we used functional magnetic resonance imaging (fMRI) to examine the bandwidth of orientation-tuned suppression within human visual cortex. Blood-oxygen-level-dependent (BOLD) responses were acquired as participants viewed a full-field circular stimulus composed of wedges of orientation-bandpass filtered noise. This stimulus configuration allowed us to parametrically vary orientation differences between neighboring wedges in gradual steps between collinear and orthogonal. We found the greatest suppression for collinearly arranged stimuli with a gradual increase in BOLD response as the orientation content became more dissimilar. We quantified the tuning width of orientation-tuned suppression, finding that the voxel-wise bandwidth of orientation tuned normalization was between 20° and 30°, and did not differ substantially between early visual areas. Voxel-wise analyses revealed that suppression width covaried with retinotopic preference, with the tightest bandwidths at outer eccentricities. Having an estimate of orientation-tuned suppression bandwidth can serve to constrain models of tuned normalization, establishing the precise degree to which suppression strength depends on similarity between visual stimulus components.NEW & NOTEWORTHY Neurons in the early visual cortex are subject to divisive normalization, but the feature-tuning aspect of this computation remains understudied, particularly in humans. We investigated orientation tuning of normalization in human early visual cortex using fMRI and estimated the bandwidth of the tuned normalization function across observers. Our findings provide a characterization of tuned normalization in early visual cortex that could help constrain models of divisive normalization in vision.
Collapse
Affiliation(s)
- Michaela Klímová
- Department of Psychological and Brain Sciences, Boston University, Boston, Massachusetts
- Center for Systems Neuroscience, Boston University, Boston, Massachusetts
| | - Ilona M Bloem
- Department of Psychological and Brain Sciences, Boston University, Boston, Massachusetts
- Center for Systems Neuroscience, Boston University, Boston, Massachusetts
- Department of Psychology, New York University, New York City, New York
| | - Sam Ling
- Department of Psychological and Brain Sciences, Boston University, Boston, Massachusetts
- Center for Systems Neuroscience, Boston University, Boston, Massachusetts
| |
Collapse
|
19
|
Canham T, Vazquez-Corral J, Mathieu E, Bertalmío M. Matching visual induction effects on screens of different size. J Vis 2021; 21:10. [PMID: 34144607 PMCID: PMC8237091 DOI: 10.1167/jov.21.6.10] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open
Abstract
In the film industry, the same movie is expected to be watched on displays of vastly different sizes, from cinema screens to mobile phones. But visual induction, the perceptual phenomenon by which the appearance of a scene region is affected by its surroundings, will be different for the same image shown on two displays of different dimensions. This phenomenon presents a practical challenge for the preservation of the artistic intentions of filmmakers, because it can lead to shifts in image appearance between viewing destinations. In this work, we show that a neural field model based on the efficient representation principle is able to predict induction effects and how, by regularizing its associated energy functional, the model is still able to represent induction but is now invertible. From this finding, we propose a method to preprocess an image in a screen-size dependent way so that its perception, in terms of visual induction, may remain constant across displays of different size. The potential of the method is demonstrated through psychophysical experiments on synthetic images and qualitative examples on natural images.
Collapse
Affiliation(s)
- Trevor Canham
- Department of Information and Communication Technologies, Universitat Pompeu Fabra, Barcelona, Spain.,
| | - Javier Vazquez-Corral
- Computer Vision Center and the Computer Sciences Department at Universitat Autònoma de Barcelona, Cerdanyola del Vallès, Spain., http://www.jvazquez-corral.net
| | | | - Marcelo Bertalmío
- Instituto de óptica, Spanish National Research Council (CSIC), Spain.,
| |
Collapse
|
20
|
Festa D, Aschner A, Davila A, Kohn A, Coen-Cagli R. Neuronal variability reflects probabilistic inference tuned to natural image statistics. Nat Commun 2021; 12:3635. [PMID: 34131142 PMCID: PMC8206154 DOI: 10.1038/s41467-021-23838-x] [Citation(s) in RCA: 29] [Impact Index Per Article: 9.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2020] [Accepted: 05/19/2021] [Indexed: 11/23/2022] Open
Abstract
Neuronal activity in sensory cortex fluctuates over time and across repetitions of the same input. This variability is often considered detrimental to neural coding. The theory of neural sampling proposes instead that variability encodes the uncertainty of perceptual inferences. In primary visual cortex (V1), modulation of variability by sensory and non-sensory factors supports this view. However, it is unknown whether V1 variability reflects the statistical structure of visual inputs, as would be required for inferences correctly tuned to the statistics of the natural environment. Here we combine analysis of image statistics and recordings in macaque V1 to show that probabilistic inference tuned to natural image statistics explains the widely observed dependence between spike count variance and mean, and the modulation of V1 activity and variability by spatial context in images. Our results show that the properties of a basic aspect of cortical responses-their variability-can be explained by a probabilistic representation tuned to naturalistic inputs.
Collapse
Affiliation(s)
- Dylan Festa
- Department of Systems and Computational Biology, Albert Einstein College of Medicine, Bronx, NY, USA
| | - Amir Aschner
- Dominick Purpura Department of Neuroscience, Albert Einstein College of Medicine, Bronx, NY, USA
| | - Aida Davila
- Dominick Purpura Department of Neuroscience, Albert Einstein College of Medicine, Bronx, NY, USA
| | - Adam Kohn
- Department of Systems and Computational Biology, Albert Einstein College of Medicine, Bronx, NY, USA
- Dominick Purpura Department of Neuroscience, Albert Einstein College of Medicine, Bronx, NY, USA
- Department of Ophthalmology and Visual Sciences, Albert Einstein College of Medicine, Bronx, NY, USA
| | - Ruben Coen-Cagli
- Department of Systems and Computational Biology, Albert Einstein College of Medicine, Bronx, NY, USA.
- Dominick Purpura Department of Neuroscience, Albert Einstein College of Medicine, Bronx, NY, USA.
| |
Collapse
|
21
|
Berga D, Otazu X. Modeling bottom-up and top-down attention with a neurodynamic model of V1. Neurocomputing 2020. [DOI: 10.1016/j.neucom.2020.07.047] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]
|
22
|
Bertalmío M, Gomez-Villa A, Martín A, Vazquez-Corral J, Kane D, Malo J. Evidence for the intrinsically nonlinear nature of receptive fields in vision. Sci Rep 2020; 10:16277. [PMID: 33004868 PMCID: PMC7530701 DOI: 10.1038/s41598-020-73113-0] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2019] [Accepted: 09/11/2020] [Indexed: 11/10/2022] Open
Abstract
The responses of visual neurons, as well as visual perception phenomena in general, are highly nonlinear functions of the visual input, while most vision models are grounded on the notion of a linear receptive field (RF). The linear RF has a number of inherent problems: it changes with the input, it presupposes a set of basis functions for the visual system, and it conflicts with recent studies on dendritic computations. Here we propose to model the RF in a nonlinear manner, introducing the intrinsically nonlinear receptive field (INRF). Apart from being more physiologically plausible and embodying the efficient representation principle, the INRF has a key property of wide-ranging implications: for several vision science phenomena where a linear RF must vary with the input in order to predict responses, the INRF can remain constant under different stimuli. We also prove that Artificial Neural Networks with INRF modules instead of linear filters have a remarkably improved performance and better emulate basic human perception. Our results suggest a change of paradigm for vision science as well as for artificial intelligence.
Collapse
Affiliation(s)
| | | | | | | | - David Kane
- Universitat Pompeu Fabra, Barcelona, Spain
| | - Jesús Malo
- Universitat de Valencia, Valencia, Spain
| |
Collapse
|
23
|
Gomez-Villa A, Martín A, Vazquez-Corral J, Bertalmío M, Malo J. Color illusions also deceive CNNs for low-level vision tasks: Analysis and implications. Vision Res 2020; 176:156-174. [PMID: 32896717 DOI: 10.1016/j.visres.2020.07.010] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2019] [Revised: 07/10/2020] [Accepted: 07/22/2020] [Indexed: 11/18/2022]
Abstract
The study of visual illusions has proven to be a very useful approach in vision science. In this work we start by showing that, while convolutional neural networks (CNNs) trained for low-level visual tasks in natural images may be deceived by brightness and color illusions, some network illusions can be inconsistent with the perception of humans. Next, we analyze where these similarities and differences may come from. On one hand, the proposed linear eigenanalysis explains the overall similarities: in simple CNNs trained for tasks like denoising or deblurring, the linear version of the network has center-surround receptive fields, and global transfer functions are very similar to the human achromatic and chromatic contrast sensitivity functions in human-like opponent color spaces. These similarities are consistent with the long-standing hypothesis that considers low-level visual illusions as a by-product of the optimization to natural environments. Specifically, here human-like features emerge from error minimization. On the other hand, the observed differences must be due to the behavior of the human visual system not explained by the linear approximation. However, our study also shows that more 'flexible' network architectures, with more layers and a higher degree of nonlinearity, may actually have a worse capability of reproducing visual illusions. This implies, in line with other works in the vision science literature, a word of caution on using CNNs to study human vision: on top of the intrinsic limitations of the L + NL formulation of artificial networks to model vision, the nonlinear behavior of flexible architectures may easily be markedly different from that of the visual system.
Collapse
Affiliation(s)
- A Gomez-Villa
- Dept. Inf. Comm. Tech., Universitat Pompeu Fabra, Barcelona, Spain.
| | - A Martín
- Dept. Inf. Comm. Tech., Universitat Pompeu Fabra, Barcelona, Spain.
| | - J Vazquez-Corral
- Dept. Inf. Comm. Tech., Universitat Pompeu Fabra, Barcelona, Spain.
| | - M Bertalmío
- Dept. Inf. Comm. Tech., Universitat Pompeu Fabra, Barcelona, Spain.
| | - J Malo
- Image Proc., Lab, Universitat de València, València, Spain.
| |
Collapse
|
24
|
Fruend I. Constrained sampling from deep generative image models reveals mechanisms of human target detection. J Vis 2020; 20:32. [PMID: 32729908 PMCID: PMC7424951 DOI: 10.1167/jov.20.7.32] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open
Abstract
The first steps of visual processing are often described as a bank of oriented filters followed by divisive normalization. This approach has been tremendously successful at predicting contrast thresholds in simple visual displays. However, it is unclear to what extent this kind of architecture also supports processing in more complex visual tasks performed in naturally looking images. We used a deep generative image model to embed arc segments with different curvatures in naturalistic images. These images contain the target as part of the image scene, resulting in considerable appearance variation of target as well as background. Three observers localized arc targets in these images, with an average accuracy of 74.7%. Data were fit by several biologically inspired models, four standard deep convolutional neural networks (CNNs), and a five-layer CNN specifically trained for this task. Four models predicted observer responses particularly well; (1) a bank of oriented filters, similar to complex cells in primate area V1; (2) a bank of oriented filters followed by tuned gain control, incorporating knowledge about cortical surround interactions; (3) a bank of oriented filters followed by local normalization; and (4) the five-layer CNN. A control experiment with optimized stimuli based on these four models showed that the observers' data were best explained by model (2) with tuned gain control. These data suggest that standard models of early vision provide good descriptions of performance in much more complex tasks than what they were designed for, while general-purpose non linear models such as convolutional neural networks do not.
Collapse
Affiliation(s)
- Ingo Fruend
- Department of Psychology, Centre for Vision Research & Vision: Science to Application, York University, Toronto, ON, Canada
| |
Collapse
|
25
|
Henry CA, Jazayeri M, Shapley RM, Hawken MJ. Distinct spatiotemporal mechanisms underlie extra-classical receptive field modulation in macaque V1 microcircuits. eLife 2020; 9:54264. [PMID: 32458798 PMCID: PMC7253173 DOI: 10.7554/elife.54264] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2019] [Accepted: 05/11/2020] [Indexed: 01/23/2023] Open
Abstract
Complex scene perception depends upon the interaction between signals from the classical receptive field (CRF) and the extra-classical receptive field (eCRF) in primary visual cortex (V1) neurons. Although much is known about V1 eCRF properties, we do not yet know how the underlying mechanisms map onto the cortical microcircuit. We probed the spatio-temporal dynamics of eCRF modulation using a reverse correlation paradigm, and found three principal eCRF mechanisms: tuned-facilitation, untuned-suppression, and tuned-suppression. Each mechanism had a distinct timing and spatial profile. Laminar analysis showed that the timing, orientation-tuning, and strength of eCRF mechanisms had distinct signatures within magnocellular and parvocellular processing streams in the V1 microcircuit. The existence of multiple eCRF mechanisms provides new insights into how V1 responds to spatial context. Modeling revealed that the differences in timing and scale of these mechanisms predicted distinct patterns of net modulation, reconciling many previous disparate physiological and psychophysical findings.
Collapse
Affiliation(s)
- Christopher A Henry
- Center for Neural Science, New York University, New York, United States.,Dominick Purpura Department of Neuroscience, Albert Einstein College of Medicine, Bronx, United States
| | - Mehrdad Jazayeri
- Department of Brain and Cognitive Sciences, McGovern Institute for Brain Research, Massachusetts Institute of Technology, Cambridge, United States
| | - Robert M Shapley
- Center for Neural Science, New York University, New York, United States
| | - Michael J Hawken
- Center for Neural Science, New York University, New York, United States
| |
Collapse
|
26
|
Iyer R, Hu B, Mihalas S. Contextual Integration in Cortical and Convolutional Neural Networks. Front Comput Neurosci 2020; 14:31. [PMID: 32390818 PMCID: PMC7192314 DOI: 10.3389/fncom.2020.00031] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2019] [Accepted: 03/24/2020] [Indexed: 11/28/2022] Open
Abstract
It has been suggested that neurons can represent sensory input using probability distributions and neural circuits can perform probabilistic inference. Lateral connections between neurons have been shown to have non-random connectivity and modulate responses to stimuli within the classical receptive field. Large-scale efforts mapping local cortical connectivity describe cell type specific connections from inhibitory neurons and like-to-like connectivity between excitatory neurons. To relate the observed connectivity to computations, we propose a neuronal network model that approximates Bayesian inference of the probability of different features being present at different image locations. We show that the lateral connections between excitatory neurons in a circuit implementing contextual integration in this should depend on correlations between unit activities, minus a global inhibitory drive. The model naturally suggests the need for two types of inhibitory gates (normalization, surround inhibition). First, using natural scene statistics and classical receptive fields corresponding to simple cells parameterized with data from mouse primary visual cortex, we show that the predicted connectivity qualitatively matches with that measured in mouse cortex: neurons with similar orientation tuning have stronger connectivity, and both excitatory and inhibitory connectivity have a modest spatial extent, comparable to that observed in mouse visual cortex. We incorporate lateral connections learned using this model into convolutional neural networks. Features are defined by supervised learning on the task, and the lateral connections provide an unsupervised learning of feature context in multiple layers. Since the lateral connections provide contextual information when the feedforward input is locally corrupted, we show that incorporating such lateral connections into convolutional neural networks makes them more robust to noise and leads to better performance on noisy versions of the MNIST dataset. Decomposing the predicted lateral connectivity matrices into low-rank and sparse components introduces additional cell types into these networks. We explore effects of cell-type specific perturbations on network computation. Our framework can potentially be applied to networks trained on other tasks, with the learned lateral connections aiding computations implemented by feedforward connections when the input is unreliable and demonstrate the potential usefulness of combining supervised and unsupervised learning techniques in real-world vision tasks.
Collapse
Affiliation(s)
- Ramakrishnan Iyer
- Modeling and Theory, Allen Institute for Brain Science, Seattle, WA, United States
| | - Brian Hu
- Modeling and Theory, Allen Institute for Brain Science, Seattle, WA, United States
| | - Stefan Mihalas
- Modeling and Theory, Allen Institute for Brain Science, Seattle, WA, United States
| |
Collapse
|
27
|
Unique Spatial Integration in Mouse Primary Visual Cortex and Higher Visual Areas. J Neurosci 2020; 40:1862-1873. [PMID: 31949109 DOI: 10.1523/jneurosci.1997-19.2020] [Citation(s) in RCA: 21] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2019] [Revised: 01/07/2020] [Accepted: 01/08/2020] [Indexed: 01/28/2023] Open
Abstract
Neurons in the visual system integrate over a wide range of spatial scales. This diversity is thought to enable both local and global computations. To understand how spatial information is encoded across the mouse visual system, we use two-photon imaging to measure receptive fields (RFs) and size-tuning in primary visual cortex (V1) and three downstream higher visual areas (HVAs: LM (lateromedial), AL (anterolateral), and PM (posteromedial)) in mice of both sexes. Neurons in PM, compared with V1 or the other HVAs, have significantly larger RF sizes and less surround suppression, independent of stimulus eccentricity or contrast. To understand how this specialization of RFs arises in the HVAs, we measured the spatial properties of V1 inputs to each area. Spatial integration of V1 axons was remarkably similar across areas and significantly different from the tuning of neurons in their target HVAs. Thus, unlike other visual features studied in this system, specialization of spatial integration in PM cannot be explained by specific projections from V1 to the HVAs. Further, the differences in RF properties could not be explained by differences in convergence of V1 inputs to the HVAs. Instead, our data suggest that distinct inputs from other areas or connectivity within PM may support the area's unique ability to encode global features of the visual scene, whereas V1, LM, and AL may be more specialized for processing local features.SIGNIFICANCE STATEMENT Surround suppression is a common feature of visual processing whereby large stimuli are less effective at driving neuronal responses than smaller stimuli. This is thought to enhance efficiency in the population code and enable higher-order processing of visual information, such as figure-ground segregation. However, this comes at the expense of global computations. Here we find that surround suppression is not equally represented across mouse visual areas: primary visual cortex has substantially more surround suppression than higher visual areas, and one higher area has significantly less suppression than two others examined, suggesting that these areas have distinct functional roles. Thus, we have identified a novel dimension of specialization in the mouse visual cortex that may enable both local and global computations.
Collapse
|
28
|
Capparelli F, Pawelzik K, Ernst U. Constrained inference in sparse coding reproduces contextual effects and predicts laminar neural dynamics. PLoS Comput Biol 2019; 15:e1007370. [PMID: 31581240 PMCID: PMC6793885 DOI: 10.1371/journal.pcbi.1007370] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2019] [Revised: 10/15/2019] [Accepted: 09/02/2019] [Indexed: 01/16/2023] Open
Abstract
When probed with complex stimuli that extend beyond their classical receptive field, neurons in primary visual cortex display complex and non-linear response characteristics. Sparse coding models reproduce some of the observed contextual effects, but still fail to provide a satisfactory explanation in terms of realistic neural structures and cortical mechanisms, since the connection scheme they propose consists only of interactions among neurons with overlapping input fields. Here we propose an extended generative model for visual scenes that includes spatial dependencies among different features. We derive a neurophysiologically realistic inference scheme under the constraint that neurons have direct access only to local image information. The scheme can be interpreted as a network in primary visual cortex where two neural populations are organized in different layers within orientation hypercolumns that are connected by local, short-range and long-range recurrent interactions. When trained with natural images, the model predicts a connectivity structure linking neurons with similar orientation preferences matching the typical patterns found for long-ranging horizontal axons and feedback projections in visual cortex. Subjected to contextual stimuli typically used in empirical studies, our model replicates several hallmark effects of contextual processing and predicts characteristic differences for surround modulation between the two model populations. In summary, our model provides a novel framework for contextual processing in the visual system proposing a well-defined functional role for horizontal axons and feedback projections.
Collapse
Affiliation(s)
- Federica Capparelli
- Institute for Theoretical Physics, University of Bremen, Bremen, Germany
- * E-mail:
| | - Klaus Pawelzik
- Institute for Theoretical Physics, University of Bremen, Bremen, Germany
| | - Udo Ernst
- Institute for Theoretical Physics, University of Bremen, Bremen, Germany
| |
Collapse
|
29
|
Giraldo LGS, Schwartz O. Integrating Flexible Normalization into Midlevel Representations of Deep Convolutional Neural Networks. Neural Comput 2019; 31:2138-2176. [PMID: 31525314 DOI: 10.1162/neco_a_01226] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022]
Abstract
Deep convolutional neural networks (CNNs) are becoming increasingly popular models to predict neural responses in visual cortex. However, contextual effects, which are prevalent in neural processing and in perception, are not explicitly handled by current CNNs, including those used for neural prediction. In primary visual cortex, neural responses are modulated by stimuli spatially surrounding the classical receptive field in rich ways. These effects have been modeled with divisive normalization approaches, including flexible models, where spatial normalization is recruited only to the degree that responses from center and surround locations are deemed statistically dependent. We propose a flexible normalization model applied to midlevel representations of deep CNNs as a tractable way to study contextual normalization mechanisms in midlevel cortical areas. This approach captures nontrivial spatial dependencies among midlevel features in CNNs, such as those present in textures and other visual stimuli, that arise from tiling high-order features geometrically. We expect that the proposed approach can make predictions about when spatial normalization might be recruited in midlevel cortical areas. We also expect this approach to be useful as part of the CNN tool kit, therefore going beyond more restrictive fixed forms of normalization.
Collapse
Affiliation(s)
| | - Odelia Schwartz
- Computer Science Department, University of Miami, Coral Gables, FL 33146, U.S.A.
| |
Collapse
|
30
|
Coen-Cagli R, Solomon SS. Relating Divisive Normalization to Neuronal Response Variability. J Neurosci 2019; 39:7344-7356. [PMID: 31387914 PMCID: PMC6759019 DOI: 10.1523/jneurosci.0126-19.2019] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2019] [Revised: 06/13/2019] [Accepted: 06/18/2019] [Indexed: 01/13/2023] Open
Abstract
Cortical responses to repeated presentations of a sensory stimulus are variable. This variability is sensitive to several stimulus dimensions, suggesting that it may carry useful information beyond the average firing rate. Many experimental manipulations that affect response variability are also known to engage divisive normalization, a widespread operation that describes neuronal activity as the ratio of a numerator (representing the excitatory stimulus drive) and denominator (the normalization signal). Although it has been suggested that normalization affects response variability, we lack a quantitative framework to determine the relation between the two. Here we extend the standard normalization model, by treating the numerator and the normalization signal as variable quantities. The resulting model predicts a general stabilizing effect of normalization on neuronal responses, and allows us to infer the single-trial normalization strength, a quantity that cannot be measured directly. We test the model on neuronal responses to stimuli of varying contrast, recorded in primary visual cortex of male macaques. We find that neurons that are more strongly normalized fire more reliably, and response variability and pairwise noise correlations are reduced during trials in which normalization is inferred to be strong. Our results thus suggest a novel functional role for normalization, namely, modulating response variability. Our framework could enable a direct quantification of the impact of single-trial normalization strength on the accuracy of perceptual judgments, and can be readily applied to other sensory and nonsensory factors.SIGNIFICANCE STATEMENT Divisive normalization is a widespread neural operation across sensory and nonsensory brain areas, which describes neuronal responses as the ratio between the excitatory drive to the neuron and a normalization signal. Normalization plays a key role in several important computations, including adjusting the neuron's dynamic range, reducing redundancy, and facilitating probabilistic inference. However, the relation between normalization and neuronal response variability (a fundamental aspect of neural coding) remains unclear. Here we develop a new model and test it on primary visual cortex responses. We show that normalization has a stabilizing effect on neuronal activity, beyond the known suppression of firing rate. This modulation of variability suggests a new functional role for normalization in neural coding and perception.
Collapse
Affiliation(s)
- Ruben Coen-Cagli
- Department of Systems and Computational Biology, and
- Dominick P. Purpura Department of Neuroscience, Albert Einstein College of Medicine, Bronx, New York 10461
| | - Selina S Solomon
- Dominick P. Purpura Department of Neuroscience, Albert Einstein College of Medicine, Bronx, New York 10461
| |
Collapse
|
31
|
Sanchez-Giraldo LG, Laskar MNU, Schwartz O. Normalization and pooling in hierarchical models of natural images. Curr Opin Neurobiol 2019; 55:65-72. [PMID: 30785005 DOI: 10.1016/j.conb.2019.01.008] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2018] [Revised: 12/29/2018] [Accepted: 01/13/2019] [Indexed: 11/17/2022]
Abstract
Divisive normalization and subunit pooling are two canonical classes of computation that have become widely used in descriptive (what) models of visual cortical processing. Normative (why) models from natural image statistics can help constrain the form and parameters of such classes of models. We focus on recent advances in two particular directions, namely deriving richer forms of divisive normalization, and advances in learning pooling from image statistics. We discuss the incorporation of such components into hierarchical models. We consider both hierarchical unsupervised learning from image statistics, and discriminative supervised learning in deep convolutional neural networks (CNNs). We further discuss studies on the utility and extensions of the convolutional architecture, which has also been adopted by recent descriptive models. We review the recent literature and discuss the current promises and gaps of using such approaches to gain a better understanding of how cortical neurons represent and process complex visual stimuli.
Collapse
Affiliation(s)
- Luis G Sanchez-Giraldo
- Computational Neuroscience Lab, Dept. of Computer Science, University of Miami, FL 33146, United States.
| | - Md Nasir Uddin Laskar
- Computational Neuroscience Lab, Dept. of Computer Science, University of Miami, FL 33146, United States
| | - Odelia Schwartz
- Computational Neuroscience Lab, Dept. of Computer Science, University of Miami, FL 33146, United States
| |
Collapse
|
32
|
Peter A, Uran C, Klon-Lipok J, Roese R, van Stijn S, Barnes W, Dowdall JR, Singer W, Fries P, Vinck M. Surface color and predictability determine contextual modulation of V1 firing and gamma oscillations. eLife 2019; 8:42101. [PMID: 30714900 PMCID: PMC6391066 DOI: 10.7554/elife.42101] [Citation(s) in RCA: 46] [Impact Index Per Article: 9.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2018] [Accepted: 01/30/2019] [Indexed: 12/03/2022] Open
Abstract
The integration of direct bottom-up inputs with contextual information is a core feature of neocortical circuits. In area V1, neurons may reduce their firing rates when their receptive field input can be predicted by spatial context. Gamma-synchronized (30–80 Hz) firing may provide a complementary signal to rates, reflecting stronger synchronization between neuronal populations receiving mutually predictable inputs. We show that large uniform surfaces, which have high spatial predictability, strongly suppressed firing yet induced prominent gamma synchronization in macaque V1, particularly when they were colored. Yet, chromatic mismatches between center and surround, breaking predictability, strongly reduced gamma synchronization while increasing firing rates. Differences between responses to different colors, including strong gamma-responses to red, arose from stimulus adaptation to a full-screen background, suggesting prominent differences in adaptation between M- and L-cone signaling pathways. Thus, synchrony signaled whether RF inputs were predicted from spatial context, while firing rates increased when stimuli were unpredicted from context.
Collapse
Affiliation(s)
- Alina Peter
- Ernst Strüngmann Institute (ESI) for Neuroscience in Cooperation with Max Planck Society, Frankfurt, Germany.,International Max Planck Research School for Neural Circuits, Frankfurt, Germany
| | - Cem Uran
- Ernst Strüngmann Institute (ESI) for Neuroscience in Cooperation with Max Planck Society, Frankfurt, Germany
| | - Johanna Klon-Lipok
- Ernst Strüngmann Institute (ESI) for Neuroscience in Cooperation with Max Planck Society, Frankfurt, Germany.,Max Planck Institute for Brain Research, Frankfurt, Germany
| | - Rasmus Roese
- Ernst Strüngmann Institute (ESI) for Neuroscience in Cooperation with Max Planck Society, Frankfurt, Germany
| | - Sylvia van Stijn
- Ernst Strüngmann Institute (ESI) for Neuroscience in Cooperation with Max Planck Society, Frankfurt, Germany.,Max Planck Institute for Brain Research, Frankfurt, Germany
| | - William Barnes
- Ernst Strüngmann Institute (ESI) for Neuroscience in Cooperation with Max Planck Society, Frankfurt, Germany
| | - Jarrod R Dowdall
- Ernst Strüngmann Institute (ESI) for Neuroscience in Cooperation with Max Planck Society, Frankfurt, Germany
| | - Wolf Singer
- Ernst Strüngmann Institute (ESI) for Neuroscience in Cooperation with Max Planck Society, Frankfurt, Germany.,Frankfurt Institute for Advanced Studies, Frankfurt, Germany
| | - Pascal Fries
- Ernst Strüngmann Institute (ESI) for Neuroscience in Cooperation with Max Planck Society, Frankfurt, Germany.,Donders Institute for Brain, Cognition and Behaviour, Radboud University Nijmegen, Nijmegen, Netherlands
| | - Martin Vinck
- Ernst Strüngmann Institute (ESI) for Neuroscience in Cooperation with Max Planck Society, Frankfurt, Germany
| |
Collapse
|
33
|
Layer 3 Dynamically Coordinates Columnar Activity According to Spatial Context. J Neurosci 2019; 39:281-294. [PMID: 30459226 DOI: 10.1523/jneurosci.1568-18.2018] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2018] [Revised: 10/16/2018] [Accepted: 10/16/2018] [Indexed: 01/03/2023] Open
Abstract
To reduce statistical redundancy of natural inputs and increase the sparseness of coding, neurons in primary visual cortex (V1) show tuning for stimulus size and surround suppression. This integration of spatial information is a fundamental, context-dependent neural operation involving extensive neural circuits that span across all cortical layers of a V1 column, and reflects both feedforward and feedback processing. However, how spatial integration is dynamically coordinated across cortical layers remains poorly understood. We recorded single- and multiunit activity and local field potentials across V1 layers of awake mice (both sexes) while they viewed stimuli of varying size and used dynamic Bayesian model comparisons to identify when laminar activity and interlaminar functional interactions showed surround suppression, the hallmark of spatial integration. We found that surround suppression is strongest in layer 3 (L3) and L4 activity, where suppression is established within ∼10 ms after response onset, and receptive fields dynamically sharpen while suppression strength increases. Importantly, we also found that specific directed functional connections were strongest for intermediate stimulus sizes and suppressed for larger ones, particularly for connections from L3 targeting L5 and L1. Together, the results shed light on the different functional roles of cortical layers in spatial integration and on how L3 dynamically coordinates activity across a cortical column depending on spatial context.SIGNIFICANCE STATEMENT Neurons in primary visual cortex (V1) show tuning for stimulus size, where responses to stimuli exceeding the receptive field can be suppressed (surround suppression). We demonstrate that functional connectivity between V1 layers can also have a surround-suppressed profile. A particularly prominent role seems to have layer 3, the functional connections to layers 5 and 1 of which are strongest for stimuli of optimal size and decreased for large stimuli. Our results therefore point toward a key role of layer 3 in coordinating activity across the cortical column according to spatial context.
Collapse
|
34
|
Turner MH, Sanchez Giraldo LG, Schwartz O, Rieke F. Stimulus- and goal-oriented frameworks for understanding natural vision. Nat Neurosci 2019; 22:15-24. [PMID: 30531846 PMCID: PMC8378293 DOI: 10.1038/s41593-018-0284-0] [Citation(s) in RCA: 32] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2018] [Accepted: 10/22/2018] [Indexed: 12/21/2022]
Abstract
Our knowledge of sensory processing has advanced dramatically in the last few decades, but this understanding remains far from complete, especially for stimuli with the large dynamic range and strong temporal and spatial correlations characteristic of natural visual inputs. Here we describe some of the issues that make understanding the encoding of natural images a challenge. We highlight two broad strategies for approaching this problem: a stimulus-oriented framework and a goal-oriented one. Different contexts can call for one framework or the other. Looking forward, recent advances, particularly those based in machine learning, show promise in borrowing key strengths of both frameworks and by doing so illuminating a path to a more comprehensive understanding of the encoding of natural stimuli.
Collapse
Affiliation(s)
- Maxwell H Turner
- Department of Physiology and Biophysics, University of Washington, Seattle, WA, USA
- Graduate Program in Neuroscience, University of Washington, Seattle, WA, USA
| | | | - Odelia Schwartz
- Department of Computer Science, University of Miami, Coral Gables, FL, USA
| | - Fred Rieke
- Department of Physiology and Biophysics, University of Washington, Seattle, WA, USA.
| |
Collapse
|
35
|
Aschner A, Solomon SG, Landy MS, Heeger DJ, Kohn A. Temporal Contingencies Determine Whether Adaptation Strengthens or Weakens Normalization. J Neurosci 2018; 38:10129-10142. [PMID: 30291205 PMCID: PMC6246879 DOI: 10.1523/jneurosci.1131-18.2018] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2018] [Revised: 08/30/2018] [Accepted: 09/19/2018] [Indexed: 11/21/2022] Open
Abstract
A fundamental and nearly ubiquitous feature of sensory encoding is that neuronal responses are strongly influenced by recent experience, or adaptation. Theoretical and computational studies have proposed that many adaptation effects may result in part from changes in the strength of normalization signals. Normalization is a "canonical" computation in which a neuron's response is modulated (normalized) by the pooled activity of other neurons. Here, we test whether adaptation can alter the strength of cross-orientation suppression, or masking, a paradigmatic form of normalization evident in primary visual cortex (V1). We made extracellular recordings of V1 neurons in anesthetized male macaques and measured responses to plaid stimuli composed of two overlapping, orthogonal gratings before and after prolonged exposure to two distinct adapters. The first adapter was a plaid consisting of orthogonal gratings and led to stronger masking. The second adapter presented the same orthogonal gratings in an interleaved manner and led to weaker masking. The strength of adaptation's effects on masking depended on the orientation of the test stimuli relative to the orientation of the adapters, but was independent of neuronal orientation preference. Changes in masking could not be explained by altered neuronal responsivity. Our results suggest that normalization signals can be strengthened or weakened by adaptation depending on the temporal contingencies of the adapting stimuli. Our findings reveal an interplay between two widespread computations in cortical circuits, adaptation and normalization, that enables flexible adjustments to the structure of the environment, including the temporal relationships among sensory stimuli.SIGNIFICANCE STATEMENT Two fundamental features of sensory responses are that they are influenced by adaptation and that they are modulated by the activity of other nearby neurons via normalization. Our findings reveal a strong interaction between these two aspects of cortical computation. Specifically, we show that cross-orientation masking, a form of normalization, can be strengthened or weakened by adaptation depending on the temporal contingencies between sensory inputs. Our findings support theoretical proposals that some adaptation effects may involve altered normalization and offer a network-based explanation for how cortex adjusts to current sensory demands.
Collapse
Affiliation(s)
- Amir Aschner
- Dominik Purpura Department of Neuroscience, Albert Einstein College of Medicine, Bronx, New York 10461,
| | - Samuel G Solomon
- Department of Experimental Psychology, University College London, London, United Kingdom WC1H 0AP
| | - Michael S Landy
- Department of Psychology and Center for Neural Science, New York University, New York, New York 10003
| | - David J Heeger
- Department of Psychology and Center for Neural Science, New York University, New York, New York 10003
| | - Adam Kohn
- Dominik Purpura Department of Neuroscience, Albert Einstein College of Medicine, Bronx, New York 10461
- Department of Ophthalmology and Visual Sciences, Albert Einstein College of Medicine, Bronx, New York 10461, and
- Department of Systems and Computational Biology, Albert Einstein College of Medicine, Bronx, New York 10461
| |
Collapse
|
36
|
Keemink SW, Boucsein C, van Rossum MCW. Effects of V1 surround modulation tuning on visual saliency and the tilt illusion. J Neurophysiol 2018; 120:942-952. [PMID: 29847234 DOI: 10.1152/jn.00864.2017] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022] Open
Abstract
Neurons in the primary visual cortex respond to oriented stimuli placed in the center of their receptive field, yet their response is modulated by stimuli outside the receptive field (the surround). Classically, this surround modulation is assumed to be strongest if the orientation of the surround stimulus aligns with the neuron's preferred orientation, irrespective of the actual center stimulus. This neuron-dependent surround modulation has been used to explain a wide range of psychophysical phenomena, such as biased tilt perception and saliency of stimuli with contrasting orientation. However, several neurophysiological studies have shown that for most neurons surround modulation is instead center dependent: it is strongest if the surround orientation aligns with the center stimulus. As the impact of such center-dependent modulation on the population level is unknown, we examine this using computational models. We find that with neuron-dependent modulation the biases in orientation coding, commonly used to explain the tilt illusion, are larger than psychophysically reported, but disappear with center-dependent modulation. Therefore we suggest that a mixture of the two modulation types is necessary to quantitatively explain the psychophysically observed biases. Next, we find that under center-dependent modulation average population responses are more sensitive to orientation differences between stimuli, which in theory could improve saliency detection. However, this effect depends on the specific saliency model. Overall, our results thus show that center-dependent modulation reduces coding bias, while possibly increasing the sensitivity to salient features. NEW & NOTEWORTHY Neural responses in the primary visual cortex are modulated by stimuli surrounding the receptive field. Most earlier studies assume this modulation depends on the neuron's tuning properties, but experiments have shown that instead it depends mostly on the stimulus characteristics. We show that this simple change leads to neural coding that is less biased and under some conditions more sensitive to salient features.
Collapse
Affiliation(s)
- Sander W Keemink
- Institute for Adaptive and Neural Computation, School of Informatics, University of Edinburgh , Edinburgh , United Kingdom.,Bernstein Center Freiburg, Faculty of Biology, University of Freiburg , Freiburg , Germany
| | - Clemens Boucsein
- Bernstein Center Freiburg, Faculty of Biology, University of Freiburg , Freiburg , Germany
| | - Mark C W van Rossum
- Institute for Adaptive and Neural Computation, School of Informatics, University of Edinburgh , Edinburgh , United Kingdom
| |
Collapse
|
37
|
Jaini P, Burge J. Linking normative models of natural tasks to descriptive models of neural response. J Vis 2017; 17:16. [PMID: 29071353 PMCID: PMC6097587 DOI: 10.1167/17.12.16] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022] Open
Abstract
Understanding how nervous systems exploit task-relevant properties of sensory stimuli to perform natural tasks is fundamental to the study of perceptual systems. However, there are few formal methods for determining which stimulus properties are most useful for a given natural task. As a consequence, it is difficult to develop principled models for how to compute task-relevant latent variables from natural signals, and it is difficult to evaluate descriptive models fit to neural response. Accuracy maximization analysis (AMA) is a recently developed Bayesian method for finding the optimal task-specific filters (receptive fields). Here, we introduce AMA–Gauss, a new faster form of AMA that incorporates the assumption that the class-conditional filter responses are Gaussian distributed. Then, we use AMA–Gauss to show that its assumptions are justified for two fundamental visual tasks: retinal speed estimation and binocular disparity estimation. Next, we show that AMA–Gauss has striking formal similarities to popular quadratic models of neural response: the energy model and the generalized quadratic model (GQM). Together, these developments deepen our understanding of why the energy model of neural response have proven useful, improve our ability to evaluate results from subunit model fits to neural data, and should help accelerate psychophysics and neuroscience research with natural stimuli.
Collapse
Affiliation(s)
- Priyank Jaini
- Cheriton School of Computer Science, Waterloo, Ontario, Canada.,Department of Psychology, University of Pennsylvania, Philadelphia, PA, USA
| | - Johannes Burge
- Department of Psychology, University of Pennsylvania, Philadelphia, PA, USA.,Neuroscience Graduate Group, University of Pennsylvania, Philadelphia, PA, USA.,Bioengineering Graduate Group, University of Pennsylvania, Philadelphia, PA, USA
| |
Collapse
|
38
|
Młynarski W, McDermott JH. Learning Midlevel Auditory Codes from Natural Sound Statistics. Neural Comput 2017; 30:631-669. [PMID: 29220308 DOI: 10.1162/neco_a_01048] [Citation(s) in RCA: 27] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022]
Abstract
Interaction with the world requires an organism to transform sensory signals into representations in which behaviorally meaningful properties of the environment are made explicit. These representations are derived through cascades of neuronal processing stages in which neurons at each stage recode the output of preceding stages. Explanations of sensory coding may thus involve understanding how low-level patterns are combined into more complex structures. To gain insight into such midlevel representations for sound, we designed a hierarchical generative model of natural sounds that learns combinations of spectrotemporal features from natural stimulus statistics. In the first layer, the model forms a sparse convolutional code of spectrograms using a dictionary of learned spectrotemporal kernels. To generalize from specific kernel activation patterns, the second layer encodes patterns of time-varying magnitude of multiple first-layer coefficients. When trained on corpora of speech and environmental sounds, some second-layer units learned to group similar spectrotemporal features. Others instantiate opponency between distinct sets of features. Such groupings might be instantiated by neurons in the auditory cortex, providing a hypothesis for midlevel neuronal computation.
Collapse
|
39
|
Sasaki H, Gutmann MU, Shouno H, Hyvärinen A. Simultaneous Estimation of Nongaussian Components and Their Correlation Structure. Neural Comput 2017; 29:2887-2924. [PMID: 28777730 DOI: 10.1162/neco_a_01006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022]
Abstract
The statistical dependencies that independent component analysis (ICA) cannot remove often provide rich information beyond the linear independent components. It would thus be very useful to estimate the dependency structure from data. While such models have been proposed, they have usually concentrated on higher-order correlations such as energy (square) correlations. Yet linear correlations are a fundamental and informative form of dependency in many real data sets. Linear correlations are usually completely removed by ICA and related methods so they can only be analyzed by developing new methods that explicitly allow for linearly correlated components. In this article, we propose a probabilistic model of linear nongaussian components that are allowed to have both linear and energy correlations. The precision matrix of the linear components is assumed to be randomly generated by a higher-order process and explicitly parameterized by a parameter matrix. The estimation of the parameter matrix is shown to be particularly simple because using score-matching (Hyvärinen, 2005 ), the objective function is a quadratic form. Using simulations with artificial data, we demonstrate that the proposed method improves the identifiability of nongaussian components by simultaneously learning their correlation structure. Applications on simulated complex cells with natural image input, as well as spectrograms of natural audio data, show that the method finds new kinds of dependencies between the components.
Collapse
Affiliation(s)
- Hiroaki Sasaki
- Graduate School of Information Science, Nara Institute of Science and Technology, Nara 630-0192, Japan
| | - Michael U Gutmann
- School of Informatics, University of Edinburgh, Edinburgh EH8 9AB, U.K.
| | - Hayaru Shouno
- Graduate School of Informatics and Engineering, University of Electro-Communications, Tokyo 182-8585, Japan
| | - Aapo Hyvärinen
- Helsinki Institute for Information Technology, University of Helsinki, Helsinki 00560, Finland, and Gatsby Computational Neuroscience Unit, University College London, London W1T 4JG, U.K.
| |
Collapse
|
40
|
Snow M, Coen-Cagli R, Schwartz O. Adaptation in the visual cortex: a case for probing neuronal populations with natural stimuli. F1000Res 2017; 6:1246. [PMID: 29034079 PMCID: PMC5532795 DOI: 10.12688/f1000research.11154.1] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 07/24/2017] [Indexed: 12/19/2022] Open
Abstract
The perception of, and neural responses to, sensory stimuli in the present are influenced by what has been observed in the past—a phenomenon known as adaptation. We focus on adaptation in visual cortical neurons as a paradigmatic example. We review recent work that represents two shifts in the way we study adaptation, namely (i) going beyond single neurons to study adaptation in populations of neurons and (ii) going beyond simple stimuli to study adaptation to natural stimuli. We suggest that efforts in these two directions, through a closer integration of experimental and modeling approaches, will enable a more complete understanding of cortical processing in natural environments.
Collapse
Affiliation(s)
- Michoel Snow
- Department of Neuroscience, Albert Einstein College of Medicine, Bronx, NY, 10461, USA.,Department of Systems and Computational Biology, Albert Einstein College of Medicine, Bronx, NY, 10461, USA
| | - Ruben Coen-Cagli
- Department of Neuroscience, Albert Einstein College of Medicine, Bronx, NY, 10461, USA.,Department of Systems and Computational Biology, Albert Einstein College of Medicine, Bronx, NY, 10461, USA
| | - Odelia Schwartz
- Department of Computer Science, University of Miami, Coral Gables, FL, 33146, USA
| |
Collapse
|
41
|
Poltoratski S, Ling S, McCormack D, Tong F. Characterizing the effects of feature salience and top-down attention in the early visual system. J Neurophysiol 2017; 118:564-573. [PMID: 28381491 PMCID: PMC5511869 DOI: 10.1152/jn.00924.2016] [Citation(s) in RCA: 23] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2016] [Revised: 03/31/2017] [Accepted: 04/01/2017] [Indexed: 11/22/2022] Open
Abstract
The visual system employs a sophisticated balance of attentional mechanisms: salient stimuli are prioritized for visual processing, yet observers can also ignore such stimuli when their goals require directing attention elsewhere. A powerful determinant of visual salience is local feature contrast: if a local region differs from its immediate surround along one or more feature dimensions, it will appear more salient. We used high-resolution functional MRI (fMRI) at 7T to characterize the modulatory effects of bottom-up salience and top-down voluntary attention within multiple sites along the early visual pathway, including visual areas V1-V4 and the lateral geniculate nucleus (LGN). Observers viewed arrays of spatially distributed gratings, where one of the gratings immediately to the left or right of fixation differed from all other items in orientation or motion direction, making it salient. To investigate the effects of directed attention, observers were cued to attend to the grating to the left or right of fixation, which was either salient or nonsalient. Results revealed reliable additive effects of top-down attention and stimulus-driven salience throughout visual areas V1-hV4. In comparison, the LGN exhibited significant attentional enhancement but was not reliably modulated by orientation- or motion-defined salience. Our findings indicate that top-down effects of spatial attention can influence visual processing at the earliest possible site along the visual pathway, including the LGN, whereas the processing of orientation- and motion-driven salience primarily involves feature-selective interactions that take place in early cortical visual areas.NEW & NOTEWORTHY While spatial attention allows for specific, goal-driven enhancement of stimuli, salient items outside of the current focus of attention must also be prioritized. We used 7T fMRI to compare salience and spatial attentional enhancement along the early visual hierarchy. We report additive effects of attention and bottom-up salience in early visual areas, suggesting that salience enhancement is not contingent on the observer's attentional state.
Collapse
Affiliation(s)
- Sonia Poltoratski
- Vanderbilt Vision Research Center, Psychology Department, Vanderbilt University, Nashville, Tennessee; and
| | - Sam Ling
- Department of Psychological & Brain Sciences, Center for Computational Neuroscience and Neural Technology, Boston University, Boston, Massachusetts
| | - Devin McCormack
- Vanderbilt Vision Research Center, Psychology Department, Vanderbilt University, Nashville, Tennessee; and
| | - Frank Tong
- Vanderbilt Vision Research Center, Psychology Department, Vanderbilt University, Nashville, Tennessee; and
| |
Collapse
|
42
|
Schallmo MP, Grant AN, Burton PC, Olman CA. The effects of orientation and attention during surround suppression of small image features: A 7 Tesla fMRI study. J Vis 2017; 16:19. [PMID: 27565016 PMCID: PMC5015919 DOI: 10.1167/16.10.19] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/30/2023] Open
Abstract
Although V1 responses are driven primarily by elements within a neuron's receptive field, which subtends about 1° visual angle in parafoveal regions, previous work has shown that localized fMRI responses to visual elements reflect not only local feature encoding but also long-range pattern attributes. However, separating the response to an image feature from the response to the surrounding stimulus and studying the interactions between these two responses demands both spatial precision and signal independence, which may be challenging to attain with fMRI. The present study used 7 Tesla fMRI with 1.2-mm resolution to measure the interactions between small sinusoidal grating patches (targets) at 3° eccentricity and surrounds of various sizes and orientations to test the conditions under which localized, context-dependent fMRI responses could be predicted from either psychophysical or electrophysiological data. Targets were presented at 8%, 16%, and 32% contrast while manipulating (a) spatial extent of parallel (strongly suppressive) or orthogonal (weakly suppressive) surrounds, (b) locus of attention, (c) stimulus onset asynchrony between target and surround, and (d) blocked versus event-related design. In all experiments, the V1 fMRI signal was lower when target stimuli were flanked by parallel versus orthogonal context. Attention amplified fMRI responses to all stimuli but did not show a selective effect on central target responses or a measurable effect on orientation-dependent surround suppression. Suppression of the V1 fMRI response by parallel surrounds was stronger than predicted from psychophysics but showed a better match to previous electrophysiological reports.
Collapse
|
43
|
Angelucci A, Bijanzadeh M, Nurminen L, Federer F, Merlin S, Bressloff PC. Circuits and Mechanisms for Surround Modulation in Visual Cortex. Annu Rev Neurosci 2017; 40:425-451. [PMID: 28471714 DOI: 10.1146/annurev-neuro-072116-031418] [Citation(s) in RCA: 138] [Impact Index Per Article: 19.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/16/2023]
Abstract
Surround modulation (SM) is a fundamental property of sensory neurons in many species and sensory modalities. SM is the ability of stimuli in the surround of a neuron's receptive field (RF) to modulate (typically suppress) the neuron's response to stimuli simultaneously presented inside the RF, a property thought to underlie optimal coding of sensory information and important perceptual functions. Understanding the circuit and mechanisms for SM can reveal fundamental principles of computations in sensory cortices, from mouse to human. Current debate is centered over whether feedforward or intracortical circuits generate SM, and whether this results from increased inhibition or reduced excitation. Here we present a working hypothesis, based on theoretical and experimental evidence, that SM results from feedforward, horizontal, and feedback interactions with local recurrent connections, via synaptic mechanisms involving both increased inhibition and reduced recurrent excitation. In particular, strong and balanced recurrent excitatory and inhibitory circuits play a crucial role in the computation of SM.
Collapse
Affiliation(s)
- Alessandra Angelucci
- Department of Ophthalmology and Visual Science, Moran Eye Institute, University of Utah, Salt Lake City, Utah 84132; , , , ,
| | - Maryam Bijanzadeh
- Department of Ophthalmology and Visual Science, Moran Eye Institute, University of Utah, Salt Lake City, Utah 84132; , , , ,
| | - Lauri Nurminen
- Department of Ophthalmology and Visual Science, Moran Eye Institute, University of Utah, Salt Lake City, Utah 84132; , , , ,
| | - Frederick Federer
- Department of Ophthalmology and Visual Science, Moran Eye Institute, University of Utah, Salt Lake City, Utah 84132; , , , ,
| | - Sam Merlin
- Department of Ophthalmology and Visual Science, Moran Eye Institute, University of Utah, Salt Lake City, Utah 84132; , , , ,
| | - Paul C Bressloff
- Department of Mathematics, University of Utah, Salt Lake City, Utah 84132;
| |
Collapse
|
44
|
Aitchison L, Lengyel M. The Hamiltonian Brain: Efficient Probabilistic Inference with Excitatory-Inhibitory Neural Circuit Dynamics. PLoS Comput Biol 2016; 12:e1005186. [PMID: 28027294 PMCID: PMC5189947 DOI: 10.1371/journal.pcbi.1005186] [Citation(s) in RCA: 29] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2015] [Accepted: 10/06/2016] [Indexed: 12/19/2022] Open
Abstract
Probabilistic inference offers a principled framework for understanding both behaviour and cortical computation. However, two basic and ubiquitous properties of cortical responses seem difficult to reconcile with probabilistic inference: neural activity displays prominent oscillations in response to constant input, and large transient changes in response to stimulus onset. Indeed, cortical models of probabilistic inference have typically either concentrated on tuning curve or receptive field properties and remained agnostic as to the underlying circuit dynamics, or had simplistic dynamics that gave neither oscillations nor transients. Here we show that these dynamical behaviours may in fact be understood as hallmarks of the specific representation and algorithm that the cortex employs to perform probabilistic inference. We demonstrate that a particular family of probabilistic inference algorithms, Hamiltonian Monte Carlo (HMC), naturally maps onto the dynamics of excitatory-inhibitory neural networks. Specifically, we constructed a model of an excitatory-inhibitory circuit in primary visual cortex that performed HMC inference, and thus inherently gave rise to oscillations and transients. These oscillations were not mere epiphenomena but served an important functional role: speeding up inference by rapidly spanning a large volume of state space. Inference thus became an order of magnitude more efficient than in a non-oscillatory variant of the model. In addition, the network matched two specific properties of observed neural dynamics that would otherwise be difficult to account for using probabilistic inference. First, the frequency of oscillations as well as the magnitude of transients increased with the contrast of the image stimulus. Second, excitation and inhibition were balanced, and inhibition lagged excitation. These results suggest a new functional role for the separation of cortical populations into excitatory and inhibitory neurons, and for the neural oscillations that emerge in such excitatory-inhibitory networks: enhancing the efficiency of cortical computations. Our brain operates in the face of substantial uncertainty due to ambiguity in the inputs, and inherent unpredictability in the environment. Behavioural and neural evidence indicates that the brain often uses a close approximation of the optimal strategy, probabilistic inference, to interpret sensory inputs and make decisions under uncertainty. However, the circuit dynamics underlying such probabilistic computations are unknown. In particular, two fundamental properties of cortical responses, the presence of oscillations and transients, are difficult to reconcile with probabilistic inference. We show that excitatory-inhibitory neural networks are naturally suited to implement a particular inference algorithm, Hamiltonian Monte Carlo. Our network showed oscillations and transients like those found in the cortex and took advantage of these dynamical motifs to speed up inference by an order of magnitude. These results suggest a new functional role for the separation of cortical populations into excitatory and inhibitory neurons, and for the neural oscillations that emerge in such excitatory-inhibitory networks: enhancing the efficiency of cortical computations.
Collapse
Affiliation(s)
- Laurence Aitchison
- Gatsby Computational Neuroscience Unit, University College London, London, United Kingdom
- * E-mail:
| | - Máté Lengyel
- Computational & Biological Learning Lab, Department of Engineering, University of Cambridge, Cambridge, United Kingdom
- Department of Cognitive Science, Central European University, Budapest, Hungary
| |
Collapse
|
45
|
Snow M, Coen-Cagli R, Schwartz O. Specificity and timescales of cortical adaptation as inferences about natural movie statistics. J Vis 2016; 16:2565618. [PMID: 27699416 PMCID: PMC5054764 DOI: 10.1167/16.13.1] [Citation(s) in RCA: 24] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2015] [Indexed: 11/30/2022] Open
Abstract
Adaptation is a phenomenological umbrella term under which a variety of temporal contextual effects are grouped. Previous models have shown that some aspects of visual adaptation reflect optimal processing of dynamic visual inputs, suggesting that adaptation should be tuned to the properties of natural visual inputs. However, the link between natural dynamic inputs and adaptation is poorly understood. Here, we extend a previously developed Bayesian modeling framework for spatial contextual effects to the temporal domain. The model learns temporal statistical regularities of natural movies and links these statistics to adaptation in primary visual cortex via divisive normalization, a ubiquitous neural computation. In particular, the model divisively normalizes the present visual input by the past visual inputs only to the degree that these are inferred to be statistically dependent. We show that this flexible form of normalization reproduces classical findings on how brief adaptation affects neuronal selectivity. Furthermore, prior knowledge acquired by the Bayesian model from natural movies can be modified by prolonged exposure to novel visual stimuli. We show that this updating can explain classical results on contrast adaptation. We also simulate the recent finding that adaptation maintains population homeostasis, namely, a balanced level of activity across a population of neurons with different orientation preferences. Consistent with previous disparate observations, our work further clarifies the influence of stimulus-specific and neuronal-specific normalization signals in adaptation.
Collapse
Affiliation(s)
- Michoel Snow
- Department of Systems and Computational Biology, and Dominick Purpura Department of Neuroscience, Albert Einstein College of Medicine, Bronx, NY, USA.
| | - Ruben Coen-Cagli
- Department of Basic Neuroscience, University of Geneva, Switzerland Department of Systems and Computational Biology, and Dominick Purpura Department of Neuroscience, Albert Einstein College of Medicine, Bronx, NY, USA. https://sites.google.com/site/rubencoencagli/
| | - Odelia Schwartz
- Department of Computer Science, University of Miami, Miami, FL, USA Dominick Purpura Department of Neuroscience, and Department of Systems and Computational Biology, Albert Einstein College of Medicine, Bronx, NY, USA. http://www.cs.miami.edu/home/odelia/
| |
Collapse
|
46
|
Abstract
Surround suppression is a well-known phenomenon in which the response to a visual stimulus is diminished by the presence of neighboring stimuli. This effect is observed in neural responses in areas such as primary visual cortex, and also manifests in visual contrast perception. Studies in animal models have identified at least two separate mechanisms that may contribute to surround suppression: one that is monocular and resistant to contrast adaptation, and another that is binocular and strongly diminished by adaptation. The current study was designed to investigate whether these two mechanisms exist in humans and if they can be identified psychophysically using eye-of-origin and contrast adaptation manipulations. In addition, we examined the prediction that the monocular suppression component is broadly tuned for orientation, while suppression between eyes is narrowly tuned. Our results confirmed that when center and surrounding stimuli were presented dichoptically (in opposite eyes), suppression was orientation-tuned. Following adaptation in the surrounding region, no dichoptic suppression was observed, and monoptic suppression no longer showed orientation selectivity. These results are consistent with a model of surround suppression that depends on both low-level and higher level components. This work provides a method to assess the separate contributions of these components during spatial context processing in human vision.
Collapse
|
47
|
Bruce ND, Wloka C, Frosst N, Rahman S, Tsotsos JK. On computational modeling of visual saliency: Examining what’s right, and what’s left. Vision Res 2015; 116:95-112. [DOI: 10.1016/j.visres.2015.01.010] [Citation(s) in RCA: 54] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2014] [Revised: 12/16/2014] [Accepted: 01/19/2015] [Indexed: 11/26/2022]
|
48
|
Mannion DJ, Kersten DJ, Olman CA. Scene coherence can affect the local response to natural images in human V1. Eur J Neurosci 2015; 42:2895-903. [PMID: 26390850 DOI: 10.1111/ejn.13082] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2015] [Revised: 09/14/2015] [Accepted: 09/16/2015] [Indexed: 11/30/2022]
Abstract
Neurons in primary visual cortex (V1) can be indirectly affected by visual stimulation positioned outside their receptive fields. Although this contextual modulation has been intensely studied, we have little notion of how it manifests with naturalistic stimulation. Here, we investigated how the V1 response to a natural image fragment is affected by spatial context that is consistent or inconsistent with the scene from which it was extracted. Using functional magnetic resonance imaging at 7 T, we measured the blood oxygen level-dependent signal in human V1 (n = 8) while participants viewed an array of apertures. Most apertures showed fragments from a single scene, yielding a dominant perceptual interpretation which participants were asked to categorize, and the remaining apertures each showed fragments drawn from a set of 20 scenes. We find that the V1 response was significantly increased for apertures showing image structure that was coherent with the dominant scene relative to the response to the same image structure when it was non-coherent. Additional analyses suggest that this effect was mostly evident for apertures in the periphery of the visual field, that it peaked towards the centre of the aperture, and that it peaked in the middle to superficial regions of the cortical grey matter. These findings suggest that knowledge of typical spatial relationships is embedded in the circuitry of contextual modulation. Such mechanisms, possibly augmented by contributions from attentional factors, serve to increase the local V1 activity under conditions of contextual consistency.
Collapse
Affiliation(s)
- Damien J Mannion
- School of Psychology, UNSW Australia UNSW, Sydney, NSW, 2052, Australia.,Department of Psychology, University of Minnesota, Minneapolis, MN, USA
| | - Daniel J Kersten
- Department of Psychology, University of Minnesota, Minneapolis, MN, USA.,Department of Brain and Cognitive Engineering, Korea University, Seoul, Korea
| | - Cheryl A Olman
- Department of Psychology, University of Minnesota, Minneapolis, MN, USA
| |
Collapse
|
49
|
Coen-Cagli R, Kohn A, Schwartz O. Flexible gating of contextual influences in natural vision. Nat Neurosci 2015; 18:1648-55. [PMID: 26436902 PMCID: PMC4624479 DOI: 10.1038/nn.4128] [Citation(s) in RCA: 101] [Impact Index Per Article: 11.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2015] [Accepted: 09/07/2015] [Indexed: 11/30/2022]
Abstract
Identical sensory inputs can be perceived as strikingly different when embedded in distinct contexts. Neural responses to simple stimuli are also modulated by context, but the contribution of this modulation to the processing of natural sensory input is unclear. We measured surround suppression, a quintessential contextual influence, in macaque primary visual cortex with natural images. We found suppression strength varied substantially for different images. This variability was not well explained by existing descriptions of surround suppression, but it was predicted by Bayesian inference about statistical dependencies in images. In this framework, surround suppression was flexible: it was recruited when the image was inferred to contain redundancies, and substantially reduced in strength otherwise. Our results thus reveal a surprising gating of a basic, widespread cortical computation, by inference about the statistics of natural input.
Collapse
Affiliation(s)
- Ruben Coen-Cagli
- D.P. Purpura Department of Neuroscience, Albert Einstein College of Medicine, Bronx, New York, USA
| | - Adam Kohn
- D.P. Purpura Department of Neuroscience, Albert Einstein College of Medicine, Bronx, New York, USA.,Department of Ophthalmology and Visual Sciences, Albert Einstein College of Medicine, Bronx, New York, USA.,Department of Systems and Computational Biology, Albert Einstein College of Medicine, Bronx, New York, USA
| | - Odelia Schwartz
- D.P. Purpura Department of Neuroscience, Albert Einstein College of Medicine, Bronx, New York, USA.,Department of Systems and Computational Biology, Albert Einstein College of Medicine, Bronx, New York, USA
| |
Collapse
|
50
|
Abstract
In primary visual cortex (V1), neuronal responses are sensitive to context. For example, responses to stimuli presented within the receptive field (RF) center are often suppressed by stimuli within the RF surround, and this suppression tends to be strongest when the center and surround stimuli match. We sought to identify the mechanism that gives rise to these properties of surround modulation. To do so, we exploited the stability of implanted multielectrode arrays to record from neurons in V1 of alert monkeys with multiple stimulus sets that more exhaustively probed center-surround interactions. We first replicated previous results concerning center-surround similarity using gratings representing all combinations of center and surround orientation. With this stimulus set, the surround simply scaled population responses to the center, such that the overall population tuning curve had the same shape and peak response. However, when the center contained two superimposed gratings (i.e., a visual "plaid"), one component of which always matched the surround orientation, suppression selectively affected the portion of the response driven by the matching center component, thereby producing shifts in the peak of the population orientation tuning curve. In effect, the surround caused neurons to respond predominantly to the component grating of the center plaid that was unmatched to the surround grating, as if by reducing the effective strength of whichever stimulus attributes were matched to the surround. These results provide key physiological support for theoretical models that propose feature-specific, input-gain control as the mechanism underlying surround suppression.
Collapse
|