1
|
Castellotti S, Del Viva MM. Neural Substrates for Early Data Reduction in Fast Vision: A Psychophysical Investigation. Brain Sci 2024; 14:753. [PMID: 39199448 PMCID: PMC11352587 DOI: 10.3390/brainsci14080753] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2024] [Revised: 07/23/2024] [Accepted: 07/25/2024] [Indexed: 09/01/2024] Open
Abstract
To ensure survival, the visual system must rapidly extract the most important elements from a large stream of information. This necessity clashes with the computational limitations of the human brain, so a strong early data reduction is required to efficiently process information in fast vision. A theoretical early vision model, recently developed to preserve maximum information using minimal computational resources, allows efficient image data reduction by extracting simplified sketches containing only optimally informative, salient features. Here, we investigate the neural substrates of this mechanism for optimal encoding of information, possibly located in early visual structures. We adopted a flicker adaptation paradigm, which has been demonstrated to specifically impair the contrast sensitivity of the magnocellular pathway. We compared flicker-induced contrast threshold changes in three different tasks. The results indicate that, after adapting to a uniform flickering field, thresholds for image discrimination using briefly presented sketches increase. Similar threshold elevations occur for motion discrimination, a task typically targeting the magnocellular system. Instead, contrast thresholds for orientation discrimination, a task typically targeting the parvocellular system, do not change with flicker adaptation. The computation performed by this early data reduction mechanism seems thus consistent with magnocellular processing.
Collapse
Affiliation(s)
- Serena Castellotti
- Department of Translational Research on New Technologies in Medicine and Surgery, University of Pisa, 56126 Pisa, Italy
- Department of Neurosciences, Psychology, Drug Research and Child Health (NEUROFARBA), University of Florence, 50135 Florence, Italy;
| | - Maria Michela Del Viva
- Department of Neurosciences, Psychology, Drug Research and Child Health (NEUROFARBA), University of Florence, 50135 Florence, Italy;
| |
Collapse
|
2
|
Lakshminarasimhan KJ, Xie M, Cohen JD, Sauerbrei BA, Hantman AW, Litwin-Kumar A, Escola S. Specific connectivity optimizes learning in thalamocortical loops. Cell Rep 2024; 43:114059. [PMID: 38602873 PMCID: PMC11104520 DOI: 10.1016/j.celrep.2024.114059] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2023] [Revised: 01/04/2024] [Accepted: 03/20/2024] [Indexed: 04/13/2024] Open
Abstract
Thalamocortical loops have a central role in cognition and motor control, but precisely how they contribute to these processes is unclear. Recent studies showing evidence of plasticity in thalamocortical synapses indicate a role for the thalamus in shaping cortical dynamics through learning. Since signals undergo a compression from the cortex to the thalamus, we hypothesized that the computational role of the thalamus depends critically on the structure of corticothalamic connectivity. To test this, we identified the optimal corticothalamic structure that promotes biologically plausible learning in thalamocortical synapses. We found that corticothalamic projections specialized to communicate an efference copy of the cortical output benefit motor control, while communicating the modes of highest variance is optimal for working memory tasks. We analyzed neural recordings from mice performing grasping and delayed discrimination tasks and found corticothalamic communication consistent with these predictions. These results suggest that the thalamus orchestrates cortical dynamics in a functionally precise manner through structured connectivity.
Collapse
Affiliation(s)
| | - Marjorie Xie
- Zuckerman Mind Brain Behavior Institute, Columbia University, New York, NY 10027, USA
| | - Jeremy D Cohen
- Neuroscience Center, University of North Carolina, Chapel Hill, NC 27559, USA
| | - Britton A Sauerbrei
- Department of Neurosciences, Case Western Reserve University, Cleveland, OH 44106, USA
| | - Adam W Hantman
- Neuroscience Center, University of North Carolina, Chapel Hill, NC 27559, USA
| | - Ashok Litwin-Kumar
- Zuckerman Mind Brain Behavior Institute, Columbia University, New York, NY 10027, USA.
| | - Sean Escola
- Department of Psychiatry, Columbia University, New York, NY 10032, USA.
| |
Collapse
|
3
|
Bhaskaran AA, Gauvrit T, Vyas Y, Bony G, Ginger M, Frick A. Endogenous noise of neocortical neurons correlates with atypical sensory response variability in the Fmr1 -/y mouse model of autism. Nat Commun 2023; 14:7905. [PMID: 38036566 PMCID: PMC10689491 DOI: 10.1038/s41467-023-43777-z] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2023] [Accepted: 11/20/2023] [Indexed: 12/02/2023] Open
Abstract
Excessive neural variability of sensory responses is a hallmark of atypical sensory processing in autistic individuals with cascading effects on other core autism symptoms but unknown neurobiological substrate. Here, by recording neocortical single neuron activity in a well-established mouse model of Fragile X syndrome and autism, we characterized atypical sensory processing and probed the role of endogenous noise sources in exaggerated response variability in males. The analysis of sensory stimulus evoked activity and spontaneous dynamics, as well as neuronal features, reveals a complex cellular and network phenotype. Neocortical sensory information processing is more variable and temporally imprecise. Increased trial-by-trial and inter-neuronal response variability is strongly related to key endogenous noise features, and may give rise to behavioural sensory responsiveness variability in autism. We provide a novel preclinical framework for understanding the sources of endogenous noise and its contribution to core autism symptoms, and for testing the functional consequences for mechanism-based manipulation of noise.
Collapse
Affiliation(s)
- Arjun A Bhaskaran
- INSERM, U1215 Neurocentre Magendie, 33077, Bordeaux, France
- University of Bordeaux, 33000, Bordeaux, France
- Department of Psychiatry, Djavad Mowafaghian Centre for Brain Health, University of British Columbia, Vancouver, BC, Canada
| | - Théo Gauvrit
- INSERM, U1215 Neurocentre Magendie, 33077, Bordeaux, France
- University of Bordeaux, 33000, Bordeaux, France
| | - Yukti Vyas
- INSERM, U1215 Neurocentre Magendie, 33077, Bordeaux, France
- University of Bordeaux, 33000, Bordeaux, France
| | - Guillaume Bony
- INSERM, U1215 Neurocentre Magendie, 33077, Bordeaux, France
- University of Bordeaux, 33000, Bordeaux, France
| | - Melanie Ginger
- INSERM, U1215 Neurocentre Magendie, 33077, Bordeaux, France
- University of Bordeaux, 33000, Bordeaux, France
| | - Andreas Frick
- INSERM, U1215 Neurocentre Magendie, 33077, Bordeaux, France.
- University of Bordeaux, 33000, Bordeaux, France.
| |
Collapse
|
4
|
Dayan P. Metacognitive Information Theory. Open Mind (Camb) 2023; 7:392-411. [PMID: 37637303 PMCID: PMC10449404 DOI: 10.1162/opmi_a_00091] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2022] [Accepted: 06/25/2023] [Indexed: 08/29/2023] Open
Abstract
The capacity that subjects have to rate confidence in their choices is a form of metacognition, and can be assessed according to bias, sensitivity and efficiency. Rich networks of domain-specific and domain-general regions of the brain are involved in the rating, and are associated with its quality and its use for regulating the processes of thinking and acting. Sensitivity and efficiency are often measured by quantities called meta-d' and the M-ratio that are based on reverse engineering the potential accuracy of the original, primary, choice that is implied by the quality of the confidence judgements. Here, we advocate a straightforward measure of sensitivity, called meta-𝓘, which assesses the mutual information between the accuracy of the subject's choices and the confidence reports, and two normalized versions of this measure that quantify efficiency in different regimes. Unlike most other measures, meta-𝓘-based quantities increase with the number of correctly assessed bins with which confidence is reported. We illustrate meta-𝓘 on data from a perceptual decision-making task, and via a simple form of simulated second-order metacognitive observer.
Collapse
Affiliation(s)
- Peter Dayan
- Max Planck Institute for Biological Cybernetics, Tübingen, Germany
- University of Tübingen, Tübingen, Germany
| |
Collapse
|
5
|
Castellotti S, D’Agostino O, Del Viva MM. Fast discrimination of fragmentary images: the role of local optimal information. Front Hum Neurosci 2023; 17:1049615. [PMID: 36845876 PMCID: PMC9945129 DOI: 10.3389/fnhum.2023.1049615] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2022] [Accepted: 01/18/2023] [Indexed: 02/11/2023] Open
Abstract
In naturalistic conditions, objects in the scene may be partly occluded and the visual system has to recognize the whole image based on the little information contained in some visible fragments. Previous studies demonstrated that humans can successfully recognize severely occluded images, but the underlying mechanisms occurring in the early stages of visual processing are still poorly understood. The main objective of this work is to investigate the contribution of local information contained in a few visible fragments to image discrimination in fast vision. It has been already shown that a specific set of features, predicted by a constrained maximum-entropy model to be optimal carriers of information (optimal features), are used to build simplified early visual representations (primal sketch) that are sufficient for fast image discrimination. These features are also considered salient by the visual system and can guide visual attention when presented isolated in artificial stimuli. Here, we explore whether these local features also play a significant role in more natural settings, where all existing features are kept, but the overall available information is drastically reduced. Indeed, the task requires discrimination of naturalistic images based on a very brief presentation (25 ms) of a few small visible image fragments. In the main experiment, we reduced the possibility to perform the task based on global-luminance positional cues by presenting randomly inverted-contrast images, and we measured how much observers' performance relies on the local features contained in the fragments or on global information. The size and the number of fragments were determined in two preliminary experiments. Results show that observers are very skilled in fast image discrimination, even when a drastic occlusion is applied. When observers cannot rely on the position of global-luminance information, the probability of correct discrimination increases when the visible fragments contain a high number of optimal features. These results suggest that such optimal local information contributes to the successful reconstruction of naturalistic images even in challenging conditions.
Collapse
|
6
|
Abstract
When grasping objects, we rely on our sense of touch to adjust our grip and react against external perturbations. Less than 200 ms after an unexpected event, the sensorimotor system is able to process tactile information to deduce the frictional strength of the contact and to react accordingly. Given that roughly 1,300 afferents innervate the fingertips, it is unclear how the nervous system can process such a large influx of data in a sufficiently short time span. In this study, we measured the deformation of the skin during the initial stages of incipient sliding for a wide range of frictional conditions. We show that the dominant patterns of deformation are sufficient to estimate the distance between the frictional force and the frictional strength of the contact. From these stereotypical patterns, a classifier can predict if an object is about to slide during the initial stages of incipient slip. The prediction is robust to the actual value of the interfacial friction, showing sensory invariance. These results suggest the existence of a possible compact set of bases that we call Eigenstrains. These Eigenstrains are a potential mechanism to rapidly decode the margin from full slip from the tactile information contained in the deformation of the skin. Our findings suggest that only 6 of these Eigenstrains are necessary to classify whether the object is firmly stuck to the fingers or is close to slipping away. These findings give clues about the tactile regulation of grasp and the insights are directly applicable to the design of robotic grippers and prosthetics that rapidly react to external perturbations.
Collapse
|
7
|
Zhou L, Zhou T, Khan S, Sun H, Shen J, Shao L. Weakly Supervised Visual Saliency Prediction. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2022; 31:3111-3124. [PMID: 35380961 DOI: 10.1109/tip.2022.3158064] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
The success of current deep saliency models heavily depends on large amounts of annotated human fixation data to fit the highly non-linear mapping between the stimuli and visual saliency. Such fully supervised data-driven approaches are annotation-intensive and often fail to consider the underlying mechanisms of visual attention. In contrast, in this paper, we introduce a model based on various cognitive theories of visual saliency, which learns visual attention patterns in a weakly supervised manner. Our approach incorporates insights from cognitive science as differentiable submodules, resulting in a unified, end-to-end trainable framework. Specifically, our model encapsulates the following important components motivated from biological vision. (a) As scene semantics are closely related to visually attentive regions, our model encodes discriminative spatial information for scene understanding through spatial visual semantics embedding. (b) To model the objectness factors in visual attention deployment, we incorporate object-level semantics embedding and object relation information. (c) Considering the "winner-take-all" mechanism in visual stimuli processing, we model the competition mechanism among objects with softmax based neural attention. (d) Lastly, a conditional center prior is learned to mimic the spatial distribution bias of visual attention. Furthermore, we propose novel loss functions to utilize supervision cues from image-level semantics, saliency prior knowledge, and self-information compression. Experiments show that our method achieves promising results, and even outperforms many of its fully supervised counterparts. Overall, our weakly supervised saliency method makes an essential step towards reducing the annotation budget of current approaches, as well as providing a more comprehensive understanding of the visual attention mechanism. Our code is available at: https://github.com/ashleylqx/WeakFixation.git.
Collapse
|
8
|
Cessac B. Retinal Processing: Insights from Mathematical Modelling. J Imaging 2022; 8:14. [PMID: 35049855 PMCID: PMC8780400 DOI: 10.3390/jimaging8010014] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2021] [Revised: 01/11/2022] [Accepted: 01/12/2022] [Indexed: 02/04/2023] Open
Abstract
The retina is the entrance of the visual system. Although based on common biophysical principles, the dynamics of retinal neurons are quite different from their cortical counterparts, raising interesting problems for modellers. In this paper, I address some mathematically stated questions in this spirit, discussing, in particular: (1) How could lateral amacrine cell connectivity shape the spatio-temporal spike response of retinal ganglion cells? (2) How could spatio-temporal stimuli correlations and retinal network dynamics shape the spike train correlations at the output of the retina? These questions are addressed, first, introducing a mathematically tractable model of the layered retina, integrating amacrine cells' lateral connectivity and piecewise linear rectification, allowing for computing the retinal ganglion cells receptive field together with the voltage and spike correlations of retinal ganglion cells resulting from the amacrine cells networks. Then, I review some recent results showing how the concept of spatio-temporal Gibbs distributions and linear response theory can be used to characterize the collective spike response to a spatio-temporal stimulus of a set of retinal ganglion cells, coupled via effective interactions corresponding to the amacrine cells network. On these bases, I briefly discuss several potential consequences of these results at the cortical level.
Collapse
Affiliation(s)
- Bruno Cessac
- France INRIA Biovision Team and Neuromod Institute, Université Côte d'Azur, 2004 Route des Lucioles, BP 93, 06902 Valbonne, France
| |
Collapse
|
9
|
Rose O, Johnson J, Wang B, Ponce CR. Visual prototypes in the ventral stream are attuned to complexity and gaze behavior. Nat Commun 2021; 12:6723. [PMID: 34795262 PMCID: PMC8602238 DOI: 10.1038/s41467-021-27027-8] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2021] [Accepted: 11/01/2021] [Indexed: 01/02/2023] Open
Abstract
Early theories of efficient coding suggested the visual system could compress the world by learning to represent features where information was concentrated, such as contours. This view was validated by the discovery that neurons in posterior visual cortex respond to edges and curvature. Still, it remains unclear what other information-rich features are encoded by neurons in more anterior cortical regions (e.g., inferotemporal cortex). Here, we use a generative deep neural network to synthesize images guided by neuronal responses from across the visuocortical hierarchy, using floating microelectrode arrays in areas V1, V4 and inferotemporal cortex of two macaque monkeys. We hypothesize these images ("prototypes") represent such predicted information-rich features. Prototypes vary across areas, show moderate complexity, and resemble salient visual attributes and semantic content of natural images, as indicated by the animals' gaze behavior. This suggests the code for object recognition represents compressed features of behavioral relevance, an underexplored aspect of efficient coding.
Collapse
Affiliation(s)
- Olivia Rose
- Department of Neuroscience, Washington University School of Medicine, St. Louis, MO, USA
- Department of Neurobiology, Harvard Medical School, Boston, MA, USA
| | - James Johnson
- Department of Neuroscience, Washington University School of Medicine, St. Louis, MO, USA
| | - Binxu Wang
- Department of Neuroscience, Washington University School of Medicine, St. Louis, MO, USA
- Department of Neurobiology, Harvard Medical School, Boston, MA, USA
| | - Carlos R Ponce
- Department of Neuroscience, Washington University School of Medicine, St. Louis, MO, USA.
- Department of Neurobiology, Harvard Medical School, Boston, MA, USA.
| |
Collapse
|
10
|
Chauhan T, Masquelier T, Cottereau BR. Sub-Optimality of the Early Visual System Explained Through Biologically Plausible Plasticity. Front Neurosci 2021; 15:727448. [PMID: 34602970 PMCID: PMC8480265 DOI: 10.3389/fnins.2021.727448] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2021] [Accepted: 08/25/2021] [Indexed: 11/13/2022] Open
Abstract
The early visual cortex is the site of crucial pre-processing for more complex, biologically relevant computations that drive perception and, ultimately, behaviour. This pre-processing is often studied under the assumption that neural populations are optimised for the most efficient (in terms of energy, information, spikes, etc.) representation of natural statistics. Normative models such as Independent Component Analysis (ICA) and Sparse Coding (SC) consider the phenomenon as a generative, minimisation problem which they assume the early cortical populations have evolved to solve. However, measurements in monkey and cat suggest that receptive fields (RFs) in the primary visual cortex are often noisy, blobby, and symmetrical, making them sub-optimal for operations such as edge-detection. We propose that this suboptimality occurs because the RFs do not emerge through a global minimisation of generative error, but through locally operating biological mechanisms such as spike-timing dependent plasticity (STDP). Using a network endowed with an abstract, rank-based STDP rule, we show that the shape and orientation tuning of the converged units are remarkably close to single-cell measurements in the macaque primary visual cortex. We quantify this similarity using physiological parameters (frequency-normalised spread vectors), information theoretic measures [Kullback–Leibler (KL) divergence and Gini index], as well as simulations of a typical electrophysiology experiment designed to estimate orientation tuning curves. Taken together, our results suggest that compared to purely generative schemes, process-based biophysical models may offer a better description of the suboptimality observed in the early visual cortex.
Collapse
Affiliation(s)
- Tushar Chauhan
- Centre de Recherche Cerveau et Cognition, Université de Toulouse, Toulouse, France.,Centre National de la Recherche Scientifique, Toulouse, France
| | - Timothée Masquelier
- Centre de Recherche Cerveau et Cognition, Université de Toulouse, Toulouse, France.,Centre National de la Recherche Scientifique, Toulouse, France
| | - Benoit R Cottereau
- Centre de Recherche Cerveau et Cognition, Université de Toulouse, Toulouse, France.,Centre National de la Recherche Scientifique, Toulouse, France
| |
Collapse
|
11
|
Farrell M, Recanatesi S, Reid RC, Mihalas S, Shea-Brown E. Autoencoder networks extract latent variables and encode these variables in their connectomes. Neural Netw 2021; 141:330-343. [PMID: 33957382 DOI: 10.1016/j.neunet.2021.03.010] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2020] [Revised: 03/02/2021] [Accepted: 03/08/2021] [Indexed: 11/30/2022]
Abstract
Advances in electron microscopy and data processing techniques are leading to increasingly large and complete microscale connectomes. At the same time, advances in artificial neural networks have produced model systems that perform comparably rich computations with perfectly specified connectivity. This raises an exciting scientific opportunity for the study of both biological and artificial neural networks: to infer the underlying circuit function from the structure of its connectivity. A potential roadblock, however, is that - even with well constrained neural dynamics - there are in principle many different connectomes that could support a given computation. Here, we define a tractable setting in which the problem of inferring circuit function from circuit connectivity can be analyzed in detail: the function of input compression and reconstruction, in an autoencoder network with a single hidden layer. Here, in general there is substantial ambiguity in the weights that can produce the same circuit function, because largely arbitrary changes to input weights can be undone by applying the inverse modifications to the output weights. However, we use mathematical arguments and simulations to show that adding simple, biologically motivated regularization of connectivity resolves this ambiguity in an interesting way: weights are constrained such that the latent variable structure underlying the inputs can be extracted from the weights by using nonlinear dimensionality reduction methods.
Collapse
Affiliation(s)
- Matthew Farrell
- Applied Mathematics Department, University of Washington, Seattle, WA, United States of America; Computational Neuroscience Center, University of Washington, Seattle, WA, United States of America.
| | - Stefano Recanatesi
- Computational Neuroscience Center, University of Washington, Seattle, WA, United States of America
| | - R Clay Reid
- Allen Institute for Brain Science, Seattle, WA, United States of America
| | - Stefan Mihalas
- Allen Institute for Brain Science, Seattle, WA, United States of America
| | - Eric Shea-Brown
- Applied Mathematics Department, University of Washington, Seattle, WA, United States of America; Computational Neuroscience Center, University of Washington, Seattle, WA, United States of America; Allen Institute for Brain Science, Seattle, WA, United States of America
| |
Collapse
|
12
|
Hsu WMM, Kastner DB, Baccus SA, Sharpee TO. How inhibitory neurons increase information transmission under threshold modulation. Cell Rep 2021; 35:109158. [PMID: 34038717 PMCID: PMC8846953 DOI: 10.1016/j.celrep.2021.109158] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2020] [Revised: 02/14/2021] [Accepted: 04/29/2021] [Indexed: 11/28/2022] Open
Abstract
Modulation of neuronal thresholds is ubiquitous in the brain. Phenomena such as figure-ground segmentation, motion detection, stimulus anticipation, and shifts in attention all involve changes in a neuron’s threshold based on signals from larger scales than its primary inputs. However, this modulation reduces the accuracy with which neurons can represent their primary inputs, creating a mystery as to why threshold modulation is so widespread in the brain. We find that modulation is less detrimental than other forms of neuronal variability and that its negative effects can be nearly completely eliminated if modulation is applied selectively to sparsely responding neurons in a circuit by inhibitory neurons. We verify these predictions in the retina where we find that inhibitory amacrine cells selectively deliver modulation signals to sparsely responding ganglion cell types. Our findings elucidate the central role that inhibitory neurons play in maximizing information transmission under modulation. Modulation of neuronal thresholds is ubiquitous in the brain but reduces the accuracy of neural signaling. Hsu et al. show that the negative impact of threshold modulation can be almost completely eliminated when modulation is not delivered uniformly to all neurons but only to a subset and via inhibitory neurons.
Collapse
Affiliation(s)
- Wei-Mien M Hsu
- Computational Neurobiology Laboratory, Salk Institute for Biological Studies, La Jolla, CA, USA; Department of Physics, University of California, San Diego, La Jolla, CA, USA
| | - David B Kastner
- Department of Psychiatry and Behavioral Sciences, University of California, San Francisco, School of Medicine, San Francisco, CA, USA; Department of Neurobiology, Stanford University School of Medicine, Stanford, CA, USA
| | - Stephen A Baccus
- Department of Neurobiology, Stanford University School of Medicine, Stanford, CA, USA
| | - Tatyana O Sharpee
- Computational Neurobiology Laboratory, Salk Institute for Biological Studies, La Jolla, CA, USA; Department of Physics, University of California, San Diego, La Jolla, CA, USA.
| |
Collapse
|
13
|
Castellotti S, Montagnini A, Del Viva MM. Early Visual Saliency Based on Isolated Optimal Features. Front Neurosci 2021; 15:645743. [PMID: 33994923 PMCID: PMC8120310 DOI: 10.3389/fnins.2021.645743] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2020] [Accepted: 04/06/2021] [Indexed: 12/02/2022] Open
Abstract
Under fast viewing conditions, the visual system extracts salient and simplified representations of complex visual scenes. Saccadic eye movements optimize such visual analysis through the dynamic sampling of the most informative and salient regions in the scene. However, a general definition of saliency, as well as its role for natural active vision, is still a matter for discussion. Following the general idea that visual saliency may be based on the amount of local information, a recent constrained maximum-entropy model of early vision, applied to natural images, extracts a set of local optimal information-carriers, as candidate salient features. These optimal features proved to be more informative than others in fast vision, when embedded in simplified sketches of natural images. In the present study, for the first time, these features were presented in isolation, to investigate whether they can be visually more salient than other non-optimal features, even in the absence of any meaningful global arrangement (contour, line, etc.). In four psychophysics experiments, fast discriminability of a compound of optimal features (target) in comparison with a similar compound of non-optimal features (distractor) was measured as a function of their number and contrast. Results showed that the saliency predictions from the constrained maximum-entropy model are well verified in the data, even when the optimal features are presented in smaller numbers or at lower contrast. In the eye movements experiment, the target and the distractor compounds were presented in the periphery at different angles. Participants were asked to perform a simple choice-saccade task. Results showed that saccades can select informative optimal features spatially interleaved with non-optimal features even at the shortest latencies. Saccades’ choice accuracy and landing position precision improved with SNR. In conclusion, the optimal features predicted by the reference model, turn out to be more salient than others, despite the lack of any clues coming from a global meaningful structure, suggesting that they get preferential treatment during fast image analysis. Also, peripheral fast visual processing of these informative local features is able to guide gaze orientation. We speculate that active vision is efficiently adapted to maximize information in natural visual scenes.
Collapse
Affiliation(s)
| | - Anna Montagnini
- Institut de Neurosciences de la Timone (UMR 7289), CNRS and Aix-Marseille Université, Marseille, France
| | | |
Collapse
|
14
|
Nieves JL, Ojeda J, Gómez-Robledo L, Romero J. Psychophysical Determination of the Relevant Colours That Describe the Colour Palette of Paintings. J Imaging 2021; 7:jimaging7040072. [PMID: 34460522 PMCID: PMC8321366 DOI: 10.3390/jimaging7040072] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2021] [Revised: 04/10/2021] [Accepted: 04/13/2021] [Indexed: 11/16/2022] Open
Abstract
In an early study, the so-called “relevant colour” in a painting was heuristically introduced as a term to describe the number of colours that would stand out for an observer when just glancing at a painting. The purpose of this study is to analyse how observers determine the relevant colours by describing observers’ subjective impressions of the most representative colours in paintings and to provide a psychophysical backing for a related computational model we proposed in a previous work. This subjective impression is elicited by an efficient and optimal processing of the most representative colour instances in painting images. Our results suggest an average number of 21 subjective colours. This number is in close agreement with the computational number of relevant colours previously obtained and allows a reliable segmentation of colour images using a small number of colours without introducing any colour categorization. In addition, our results are in good agreement with the directions of colour preferences derived from an independent component analysis. We show that independent component analysis of the painting images yields directions of colour preference aligned with the relevant colours of these images. Following on from this analysis, the results suggest that hue colour components are efficiently distributed throughout a discrete number of directions and could be relevant instances to a priori describe the most representative colours that make up the colour palette of paintings.
Collapse
|
15
|
Gutierrez GJ, Rieke F, Shea-Brown ET. Nonlinear convergence boosts information coding in circuits with parallel outputs. Proc Natl Acad Sci U S A 2021; 118:e1921882118. [PMID: 33593894 PMCID: PMC7923546 DOI: 10.1073/pnas.1921882118] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Neural circuits are structured with layers of converging and diverging connectivity and selectivity-inducing nonlinearities at neurons and synapses. These components have the potential to hamper an accurate encoding of the circuit inputs. Past computational studies have optimized the nonlinearities of single neurons, or connection weights in networks, to maximize encoded information, but have not grappled with the simultaneous impact of convergent circuit structure and nonlinear response functions for efficient coding. Our approach is to compare model circuits with different combinations of convergence, divergence, and nonlinear neurons to discover how interactions between these components affect coding efficiency. We find that a convergent circuit with divergent parallel pathways can encode more information with nonlinear subunits than with linear subunits, despite the compressive loss induced by the convergence and the nonlinearities when considered separately.
Collapse
Affiliation(s)
- Gabrielle J Gutierrez
- Department of Applied Mathematics, University of Washington, Seattle, WA 98195;
- Department of Physiology and Biophysics, University of Washington, Seattle, WA 98195
| | - Fred Rieke
- Department of Physiology and Biophysics, University of Washington, Seattle, WA 98195
| | - Eric T Shea-Brown
- Department of Applied Mathematics, University of Washington, Seattle, WA 98195
- Department of Physiology and Biophysics, University of Washington, Seattle, WA 98195
| |
Collapse
|
16
|
Chauhan T, Héjja-Brichard Y, Cottereau BR. Modelling binocular disparity processing from statistics in natural scenes. Vision Res 2020; 176:27-39. [DOI: 10.1016/j.visres.2020.07.009] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2019] [Revised: 07/19/2020] [Accepted: 07/20/2020] [Indexed: 11/25/2022]
|
17
|
Yildizoglu T, Riegler C, Fitzgerald JE, Portugues R. A Neural Representation of Naturalistic Motion-Guided Behavior in the Zebrafish Brain. Curr Biol 2020; 30:2321-2333.e6. [PMID: 32386533 DOI: 10.1016/j.cub.2020.04.043] [Citation(s) in RCA: 23] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2018] [Revised: 03/13/2020] [Accepted: 04/20/2020] [Indexed: 11/20/2022]
Abstract
All animals must transform ambiguous sensory data into successful behavior. This requires sensory representations that accurately reflect the statistics of natural stimuli and behavior. Multiple studies show that visual motion processing is tuned for accuracy under naturalistic conditions, but the sensorimotor circuits extracting these cues and implementing motion-guided behavior remain unclear. Here we show that the larval zebrafish retina extracts a diversity of naturalistic motion cues, and the retinorecipient pretectum organizes these cues around the elements of behavior. We find that higher-order motion stimuli, gliders, induce optomotor behavior matching expectations from natural scene analyses. We then image activity of retinal ganglion cell terminals and pretectal neurons. The retina exhibits direction-selective responses across glider stimuli, and anatomically clustered pretectal neurons respond with magnitudes matching behavior. Peripheral computations thus reflect natural input statistics, whereas central brain activity precisely codes information needed for behavior. This general principle could organize sensorimotor transformations across animal species.
Collapse
Affiliation(s)
- Tugce Yildizoglu
- Max Planck Institute of Neurobiology, Research Group of Sensorimotor Control, Martinsried 82152, Germany
| | - Clemens Riegler
- Department of Molecular and Cellular Biology, Harvard University, Cambridge, MA 02138, USA; Department of Neurobiology, Faculty of Life Sciences, University of Vienna, Althanstrasse 14, 1090 Vienna, Austria
| | - James E Fitzgerald
- Janelia Research Campus, Howard Hughes Medical Institute, Ashburn, VA 20147, USA.
| | - Ruben Portugues
- Max Planck Institute of Neurobiology, Research Group of Sensorimotor Control, Martinsried 82152, Germany; Institute of Neuroscience, Technical University of Munich, Munich 80802, Germany; Munich Cluster for Systems Neurology (SyNergy), Munich 80802, Germany.
| |
Collapse
|
18
|
Giraldo LGS, Schwartz O. Integrating Flexible Normalization into Midlevel Representations of Deep Convolutional Neural Networks. Neural Comput 2019; 31:2138-2176. [PMID: 31525314 DOI: 10.1162/neco_a_01226] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022]
Abstract
Deep convolutional neural networks (CNNs) are becoming increasingly popular models to predict neural responses in visual cortex. However, contextual effects, which are prevalent in neural processing and in perception, are not explicitly handled by current CNNs, including those used for neural prediction. In primary visual cortex, neural responses are modulated by stimuli spatially surrounding the classical receptive field in rich ways. These effects have been modeled with divisive normalization approaches, including flexible models, where spatial normalization is recruited only to the degree that responses from center and surround locations are deemed statistically dependent. We propose a flexible normalization model applied to midlevel representations of deep CNNs as a tractable way to study contextual normalization mechanisms in midlevel cortical areas. This approach captures nontrivial spatial dependencies among midlevel features in CNNs, such as those present in textures and other visual stimuli, that arise from tiling high-order features geometrically. We expect that the proposed approach can make predictions about when spatial normalization might be recruited in midlevel cortical areas. We also expect this approach to be useful as part of the CNN tool kit, therefore going beyond more restrictive fixed forms of normalization.
Collapse
Affiliation(s)
| | - Odelia Schwartz
- Computer Science Department, University of Miami, Coral Gables, FL 33146, U.S.A.
| |
Collapse
|
19
|
Zhaoping L. A new framework for understanding vision from the perspective of the primary visual cortex. Curr Opin Neurobiol 2019; 58:1-10. [PMID: 31271931 DOI: 10.1016/j.conb.2019.06.001] [Citation(s) in RCA: 41] [Impact Index Per Article: 8.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2019] [Revised: 06/02/2019] [Accepted: 06/10/2019] [Indexed: 11/25/2022]
Abstract
Visual attention selects only a tiny fraction of visual input information for further processing. Selection starts in the primary visual cortex (V1), which creates a bottom-up saliency map to guide the fovea to selected visual locations via gaze shifts. This motivates a new framework that views vision as consisting of encoding, selection, and decoding stages, placing selection on center stage. It suggests a massive loss of non-selected information from V1 downstream along the visual pathway. Hence, feedback from downstream visual cortical areas to V1 for better decoding (recognition), through analysis-by-synthesis, should query for additional information and be mainly directed at the foveal region. Accordingly, non-foveal vision is not only poorer in spatial resolution, but also more susceptible to many illusions.
Collapse
Affiliation(s)
- Li Zhaoping
- University of Tübingen, Max Planck Institute for Biological Cybernetics, Tübingen, Germany
| |
Collapse
|
20
|
Stem cell-based retina models. Adv Drug Deliv Rev 2019; 140:33-50. [PMID: 29777757 DOI: 10.1016/j.addr.2018.05.005] [Citation(s) in RCA: 43] [Impact Index Per Article: 8.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2017] [Revised: 03/16/2018] [Accepted: 05/12/2018] [Indexed: 12/23/2022]
Abstract
From the early days of cell biological research, the eye-especially the retina-has evoked broad interest among scientists. The retina has since been thoroughly investigated and numerous models have been exploited to shed light on its development, morphology, and function. Apart from various animal models and human clinical and anatomical research, stem cell-based models of animal and human cells of origin have entered the field, especially during the last decade. Despite the observation that the retina of different species comprises endogenous stem cells, most stem cell-related research in the human retina is now based on pluripotent stem cell models. Herein, systems of two-dimensional (2D) cultures and co-cultures of distinctly differentiated retinal subtypes revealed a variety of cellular aspects but have in many aspects been replaced by three-dimensional (3D) structures-the so-called retinal organoids. These organoids not only contain all major retinal cell subtypes compared to the physiological situation, but also show a distinct layering in close proximity to the in vivo morphology. Nevertheless, all these models have inherent advantages and disadvantages, which are expounded and summarized in this review. Finally, we discuss current application aspects of stem cell-based retina models and the specific promises they hold for the future.
Collapse
|
21
|
Turner MH, Sanchez Giraldo LG, Schwartz O, Rieke F. Stimulus- and goal-oriented frameworks for understanding natural vision. Nat Neurosci 2019; 22:15-24. [PMID: 30531846 PMCID: PMC8378293 DOI: 10.1038/s41593-018-0284-0] [Citation(s) in RCA: 32] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2018] [Accepted: 10/22/2018] [Indexed: 12/21/2022]
Abstract
Our knowledge of sensory processing has advanced dramatically in the last few decades, but this understanding remains far from complete, especially for stimuli with the large dynamic range and strong temporal and spatial correlations characteristic of natural visual inputs. Here we describe some of the issues that make understanding the encoding of natural images a challenge. We highlight two broad strategies for approaching this problem: a stimulus-oriented framework and a goal-oriented one. Different contexts can call for one framework or the other. Looking forward, recent advances, particularly those based in machine learning, show promise in borrowing key strengths of both frameworks and by doing so illuminating a path to a more comprehensive understanding of the encoding of natural stimuli.
Collapse
Affiliation(s)
- Maxwell H Turner
- Department of Physiology and Biophysics, University of Washington, Seattle, WA, USA
- Graduate Program in Neuroscience, University of Washington, Seattle, WA, USA
| | | | - Odelia Schwartz
- Department of Computer Science, University of Miami, Coral Gables, FL, USA
| | - Fred Rieke
- Department of Physiology and Biophysics, University of Washington, Seattle, WA, USA.
| |
Collapse
|
22
|
Ward J. Individual differences in sensory sensitivity: A synthesizing framework and evidence from normal variation and developmental conditions. Cogn Neurosci 2018; 10:139-157. [PMID: 30526338 DOI: 10.1080/17588928.2018.1557131] [Citation(s) in RCA: 47] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022]
Abstract
For some people, simple sensory stimuli (e.g., noises, patterns) may reliably evoke intense and aversive reactions. This is common in certain clinical groups (e.g., autism) and varies greatly in the neurotypical population. This paper critically evaluates the concept of individual differences in sensory sensitivity, explores its possible underlying neurobiological basis, and presents a roadmap for future research in this area. A distinction is made between subjective sensory sensitivity (self-reported symptoms); neural sensory sensitivity (the degree of neural activity induced by sensory stimuli); and behavioral sensory sensitivity (detection and discrimination of sensory stimuli). Whereas increased subjective and neural sensory sensitivity are assumed to increase together, the status of behavioral sensory sensitivity depends on the extent to which the increased neural activity is linked to signal or noise. A signal detection framework is presented that offers a unifying framework for exploring sensory sensitivity across different conditions. The framework is discussed, in more concrete terms, by linking it to four existing theoretical accounts of atypical sensory sensitivity (not necessarily mutually exclusive): increased excitation-to-inhibition ratio; predictive coding; increased neural noise; and atypical brain connectivity.
Collapse
Affiliation(s)
- Jamie Ward
- a School of Psychology , University of Sussex , Brighton , UK
| |
Collapse
|
23
|
Harmonics added to a flickering light can upset the balance between ON and OFF pathways to produce illusory colors. Proc Natl Acad Sci U S A 2018; 115:E4081-E4090. [PMID: 29632212 PMCID: PMC5924891 DOI: 10.1073/pnas.1717356115] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022] Open
Abstract
By varying the temporal waveforms of complex flickering stimuli, we can produce alterations in their mean color that can be predicted by a physiologically based model of visual processing. The model highlights the perceptual effects of a well-known feature of most visual pathways, namely the early separation of visual signals into increments and decrements. The role of this separation in improving the efficiency and sensitivity of the visual system has been discussed before, but its effect on perception has been neglected. The application of a model incorporating half-wave rectification offers an exciting psychophysical method for investigating the inner workings of the human visual system. The neural signals generated by the light-sensitive photoreceptors in the human eye are substantially processed and recoded in the retina before being transmitted to the brain via the optic nerve. A key aspect of this recoding is the splitting of the signals within the two major cone-driven visual pathways into distinct ON and OFF branches that transmit information about increases and decreases in the neural signal around its mean level. While this separation is clearly important physiologically, its effect on perception is unclear. We have developed a model of the ON and OFF pathways in early color processing. Using this model as a guide, we can produce imbalances in the ON and OFF pathways by changing the shapes of time-varying stimulus waveforms and thus make reliable and predictable alterations to the perceived average color of the stimulus—although the physical mean of the waveforms does not change. The key components in the model are the early half-wave rectifying synapses that split retinal photoreceptor outputs into the ON and OFF pathways and later sigmoidal nonlinearities in each pathway. The ability to systematically vary the waveforms to change a perceptual quality by changing the balance of signals between the ON and OFF visual pathways provides a powerful psychophysical tool for disentangling and investigating the neural workings of human vision.
Collapse
|
24
|
Kruijne W, Meeter M. You prime what you code: The fAIM model of priming of pop-out. PLoS One 2017; 12:e0187556. [PMID: 29166386 PMCID: PMC5699828 DOI: 10.1371/journal.pone.0187556] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2017] [Accepted: 10/21/2017] [Indexed: 11/18/2022] Open
Abstract
Our visual brain makes use of recent experience to interact with the visual world, and efficiently select relevant information. This is exemplified by speeded search when target- and distractor features repeat across trials versus when they switch, a phenomenon referred to as intertrial priming. Here, we present fAIM, a computational model that demonstrates how priming can be explained by a simple feature-weighting mechanism integrated into an established model of bottom-up vision. In fAIM, such modulations in feature gains are widespread and not just restricted to one or a few features. Consequentially, priming effects result from the overall tuning of visual features to the task at hand. Such tuning allows the model to reproduce priming for different types of stimuli, including for typical stimulus dimensions such as 'color' and for less obvious dimensions such as 'spikiness' of shapes. Moreover, the model explains some puzzling findings from the literature: it shows how priming can be found for target-distractor stimulus relations rather than for their absolute stimulus values per se, without an explicit representation of relations. Similarly, it simulates effects that have been taken to reflect a modulation of priming by an observers' goals-without any representation of goals in the model. We conclude that priming is best considered as a consequence of a general adaptation of the brain to visual input, and not as a peculiarity of visual search.
Collapse
Affiliation(s)
- Wouter Kruijne
- Department of Experimental and Applied Psychology, Faculty of Behavioural and Movement Sciences, Vrije Universiteit, Amsterdam, The Netherlands
| | - Martijn Meeter
- Department of Experimental and Applied Psychology, Faculty of Behavioural and Movement Sciences, Vrije Universiteit, Amsterdam, The Netherlands
| |
Collapse
|
25
|
Snow M, Coen-Cagli R, Schwartz O. Adaptation in the visual cortex: a case for probing neuronal populations with natural stimuli. F1000Res 2017; 6:1246. [PMID: 29034079 PMCID: PMC5532795 DOI: 10.12688/f1000research.11154.1] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 07/24/2017] [Indexed: 12/19/2022] Open
Abstract
The perception of, and neural responses to, sensory stimuli in the present are influenced by what has been observed in the past—a phenomenon known as adaptation. We focus on adaptation in visual cortical neurons as a paradigmatic example. We review recent work that represents two shifts in the way we study adaptation, namely (i) going beyond single neurons to study adaptation in populations of neurons and (ii) going beyond simple stimuli to study adaptation to natural stimuli. We suggest that efforts in these two directions, through a closer integration of experimental and modeling approaches, will enable a more complete understanding of cortical processing in natural environments.
Collapse
Affiliation(s)
- Michoel Snow
- Department of Neuroscience, Albert Einstein College of Medicine, Bronx, NY, 10461, USA.,Department of Systems and Computational Biology, Albert Einstein College of Medicine, Bronx, NY, 10461, USA
| | - Ruben Coen-Cagli
- Department of Neuroscience, Albert Einstein College of Medicine, Bronx, NY, 10461, USA.,Department of Systems and Computational Biology, Albert Einstein College of Medicine, Bronx, NY, 10461, USA
| | - Odelia Schwartz
- Department of Computer Science, University of Miami, Coral Gables, FL, 33146, USA
| |
Collapse
|
26
|
Abstract
Background Migraine is a common neurological condition that often involves differences in visual processing. These sensory processing differences provide important information about the underlying causes of the condition, and for the development of treatments. Review of psychophysical literature Psychophysical experiments have shown consistent impairments in contrast sensitivity, orientation acuity, and the perception of global form and motion. They have also established that the addition of task-irrelevant visual noise has a greater effect, and that surround suppression, masking and adaptation are all stronger in migraine. Theoretical signal processing model We propose utilising an established model of visual processing, based on signal processing theory, to account for the behavioural differences seen in migraine. This has the advantage of precision and clarity, and generating clear, falsifiable predictions. Conclusion Increased effects of noise and differences in excitation and inhibition can account for the differences in migraine visual perception. Consolidating existing research and creating a unified, defined theoretical account is needed to better understand the disorder.
Collapse
Affiliation(s)
- Louise O'Hare
- School of Psychology, College of Social Science, University of Lincoln, UK
| | - Paul B Hibbard
- Department of Psychology, University of Essex, UK
- School of Psychology and Neuroscience, University of St Andrews, UK
| |
Collapse
|
27
|
|
28
|
Visual Saliency Using Binary Spectrum of Walsh–Hadamard Transform and Its Applications to Ship Detection in Multispectral Imagery. Neural Process Lett 2016. [DOI: 10.1007/s11063-016-9507-0] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]
|
29
|
Primary Visual Cortex as a Saliency Map: A Parameter-Free Prediction and Its Test by Behavioral Data. PLoS Comput Biol 2015; 11:e1004375. [PMID: 26441341 PMCID: PMC4595278 DOI: 10.1371/journal.pcbi.1004375] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2014] [Accepted: 06/01/2015] [Indexed: 11/28/2022] Open
Abstract
It has been hypothesized that neural activities in the primary visual cortex (V1) represent a saliency map of the visual field to exogenously guide attention. This hypothesis has so far provided only qualitative predictions and their confirmations. We report this hypothesis’ first quantitative prediction, derived without free parameters, and its confirmation by human behavioral data. The hypothesis provides a direct link between V1 neural responses to a visual location and the saliency of that location to guide attention exogenously. In a visual input containing many bars, one of them saliently different from all the other bars which are identical to each other, saliency at the singleton’s location can be measured by the shortness of the reaction time in a visual search for singletons. The hypothesis predicts quantitatively the whole distribution of the reaction times to find a singleton unique in color, orientation, and motion direction from the reaction times to find other types of singletons. The prediction matches human reaction time data. A requirement for this successful prediction is a data-motivated assumption that V1 lacks neurons tuned simultaneously to color, orientation, and motion direction of visual inputs. Since evidence suggests that extrastriate cortices do have such neurons, we discuss the possibility that the extrastriate cortices play no role in guiding exogenous attention so that they can be devoted to other functions like visual decoding and endogenous attention. It has been hypothesized that neural activities in the primary visual cortex represent a saliency map of the visual field to exogenously guide attention. This hypothesis has so far provided only qualitative predictions and their confirmations. We report this hypothesis’ first quantitative prediction, derived without free parameters, and its confirmation by human behavioral data. Using the shortness of reaction times in visual search tasks to measure saliency of the search target’s location, the hypothesis predicts the quantitative distribution of the reaction times to find a salient bar unique in color, orientation, and motion direction in a background of bars that are identical to each other. The prediction matches experimental observations in human observers. Since the prediction would be invalid without a particular neural property of the primary visual cortex, the extrastriate cortices may give little contribution to exogenous attentional guidance since they lack this neural property. Implications of this prospect on the framework of attentional network and the computational role of the higher brain areas are also discussed.
Collapse
|
30
|
|
31
|
Critical and maximally informative encoding between neural populations in the retina. Proc Natl Acad Sci U S A 2015; 112:2533-8. [PMID: 25675497 DOI: 10.1073/pnas.1418092112] [Citation(s) in RCA: 46] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/14/2023] Open
Abstract
Computation in the brain involves multiple types of neurons, yet the organizing principles for how these neurons work together remain unclear. Information theory has offered explanations for how different types of neurons can maximize the transmitted information by encoding different stimulus features. However, recent experiments indicate that separate neuronal types exist that encode the same filtered version of the stimulus, but then the different cell types signal the presence of that stimulus feature with different thresholds. Here we show that the emergence of these neuronal types can be quantitatively described by the theory of transitions between different phases of matter. The two key parameters that control the separation of neurons into subclasses are the mean and standard deviation (SD) of noise affecting neural responses. The average noise across the neural population plays the role of temperature in the classic theory of phase transitions, whereas the SD is equivalent to pressure or magnetic field, in the case of liquid-gas and magnetic transitions, respectively. Our results account for properties of two recently discovered types of salamander Off retinal ganglion cells, as well as the absence of multiple types of On cells. We further show that, across visual stimulus contrasts, retinal circuits continued to operate near the critical point whose quantitative characteristics matched those expected near a liquid-gas critical point and described by the nearest-neighbor Ising model in three dimensions. By operating near a critical point, neural circuits can maximize information transmission in a given environment while retaining the ability to quickly adapt to a new environment.
Collapse
|
32
|
Statistics of the vestibular input experienced during natural self-motion: implications for neural processing. J Neurosci 2014; 34:8347-57. [PMID: 24920638 DOI: 10.1523/jneurosci.0692-14.2014] [Citation(s) in RCA: 82] [Impact Index Per Article: 8.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022] Open
Abstract
It is widely believed that sensory systems are optimized for processing stimuli occurring in the natural environment. However, it remains unknown whether this principle applies to the vestibular system, which contributes to essential brain functions ranging from the most automatic reflexes to spatial perception and motor coordination. Here we quantified, for the first time, the statistics of natural vestibular inputs experienced by freely moving human subjects during typical everyday activities. Although previous studies have found that the power spectra of natural signals across sensory modalities decay as a power law (i.e., as 1/f(α)), we found that this did not apply to natural vestibular stimuli. Instead, power decreased slowly at lower and more rapidly at higher frequencies for all motion dimensions. We further establish that this unique stimulus structure is the result of active motion as well as passive biomechanical filtering occurring before any neural processing. Notably, the transition frequency (i.e., frequency at which power starts to decrease rapidly) was lower when subjects passively experienced sensory stimulation than when they actively controlled stimulation through their own movement. In contrast to signals measured at the head, the spectral content of externally generated (i.e., passive) environmental motion did follow a power law. Specifically, transformations caused by both motor control and biomechanics shape the statistics of natural vestibular stimuli before neural processing. We suggest that the unique structure of natural vestibular stimuli will have important consequences on the neural coding strategies used by this essential sensory system to represent self-motion in everyday life.
Collapse
|
33
|
Samorodov AV. Building intelligent systems for the analysis of microscopic images in medicine and biology. PATTERN RECOGNITION AND IMAGE ANALYSIS 2013. [DOI: 10.1134/s1054661813040159] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
|
34
|
Del Viva MM, Punzi G, Benedetti D. Information and perception of meaningful patterns. PLoS One 2013; 8:e69154. [PMID: 23894422 PMCID: PMC3716808 DOI: 10.1371/journal.pone.0069154] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2012] [Accepted: 06/12/2013] [Indexed: 11/22/2022] Open
Abstract
The visual system needs to extract the most important elements of the external world from a large flux of information in a short time for survival purposes. It is widely believed that in performing this task, it operates a strong data reduction at an early stage, by creating a compact summary of relevant information that can be handled by further levels of processing. In this work we formulate a model of early vision based on a pattern-filtering architecture, partly inspired by high-speed digital data reduction in experimental high-energy physics (HEP). This allows a much stronger data reduction than models based just on redundancy reduction. We show that optimizing this model for best information preservation under tight constraints on computational resources yields surprisingly specific a-priori predictions for the shape of biologically plausible features, and for experimental observations on fast extraction of salient visual features by human observers. Interestingly, applying the same optimized model to HEP data acquisition systems based on pattern-filtering architectures leads to specific a-priori predictions for the relevant data patterns that these devices extract from their inputs. These results suggest that the limitedness of computing resources can play an important role in shaping the nature of perception, by determining what is perceived as “meaningful features” in the input data.
Collapse
Affiliation(s)
- Maria M Del Viva
- NEUROFARBA Dipartimento di Neuroscienze, Psicologia, Area del Farmaco e Salute del Bambino Sezione di Psicologia, Università di Firenze, Firenze, Italy.
| | | | | |
Collapse
|
35
|
Coen-Cagli R, Schwartz O. The impact on midlevel vision of statistically optimal divisive normalization in V1. J Vis 2013; 13:13.8.13. [PMID: 23857950 DOI: 10.1167/13.8.13] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open
Abstract
The first two areas of the primate visual cortex (V1, V2) provide a paradigmatic example of hierarchical computation in the brain. However, neither the functional properties of V2 nor the interactions between the two areas are well understood. One key aspect is that the statistics of the inputs received by V2 depend on the nonlinear response properties of V1. Here, we focused on divisive normalization, a canonical nonlinear computation that is observed in many neural areas and modalities. We simulated V1 responses with (and without) different forms of surround normalization derived from statistical models of natural scenes, including canonical normalization and a statistically optimal extension that accounted for image nonhomogeneities. The statistics of the V1 population responses differed markedly across models. We then addressed how V2 receptive fields pool the responses of V1 model units with different tuning. We assumed this is achieved by learning without supervision a linear representation that removes correlations, which could be accomplished with principal component analysis. This approach revealed V2-like feature selectivity when we used the optimal normalization and, to a lesser extent, the canonical one but not in the absence of both. We compared the resulting two-stage models on two perceptual tasks; while models encompassing V1 surround normalization performed better at object recognition, only statistically optimal normalization provided systematic advantages in a task more closely matched to midlevel vision, namely figure/ground judgment. Our results suggest that experiments probing midlevel areas might benefit from using stimuli designed to engage the computations that characterize V1 optimality.
Collapse
Affiliation(s)
- Ruben Coen-Cagli
- Department of Basic Neuroscience, University of Geneva, Geneva, Switzerland
| | | |
Collapse
|
36
|
Framework for reliable, real-time facial expression recognition for low resolution images. Pattern Recognit Lett 2013. [DOI: 10.1016/j.patrec.2013.03.022] [Citation(s) in RCA: 96] [Impact Index Per Article: 8.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
|
37
|
Hunt JJ, Dayan P, Goodhill GJ. Sparse coding can predict primary visual cortex receptive field changes induced by abnormal visual input. PLoS Comput Biol 2013; 9:e1003005. [PMID: 23675290 PMCID: PMC3649976 DOI: 10.1371/journal.pcbi.1003005] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2012] [Accepted: 02/10/2013] [Indexed: 11/24/2022] Open
Abstract
Receptive fields acquired through unsupervised learning of sparse representations of natural scenes have similar properties to primary visual cortex (V1) simple cell receptive fields. However, what drives in vivo development of receptive fields remains controversial. The strongest evidence for the importance of sensory experience in visual development comes from receptive field changes in animals reared with abnormal visual input. However, most sparse coding accounts have considered only normal visual input and the development of monocular receptive fields. Here, we applied three sparse coding models to binocular receptive field development across six abnormal rearing conditions. In every condition, the changes in receptive field properties previously observed experimentally were matched to a similar and highly faithful degree by all the models, suggesting that early sensory development can indeed be understood in terms of an impetus towards sparsity. As previously predicted in the literature, we found that asymmetries in inter-ocular correlation across orientations lead to orientation-specific binocular receptive fields. Finally we used our models to design a novel stimulus that, if present during rearing, is predicted by the sparsity principle to lead robustly to radically abnormal receptive fields. The responses of neurons in the primary visual cortex (V1), a region of the brain involved in encoding visual input, are modified by the visual experience of the animal during development. For example, most neurons in animals reared viewing stripes of a particular orientation only respond to the orientation that the animal experienced. The responses of V1 cells in normal animals are similar to responses that simple optimisation algorithms can learn when trained on images. However, whether the similarity between these algorithms and V1 responses is merely coincidental has been unclear. Here, we used the results of a number of experiments where animals were reared with modified visual experience to test the explanatory power of three related optimisation algorithms. We did this by filtering the images for the algorithms in ways that mimicked the visual experience of the animals. This allowed us to show that the changes in V1 responses in experiment were consistent with the algorithms. This is evidence that the precepts of the algorithms, notably sparsity, can be used to understand the development of V1 responses. Further, we used our model to propose a novel rearing condition which we expect to have a dramatic effect on development.
Collapse
Affiliation(s)
- Jonathan J. Hunt
- Queensland Brain Institute, University of Queensland, St Lucia, Australia
| | - Peter Dayan
- Gatsby Computational Neuroscience Unit, University College London, London, United Kingdom
| | - Geoffrey J. Goodhill
- Queensland Brain Institute, University of Queensland, St Lucia, Australia
- School of Mathematics and Physics, University of Queensland, St Lucia, Australia
- * E-mail:
| |
Collapse
|
38
|
Ramirez-Moreno DF, Schwartz O, Ramirez-Villegas JF. A saliency-based bottom-up visual attention model for dynamic scenes analysis. BIOLOGICAL CYBERNETICS 2013; 107:141-160. [PMID: 23314730 DOI: 10.1007/s00422-012-0542-2] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/03/2011] [Accepted: 12/13/2012] [Indexed: 06/01/2023]
Abstract
This work proposes a model of visual bottom-up attention for dynamic scene analysis. Our work adds motion saliency calculations to a neural network model with realistic temporal dynamics [(e.g., building motion salience on top of De Brecht and Saiki Neural Networks 19:1467-1474, (2006)]. The resulting network elicits strong transient responses to moving objects and reaches stability within a biologically plausible time interval. The responses are statistically different comparing between earlier and later motion neural activity; and between moving and non-moving objects. We demonstrate the network on a number of synthetic and real dynamical movie examples. We show that the model captures the motion saliency asymmetry phenomenon. In addition, the motion salience computation enables sudden-onset moving objects that are less salient in the static scene to rise above others. Finally, we include strong consideration for the neural latencies, the Lyapunov stability, and the neural properties being reproduced by the model.
Collapse
Affiliation(s)
- David F Ramirez-Moreno
- Computational Neuroscience, Department of Physics, Universidad Autonoma de Occidente, Cali, Colombia.
| | | | | |
Collapse
|
39
|
Abstract
Sensory neurons have been hypothesized to efficiently encode signals from the natural environment subject to resource constraints. The predictions of this efficient coding hypothesis regarding the spatial filtering properties of the visual system have been found consistent with human perception, but they have not been compared directly with neural responses. Here, we analyze the information that retinal ganglion cells transmit to the brain about the spatial information in natural images subject to three resource constraints: the number of retinal ganglion cells, their total response variances, and their total synaptic strengths. We derive a model that optimizes the transmitted information and compare it directly with measurements of complete functional connectivity between cone photoreceptors and the four major types of ganglion cells in the primate retina, obtained at single-cell resolution. We find that the ganglion cell population exhibited 80% efficiency in transmitting spatial information relative to the model. Both the retina and the model exhibited high redundancy (~30%) among ganglion cells of the same cell type. A novel and unique prediction of efficient coding, the relationships between projection patterns of individual cones to all ganglion cells, was consistent with the observed projection patterns in the retina. These results indicate a high level of efficiency with near-optimal redundancy in visual signaling by the retina.
Collapse
|
40
|
Dayan P. How to set the switches on this thing. Curr Opin Neurobiol 2012; 22:1068-74. [DOI: 10.1016/j.conb.2012.05.011] [Citation(s) in RCA: 67] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2012] [Revised: 05/10/2012] [Accepted: 05/28/2012] [Indexed: 11/26/2022]
|
41
|
Abstract
BACKGROUND An important problem in selective attention is determining the ways the primary visual cortex contributes to the encoding of bottom-up saliency and the types of neural computation that are effective to model this process. To address this problem, we constructed a two-layered network that satisfies the neurobiological constraints of the primary visual cortex to detect salient objects. We carried out experiments on both synthetic images and natural images to explore the influences of different factors, such as network structure, the size of each layer, the type of suppression and the combination strategy, on saliency detection performance. RESULTS The experimental results statistically demonstrated that the type and scale of filters contribute greatly to the encoding of bottom-up saliency. These two factors correspond to the mechanisms of invariant encoding and overcomplete representation in the primary visual cortex. CONCLUSIONS (1) Instead of constructing Gabor functions or Gaussian pyramids filters for feature extraction as traditional attention models do, we learn overcomplete basis sets from natural images to extract features for saliency detection. Experiments show that given the proper layer size and a robust combination strategy, the learned overcomplete basis set outperforms a complete set and Gabor pyramids in visual saliency detection. This finding can potentially be applied in task-dependent and supervised object detection. (2) A hierarchical coding model that can represent invariant features, is designed for the pre-attentive stage of bottom-up attention. This coding model improves robustness to noises and distractions and improves the ability of detecting salient structures, such as collinear and co-circular structures, and several composite stimuli. This result indicates that invariant representation contributes to saliency detection (popping out) in bottom-up attention. The aforementioned perspectives will significantly contribute to the in-depth understanding of the information processing mechanism in the primary visual system.
Collapse
|
42
|
A spiking neural network based cortex-like mechanism and application to facial expression recognition. COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE 2012; 2012:946589. [PMID: 23193391 PMCID: PMC3501821 DOI: 10.1155/2012/946589] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/27/2012] [Accepted: 07/03/2012] [Indexed: 11/28/2022]
Abstract
In this paper, we present a quantitative, highly structured cortex-simulated model, which can be simply described as feedforward, hierarchical simulation of ventral stream of visual cortex using biologically plausible, computationally convenient spiking neural network system. The motivation comes directly from recent pioneering works on detailed functional decomposition analysis of the feedforward pathway of the ventral stream of visual cortex and developments on artificial spiking neural networks (SNNs). By combining the logical structure of the cortical hierarchy and computing power of the spiking neuron model, a practical framework has been presented. As a proof of principle, we demonstrate our system on several facial expression recognition tasks. The proposed cortical-like feedforward hierarchy framework has the merit of capability of dealing with complicated pattern recognition problems, suggesting that, by combining the cognitive models with modern neurocomputational approaches, the neurosystematic approach to the study of cortex-like mechanism has the potential to extend our knowledge of brain mechanisms underlying the cognitive analysis and to advance theoretical models of how we recognize face or, more specifically, perceive other people's facial expression in a rich, dynamic, and complex environment, providing a new starting point for improved models of visual cortex-like mechanism.
Collapse
|
43
|
Zou Q, Wang Z, Luo S, Huang Y, Tian M. A computational coding model for saliency detection in primary visual cortex. CHINESE SCIENCE BULLETIN-CHINESE 2012. [DOI: 10.1007/s11434-012-5402-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
|
44
|
Hunt JJ, Mattingley JB, Goodhill GJ. Randomly oriented edge arrangements dominate naturalistic arrangements in binocular rivalry. Vision Res 2012; 64:49-55. [DOI: 10.1016/j.visres.2012.05.007] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2012] [Revised: 05/04/2012] [Accepted: 05/09/2012] [Indexed: 11/15/2022]
|
45
|
Zhaoping L, Zhe L. Properties of V1 neurons tuned to conjunctions of visual features: application of the V1 saliency hypothesis to visual search behavior. PLoS One 2012; 7:e36223. [PMID: 22719829 PMCID: PMC3373599 DOI: 10.1371/journal.pone.0036223] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2011] [Accepted: 03/27/2012] [Indexed: 11/23/2022] Open
Abstract
From a computational theory of V1, we formulate an optimization problem to investigate neural properties in the primary visual cortex (V1) from human reaction times (RTs) in visual search. The theory is the V1 saliency hypothesis that the bottom-up saliency of any visual location is represented by the highest V1 response to it relative to the background responses. The neural properties probed are those associated with the less known V1 neurons tuned simultaneously or conjunctively in two feature dimensions. The visual search is to find a target bar unique in color (C), orientation (O), motion direction (M), or redundantly in combinations of these features (e.g., CO, MO, or CM) among uniform background bars. A feature singleton target is salient because its evoked V1 response largely escapes the iso-feature suppression on responses to the background bars. The responses of the conjunctively tuned cells are manifested in the shortening of the RT for a redundant feature target (e.g., a CO target) from that predicted by a race between the RTs for the two corresponding single feature targets (e.g., C and O targets). Our investigation enables the following testable predictions. Contextual suppression on the response of a CO-tuned or MO-tuned conjunctive cell is weaker when the contextual inputs differ from the direct inputs in both feature dimensions, rather than just one. Additionally, CO-tuned cells and MO-tuned cells are often more active than the single feature tuned cells in response to the redundant feature targets, and this occurs more frequently for the MO-tuned cells such that the MO-tuned cells are no less likely than either the M-tuned or O-tuned neurons to be the most responsive neuron to dictate saliency for an MO target.
Collapse
Affiliation(s)
- Li Zhaoping
- Department of Computer Science, University College London, London, United Kingdom.
| | | |
Collapse
|
46
|
Coen-Cagli R, Dayan P, Schwartz O. Cortical Surround Interactions and Perceptual Salience via Natural Scene Statistics. PLoS Comput Biol 2012; 8:e1002405. [PMID: 22396635 PMCID: PMC3291533 DOI: 10.1371/journal.pcbi.1002405] [Citation(s) in RCA: 62] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2011] [Accepted: 01/13/2012] [Indexed: 11/19/2022] Open
Abstract
Spatial context in images induces perceptual phenomena associated with salience and modulates the responses of neurons in primary visual cortex (V1). However, the computational and ecological principles underlying contextual effects are incompletely understood. We introduce a model of natural images that includes grouping and segmentation of neighboring features based on their joint statistics, and we interpret the firing rates of V1 neurons as performing optimal recognition in this model. We show that this leads to a substantial generalization of divisive normalization, a computation that is ubiquitous in many neural areas and systems. A main novelty in our model is that the influence of the context on a target stimulus is determined by their degree of statistical dependence. We optimized the parameters of the model on natural image patches, and then simulated neural and perceptual responses on stimuli used in classical experiments. The model reproduces some rich and complex response patterns observed in V1, such as the contrast dependence, orientation tuning and spatial asymmetry of surround suppression, while also allowing for surround facilitation under conditions of weak stimulation. It also mimics the perceptual salience produced by simple displays, and leads to readily testable predictions. Our results provide a principled account of orientation-based contextual modulation in early vision and its sensitivity to the homogeneity and spatial arrangement of inputs, and lends statistical support to the theory that V1 computes visual salience.
Collapse
Affiliation(s)
- Ruben Coen-Cagli
- Dominick Purpura Department of Neuroscience, Albert Einstein College of Medicine, Bronx, New York, United States of America.
| | | | | |
Collapse
|
47
|
An approach for visual attention based on biquaternion and its application for ship detection in multispectral imagery. Neurocomputing 2012. [DOI: 10.1016/j.neucom.2011.05.027] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
|
48
|
Yu Y, Wang B, Zhang L. Bottom-up attention: pulsed PCA transform and pulsed cosine transform. Cogn Neurodyn 2011; 5:321-32. [PMID: 23115590 PMCID: PMC3193976 DOI: 10.1007/s11571-011-9155-z] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2010] [Revised: 04/05/2011] [Accepted: 04/08/2011] [Indexed: 11/25/2022] Open
Abstract
In this paper we propose a computational model of bottom-up visual attention based on a pulsed principal component analysis (PCA) transform, which simply exploits the signs of the PCA coefficients to generate spatial and motional saliency. We further extend the pulsed PCA transform to a pulsed cosine transform that is not only data-independent but also very fast in computation. The proposed model has the following biological plausibilities. First, the PCA projection vectors in the model can be obtained by using the Hebbian rule in neural networks. Second, the outputs of the pulsed PCA transform, which are inherently binary, simulate the neuronal pulses in the human brain. Third, like many Fourier transform-based approaches, our model also accomplishes the cortical center-surround suppression in frequency domain. Experimental results on psychophysical patterns and natural images show that the proposed model is more effective in saliency detection and predict human eye fixations better than the state-of-the-art attention models.
Collapse
Affiliation(s)
- Ying Yu
- School of Information Science and Engineering, Yunnan University, Kunming, 650091 China
- Department of Electronic Engineering, Fudan University, Shanghai, 200433 China
| | - Bin Wang
- Department of Electronic Engineering, Fudan University, Shanghai, 200433 China
| | - Liming Zhang
- Department of Electronic Engineering, Fudan University, Shanghai, 200433 China
| |
Collapse
|
49
|
Zhao L, Zhaoping L. Understanding auditory spectro-temporal receptive fields and their changes with input statistics by efficient coding principles. PLoS Comput Biol 2011; 7:e1002123. [PMID: 21887121 PMCID: PMC3158037 DOI: 10.1371/journal.pcbi.1002123] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2010] [Accepted: 05/31/2011] [Indexed: 11/18/2022] Open
Abstract
Spectro-temporal receptive fields (STRFs) have been widely used as linear approximations to the signal transform from sound spectrograms to neural responses along the auditory pathway. Their dependence on statistical attributes of the stimuli, such as sound intensity, is usually explained by nonlinear mechanisms and models. Here, we apply an efficient coding principle which has been successfully used to understand receptive fields in early stages of visual processing, in order to provide a computational understanding of the STRFs. According to this principle, STRFs result from an optimal tradeoff between maximizing the sensory information the brain receives, and minimizing the cost of the neural activities required to represent and transmit this information. Both terms depend on the statistical properties of the sensory inputs and the noise that corrupts them. The STRFs should therefore depend on the input power spectrum and the signal-to-noise ratio, which is assumed to increase with input intensity. We analytically derive the optimal STRFs when signal and noise are approximated as Gaussians. Under the constraint that they should be spectro-temporally local, the STRFs are predicted to adapt from being band-pass to low-pass filters as the input intensity reduces, or the input correlation becomes longer range in sound frequency or time. These predictions qualitatively match physiological observations. Our prediction as to how the STRFs should be determined by the input power spectrum could readily be tested, since this spectrum depends on the stimulus ensemble. The potentials and limitations of the efficient coding principle are discussed.
Collapse
Affiliation(s)
- Lingyun Zhao
- Department of Biomedical Engineering, School of Medicine, Tsinghua University, Beijing, P.R. China
| | - Li Zhaoping
- Department of Computer Science, University College London, London, United Kingdom
- * E-mail:
| |
Collapse
|
50
|
Qian N, Lipkin RM. A learning-style theory for understanding autistic behaviors. Front Hum Neurosci 2011; 5:77. [PMID: 21886617 PMCID: PMC3155869 DOI: 10.3389/fnhum.2011.00077] [Citation(s) in RCA: 43] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2011] [Accepted: 07/21/2011] [Indexed: 12/20/2022] Open
Abstract
Understanding autism's ever-expanding array of behaviors, from sensation to cognition, is a major challenge. We posit that autistic and typically developing brains implement different algorithms that are better suited to learn, represent, and process different tasks; consequently, they develop different interests and behaviors. Computationally, a continuum of algorithms exists, from lookup table (LUT) learning, which aims to store experiences precisely, to interpolation (INT) learning, which focuses on extracting underlying statistical structure (regularities) from experiences. We hypothesize that autistic and typical brains, respectively, are biased toward LUT and INT learning, in low- and high-dimensional feature spaces, possibly because of their narrow and broad tuning functions. The LUT style is good at learning relationships that are local, precise, rigid, and contain little regularity for generalization (e.g., the name–number association in a phonebook). However, it is poor at learning relationships that are context dependent, noisy, flexible, and do contain regularities for generalization (e.g., associations between gaze direction and intention, language and meaning, sensory input and interpretation, motor-control signal and movement, and social situation and proper response). The LUT style poorly compresses information, resulting in inefficiency, sensory overload (overwhelm), restricted interests, and resistance to change. It also leads to poor prediction and anticipation, frequent surprises and over-reaction (hyper-sensitivity), impaired attentional selection and switching, concreteness, strong local focus, weak adaptation, and superior and inferior performances on simple and complex tasks. The spectrum nature of autism can be explained by different degrees of LUT learning among different individuals, and in different systems of the same individual. Our theory suggests that therapy should focus on training autistic LUT algorithm to learn regularities.
Collapse
Affiliation(s)
- Ning Qian
- Department of Neuroscience, Columbia University New York, NY, USA
| | | |
Collapse
|