1
|
Meneghetti N, Vannini E, Mazzoni A. Rodents' visual gamma as a biomarker of pathological neural conditions. J Physiol 2024; 602:1017-1048. [PMID: 38372352 DOI: 10.1113/jp283858] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2022] [Accepted: 01/23/2024] [Indexed: 02/20/2024] Open
Abstract
Neural gamma oscillations (indicatively 30-100 Hz) are ubiquitous: they are associated with a broad range of functions in multiple cortical areas and across many animal species. Experimental and computational works established gamma rhythms as a global emergent property of neuronal networks generated by the balanced and coordinated interaction of excitation and inhibition. Coherently, gamma activity is strongly influenced by the alterations of synaptic dynamics which are often associated with pathological neural dysfunctions. We argue therefore that these oscillations are an optimal biomarker for probing the mechanism of cortical dysfunctions. Gamma oscillations are also highly sensitive to external stimuli in sensory cortices, especially the primary visual cortex (V1), where the stimulus dependence of gamma oscillations has been thoroughly investigated. Gamma manipulation by visual stimuli tuning is particularly easy in rodents, which have become a standard animal model for investigating the effects of network alterations on gamma oscillations. Overall, gamma in the rodents' visual cortex offers an accessible probe on dysfunctional information processing in pathological conditions. Beyond vision-related dysfunctions, alterations of gamma oscillations in rodents were indeed also reported in neural deficits such as migraine, epilepsy and neurodegenerative or neuropsychiatric conditions such as Alzheimer's, schizophrenia and autism spectrum disorders. Altogether, the connections between visual cortical gamma activity and physio-pathological conditions in rodent models underscore the potential of gamma oscillations as markers of neuronal (dys)functioning.
Collapse
Affiliation(s)
- Nicolò Meneghetti
- The Biorobotics Institute, Scuola Superiore Sant'Anna, Pisa, Italy
- Department of Excellence for Robotics and AI, Scuola Superiore Sant'Anna, Pisa, Italy
| | - Eleonora Vannini
- Neuroscience Institute, National Research Council (CNR), Pisa, Italy
| | - Alberto Mazzoni
- The Biorobotics Institute, Scuola Superiore Sant'Anna, Pisa, Italy
- Department of Excellence for Robotics and AI, Scuola Superiore Sant'Anna, Pisa, Italy
| |
Collapse
|
2
|
Tao L, Wechsler SP, Bhandawat V. Sensorimotor transformation underlying odor-modulated locomotion in walking Drosophila. Nat Commun 2023; 14:6818. [PMID: 37884581 PMCID: PMC10603174 DOI: 10.1038/s41467-023-42613-8] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2022] [Accepted: 10/17/2023] [Indexed: 10/28/2023] Open
Abstract
Most real-world behaviors - such as odor-guided locomotion - are performed with incomplete information. Activity in olfactory receptor neuron (ORN) classes provides information about odor identity but not the location of its source. In this study, we investigate the sensorimotor transformation that relates ORN activation to locomotion changes in Drosophila by optogenetically activating different combinations of ORN classes and measuring the resulting changes in locomotion. Three features describe this sensorimotor transformation: First, locomotion depends on both the instantaneous firing frequency (f) and its change (df); the two together serve as a short-term memory that allows the fly to adapt its motor program to sensory context automatically. Second, the mapping between (f, df) and locomotor parameters such as speed or curvature is distinct for each pattern of activated ORNs. Finally, the sensorimotor mapping changes with time after odor exposure, allowing information integration over a longer timescale.
Collapse
Affiliation(s)
- Liangyu Tao
- School of Biomedical Engineering and Health Sciences, Drexel University, Philadelphia, PA, USA
| | - Samuel P Wechsler
- School of Biomedical Engineering and Health Sciences, Drexel University, Philadelphia, PA, USA
- Department of Neurobiology and Anatomy, Drexel University, Philadelphia, PA, USA
| | - Vikas Bhandawat
- School of Biomedical Engineering and Health Sciences, Drexel University, Philadelphia, PA, USA.
| |
Collapse
|
3
|
Winding M, Pedigo BD, Barnes CL, Patsolic HG, Park Y, Kazimiers T, Fushiki A, Andrade IV, Khandelwal A, Valdes-Aleman J, Li F, Randel N, Barsotti E, Correia A, Fetter RD, Hartenstein V, Priebe CE, Vogelstein JT, Cardona A, Zlatic M. The connectome of an insect brain. Science 2023; 379:eadd9330. [PMID: 36893230 PMCID: PMC7614541 DOI: 10.1126/science.add9330] [Citation(s) in RCA: 89] [Impact Index Per Article: 89.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2022] [Accepted: 02/07/2023] [Indexed: 03/11/2023]
Abstract
Brains contain networks of interconnected neurons and so knowing the network architecture is essential for understanding brain function. We therefore mapped the synaptic-resolution connectome of an entire insect brain (Drosophila larva) with rich behavior, including learning, value computation, and action selection, comprising 3016 neurons and 548,000 synapses. We characterized neuron types, hubs, feedforward and feedback pathways, as well as cross-hemisphere and brain-nerve cord interactions. We found pervasive multisensory and interhemispheric integration, highly recurrent architecture, abundant feedback from descending neurons, and multiple novel circuit motifs. The brain's most recurrent circuits comprised the input and output neurons of the learning center. Some structural features, including multilayer shortcuts and nested recurrent loops, resembled state-of-the-art deep learning architectures. The identified brain architecture provides a basis for future experimental and theoretical studies of neural circuits.
Collapse
Affiliation(s)
- Michael Winding
- University of Cambridge, Department of Zoology, Cambridge, UK
- MRC Laboratory of Molecular Biology, Neurobiology Division, Cambridge, UK
- Janelia Research Campus, Howard Hughes Medical Institute, Ashburn, VA, USA
| | - Benjamin D. Pedigo
- Johns Hopkins University, Department of Biomedical Engineering, Baltimore, MD, USA
| | - Christopher L. Barnes
- MRC Laboratory of Molecular Biology, Neurobiology Division, Cambridge, UK
- University of Cambridge, Department of Physiology, Development, and Neuroscience, Cambridge, UK
| | - Heather G. Patsolic
- Johns Hopkins University, Department of Applied Mathematics and Statistics, Baltimore, MD, USA
- Accenture, Arlington, VA, USA
| | - Youngser Park
- Johns Hopkins University, Center for Imaging Science, Baltimore, MD, USA
| | - Tom Kazimiers
- Janelia Research Campus, Howard Hughes Medical Institute, Ashburn, VA, USA
- kazmos GmbH, Dresden, Germany
| | - Akira Fushiki
- Janelia Research Campus, Howard Hughes Medical Institute, Ashburn, VA, USA
- Zuckerman Mind Brain Behavior Institute, Columbia University, New York, NY, USA
| | - Ingrid V. Andrade
- University of California Los Angeles, Department of Molecular, Cell and Developmental Biology, Los Angeles, CA, USA
| | - Avinash Khandelwal
- Janelia Research Campus, Howard Hughes Medical Institute, Ashburn, VA, USA
| | - Javier Valdes-Aleman
- University of Cambridge, Department of Zoology, Cambridge, UK
- Janelia Research Campus, Howard Hughes Medical Institute, Ashburn, VA, USA
| | - Feng Li
- Janelia Research Campus, Howard Hughes Medical Institute, Ashburn, VA, USA
| | - Nadine Randel
- University of Cambridge, Department of Zoology, Cambridge, UK
- MRC Laboratory of Molecular Biology, Neurobiology Division, Cambridge, UK
| | - Elizabeth Barsotti
- MRC Laboratory of Molecular Biology, Neurobiology Division, Cambridge, UK
- University of Cambridge, Department of Physiology, Development, and Neuroscience, Cambridge, UK
| | - Ana Correia
- MRC Laboratory of Molecular Biology, Neurobiology Division, Cambridge, UK
- University of Cambridge, Department of Physiology, Development, and Neuroscience, Cambridge, UK
| | - Richard D. Fetter
- Janelia Research Campus, Howard Hughes Medical Institute, Ashburn, VA, USA
- Stanford University, Stanford, CA, USA
| | - Volker Hartenstein
- University of California Los Angeles, Department of Molecular, Cell and Developmental Biology, Los Angeles, CA, USA
| | - Carey E. Priebe
- Johns Hopkins University, Department of Applied Mathematics and Statistics, Baltimore, MD, USA
- Johns Hopkins University, Center for Imaging Science, Baltimore, MD, USA
| | - Joshua T. Vogelstein
- Johns Hopkins University, Department of Biomedical Engineering, Baltimore, MD, USA
- Johns Hopkins University, Center for Imaging Science, Baltimore, MD, USA
| | - Albert Cardona
- MRC Laboratory of Molecular Biology, Neurobiology Division, Cambridge, UK
- Janelia Research Campus, Howard Hughes Medical Institute, Ashburn, VA, USA
- University of Cambridge, Department of Physiology, Development, and Neuroscience, Cambridge, UK
| | - Marta Zlatic
- University of Cambridge, Department of Zoology, Cambridge, UK
- MRC Laboratory of Molecular Biology, Neurobiology Division, Cambridge, UK
- Janelia Research Campus, Howard Hughes Medical Institute, Ashburn, VA, USA
| |
Collapse
|
4
|
Li AY, Fukuda K, Barense MD. Independent features form integrated objects: Using a novel shape-color “conjunction task” to reconstruct memory resolution for multiple object features simultaneously. Cognition 2022; 223:105024. [DOI: 10.1016/j.cognition.2022.105024] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2021] [Revised: 12/17/2021] [Accepted: 01/13/2022] [Indexed: 11/16/2022]
|
5
|
Abstract
The brain’s ability to create a unified conscious representation of an object by integrating information from multiple perception pathways is called perceptual binding. Binding is crucial for normal cognitive function. Some perceptual binding errors and disorders have been linked to certain neurological conditions, brain lesions, and conditions that give rise to illusory conjunctions. However, the mechanism of perceptual binding remains elusive. Here, I present a computational model of binding using two sets of coupled oscillatory processes that are assumed to occur in response to two different percepts. I use the model to study the dynamic behavior of coupled processes to characterize how these processes can modulate each other and reach a temporal synchrony. I identify different oscillatory dynamic regimes that depend on coupling mechanisms and parameter values. The model can also discriminate different combinations of initial inputs that are set by initial states of coupled processes. Decoding brain signals that are formed through perceptual binding is a challenging task, but my modeling results demonstrate how crosstalk between two systems of processes can possibly modulate their outputs. Therefore, my mechanistic model can help one gain a better understanding of how crosstalk between perception pathways can affect the dynamic behavior of the systems that involve perceptual binding.
Collapse
|
6
|
Raj R, Dahlen D, Duyck K, Yu CR. Maximal Dependence Capturing as a Principle of Sensory Processing. Front Comput Neurosci 2022; 16:857653. [PMID: 35399919 PMCID: PMC8989953 DOI: 10.3389/fncom.2022.857653] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2022] [Accepted: 02/15/2022] [Indexed: 11/13/2022] Open
Abstract
Sensory inputs conveying information about the environment are often noisy and incomplete, yet the brain can achieve remarkable consistency in recognizing objects. Presumably, transforming the varying input patterns into invariant object representations is pivotal for this cognitive robustness. In the classic hierarchical representation framework, early stages of sensory processing utilize independent components of environmental stimuli to ensure efficient information transmission. Representations in subsequent stages are based on increasingly complex receptive fields along a hierarchical network. This framework accurately captures the input structures; however, it is challenging to achieve invariance in representing different appearances of objects. Here we assess theoretical and experimental inconsistencies of the current framework. In its place, we propose that individual neurons encode objects by following the principle of maximal dependence capturing (MDC), which compels each neuron to capture the structural components that contain maximal information about specific objects. We implement the proposition in a computational framework incorporating dimension expansion and sparse coding, which achieves consistent representations of object identities under occlusion, corruption, or high noise conditions. The framework neither requires learning the corrupted forms nor comprises deep network layers. Moreover, it explains various receptive field properties of neurons. Thus, MDC provides a unifying principle for sensory processing.
Collapse
Affiliation(s)
- Rishabh Raj
- Stowers Institute for Medical Research, Kansas City, MO, United States
| | - Dar Dahlen
- Stowers Institute for Medical Research, Kansas City, MO, United States
| | - Kyle Duyck
- Stowers Institute for Medical Research, Kansas City, MO, United States
| | - C. Ron Yu
- Stowers Institute for Medical Research, Kansas City, MO, United States
- Department of Anatomy and Cell Biology, University of Kansas Medical Center, Kansas City, KS, United States
| |
Collapse
|
7
|
The Global Configuration of Visual Stimuli Alters Co-Fluctuations of Cross-Hemispheric Human Brain Activity. J Neurosci 2021; 41:9756-9766. [PMID: 34663628 DOI: 10.1523/jneurosci.3214-20.2021] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/26/2020] [Revised: 09/11/2021] [Accepted: 10/07/2021] [Indexed: 11/21/2022] Open
Abstract
We tested how a stimulus gestalt, defined by the neuronal interaction between local and global features of a stimulus, is represented within human primary visual cortex (V1). We used high-resolution fMRI, which serves as a surrogate of neuronal activation, to measure co-fluctuations within subregions of V1 as (male and female) subjects were presented with peripheral stimuli, each with different global configurations. We found stronger cross-hemisphere correlations when fine-scale V1 cortical subregions represented parts of the same object compared with different objects. This result was consistent with the vertical bias in global processing and, critically, was independent of the task and local discontinuities within objects. Thus, despite the relatively small receptive fields of neurons within V1, global stimulus configuration affects neuronal processing via correlated fluctuations between regions that represent different sectors of the visual field.SIGNIFICANCE STATEMENT We provide the first evidence for the impact of global stimulus configuration on cross-hemispheric fMRI fluctuations, measured in human primary visual cortex. Our results are consistent with changes in the level of γ-band synchrony, which has been shown to be affected by global stimulus configuration, being reflected in the level fMRI co-fluctuations. These data help narrow the gap between knowledge of global stimulus configuration encoding at the single-neuron level versus at the behavioral level.
Collapse
|
8
|
Kreiman G, Serre T. Beyond the feedforward sweep: feedback computations in the visual cortex. Ann N Y Acad Sci 2020; 1464:222-241. [PMID: 32112444 PMCID: PMC7456511 DOI: 10.1111/nyas.14320] [Citation(s) in RCA: 27] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2019] [Revised: 01/24/2020] [Accepted: 01/30/2020] [Indexed: 11/28/2022]
Abstract
Visual perception involves the rapid formation of a coarse image representation at the onset of visual processing, which is iteratively refined by late computational processes. These early versus late time windows approximately map onto feedforward and feedback processes, respectively. State-of-the-art convolutional neural networks, the main engine behind recent machine vision successes, are feedforward architectures. Their successes and limitations provide critical information regarding which visual tasks can be solved by purely feedforward processes and which require feedback mechanisms. We provide an overview of recent work in cognitive neuroscience and machine vision that highlights the possible role of feedback processes for both visual recognition and beyond. We conclude by discussing important open questions for future research.
Collapse
Affiliation(s)
- Gabriel Kreiman
- Children’s Hospital, Harvard Medical School and Center for Brains, Minds, and Machines
| | - Thomas Serre
- Cognitive Linguistic & Psychological Sciences, Carney Institute for Brain Science, Brown University
| |
Collapse
|
9
|
Sikkens T, Bosman CA, Olcese U. The Role of Top-Down Modulation in Shaping Sensory Processing Across Brain States: Implications for Consciousness. Front Syst Neurosci 2019; 13:31. [PMID: 31680883 PMCID: PMC6802962 DOI: 10.3389/fnsys.2019.00031] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2019] [Accepted: 07/05/2019] [Indexed: 11/24/2022] Open
Abstract
Top-down, feedback projections account for a large portion of all connections between neurons in the thalamocortical system, yet their precise role remains the subject of much discussion. A large number of studies has focused on investigating how sensory information is transformed across hierarchically-distributed processing stages in a feedforward fashion, and computational models have shown that purely feedforward artificial neural networks can even outperform humans in pattern classification tasks. What is then the functional role of feedback connections? Several key roles have been identified, ranging from attentional modulation to, crucially, conscious perception. Specifically, most of the major theories on consciousness postulate that feedback connections would play an essential role in enabling sensory information to be consciously perceived. Consequently, it follows that their efficacy in modulating target regions should drastically decrease in nonconscious brain states [non-rapid eye movement (REM) sleep, anesthesia] compared to conscious ones (wakefulness), and also in instances when a given sensory stimulus is not perceived compared to when it is. Until recently, however, this prediction could only be tested with correlative experiments, due to the lack of techniques to selectively manipulate and measure the activity of feedback pathways. In this article, we will review the most recent literature on the functions of feedback connections across brain states and based on the presence or absence of perception. We will focus on experiments studying mismatch negativity, a phenomenon which has been hypothesized to rely on top-down modulation but which persists during nonconscious states. While feedback modulation is generally dampened in nonconscious states and enhanced when perception occurs, there are clear deviations from this rule. As we will discuss, this may pose a challenge to most theories of consciousness, and possibly require a change in how the level of consciousness in supposedly nonconscious states is assessed.
Collapse
Affiliation(s)
- Tom Sikkens
- Cognitive and Systems Neuroscience Group, Swammerdam Institute for Life Sciences, University of Amsterdam, Amsterdam, Netherlands.,Research Priority Area Brain and Cognition, University of Amsterdam, Amsterdam, Netherlands
| | - Conrado A Bosman
- Cognitive and Systems Neuroscience Group, Swammerdam Institute for Life Sciences, University of Amsterdam, Amsterdam, Netherlands.,Research Priority Area Brain and Cognition, University of Amsterdam, Amsterdam, Netherlands
| | - Umberto Olcese
- Cognitive and Systems Neuroscience Group, Swammerdam Institute for Life Sciences, University of Amsterdam, Amsterdam, Netherlands.,Research Priority Area Brain and Cognition, University of Amsterdam, Amsterdam, Netherlands
| |
Collapse
|
10
|
Kotchoubey B. Human Consciousness: Where Is It From and What Is It for. Front Psychol 2018; 9:567. [PMID: 29740366 PMCID: PMC5924785 DOI: 10.3389/fpsyg.2018.00567] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2017] [Accepted: 04/04/2018] [Indexed: 11/25/2022] Open
Abstract
Consciousness is not a process in the brain but a kind of behavior that, of course, is controlled by the brain like any other behavior. Human consciousness emerges on the interface between three components of animal behavior: communication, play, and the use of tools. These three components interact on the basis of anticipatory behavioral control, which is common for all complex forms of animal life. All three do not exclusively distinguish our close relatives, i.e., primates, but are broadly presented among various species of mammals, birds, and even cephalopods; however, their particular combination in humans is unique. The interaction between communication and play yields symbolic games, most importantly language; the interaction between symbols and tools results in human praxis. Taken together, this gives rise to a mechanism that allows a creature, instead of performing controlling actions overtly, to play forward the corresponding behavioral options in a “second reality” of objectively (by means of tools) grounded symbolic systems. The theory possesses the following properties: (1) It is anti-reductionist and anti-eliminativist, and yet, human consciousness is considered as a purely natural (biological) phenomenon. (2) It avoids epiphenomenalism and indicates in which conditions human consciousness has evolutionary advantages, and in which it may even be disadvantageous. (3) It allows to easily explain the most typical features of consciousness, such as objectivity, seriality and limited resources, the relationship between consciousness and explicit memory, the feeling of conscious agency, etc.
Collapse
Affiliation(s)
- Boris Kotchoubey
- Institute of Medical Psychology and Behavioral Neurobiology, University of Tübingen, Tübingen, Germany
| |
Collapse
|
11
|
Fademrecht L, Nieuwenhuis J. Action Recognition in a Crowded Environment. Iperception 2017; 8:2041669517743521. [PMID: 29308177 PMCID: PMC5751920 DOI: 10.1177/2041669517743521] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022] Open
Abstract
So far, action recognition has been mainly examined with small point-light human stimuli presented alone within a narrow central area of the observer's visual field. Yet, we need to recognize the actions of life-size humans viewed alone or surrounded by bystanders, whether they are seen in central or peripheral vision. Here, we examined the mechanisms in central vision and far periphery (40° eccentricity) involved in the recognition of the actions of a life-size actor (target) and their sensitivity to the presence of a crowd surrounding the target. In Experiment 1, we used an action adaptation paradigm to probe whether static or idly moving crowds might interfere with the recognition of a target's action (hug or clap). We found that this type of crowds whose movements were dissimilar to the target action hardly affected action recognition in central and peripheral vision. In Experiment 2, we examined whether crowd actions that were more similar to the target actions affected action recognition. Indeed, the presence of that crowd diminished adaptation aftereffects in central vision as wells as in the periphery. We replicated Experiment 2 using a recognition task instead of an adaptation paradigm. With this task, we found evidence of decreased action recognition accuracy, but this was significant in peripheral vision only. Our results suggest that the presence of a crowd carrying out actions similar to that of the target affects its recognition. We outline how these results can be understood in terms of high-level crowding effects that operate on action-sensitive perceptual channels.
Collapse
Affiliation(s)
- Laura Fademrecht
- Department of Human Perception, Cognition and Action, Max PlanckInstitute for Biological Cybernetics, Tübingen, Baden-Württemberg, Germany
| | - Judith Nieuwenhuis
- Department of Human Perception, Cognition and Action, Max PlanckInstitute for Biological Cybernetics, Tübingen, Baden-Württemberg, Germany
| |
Collapse
|
12
|
Lomp O, Faubel C, Schöner G. A Neural-Dynamic Architecture for Concurrent Estimation of Object Pose and Identity. Front Neurorobot 2017; 11:23. [PMID: 28503145 PMCID: PMC5408094 DOI: 10.3389/fnbot.2017.00023] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2016] [Accepted: 04/06/2017] [Indexed: 11/24/2022] Open
Abstract
Handling objects or interacting with a human user about objects on a shared tabletop requires that objects be identified after learning from a small number of views and that object pose be estimated. We present a neurally inspired architecture that learns object instances by storing features extracted from a single view of each object. Input features are color and edge histograms from a localized area that is updated during processing. The system finds the best-matching view for the object in a novel input image while concurrently estimating the object’s pose, aligning the learned view with current input. The system is based on neural dynamics, computationally operating in real time, and can handle dynamic scenes directly off live video input. In a scenario with 30 everyday objects, the system achieves recognition rates of 87.2% from a single training view for each object, while also estimating pose quite precisely. We further demonstrate that the system can track moving objects, and that it can segment the visual array, selecting and recognizing one object while suppressing input from another known object in the immediate vicinity. Evaluation on the COIL-100 dataset, in which objects are depicted from different viewing angles, revealed recognition rates of 91.1% on the first 30 objects, each learned from four training views.
Collapse
Affiliation(s)
- Oliver Lomp
- Institut für Neuroinformatik, Ruhr-University Bochum, Bochum, Germany
- *Correspondence: Oliver Lomp,
| | - Christian Faubel
- Institut für Neuroinformatik, Ruhr-University Bochum, Bochum, Germany
| | - Gregor Schöner
- Institut für Neuroinformatik, Ruhr-University Bochum, Bochum, Germany
| |
Collapse
|
13
|
Finlayson NJ, Golomb JD. Feature-location binding in 3D: Feature judgments are biased by 2D location but not position-in-depth. Vision Res 2016; 127:49-56. [PMID: 27468654 PMCID: PMC5035601 DOI: 10.1016/j.visres.2016.07.003] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2016] [Revised: 07/01/2016] [Accepted: 07/05/2016] [Indexed: 11/29/2022]
Abstract
A fundamental aspect of human visual perception is the ability to recognize and locate objects in the environment. Importantly, our environment is predominantly three-dimensional (3D), but while there is considerable research exploring the binding of object features and location, it is unknown how depth information interacts with features in the object binding process. A recent paradigm called the spatial congruency bias demonstrated that 2D location is fundamentally bound to object features, such that irrelevant location information biases judgments of object features, but irrelevant feature information does not bias judgments of location or other features. Here, using the spatial congruency bias paradigm, we asked whether depth is processed as another type of location, or more like other features. We initially found that depth cued by binocular disparity biased judgments of object color. However, this result seemed to be driven more by the disparity differences than the depth percept: Depth cued by occlusion and size did not bias color judgments, whereas vertical disparity information (with no depth percept) did bias color judgments. Our results suggest that despite the 3D nature of our visual environment, only 2D location information - not position-in-depth - seems to be automatically bound to object features, with depth information processed more similarly to other features than to 2D location.
Collapse
Affiliation(s)
- Nonie J Finlayson
- Department of Psychology, Center for Cognitive & Brain Sciences, The Ohio State University, Columbus, OH 43210, USA.
| | - Julie D Golomb
- Department of Psychology, Center for Cognitive & Brain Sciences, The Ohio State University, Columbus, OH 43210, USA
| |
Collapse
|
14
|
There Is a "U" in Clutter: Evidence for Robust Sparse Codes Underlying Clutter Tolerance in Human Vision. J Neurosci 2016; 35:14148-59. [PMID: 26490856 DOI: 10.1523/jneurosci.1211-15.2015] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022] Open
Abstract
UNLABELLED The ability to recognize objects in clutter is crucial for human vision, yet the underlying neural computations remain poorly understood. Previous single-unit electrophysiology recordings in inferotemporal cortex in monkeys and fMRI studies of object-selective cortex in humans have shown that the responses to pairs of objects can sometimes be well described as a weighted average of the responses to the constituent objects. Yet, from a computational standpoint, it is not clear how the challenge of object recognition in clutter can be solved if downstream areas must disentangle the identity of an unknown number of individual objects from the confounded average neuronal responses. An alternative idea is that recognition is based on a subpopulation of neurons that are robust to clutter, i.e., that do not show response averaging, but rather robust object-selective responses in the presence of clutter. Here we show that simulations using the HMAX model of object recognition in cortex can fit the aforementioned single-unit and fMRI data, showing that the averaging-like responses can be understood as the result of responses of object-selective neurons to suboptimal stimuli. Moreover, the model shows how object recognition can be achieved by a sparse readout of neurons whose selectivity is robust to clutter. Finally, the model provides a novel prediction about human object recognition performance, namely, that target recognition ability should show a U-shaped dependency on the similarity of simultaneously presented clutter objects. This prediction is confirmed experimentally, supporting a simple, unifying model of how the brain performs object recognition in clutter. SIGNIFICANCE STATEMENT The neural mechanisms underlying object recognition in cluttered scenes (i.e., containing more than one object) remain poorly understood. Studies have suggested that neural responses to multiple objects correspond to an average of the responses to the constituent objects. Yet, it is unclear how the identities of an unknown number of objects could be disentangled from a confounded average response. Here, we use a popular computational biological vision model to show that averaging-like responses can result from responses of clutter-tolerant neurons to suboptimal stimuli. The model also provides a novel prediction, that human detection ability should show a U-shaped dependency on target-clutter similarity, which is confirmed experimentally, supporting a simple, unifying account of how the brain performs object recognition in clutter.
Collapse
|
15
|
Rodríguez-Sánchez AJ, Fallah M, Leonardis A. Editorial: Hierarchical Object Representations in the Visual Cortex and Computer Vision. Front Comput Neurosci 2015; 9:142. [PMID: 26635595 PMCID: PMC4653288 DOI: 10.3389/fncom.2015.00142] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2015] [Accepted: 11/06/2015] [Indexed: 11/29/2022] Open
Affiliation(s)
- Antonio J Rodríguez-Sánchez
- Intelligent and Interactive Systems, Department of Computer Science, University of Innsbruck Innsbruck, Austria
| | - Mazyar Fallah
- Visual Perception and Attention Laboratory, Centre for Vision Research, School of Kinesiology and Health Science, York University Toronto, ON, Canada
| | - Aleš Leonardis
- School of Computer Science, University of Birmingham Birmingham, UK
| |
Collapse
|
16
|
Strait CE, Sleezer BJ, Blanchard TC, Azab H, Castagno MD, Hayden BY. Neuronal selectivity for spatial positions of offers and choices in five reward regions. J Neurophysiol 2015; 115:1098-111. [PMID: 26631146 DOI: 10.1152/jn.00325.2015] [Citation(s) in RCA: 55] [Impact Index Per Article: 6.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2015] [Accepted: 12/01/2015] [Indexed: 11/22/2022] Open
Abstract
When we evaluate an option, how is the neural representation of its value linked to information that identifies it, such as its position in space? We hypothesized that value information and identity cues are not bound together at a particular point but are represented together at the single unit level throughout the entirety of the choice process. We examined neuronal responses in two-option gambling tasks with lateralized and asynchronous presentation of offers in five reward regions: orbitofrontal cortex (OFC, area 13), ventromedial prefrontal cortex (vmPFC, area 14), ventral striatum (VS), dorsal anterior cingulate cortex (dACC), and subgenual anterior cingulate cortex (sgACC, area 25). Neuronal responses in all areas are sensitive to the positions of both offers and of choices. This selectivity is strongest in reward-sensitive neurons, indicating that it is not a property of a specialized subpopulation of cells. We did not find consistent contralateral or any other organization to these responses, indicating that they may be difficult to detect with aggregate measures like neuroimaging or studies of lesion effects. These results suggest that value coding is wed to factors that identify the object throughout the reward system and suggest a possible solution to the binding problem raised by abstract value encoding schemes.
Collapse
Affiliation(s)
- Caleb E Strait
- Department of Brain and Cognitive Sciences and Center for Visual Science, University of Rochester, Rochester, New York; and
| | - Brianna J Sleezer
- Department of Brain and Cognitive Sciences and Center for Visual Science, University of Rochester, Rochester, New York; and Neuroscience Graduate Program, University of Rochester, Rochester, New York
| | - Tommy C Blanchard
- Department of Brain and Cognitive Sciences and Center for Visual Science, University of Rochester, Rochester, New York; and
| | - Habiba Azab
- Department of Brain and Cognitive Sciences and Center for Visual Science, University of Rochester, Rochester, New York; and
| | - Meghan D Castagno
- Department of Brain and Cognitive Sciences and Center for Visual Science, University of Rochester, Rochester, New York; and
| | - Benjamin Y Hayden
- Department of Brain and Cognitive Sciences and Center for Visual Science, University of Rochester, Rochester, New York; and
| |
Collapse
|
17
|
Affiliation(s)
- Mark E J Sheffield
- Department of Neurobiology, Northwestern University, Evanston, Illinois, USA
| | - Daniel A Dombeck
- Department of Neurobiology, Northwestern University, Evanston, Illinois, USA
| |
Collapse
|
18
|
Wang Z, Cui P, Li F, Chang E, Yang S. A data-driven study of image feature extraction and fusion. Inf Sci (N Y) 2014. [DOI: 10.1016/j.ins.2014.02.030] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]
|
19
|
Orchard G, Martin JG, Vogelstein RJ, Etienne-Cummings R. Fast neuromimetic object recognition using FPGA outperforms GPU implementations. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2013; 24:1239-1252. [PMID: 24808564 DOI: 10.1109/tnnls.2013.2253563] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/03/2023]
Abstract
Recognition of objects in still images has traditionally been regarded as a difficult computational problem. Although modern automated methods for visual object recognition have achieved steadily increasing recognition accuracy, even the most advanced computational vision approaches are unable to obtain performance equal to that of humans. This has led to the creation of many biologically inspired models of visual object recognition, among them the hierarchical model and X (HMAX) model. HMAX is traditionally known to achieve high accuracy in visual object recognition tasks at the expense of significant computational complexity. Increasing complexity, in turn, increases computation time, reducing the number of images that can be processed per unit time. In this paper we describe how the computationally intensive and biologically inspired HMAX model for visual object recognition can be modified for implementation on a commercial field-programmable aate Array, specifically the Xilinx Virtex 6 ML605 evaluation board with XC6VLX240T FPGA. We show that with minor modifications to the traditional HMAX model we can perform recognition on images of size 128 × 128 pixels at a rate of 190 images per second with a less than 1% loss in recognition accuracy in both binary and multiclass visual object recognition tasks.
Collapse
|
20
|
Abstract
The visual recognition of actions is an important visual function that is critical for motor learning and social communication. Action-selective neurons have been found in different cortical regions, including the superior temporal sulcus, parietal and premotor cortex. Among those are mirror neurons, which link visual and motor representations of body movements. While numerous theoretical models for the mirror neuron system have been proposed, the computational basis of the visual processing of goal-directed actions remains largely unclear. While most existing models focus on the possible role of motor representations in action recognition, we propose a model showing that many critical properties of action-selective visual neurons can be accounted for by well-established visual mechanisms. Our model accomplishes the recognition of hand actions from real video stimuli, exploiting exclusively mechanisms that can be implemented in a biologically plausible way by cortical neurons. We show that the model provides a unifying quantitatively consistent account of a variety of electrophysiological results from action-selective visual neurons. In addition, it makes a number of predictions, some of which could be confirmed in recent electrophysiological experiments.
Collapse
|
21
|
Cortical gamma oscillations: the functional key is activation, not cognition. Neurosci Biobehav Rev 2013; 37:401-17. [PMID: 23333264 DOI: 10.1016/j.neubiorev.2013.01.013] [Citation(s) in RCA: 80] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2012] [Revised: 12/28/2012] [Accepted: 01/07/2013] [Indexed: 12/19/2022]
Abstract
Cortical oscillatory synchrony in the gamma range has been attracting increasing attention in cognitive neuroscience ever since being proposed as a solution to the so-called binding problem. This growing literature is critically reviewed in both its basic neuroscience and cognitive aspects. A physiological "default assumption" regarding these oscillations is introduced, according to which they signal a state of physiological activation of cortical tissue, and the associated need to balance excitation with inhibition in particular. As such these oscillations would belong among a variety of generic neural control operations that enable neural tissue to perform its systems level functions, without implementing those functions themselves. Regional control of cerebral blood flow provides an analogy in this regard, and gamma oscillations are tightly correlated with this even more elementary control operation. As correlates of neural activation they will also covary with cognitive activity, and this typically suffices to account for the covariation between gamma activity and cognitive task variables. A number of specific cases of gamma synchrony are examined in this light, including the original impetus for attributing cognitive significance to gamma activity, namely the experiments interpreted as evidence for "binding by synchrony". This examination finds no compelling reasons to assign functional roles to oscillatory synchrony in the gamma range beyond its generic functions at the level of infrastructural neural control.
Collapse
|
22
|
Korjoukov I, Jeurissen D, Kloosterman NA, Verhoeven JE, Scholte HS, Roelfsema PR. The time course of perceptual grouping in natural scenes. Psychol Sci 2012; 23:1482-9. [PMID: 23137967 DOI: 10.1177/0956797612443832] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022] Open
Abstract
Visual perception starts with localized filters that subdivide the image into fragments that undergo separate analyses. The visual system has to reconstruct objects by grouping image fragments that belong to the same object. A widely held view is that perceptual grouping occurs in parallel across the visual scene and without attention. To test this idea, we measured the speed of grouping in pictures of animals and vehicles. In a classification task, these pictures were categorized efficiently. In an image-parsing task, participants reported whether two cues fell on the same or different objects, and we measured reaction times. Despite the participants' fast object classification, perceptual grouping required more time if the distance between cues was larger, and we observed an additional delay when the cues fell on different parts of a single object. Parsing was also slower for inverted than for upright objects. These results imply that perception starts with rapid object classification and that rapid classification is followed by a serial perceptual grouping phase, which is more efficient for objects in a familiar orientation than for objects in an unfamiliar orientation.
Collapse
Affiliation(s)
- Ilia Korjoukov
- Department of Vision and Cognition, Netherlands Institute for Neuroscience, Royal Netherlands Academy of Arts and Sciences
| | | | | | | | | | | |
Collapse
|
23
|
Velik R. From simple receptors to complex multimodal percepts: a first global picture on the mechanisms involved in perceptual binding. Front Psychol 2012; 3:259. [PMID: 22837751 PMCID: PMC3402139 DOI: 10.3389/fpsyg.2012.00259] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/10/2012] [Accepted: 07/06/2012] [Indexed: 11/13/2022] Open
Abstract
The binding problem in perception is concerned with answering the question how information from millions of sensory receptors, processed by millions of neurons working in parallel, can be merged into a unified percept. Binding in perception reaches from the lowest levels of feature binding up to the levels of multimodal binding of information coming from the different sensor modalities and also from other functional systems. The last 40 years of research have shown that the binding problem cannot be solved easily. Today, it is considered as one of the key questions to brain understanding. To date, various solutions have been suggested to the binding problem including: (1) combination coding, (2) binding by synchrony, (3) population coding, (4) binding by attention, (5) binding by knowledge, expectation, and memory, (6) hardwired vs. on-demand binding, (7) bundling and binding of features, (8) the feature-integration theory of attention, and (9) synchronization through top-down processes. Each of those hypotheses addresses important aspects of binding. However, each of them also suffers from certain weak points and can never give a complete explanation. This article gives a brief overview of the so far suggested solutions of perceptual binding and then shows that those are actually not mutually exclusive but can complement each other. A computationally verified model is presented which shows that, most likely, the different described mechanisms of binding act (1) at different hierarchical levels and (2) in different stages of "perceptual knowledge acquisition." The model furthermore considers and explains a number of inhibitory "filter mechanisms" that suppress the activation of inappropriate or currently irrelevant information.
Collapse
|
24
|
Rolls ET. Invariant Visual Object and Face Recognition: Neural and Computational Bases, and a Model, VisNet. Front Comput Neurosci 2012; 6:35. [PMID: 22723777 PMCID: PMC3378046 DOI: 10.3389/fncom.2012.00035] [Citation(s) in RCA: 59] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2011] [Accepted: 05/23/2012] [Indexed: 11/13/2022] Open
Abstract
Neurophysiological evidence for invariant representations of objects and faces in the primate inferior temporal visual cortex is described. Then a computational approach to how invariant representations are formed in the brain is described that builds on the neurophysiology. A feature hierarchy model in which invariant representations can be built by self-organizing learning based on the temporal and spatial statistics of the visual input produced by objects as they transform in the world is described. VisNet can use temporal continuity in an associative synaptic learning rule with a short-term memory trace, and/or it can use spatial continuity in continuous spatial transformation learning which does not require a temporal trace. The model of visual processing in the ventral cortical stream can build representations of objects that are invariant with respect to translation, view, size, and also lighting. The model has been extended to provide an account of invariant representations in the dorsal visual system of the global motion produced by objects such as looming, rotation, and object-based movement. The model has been extended to incorporate top-down feedback connections to model the control of attention by biased competition in, for example, spatial and object search tasks. The approach has also been extended to account for how the visual system can select single objects in complex visual scenes, and how multiple objects can be represented in a scene. The approach has also been extended to provide, with an additional layer, for the development of representations of spatial scenes of the type found in the hippocampus.
Collapse
Affiliation(s)
- Edmund T. Rolls
- Oxford Centre for Computational NeuroscienceOxford, UK
- Department of Computer Science, University of WarwickCoventry, UK
| |
Collapse
|
25
|
Wyatte D, Herd S, Mingus B, O'Reilly R. The Role of Competitive Inhibition and Top-Down Feedback in Binding during Object Recognition. Front Psychol 2012; 3:182. [PMID: 22719733 PMCID: PMC3376426 DOI: 10.3389/fpsyg.2012.00182] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2012] [Accepted: 05/20/2012] [Indexed: 01/07/2023] Open
Abstract
How does the brain bind together visual features that are processed concurrently by different neurons into a unified percept suitable for processes such as object recognition? Here, we describe how simple, commonly accepted principles of neural processing can interact over time to solve the brain’s binding problem. We focus on mechanisms of neural inhibition and top-down feedback. Specifically, we describe how inhibition creates competition among neural populations that code different features, effectively suppressing irrelevant information, and thus minimizing illusory conjunctions. Top-down feedback contributes to binding in a similar manner, but by reinforcing relevant features. Together, inhibition and top-down feedback contribute to a competitive environment that ensures only the most appropriate features are bound together. We demonstrate this overall proposal using a biologically realistic neural model of vision that processes features across a hierarchy of interconnected brain areas. Finally, we argue that temporal synchrony plays only a limited role in binding – it does not simultaneously bind multiple objects, but does aid in creating additional contrast between relevant and irrelevant features. Thus, our overall theory constitutes a solution to the binding problem that relies only on simple neural principles without any binding-specific processes.
Collapse
Affiliation(s)
- Dean Wyatte
- Department of Psychology and Neuroscience, University of Colorado Boulder Boulder, CO, USA
| | | | | | | |
Collapse
|
26
|
Whiteley L, Sahani M. Attention in a bayesian framework. Front Hum Neurosci 2012; 6:100. [PMID: 22712010 PMCID: PMC3375068 DOI: 10.3389/fnhum.2012.00100] [Citation(s) in RCA: 33] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2011] [Accepted: 04/06/2012] [Indexed: 11/13/2022] Open
Abstract
The behavioral phenomena of sensory attention are thought to reflect the allocation of a limited processing resource, but there is little consensus on the nature of the resource or why it should be limited. Here we argue that a fundamental bottleneck emerges naturally within Bayesian models of perception, and use this observation to frame a new computational account of the need for, and action of, attention - unifying diverse attentional phenomena in a way that goes beyond previous inferential, probabilistic and Bayesian models. Attentional effects are most evident in cluttered environments, and include both selective phenomena, where attention is invoked by cues that point to particular stimuli, and integrative phenomena, where attention is invoked dynamically by endogenous processing. However, most previous Bayesian accounts of attention have focused on describing relatively simple experimental settings, where cues shape expectations about a small number of upcoming stimuli and thus convey "prior" information about clearly defined objects. While operationally consistent with the experiments it seeks to describe, this view of attention as prior seems to miss many essential elements of both its selective and integrative roles, and thus cannot be easily extended to complex environments. We suggest that the resource bottleneck stems from the computational intractability of exact perceptual inference in complex settings, and that attention reflects an evolved mechanism for approximate inference which can be shaped to refine the local accuracy of perception. We show that this approach extends the simple picture of attention as prior, so as to provide a unified and computationally driven account of both selective and integrative attentional phenomena.
Collapse
Affiliation(s)
- Louise Whiteley
- Gatsby Computational Neuroscience Unit, University College London London, UK
| | | |
Collapse
|
27
|
Herzog MH, Otto TU, Ogmen H. The fate of visible features of invisible elements. Front Psychol 2012; 3:119. [PMID: 22557985 PMCID: PMC3338119 DOI: 10.3389/fpsyg.2012.00119] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2012] [Accepted: 04/01/2012] [Indexed: 11/13/2022] Open
Abstract
To investigate the integration of features, we have developed a paradigm in which an element is rendered invisible by visual masking. Still, the features of the element are visible as part of other display elements presented at different locations and times (sequential metacontrast). In this sense, we can "transport" features non-retinotopically across space and time. The features of the invisible element integrate with features of other elements if and only if the elements belong to the same spatio-temporal group. The mechanisms of this kind of feature integration seem to be quite different from classical mechanisms proposed for feature binding. We propose that feature processing, binding, and integration occur concurrently during processes that group elements into wholes.
Collapse
Affiliation(s)
- Michael H Herzog
- Laboratory of Psychophysics, Ecole Polytechnique Fédérale de Lausanne (EPFL) Lausanne, Switzerland
| | | | | |
Collapse
|
28
|
Abstract
One important task for the visual system is to group image elements that belong to an object and to segregate them from other objects and the background. We here present an incremental grouping theory (IGT) that addresses the role of object-based attention in perceptual grouping at a psychological level and, at the same time, outlines the mechanisms for grouping at the neurophysiological level. The IGT proposes that there are two processes for perceptual grouping. The first process is base grouping and relies on neurons that are tuned to feature conjunctions. Base grouping is fast and occurs in parallel across the visual scene, but not all possible feature conjunctions can be coded as base groupings. If there are no neurons tuned to the relevant feature conjunctions, a second process called incremental grouping comes into play. Incremental grouping is a time-consuming and capacity-limited process that requires the gradual spread of enhanced neuronal activity across the representation of an object in the visual cortex. The spread of enhanced neuronal activity corresponds to the labeling of image elements with object-based attention.
Collapse
|
29
|
Cichy RM, Sterzer P, Heinzle J, Elliott LT, Ramirez F, Haynes JD. Probing principles of large-scale object representation: category preference and location encoding. Hum Brain Mapp 2012; 34:1636-51. [PMID: 22371355 DOI: 10.1002/hbm.22020] [Citation(s) in RCA: 31] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2011] [Revised: 10/27/2011] [Accepted: 11/21/2011] [Indexed: 11/07/2022] Open
Abstract
Knowledge about the principles that govern large-scale neural representations of objects is central to a systematic understanding of object recognition. We used functional magnetic resonance imaging (fMRI) and multivariate pattern classification to investigate two such candidate principles: category preference and location encoding. The former designates the preferential activation of distinct cortical regions by a specific category of objects. The latter refers to information about where in the visual field a particular object is located. Participants viewed exemplars of three object categories (faces, bodies, and scenes) that were presented left or right of fixation. The analysis of fMRI activation patterns revealed the following. Category-selective regions retained their preference to the same categories in a manner tolerant to changes in object location. However, category preference was not absolute: category-selective regions also contained location-tolerant information about nonpreferred categories. Furthermore, location information was present throughout high-level ventral visual cortex and was distributed systematically across the cortical surface. We found more location information in lateral-occipital cortex than in ventral-temporal cortex. Our results provide a systematic account of the extent to which the principles of category preference and location encoding determine the representation of objects in the high-level ventral visual cortex.
Collapse
Affiliation(s)
- Radoslaw Martin Cichy
- Bernstein Center for Computational Neuroscience Berlin, Charité-Universitätsmedizin Berlin, Germany.
| | | | | | | | | | | |
Collapse
|
30
|
Abstract
Mounting evidence suggests that 'core object recognition,' the ability to rapidly recognize objects despite substantial appearance variation, is solved in the brain via a cascade of reflexive, largely feedforward computations that culminate in a powerful neuronal representation in the inferior temporal cortex. However, the algorithm that produces this solution remains poorly understood. Here we review evidence ranging from individual neurons and neuronal populations to behavior and computational models. We propose that understanding this algorithm will require using neuronal and psychophysical data to sift through many computational models, each based on building blocks of small, canonical subnetworks with a common functional goal.
Collapse
Affiliation(s)
- James J DiCarlo
- Department of Brain and Cognitive Sciences and McGovern Institute for Brain Research, Massachusetts Institute of Technology, Cambridge, MA 02139, USA.
| | | | | |
Collapse
|
31
|
Samuelson LK, Smith LB, Perry LK, Spencer JP. Grounding word learning in space. PLoS One 2011; 6:e28095. [PMID: 22194807 PMCID: PMC3237424 DOI: 10.1371/journal.pone.0028095] [Citation(s) in RCA: 62] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2011] [Accepted: 11/01/2011] [Indexed: 11/22/2022] Open
Abstract
Humans and objects, and thus social interactions about objects, exist within space. Words direct listeners' attention to specific regions of space. Thus, a strong correspondence exists between where one looks, one's bodily orientation, and what one sees. This leads to further correspondence with what one remembers. Here, we present data suggesting that children use associations between space and objects and space and words to link words and objects—space binds labels to their referents. We tested this claim in four experiments, showing that the spatial consistency of where objects are presented affects children's word learning. Next, we demonstrate that a process model that grounds word learning in the known neural dynamics of spatial attention, spatial memory, and associative learning can capture the suite of results reported here. This model also predicts that space is special, a prediction supported in a fifth experiment that shows children do not use color as a cue to bind words and objects. In a final experiment, we ask whether spatial consistency affects word learning in naturalistic word learning contexts. Children of parents who spontaneously keep objects in a consistent spatial location during naming interactions learn words more effectively. Together, the model and data show that space is a powerful tool that can effectively ground word learning in social contexts.
Collapse
Affiliation(s)
- Larissa K Samuelson
- Department of Psychology and Delta Center, University of Iowa, Iowa City, Iowa, United States of America.
| | | | | | | |
Collapse
|
32
|
Amarasingham A, Harrison MT, Hatsopoulos NG, Geman S. Conditional modeling and the jitter method of spike resampling. J Neurophysiol 2011; 107:517-31. [PMID: 22031767 DOI: 10.1152/jn.00633.2011] [Citation(s) in RCA: 71] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022] Open
Abstract
The existence and role of fine-temporal structure in the spiking activity of central neurons is the subject of an enduring debate among physiologists. To a large extent, the problem is a statistical one: what inferences can be drawn from neurons monitored in the absence of full control over their presynaptic environments? In principle, properly crafted resampling methods can still produce statistically correct hypothesis tests. We focus on the approach to resampling known as jitter. We review a wide range of jitter techniques, illustrated by both simulation experiments and selected analyses of spike data from motor cortical neurons. We rely on an intuitive and rigorous statistical framework known as conditional modeling to reveal otherwise hidden assumptions and to support precise conclusions. Among other applications, we review statistical tests for exploring any proposed limit on the rate of change of spiking probabilities, exact tests for the significance of repeated fine-temporal patterns of spikes, and the construction of acceptance bands for testing any purported relationship between sensory or motor variables and synchrony or other fine-temporal events.
Collapse
Affiliation(s)
- Asohan Amarasingham
- Department of Mathematics, The City College of New York, and Program in Cognitive Neuroscience, The Graduate Center, City University of New York, New York, New York, USA
| | | | | | | |
Collapse
|
33
|
From Vision to Decision: The Role of Visual Attention in Elite Sports Performance. Eye Contact Lens 2011; 37:131-9. [DOI: 10.1097/icl.0b013e3182190b7f] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
|
34
|
Cichy RM, Chen Y, Haynes JD. Encoding the identity and location of objects in human LOC. Neuroimage 2010; 54:2297-307. [PMID: 20869451 DOI: 10.1016/j.neuroimage.2010.09.044] [Citation(s) in RCA: 95] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2010] [Revised: 09/16/2010] [Accepted: 09/17/2010] [Indexed: 11/30/2022] Open
Abstract
We are able to recognize objects independent of their location in the visual field. At the same time, we also keep track of the location of objects to orient ourselves and to interact with the environment. The lateral occipital complex (LOC) has been suggested as the prime cortical region for representation of object identity. However, the extent to which LOC also represents object location has remained debated. In this study we used high-resolution fMRI in combination with multivoxel pattern classification to investigate the cortical encoding of three object exemplars from four different categories presented in two different locations. This approach allowed us to study location-tolerant object information and object-tolerant location information in LOC, both at the level of categories and exemplars. We found evidence for both location-tolerant object information and object-tolerant location information in LOC at the level of categories and exemplars. Our results further highlight the mixing of identity and location information in the ventral visual pathway.
Collapse
Affiliation(s)
- Radoslaw Martin Cichy
- Bernstein Center for Computational Neuroscience Berlin, Charité-Universitätsmedizin Berlin, Berlin, Germany.
| | | | | |
Collapse
|
35
|
Continuous transformation learning of translation invariant representations. Exp Brain Res 2010; 204:255-70. [PMID: 20544186 DOI: 10.1007/s00221-010-2309-0] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2009] [Accepted: 05/21/2010] [Indexed: 01/24/2023]
Abstract
We show that spatial continuity can enable a network to learn translation invariant representations of objects by self-organization in a hierarchical model of cortical processing in the ventral visual system. During 'continuous transformation learning', the active synapses from each overlapping transform are associatively modified onto the set of postsynaptic neurons. Because other transforms of the same object overlap with previously learned exemplars, a common set of postsynaptic neurons is activated by the new transforms, and learning of the new active inputs onto the same postsynaptic neurons is facilitated. We show that the transforms must be close for this to occur; that the temporal order of presentation of each transformed image during training is not crucial for learning to occur; that relatively large numbers of transforms can be learned; and that such continuous transformation learning can be usefully combined with temporal trace training.
Collapse
|
36
|
Further evidence for the spread of attention during contour grouping: A reply to Crundall, Dewhurst, and Underwood (2008). Atten Percept Psychophys 2010; 72:849-62. [DOI: 10.3758/app.72.3.849] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
|
37
|
Abstract
The neural bases of behavior are often discussed in terms of perceptual, cognitive, and motor stages, defined within an information processing framework that was originally inspired by models of human abstract problem solving. Here, we review a growing body of neurophysiological data that is difficult to reconcile with this influential theoretical perspective. As an alternative foundation for interpreting neural data, we consider frameworks borrowed from ethology, which emphasize the kinds of real-time interactive behaviors that animals have engaged in for millions of years. In particular, we discuss an ethologically-inspired view of interactive behavior as simultaneous processes that specify potential motor actions and select between them. We review how recent neurophysiological data from diverse cortical and subcortical regions appear more compatible with this parallel view than with the classical view of serial information processing stages.
Collapse
Affiliation(s)
- Paul Cisek
- Groupe de Recherche sur le Système Nerveux Central (FRSQ), Département de Physiologie, Université de Montréal, Montréal, Québec H3C3J7, Canada.
| | | |
Collapse
|
38
|
Task effects, performance levels, features, configurations, and holistic face processing: a reply to Rossion. Acta Psychol (Amst) 2009; 132:286-92. [PMID: 19665104 DOI: 10.1016/j.actpsy.2009.07.004] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2009] [Revised: 07/08/2009] [Accepted: 07/10/2009] [Indexed: 11/23/2022] Open
Abstract
A recent article in Acta Psychologica ("Picture-plane inversion leads to qualitative changes of face perception" by Rossion [Rossion, B. (2008). Picture-plane inversion leads to qualitative changes of face perception. Acta Psychologica (Amst), 128(2), 274-289]) criticized several aspects of an earlier paper of ours [Riesenhuber, M., Jarudi, I., Gilad, S., & Sinha, P. (2004). Face processing in humans is compatible with a simple shape-based model of vision. Proceedings of the Royal Society of London B (Supplements), 271, S448-S450]. We here address Rossion's criticisms and correct some misunderstandings. To frame the discussion, we first review our previously presented computational model of face recognition in cortex [Jiang, X., Rosen, E., Zeffiro, T., Vanmeter, J., Blanz, V., & Riesenhuber, M. (2006). Evaluation of a shape-based model of human face discrimination using FMRI and behavioral techniques. Neuron, 50(1), 159-172] that provides a concrete biologically plausible computational substrate for holistic coding, namely a neural representation learned for upright faces, in the spirit of the original simple-to-complex hierarchical model of vision by Hubel and Wiesel. We show that Rossion's and others' data support the model, and that there is actually a convergence of views on the mechanisms underlying face recognition, in particular regarding holistic processing.
Collapse
|
39
|
Abstract
How the brain 'binds' information to create a coherent perceptual experience is an enduring question. Recent research in the psychophysics of perceptual binding and developments in fMRI analysis techniques are bringing us closer to an understanding of how the brain solves the binding problem.
Collapse
Affiliation(s)
- David Whitney
- The Department of Psychology, and The Center for Mind and Brain, University of California, Davis, CA 95618, USA.
| |
Collapse
|
40
|
Li N, Cox DD, Zoccolan D, DiCarlo JJ. What response properties do individual neurons need to underlie position and clutter "invariant" object recognition? J Neurophysiol 2009; 102:360-76. [PMID: 19439676 DOI: 10.1152/jn.90745.2008] [Citation(s) in RCA: 61] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022] Open
Abstract
Primates can easily identify visual objects over large changes in retinal position--a property commonly referred to as position "invariance." This ability is widely assumed to depend on neurons in inferior temporal cortex (IT) that can respond selectively to isolated visual objects over similarly large ranges of retinal position. However, in the real world, objects rarely appear in isolation, and the interplay between position invariance and the representation of multiple objects (i.e., clutter) remains unresolved. At the heart of this issue is the intuition that the representations of nearby objects can interfere with one another and that the large receptive fields needed for position invariance can exacerbate this problem by increasing the range over which interference acts. Indeed, most IT neurons' responses are strongly affected by the presence of clutter. While external mechanisms (such as attention) are often invoked as a way out of the problem, we show (using recorded neuronal data and simulations) that the intrinsic properties of IT population responses, by themselves, can support object recognition in the face of limited clutter. Furthermore, we carried out extensive simulations of hypothetical neuronal populations to identify the essential individual-neuron ingredients of a good population representation. These simulations show that the crucial neuronal property to support recognition in clutter is not preservation of response magnitude, but preservation of each neuron's rank-order object preference under identity-preserving image transformations (e.g., clutter). Because IT neuronal responses often exhibit that response property, while neurons in earlier visual areas (e.g., V1) do not, we suggest that preserving the rank-order object preference regardless of clutter, rather than the response magnitude, more precisely describes the goal of individual neurons at the top of the ventral visual stream.
Collapse
Affiliation(s)
- Nuo Li
- McGovern Institute for Brain Research, Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, 77 Massachusetts Ave., Cambridge, MA 02139, USA
| | | | | | | |
Collapse
|
41
|
Seymour KJ, Scott McDonald J, Clifford CWG. Failure of colour and contrast polarity identification at threshold for detection of motion and global form. Vision Res 2009; 49:1592-8. [PMID: 19341760 DOI: 10.1016/j.visres.2009.03.022] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2009] [Revised: 03/17/2009] [Accepted: 03/23/2009] [Indexed: 11/24/2022]
Abstract
We used identification at threshold to systematically measure binding costs in two visual modalities. We presented a conjunction of two features as a signal stimulus and concurrently measured detection and identification performance as a function of three threshold variables: duration, contrast and coherence. Discrepancies between detection and identification sensitivity functions demonstrated a consistent processing cost to visual feature binding. Our findings suggest that feature binding is indeed a genuine problem for the brain to solve. This simple paradigm can transfer across arbitrary feature combinations and is therefore suitable to use in experiments addressing mechanisms of sensory integration.
Collapse
Affiliation(s)
- Kiley J Seymour
- School of Psychology, Colour Form Motion Lab, University of Sydney, Sydney, NSW, Australia.
| | | | | |
Collapse
|
42
|
|
43
|
Zhao S, Yao L, Jin Z, Xiong X, Wu X, Zou Q, Yao G, Cai X, Liu Y. Sparse representation of global features of visual images in human primary visual cortex: Evidence from fMRI. Sci Bull (Beijing) 2008. [DOI: 10.1007/s11434-008-0254-0] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
|
44
|
Spratling M. Predictive coding as a model of biased competition in visual attention. Vision Res 2008; 48:1391-408. [DOI: 10.1016/j.visres.2008.03.009] [Citation(s) in RCA: 147] [Impact Index Per Article: 9.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2007] [Revised: 02/29/2008] [Accepted: 03/14/2008] [Indexed: 11/29/2022]
|
45
|
Learning to recognize objects on the fly: a neurally based dynamic field approach. Neural Netw 2008; 21:562-76. [PMID: 18501555 DOI: 10.1016/j.neunet.2008.03.007] [Citation(s) in RCA: 54] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2006] [Revised: 03/07/2008] [Accepted: 03/07/2008] [Indexed: 11/21/2022]
Abstract
Autonomous robots interacting with human users need to build and continuously update scene representations. This entails the problem of rapidly learning to recognize new objects under user guidance. Based on analogies with human visual working memory, we propose a dynamical field architecture, in which localized peaks of activation represent objects over a small number of simple feature dimensions. Learning consists of laying down memory traces of such peaks. We implement the dynamical field model on a service robot and demonstrate how it learns 30 objects from a very small number of views (about 5 per object are sufficient). We also illustrate how properties of feature binding emerge from this framework.
Collapse
|
46
|
Bodovitz S. The neural correlate of consciousness. J Theor Biol 2008; 254:594-8. [PMID: 18514741 DOI: 10.1016/j.jtbi.2008.04.019] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2008] [Revised: 03/22/2008] [Accepted: 04/15/2008] [Indexed: 11/18/2022]
Abstract
I propose that we are only aware of changes in our underlying cognition. This hypothesis is based on four lines of evidence. (1) Without changes in visual input (including fixational eye movements), static images fade from awareness. (2) Consciousness appears to be continuous, but is actually broken up into discrete cycles of cognition. Without continuity, conscious awareness disintegrates into a series of isolated cycles. The simplest mechanism for creating continuity is to track the changes between the cycles. (3) While these conscious vectors are putative, they have a clear source: the dorsolateral prefrontal cortex (DLPFC). The DLPFC is active during awareness of changes, and this awareness is disrupted by repetitive transcranial magnetic stimulation. (4) When the DLPFC and the orbital and inferior parietal cortices are deactivated during dreaming, conscious awareness is absent even though the rest of the brain is active. Moreover, Lau and Passingham showed that activation of the DLPFC, but no other brain region, correlates with awareness. In summary, if the DLPFC and conscious vectors are the neural correlate of consciousness, then we are only aware of changes in our underlying cognition. The glue that holds conscious awareness together is conscious awareness.
Collapse
Affiliation(s)
- Steven Bodovitz
- BioPerspectives, 2040 Hyde Street, San Francisco, CA 94109, USA.
| |
Collapse
|
47
|
Roudi Y, Treves A. Representing where along with what information in a model of a cortical patch. PLoS Comput Biol 2008; 4:e1000012. [PMID: 18369416 PMCID: PMC2268242 DOI: 10.1371/journal.pcbi.1000012] [Citation(s) in RCA: 36] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2007] [Accepted: 01/29/2008] [Indexed: 11/18/2022] Open
Abstract
Behaving in the real world requires flexibly combining and maintaining information about both continuous and discrete variables. In the visual domain, several lines of evidence show that neurons in some cortical networks can simultaneously represent information about the position and identity of objects, and maintain this combined representation when the object is no longer present. The underlying network mechanism for this combined representation is, however, unknown. In this paper, we approach this issue through a theoretical analysis of recurrent networks. We present a model of a cortical network that can retrieve information about the identity of objects from incomplete transient cues, while simultaneously representing their spatial position. Our results show that two factors are important in making this possible: A) a metric organisation of the recurrent connections, and B) a spatially localised change in the linear gain of neurons. Metric connectivity enables a localised retrieval of information about object identity, while gain modulation ensures localisation in the correct position. Importantly, we find that the amount of information that the network can retrieve and retain about identity is strongly affected by the amount of information it maintains about position. This balance can be controlled by global signals that change the neuronal gain. These results show that anatomical and physiological properties, which have long been known to characterise cortical networks, naturally endow them with the ability to maintain a conjunctive representation of the identity and location of objects.
Collapse
Affiliation(s)
- Yasser Roudi
- Gatsby Computational Neuroscience Unit, UCL, United Kingdom.
| | | |
Collapse
|
48
|
Pichevar R, Rouat J. Monophonic sound source separation with an unsupervised network of spiking neurones. Neurocomputing 2007. [DOI: 10.1016/j.neucom.2007.08.001] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
|
49
|
Plate J. An Analysis of the Binding Problem. PHILOSOPHICAL PSYCHOLOGY 2007. [DOI: 10.1080/09515080701694136] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]
|
50
|
Murphy TM, Finkel LH. Shape representation by a network of V4-like cells. Neural Netw 2007; 20:851-67. [PMID: 17884335 DOI: 10.1016/j.neunet.2007.06.004] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2005] [Revised: 06/27/2007] [Accepted: 06/27/2007] [Indexed: 10/23/2022]
Abstract
Cells in extrastriate visual cortex have been reported to be selective for various configurations of local contour shape [Pasupathy, A., & Connor, C. E. (2001). Shape representation in area V4: Position-specific tuning for boundary conformation. The Journal of Neurophysiology, 86 (5), 2505-2519; Hegdé, J., & Van Essen, D. C. (2003). Strategies of shape representation in macaque visual area V2. Visual Neuroscience, 20 (3), 313-328]. Specifically, Pasupathy and Connor found that in area V4 most cells are strongly responsive to a particular local contour conformation located at a specific position on the object's boundary. We used a population of "V4-like cells"-units sensitive to multiple shape features modeled after V4 cell behavior-to generate representations of different shapes. Standard classification algorithms (earth mover's distance, support vector machines) applied to this population representation demonstrate high recognition accuracies classifying handwritten digits in the MNIST database and objects in the MPEG-7 Shape Silhouette database. We compare the performance of the V4-like unit representation to the "shape context" representation of Belongie et al. [Belongie, S., Malik, J., & Puzicha, J. (2002). Shape matching and object recognition using shape contexts. IEEE Transactions on Pattern Analysis and Machine Intelligence, 24 (24), 509-522]. Results show roughly comparable recognition accuracies using the two representations when tested on portions of the MNIST database. We analyze the relative contributions of various V4-like feature sensitivities to recognition accuracy and robustness to noise - feature sensitivities include curvature magnitude, direction of curvature, global orientation of the contour segment, distance of the contour segment from object center, and modulatory effect of adjacent contour regions. Among these, local curvature appears to be the most informative variable for shape recognition. Our results support the hypothesis that V4 cells function as robust shape descriptors in the early stages of object recognition.
Collapse
Affiliation(s)
- Thomas M Murphy
- Department of Bioengineering, School of Engineering and Applied Science, University of Pennsylvania, 301 Hayden Hall, 3320 Smith Walk, Philadelphia, PA 19104-6321, USA.
| | | |
Collapse
|