1
|
Dora S, Bohte SM, Pennartz CMA. Deep Gated Hebbian Predictive Coding Accounts for Emergence of Complex Neural Response Properties Along the Visual Cortical Hierarchy. Front Comput Neurosci 2021; 15:666131. [PMID: 34393744 PMCID: PMC8355371 DOI: 10.3389/fncom.2021.666131] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2021] [Accepted: 06/28/2021] [Indexed: 11/13/2022] Open
Abstract
Predictive coding provides a computational paradigm for modeling perceptual processing as the construction of representations accounting for causes of sensory inputs. Here, we developed a scalable, deep network architecture for predictive coding that is trained using a gated Hebbian learning rule and mimics the feedforward and feedback connectivity of the cortex. After training on image datasets, the models formed latent representations in higher areas that allowed reconstruction of the original images. We analyzed low- and high-level properties such as orientation selectivity, object selectivity and sparseness of neuronal populations in the model. As reported experimentally, image selectivity increased systematically across ascending areas in the model hierarchy. Depending on the strength of regularization factors, sparseness also increased from lower to higher areas. The results suggest a rationale as to why experimental results on sparseness across the cortical hierarchy have been inconsistent. Finally, representations for different object classes became more distinguishable from lower to higher areas. Thus, deep neural networks trained using a gated Hebbian formulation of predictive coding can reproduce several properties associated with neuronal responses along the visual cortical hierarchy.
Collapse
Affiliation(s)
- Shirin Dora
- Cognitive and Systems Neuroscience Group, Swammerdam Institute for Life Sciences, University of Amsterdam, Amsterdam, Netherlands.,Intelligent Systems Research Centre, Ulster University, Londonderry, United Kingdom
| | - Sander M Bohte
- Cognitive and Systems Neuroscience Group, Swammerdam Institute for Life Sciences, University of Amsterdam, Amsterdam, Netherlands.,Machine Learning Group, Centre of Mathematics and Computer Science, Amsterdam, Netherlands
| | - Cyriel M A Pennartz
- Cognitive and Systems Neuroscience Group, Swammerdam Institute for Life Sciences, University of Amsterdam, Amsterdam, Netherlands
| |
Collapse
|
2
|
Lehky SR, Tanaka K, Sereno AB. Pseudosparse neural coding in the visual system of primates. Commun Biol 2021; 4:50. [PMID: 33420410 PMCID: PMC7794537 DOI: 10.1038/s42003-020-01572-2] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2020] [Accepted: 12/04/2020] [Indexed: 11/09/2022] Open
Abstract
When measuring sparseness in neural populations as an indicator of efficient coding, an implicit assumption is that each stimulus activates a different random set of neurons. In other words, population responses to different stimuli are, on average, uncorrelated. Here we examine neurophysiological data from four lobes of macaque monkey cortex, including V1, V2, MT, anterior inferotemporal cortex, lateral intraparietal cortex, the frontal eye fields, and perirhinal cortex, to determine how correlated population responses are. We call the mean correlation the pseudosparseness index, because high pseudosparseness can mimic statistical properties of sparseness without being authentically sparse. In every data set we find high levels of pseudosparseness ranging from 0.59-0.98, substantially greater than the value of 0.00 for authentic sparseness. This was true for synthetic and natural stimuli, as well as for single-electrode and multielectrode data. A model indicates that a key variable producing high pseudosparseness is the standard deviation of spontaneous activity across the population. Consistently high values of pseudosparseness in the data demand reconsideration of the sparse coding literature as well as consideration of the degree to which authentic sparseness provides a useful framework for understanding neural coding in the cortex.
Collapse
Affiliation(s)
- Sidney R Lehky
- Cognitive Brain Mapping Laboratory, RIKEN Center for Brain Science, Wako-shi, Saitama, 351-0198, Japan. .,Computational Neurobiology Laboratory, The Salk Institute, La Jolla, CA, 92037, USA.
| | - Keiji Tanaka
- Cognitive Brain Mapping Laboratory, RIKEN Center for Brain Science, Wako-shi, Saitama, 351-0198, Japan
| | - Anne B Sereno
- Department of Psychological Sciences, Purdue University, West Lafayette, IN, 47907, USA.,Weldon School of Biomedical Engineering, Purdue University, West Lafayette, IN, 47907, USA
| |
Collapse
|
3
|
Liu X, Zhen Z, Liu J. Hierarchical Sparse Coding of Objects in Deep Convolutional Neural Networks. Front Comput Neurosci 2020; 14:578158. [PMID: 33362499 PMCID: PMC7755594 DOI: 10.3389/fncom.2020.578158] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2020] [Accepted: 11/17/2020] [Indexed: 12/04/2022] Open
Abstract
Recently, deep convolutional neural networks (DCNNs) have attained human-level performances on challenging object recognition tasks owing to their complex internal representation. However, it remains unclear how objects are represented in DCNNs with an overwhelming number of features and non-linear operations. In parallel, the same question has been extensively studied in primates' brain, and three types of coding schemes have been found: one object is coded by the entire neuronal population (distributed coding), or by one single neuron (local coding), or by a subset of neuronal population (sparse coding). Here we asked whether DCNNs adopted any of these coding schemes to represent objects. Specifically, we used the population sparseness index, which is widely-used in neurophysiological studies on primates' brain, to characterize the degree of sparseness at each layer in representative DCNNs pretrained for object categorization. We found that the sparse coding scheme was adopted at all layers of the DCNNs, and the degree of sparseness increased along the hierarchy. That is, the coding scheme shifted from distributed-like coding at lower layers to local-like coding at higher layers. Further, the degree of sparseness was positively correlated with DCNNs' performance in object categorization, suggesting that the coding scheme was related to behavioral performance. Finally, with the lesion approach, we demonstrated that both external learning experiences and built-in gating operations were necessary to construct such a hierarchical coding scheme. In sum, our study provides direct evidence that DCNNs adopted a hierarchically-evolved sparse coding scheme as the biological brain does, suggesting the possibility of an implementation-independent principle underling object recognition.
Collapse
Affiliation(s)
- Xingyu Liu
- Beijing Key Laboratory of Applied Experimental Psychology, Faculty of Psychology, Beijing Normal University, Beijing, China
| | - Zonglei Zhen
- Beijing Key Laboratory of Applied Experimental Psychology, Faculty of Psychology, Beijing Normal University, Beijing, China
| | - Jia Liu
- Department of Psychology & Tsinghua Laboratory of Brain and Intelligence, Tsinghua University, Beijing, China
| |
Collapse
|
4
|
Sorooshyari SK, Sheng H, Poor HV. Object Recognition at Higher Regions of the Ventral Visual Stream via Dynamic Inference. Front Comput Neurosci 2020; 14:46. [PMID: 32655388 PMCID: PMC7325008 DOI: 10.3389/fncom.2020.00046] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/06/2020] [Accepted: 04/30/2020] [Indexed: 11/13/2022] Open
Affiliation(s)
- Siamak K. Sorooshyari
- Department of Integrative Biology, University of California, Berkeley, Berkeley, CA, United States
- *Correspondence: Siamak K. Sorooshyari
| | - Huanjie Sheng
- Department of Integrative Biology, University of California, Berkeley, Berkeley, CA, United States
| | - H. Vincent Poor
- Department of Electrical Engineering, Princeton University, Princeton, NJ, United States
| |
Collapse
|
5
|
Dong Q, Liu B, Hu Z. Non-uniqueness Phenomenon of Object Representation in Modeling IT Cortex by Deep Convolutional Neural Network (DCNN). Front Comput Neurosci 2020; 14:35. [PMID: 32477087 PMCID: PMC7235366 DOI: 10.3389/fncom.2020.00035] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2019] [Accepted: 04/09/2020] [Indexed: 11/13/2022] Open
Abstract
Recently DCNN (Deep Convolutional Neural Network) has been advocated as a general and promising modeling approach for neural object representation in primate inferotemporal cortex. In this work, we show that some inherent non-uniqueness problem exists in the DCNN-based modeling of image object representations. This non-uniqueness phenomenon reveals to some extent the theoretical limitation of this general modeling approach, and invites due attention to be taken in practice.
Collapse
Affiliation(s)
- Qiulei Dong
- National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences, Beijing, China
- School of Artificial Intelligence, University of Chinese Academy of Sciences, Beijing, China
- Center for Excellence in Brain Science and Intelligence Technology, Chinese Academy of Sciences, Shanghai, China
| | - Bo Liu
- National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences, Beijing, China
- School of Artificial Intelligence, University of Chinese Academy of Sciences, Beijing, China
| | - Zhanyi Hu
- National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences, Beijing, China
- School of Artificial Intelligence, University of Chinese Academy of Sciences, Beijing, China
- Center for Excellence in Brain Science and Intelligence Technology, Chinese Academy of Sciences, Shanghai, China
| |
Collapse
|
6
|
Bowers JS, Martin ND, Gale EM. Researchers Keep Rejecting Grandmother Cells after Running the Wrong Experiments: The Issue Is How Familiar Stimuli Are Identified. Bioessays 2020; 41:e1800248. [PMID: 31322760 DOI: 10.1002/bies.201800248] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2018] [Revised: 05/01/2019] [Indexed: 02/01/2023]
Abstract
There is widespread agreement in neuroscience and psychology that the visual system identifies objects and faces based on a pattern of activation over many neurons, each neuron being involved in representing many different categories. The hypothesis that the visual system includes finely tuned neurons for specific objects or faces for the sake of identification, so-called "grandmother cells", is widely rejected. Here it is argued that the rejection of grandmother cells is premature. Grandmother cells constitute a hypothesis of how familiar visual categories are identified, but the primary evidence against this hypothesis comes from studies that have failed to observe neurons that selectively respond to unfamiliar stimuli. These findings are reviewed and it is shown that they are irrelevant. Neuroscientists need to better understand existing models of face and object identification that include grandmother cells and then compare the selectivity of these units with single neurons responding to stimuli that can be identified.
Collapse
Affiliation(s)
- Jeffrey S Bowers
- School of Psychological Science, University of Bristol, Bristol, BS8 1TU, UK
| | - Nicolas D Martin
- School of Psychological Science, University of Bristol, Bristol, BS8 1TU, UK
| | - Ella M Gale
- School of Psychological Science, University of Bristol, Bristol, BS8 1TU, UK
| |
Collapse
|
7
|
Tsushima Y, Sawahata Y, Komine K. Task-dependent fMRI decoder with the power to extend Gabor patch results to Natural images. Sci Rep 2020; 10:1382. [PMID: 31992812 PMCID: PMC6987206 DOI: 10.1038/s41598-020-58241-x] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2018] [Accepted: 01/13/2020] [Indexed: 11/13/2022] Open
Abstract
Scientists are often asked to what extent a simple finding in a laboratory can be generalized to complicated phenomena in our daily lives. The same is equally true of vision science; numerous critical discoveries about our visual system have been made using very simple visual images, such as Gabor patches, but to what extent can these findings be applied to more natural images? Here, we used the fMRI decoding technique and directly tested whether the findings obtained with primitive visual stimuli (Gabor patches) were applicable to natural images. In the fMRI experiments, participants performed depth and resolution tasks with both Gabor patches and natural images. We created a fMRI decoder made from the results of the Gabor patch experiments that classified a brain activity pattern into the depth or resolution task, and then examined how successful the task-dependent decoder could sort a brain activity pattern in the natural image experiment into the depth or resolution task. As a result, we found that the task-dependent decoder constructed from Gabor patch experiments could predict which task (depth or resolution task) a participant was engaged in the natural image experiments, especially in the V3 and middle temporal (MT+) areas of the brain. This is consistent with previous researches on the cortical activation relating to depth perception rather than perceptual processing of display resolution. These results provide firm evidence that fMRI decoding technique possesses the power to evaluate the application of Gabor patch results (laboratory findings) to the natural images (everyday affairs), representing a new approach for studying the mechanism of visual perception.
Collapse
Affiliation(s)
- Yoshiaki Tsushima
- Center for Information and Neural Networks, National Institute of Information and Communication Technology, 3-5, Hikaridai, Soraku-gun, Seika-cho, 619-0289, Kyoto, Japan. .,Science & Technology Research Laboratories, Japan Broadcasting Corporation (NHK), 1-10-11 Kinuta, Setagaya-ku, 157-8510, Tokyo, Japan.
| | - Yasuhito Sawahata
- Science & Technology Research Laboratories, Japan Broadcasting Corporation (NHK), 1-10-11 Kinuta, Setagaya-ku, 157-8510, Tokyo, Japan
| | - Kazuteru Komine
- Science & Technology Research Laboratories, Japan Broadcasting Corporation (NHK), 1-10-11 Kinuta, Setagaya-ku, 157-8510, Tokyo, Japan
| |
Collapse
|
8
|
Lehky SR, Phan AH, Cichocki A, Tanaka K. Face Representations via Tensorfaces of Various Complexities. Neural Comput 2019; 32:281-329. [PMID: 31835006 DOI: 10.1162/neco_a_01258] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022]
Abstract
Neurons selective for faces exist in humans and monkeys. However, characteristics of face cell receptive fields are poorly understood. In this theoretical study, we explore the effects of complexity, defined as algorithmic information (Kolmogorov complexity) and logical depth, on possible ways that face cells may be organized. We use tensor decompositions to decompose faces into a set of components, called tensorfaces, and their associated weights, which can be interpreted as model face cells and their firing rates. These tensorfaces form a high-dimensional representation space in which each tensorface forms an axis of the space. A distinctive feature of the decomposition algorithm is the ability to specify tensorface complexity. We found that low-complexity tensorfaces have blob-like appearances crudely approximating faces, while high-complexity tensorfaces appear clearly face-like. Low-complexity tensorfaces require a larger population to reach a criterion face reconstruction error than medium- or high-complexity tensorfaces, and thus are inefficient by that criterion. Low-complexity tensorfaces, however, generalize better when representing statistically novel faces, which are faces falling beyond the distribution of face description parameters found in the tensorface training set. The degree to which face representations are parts based or global forms a continuum as a function of tensorface complexity, with low and medium tensorfaces being more parts based. Given the computational load imposed in creating high-complexity face cells (in the form of algorithmic information and logical depth) and in the absence of a compelling advantage to using high-complexity cells, we suggest face representations consist of a mixture of low- and medium-complexity face cells.
Collapse
Affiliation(s)
- Sidney R Lehky
- Cognitive Brain Mapping Laboratory, RIKEN Center for Brain Science, Wako-shi, Saitama 351-0198, Japan, and Computational Neurobiology Laboratory, Salk Institute, La Jolla, CA 92037, U.S.A.
| | - Anh Huy Phan
- Center for Computational and Data-Intensive Science and Engineering, Skolkovo Institute of Science and Technology, 143026 Moscow, Russia; and Institute of Global Innovation Research, Tokyo University of Agriculture and Technology, Tokyo 183-8538, Japan
| | - Andrzej Cichocki
- Center for Computational and Data-Intensive Science and Engineering, Skolkovo Institute of Science and Technology, 143026 Moscow, Russia; Systems Research Institute, Polish Academy of Sciences, 01447 Warsaw, Poland; College of Computer Science, Hangzhou Dianzu University, Hangzhou 310018, China; and Institute of Global Innovation Research, Tokyo University of Agriculture and Technology, Tokyo 183-8538, Japan
| | - Keiji Tanaka
- Cognitive Brain Mapping Laboratory, RIKEN Center for Brain Science, Wako-shi, Saitama 325-0198, Japan
| |
Collapse
|
9
|
Biologically-Inspired Computational Neural Mechanism for Human Action/activity Recognition: A Review. ELECTRONICS 2019. [DOI: 10.3390/electronics8101169] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Theoretical neuroscience investigation shows valuable information on the mechanism for recognizing the biological movements in the mammalian visual system. This involves many different fields of researches such as psychological, neurophysiology, neuro-psychological, computer vision, and artificial intelligence (AI). The research on these areas provided massive information and plausible computational models. Here, a review on this subject is presented. This paper describes different perspective to look at this task including action perception, computational and knowledge based modeling, psychological, and neuroscience approaches.
Collapse
|
10
|
Rezai O, Stoffl L, Tripp B. How are response properties in the middle temporal area related to inference on visual motion patterns? Neural Netw 2019; 121:122-131. [PMID: 31541880 DOI: 10.1016/j.neunet.2019.08.027] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2018] [Revised: 08/04/2019] [Accepted: 08/22/2019] [Indexed: 10/26/2022]
Abstract
Neurons in the primate middle temporal area (MT) respond to moving stimuli, with strong tuning for motion speed and direction. These responses have been characterized in detail, but the functional significance of these details (e.g. shapes and widths of speed tuning curves) is unclear, because they cannot be selectively manipulated. To estimate their functional significance, we used a detailed model of MT population responses as input to convolutional networks that performed sophisticated motion processing tasks (visual odometry and gesture recognition). We manipulated the distributions of speed and direction tuning widths, and studied the effects on task performance. We also studied performance with random linear mixtures of the responses, and with responses that had the same representational dissimilarity as the model populations, but were otherwise randomized. The width of speed and direction tuning both affected task performance, despite the networks having been optimized individually for each tuning variation, but the specific effects were different in each task. Random linear mixing improved performance of the odometry task, but not the gesture recognition task. Randomizing the responses while maintaining representational dissimilarity resulted in poor odometry performance. In summary, despite full optimization of the deep networks in each case, each manipulation of the representation affected performance of sophisticated visual tasks. Representation properties such as tuning width and representational similarity have been studied extensively from other perspectives, but this work provides new insight into their possible roles in sophisticated visual inference.
Collapse
|
11
|
Pereira U, Brunel N. Attractor Dynamics in Networks with Learning Rules Inferred from In Vivo Data. Neuron 2018; 99:227-238.e4. [PMID: 29909997 PMCID: PMC6091895 DOI: 10.1016/j.neuron.2018.05.038] [Citation(s) in RCA: 33] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2017] [Revised: 04/08/2018] [Accepted: 05/23/2018] [Indexed: 01/12/2023]
Abstract
The attractor neural network scenario is a popular scenario for memory storage in the association cortex, but there is still a large gap between models based on this scenario and experimental data. We study a recurrent network model in which both learning rules and distribution of stored patterns are inferred from distributions of visual responses for novel and familiar images in the inferior temporal cortex (ITC). Unlike classical attractor neural network models, our model exhibits graded activity in retrieval states, with distributions of firing rates that are close to lognormal. Inferred learning rules are close to maximizing the number of stored patterns within a family of unsupervised Hebbian learning rules, suggesting that learning rules in ITC are optimized to store a large number of attractor states. Finally, we show that there exist two types of retrieval states: one in which firing rates are constant in time and another in which firing rates fluctuate chaotically.
Collapse
Affiliation(s)
- Ulises Pereira
- Department of Statistics, The University of Chicago, Chicago, IL 60637, USA
| | - Nicolas Brunel
- Department of Statistics, The University of Chicago, Chicago, IL 60637, USA; Department of Neurobiology, The University of Chicago, Chicago, IL 60637, USA; Department of Neurobiology, Duke University, Durham, NC 27710, USA; Department of Physics, Duke University, Durham, NC 27708, USA.
| |
Collapse
|
12
|
Dong Q, Wang H, Hu Z. Statistics of Visual Responses to Image Object Stimuli from Primate AIT Neurons to DNN Neurons. Neural Comput 2017; 30:447-476. [PMID: 29162010 DOI: 10.1162/neco_a_01039] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022]
Abstract
Under the goal-driven paradigm, Yamins et al. ( 2014 ; Yamins & DiCarlo, 2016 ) have shown that by optimizing only the final eight-way categorization performance of a four-layer hierarchical network, not only can its top output layer quantitatively predict IT neuron responses but its penultimate layer can also automatically predict V4 neuron responses. Currently, deep neural networks (DNNs) in the field of computer vision have reached image object categorization performance comparable to that of human beings on ImageNet, a data set that contains 1.3 million training images of 1000 categories. We explore whether the DNN neurons (units in DNNs) possess image object representational statistics similar to monkey IT neurons, particularly when the network becomes deeper and the number of image categories becomes larger, using VGG19, a typical and widely used deep network of 19 layers in the computer vision field. Following Lehky, Kiani, Esteky, and Tanaka ( 2011 , 2014 ), where the response statistics of 674 IT neurons to 806 image stimuli are analyzed using three measures (kurtosis, Pareto tail index, and intrinsic dimensionality), we investigate the three issues in this letter using the same three measures: (1) the similarities and differences of the neural response statistics between VGG19 and primate IT cortex, (2) the variation trends of the response statistics of VGG19 neurons at different layers from low to high, and (3) the variation trends of the response statistics of VGG19 neurons when the numbers of stimuli and neurons increase. We find that the response statistics on both single-neuron selectivity and population sparseness of VGG19 neurons are fundamentally different from those of IT neurons in most cases; by increasing the number of neurons in different layers and the number of stimuli, the response statistics of neurons at different layers from low to high do not substantially change; and the estimated intrinsic dimensionality values at the low convolutional layers of VGG19 are considerably larger than the value of approximately 100 reported for IT neurons in Lehky et al. ( 2014 ), whereas those at the high fully connected layers are close to or lower than 100. To the best of our knowledge, this work is the first attempt to analyze the response statistics of DNN neurons with respect to primate IT neurons in image object representation.
Collapse
Affiliation(s)
- Qiulei Dong
- National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences, Beijing 100190, China; University of Chinese Academy of Sciences, Beijing 100049, China; and CAS Center for Excellence in Brain Science and Intelligence Technology, Beijing 100190, China
| | - Hong Wang
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Zhanyi Hu
- National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences, Beijing 100190, China; University of Chinese Academy of Sciences, Beijing 100049, China; and CAS Center for Excellence in Brain Science and Intelligence Technology, Beijing 100190, China
| |
Collapse
|
13
|
Dong Q, Liu B, Hu Z. Comparison of IT Neural Response Statistics with Simulations. Front Comput Neurosci 2017; 11:60. [PMID: 28747882 PMCID: PMC5506183 DOI: 10.3389/fncom.2017.00060] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2017] [Accepted: 06/23/2017] [Indexed: 11/13/2022] Open
Abstract
Lehky et al. (2011) provided a statistical analysis on the responses of the recorded 674 neurons to 806 image stimuli in anterior inferotemporalm (AIT) cortex of two monkeys. In terms of kurtosis and Pareto tail index, they observed that the population sparseness of both unnormalized and normalized responses is always larger than their single-neuron selectivity, hence concluded that the critical features for individual neurons in primate AIT cortex are not very complex, but there is an indefinitely large number of them. In this work, we explore an "inverse problem" by simulation, that is, by simulating each neuron indeed only responds to a very limited number of stimuli among a very large number of neurons and stimuli, to assess whether the population sparseness is always larger than the single-neuron selectivity. Our simulation results show that the population sparseness exceeds the single-neuron selectivity in most cases even if the number of neurons and stimuli are much larger than several hundreds, which confirms the observations in Lehky et al. (2011). In addition, we found that the variances of the computed kurtosis and Pareto tail index are quite large in some cases, which reveals some limitations of these two criteria when used for neuron response evaluation.
Collapse
Affiliation(s)
- Qiulei Dong
- National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of SciencesBeijing, China.,Department of Artificial Intelligence, University of Chinese Academy of SciencesBeijing, China.,Center for Excellence in Brain Science and Intelligence Technology, Chinese Academy of SciencesBeijing, China
| | - Bo Liu
- National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of SciencesBeijing, China.,Department of Artificial Intelligence, University of Chinese Academy of SciencesBeijing, China
| | - Zhanyi Hu
- National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of SciencesBeijing, China.,Department of Artificial Intelligence, University of Chinese Academy of SciencesBeijing, China.,Center for Excellence in Brain Science and Intelligence Technology, Chinese Academy of SciencesBeijing, China
| |
Collapse
|
14
|
Khan S, Tripp B. An empirical model of activity in macaque inferior temporal cortex. Neural Netw 2017; 87:8-21. [PMID: 28039780 DOI: 10.1016/j.neunet.2016.12.001] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2016] [Revised: 11/28/2016] [Accepted: 12/02/2016] [Indexed: 11/24/2022]
Abstract
There are compelling computational models of many properties of the primate ventral visual stream, but a gap remains between the models and the physiology. To facilitate ongoing refinement of these models, we have compiled diverse information from the electrophysiology literature into a statistical model of inferotemporal (IT) cortex responses. This is a purely descriptive model, so it has little explanatory power. However it is able to directly incorporate a rich and extensible set of tuning properties. So far, we have approximated tuning curves and statistics of tuning diversity for occlusion, clutter, size, orientation, position, and object selectivity in early versus late response phases. We integrated the model with the V-REP simulator, which provides stimulus properties in a simulated physical environment. In contrast with the empirical model presented here, mechanistic models are ultimately more useful for understanding neural systems. However, a detailed empirical model may be useful as a source of labeled data for optimizing and validating mechanistic models, or as a source of input to models of other brain areas.
Collapse
Affiliation(s)
- Salman Khan
- Department of Systems Design Engineering, University of Waterloo, 200 University Ave. W., Waterloo, Ontario, Canada N2L 3G1; Center for Theoretical Neuroscience, University of Waterloo, 200 University Ave. W., Waterloo, Ontario, Canada N2L 3G1.
| | - Bryan Tripp
- Department of Systems Design Engineering, University of Waterloo, 200 University Ave. W., Waterloo, Ontario, Canada N2L 3G1; Center for Theoretical Neuroscience, University of Waterloo, 200 University Ave. W., Waterloo, Ontario, Canada N2L 3G1.
| |
Collapse
|
15
|
Hatori Y, Mashita T, Sakai K. Sparse coding generates curvature selectivity in V4 neurons. JOURNAL OF THE OPTICAL SOCIETY OF AMERICA. A, OPTICS, IMAGE SCIENCE, AND VISION 2016; 33:527-537. [PMID: 27140760 DOI: 10.1364/josaa.33.000527] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
The cortical area V4 produces a representation of curvature as the intermediate-level representation of an object's shape. We investigated whether sparse coding is the principle driving the generation of the spatial properties of the receptive field in V4 that exhibit curvature selectivity. To investigate the role of sparseness in the construction of curvature representations, we applied component analysis with a sparseness constraint to the activity of model V2 neurons that were responding to shapes derived from natural images. Our simulation results showed that single basis functions with medium degrees of sparseness (0.7-0.8) produced curvature selectivity, and their population activity produced acute curvature bias. The results support the hypothesis that sparseness plays an essential role in the construction of curvature selectivity in V4.
Collapse
|
16
|
Neural representation for object recognition in inferotemporal cortex. Curr Opin Neurobiol 2016; 37:23-35. [PMID: 26771242 DOI: 10.1016/j.conb.2015.12.001] [Citation(s) in RCA: 57] [Impact Index Per Article: 7.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2015] [Accepted: 12/01/2015] [Indexed: 11/22/2022]
Abstract
We suggest that population representation of objects in inferotemporal cortex lie on a continuum between a purely structural, parts-based description and a purely holistic description. The intrinsic dimensionality of object representation is estimated to be around 100, perhaps with lower dimensionalities for object representations more toward the holistic end of the spectrum. Cognitive knowledge in the form of semantic information and task information feed back to inferotemporal cortex from perirhinal and prefrontal cortex respectively, providing high-level multimodal-based expectations that assist in the interpretation of object stimuli. Integration of object information across eye movements may also contribute to object recognition through a process of active vision.
Collapse
|
17
|
Rezai O, Kleinhans A, Matallanas E, Selby B, Tripp BP. Modeling the shape hierarchy for visually guided grasping. Front Comput Neurosci 2014; 8:132. [PMID: 25386134 PMCID: PMC4209868 DOI: 10.3389/fncom.2014.00132] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2014] [Accepted: 09/26/2014] [Indexed: 11/25/2022] Open
Abstract
The monkey anterior intraparietal area (AIP) encodes visual information about three-dimensional object shape that is used to shape the hand for grasping. We modeled shape tuning in visual AIP neurons and its relationship with curvature and gradient information from the caudal intraparietal area (CIP). The main goal was to gain insight into the kinds of shape parameterizations that can account for AIP tuning and that are consistent with both the inputs to AIP and the role of AIP in grasping. We first experimented with superquadric shape parameters. We considered superquadrics because they occupy a role in robotics that is similar to AIP, in that superquadric fits are derived from visual input and used for grasp planning. We also experimented with an alternative shape parameterization that was based on an Isomap dimension reduction of spatial derivatives of depth (i.e., distance from the observer to the object surface). We considered an Isomap-based model because its parameters lacked discontinuities between similar shapes. When we matched the dimension of the Isomap to the number of superquadric parameters, the superquadric model fit the AIP data somewhat more closely. However, higher-dimensional Isomaps provided excellent fits. Also, we found that the Isomap parameters could be approximated much more accurately than superquadric parameters by feedforward neural networks with CIP-like inputs. We conclude that Isomaps, or perhaps alternative dimension reductions of visual inputs to AIP, provide a promising model of AIP electrophysiology data. Further work is needed to test whether such shape parameterizations actually provide an effective basis for grasp control.
Collapse
Affiliation(s)
- Omid Rezai
- Department of Systems Design Engineering, Centre for Theoretical Neuroscience, University of Waterloo Waterloo, ON, Canada
| | - Ashley Kleinhans
- Mobile Intelligent Autonomous Systems, Council for Scientific and Industrial Research Pretoria, South Africa ; School of Mechanical and Industrial Engineering, University of Johannesburg Johannesburg, South Africa
| | | | - Ben Selby
- Department of Systems Design Engineering, Centre for Theoretical Neuroscience, University of Waterloo Waterloo, ON, Canada
| | - Bryan P Tripp
- Department of Systems Design Engineering, Centre for Theoretical Neuroscience, University of Waterloo Waterloo, ON, Canada
| |
Collapse
|
18
|
Meyer T, Walker C, Cho RY, Olson CR. Image familiarization sharpens response dynamics of neurons in inferotemporal cortex. Nat Neurosci 2014; 17:1388-94. [PMID: 25151263 PMCID: PMC4613775 DOI: 10.1038/nn.3794] [Citation(s) in RCA: 43] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2014] [Accepted: 07/22/2014] [Indexed: 11/09/2022]
Abstract
Repeated viewing of an image over days and weeks induces a marked reduction in the strength with which neurons in monkey inferotemporal cortex respond to it. The processing advantage that attaches to this reduction is unknown. One possibility is that truncation of the response to a familiar image leaves neurons in a state of readiness to respond to ensuing images and thereby enhances their ability to track rapidly changing displays. We explored this possibility by assessing neuronal responses to familiar and novel images in rapid serial visual displays. Inferotemporal neurons responded more strongly to familiar than to novel images in such displays. The effect was stronger among putative inhibitory neurons than among putative excitatory neurons. A comparable effect occurred at the level of the scalp potential in humans. We conclude that long-term familiarization sharpens the response dynamics of neurons in both monkey and human extrastriate visual cortex.
Collapse
Affiliation(s)
- Travis Meyer
- Center for the Neural Basis of Cognition, Carnegie Mellon University, 115 Mellon Institute, 4400 Fifth Avenue, Pittsburgh, Pennsylvania, PA 15213
| | - Christopher Walker
- Department of Psychiatry, Thomas Detre Hall of the Western Psychiatric Institute and Clinic, University of Pittsburgh, 3811 O'Hara Street, Pittsburgh, PA 15213
| | - Raymond Y. Cho
- Department of Psychiatry, Thomas Detre Hall of the Western Psychiatric Institute and Clinic, University of Pittsburgh, 3811 O'Hara Street, Pittsburgh, PA 15213
| | - Carl R. Olson
- Center for the Neural Basis of Cognition, Carnegie Mellon University, 115 Mellon Institute, 4400 Fifth Avenue, Pittsburgh, Pennsylvania, PA 15213
- Department of Neuroscience, University of Pittsburgh, 446 Crawford Hall, Pittsburgh, Pennsylvania, PA 15260
| |
Collapse
|
19
|
Lehky SR, Kiani R, Esteky H, Tanaka K. Dimensionality of object representations in monkey inferotemporal cortex. Neural Comput 2014; 26:2135-62. [PMID: 25058707 DOI: 10.1162/neco_a_00648] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022]
Abstract
We have calculated the intrinsic dimensionality of visual object representations in anterior inferotemporal (AIT) cortex, based on responses of a large sample of cells stimulated with photographs of diverse objects. Because dimensionality was dependent on data set size, we determined asymptotic dimensionality as both the number of neurons and number of stimulus image approached infinity. Our final dimensionality estimate was 93 (SD: ± 11), indicating that there is basis set of approximately 100 independent features that characterize the dimensions of neural object space. We believe this is the first estimate of the dimensionality of neural visual representations based on single-cell neurophysiological data. The dimensionality of AIT object representations was much lower than the dimensionality of the stimuli. We suggest that there may be a gradual reduction in the dimensionality of object representations in neural populations going from retina to inferotemporal cortex as receptive fields become increasingly complex.
Collapse
Affiliation(s)
- Sidney R Lehky
- Cognitive Brain Mapping Laboratory, RIKEN Brain Science Institute, Wako, Saitama, Japan, and Computational Neurobiology Laboratory, Salk Institute, La Jolla, CA 92037, U.S.A.
| | | | | | | |
Collapse
|
20
|
Elliott T. Sparseness, antisparseness and anything in between: the operating point of a neuron determines its computational repertoire. Neural Comput 2014; 26:1924-72. [PMID: 24922502 DOI: 10.1162/neco_a_00630] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022]
Abstract
A recent model of intrinsic plasticity coupled to Hebbian synaptic plasticity proposes that adaptation of a neuron's threshold and gain in a sigmoidal response function to achieve a sparse, exponential output firing rate distribution facilitates the discovery of heavy-tailed or super- gaussian sources in the neuron's inputs. We show that the exponential output distribution is irrelevant to these dynamics and that, furthermore, while sparseness is sufficient, it is not necessary. The intrinsic plasticity mechanism drives the neuron's threshold large and positive, and we prove that in such a regime, the neuron will find supergaussian sources; equally, however, if the threshold is large and negative (an antisparse regime), it will also find supergaussian sources. Away from such extremes, the neuron can also discover subgaussian sources. By examining a neuron with a fixed sigmoidal nonlinearity and considering the synaptic strength fixed-point structure in the two-dimensional parameter space defined by the neuron's threshold and gain, we show that this space is carved up into sub- and supergaussian-input-finding regimes, possibly with regimes of simultaneous stability of sub- and supergaussian sources or regimes of instability of all sources; a single gaussian source may also be stabilized by the presence of a nongaussian source. A neuron's operating point (essentially its threshold and gain coupled with its input statistics) therefore critically determines its computational repertoire. Intrinsic plasticity mechanisms induce trajectories in this parameter space but do not fundamentally modify it. Unless the trajectories cross critical boundaries in this space, intrinsic plasticity is irrelevant and the neuron's nonlinearity may be frozen with identical receptive field refinement dynamics.
Collapse
Affiliation(s)
- Terry Elliott
- Department of Electronics and Computer Science, University of Southampton, Highfield, Southampton, SO17 1BJ, U.K.
| |
Collapse
|
21
|
Samonds JM, Potetz BR, Lee TS. Sample skewness as a statistical measurement of neuronal tuning sharpness. Neural Comput 2014; 26:860-906. [PMID: 24555451 DOI: 10.1162/neco_a_00582] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022]
Abstract
We propose using the statistical measurement of the sample skewness of the distribution of mean firing rates of a tuning curve to quantify sharpness of tuning. For some features, like binocular disparity, tuning curves are best described by relatively complex and sometimes diverse functions, making it difficult to quantify sharpness with a single function and parameter. Skewness provides a robust nonparametric measure of tuning curve sharpness that is invariant with respect to the mean and variance of the tuning curve and is straightforward to apply to a wide range of tuning, including simple orientation tuning curves and complex object tuning curves that often cannot even be described parametrically. Because skewness does not depend on a specific model or function of tuning, it is especially appealing to cases of sharpening where recurrent interactions among neurons produce sharper tuning curves that deviate in a complex manner from the feedforward function of tuning. Since tuning curves for all neurons are not typically well described by a single parametric function, this model independence additionally allows skewness to be applied to all recorded neurons, maximizing the statistical power of a set of data. We also compare skewness with other nonparametric measures of tuning curve sharpness and selectivity. Compared to these other nonparametric measures tested, skewness is best used for capturing the sharpness of multimodal tuning curves defined by narrow peaks (maximum) and broad valleys (minima). Finally, we provide a more formal definition of sharpness using a shape-based information gain measure and derive and show that skewness is correlated with this definition.
Collapse
Affiliation(s)
- Jason M Samonds
- Center for the Neural Basis of Cognition and Computer Science Department, Carnegie Mellon University, Pittsburgh, PA 15213, U.S.A.
| | | | | |
Collapse
|
22
|
Abstract
The encoding of sensory information by populations of cortical neurons forms the basis for perception but remains poorly understood. To understand the constraints of cortical population coding we analyzed neural responses to natural sounds recorded in auditory cortex of primates (Macaca mulatta). We estimated stimulus information while varying the composition and size of the considered population. Consistent with previous reports we found that when choosing subpopulations randomly from the recorded ensemble, the average population information increases steadily with population size. This scaling was explained by a model assuming that each neuron carried equal amounts of information, and that any overlap between the information carried by each neuron arises purely from random sampling within the stimulus space. However, when studying subpopulations selected to optimize information for each given population size, the scaling of information was strikingly different: a small fraction of temporally precise cells carried the vast majority of information. This scaling could be explained by an extended model, assuming that the amount of information carried by individual neurons was highly nonuniform, with few neurons carrying large amounts of information. Importantly, these optimal populations can be determined by a single biophysical marker-the neuron's encoding time scale-allowing their detection and readout within biologically realistic circuits. These results show that extrapolations of population information based on random ensembles may overestimate the population size required for stimulus encoding, and that sensory cortical circuits may process information using small but highly informative ensembles.
Collapse
|
23
|
Fitzgerald JK, Freedman DJ, Fanini A, Bennur S, Gold JI, Assad JA. Biased associative representations in parietal cortex. Neuron 2013; 77:180-91. [PMID: 23312525 DOI: 10.1016/j.neuron.2012.11.014] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 11/11/2012] [Indexed: 10/27/2022]
Abstract
Neurons in cortical sensory areas respond selectively to sensory stimuli, and the preferred stimulus typically varies among neurons so as to continuously span the sensory space. However, some neurons reflect sensory features that are learned or task dependent. For example, neurons in the lateral intraparietal area (LIP) reflect learned associations between visual stimuli. One might expect that roughly even numbers of LIP neurons would prefer each set of associated stimuli. However, in two associative learning experiments and a perceptual decision experiment, we found striking asymmetries: nearly all neurons recorded from an animal had a similar order of preference among associated stimuli. Behavioral factors could not account for these neuronal biases. A recent computational study proposed that population-firing patterns in parietal cortex have one-dimensional dynamics on long timescales, a possible consequence of recurrent connections that could drive persistent activity. One-dimensional dynamics would predict the biases in selectivity that we observed.
Collapse
Affiliation(s)
- Jamie K Fitzgerald
- Department of Neurobiology, Harvard Medical School, Boston, MA 02115, USA
| | | | | | | | | | | |
Collapse
|
24
|
Early dynamics of the semantic priming shift. Adv Cogn Psychol 2013; 9:1-14. [PMID: 23717346 PMCID: PMC3664541 DOI: 10.2478/v10053-008-0126-9] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2012] [Accepted: 11/12/2012] [Indexed: 11/21/2022] Open
Abstract
Semantic processing of sequences of words requires the cognitive system to keep
several word meanings simultaneously activated in working memory with limited
capacity. The real- time updating of the sequence of word meanings relies on
dynamic changes in the associates to the words that are activated. Protocols
involving two sequential primes report a semantic priming shift from larger
priming of associates to the first prime to larger priming of associates to the
second prime, in a range of long SOAs (stimulus-onset asynchronies) between the
second prime and the target. However, the possibility for an early semantic
priming shift is still to be tested, and its dynamics as a function of
association strength remain unknown. Three multiple priming experiments are
proposed that cross-manipulate association strength between each of two
successive primes and a target, for different values of short SOAs and prime
durations. Results show an early priming shift ranging from priming of
associates to the first prime only to priming of strong associates to the first
prime and all of the associates to the second prime. We investigated the neural
basis of the early priming shift by using a network model of spike frequency
adaptive cortical neurons (e.g., Deco &
Rolls, 2005), able to code different association strengths between
the primes and the target. The cortical network model provides a description of
the early dynamics of the priming shift in terms of pro-active and retro-active
interferences within populations of excitatory neurons regulated by fast and
unselective inhibitory feedback.
Collapse
|
25
|
Balanced increases in selectivity and tolerance produce constant sparseness along the ventral visual stream. J Neurosci 2012; 32:10170-82. [PMID: 22836252 DOI: 10.1523/jneurosci.6125-11.2012] [Citation(s) in RCA: 52] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022] Open
Abstract
Although popular accounts suggest that neurons along the ventral visual processing stream become increasingly selective for particular objects, this appears at odds with the fact that inferior temporal cortical (IT) neurons are broadly tuned. To explore this apparent contradiction, we compared processing in two ventral stream stages (visual cortical areas V4 and IT) in the rhesus macaque monkey. We confirmed that IT neurons are indeed more selective for conjunctions of visual features than V4 neurons and that this increase in feature conjunction selectivity is accompanied by an increase in tolerance ("invariance") to identity-preserving transformations (e.g., shifting, scaling) of those features. We report here that V4 and IT neurons are, on average, tightly matched in their tuning breadth for natural images ("sparseness") and that the average V4 or IT neuron will produce a robust firing rate response (>50% of its peak observed firing rate) to ∼10% of all natural images. We also observed that sparseness was positively correlated with conjunction selectivity and negatively correlated with tolerance within both V4 and IT, consistent with selectivity-building and invariance-building computations that offset one another to produce sparseness. Our results imply that the conjunction-selectivity-building and invariance-building computations necessary to support object recognition are implemented in a balanced manner to maintain sparseness at each stage of processing.
Collapse
|