1
|
Sadeghnejad N, Ezoji M, Ebrahimpour R, Qodosi M, Zabbah S. A fully spiking coupled model of a deep neural network and a recurrent attractor explains dynamics of decision making in an object recognition task. J Neural Eng 2024; 21:026011. [PMID: 38506115 DOI: 10.1088/1741-2552/ad2d30] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2022] [Accepted: 02/26/2024] [Indexed: 03/21/2024]
Abstract
Objective.Object recognition and making a choice regarding the recognized object is pivotal for most animals. This process in the brain contains information representation and decision making steps which both take different amount of times for different objects. While dynamics of object recognition and decision making are usually ignored in object recognition models, here we proposed a fully spiking hierarchical model, explaining the process of object recognition from information representation to making decision.Approach.Coupling a deep neural network and a recurrent attractor based decision making model beside using spike time dependent plasticity learning rules in several convolutional and pooling layers, we proposed a model which can resemble brain behaviors during an object recognition task. We also measured human choices and reaction times in a psychophysical object recognition task and used it as a reference to evaluate the model.Main results.The proposed model explains not only the probability of making a correct decision but also the time that it takes to make a decision. Importantly, neural firing rates in both feature representation and decision making levels mimic the observed patterns in animal studies (number of spikes (p-value < 10-173) and the time of the peak response (p-value < 10-31) are significantly modulated with the strength of the stimulus). Moreover, the speed-accuracy trade-off as a well-known characteristic of decision making process in the brain is also observed in the model (changing the decision bound significantly affect the reaction time (p-value < 10-59) and accuracy (p-value < 10-165)).Significance.We proposed a fully spiking deep neural network which can explain dynamics of making decision about an object in both neural and behavioral level. Results showed that there is a strong and significant correlation (r= 0.57) between the reaction time of the model and of human participants in the psychophysical object recognition task.
Collapse
Affiliation(s)
- Naser Sadeghnejad
- Faculty of Electrical and Computer Engineering, Babol Noshirvani University of Technology, Babol, Iran
| | - Mehdi Ezoji
- Faculty of Electrical and Computer Engineering, Babol Noshirvani University of Technology, Babol, Iran
| | - Reza Ebrahimpour
- Center for Cognitive Science, Institute for Convergence Science and Technology (ICST), Sharif University of Technology, Tehran, Iran
- School of Cognitive Sciences, Institute for Research in Fundamental Sciences (IPM), Tehran, Iran
| | - Mohamad Qodosi
- Department of Computer Engineering, Shahid Rajaee Teacher Training University, Tehran, Iran
| | - Sajjad Zabbah
- School of Cognitive Sciences, Institute for Research in Fundamental Sciences (IPM), Tehran, Iran
- Wellcome Centre for Human Neuroimaging, University College London, London, United Kingdom
- Max Planck UCL Centre for Computational Psychiatry and Aging Research, University College London, London, United Kingdom
| |
Collapse
|
2
|
Sadeghnejad N, Ezoji M, Ebrahimpour R, Zabbah S. Resolving the neural mechanism of core object recognition in space and time: A computational approach. Neurosci Res 2023; 190:36-50. [PMID: 36502958 DOI: 10.1016/j.neures.2022.12.002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2022] [Revised: 11/09/2022] [Accepted: 12/01/2022] [Indexed: 12/14/2022]
Abstract
The underlying mechanism of object recognition- a fundamental brain ability- has been investigated in various studies. However, balancing between the speed and accuracy of recognition is less explored. Most of the computational models of object recognition are not potentially able to explain the recognition time and, thus, only focus on the recognition accuracy because of two reasons: lack of a temporal representation mechanism for sensory processing and using non-biological classifiers for decision-making processing. Here, we proposed a hierarchical temporal model of object recognition using a spiking deep neural network coupled to a biologically plausible decision-making model for explaining both recognition time and accuracy. We showed that the response dynamics of the proposed model can resemble those of the brain. Firstly, in an object recognition task, the model can mimic human's and monkey's recognition time as well as accuracy. Secondly, the model can replicate different speed-accuracy trade-off regimes as observed in the literature. More importantly, we demonstrated that temporal representation of different abstraction levels (superordinate, midlevel, and subordinate) in the proposed model matched the brain representation dynamics observed in previous studies. We conclude that the accumulation of spikes, generated by a hierarchical feedforward spiking structure, to reach abound can well explain not even the dynamics of making a decision, but also the representations dynamics for different abstraction levels.
Collapse
Affiliation(s)
- Naser Sadeghnejad
- Faculty of Electrical and Computer Engineering, Babol Noshirvani University of Technology, Babol, Iran
| | - Mehdi Ezoji
- Faculty of Electrical and Computer Engineering, Babol Noshirvani University of Technology, Babol, Iran.
| | - Reza Ebrahimpour
- Institute for Convergence Science and Technology (ICST), Sharif University of Technology, Tehran, Iran; Faculty of Computer Engineering, Shahid Rajaee Teacher Training University, Tehran, Iran; School of Cognitive Sciences (SCS), Institute for Research in Fundamental Sciences (IPM), Tehran, Iran
| | - Sajjad Zabbah
- School of Cognitive Sciences (SCS), Institute for Research in Fundamental Sciences (IPM), Tehran, Iran; Wellcome Centre for Human Neuroimaging, University College London, London, UK; Max Planck UCL Centre for Computational Psychiatry and Aging Research, University College London, London, UK
| |
Collapse
|
3
|
Lukanov H, König P, Pipa G. Biologically Inspired Deep Learning Model for Efficient Foveal-Peripheral Vision. Front Comput Neurosci 2021; 15:746204. [PMID: 34880741 PMCID: PMC8645638 DOI: 10.3389/fncom.2021.746204] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2021] [Accepted: 10/27/2021] [Indexed: 11/13/2022] Open
Abstract
While abundant in biology, foveated vision is nearly absent from computational models and especially deep learning architectures. Despite considerable hardware improvements, training deep neural networks still presents a challenge and constraints complexity of models. Here we propose an end-to-end neural model for foveal-peripheral vision, inspired by retino-cortical mapping in primates and humans. Our model has an efficient sampling technique for compressing the visual signal such that a small portion of the scene is perceived in high resolution while a large field of view is maintained in low resolution. An attention mechanism for performing "eye-movements" assists the agent in collecting detailed information incrementally from the observed scene. Our model achieves comparable results to a similar neural architecture trained on full-resolution data for image classification and outperforms it at video classification tasks. At the same time, because of the smaller size of its input, it can reduce computational effort tenfold and uses several times less memory. Moreover, we present an easy to implement bottom-up and top-down attention mechanism which relies on task-relevant features and is therefore a convenient byproduct of the main architecture. Apart from its computational efficiency, the presented work provides means for exploring active vision for agent training in simulated environments and anthropomorphic robotics.
Collapse
Affiliation(s)
- Hristofor Lukanov
- Department of Neuroinformatics, Institute of Cognitive Science, Osnabrück University, Osnabrück, Germany
| | - Peter König
- Department of Neurobiopsychology, Institute of Cognitive Science, Osnabrück University, Osnabrück, Germany
- Department of Neurophysiology and Pathophysiology, Center of Experimental Medicine, University Medical Center Hamburg-Eppendorf, Hamburg, Germany
| | - Gordon Pipa
- Department of Neuroinformatics, Institute of Cognitive Science, Osnabrück University, Osnabrück, Germany
| |
Collapse
|
4
|
|
5
|
Krasovskaya S, MacInnes WJ. Salience Models: A Computational Cognitive Neuroscience Review. Vision (Basel) 2019; 3:E56. [PMID: 31735857 PMCID: PMC6969943 DOI: 10.3390/vision3040056] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2019] [Revised: 10/12/2019] [Accepted: 10/22/2019] [Indexed: 11/21/2022] Open
Abstract
The seminal model by Laurent Itti and Cristoph Koch demonstrated that we can compute the entire flow of visual processing from input to resulting fixations. Despite many replications and follow-ups, few have matched the impact of the original model-so what made this model so groundbreaking? We have selected five key contributions that distinguish the original salience model by Itti and Koch; namely, its contribution to our theoretical, neural, and computational understanding of visual processing, as well as the spatial and temporal predictions for fixation distributions. During the last 20 years, advances in the field have brought up various techniques and approaches to salience modelling, many of which tried to improve or add to the initial Itti and Koch model. One of the most recent trends has been to adopt the computational power of deep learning neural networks; however, this has also shifted their primary focus to spatial classification. We present a review of recent approaches to modelling salience, starting from direct variations of the Itti and Koch salience model to sophisticated deep-learning architectures, and discuss the models from the point of view of their contribution to computational cognitive neuroscience.
Collapse
Affiliation(s)
- Sofia Krasovskaya
- Vision Modelling Laboratory, Faculty of Social Science, National Research University Higher School of Economics, 101000 Moscow, Russia
- School of Psychology, National Research University Higher School of Economics, 101000 Moscow, Russia
| | - W. Joseph MacInnes
- Vision Modelling Laboratory, Faculty of Social Science, National Research University Higher School of Economics, 101000 Moscow, Russia
- School of Psychology, National Research University Higher School of Economics, 101000 Moscow, Russia
| |
Collapse
|
6
|
|
7
|
MacInnes WJ, Hunt AR, Clarke ADF, Dodd MD. A Generative Model of Cognitive State from Task and Eye Movements. Cognit Comput 2018; 10:703-717. [PMID: 30740186 DOI: 10.1007/s12559-018-9558-9] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/01/2022]
Abstract
The early eye tracking studies of Yarbus provided descriptive evidence that an observer's task influences patterns of eye movements, leading to the tantalizing prospect that an observer's intentions could be inferred from their saccade behavior. We investigate the predictive value of task and eye movement properties by creating a computational cognitive model of saccade selection based on instructed task and internal cognitive state using a Dynamic Bayesian Network (DBN). Understanding how humans generate saccades under different conditions and cognitive sets links recent work on salience models of low-level vision with higher level cognitive goals. This model provides a Bayesian, cognitive approach to top-down transitions in attentional set in pre-frontal areas along with vector-based saccade generation from the superior colliculus. Our approach is to begin with eye movement data that has previously been shown to differ across task. We first present an analysis of the extent to which individual saccadic features are diagnostic of an observer's task. Second, we use those features to infer an underlying cognitive state that potentially differs from the instructed task. Finally, we demonstrate how changes of cognitive state over time can be incorporated into a generative model of eye movement vectors without resorting to an external decision homunculus. Internal cognitive state frees the model from the assumption that instructed task is the only factor influencing observers' saccadic behavior. While the inclusion of hidden temporal state does not improve the classification accuracy of the model, it does allow accurate prediction of saccadic sequence results observed in search paradigms. Given the generative nature of this model, it is capable of saccadic simulation in real time. We demonstrated that the properties from its generated saccadic vectors closely match those of human observers given a particular task and cognitive state. Many current models of vision focus entirely on bottom-up salience to produce estimates of spatial "areas of interest" within a visual scene. While a few recent models do add top-down knowledge and task information, we believe our contribution is important in three key ways. First, we incorporate task as learned attentional sets that are capable of self-transition given only information available to the visual system. This matches influential theories of bias signals by (Miller and Cohen Annu Rev Neurosci 24:167-202, 2001) and implements selection of state without simply shifting the decision to an external homunculus. Second, our model is generative and capable of predicting sequence artifacts in saccade generation like those found in visual search. Third, our model generates relative saccadic vector information as opposed to absolute spatial coordinates. This matches more closely the internal saccadic representations as they are generated in the superior colliculus.
Collapse
Affiliation(s)
- W Joseph MacInnes
- School of Psychology, National Research University Higher School of Economics, Moscow, Russian Federation
| | - Amelia R Hunt
- School of Psychology, University of Aberdeen, Aberdeen, UK
| | | | | |
Collapse
|
8
|
Discriminative Deep Belief Network for Indoor Environment Classification Using Global Visual Features. Cognit Comput 2018. [DOI: 10.1007/s12559-017-9534-9] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
|
9
|
|