1
|
Grossberg S. How children learn to understand language meanings: a neural model of adult-child multimodal interactions in real-time. Front Psychol 2023; 14:1216479. [PMID: 37599779 PMCID: PMC10435915 DOI: 10.3389/fpsyg.2023.1216479] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2023] [Accepted: 06/28/2023] [Indexed: 08/22/2023] Open
Abstract
This article describes a biological neural network model that can be used to explain how children learn to understand language meanings about the perceptual and affective events that they consciously experience. This kind of learning often occurs when a child interacts with an adult teacher to learn language meanings about events that they experience together. Multiple types of self-organizing brain processes are involved in learning language meanings, including processes that control conscious visual perception, joint attention, object learning and conscious recognition, cognitive working memory, cognitive planning, emotion, cognitive-emotional interactions, volition, and goal-oriented actions. The article shows how all of these brain processes interact to enable the learning of language meanings to occur. The article also contrasts these human capabilities with AI models such as ChatGPT. The current model is called the ChatSOME model, where SOME abbreviates Self-Organizing MEaning.
Collapse
Affiliation(s)
- Stephen Grossberg
- Center for Adaptive Systems, Boston University, Boston, MA, United States
| |
Collapse
|
2
|
Grossberg S. The resonant brain: How attentive conscious seeing regulates action sequences that interact with attentive cognitive learning, recognition, and prediction. Atten Percept Psychophys 2019; 81:2237-2264. [PMID: 31218601 PMCID: PMC6848053 DOI: 10.3758/s13414-019-01789-2] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
Abstract
This article describes mechanistic links that exist in advanced brains between processes that regulate conscious attention, seeing, and knowing, and those that regulate looking and reaching. These mechanistic links arise from basic properties of brain design principles such as complementary computing, hierarchical resolution of uncertainty, and adaptive resonance. These principles require conscious states to mark perceptual and cognitive representations that are complete, context sensitive, and stable enough to control effective actions. Surface-shroud resonances support conscious seeing and action, whereas feature-category resonances support learning, recognition, and prediction of invariant object categories. Feedback interactions between cortical areas such as peristriate visual cortical areas V2, V3A, and V4, and the lateral intraparietal area (LIP) and inferior parietal sulcus (IPS) of the posterior parietal cortex (PPC) control sequences of saccadic eye movements that foveate salient features of attended objects and thereby drive invariant object category learning. Learned categories can, in turn, prime the objects and features that are attended and searched. These interactions coordinate processes of spatial and object attention, figure-ground separation, predictive remapping, invariant object category learning, and visual search. They create a foundation for learning to control motor-equivalent arm movement sequences, and for storing these sequences in cognitive working memories that can trigger the learning of cognitive plans with which to read out skilled movement sequences. Cognitive-emotional interactions that are regulated by reinforcement learning can then help to select the plans that control actions most likely to acquire valued goal objects in different situations. Many interdisciplinary psychological and neurobiological data about conscious and unconscious behaviors in normal individuals and clinical patients have been explained in terms of these concepts and mechanisms.
Collapse
Affiliation(s)
- Stephen Grossberg
- Center for Adaptive Systems, Room 213, Graduate Program in Cognitive and Neural Systems, Departments of Mathematics & Statistics, Psychological & Brain Sciences, and Biomedical Engineering, Boston University, 677 Beacon Street, Boston, MA, 02215, USA.
| |
Collapse
|
3
|
Grossberg S. The Embodied Brain of SOVEREIGN2: From Space-Variant Conscious Percepts During Visual Search and Navigation to Learning Invariant Object Categories and Cognitive-Emotional Plans for Acquiring Valued Goals. Front Comput Neurosci 2019; 13:36. [PMID: 31333437 PMCID: PMC6620614 DOI: 10.3389/fncom.2019.00036] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2019] [Accepted: 05/21/2019] [Indexed: 11/13/2022] Open
Abstract
This article develops a model of how reactive and planned behaviors interact in real time. Controllers for both animals and animats need reactive mechanisms for exploration, and learned plans to efficiently reach goal objects once an environment becomes familiar. The SOVEREIGN model embodied these capabilities, and was tested in a 3D virtual reality environment. Neural models have characterized important adaptive and intelligent processes that were not included in SOVEREIGN. A major research program is summarized herein by which to consistently incorporate them into an enhanced model called SOVEREIGN2. Key new perceptual, cognitive, cognitive-emotional, and navigational processes require feedback networks which regulate resonant brain states that support conscious experiences of seeing, feeling, and knowing. Also included are computationally complementary processes of the mammalian neocortical What and Where processing streams, and homologous mechanisms for spatial navigation and arm movement control. These include: Unpredictably moving targets are tracked using coordinated smooth pursuit and saccadic movements. Estimates of target and present position are computed in the Where stream, and can activate approach movements. Motion cues can elicit orienting movements to bring new targets into view. Cumulative movement estimates are derived from visual and vestibular cues. Arbitrary navigational routes are incrementally learned as a labeled graph of angles turned and distances traveled between turns. Noisy and incomplete visual sensor data are transformed into representations of visual form and motion. Invariant recognition categories are learned in the What stream. Sequences of invariant object categories are stored in a cognitive working memory, whereas sequences of movement positions and directions are stored in a spatial working memory. Stored sequences trigger learning of cognitive and spatial/motor sequence categories or plans, also called list chunks, which control planned decisions and movements toward valued goal objects. Predictively successful list chunk combinations are selectively enhanced or suppressed via reinforcement learning and incentive motivational learning. Expected vs. unexpected event disconfirmations regulate these enhancement and suppressive processes. Adaptively timed learning enables attention and action to match task constraints. Social cognitive joint attention enables imitation learning of skills by learners who observe teachers from different spatial vantage points.
Collapse
Affiliation(s)
- Stephen Grossberg
- Center for Adaptive Systems, Graduate Program in Cognitive and Neural Systems, Departments of Mathematics & Statistics, Psychological & Brain Sciences, and Biomedical Engineering, Boston University, Boston, MA, United States
| |
Collapse
|
4
|
Grossberg S, Kishnan D. Neural Dynamics of Autistic Repetitive Behaviors and Fragile X Syndrome: Basal Ganglia Movement Gating and mGluR-Modulated Adaptively Timed Learning. Front Psychol 2018; 9:269. [PMID: 29593596 PMCID: PMC5859312 DOI: 10.3389/fpsyg.2018.00269] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2017] [Accepted: 02/19/2018] [Indexed: 11/13/2022] Open
Abstract
This article develops the iSTART neural model that proposes how specific imbalances in cognitive, emotional, timing, and motor processes that involve brain regions like prefrontal cortex, temporal cortex, amygdala, hypothalamus, hippocampus, and cerebellum may interact together to cause behavioral symptoms of autism. These imbalances include underaroused emotional depression in the amygdala/hypothalamus, learning of hyperspecific recognition categories that help to cause narrowly focused attention in temporal and prefrontal cortices, and breakdowns of adaptively timed motivated attention and motor circuits in the hippocampus and cerebellum. The article expands the model's explanatory range by, first, explaining recent data about Fragile X syndrome (FXS), mGluR, and trace conditioning; and, second, by explaining distinct causes of stereotyped behaviors in individuals with autism. Some of these stereotyped behaviors, such as an insistence on sameness and circumscribed interests, may result from imbalances in the cognitive and emotional circuits that iSTART models. These behaviors may be ameliorated by operant conditioning methods. Other stereotyped behaviors, such as repetitive motor behaviors, may result from imbalances in how the direct and indirect pathways of the basal ganglia open or close movement gates, respectively. These repetitive behaviors may be ameliorated by drugs that augment D2 dopamine receptor responses or reduce D1 dopamine receptor responses. The article also notes the ubiquitous role of gating by basal ganglia loops in regulating all the functions that iSTART models.
Collapse
Affiliation(s)
- Stephen Grossberg
- Center for Adaptive Systems, Graduate Program in Cognitive and Neural Systems, Departments of Mathematics & Statistics, Psychological & Brain Sciences, and Biomedical Engineering, Boston University, Boston, MA, United States
| | - Devika Kishnan
- Department of Biomedical Engineering, Boston University, Boston, MA, United States
| |
Collapse
|
5
|
Grossberg S. Desirability, availability, credit assignment, category learning, and attention: Cognitive-emotional and working memory dynamics of orbitofrontal, ventrolateral, and dorsolateral prefrontal cortices. Brain Neurosci Adv 2018; 2:2398212818772179. [PMID: 32166139 PMCID: PMC7058233 DOI: 10.1177/2398212818772179] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2017] [Accepted: 03/16/2018] [Indexed: 11/17/2022] Open
Abstract
BACKGROUND The prefrontal cortices play an essential role in cognitive-emotional and working memory processes through interactions with multiple brain regions. METHODS This article further develops a unified neural architecture that explains many recent and classical data about prefrontal function and makes testable predictions. RESULTS Prefrontal properties of desirability, availability, credit assignment, category learning, and feature-based attention are explained. These properties arise through interactions of orbitofrontal, ventrolateral prefrontal, and dorsolateral prefrontal cortices with the inferotemporal cortex, perirhinal cortex, parahippocampal cortices; ventral bank of the principal sulcus, ventral prearcuate gyrus, frontal eye fields, hippocampus, amygdala, basal ganglia, hypothalamus, and visual cortical areas V1, V2, V3A, V4, middle temporal cortex, medial superior temporal area, lateral intraparietal cortex, and posterior parietal cortex. Model explanations also include how the value of visual objects and events is computed, which objects and events cause desired consequences and which may be ignored as predictively irrelevant, and how to plan and act to realise these consequences, including how to selectively filter expected versus unexpected events, leading to movements towards, and conscious perception of, expected events. Modelled processes include reinforcement learning and incentive motivational learning; object and spatial working memory dynamics; and category learning, including the learning of object categories, value categories, object-value categories, and sequence categories, or list chunks. CONCLUSION This article hereby proposes a unified neural theory of prefrontal cortex and its functions.
Collapse
Affiliation(s)
- Stephen Grossberg
- Center for Adaptive Systems, Graduate Program in Cognitive and Neural Systems, Departments of Mathematics & Statistics, Psychological & Brain Sciences, Biomedical Engineering, Boston University, Boston, MA, USA
| |
Collapse
|
6
|
Grossberg S. Towards solving the hard problem of consciousness: The varieties of brain resonances and the conscious experiences that they support. Neural Netw 2016; 87:38-95. [PMID: 28088645 DOI: 10.1016/j.neunet.2016.11.003] [Citation(s) in RCA: 45] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2016] [Revised: 10/21/2016] [Accepted: 11/20/2016] [Indexed: 10/20/2022]
Abstract
The hard problem of consciousness is the problem of explaining how we experience qualia or phenomenal experiences, such as seeing, hearing, and feeling, and knowing what they are. To solve this problem, a theory of consciousness needs to link brain to mind by modeling how emergent properties of several brain mechanisms interacting together embody detailed properties of individual conscious psychological experiences. This article summarizes evidence that Adaptive Resonance Theory, or ART, accomplishes this goal. ART is a cognitive and neural theory of how advanced brains autonomously learn to attend, recognize, and predict objects and events in a changing world. ART has predicted that "all conscious states are resonant states" as part of its specification of mechanistic links between processes of consciousness, learning, expectation, attention, resonance, and synchrony. It hereby provides functional and mechanistic explanations of data ranging from individual spikes and their synchronization to the dynamics of conscious perceptual, cognitive, and cognitive-emotional experiences. ART has reached sufficient maturity to begin classifying the brain resonances that support conscious experiences of seeing, hearing, feeling, and knowing. Psychological and neurobiological data in both normal individuals and clinical patients are clarified by this classification. This analysis also explains why not all resonances become conscious, and why not all brain dynamics are resonant. The global organization of the brain into computationally complementary cortical processing streams (complementary computing), and the organization of the cerebral cortex into characteristic layers of cells (laminar computing), figure prominently in these explanations of conscious and unconscious processes. Alternative models of consciousness are also discussed.
Collapse
Affiliation(s)
- Stephen Grossberg
- Center for Adaptive Systems, Boston University, 677 Beacon Street, Boston, MA 02215, USA; Graduate Program in Cognitive and Neural Systems, Departments of Mathematics & Statistics, Psychological & Brain Sciences, and Biomedical Engineering Boston University, 677 Beacon Street, Boston, MA 02215, USA.
| |
Collapse
|
7
|
Neural Dynamics of the Basal Ganglia During Perceptual, Cognitive, and Motor Learning and Gating. INNOVATIONS IN COGNITIVE NEUROSCIENCE 2016. [DOI: 10.1007/978-3-319-42743-0_19] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/12/2023]
|
8
|
|
9
|
From brain synapses to systems for learning and memory: Object recognition, spatial navigation, timed conditioning, and movement control. Brain Res 2014; 1621:270-93. [PMID: 25446436 DOI: 10.1016/j.brainres.2014.11.018] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2014] [Accepted: 11/06/2014] [Indexed: 11/23/2022]
Abstract
This article provides an overview of neural models of synaptic learning and memory whose expression in adaptive behavior depends critically on the circuits and systems in which the synapses are embedded. It reviews Adaptive Resonance Theory, or ART, models that use excitatory matching and match-based learning to achieve fast category learning and whose learned memories are dynamically stabilized by top-down expectations, attentional focusing, and memory search. ART clarifies mechanistic relationships between consciousness, learning, expectation, attention, resonance, and synchrony. ART models are embedded in ARTSCAN architectures that unify processes of invariant object category learning, recognition, spatial and object attention, predictive remapping, and eye movement search, and that clarify how conscious object vision and recognition may fail during perceptual crowding and parietal neglect. The generality of learned categories depends upon a vigilance process that is regulated by acetylcholine via the nucleus basalis. Vigilance can get stuck at too high or too low values, thereby causing learning problems in autism and medial temporal amnesia. Similar synaptic learning laws support qualitatively different behaviors: Invariant object category learning in the inferotemporal cortex; learning of grid cells and place cells in the entorhinal and hippocampal cortices during spatial navigation; and learning of time cells in the entorhinal-hippocampal system during adaptively timed conditioning, including trace conditioning. Spatial and temporal processes through the medial and lateral entorhinal-hippocampal system seem to be carried out with homologous circuit designs. Variations of a shared laminar neocortical circuit design have modeled 3D vision, speech perception, and cognitive working memory and learning. A complementary kind of inhibitory matching and mismatch learning controls movement. This article is part of a Special Issue entitled SI: Brain and Memory.
Collapse
|
10
|
Ghodrati M, Rajaei K, Ebrahimpour R. The importance of visual features in generic vs. specialized object recognition: a computational study. Front Comput Neurosci 2014; 8:78. [PMID: 25202259 PMCID: PMC4141282 DOI: 10.3389/fncom.2014.00078] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2014] [Accepted: 07/01/2014] [Indexed: 11/13/2022] Open
Abstract
It is debated whether the representation of objects in inferior temporal (IT) cortex is distributed over activities of many neurons or there are restricted islands of neurons responsive to a specific set of objects. There are lines of evidence demonstrating that fusiform face area (FFA-in human) processes information related to specialized object recognition (here we say within category object recognition such as face identification). Physiological studies have also discovered several patches in monkey ventral temporal lobe that are responsible for facial processing. Neuronal recording from these patches shows that neurons are highly selective for face images whereas for other objects we do not see such selectivity in IT. However, it is also well-supported that objects are encoded through distributed patterns of neural activities that are distinctive for each object category. It seems that visual cortex utilize different mechanisms for between category object recognition (e.g., face vs. non-face objects) vs. within category object recognition (e.g., two different faces). In this study, we address this question with computational simulations. We use two biologically inspired object recognition models and define two experiments which address these issues. The models have a hierarchical structure of several processing layers that simply simulate visual processing from V1 to aIT. We show, through computational modeling, that the difference between these two mechanisms of recognition can underlie the visual feature and extraction mechanism. It is argued that in order to perform generic and specialized object recognition, visual cortex must separate the mechanisms involved in within category from between categories object recognition. High recognition performance in within category object recognition can be guaranteed when class-specific features with intermediate size and complexity are extracted. However, generic object recognition requires a distributed universal dictionary of visual features in which the size of features does not have significant difference.
Collapse
Affiliation(s)
- Masoud Ghodrati
- Brain and Intelligent Systems Research Laboratory (BISLab), Department of Electrical and Computer Engineering, Shahid Rajaee Teacher Training University Tehran, Iran ; School of Cognitive Sciences, Institute for Research in Fundamental Sciences (IPM) Tehran, Iran ; Department of Physiology, Monash University Melbourne, VIC, Australia
| | - Karim Rajaei
- Brain and Intelligent Systems Research Laboratory (BISLab), Department of Electrical and Computer Engineering, Shahid Rajaee Teacher Training University Tehran, Iran ; School of Cognitive Sciences, Institute for Research in Fundamental Sciences (IPM) Tehran, Iran
| | - Reza Ebrahimpour
- Brain and Intelligent Systems Research Laboratory (BISLab), Department of Electrical and Computer Engineering, Shahid Rajaee Teacher Training University Tehran, Iran ; School of Cognitive Sciences, Institute for Research in Fundamental Sciences (IPM) Tehran, Iran
| |
Collapse
|
11
|
Adaptive Resonance Theory: How a brain learns to consciously attend, learn, and recognize a changing world. Neural Netw 2013; 37:1-47. [PMID: 23149242 DOI: 10.1016/j.neunet.2012.09.017] [Citation(s) in RCA: 183] [Impact Index Per Article: 16.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2012] [Revised: 08/24/2012] [Accepted: 09/24/2012] [Indexed: 11/17/2022]
|
12
|
Fechler K, von der Emde G. Figure-ground separation during active electrolocation in the weakly electric fish, Gnathonemus petersii. ACTA ACUST UNITED AC 2012; 107:72-83. [PMID: 22504389 DOI: 10.1016/j.jphysparis.2012.03.002] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2012] [Revised: 03/23/2012] [Accepted: 03/28/2012] [Indexed: 11/29/2022]
Abstract
The weakly electric fish Gnathonemus petersii uses active electrolocation to detect and discriminate between objects in its environment. Objects are recognised by analysing the electric images, which they project onto the fish's skin. In this study, we determined whether different types of large backgrounds interfere with the fishes' ability to discriminate between objects. Fish were trained in a food-rewarded two-alternative forced-choice procedure to discriminate between two objects. In subsequent tests, structured and non-structured as well as stationary and moving backgrounds were positioned behind the objects and discrimination performance between objects was measured at different object distances. To define the electrosensory stimuli during the tests, the electric images of the objects and backgrounds used were measured. Without a background G. petersii was able to discriminate between objects up to distances of about 3-4 cm. Even though the electric images of background and object superimposed in a complex way, the addition of stationary structured or plain backgrounds had only minor effects on the range of object discrimination. However, two types of moving backgrounds improved electrolocation by extending the range of object discrimination up to a distance of almost 5 cm. This suggests that movements in the environment plays an important role for object identification and improves figure-ground separation during active electrolocation.
Collapse
Affiliation(s)
- Katharina Fechler
- University of Bonn, Institute of Zoology, Department of Neuroethology/Sensory Ecology, Endenicher Allee 11-13, 53115 Bonn, Germany.
| | - Gerhard von der Emde
- University of Bonn, Institute of Zoology, Department of Neuroethology/Sensory Ecology, Endenicher Allee 11-13, 53115 Bonn, Germany.
| |
Collapse
|
13
|
Foley NC, Grossberg S, Mingolla E. Neural dynamics of object-based multifocal visual spatial attention and priming: object cueing, useful-field-of-view, and crowding. Cogn Psychol 2012; 65:77-117. [PMID: 22425615 DOI: 10.1016/j.cogpsych.2012.02.001] [Citation(s) in RCA: 36] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/10/2011] [Revised: 01/07/2012] [Accepted: 02/02/2012] [Indexed: 11/18/2022]
Abstract
How are spatial and object attention coordinated to achieve rapid object learning and recognition during eye movement search? How do prefrontal priming and parietal spatial mechanisms interact to determine the reaction time costs of intra-object attention shifts, inter-object attention shifts, and shifts between visible objects and covertly cued locations? What factors underlie individual differences in the timing and frequency of such attentional shifts? How do transient and sustained spatial attentional mechanisms work and interact? How can volition, mediated via the basal ganglia, influence the span of spatial attention? A neural model is developed of how spatial attention in the where cortical stream coordinates view-invariant object category learning in the what cortical stream under free viewing conditions. The model simulates psychological data about the dynamics of covert attention priming and switching requiring multifocal attention without eye movements. The model predicts how "attentional shrouds" are formed when surface representations in cortical area V4 resonate with spatial attention in posterior parietal cortex (PPC) and prefrontal cortex (PFC), while shrouds compete among themselves for dominance. Winning shrouds support invariant object category learning, and active surface-shroud resonances support conscious surface perception and recognition. Attentive competition between multiple objects and cues simulates reaction-time data from the two-object cueing paradigm. The relative strength of sustained surface-driven and fast-transient motion-driven spatial attention controls individual differences in reaction time for invalid cues. Competition between surface-driven attentional shrouds controls individual differences in detection rate of peripheral targets in useful-field-of-view tasks. The model proposes how the strength of competition can be mediated, though learning or momentary changes in volition, by the basal ganglia. A new explanation of crowding shows how the cortical magnification factor, among other variables, can cause multiple object surfaces to share a single surface-shroud resonance, thereby preventing recognition of the individual objects.
Collapse
Affiliation(s)
- Nicholas C Foley
- Center for Adaptive Systems, Department of Cognitive and Neural Systems, Boston University, 677 Beacon Street, Boston, MA 02215, USA
| | | | | |
Collapse
|
14
|
Crouzet SM, Serre T. What are the Visual Features Underlying Rapid Object Recognition? Front Psychol 2011; 2:326. [PMID: 22110461 PMCID: PMC3216029 DOI: 10.3389/fpsyg.2011.00326] [Citation(s) in RCA: 33] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2011] [Accepted: 10/23/2011] [Indexed: 11/13/2022] Open
Abstract
Research progress in machine vision has been very significant in recent years. Robust face detection and identification algorithms are already readily available to consumers, and modern computer vision algorithms for generic object recognition are now coping with the richness and complexity of natural visual scenes. Unlike early vision models of object recognition that emphasized the role of figure-ground segmentation and spatial information between parts, recent successful approaches are based on the computation of loose collections of image features without prior segmentation or any explicit encoding of spatial relations. While these models remain simplistic models of visual processing, they suggest that, in principle, bottom-up activation of a loose collection of image features could support the rapid recognition of natural object categories and provide an initial coarse visual representation before more complex visual routines and attentional mechanisms take place. Focusing on biologically plausible computational models of (bottom-up) pre-attentive visual recognition, we review some of the key visual features that have been described in the literature. We discuss the consistency of these feature-based representations with classical theories from visual psychology and test their ability to account for human performance on a rapid object categorization task.
Collapse
Affiliation(s)
- Sébastien M Crouzet
- Cognitive, Linguistic, and Psychological Sciences Department, Institute for Brain Sciences, Brown University Providence, RI, USA
| | | |
Collapse
|
15
|
Grossberg S, Srinivasan K, Yazdanbakhsh A. On the road to invariant object recognition: how cortical area V2 transforms absolute to relative disparity during 3D vision. Neural Netw 2011; 24:686-92. [PMID: 21507610 DOI: 10.1016/j.neunet.2011.03.021] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2010] [Revised: 03/17/2011] [Accepted: 03/21/2011] [Indexed: 11/18/2022]
Abstract
Invariant recognition of objects depends on a hierarchy of cortical stages that build invariance gradually. Binocular disparity computations are a key part of this transformation. Cortical area V1 computes absolute disparity, which is the horizontal difference in retinal location of an image in the left and right foveas. Many cells in cortical area V2 compute relative disparity, which is the difference in absolute disparity of two visible features. Relative, but not absolute, disparity is invariant under both a disparity change across a scene and vergence eye movements. A neural network model is introduced which predicts that shunting lateral inhibition of disparity-sensitive layer 4 cells in V2 causes a peak shift in cell responses that transforms absolute disparity from V1 into relative disparity in V2. This inhibitory circuit has previously been implicated in contrast gain control, divisive normalization, selection of perceptual groupings, and attentional focusing. The model hereby links relative disparity to other visual functions and thereby suggests new ways to test its mechanistic basis. Other brain circuits are reviewed wherein lateral inhibition causes a peak shift that influences behavioral responses.
Collapse
Affiliation(s)
- Stephen Grossberg
- Center for Adaptive Systems, Department of Cognitive and Neural Systems, Boston University, 677 Beacon Street, Boston, MA 02215, USA.
| | | | | |
Collapse
|