1
|
Ahn S, Adeli H, Zelinsky GJ. The attentive reconstruction of objects facilitates robust object recognition. PLoS Comput Biol 2024; 20:e1012159. [PMID: 38870125 PMCID: PMC11175536 DOI: 10.1371/journal.pcbi.1012159] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2023] [Accepted: 05/11/2024] [Indexed: 06/15/2024] Open
Abstract
Humans are extremely robust in our ability to perceive and recognize objects-we see faces in tea stains and can recognize friends on dark streets. Yet, neurocomputational models of primate object recognition have focused on the initial feed-forward pass of processing through the ventral stream and less on the top-down feedback that likely underlies robust object perception and recognition. Aligned with the generative approach, we propose that the visual system actively facilitates recognition by reconstructing the object hypothesized to be in the image. Top-down attention then uses this reconstruction as a template to bias feedforward processing to align with the most plausible object hypothesis. Building on auto-encoder neural networks, our model makes detailed hypotheses about the appearance and location of the candidate objects in the image by reconstructing a complete object representation from potentially incomplete visual input due to noise and occlusion. The model then leverages the best object reconstruction, measured by reconstruction error, to direct the bottom-up process of selectively routing low-level features, a top-down biasing that captures a core function of attention. We evaluated our model using the MNIST-C (handwritten digits under corruptions) and ImageNet-C (real-world objects under corruptions) datasets. Not only did our model achieve superior performance on these challenging tasks designed to approximate real-world noise and occlusion viewing conditions, but also better accounted for human behavioral reaction times and error patterns than a standard feedforward Convolutional Neural Network. Our model suggests that a complete understanding of object perception and recognition requires integrating top-down and attention feedback, which we propose is an object reconstruction.
Collapse
Affiliation(s)
- Seoyoung Ahn
- Department of Molecular and Cell Biology, University of California, Berkeley, California, United States of America
| | - Hossein Adeli
- Zuckerman Mind Brain Behavior Institute, Columbia University, New York City, New York, United States of America
| | - Gregory J. Zelinsky
- Department of Psychology, Stony Brook University, Stony Brook, New York, United States of America
- Department of Computer Science, Stony Brook University, Stony Brook, New York, United States of America
| |
Collapse
|
2
|
Domijan D, Marić M. An interactive cortical architecture for perceptual organization by accentuation. Neural Netw 2023; 169:205-225. [PMID: 39491385 DOI: 10.1016/j.neunet.2023.10.028] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2023] [Revised: 09/26/2023] [Accepted: 10/18/2023] [Indexed: 11/05/2024]
Abstract
Accentuation has been proposed as a general principle of perceptual organization. Here, we have developed a neurodynamic architecture to explain how accentuation affects boundary segmentation and shape perception. The model consists of bottom-up and top-down pathways. Bottom-up processing involves a set of feature maps that compute bottom-up salience of surfaces, boundaries, boundary completions, and junctions. Then, a feature-based winner-take-all network selects the most salient locations. Top-down processing includes an object-based attention stage that allows enhanced neural activity to propagate from the most salient locations to all connected locations, and a visual segmentation stage that employs inhibitory connections to segregate boundaries into distinct maps. The model was tested on a series of computer simulations showing how the position of the accent affects boundary segregation in the square-diamond and the pointing illusion. The model was also tested on a variety of texture segregation tasks, showing that its performance was comparable to that of human observers. The model suggests that there is an intermediate stage of visual processing between perceptual grouping and object recognition that helps the visual system choose between competing percepts of the ambiguous stimulus.
Collapse
|
3
|
Grossberg S. How children learn to understand language meanings: a neural model of adult-child multimodal interactions in real-time. Front Psychol 2023; 14:1216479. [PMID: 37599779 PMCID: PMC10435915 DOI: 10.3389/fpsyg.2023.1216479] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2023] [Accepted: 06/28/2023] [Indexed: 08/22/2023] Open
Abstract
This article describes a biological neural network model that can be used to explain how children learn to understand language meanings about the perceptual and affective events that they consciously experience. This kind of learning often occurs when a child interacts with an adult teacher to learn language meanings about events that they experience together. Multiple types of self-organizing brain processes are involved in learning language meanings, including processes that control conscious visual perception, joint attention, object learning and conscious recognition, cognitive working memory, cognitive planning, emotion, cognitive-emotional interactions, volition, and goal-oriented actions. The article shows how all of these brain processes interact to enable the learning of language meanings to occur. The article also contrasts these human capabilities with AI models such as ChatGPT. The current model is called the ChatSOME model, where SOME abbreviates Self-Organizing MEaning.
Collapse
Affiliation(s)
- Stephen Grossberg
- Center for Adaptive Systems, Boston University, Boston, MA, United States
| |
Collapse
|
4
|
Reeves A, Qian J. The Short-Term Retention of Depth. Vision (Basel) 2021; 5:59. [PMID: 34941654 PMCID: PMC8707874 DOI: 10.3390/vision5040059] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2021] [Revised: 11/19/2021] [Accepted: 12/05/2021] [Indexed: 11/16/2022] Open
Abstract
We review research on the visual working memory for information portrayed by items arranged in depth (i.e., distance to the observer) within peri-personal space. Most items lose their metric depths within half a second, even though their identities and spatial positions are retained. The paradoxical loss of depth information may arise because visual working memory retains the depth of a single object for the purpose of actions such as pointing or grasping which usually apply to only one thing at a time.
Collapse
Affiliation(s)
- Adam Reeves
- Department of Psychology, Northeastern University, Boston, MA 02115, USA
| | - Jiehui Qian
- Department of Psychology, Sun Yat-Sen University, Guangzhou 510006, China;
| |
Collapse
|
5
|
Grossberg S. A Canonical Laminar Neocortical Circuit Whose Bottom-Up, Horizontal, and Top-Down Pathways Control Attention, Learning, and Prediction. Front Syst Neurosci 2021; 15:650263. [PMID: 33967708 PMCID: PMC8102731 DOI: 10.3389/fnsys.2021.650263] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/06/2021] [Accepted: 03/29/2021] [Indexed: 11/27/2022] Open
Abstract
All perceptual and cognitive circuits in the human cerebral cortex are organized into layers. Specializations of a canonical laminar network of bottom-up, horizontal, and top-down pathways carry out multiple kinds of biological intelligence across different neocortical areas. This article describes what this canonical network is and notes that it can support processes as different as 3D vision and figure-ground perception; attentive category learning and decision-making; speech perception; and cognitive working memory (WM), planning, and prediction. These processes take place within and between multiple parallel cortical streams that obey computationally complementary laws. The interstream interactions that are needed to overcome these complementary deficiencies mix cell properties so thoroughly that some authors have noted the difficulty of determining what exactly constitutes a cortical stream and the differences between streams. The models summarized herein explain how these complementary properties arise, and how their interstream interactions overcome their computational deficiencies to support effective goal-oriented behaviors.
Collapse
Affiliation(s)
- Stephen Grossberg
- Graduate Program in Cognitive and Neural Systems, Departments of Mathematics and Statistics, Psychological and Brain Sciences, and Biomedical Engineering, Center for Adaptive Systems, Boston University, Boston, MA, United States
| |
Collapse
|
6
|
Marić M, Domijan D. A neurodynamic model of the interaction between color perception and color memory. Neural Netw 2020; 129:222-248. [PMID: 32615406 DOI: 10.1016/j.neunet.2020.06.008] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2020] [Revised: 05/03/2020] [Accepted: 06/04/2020] [Indexed: 12/17/2022]
Abstract
The memory color effect and Spanish castle illusion have been taken as evidence of the cognitive penetrability of vision. In the same manner, the successful decoding of color-related brain signals in functional neuroimaging studies suggests the retrieval of memory colors associated with a perceived gray object. Here, we offer an alternative account of these findings based on the design principles of adaptive resonance theory (ART). In ART, conscious perception is a consequence of a resonant state. Resonance emerges in a recurrent cortical circuit when a bottom-up spatial pattern agrees with the top-down expectation. When they do not agree, a special control mechanism is activated that resets the network and clears off erroneous expectation, thus allowing the bottom-up activity to always dominate in perception. We developed a color ART circuit and evaluated its behavior in computer simulations. The model helps to explain how traces of erroneous expectations about incoming color are eventually removed from the color perception, although their transient effect may be visible in behavioral responses or in brain imaging. Our results suggest that the color ART circuit, as a predictive computational system, is almost never penetrable, because it is equipped with computational mechanisms designed to constrain the impact of the top-down predictions on ongoing perceptual processing.
Collapse
|
7
|
Grossberg S. A Path Toward Explainable AI and Autonomous Adaptive Intelligence: Deep Learning, Adaptive Resonance, and Models of Perception, Emotion, and Action. Front Neurorobot 2020; 14:36. [PMID: 32670045 PMCID: PMC7330174 DOI: 10.3389/fnbot.2020.00036] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2020] [Accepted: 05/18/2020] [Indexed: 11/13/2022] Open
Abstract
Biological neural network models whereby brains make minds help to understand autonomous adaptive intelligence. This article summarizes why the dynamics and emergent properties of such models for perception, cognition, emotion, and action are explainable, and thus amenable to being confidently implemented in large-scale applications. Key to their explainability is how these models combine fast activations, or short-term memory (STM) traces, and learned weights, or long-term memory (LTM) traces. Visual and auditory perceptual models have explainable conscious STM representations of visual surfaces and auditory streams in surface-shroud resonances and stream-shroud resonances, respectively. Deep Learning is often used to classify data. However, Deep Learning can experience catastrophic forgetting: At any stage of learning, an unpredictable part of its memory can collapse. Even if it makes some accurate classifications, they are not explainable and thus cannot be used with confidence. Deep Learning shares these problems with the back propagation algorithm, whose computational problems due to non-local weight transport during mismatch learning were described in the 1980s. Deep Learning became popular after very fast computers and huge online databases became available that enabled new applications despite these problems. Adaptive Resonance Theory, or ART, algorithms overcome the computational problems of back propagation and Deep Learning. ART is a self-organizing production system that incrementally learns, using arbitrary combinations of unsupervised and supervised learning and only locally computable quantities, to rapidly classify large non-stationary databases without experiencing catastrophic forgetting. ART classifications and predictions are explainable using the attended critical feature patterns in STM on which they build. The LTM adaptive weights of the fuzzy ARTMAP algorithm induce fuzzy IF-THEN rules that explain what feature combinations predict successful outcomes. ART has been successfully used in multiple large-scale real world applications, including remote sensing, medical database prediction, and social media data clustering. Also explainable are the MOTIVATOR model of reinforcement learning and cognitive-emotional interactions, and the VITE, DIRECT, DIVA, and SOVEREIGN models for reaching, speech production, spatial navigation, and autonomous adaptive intelligence. These biological models exemplify complementary computing, and use local laws for match learning and mismatch learning that avoid the problems of Deep Learning.
Collapse
Affiliation(s)
- Stephen Grossberg
- Graduate Program in Cognitive and Neural Systems, Departments of Mathematics & Statistics, Psychological & Brain Sciences, and Biomedical Engineering, Center for Adaptive Systems, Boston University, Boston, MA, United States
| |
Collapse
|
8
|
Tse PU. Abutting Objects Warp the Three-Dimensional Curvature of Modally Completing Surfaces. Iperception 2020; 11:2041669520903554. [PMID: 32518614 PMCID: PMC7253068 DOI: 10.1177/2041669520903554] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2019] [Accepted: 12/23/2019] [Indexed: 11/17/2022] Open
Abstract
Binocular disparity can give rise to the perception of open surfaces or closed curved surfaces (volumes) that appear to vary smoothly across discrete depths. Here I build on my recent papers by providing examples where modally completing surfaces not only fill in from one depth layer's visible contours to another layer's visible contours within virtual contours in an analog manner, but where modally completing surface curvature is altered by the interpolation of an abutting object perceived to be connected to or embedded within that modally completing surface. Seemingly minor changes in such an abutting object can flip the interpretation of distal regions, for example, turning a distant edge (where a surface ends) into rim (where a surface bends to occlude itself) or turning an open surface into a closed one. In general, the interpolated modal surface appears to deform, warp, or bend in three-dimensions to accommodate the abutting object. These demonstrations cannot be easily explained by existing models of visual processing or modal completion and drive home the implausibility of localistic accounts of modal or amodal completion that are based, for example, solely on extending contours in space until they meet behind an occluder or in front of "pacmen." These demonstrations place new constraints on the holistic surface and volume generation processes that construct our experience of a three-dimensional world of surfaces and objects under normal viewing conditions.
Collapse
Affiliation(s)
- Peter U Tse
- Department of Psychological and Brain Sciences, Dartmouth College, Hanover, New Hampshire, United States
| |
Collapse
|
9
|
Abstract
This article proposes that biologically plausible theories of behavior can be constructed by following a method of "phylogenetic refinement," whereby they are progressively elaborated from simple to complex according to phylogenetic data on the sequence of changes that occurred over the course of evolution. It is argued that sufficient data exist to make this approach possible, and that the result can more effectively delineate the true biological categories of neurophysiological mechanisms than do approaches based on definitions of putative functions inherited from psychological traditions. As an example, the approach is used to sketch a theoretical framework of how basic feedback control of interaction with the world was elaborated during vertebrate evolution, to give rise to the functional architecture of the mammalian brain. The results provide a conceptual taxonomy of mechanisms that naturally map to neurophysiological and neuroanatomical data and that offer a context for defining putative functions that, it is argued, are better grounded in biology than are some of the traditional concepts of cognitive science.
Collapse
Affiliation(s)
- Paul Cisek
- Department of Neuroscience, University of Montréal, Montréal, Québec, Canada.
| |
Collapse
|
10
|
Grossberg S. The resonant brain: How attentive conscious seeing regulates action sequences that interact with attentive cognitive learning, recognition, and prediction. Atten Percept Psychophys 2019; 81:2237-2264. [PMID: 31218601 PMCID: PMC6848053 DOI: 10.3758/s13414-019-01789-2] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
Abstract
This article describes mechanistic links that exist in advanced brains between processes that regulate conscious attention, seeing, and knowing, and those that regulate looking and reaching. These mechanistic links arise from basic properties of brain design principles such as complementary computing, hierarchical resolution of uncertainty, and adaptive resonance. These principles require conscious states to mark perceptual and cognitive representations that are complete, context sensitive, and stable enough to control effective actions. Surface-shroud resonances support conscious seeing and action, whereas feature-category resonances support learning, recognition, and prediction of invariant object categories. Feedback interactions between cortical areas such as peristriate visual cortical areas V2, V3A, and V4, and the lateral intraparietal area (LIP) and inferior parietal sulcus (IPS) of the posterior parietal cortex (PPC) control sequences of saccadic eye movements that foveate salient features of attended objects and thereby drive invariant object category learning. Learned categories can, in turn, prime the objects and features that are attended and searched. These interactions coordinate processes of spatial and object attention, figure-ground separation, predictive remapping, invariant object category learning, and visual search. They create a foundation for learning to control motor-equivalent arm movement sequences, and for storing these sequences in cognitive working memories that can trigger the learning of cognitive plans with which to read out skilled movement sequences. Cognitive-emotional interactions that are regulated by reinforcement learning can then help to select the plans that control actions most likely to acquire valued goal objects in different situations. Many interdisciplinary psychological and neurobiological data about conscious and unconscious behaviors in normal individuals and clinical patients have been explained in terms of these concepts and mechanisms.
Collapse
Affiliation(s)
- Stephen Grossberg
- Center for Adaptive Systems, Room 213, Graduate Program in Cognitive and Neural Systems, Departments of Mathematics & Statistics, Psychological & Brain Sciences, and Biomedical Engineering, Boston University, 677 Beacon Street, Boston, MA, 02215, USA.
| |
Collapse
|
11
|
Grossberg S. The Embodied Brain of SOVEREIGN2: From Space-Variant Conscious Percepts During Visual Search and Navigation to Learning Invariant Object Categories and Cognitive-Emotional Plans for Acquiring Valued Goals. Front Comput Neurosci 2019; 13:36. [PMID: 31333437 PMCID: PMC6620614 DOI: 10.3389/fncom.2019.00036] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2019] [Accepted: 05/21/2019] [Indexed: 11/13/2022] Open
Abstract
This article develops a model of how reactive and planned behaviors interact in real time. Controllers for both animals and animats need reactive mechanisms for exploration, and learned plans to efficiently reach goal objects once an environment becomes familiar. The SOVEREIGN model embodied these capabilities, and was tested in a 3D virtual reality environment. Neural models have characterized important adaptive and intelligent processes that were not included in SOVEREIGN. A major research program is summarized herein by which to consistently incorporate them into an enhanced model called SOVEREIGN2. Key new perceptual, cognitive, cognitive-emotional, and navigational processes require feedback networks which regulate resonant brain states that support conscious experiences of seeing, feeling, and knowing. Also included are computationally complementary processes of the mammalian neocortical What and Where processing streams, and homologous mechanisms for spatial navigation and arm movement control. These include: Unpredictably moving targets are tracked using coordinated smooth pursuit and saccadic movements. Estimates of target and present position are computed in the Where stream, and can activate approach movements. Motion cues can elicit orienting movements to bring new targets into view. Cumulative movement estimates are derived from visual and vestibular cues. Arbitrary navigational routes are incrementally learned as a labeled graph of angles turned and distances traveled between turns. Noisy and incomplete visual sensor data are transformed into representations of visual form and motion. Invariant recognition categories are learned in the What stream. Sequences of invariant object categories are stored in a cognitive working memory, whereas sequences of movement positions and directions are stored in a spatial working memory. Stored sequences trigger learning of cognitive and spatial/motor sequence categories or plans, also called list chunks, which control planned decisions and movements toward valued goal objects. Predictively successful list chunk combinations are selectively enhanced or suppressed via reinforcement learning and incentive motivational learning. Expected vs. unexpected event disconfirmations regulate these enhancement and suppressive processes. Adaptively timed learning enables attention and action to match task constraints. Social cognitive joint attention enables imitation learning of skills by learners who observe teachers from different spatial vantage points.
Collapse
Affiliation(s)
- Stephen Grossberg
- Center for Adaptive Systems, Graduate Program in Cognitive and Neural Systems, Departments of Mathematics & Statistics, Psychological & Brain Sciences, and Biomedical Engineering, Boston University, Boston, MA, United States
| |
Collapse
|
12
|
Surface diagnosticity predicts the high-level representation of regular and irregular object shape in human vision. Atten Percept Psychophys 2019; 81:1589-1608. [PMID: 30864108 PMCID: PMC6647524 DOI: 10.3758/s13414-019-01698-4] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
The human visual system has an extraordinary capacity to compute three-dimensional (3D) shape structure for both geometrically regular and irregular objects. The goal of this study was to shed new light on the underlying representational structures that support this ability. Observers (N = 85) completed two complementary perceptual tasks. Experiment 1 involved whole–part matching of image parts to whole geometrically regular and irregular novel object shapes. Image parts comprised either regions of edge contour, volumetric parts, or surfaces. Performance was better for irregular than for regular objects and interacted with part type: volumes yielded better matching performance than surfaces for regular but not for irregular objects. The basis for this effect was further explored in Experiment 2, which used implicit part–whole repetition priming. Here, we orthogonally manipulated shape regularity and a new factor of surface diagnosticity (how predictive a single surface is of object identity). The results showed that surface diagnosticity, not object shape regularity, determined the differential processing of volumes and surfaces. Regardless of shape regularity, objects with low surface diagnosticity were better primed by volumes than by surfaces. In contrast, objects with high surface diagnosticity showed the opposite pattern. These findings are the first to show that surface diagnosticity plays a fundamental role in object recognition. We propose that surface-based shape primitives—rather than volumetric parts—underlie the derivation of 3D object shape in human vision.
Collapse
|
13
|
Schendan HE. Memory influences visual cognition across multiple functional states of interactive cortical dynamics. PSYCHOLOGY OF LEARNING AND MOTIVATION 2019. [DOI: 10.1016/bs.plm.2019.07.007] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/12/2023]
|
14
|
|
15
|
Abstract
Short-term visual memory was studied by displaying arrays of four or five numerals, each numeral in its own depth plane, followed after various delays by an arrow cue shown in one of the depth planes. Subjects reported the numeral at the depth cued by the arrow. Accuracy fell with increasing cue delay for the first 500 ms or so, and then recovered almost fully. This dipping pattern contrasts with the usual iconic decay observed for memory traces. The dip occurred with or without a verbal or color-shape retention load on working memory. In contrast, accuracy did not change with delay when a tonal cue replaced the arrow cue. We hypothesized that information concerning the depths of the numerals decays over time in sensory memory, but that cued recall is aided later on by transfer to a visual memory specialized for depth. This transfer is sufficiently rapid with a tonal cue to compensate for the sensory decay, but it is slowed by the need to tag the arrow cue's depth relative to the depths of the numerals, exposing a dip when sensation has decayed and transfer is not yet complete. A model with a fixed rate of sensory decay and varied transfer rates across individuals captures the dip as well as the cue modality effect.
Collapse
|
16
|
Marić M, Domijan D. A Neurodynamic Model of Feature-Based Spatial Selection. Front Psychol 2018; 9:417. [PMID: 29643826 PMCID: PMC5883145 DOI: 10.3389/fpsyg.2018.00417] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2017] [Accepted: 03/13/2018] [Indexed: 11/21/2022] Open
Abstract
Huang and Pashler (2007) suggested that feature-based attention creates a special form of spatial representation, which is termed a Boolean map. It partitions the visual scene into two distinct and complementary regions: selected and not selected. Here, we developed a model of a recurrent competitive network that is capable of state-dependent computation. It selects multiple winning locations based on a joint top-down cue. We augmented a model of the WTA circuit that is based on linear-threshold units with two computational elements: dendritic non-linearity that acts on the excitatory units and activity-dependent modulation of synaptic transmission between excitatory and inhibitory units. Computer simulations showed that the proposed model could create a Boolean map in response to a featured cue and elaborate it using the logical operations of intersection and union. In addition, it was shown that in the absence of top-down guidance, the model is sensitive to bottom-up cues such as saliency and abrupt visual onset.
Collapse
Affiliation(s)
- Mateja Marić
- Department of Psychology, Faculty of Humanities and Social Sciences, University of Rijeka, Rijeka, Croatia
| | - Dražen Domijan
- Department of Psychology, Faculty of Humanities and Social Sciences, University of Rijeka, Rijeka, Croatia
| |
Collapse
|
17
|
Grossberg S. Desirability, availability, credit assignment, category learning, and attention: Cognitive-emotional and working memory dynamics of orbitofrontal, ventrolateral, and dorsolateral prefrontal cortices. Brain Neurosci Adv 2018; 2:2398212818772179. [PMID: 32166139 PMCID: PMC7058233 DOI: 10.1177/2398212818772179] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2017] [Accepted: 03/16/2018] [Indexed: 11/17/2022] Open
Abstract
BACKGROUND The prefrontal cortices play an essential role in cognitive-emotional and working memory processes through interactions with multiple brain regions. METHODS This article further develops a unified neural architecture that explains many recent and classical data about prefrontal function and makes testable predictions. RESULTS Prefrontal properties of desirability, availability, credit assignment, category learning, and feature-based attention are explained. These properties arise through interactions of orbitofrontal, ventrolateral prefrontal, and dorsolateral prefrontal cortices with the inferotemporal cortex, perirhinal cortex, parahippocampal cortices; ventral bank of the principal sulcus, ventral prearcuate gyrus, frontal eye fields, hippocampus, amygdala, basal ganglia, hypothalamus, and visual cortical areas V1, V2, V3A, V4, middle temporal cortex, medial superior temporal area, lateral intraparietal cortex, and posterior parietal cortex. Model explanations also include how the value of visual objects and events is computed, which objects and events cause desired consequences and which may be ignored as predictively irrelevant, and how to plan and act to realise these consequences, including how to selectively filter expected versus unexpected events, leading to movements towards, and conscious perception of, expected events. Modelled processes include reinforcement learning and incentive motivational learning; object and spatial working memory dynamics; and category learning, including the learning of object categories, value categories, object-value categories, and sequence categories, or list chunks. CONCLUSION This article hereby proposes a unified neural theory of prefrontal cortex and its functions.
Collapse
Affiliation(s)
- Stephen Grossberg
- Center for Adaptive Systems, Graduate Program in Cognitive and Neural Systems, Departments of Mathematics & Statistics, Psychological & Brain Sciences, Biomedical Engineering, Boston University, Boston, MA, USA
| |
Collapse
|
18
|
|
19
|
Abstract
Building on the modal and amodal completion work of Kanizsa, Carman and Welch showed that binocular stereo viewing of two disparate images can give rise to a percept of 3D curved, nonclosed illusory contours and surfaces. Here, it is shown that binocular presentation can also give rise to the percept of closed curved surfaces or volumes that appear to vary smoothly across discrete depths in binocularly fused images, although in fact only two binocular disparities are discretely defined between corresponding contour elements of the inducing elements. Surfaces are filled in from one depth layer's visible contours to another layer's visible contours within virtual contours that are interpolated on the basis of good contour continuation between the visible portions of contour. These single depth contour segments are taken not to arise from surface edges, as in Kanizsa's or Carman and Welch's examples, but from segments of "rim" where the line of sight just grazes a surface that continues behind and beyond the rim smoothly. When there are two or more surface-propagating contour segments, the propagated surfaces can continue away from the inferred rim, merge, and then close behind the self-occluding visible surface into an everywhere differentiable closed surface or volume. Illusory surfaces can possess a depth and perceived surface curvature that is consistent with all visible contour segments, despite the absence of local disparity cues at interpolated 3D surface locations far from any visible contour. These demonstrations cannot be easily explained by existing models of visual processing. They place constraints on the surface and volume generation processes that construct our 3D world under normal viewing conditions.
Collapse
Affiliation(s)
- Peter Ulric Tse
- Department of Psychological and Brain Sciences, Dartmouth College, Hanover, USA
| |
Collapse
|
20
|
Grossberg S. Towards solving the hard problem of consciousness: The varieties of brain resonances and the conscious experiences that they support. Neural Netw 2016; 87:38-95. [PMID: 28088645 DOI: 10.1016/j.neunet.2016.11.003] [Citation(s) in RCA: 45] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2016] [Revised: 10/21/2016] [Accepted: 11/20/2016] [Indexed: 10/20/2022]
Abstract
The hard problem of consciousness is the problem of explaining how we experience qualia or phenomenal experiences, such as seeing, hearing, and feeling, and knowing what they are. To solve this problem, a theory of consciousness needs to link brain to mind by modeling how emergent properties of several brain mechanisms interacting together embody detailed properties of individual conscious psychological experiences. This article summarizes evidence that Adaptive Resonance Theory, or ART, accomplishes this goal. ART is a cognitive and neural theory of how advanced brains autonomously learn to attend, recognize, and predict objects and events in a changing world. ART has predicted that "all conscious states are resonant states" as part of its specification of mechanistic links between processes of consciousness, learning, expectation, attention, resonance, and synchrony. It hereby provides functional and mechanistic explanations of data ranging from individual spikes and their synchronization to the dynamics of conscious perceptual, cognitive, and cognitive-emotional experiences. ART has reached sufficient maturity to begin classifying the brain resonances that support conscious experiences of seeing, hearing, feeling, and knowing. Psychological and neurobiological data in both normal individuals and clinical patients are clarified by this classification. This analysis also explains why not all resonances become conscious, and why not all brain dynamics are resonant. The global organization of the brain into computationally complementary cortical processing streams (complementary computing), and the organization of the cerebral cortex into characteristic layers of cells (laminar computing), figure prominently in these explanations of conscious and unconscious processes. Alternative models of consciousness are also discussed.
Collapse
Affiliation(s)
- Stephen Grossberg
- Center for Adaptive Systems, Boston University, 677 Beacon Street, Boston, MA 02215, USA; Graduate Program in Cognitive and Neural Systems, Departments of Mathematics & Statistics, Psychological & Brain Sciences, and Biomedical Engineering Boston University, 677 Beacon Street, Boston, MA 02215, USA.
| |
Collapse
|
21
|
Dresp-Langley B, Grossberg S. Neural Computation of Surface Border Ownership and Relative Surface Depth from Ambiguous Contrast Inputs. Front Psychol 2016; 7:1102. [PMID: 27516746 PMCID: PMC4963386 DOI: 10.3389/fpsyg.2016.01102] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2015] [Accepted: 07/07/2016] [Indexed: 11/13/2022] Open
Abstract
The segregation of image parts into foreground and background is an important aspect of the neural computation of 3D scene perception. To achieve such segregation, the brain needs information about border ownership; that is, the belongingness of a contour to a specific surface represented in the image. This article presents psychophysical data derived from 3D percepts of figure and ground that were generated by presenting 2D images composed of spatially disjoint shapes that pointed inward or outward relative to the continuous boundaries that they induced along their collinear edges. The shapes in some images had the same contrast (black or white) with respect to the background gray. Other images included opposite contrasts along each induced continuous boundary. Psychophysical results demonstrate conditions under which figure-ground judgment probabilities in response to these ambiguous displays are determined by the orientation of contrasts only, not by their relative contrasts, despite the fact that many border ownership cells in cortical area V2 respond to a preferred relative contrast. Studies are also reviewed in which both polarity-specific and polarity-invariant properties obtain. The FACADE and 3D LAMINART models are used to explain these data.
Collapse
Affiliation(s)
- Birgitta Dresp-Langley
- Centre National de la Recherche Scientifique, ICube UMR 7357, University of Strasbourg Strasbourg, France
| | - Stephen Grossberg
- Center for Adaptive Systems, Graduate Program in Cognitive and Neural Systems, Department of Mathematics, Boston University, Boston MA, USA
| |
Collapse
|
22
|
Grossberg S. Cortical Dynamics of Figure-Ground Separation in Response to 2D Pictures and 3D Scenes: How V2 Combines Border Ownership, Stereoscopic Cues, and Gestalt Grouping Rules. Front Psychol 2016; 6:2054. [PMID: 26858665 PMCID: PMC4726768 DOI: 10.3389/fpsyg.2015.02054] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2015] [Accepted: 12/24/2015] [Indexed: 11/20/2022] Open
Abstract
The FACADE model, and its laminar cortical realization and extension in the 3D LAMINART model, have explained, simulated, and predicted many perceptual and neurobiological data about how the visual cortex carries out 3D vision and figure-ground perception, and how these cortical mechanisms enable 2D pictures to generate 3D percepts of occluding and occluded objects. In particular, these models have proposed how border ownership occurs, but have not yet explicitly explained the correlation between multiple properties of border ownership neurons in cortical area V2 that were reported in a remarkable series of neurophysiological experiments by von der Heydt and his colleagues; namely, border ownership, contrast preference, binocular stereoscopic information, selectivity for side-of-figure, Gestalt rules, and strength of attentional modulation, as well as the time course during which such properties arise. This article shows how, by combining 3D LAMINART properties that were discovered in two parallel streams of research, a unified explanation of these properties emerges. This explanation proposes, moreover, how these properties contribute to the generation of consciously seen 3D surfaces. The first research stream models how processes like 3D boundary grouping and surface filling-in interact in multiple stages within and between the V1 interblob—V2 interstripe—V4 cortical stream and the V1 blob—V2 thin stripe—V4 cortical stream, respectively. Of particular importance for understanding figure-ground separation is how these cortical interactions convert computationally complementary boundary and surface mechanisms into a consistent conscious percept, including the critical use of surface contour feedback signals from surface representations in V2 thin stripes to boundary representations in V2 interstripes. Remarkably, key figure-ground properties emerge from these feedback interactions. The second research stream shows how cells that compute absolute disparity in cortical area V1 are transformed into cells that compute relative disparity in cortical area V2. Relative disparity is a more invariant measure of an object's depth and 3D shape, and is sensitive to figure-ground properties.
Collapse
Affiliation(s)
- Stephen Grossberg
- Center for Adaptive Systems, Graduate Program in Cognitive and Neural Systems, Center for Computational Neuroscience and Neural Technology, Boston UniversityBoston, MA, USA; Department of Mathematics, Boston UniversityBoston, MA, USA
| |
Collapse
|
23
|
Grossberg S, Palma J, Versace M. Resonant Cholinergic Dynamics in Cognitive and Motor Decision-Making: Attention, Category Learning, and Choice in Neocortex, Superior Colliculus, and Optic Tectum. Front Neurosci 2016; 9:501. [PMID: 26834535 PMCID: PMC4718999 DOI: 10.3389/fnins.2015.00501] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2015] [Accepted: 12/18/2015] [Indexed: 12/20/2022] Open
Abstract
Freely behaving organisms need to rapidly calibrate their perceptual, cognitive, and motor decisions based on continuously changing environmental conditions. These plastic changes include sharpening or broadening of cognitive and motor attention and learning to match the behavioral demands that are imposed by changing environmental statistics. This article proposes that a shared circuit design for such flexible decision-making is used in specific cognitive and motor circuits, and that both types of circuits use acetylcholine to modulate choice selectivity. Such task-sensitive control is proposed to control thalamocortical choice of the critical features that are cognitively attended and that are incorporated through learning into prototypes of visual recognition categories. A cholinergically-modulated process of vigilance control determines if a recognition category and its attended features are abstract (low vigilance) or concrete (high vigilance). Homologous neural mechanisms of cholinergic modulation are proposed to focus attention and learn a multimodal map within the deeper layers of superior colliculus. This map enables visual, auditory, and planned movement commands to compete for attention, leading to selection of a winning position that controls where the next saccadic eye movement will go. Such map learning may be viewed as a kind of attentive motor category learning. The article hereby explicates a link between attention, learning, and cholinergic modulation during decision making within both cognitive and motor systems. Homologs between the mammalian superior colliculus and the avian optic tectum lead to predictions about how multimodal map learning may occur in the mammalian and avian brain and how such learning may be modulated by acetycholine.
Collapse
Affiliation(s)
- Stephen Grossberg
- Graduate Program in Cognitive and Neural Systems, Boston UniversityBoston, MA, USA
- Center for Adaptive Systems, Boston UniversityBoston, MA, USA
- Departments of Mathematics, Psychology, and Biomedical Engineering, Boston UniversityBoston, MA, USA
- Center for Computational Neuroscience and Neural Technology, Boston UniversityBoston, MA, USA
| | - Jesse Palma
- Center for Computational Neuroscience and Neural Technology, Boston UniversityBoston, MA, USA
| | - Massimiliano Versace
- Graduate Program in Cognitive and Neural Systems, Boston UniversityBoston, MA, USA
- Center for Computational Neuroscience and Neural Technology, Boston UniversityBoston, MA, USA
| |
Collapse
|
24
|
Neural Dynamics of the Basal Ganglia During Perceptual, Cognitive, and Motor Learning and Gating. INNOVATIONS IN COGNITIVE NEUROSCIENCE 2016. [DOI: 10.1007/978-3-319-42743-0_19] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/12/2023]
|
25
|
Schendan HE, Ganis G. Top-down modulation of visual processing and knowledge after 250 ms supports object constancy of category decisions. Front Psychol 2015; 6:1289. [PMID: 26441701 PMCID: PMC4584963 DOI: 10.3389/fpsyg.2015.01289] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2013] [Accepted: 08/12/2015] [Indexed: 11/13/2022] Open
Abstract
People categorize objects more slowly when visual input is highly impoverished instead of optimal. While bottom-up models may explain a decision with optimal input, perceptual hypothesis testing (PHT) theories implicate top-down processes with impoverished input. Brain mechanisms and the time course of PHT are largely unknown. This event-related potential study used a neuroimaging paradigm that implicated prefrontal cortex in top-down modulation of occipitotemporal cortex. Subjects categorized more impoverished and less impoverished real and pseudo objects. PHT theories predict larger impoverishment effects for real than pseudo objects because top-down processes modulate knowledge only for real objects, but different PHT variants predict different timing. Consistent with parietal-prefrontal PHT variants, around 250 ms, the earliest impoverished real object interaction started on an N3 complex, which reflects interactive cortical activity for object cognition. N3 impoverishment effects localized to both prefrontal and occipitotemporal cortex for real objects only. The N3 also showed knowledge effects by 230 ms that localized to occipitotemporal cortex. Later effects reflected (a) word meaning in temporal cortex during the N400, (b) internal evaluation of prior decision and memory processes and secondary higher-order memory involving anterotemporal parts of a default mode network during posterior positivity (P600), and (c) response related activity in posterior cingulate during an anterior slow wave (SW) after 700 ms. Finally, response activity in supplementary motor area during a posterior SW after 900 ms showed impoverishment effects that correlated with RTs. Convergent evidence from studies of vision, memory, and mental imagery which reflects purely top-down inputs, indicates that the N3 reflects the critical top-down processes of PHT. A hybrid multiple-state interactive, PHT and decision theory best explains the visual constancy of object cognition.
Collapse
Affiliation(s)
- Haline E. Schendan
- School of Psychology, Cognition Institute, University of PlymouthPlymouth, UK
| | - Giorgio Ganis
- School of Psychology, Cognition Institute, University of PlymouthPlymouth, UK
- Athinoula A. Martinos Center for Biomedical Imaging, Massachusetts General HospitalCharlestown, MA, USA
- Department of Radiology, Harvard Medical SchoolBoston, MA, USA
| |
Collapse
|
26
|
Reppa I, Greville WJ, Leek EC. The role of surface-based representations of shape in visual object recognition. Q J Exp Psychol (Hove) 2015; 68:2351-69. [PMID: 25768675 DOI: 10.1080/17470218.2015.1014379] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
Abstract
This study contrasted the role of surfaces and volumetric shape primitives in three-dimensional object recognition. Observers (N = 50) matched subsets of closed contour fragments, surfaces, or volumetric parts to whole novel objects during a whole-part matching task. Three factors were further manipulated: part viewpoint (either same or different between component parts and whole objects), surface occlusion (comparison parts contained either visible surfaces only, or a surface that was fully or partially occluded in the whole object), and target-distractor similarity. Similarity was varied in terms of systematic variation in nonaccidental (NAP) or metric (MP) properties of individual parts. Analysis of sensitivity (d') showed a whole-part matching advantage for surface-based parts and volumes over closed contour fragments--but no benefit for volumetric parts over surfaces. We also found a performance cost in matching volumetric parts to wholes when the volumes showed surfaces that were occluded in the whole object. The same pattern was found for both same and different viewpoints, and regardless of target-distractor similarity. These findings challenge models in which recognition is mediated by volumetric part-based shape representations. Instead, we argue that the results are consistent with a surface-based model of high-level shape representation for recognition.
Collapse
Affiliation(s)
- Irene Reppa
- a Department of Psychology, Wales Institute for Cognitive Neuroscience , Swansea University , Swansea , UK
| | - W James Greville
- a Department of Psychology, Wales Institute for Cognitive Neuroscience , Swansea University , Swansea , UK
| | - E Charles Leek
- b Wolfson Centre for Clinical and Cognitive Neuroscience, School of Psychology , Bangor University , Bangor , UK
| |
Collapse
|
27
|
Grossberg S, Srinivasan K, Yazdanbakhsh A. Binocular fusion and invariant category learning due to predictive remapping during scanning of a depthful scene with eye movements. Front Psychol 2015; 5:1457. [PMID: 25642198 PMCID: PMC4294135 DOI: 10.3389/fpsyg.2014.01457] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/04/2014] [Accepted: 11/28/2014] [Indexed: 12/02/2022] Open
Abstract
How does the brain maintain stable fusion of 3D scenes when the eyes move? Every eye movement causes each retinal position to process a different set of scenic features, and thus the brain needs to binocularly fuse new combinations of features at each position after an eye movement. Despite these breaks in retinotopic fusion due to each movement, previously fused representations of a scene in depth often appear stable. The 3D ARTSCAN neural model proposes how the brain does this by unifying concepts about how multiple cortical areas in the What and Where cortical streams interact to coordinate processes of 3D boundary and surface perception, spatial attention, invariant object category learning, predictive remapping, eye movement control, and learned coordinate transformations. The model explains data from single neuron and psychophysical studies of covert visual attention shifts prior to eye movements. The model further clarifies how perceptual, attentional, and cognitive interactions among multiple brain regions (LGN, V1, V2, V3A, V4, MT, MST, PPC, LIP, ITp, ITa, SC) may accomplish predictive remapping as part of the process whereby view-invariant object categories are learned. These results build upon earlier neural models of 3D vision and figure-ground separation and the learning of invariant object categories as the eyes freely scan a scene. A key process concerns how an object's surface representation generates a form-fitting distribution of spatial attention, or attentional shroud, in parietal cortex that helps maintain the stability of multiple perceptual and cognitive processes. Predictive eye movement signals maintain the stability of the shroud, as well as of binocularly fused perceptual boundaries and surface representations.
Collapse
Affiliation(s)
- Stephen Grossberg
- Center for Adaptive Systems, Graduate Program in Cognitive and Neural Systems, Center of Excellence for Learning in Education, Science and Technology, Center for Computational Neuroscience and Neural Technology, and Department of Mathematics Boston University, Boston, MA, USA
| | - Karthik Srinivasan
- Center for Adaptive Systems, Graduate Program in Cognitive and Neural Systems, Center of Excellence for Learning in Education, Science and Technology, Center for Computational Neuroscience and Neural Technology, and Department of Mathematics Boston University, Boston, MA, USA
| | - Arash Yazdanbakhsh
- Center for Adaptive Systems, Graduate Program in Cognitive and Neural Systems, Center of Excellence for Learning in Education, Science and Technology, Center for Computational Neuroscience and Neural Technology, and Department of Mathematics Boston University, Boston, MA, USA
| |
Collapse
|
28
|
|
29
|
From brain synapses to systems for learning and memory: Object recognition, spatial navigation, timed conditioning, and movement control. Brain Res 2014; 1621:270-93. [PMID: 25446436 DOI: 10.1016/j.brainres.2014.11.018] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2014] [Accepted: 11/06/2014] [Indexed: 11/23/2022]
Abstract
This article provides an overview of neural models of synaptic learning and memory whose expression in adaptive behavior depends critically on the circuits and systems in which the synapses are embedded. It reviews Adaptive Resonance Theory, or ART, models that use excitatory matching and match-based learning to achieve fast category learning and whose learned memories are dynamically stabilized by top-down expectations, attentional focusing, and memory search. ART clarifies mechanistic relationships between consciousness, learning, expectation, attention, resonance, and synchrony. ART models are embedded in ARTSCAN architectures that unify processes of invariant object category learning, recognition, spatial and object attention, predictive remapping, and eye movement search, and that clarify how conscious object vision and recognition may fail during perceptual crowding and parietal neglect. The generality of learned categories depends upon a vigilance process that is regulated by acetylcholine via the nucleus basalis. Vigilance can get stuck at too high or too low values, thereby causing learning problems in autism and medial temporal amnesia. Similar synaptic learning laws support qualitatively different behaviors: Invariant object category learning in the inferotemporal cortex; learning of grid cells and place cells in the entorhinal and hippocampal cortices during spatial navigation; and learning of time cells in the entorhinal-hippocampal system during adaptively timed conditioning, including trace conditioning. Spatial and temporal processes through the medial and lateral entorhinal-hippocampal system seem to be carried out with homologous circuit designs. Variations of a shared laminar neocortical circuit design have modeled 3D vision, speech perception, and cognitive working memory and learning. A complementary kind of inhibitory matching and mismatch learning controls movement. This article is part of a Special Issue entitled SI: Brain and Memory.
Collapse
|
30
|
Cao Y, Grossberg S. How the venetian blind percept emerges from the laminar cortical dynamics of 3D vision. Front Psychol 2014; 5:694. [PMID: 25309467 PMCID: PMC4160971 DOI: 10.3389/fpsyg.2014.00694] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2013] [Accepted: 06/16/2014] [Indexed: 12/03/2022] Open
Abstract
The 3D LAMINART model of 3D vision and figure-ground perception is used to explain and simulate a key example of the Venetian blind effect and to show how it is related to other well-known perceptual phenomena such as Panum's limiting case. The model proposes how lateral geniculate nucleus (LGN) and hierarchically organized laminar circuits in cortical areas V1, V2, and V4 interact to control processes of 3D boundary formation and surface filling-in that simulate many properties of 3D vision percepts, notably consciously seen surface percepts, which are predicted to arise when filled-in surface representations are integrated into surface-shroud resonances between visual and parietal cortex. Interactions between layers 4, 3B, and 2/3 in V1 and V2 carry out stereopsis and 3D boundary formation. Both binocular and monocular information combine to form 3D boundary and surface representations. Surface contour surface-to-boundary feedback from V2 thin stripes to V2 pale stripes combines computationally complementary boundary and surface formation properties, leading to a single consistent percept, while also eliminating redundant 3D boundaries, and triggering figure-ground perception. False binocular boundary matches are eliminated by Gestalt grouping properties during boundary formation. In particular, a disparity filter, which helps to solve the Correspondence Problem by eliminating false matches, is predicted to be realized as part of the boundary grouping process in layer 2/3 of cortical area V2. The model has been used to simulate the consciously seen 3D surface percepts in 18 psychophysical experiments. These percepts include the Venetian blind effect, Panum's limiting case, contrast variations of dichoptic masking and the correspondence problem, the effect of interocular contrast differences on stereoacuity, stereopsis with polarity-reversed stereograms, da Vinci stereopsis, and perceptual closure. These model mechanisms have also simulated properties of 3D neon color spreading, binocular rivalry, 3D Necker cube, and many examples of 3D figure-ground separation.
Collapse
Affiliation(s)
| | - Stephen Grossberg
- Graduate Program in Cognitive and Neural Systems, Department of Mathematics, Center for Adaptive Systems, Center for Computational Neuroscience and Neural Technology, Boston University Boston, MA, USA
| |
Collapse
|
31
|
Kazerounian S, Grossberg S. Real-time learning of predictive recognition categories that chunk sequences of items stored in working memory. Front Psychol 2014; 5:1053. [PMID: 25339918 PMCID: PMC4186345 DOI: 10.3389/fpsyg.2014.01053] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2014] [Accepted: 09/02/2014] [Indexed: 11/20/2022] Open
Abstract
How are sequences of events that are temporarily stored in a cognitive working memory unitized, or chunked, through learning? Such sequential learning is needed by the brain in order to enable language, spatial understanding, and motor skills to develop. In particular, how does the brain learn categories, or list chunks, that become selectively tuned to different temporal sequences of items in lists of variable length as they are stored in working memory, and how does this learning process occur in real time? The present article introduces a neural model that simulates learning of such list chunks. In this model, sequences of items are temporarily stored in an Item-and-Order, or competitive queuing, working memory before learning categorizes them using a categorization network, called a Masking Field, which is a self-similar, multiple-scale, recurrent on-center off-surround network that can weigh the evidence for variable-length sequences of items as they are stored in the working memory through time. A Masking Field hereby activates the learned list chunks that represent the most predictive item groupings at any time, while suppressing less predictive chunks. In a network with a given number of input items, all possible ordered sets of these item sequences, up to a fixed length, can be learned with unsupervised or supervised learning. The self-similar multiple-scale properties of Masking Fields interacting with an Item-and-Order working memory provide a natural explanation of George Miller's Magical Number Seven and Nelson Cowan's Magical Number Four. The article explains why linguistic, spatial, and action event sequences may all be stored by Item-and-Order working memories that obey similar design principles, and thus how the current results may apply across modalities. Item-and-Order properties may readily be extended to Item-Order-Rank working memories in which the same item can be stored in multiple list positions, or ranks, as in the list ABADBD. Comparisons with other models, including TRACE, MERGE, and TISK, are made.
Collapse
Affiliation(s)
| | - Stephen Grossberg
- Graduate Program in Cognitive and Neural Systems, Department of Mathematics, Center for Adaptive Systems, Center for Computational Neuroscience and Neural Technology, Boston UniversityBoston, MA, USA
| |
Collapse
|
32
|
|
33
|
Ghodrati M, Rajaei K, Ebrahimpour R. The importance of visual features in generic vs. specialized object recognition: a computational study. Front Comput Neurosci 2014; 8:78. [PMID: 25202259 PMCID: PMC4141282 DOI: 10.3389/fncom.2014.00078] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2014] [Accepted: 07/01/2014] [Indexed: 11/13/2022] Open
Abstract
It is debated whether the representation of objects in inferior temporal (IT) cortex is distributed over activities of many neurons or there are restricted islands of neurons responsive to a specific set of objects. There are lines of evidence demonstrating that fusiform face area (FFA-in human) processes information related to specialized object recognition (here we say within category object recognition such as face identification). Physiological studies have also discovered several patches in monkey ventral temporal lobe that are responsible for facial processing. Neuronal recording from these patches shows that neurons are highly selective for face images whereas for other objects we do not see such selectivity in IT. However, it is also well-supported that objects are encoded through distributed patterns of neural activities that are distinctive for each object category. It seems that visual cortex utilize different mechanisms for between category object recognition (e.g., face vs. non-face objects) vs. within category object recognition (e.g., two different faces). In this study, we address this question with computational simulations. We use two biologically inspired object recognition models and define two experiments which address these issues. The models have a hierarchical structure of several processing layers that simply simulate visual processing from V1 to aIT. We show, through computational modeling, that the difference between these two mechanisms of recognition can underlie the visual feature and extraction mechanism. It is argued that in order to perform generic and specialized object recognition, visual cortex must separate the mechanisms involved in within category from between categories object recognition. High recognition performance in within category object recognition can be guaranteed when class-specific features with intermediate size and complexity are extracted. However, generic object recognition requires a distributed universal dictionary of visual features in which the size of features does not have significant difference.
Collapse
Affiliation(s)
- Masoud Ghodrati
- Brain and Intelligent Systems Research Laboratory (BISLab), Department of Electrical and Computer Engineering, Shahid Rajaee Teacher Training University Tehran, Iran ; School of Cognitive Sciences, Institute for Research in Fundamental Sciences (IPM) Tehran, Iran ; Department of Physiology, Monash University Melbourne, VIC, Australia
| | - Karim Rajaei
- Brain and Intelligent Systems Research Laboratory (BISLab), Department of Electrical and Computer Engineering, Shahid Rajaee Teacher Training University Tehran, Iran ; School of Cognitive Sciences, Institute for Research in Fundamental Sciences (IPM) Tehran, Iran
| | - Reza Ebrahimpour
- Brain and Intelligent Systems Research Laboratory (BISLab), Department of Electrical and Computer Engineering, Shahid Rajaee Teacher Training University Tehran, Iran ; School of Cognitive Sciences, Institute for Research in Fundamental Sciences (IPM) Tehran, Iran
| |
Collapse
|
34
|
Díaz-Pernas FJ, Martínez-Zarzuela M, Antón-Rodríguez M, González-Ortega D. Double recurrent interaction V1–V2–V4 based neural architecture for color natural scene boundary detection and surface perception. Appl Soft Comput 2014. [DOI: 10.1016/j.asoc.2014.03.040] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]
|
35
|
Chang HC, Grossberg S, Cao Y. Where's Waldo? How perceptual, cognitive, and emotional brain processes cooperate during learning to categorize and find desired objects in a cluttered scene. Front Integr Neurosci 2014; 8:43. [PMID: 24987339 PMCID: PMC4060746 DOI: 10.3389/fnint.2014.00043] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2013] [Accepted: 05/02/2014] [Indexed: 11/13/2022] Open
Abstract
The Where's Waldo problem concerns how individuals can rapidly learn to search a scene to detect, attend, recognize, and look at a valued target object in it. This article develops the ARTSCAN Search neural model to clarify how brain mechanisms across the What and Where cortical streams are coordinated to solve the Where's Waldo problem. The What stream learns positionally-invariant object representations, whereas the Where stream controls positionally-selective spatial and action representations. The model overcomes deficiencies of these computationally complementary properties through What and Where stream interactions. Where stream processes of spatial attention and predictive eye movement control modulate What stream processes whereby multiple view- and positionally-specific object categories are learned and associatively linked to view- and positionally-invariant object categories through bottom-up and attentive top-down interactions. Gain fields control the coordinate transformations that enable spatial attention and predictive eye movements to carry out this role. What stream cognitive-emotional learning processes enable the focusing of motivated attention upon the invariant object categories of desired objects. What stream cognitive names or motivational drives can prime a view- and positionally-invariant object category of a desired target object. A volitional signal can convert these primes into top-down activations that can, in turn, prime What stream view- and positionally-specific categories. When it also receives bottom-up activation from a target, such a positionally-specific category can cause an attentional shift in the Where stream to the positional representation of the target, and an eye movement can then be elicited to foveate it. These processes describe interactions among brain regions that include visual cortex, parietal cortex, inferotemporal cortex, prefrontal cortex (PFC), amygdala, basal ganglia (BG), and superior colliculus (SC).
Collapse
Affiliation(s)
- Hung-Cheng Chang
- Graduate Program in Cognitive and Neural Systems, Department of Mathematics, Center for Adaptive Systems, Center for Computational Neuroscience and Neural Technology, Boston University Boston, MA, USA
| | - Stephen Grossberg
- Graduate Program in Cognitive and Neural Systems, Department of Mathematics, Center for Adaptive Systems, Center for Computational Neuroscience and Neural Technology, Boston University Boston, MA, USA
| | - Yongqiang Cao
- Graduate Program in Cognitive and Neural Systems, Department of Mathematics, Center for Adaptive Systems, Center for Computational Neuroscience and Neural Technology, Boston University Boston, MA, USA
| |
Collapse
|
36
|
The object advantage can be eliminated under equiluminant conditions. Psychon Bull Rev 2014; 21:1459-64. [PMID: 24700185 DOI: 10.3758/s13423-014-0630-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
A key phenomenon supporting the existence of object-based attention is the object advantage, in which responses are faster for within-object, relative to equidistant between-object, shifts of attention. The origins of this effect have been variously ascribed to low-level "bottom-up" sensory processing and to a cognitive "top-down" strategy of within-object attention prioritization. The degree to which the object advantage depends on lower-level sensory processing was examined by differentially stimulating the magnocellular (M) and parvocellular (P) retino-geniculo-cortical visual pathways by using equiluminant and nonequiluminant conditions. We found that the object advantage can be eliminated when M activity is reduced using psychophysically equiluminant stimuli. This novel result in normal observers suggests that the origin of the object advantage is found in lower-level sensory processing rather than a general cognitive process, which should not be so sensitive to differential activation of the bottom-up P and M pathways. Eliminating the object advantage while maintaining a spatial-cueing advantage with reduced M activity suggests that the notion of independent M-driven spatial attention and P-driven object attention requires revision.
Collapse
|
37
|
Sandamirskaya Y, Zibner SK, Schneegans S, Schöner G. Using Dynamic Field Theory to extend the embodiment stance toward higher cognition. NEW IDEAS IN PSYCHOLOGY 2013. [DOI: 10.1016/j.newideapsych.2013.01.002] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
|
38
|
Adaptive Resonance Theory: How a brain learns to consciously attend, learn, and recognize a changing world. Neural Netw 2013; 37:1-47. [PMID: 23149242 DOI: 10.1016/j.neunet.2012.09.017] [Citation(s) in RCA: 190] [Impact Index Per Article: 15.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2012] [Revised: 08/24/2012] [Accepted: 09/24/2012] [Indexed: 11/17/2022]
|
39
|
Wi NTN, Loo CK, Chockalingam L. Biologically inspired face recognition: toward pose-invariance. Int J Neural Syst 2012. [PMID: 23186278 DOI: 10.1142/s0129065712500293] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
A small change in image will cause a dramatic change in signals. Visual system is required to be able to ignore these changes, yet specific enough to perform recognition. This work intends to provide biological-backed insights into 2D translation and scaling invariance and 3D pose-invariance without imposing strain on memory and with biological justification. The model can be divided into lower and higher visual stages. Lower visual stage models the visual pathway from retina to the striate cortex (V1), whereas the modeling of higher visual stage is mainly based on current psychophysical evidences.
Collapse
Affiliation(s)
- Noel Tay Nuo Wi
- Centre of Diploma Programmes, Multimedia University, JalanAyerKeroh Lama, Melaka, Malaysia.
| | | | | |
Collapse
|
40
|
Srinivasa N, Bhattacharyya R, Sundareswara R, Lee C, Grossberg S. A bio-inspired kinematic controller for obstacle avoidance during reaching tasks with real robots. Neural Netw 2012; 35:54-69. [DOI: 10.1016/j.neunet.2012.07.010] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2010] [Revised: 06/19/2012] [Accepted: 07/28/2012] [Indexed: 11/30/2022]
|
41
|
A computational model of fMRI activity in the intraparietal sulcus that supports visual working memory. COGNITIVE AFFECTIVE & BEHAVIORAL NEUROSCIENCE 2012; 11:573-99. [PMID: 21866425 DOI: 10.3758/s13415-011-0054-x] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
A computational model was developed to explain a pattern of results of fMRI activation in the intraparietal sulcus (IPS) supporting visual working memory for multiobject scenes. The model is based on the hypothesis that dendrites of excitatory neurons are major computational elements in the cortical circuit. Dendrites enable formation of a competitive queue that exhibits a gradient of activity values for nodes encoding different objects, and this pattern is stored in working memory. In the model, brain imaging data are interpreted as a consequence of blood flow arising from dendritic processing. Computer simulations showed that the model successfully simulates data showing the involvement of inferior IPS in object individuation and spatial grouping through representation of objects' locations in space, along with the involvement of superior IPS in object identification through representation of a set of objects' features. The model exhibits a capacity limit due to the limited dynamic range for nodes and the operation of lateral inhibition among them. The capacity limit is fixed in the inferior IPS regardless of the objects' complexity, due to the normalization of lateral inhibition, and variable in the superior IPS, due to the different encoding demands for simple and complex shapes. Systematic variation in the strength of self-excitation enables an understanding of the individual differences in working memory capacity. The model offers several testable predictions regarding the neural basis of visual working memory.
Collapse
|
42
|
Hannagan T, Grainger J. Protein Analysis Meets Visual Word Recognition: A Case for String Kernels in the Brain. Cogn Sci 2012; 36:575-606. [DOI: 10.1111/j.1551-6709.2012.01236.x] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
|
43
|
Foley NC, Grossberg S, Mingolla E. Neural dynamics of object-based multifocal visual spatial attention and priming: object cueing, useful-field-of-view, and crowding. Cogn Psychol 2012; 65:77-117. [PMID: 22425615 DOI: 10.1016/j.cogpsych.2012.02.001] [Citation(s) in RCA: 36] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/10/2011] [Revised: 01/07/2012] [Accepted: 02/02/2012] [Indexed: 11/18/2022]
Abstract
How are spatial and object attention coordinated to achieve rapid object learning and recognition during eye movement search? How do prefrontal priming and parietal spatial mechanisms interact to determine the reaction time costs of intra-object attention shifts, inter-object attention shifts, and shifts between visible objects and covertly cued locations? What factors underlie individual differences in the timing and frequency of such attentional shifts? How do transient and sustained spatial attentional mechanisms work and interact? How can volition, mediated via the basal ganglia, influence the span of spatial attention? A neural model is developed of how spatial attention in the where cortical stream coordinates view-invariant object category learning in the what cortical stream under free viewing conditions. The model simulates psychological data about the dynamics of covert attention priming and switching requiring multifocal attention without eye movements. The model predicts how "attentional shrouds" are formed when surface representations in cortical area V4 resonate with spatial attention in posterior parietal cortex (PPC) and prefrontal cortex (PFC), while shrouds compete among themselves for dominance. Winning shrouds support invariant object category learning, and active surface-shroud resonances support conscious surface perception and recognition. Attentive competition between multiple objects and cues simulates reaction-time data from the two-object cueing paradigm. The relative strength of sustained surface-driven and fast-transient motion-driven spatial attention controls individual differences in reaction time for invalid cues. Competition between surface-driven attentional shrouds controls individual differences in detection rate of peripheral targets in useful-field-of-view tasks. The model proposes how the strength of competition can be mediated, though learning or momentary changes in volition, by the basal ganglia. A new explanation of crowding shows how the cortical magnification factor, among other variables, can cause multiple object surfaces to share a single surface-shroud resonance, thereby preventing recognition of the individual objects.
Collapse
Affiliation(s)
- Nicholas C Foley
- Center for Adaptive Systems, Department of Cognitive and Neural Systems, Boston University, 677 Beacon Street, Boston, MA 02215, USA
| | | | | |
Collapse
|
44
|
Abstract
Mounting evidence suggests that 'core object recognition,' the ability to rapidly recognize objects despite substantial appearance variation, is solved in the brain via a cascade of reflexive, largely feedforward computations that culminate in a powerful neuronal representation in the inferior temporal cortex. However, the algorithm that produces this solution remains poorly understood. Here we review evidence ranging from individual neurons and neuronal populations to behavior and computational models. We propose that understanding this algorithm will require using neuronal and psychophysical data to sift through many computational models, each based on building blocks of small, canonical subnetworks with a common functional goal.
Collapse
Affiliation(s)
- James J DiCarlo
- Department of Brain and Cognitive Sciences and McGovern Institute for Brain Research, Massachusetts Institute of Technology, Cambridge, MA 02139, USA.
| | | | | |
Collapse
|
45
|
Schneegans S, Schöner G. A neural mechanism for coordinate transformation predicts pre-saccadic remapping. BIOLOGICAL CYBERNETICS 2012; 106:89-109. [PMID: 22481644 DOI: 10.1007/s00422-012-0484-8] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/16/2011] [Accepted: 03/13/2012] [Indexed: 05/06/2023]
Abstract
Whenever we shift our gaze, any location information encoded in the retinocentric reference frame that is predominant in the visual system is obliterated. How is spatial memory retained across gaze changes? Two different explanations have been proposed: Retinocentric information may be transformed into a gaze-invariant representation through a mechanism consistent with gain fields observed in parietal cortex, or retinocentric information may be updated in anticipation of the shift expected with every gaze change, a proposal consistent with neural observations in LIP. The explanations were considered incompatible with each other, because retinocentric update is observed before the gaze shift has terminated. Here, we show that a neural dynamic mechanism for coordinate transformation can also account for retinocentric updating. Our model postulates an extended mechanism of reference frame transformation that is based on bidirectional mapping between a retinocentric and a body-centered representation and that enables transforming multiple object locations in parallel. The dynamic coupling between the two reference frames generates a shift of the retinocentric representation for every gaze change. We account for the predictive nature of the observed remapping activity by using the same kind of neural mechanism to generate an internal representation of gaze direction that is predictively updated based on corollary discharge signals. We provide evidence for the model by accounting for a series of behavioral and neural experimental observations.
Collapse
|
46
|
Cao Y, Grossberg S. Stereopsis and 3D surface perception by spiking neurons in laminar cortical circuits: a method for converting neural rate models into spiking models. Neural Netw 2011; 26:75-98. [PMID: 22119530 DOI: 10.1016/j.neunet.2011.10.010] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2010] [Revised: 10/16/2011] [Accepted: 10/20/2011] [Indexed: 10/15/2022]
Abstract
A laminar cortical model of stereopsis and 3D surface perception is developed and simulated. The model shows how spiking neurons that interact in hierarchically organized laminar circuits of the visual cortex can generate analog properties of 3D visual percepts. The model describes how monocular and binocular oriented filtering interact with later stages of 3D boundary formation and surface filling-in in the LGN and cortical areas V1, V2, and V4. It proposes how interactions between layers 4, 3B, and 2/3 in V1 and V2 contribute to stereopsis, and how binocular and monocular information combine to form 3D boundary and surface representations. The model suggests how surface-to-boundary feedback from V2 thin stripes to pale stripes helps to explain how computationally complementary boundary and surface formation properties lead to a single consistent percept, eliminate redundant 3D boundaries, and trigger figure-ground perception. The model also shows how false binocular boundary matches may be eliminated by Gestalt grouping properties. In particular, the disparity filter, which helps to solve the correspondence problem by eliminating false matches, is realized using inhibitory interneurons as part of the perceptual grouping process by horizontal connections in layer 2/3 of cortical area V2. The 3D sLAMINART model simulates 3D surface percepts that are consciously seen in 18 psychophysical experiments. These percepts include contrast variations of dichoptic masking and the correspondence problem, the effect of interocular contrast differences on stereoacuity, Panum's limiting case, the Venetian blind illusion, stereopsis with polarity-reversed stereograms, da Vinci stereopsis, and perceptual closure. The model hereby illustrates a general method of unlumping rate-based models that use the membrane equations of neurophysiology into models that use spiking neurons, and which may be embodied in VLSI chips that use spiking neurons to minimize heat production.
Collapse
Affiliation(s)
- Yongqiang Cao
- Center for Adaptive Systems, Department of Cognitive and Neural Systems, Boston University, 677 Beacon Street, Boston, MA 02215, USA
| | | |
Collapse
|
47
|
Silver MR, Grossberg S, Bullock D, Histed MH, Miller EK. A neural model of sequential movement planning and control of eye movements: Item-Order-Rank working memory and saccade selection by the supplementary eye fields. Neural Netw 2011; 26:29-58. [PMID: 22079270 DOI: 10.1016/j.neunet.2011.10.004] [Citation(s) in RCA: 34] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2010] [Revised: 09/22/2011] [Accepted: 10/12/2011] [Indexed: 11/17/2022]
Abstract
How does working memory store multiple spatial positions to control sequences of eye movements, particularly when the same items repeat at multiple list positions, or ranks, during the sequence? An Item-Order-Rank model of working memory shows how rank-selective representations enable storage and recall of items that repeat at arbitrary list positions. Rank-related activity has been observed in many areas including the posterior parietal cortices (PPC), prefrontal cortices (PFC) and supplementary eye fields (SEF). The model shows how rank information, originating in PPC, may support rank-sensitive PFC working memory representations and how SEF may select saccades stored in working memory. It also proposes how SEF may interact with downstream regions such as the frontal eye fields (FEF) during memory-guided sequential saccade tasks, and how the basal ganglia (BG) may control the flow of information. Model simulations reproduce behavioral, anatomical and electrophysiological data under multiple experimental paradigms, including visually- and memory-guided single and sequential saccade tasks. Simulations reproduce behavioral data during two SEF microstimulation paradigms, showing that their seemingly inconsistent findings about saccade latency can be reconciled.
Collapse
Affiliation(s)
- Matthew R Silver
- Center for Adaptive Systems, Boston University, Boston, MA 02215, USA
| | | | | | | | | |
Collapse
|
48
|
Grossberg S, Markowitz J, Cao Y. On the road to invariant recognition: explaining tradeoff and morph properties of cells in inferotemporal cortex using multiple-scale task-sensitive attentive learning. Neural Netw 2011; 24:1036-49. [PMID: 21665428 DOI: 10.1016/j.neunet.2011.04.001] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2010] [Revised: 03/30/2011] [Accepted: 04/05/2011] [Indexed: 11/30/2022]
Abstract
Visual object recognition is an essential accomplishment of advanced brains. Object recognition needs to be tolerant, or invariant, with respect to changes in object position, size, and view. In monkeys and humans, a key area for recognition is the anterior inferotemporal cortex (ITa). Recent neurophysiological data show that ITa cells with high object selectivity often have low position tolerance. We propose a neural model whose cells learn to simulate this tradeoff, as well as ITa responses to image morphs, while explaining how invariant recognition properties may arise in stages due to processes across multiple cortical areas. These processes include the cortical magnification factor, multiple receptive field sizes, and top-down attentive matching and learning properties that may be tuned by task requirements to attend to either concrete or abstract visual features with different levels of vigilance. The model predicts that data from the tradeoff and image morph tasks emerge from different levels of vigilance in the animals performing them. This result illustrates how different vigilance requirements of a task may change the course of category learning, notably the critical features that are attended and incorporated into learned category prototypes. The model outlines a path for developing an animal model of how defective vigilance control can lead to symptoms of various mental disorders, such as autism and amnesia.
Collapse
Affiliation(s)
- Stephen Grossberg
- Department of Cognitive and Neural Systems, Center of Excellence for Learning in Education, Science and Technology, Boston University, 677 Beacon Street, Boston, MA 02215, USA
| | | | | |
Collapse
|
49
|
Cao Y, Grossberg S, Markowitz J. How does the brain rapidly learn and reorganize view-invariant and position-invariant object representations in the inferotemporal cortex? Neural Netw 2011; 24:1050-61. [PMID: 21596523 DOI: 10.1016/j.neunet.2011.04.004] [Citation(s) in RCA: 30] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2010] [Revised: 04/10/2011] [Accepted: 04/12/2011] [Indexed: 11/18/2022]
Abstract
All primates depend for their survival on being able to rapidly learn about and recognize objects. Objects may be visually detected at multiple positions, sizes, and viewpoints. How does the brain rapidly learn and recognize objects while scanning a scene with eye movements, without causing a combinatorial explosion in the number of cells that are needed? How does the brain avoid the problem of erroneously classifying parts of different objects together at the same or different positions in a visual scene? In monkeys and humans, a key area for such invariant object category learning and recognition is the inferotemporal cortex (IT). A neural model is proposed to explain how spatial and object attention coordinate the ability of IT to learn invariant category representations of objects that are seen at multiple positions, sizes, and viewpoints. The model clarifies how interactions within a hierarchy of processing stages in the visual brain accomplish this. These stages include the retina, lateral geniculate nucleus, and cortical areas V1, V2, V4, and IT in the brain's What cortical stream, as they interact with spatial attention processes within the parietal cortex of the Where cortical stream. The model builds upon the ARTSCAN model, which proposed how view-invariant object representations are generated. The positional ARTSCAN (pARTSCAN) model proposes how the following additional processes in the What cortical processing stream also enable position-invariant object representations to be learned: IT cells with persistent activity, and a combination of normalizing object category competition and a view-to-object learning law which together ensure that unambiguous views have a larger effect on object recognition than ambiguous views. The model explains how such invariant learning can be fooled when monkeys, or other primates, are presented with an object that is swapped with another object during eye movements to foveate the original object. The swapping procedure is predicted to prevent the reset of spatial attention, which would otherwise keep the representations of multiple objects from being combined by learning. Li and DiCarlo (2008) have presented neurophysiological data from monkeys showing how unsupervised natural experience in a target swapping experiment can rapidly alter object representations in IT. The model quantitatively simulates the swapping data by showing how the swapping procedure fools the spatial attention mechanism. More generally, the model provides a unifying framework, and testable predictions in both monkeys and humans, for understanding object learning data using neurophysiological methods in monkeys, and spatial attention, episodic learning, and memory retrieval data using functional imaging methods in humans.
Collapse
Affiliation(s)
- Yongqiang Cao
- Center for Adaptive Systems, Department of Cognitive and Neural Systems, Center of Excellence for Learning in Education, Science, and Technology, Boston University, 677 Beacon Street, Boston, MA 02215, USA
| | | | | |
Collapse
|
50
|
A neuromorphic model of spatial lookahead planning. Neural Netw 2011; 24:257-66. [DOI: 10.1016/j.neunet.2010.11.002] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2010] [Revised: 11/01/2010] [Accepted: 11/03/2010] [Indexed: 11/15/2022]
|