1
|
Lin C, Qiao Y, Pan Y. Bio-inspired interactive feedback neural networks for edge detection. APPL INTELL 2022. [DOI: 10.1007/s10489-022-04316-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/03/2022]
|
2
|
Peters B, Kriegeskorte N. Capturing the objects of vision with neural networks. Nat Hum Behav 2021; 5:1127-1144. [PMID: 34545237 DOI: 10.1038/s41562-021-01194-6] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2019] [Accepted: 08/06/2021] [Indexed: 01/31/2023]
Abstract
Human visual perception carves a scene at its physical joints, decomposing the world into objects, which are selectively attended, tracked and predicted as we engage our surroundings. Object representations emancipate perception from the sensory input, enabling us to keep in mind that which is out of sight and to use perceptual content as a basis for action and symbolic cognition. Human behavioural studies have documented how object representations emerge through grouping, amodal completion, proto-objects and object files. By contrast, deep neural network models of visual object recognition remain largely tethered to sensory input, despite achieving human-level performance at labelling objects. Here, we review related work in both fields and examine how these fields can help each other. The cognitive literature provides a starting point for the development of new experimental tasks that reveal mechanisms of human object perception and serve as benchmarks driving the development of deep neural network models that will put the object into object recognition.
Collapse
Affiliation(s)
- Benjamin Peters
- Mortimer B. Zuckerman Mind Brain Behavior Institute, Columbia University, New York, NY, USA.
| | - Nikolaus Kriegeskorte
- Mortimer B. Zuckerman Mind Brain Behavior Institute, Columbia University, New York, NY, USA. .,Department of Psychology, Columbia University, New York, NY, USA. .,Department of Neuroscience, Columbia University, New York, NY, USA. .,Department of Electrical Engineering, Columbia University, New York, NY, USA.
| |
Collapse
|
3
|
Grossberg S. The Embodied Brain of SOVEREIGN2: From Space-Variant Conscious Percepts During Visual Search and Navigation to Learning Invariant Object Categories and Cognitive-Emotional Plans for Acquiring Valued Goals. Front Comput Neurosci 2019; 13:36. [PMID: 31333437 PMCID: PMC6620614 DOI: 10.3389/fncom.2019.00036] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2019] [Accepted: 05/21/2019] [Indexed: 11/13/2022] Open
Abstract
This article develops a model of how reactive and planned behaviors interact in real time. Controllers for both animals and animats need reactive mechanisms for exploration, and learned plans to efficiently reach goal objects once an environment becomes familiar. The SOVEREIGN model embodied these capabilities, and was tested in a 3D virtual reality environment. Neural models have characterized important adaptive and intelligent processes that were not included in SOVEREIGN. A major research program is summarized herein by which to consistently incorporate them into an enhanced model called SOVEREIGN2. Key new perceptual, cognitive, cognitive-emotional, and navigational processes require feedback networks which regulate resonant brain states that support conscious experiences of seeing, feeling, and knowing. Also included are computationally complementary processes of the mammalian neocortical What and Where processing streams, and homologous mechanisms for spatial navigation and arm movement control. These include: Unpredictably moving targets are tracked using coordinated smooth pursuit and saccadic movements. Estimates of target and present position are computed in the Where stream, and can activate approach movements. Motion cues can elicit orienting movements to bring new targets into view. Cumulative movement estimates are derived from visual and vestibular cues. Arbitrary navigational routes are incrementally learned as a labeled graph of angles turned and distances traveled between turns. Noisy and incomplete visual sensor data are transformed into representations of visual form and motion. Invariant recognition categories are learned in the What stream. Sequences of invariant object categories are stored in a cognitive working memory, whereas sequences of movement positions and directions are stored in a spatial working memory. Stored sequences trigger learning of cognitive and spatial/motor sequence categories or plans, also called list chunks, which control planned decisions and movements toward valued goal objects. Predictively successful list chunk combinations are selectively enhanced or suppressed via reinforcement learning and incentive motivational learning. Expected vs. unexpected event disconfirmations regulate these enhancement and suppressive processes. Adaptively timed learning enables attention and action to match task constraints. Social cognitive joint attention enables imitation learning of skills by learners who observe teachers from different spatial vantage points.
Collapse
Affiliation(s)
- Stephen Grossberg
- Center for Adaptive Systems, Graduate Program in Cognitive and Neural Systems, Departments of Mathematics & Statistics, Psychological & Brain Sciences, and Biomedical Engineering, Boston University, Boston, MA, United States
| |
Collapse
|
4
|
|
5
|
Cao Y, Grossberg S. How the venetian blind percept emerges from the laminar cortical dynamics of 3D vision. Front Psychol 2014; 5:694. [PMID: 25309467 PMCID: PMC4160971 DOI: 10.3389/fpsyg.2014.00694] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2013] [Accepted: 06/16/2014] [Indexed: 12/03/2022] Open
Abstract
The 3D LAMINART model of 3D vision and figure-ground perception is used to explain and simulate a key example of the Venetian blind effect and to show how it is related to other well-known perceptual phenomena such as Panum's limiting case. The model proposes how lateral geniculate nucleus (LGN) and hierarchically organized laminar circuits in cortical areas V1, V2, and V4 interact to control processes of 3D boundary formation and surface filling-in that simulate many properties of 3D vision percepts, notably consciously seen surface percepts, which are predicted to arise when filled-in surface representations are integrated into surface-shroud resonances between visual and parietal cortex. Interactions between layers 4, 3B, and 2/3 in V1 and V2 carry out stereopsis and 3D boundary formation. Both binocular and monocular information combine to form 3D boundary and surface representations. Surface contour surface-to-boundary feedback from V2 thin stripes to V2 pale stripes combines computationally complementary boundary and surface formation properties, leading to a single consistent percept, while also eliminating redundant 3D boundaries, and triggering figure-ground perception. False binocular boundary matches are eliminated by Gestalt grouping properties during boundary formation. In particular, a disparity filter, which helps to solve the Correspondence Problem by eliminating false matches, is predicted to be realized as part of the boundary grouping process in layer 2/3 of cortical area V2. The model has been used to simulate the consciously seen 3D surface percepts in 18 psychophysical experiments. These percepts include the Venetian blind effect, Panum's limiting case, contrast variations of dichoptic masking and the correspondence problem, the effect of interocular contrast differences on stereoacuity, stereopsis with polarity-reversed stereograms, da Vinci stereopsis, and perceptual closure. These model mechanisms have also simulated properties of 3D neon color spreading, binocular rivalry, 3D Necker cube, and many examples of 3D figure-ground separation.
Collapse
Affiliation(s)
| | - Stephen Grossberg
- Graduate Program in Cognitive and Neural Systems, Department of Mathematics, Center for Adaptive Systems, Center for Computational Neuroscience and Neural Technology, Boston University Boston, MA, USA
| |
Collapse
|
6
|
Díaz-Pernas FJ, Martínez-Zarzuela M, Antón-Rodríguez M, González-Ortega D. Double recurrent interaction V1–V2–V4 based neural architecture for color natural scene boundary detection and surface perception. Appl Soft Comput 2014. [DOI: 10.1016/j.asoc.2014.03.040] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]
|
7
|
Antón-Rodríguez M, González-Ortega D, Díaz-Pernas FJ, Martínez-Zarzuela M, Díez-Higuera JF. Color-texture image segmentation and recognition through a biologically-inspired architecture. PATTERN RECOGNITION AND IMAGE ANALYSIS 2012. [DOI: 10.1134/s1054661812010038] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
|
8
|
Antón-Rodríguez M, González-Ortega D, Díaz-Pernas FJ, Martínez-Zarzuela M, de la Torre-Díez I, Boto-Giralda D, Díez-Higuera JF. Bio-inspired computer vision based on neural networks. PATTERN RECOGNITION AND IMAGE ANALYSIS 2011. [DOI: 10.1134/s1054661811020064] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
|
9
|
Grossberg S, Markowitz J, Cao Y. On the road to invariant recognition: explaining tradeoff and morph properties of cells in inferotemporal cortex using multiple-scale task-sensitive attentive learning. Neural Netw 2011; 24:1036-49. [PMID: 21665428 DOI: 10.1016/j.neunet.2011.04.001] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2010] [Revised: 03/30/2011] [Accepted: 04/05/2011] [Indexed: 11/30/2022]
Abstract
Visual object recognition is an essential accomplishment of advanced brains. Object recognition needs to be tolerant, or invariant, with respect to changes in object position, size, and view. In monkeys and humans, a key area for recognition is the anterior inferotemporal cortex (ITa). Recent neurophysiological data show that ITa cells with high object selectivity often have low position tolerance. We propose a neural model whose cells learn to simulate this tradeoff, as well as ITa responses to image morphs, while explaining how invariant recognition properties may arise in stages due to processes across multiple cortical areas. These processes include the cortical magnification factor, multiple receptive field sizes, and top-down attentive matching and learning properties that may be tuned by task requirements to attend to either concrete or abstract visual features with different levels of vigilance. The model predicts that data from the tradeoff and image morph tasks emerge from different levels of vigilance in the animals performing them. This result illustrates how different vigilance requirements of a task may change the course of category learning, notably the critical features that are attended and incorporated into learned category prototypes. The model outlines a path for developing an animal model of how defective vigilance control can lead to symptoms of various mental disorders, such as autism and amnesia.
Collapse
Affiliation(s)
- Stephen Grossberg
- Department of Cognitive and Neural Systems, Center of Excellence for Learning in Education, Science and Technology, Boston University, 677 Beacon Street, Boston, MA 02215, USA
| | | | | |
Collapse
|
10
|
Antón-Rodríguez M, Díaz-Pernas F, Díez-Higuera J, Martínez-Zarzuela M, González-Ortega D, Boto-Giralda D. Recognition of coloured and textured images through a multi-scale neural architecture with orientational filtering and chromatic diffusion. Neurocomputing 2009. [DOI: 10.1016/j.neucom.2009.06.007] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
|
11
|
Serrano-Gotarredona R, Oster M, Lichtsteiner P, Linares-Barranco A, Paz-Vicente R, Gomez-Rodriguez F, Camunas-Mesa L, Berner R, Rivas-Perez M, Delbruck T, Liu SC, Douglas R, Hafliger P, Jimenez-Moreno G, Civit Ballcels A, Serrano-Gotarredona T, Acosta-Jimenez AJ, Linares-Barranco B. CAVIAR: A 45k Neuron, 5M Synapse, 12G Connects/s AER Hardware Sensory–Processing– Learning–Actuating System for High-Speed Visual Object Recognition and Tracking. ACTA ACUST UNITED AC 2009; 20:1417-38. [PMID: 19635693 DOI: 10.1109/tnn.2009.2023653] [Citation(s) in RCA: 265] [Impact Index Per Article: 17.7] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
|
12
|
View-invariant object category learning, recognition, and search: How spatial and object attention are coordinated using surface-based attentional shrouds. Cogn Psychol 2009; 58:1-48. [DOI: 10.1016/j.cogpsych.2008.05.001] [Citation(s) in RCA: 84] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2007] [Accepted: 05/06/2008] [Indexed: 11/22/2022]
|
13
|
Grossberg S, Yazdanbakhsh A, Cao Y, Swaminathan G. How does binocular rivalry emerge from cortical mechanisms of 3-D vision? Vision Res 2008; 48:2232-50. [PMID: 18640145 DOI: 10.1016/j.visres.2008.06.024] [Citation(s) in RCA: 42] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2007] [Revised: 06/17/2008] [Accepted: 06/22/2008] [Indexed: 11/19/2022]
Abstract
Under natural viewing conditions, a single depthful percept of the world is consciously seen. When dissimilar images are presented to corresponding regions of the two eyes, binocular rivalry may occur, during which the brain consciously perceives alternating percepts through time. How do the same brain mechanisms that generate a single depthful percept of the world also cause perceptual bistability, notably binocular rivalry? What properties of brain representations correspond to consciously seen percepts? A laminar cortical model of how cortical areas V1, V2, and V4 generate depthful percepts is developed to explain and quantitatively simulate binocular rivalry data. The model proposes how mechanisms of cortical development, perceptual grouping, and figure-ground perception lead to single and rivalrous percepts. Quantitative model simulations of perceptual grouping circuits demonstrate influences of contrast changes that are synchronized with switches in the dominant eye percept, gamma distribution of dominant phase durations, piecemeal percepts, and coexistence of eye-based and stimulus-based rivalry. The model as a whole also qualitatively explains data about the involvement of multiple brain regions in rivalry, the effects of object attention on switching between superimposed transparent surfaces, monocular rivalry, Marroquin patterns, the spread of suppression during binocular rivalry, binocular summation, fusion of dichoptically presented orthogonal gratings, general suppression during binocular rivalry, and pattern rivalry. These data explanations follow from model brain mechanisms that assure non-rivalrous conscious percepts.
Collapse
Affiliation(s)
- Stephen Grossberg
- Department of Cognitive and Neural Systems, Boston University, 677 Beacon Street, Boston, MA 02215, USA.
| | | | | | | |
Collapse
|
14
|
Carpenter GA, Gaddam CS, Mingolla E. CONFIGR: a vision-based model for long-range figure completion. Neural Netw 2007; 20:1109-31. [PMID: 18024082 DOI: 10.1016/j.neunet.2007.10.002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2007] [Revised: 09/27/2007] [Accepted: 09/27/2007] [Indexed: 10/22/2022]
Abstract
CONFIGR (CONtour FIgure GRound) is a computational model based on principles of biological vision that completes sparse and noisy image figures. Within an integrated vision/recognition system, CONFIGR posits an initial recognition stage which identifies figure pixels from spatially local input information. The resulting, and typically incomplete, figure is fed back to the "early vision" stage for long-range completion via filling-in. The reconstructed image is then re-presented to the recognition system for global functions such as object recognition. In the CONFIGR algorithm, the smallest independent image unit is the visible pixel, whose size defines a computational spatial scale. Once the pixel size is fixed, the entire algorithm is fully determined, with no additional parameter choices. Multi-scale simulations illustrate the vision/recognition system. Open-source CONFIGR code is available online, but all examples can be derived analytically, and the design principles applied at each step are transparent. The model balances filling-in as figure against complementary filling-in as ground, which blocks spurious figure completions. Lobe computations occur on a subpixel spatial scale. Originally designed to fill-in missing contours in an incomplete image such as a dashed line, the same CONFIGR system connects and segments sparse dots, and unifies occluded objects from pieces locally identified as figure in the initial recognition stage. The model self-scales its completion distances, filling-in across gaps of any length, where unimpeded, while limiting connections among dense image-figure pixel groups that already have intrinsic form. Long-range image completion promises to play an important role in adaptive processors that reconstruct images from highly compressed video and still camera images.
Collapse
Affiliation(s)
- Gail A Carpenter
- Department of Cognitive and Neural Systems, Boston University, 677 Beacon Street, Boston, MA 02215, USA
| | | | | |
Collapse
|
15
|
|
16
|
Grossberg S, Kuhlmann L, Mingolla E. A neural model of 3D shape-from-texture: Multiple-scale filtering, boundary grouping, and surface filling-in. Vision Res 2007; 47:634-72. [PMID: 17275061 DOI: 10.1016/j.visres.2006.10.024] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2006] [Revised: 10/15/2006] [Accepted: 10/22/2006] [Indexed: 10/23/2022]
Abstract
A neural model is presented of how cortical areas V1, V2, and V4 interact to convert a textured 2D image into a representation of curved 3D shape. Two basic problems are solved to achieve this: (1) Patterns of spatially discrete 2D texture elements are transformed into a spatially smooth surface representation of 3D shape. (2) Changes in the statistical properties of texture elements across space induce the perceived 3D shape of this surface representation. This is achieved in the model through multiple-scale filtering of a 2D image, followed by a cooperative-competitive grouping network that coherently binds texture elements into boundary webs at the appropriate depths using a scale-to-depth map and a subsequent depth competition stage. These boundary webs then gate filling-in of surface lightness signals in order to form a smooth 3D surface percept. The model quantitatively simulates challenging psychophysical data about perception of prolate ellipsoids [Todd, J., & Akerstrom, R. (1987). Perception of three-dimensional form from patterns of optical texture. Journal of Experimental Psychology: Human Perception and Performance, 13(2), 242-255]. In particular, the model represents a high degree of 3D curvature for a certain class of images, all of whose texture elements have the same degree of optical compression, in accordance with percepts of human observers. Simulations of 3D percepts of an elliptical cylinder, a slanted plane, and a photo of a golf ball are also presented.
Collapse
Affiliation(s)
- Stephen Grossberg
- Department of Cognitive and Neural Systems, Center for Adaptive Systems, Boston University, 677 Beacon Street, Boston, MA 02215, USA.
| | | | | |
Collapse
|
17
|
|
18
|
Serrano-Gotarredona R, Serrano-Gotarredona T, Acosta-Jimenez A, Linares-Barranco B. A Neuromorphic Cortical-Layer Microchip for Spike-Based Event Processing Vision Systems. ACTA ACUST UNITED AC 2006. [DOI: 10.1109/tcsi.2006.883843] [Citation(s) in RCA: 78] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
|
19
|
Linares-Barranco A, Jimenez-Moreno G, Linares-Barranco B, Civit-Balcells A. On algorithmic rate-coded AER generation. ACTA ACUST UNITED AC 2006; 17:771-88. [PMID: 16722179 DOI: 10.1109/tnn.2006.872253] [Citation(s) in RCA: 48] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
This paper addresses the problem of converting a conventional video stream based on sequences of frames into the spike event-based representation known as the address-event-representation (AER). In this paper we concentrate on rate-coded AER. The problem is addressed as an algorithmic problem, in which different methods are proposed, implemented and tested through software algorithms. The proposed algorithms are comparatively evaluated according to different criteria. Emphasis is put on the potential of such algorithms for a) doing the frame-based to event-based representation in real time, and b) that the resulting event streams ressemble as much as possible those generated naturally by rate-coded address-event VLSI chips, such as silicon AER retinae. It is found that simple and straightforward algorithms tend to have high potential for real time but produce event distributions that differ considerably from those obtained in AER VLSI chips. On the other hand, sophisticated algorithms that yield better event distributions are not efficient for real time operations. The methods based on linear-feedback-shift-register (LFSR) pseudorandom number generation is a good compromise, which is feasible for real time and yield reasonably well distributed events in time. Our software experiments, on a 1.6-GHz Pentium IV, show that at 50% AER bus load the proposed algorithms require between 0.011 and 1.14 ms per 8 bit-pixel per frame. One of the proposed LFSR methods is implemented in real time hardware using a prototyping board that includes a VirtexE 300 FPGA. The demonstration hardware is capable of transforming frames of 64 x 64 pixels of 8-bit depth at a frame rate of 25 frames per second, producing spike events at a peak rate of 10(7) events per second.
Collapse
|
20
|
Keil MS, Cristóbal G, Neumann H. Gradient representation and perception in the early visual system--a novel account of Mach band formation. Vision Res 2006; 46:2659-74. [PMID: 16603218 DOI: 10.1016/j.visres.2006.01.038] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2004] [Revised: 12/23/2005] [Accepted: 01/25/2006] [Indexed: 11/24/2022]
Abstract
Recent evidence suggests that object surfaces and their properties are represented at early stages in the visual system of primates. Most likely invariant surface properties are extracted to endow primates with robust object recognition capabilities. In real-world scenes, luminance gradients are often superimposed on surfaces. We argue that gradients should also be represented in the visual system, since they encode highly variable information, such as shading, focal blur, and penumbral blur. We present a neuronal architecture which was designed and optimized for segregating and representing luminance gradients in real-world images. Our architecture in addition provides a novel theory for Mach bands, whereby corresponding psychophysical data are predicted consistently.
Collapse
Affiliation(s)
- Matthias S Keil
- Computer Vision Center (Universitat Autonòma), E-08193 Bellaterra, Spain.
| | | | | |
Collapse
|
21
|
Keil MS, Cristóbal G, Hansen T, Neumann H. Recovering real-world images from single-scale boundaries with a novel filling-in architecture. Neural Netw 2005; 18:1319-31. [PMID: 16039097 DOI: 10.1016/j.neunet.2005.04.003] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2002] [Accepted: 04/16/2005] [Indexed: 11/30/2022]
Abstract
Filling-in models were successful in predicting psychophysical data for brightness perception. Nevertheless, their suitability for real-world image processing has never been examined. A unified architecture for both predicting psychophysical data and real-world image processing would constitute a powerful theory for early visual information processing. As a first contribution of the present paper, we identified three principal problems with current filling-in architectures, which hamper the goal of having such a unified architecture. To overcome these problems we propose an advance to filling-in theory, called BEATS filling-in, which is based on a novel nonlinear diffusion operator. BEATS filling-in furthermore introduces novel boundary structures. We compare, by means of simulation studies with real-world images, the performance of BEATS filling-in with the recently proposed confidence-based filling-in. As a second contribution we propose a novel mechanism for encoding luminance information in contrast responses ('multiplex contrasts'), which is based on recent neurophysiological findings. Again, by simulations, we show that 'multiplex contrasts' at a single, high-resolution filter scale are sufficient for recovering absolute luminance levels. Hence, 'multiplex contrasts' represent a novel theory addressing how the brain encodes and decodes luminance information.
Collapse
Affiliation(s)
- Matthias S Keil
- Centre de Visió per Computador, Edifici O, Campus UAB, E-08193 Bellaterra, Cerdanyola, Barcelona, Spain.
| | | | | | | |
Collapse
|
22
|
A Functional Simplification of the BCS/FCS Image Segmentation. PATTERN RECOGNITION AND IMAGE ANALYSIS 2005. [DOI: 10.1007/11492429_14] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
|
23
|
Abstract
A recurrent network is proposed with the ability to bind image features into a unified surface representation within a single layer and without capacity limitations or border effects. A group of cells belonging to the same object or surface is labeled with the same activity amplitude, while cells in different groups are kept segregated due to lateral inhibition. Labeling is achieved by activity spreading through local excitatory connections. In order to prevent uncontrolled spreading, a separate network computes the intensity difference between neighboring locations and signals the presence of the surface boundary, which constrains local excitation. The quality of surface representation is not compromised due to the self-excitation. The model is also applied on gray-level images. In order to remove small, noisy regions, a feedforward network is proposed that computes the size of surfaces. Size estimation is based on the difference of dendritic inhibition in lateral excitatory and inhibitory pathways, which allows the network to selectively integrate signals only from cells with the same activity amplitude. When the output of the size estimation network is combined with the recurrent network, good segmentation results are obtained. Both networks are based on biophysically realistic mechanisms such as dendritic inhibition and multiplicative integration among different dendritic branches.
Collapse
Affiliation(s)
- Drazen Domijan
- Department of Psychology, Faculty of Philosophy, University of Rijeka, Trg Ivana Klobucarica 1, HR-51000 Rijeka, Croatia.
| |
Collapse
|
24
|
Grossberg S, Swaminathan G. A laminar cortical model for 3D perception of slanted and curved surfaces and of 2D images: development, attention, and bistability. Vision Res 2004; 44:1147-87. [PMID: 15050817 DOI: 10.1016/j.visres.2003.12.009] [Citation(s) in RCA: 46] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2003] [Revised: 12/16/2003] [Indexed: 10/26/2022]
Abstract
A model of laminar visual cortical dynamics proposes how 3D boundary and surface representations arise from viewing slanted and curved 3D objects and 2D images. The 3D boundary representations emerge from non-classical receptive field interactions within intracortical and intercortical feedback circuits. Such non-classical interactions within cortical areas V1 and V2 contextually disambiguate classical receptive field responses to ambiguous visual cues using cells that are sensitive to colinear contours, angles, and disparity gradients. Remarkably, these cell types can all be explained as variants of a unified perceptual grouping circuit whose most familiar example is a 2D colinear bipole cell. Model simulations show how this circuit can develop cell selectivity to colinear contours and angles, how slanted surfaces can activate 3D boundary representations that are sensitive to angles and disparity gradients, how 3D filling-in occurs across slanted surfaces, how a 2D Necker cube image can be represented in 3D, and how bistable 3D Necker cube percepts occur. The model also explains data about slant aftereffects and 3D neon color spreading. It shows how chemical transmitters that habituate, or depress, in an activity-dependent way can help to control development and also to trigger bistable 3D percepts and slant aftereffects. Attention can influence which of these percepts is perceived by propagating selectively along object boundaries.
Collapse
Affiliation(s)
- Stephen Grossberg
- Department of Cognitive and Neural Systems and Center for Adaptive Systems, Boston University, 677 Beacon Street, Boston, MA 02215, USA.
| | | |
Collapse
|
25
|
Hansen T, Neumann H. A simple cell model with dominating opponent inhibition for robust image processing. Neural Netw 2004; 17:647-62. [PMID: 15288890 DOI: 10.1016/j.neunet.2004.04.002] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2003] [Revised: 04/01/2004] [Accepted: 04/01/2004] [Indexed: 11/19/2022]
Abstract
The extraction of oriented contrast information by cortical simple cells is a fundamental step in early visual processing. The orientation selectivity originates at least partly from the input of lateral geniculate nuclei neurons with properly aligned receptive fields. In the present article, we investigate the feedforward interactions between on- and off-pathways. Based on physiological evidence we propose a push-pull model with dominating opponent inhibition (DOI). We show that the model can account for empirical data on simple cells, such as contrast-invariant orientation tuning, sharpening of orientation tuning with increasing inhibition, and strong response decrements to stimuli with luminance gradient reversal. With identical parameter settings, we apply the model for the processing of synthetic and real world images. We show that the model with DOI can robustly extract oriented contrast information from noisy input. More important, noise is adaptively suppressed, i.e. the model simple cells do not respond to homogeneous regions of different noise levels, while remaining sensitive to small contrast changes. The image processing results reveal a possible functional role of the strong inhibition as observed empirically, namely to adaptively suppress responses to noisy input.
Collapse
Affiliation(s)
- Thorsten Hansen
- Department of Psychology, Giessen University, D-35394 Giessen, Germany.
| | | |
Collapse
|
26
|
Hong S, Grossberg S. A neuromorphic model for achromatic and chromatic surface representation of natural images. Neural Netw 2004; 17:787-808. [PMID: 15288898 DOI: 10.1016/j.neunet.2004.02.007] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2004] [Accepted: 02/13/2004] [Indexed: 12/01/2022]
Abstract
This study develops a neuromorphic model of human lightness perception that is inspired by how the mammalian visual system is designed for this function. It is known that biological visual representations can adapt to a billion-fold change in luminance. How such a system determines absolute lightness under varying illumination conditions to generate a consistent interpretation of surface lightness remains an unsolved problem. Such a process, called 'anchoring' of lightness, has properties including articulation, insulation, configuration, and area effects. The model quantitatively simulates such psychophysical lightness data, as well as other data such as discounting the illuminant, and lightness constancy and contrast effects. The model retina embodies gain control at retinal photoreceptors, and spatial contrast adaptation at the negative feedback circuit between mechanisms that model the inner segment of photoreceptors and interacting horizontal cells. The model can thereby adjust its sensitivity to input intensities ranging from dim moonlight to dazzling sunlight. A new anchoring mechanism, called the Blurred-Highest-Luminance-As-White rule, helps simulate how surface lightness becomes sensitive to the spatial scale of objects in a scene. The model is also able to process natural color images under variable lighting conditions, and is compared with the popular RETINEX model.
Collapse
Affiliation(s)
- Simon Hong
- Department of Cognitive and Neural Systems, Center for Adaptive Systems, Boston University, 677 Beacon Street, Boston, MA 02215, USA
| | | |
Collapse
|
27
|
Tong WS, Tang CK, Mordohai P, Medioni G. First order augmentation to tensor voting for boundary inference and multiscale analysis in 3D. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2004; 26:594-611. [PMID: 15460281 DOI: 10.1109/tpami.2004.1273934] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/24/2023]
Abstract
Most computer vision applications require the reliable detection of boundaries. In the presence of outliers, missing data, orientation discontinuities, and occlusion, this problem is particularly challenging. We propose to address it by complementing the tensor voting framework, which was limited to second order properties, with first order representation and voting. First order voting fields and a mechanism to vote for 3D surface and volume boundaries and curve endpoints in 3D are defined. Boundary inference is also useful for a second difficult problem in grouping, namely, automatic scale selection. We propose an algorithm that automatically infers the smallest scale that can preserve the finest details. Our algorithm then proceeds with progressively larger scales to ensure continuity where it has not been achieved. Therefore, the proposed approach does not oversmooth features or delay the handling of boundaries and discontinuities until model misfit occurs. The interaction of smooth features, boundaries, and outliers is accommodated by the unified representation, making possible the perceptual organization of data in curves, surfaces, volumes, and their boundaries simultaneously. We present results on a variety of data sets to show the efficacy of the improved formalism.
Collapse
Affiliation(s)
- Wai-Shun Tong
- Department of Computer Science, Hong Kong University of Science & Technology, Clear Water Bay, Hong Kong.
| | | | | | | |
Collapse
|
28
|
Bhanu B, Fonder S. Learning-integrated Interactive Image Segmentation. NATURAL COMPUTING SERIES 2003. [DOI: 10.1007/978-3-642-18965-4_35] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
|
29
|
Goldberg DH, Cauwenberghs G, Andreou AG. Probabilistic synaptic weighting in a reconfigurable network of VLSI integrate-and-fire neurons. Neural Netw 2001; 14:781-93. [PMID: 11665770 DOI: 10.1016/s0893-6080(01)00057-0] [Citation(s) in RCA: 91] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
Abstract
We present a scheme for implementing highly-connected, reconfigurable networks of integrate-and-fire neurons in VLSI. Neural activity is encoded by spikes, where the address of an active neuron is communicated through an asynchronous request and acknowledgement cycle. We employ probabilistic transmission of spikes to implement continuous-valued synaptic weights, and memory-based look-up tables to implement arbitrary interconnection topologies. The scheme is modular and scalable, and lends itself to the implementation of multi-chip network architectures. Results from a prototype system with 1024 analog VLSI integrate-and-fire neurons, each with up to 128 probabilistic synapses, demonstrate these concepts in an image processing task.
Collapse
Affiliation(s)
- D H Goldberg
- Department of Electrical and Computer Engineering, Johns Hopkins University, Baltimore, MD 21218, USA.
| | | | | |
Collapse
|
30
|
Computational Neural Models of Spatial Integration in Perceptual Grouping. FROM FRAGMENTS TO OBJECTS - SEGMENTATION AND GROUPING IN VISION 2001. [DOI: 10.1016/s0166-4115(01)80032-7] [Citation(s) in RCA: 19] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/03/2023]
|