Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Serre T, Oliva A, Poggio T. A feedforward architecture accounts for rapid categorization. Proc Natl Acad Sci U S A 2007;104:6424-9. [PMID: 17404214 PMCID: PMC1847457 DOI: 10.1073/pnas.0700622104] [Citation(s) in RCA: 485] [Impact Index Per Article: 26.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2006] [Indexed: 11/18/2022] Open

For:	Serre T, Oliva A, Poggio T. A feedforward architecture accounts for rapid categorization. Proc Natl Acad Sci U S A 2007;104:6424-9. [PMID: 17404214 PMCID: PMC1847457 DOI: 10.1073/pnas.0700622104] [Citation(s) in RCA: 485] [Impact Index Per Article: 26.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2006] [Indexed: 11/18/2022] Open

Number

Cited by Other Article(s)

401

Potter MC. Recognition and memory for briefly presented scenes. Front Psychol 2012;3:32. [PMID: 22371707 PMCID: PMC3284209 DOI: 10.3389/fpsyg.2012.00032] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2011] [Accepted: 01/28/2012] [Indexed: 11/17/2022] Open

402

DiCarlo JJ, Zoccolan D, Rust NC. How does the brain solve visual object recognition? Neuron 2012;73:415-34. [PMID: 22325196 PMCID: PMC3306444 DOI: 10.1016/j.neuron.2012.01.010] [Citation(s) in RCA: 894] [Impact Index Per Article: 68.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 01/20/2012] [Indexed: 11/24/2022]

403

Oppermann F, Hassler U, Jescheniak JD, Gruber T. The Rapid Extraction of Gist—Early Neural Correlates of High-level Visual Processing. J Cogn Neurosci 2012;24:521-9. [DOI: 10.1162/jocn_a_00100] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022]

404

Yau JM, Pasupathy A, Brincat SL, Connor CE. Curvature processing dynamics in macaque area V4. ACTA ACUST UNITED AC 2012;23:198-209. [PMID: 22298729 DOI: 10.1093/cercor/bhs004] [Citation(s) in RCA: 37] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]

405

Yue X, Biederman I, Mangini MC, Malsburg CVD, Amir O. Predicting the psychophysical similarity of faces and non-face complex shapes by image-based measures. Vision Res 2012;55:41-6. [PMID: 22248730 DOI: 10.1016/j.visres.2011.12.012] [Citation(s) in RCA: 37] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2011] [Revised: 09/29/2011] [Accepted: 12/23/2011] [Indexed: 11/16/2022]

406

Large-scale automated histology in the pursuit of connectomes. J Neurosci 2012;31:16125-38. [PMID: 22072665 DOI: 10.1523/jneurosci.4077-11.2011] [Citation(s) in RCA: 136] [Impact Index Per Article: 10.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/07/2023] Open

407

Riesenhuber M. Getting a handle on how the brain generates complexity. NETWORK (BRISTOL, ENGLAND) 2012;23:123-127. [PMID: 22897445 DOI: 10.3109/0954898x.2012.711918] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/01/2023]

408

Wu Y, Liu Y, Yuan Z, Zheng N. IAIR-CarPed: A psychophysically annotated dataset with fine-grained and layered semantic labels for object recognition. Pattern Recognit Lett 2012. [DOI: 10.1016/j.patrec.2011.10.003] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]

409

RSVP in orbit: Identification of single and dual targets in motion. Atten Percept Psychophys 2011;74:553-62. [DOI: 10.3758/s13414-011-0254-z] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]

410

Haxby JV, Guntupalli JS, Connolly AC, Halchenko YO, Conroy BR, Gobbini MI, Hanke M, Ramadge PJ. A common, high-dimensional model of the representational space in human ventral temporal cortex. Neuron 2011;72:404-16. [PMID: 22017997 DOI: 10.1016/j.neuron.2011.08.026] [Citation(s) in RCA: 389] [Impact Index Per Article: 27.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 08/30/2011] [Indexed: 10/16/2022]

411

ROTHENSTEIN ALBERTL, RODRÍGUEZ-SÁNCHEZ ANTONIOJ, SIMINE EVGUENI, TSOTSOS JOHNK. VISUAL FEATURE BINDING WITHIN THE SELECTIVE TUNING ATTENTION FRAMEWORK. INT J PATTERN RECOGN 2011. [DOI: 10.1142/s0218001408006648] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]

412

GOPYCH PETRO. BIOLOGICALLY PLAUSIBLE BSDT RECOGNITION OF COMPLEX IMAGES: THE CASE OF HUMAN FACES. Int J Neural Syst 2011;18:527-45. [DOI: 10.1142/s0129065708001762] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]

413

Crouzet SM, Serre T. What are the Visual Features Underlying Rapid Object Recognition? Front Psychol 2011;2:326. [PMID: 22110461 PMCID: PMC3216029 DOI: 10.3389/fpsyg.2011.00326] [Citation(s) in RCA: 33] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2011] [Accepted: 10/23/2011] [Indexed: 11/13/2022] Open

414

Spratling MW. Predictive coding as a model of the V1 saliency map hypothesis. Neural Netw 2011;26:7-28. [PMID: 22047778 DOI: 10.1016/j.neunet.2011.10.002] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2010] [Revised: 07/15/2011] [Accepted: 10/10/2011] [Indexed: 10/16/2022]

415

Gintautas V, Ham MI, Kunsberg B, Barr S, Brumby SP, Rasmussen C, George JS, Nemenman I, Bettencourt LMA, Kenyon GT. Model cortical association fields account for the time course and dependence on target complexity of human contour perception. PLoS Comput Biol 2011;7:e1002162. [PMID: 21998562 PMCID: PMC3188484 DOI: 10.1371/journal.pcbi.1002162] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2011] [Accepted: 06/29/2011] [Indexed: 12/13/2022] Open

Abstract

Can lateral connectivity in the primary visual cortex account for the time dependence and intrinsic task difficulty of human contour detection? To answer this question, we created a synthetic image set that prevents sole reliance on either low-level visual features or high-level context for the detection of target objects. Rendered images consist of smoothly varying, globally aligned contour fragments (amoebas) distributed among groups of randomly rotated fragments (clutter). The time course and accuracy of amoeba detection by humans was measured using a two-alternative forced choice protocol with self-reported confidence and variable image presentation time (20-200 ms), followed by an image mask optimized so as to interrupt visual processing. Measured psychometric functions were well fit by sigmoidal functions with exponential time constants of 30-91 ms, depending on amoeba complexity. Key aspects of the psychophysical experiments were accounted for by a computational network model, in which simulated responses across retinotopic arrays of orientation-selective elements were modulated by cortical association fields, represented as multiplicative kernels computed from the differences in pairwise edge statistics between target and distractor images. Comparing the experimental and the computational results suggests that each iteration of the lateral interactions takes at least ms of cortical processing time. Our results provide evidence that cortical association fields between orientation selective elements in early visual areas can account for important temporal and task-dependent aspects of the psychometric curves characterizing human contour perception, with the remaining discrepancies postulated to arise from the influence of higher cortical areas.

Current computer vision algorithms reproducing the feed-forward features of the primate visual pathway still fall far behind the capabilities of human subjects in detecting objects in cluttered backgrounds. Here we investigate the possibility that recurrent lateral interactions, long hypothesized to form cortical association fields, can account for the dependence of object detection accuracy on shape complexity and image exposure time. Cortical association fields are thought to aid object detection by reinforcing global image features that cannot easily be detected by single neurons in feed-forward models. Our implementation uses the spatial arrangement, relative orientation, and continuity of putative contour elements to compute the lateral contextual support. We designed synthetic images that allowed us to control object shape and background clutter while eliminating unintentional cues to the presence of an otherwise hidden target. In contrast, real objects can vary uncontrollably in shape, are camouflaged to different degrees by background clutter, and are often associated with non-shape cues, making results using natural image sets difficult to interpret. Our computational model of cortical association fields matches many aspects of the time course and object detection accuracy of human subjects on statistically identical synthetic image sets. This implies that lateral interactions may selectively reinforce smooth object global boundaries.

Collapse

416

Masquelier T. Relative spike time coding and STDP-based orientation selectivity in the early visual system in natural continuous and saccadic vision: a computational model. J Comput Neurosci 2011;32:425-41. [PMID: 21938439 DOI: 10.1007/s10827-011-0361-9] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2011] [Revised: 09/05/2011] [Accepted: 09/08/2011] [Indexed: 10/17/2022]

417

When does the visual system need to look back? J Neurosci 2011;31:8706-7. [PMID: 21677153 DOI: 10.1523/jneurosci.1878-11.2011] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022] Open

418

Cohen MA, Alvarez GA, Nakayama K. Natural-scene perception requires attention. Psychol Sci 2011;22:1165-72. [PMID: 21841149 DOI: 10.1177/0956797611419168] [Citation(s) in RCA: 89] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022] Open

419

Clarke A, Taylor KI, Tyler LK. The evolution of meaning: spatio-temporal dynamics of visual object recognition. J Cogn Neurosci 2011;23:1887-99. [PMID: 20617883 DOI: 10.1162/jocn.2010.21544] [Citation(s) in RCA: 75] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/31/2025]

420

Stollhoff R, Kennerknecht I, Elze T, Jost J. A computational model of dysfunctional facial encoding in congenital prosopagnosia. Neural Netw 2011;24:652-64. [DOI: 10.1016/j.neunet.2011.03.006] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2010] [Revised: 02/14/2011] [Accepted: 03/06/2011] [Indexed: 11/15/2022]

421

Mack ML, Palmeri TJ. The timing of visual object categorization. Front Psychol 2011;2:165. [PMID: 21811480 PMCID: PMC3139955 DOI: 10.3389/fpsyg.2011.00165] [Citation(s) in RCA: 42] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2011] [Accepted: 07/01/2011] [Indexed: 12/03/2022] Open

422

Schmidt T, Haberkamp A, Veltkamp GM, Weber A, Seydell-Greenwald A, Schmidt F. Visual processing in rapid-chase systems: image processing, attention, and awareness. Front Psychol 2011;2:169. [PMID: 21811484 PMCID: PMC3139957 DOI: 10.3389/fpsyg.2011.00169] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2011] [Accepted: 07/06/2011] [Indexed: 11/13/2022] Open

Abstract

Visual stimuli can be classified so rapidly that their analysis may be based on a single sweep of feedforward processing through the visuomotor system. Behavioral criteria for feedforward processing can be evaluated in response priming tasks where speeded pointing or keypress responses are performed toward target stimuli which are preceded by prime stimuli. We apply this method to several classes of complex stimuli. (1) When participants classify natural images into animals or non-animals, the time course of their pointing responses indicates that prime and target signals remain strictly sequential throughout all processing stages, meeting stringent behavioral criteria for feedforward processing (rapid-chase criteria). (2) Such priming effects are boosted by selective visual attention for positions, shapes, and colors, in a way consistent with bottom-up enhancement of visuomotor processing, even when primes cannot be consciously identified. (3) Speeded processing of phobic images is observed in participants specifically fearful of spiders or snakes, suggesting enhancement of feedforward processing by long-term perceptual learning. (4) When the perceived brightness of primes in complex displays is altered by means of illumination or transparency illusions, priming effects in speeded keypress responses can systematically contradict subjective brightness judgments, such that one prime appears brighter than the other but activates motor responses as if it was darker. We propose that response priming captures the output of the first feedforward pass of visual signals through the visuomotor system, and that this output lacks some characteristic features of more elaborate, recurrent processing. This way, visuomotor measures may become dissociated from several aspects of conscious vision. We argue that "fast" visuomotor measures predominantly driven by feedforward processing should supplement "slow" psychophysical measures predominantly based on visual awareness.

Collapse

423

Sugase-Miyamoto Y, Matsumoto N, Kawano K. Role of temporal processing stages by inferior temporal neurons in facial recognition. Front Psychol 2011;2:141. [PMID: 21734904 PMCID: PMC3124819 DOI: 10.3389/fpsyg.2011.00141] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2011] [Accepted: 06/12/2011] [Indexed: 11/24/2022] Open

Abstract

In this review, we focus on the role of temporal stages of encoded facial information in the visual system, which might enable the efficient determination of species, identity, and expression. Facial recognition is an important function of our brain and is known to be processed in the ventral visual pathway, where visual signals are processed through areas V1, V2, V4, and the inferior temporal (IT) cortex. In the IT cortex, neurons show selective responses to complex visual images such as faces, and at each stage along the pathway the stimulus selectivity of the neural responses becomes sharper, particularly in the later portion of the responses. In the IT cortex of the monkey, facial information is represented by different temporal stages of neural responses, as shown in our previous study: the initial transient response of face-responsive neurons represents information about global categories, i.e., human vs. monkey vs. simple shapes, whilst the later portion of these responses represents information about detailed facial categories, i.e., expression and/or identity. This suggests that the temporal stages of the neuronal firing pattern play an important role in the coding of visual stimuli, including faces. This type of coding may be a plausible mechanism underlying the temporal dynamics of recognition, including the process of detection/categorization followed by the identification of objects. Recent single-unit studies in monkeys have also provided evidence consistent with the important role of the temporal stages of encoded facial information. For example, view-invariant facial identity information is represented in the response at a later period within a region of face-selective neurons. Consistent with these findings, temporally modulated neural activity has also been observed in human studies. These results suggest a close correlation between the temporal processing stages of facial information by IT neurons and the temporal dynamics of face recognition.

Collapse

424

Wilder J, Feldman J, Singh M. Superordinate shape classification using natural shape statistics. Cognition 2011;119:325-40. [PMID: 21440250 PMCID: PMC3094567 DOI: 10.1016/j.cognition.2011.01.009] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2010] [Revised: 01/14/2011] [Accepted: 01/22/2011] [Indexed: 11/29/2022]

425

Evans KK, Horowitz TS, Wolfe JM. When categories collide: accumulation of information about multiple categories in rapid scene perception. Psychol Sci 2011;22:739-46. [PMID: 21555522 PMCID: PMC3140830 DOI: 10.1177/0956797611407930] [Citation(s) in RCA: 27] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022] Open

426

He X, Yang Z, Tsien JZ. A hierarchical probabilistic model for rapid object categorization in natural scenes. PLoS One 2011;6:e20002. [PMID: 21647443 PMCID: PMC3102072 DOI: 10.1371/journal.pone.0020002] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2010] [Accepted: 04/19/2011] [Indexed: 11/19/2022] Open

Abstract

Humans can categorize objects in complex natural scenes within 100–150 ms. This amazing ability of rapid categorization has motivated many computational models. Most of these models require extensive training to obtain a decision boundary in a very high dimensional (e.g., ∼6,000 in a leading model) feature space and often categorize objects in natural scenes by categorizing the context that co-occurs with objects when objects do not occupy large portions of the scenes. It is thus unclear how humans achieve rapid scene categorization.

To address this issue, we developed a hierarchical probabilistic model for rapid object categorization in natural scenes. In this model, a natural object category is represented by a coarse hierarchical probability distribution (PD), which includes PDs of object geometry and spatial configuration of object parts. Object parts are encoded by PDs of a set of natural object structures, each of which is a concatenation of local object features. Rapid categorization is performed as statistical inference. Since the model uses a very small number (∼100) of structures for even complex object categories such as animals and cars, it requires little training and is robust in the presence of large variations within object categories and in their occurrences in natural scenes. Remarkably, we found that the model categorized animals in natural scenes and cars in street scenes with a near human-level performance. We also found that the model located animals and cars in natural scenes, thus overcoming a flaw in many other models which is to categorize objects in natural context by categorizing contextual features. These results suggest that coarse PDs of object categories based on natural object structures and statistical operations on these PDs may underlie the human ability to rapidly categorize scenes.

Collapse

427

Kriegeskorte N. Pattern-information analysis: from stimulus decoding to computational-model testing. Neuroimage 2011;56:411-21. [PMID: 21281719 DOI: 10.1016/j.neuroimage.2011.01.061] [Citation(s) in RCA: 125] [Impact Index Per Article: 8.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2010] [Revised: 11/29/2010] [Accepted: 01/21/2011] [Indexed: 11/28/2022] Open

Abstract

Pattern-information analysis has become an important new paradigm in functional imaging. Here I review and compare existing approaches with a focus on the question of what we can learn from them in terms of brain theory. The most popular and widespread method is stimulus decoding by response-pattern classification. This approach addresses the question whether activity patterns in a given region carry information about the stimulus category. Pattern classification uses generic models of the stimulus-response relationship that do not mimic brain information processing and treats the stimulus space as categorical-a simplification that is often helpful, but also limiting in terms of the questions that can be addressed. We can address the question whether representations are consistent across different stimulus sets or tasks by cross-decoding, where the classifier is trained with one set of stimuli (or task) and tested with another. Beyond pattern classification, a major new direction is the integration of computational models of brain information processing into pattern-information analysis. This approach enables us to address the question to what extent competing computational models are consistent with the stimulus representations in a brain region. Two methods that test computational models are voxel receptive-field modeling and representational similarity analysis. These methods sample the stimulus (or mental-state) space more richly, estimate a separate response pattern for each stimulus, and can generalize from the stimulus sample to a stimulus population. Computational models that mimic brain information processing predict responses from stimuli. The reverse transform can be modeled to reconstruct stimuli from responses. Stimulus reconstruction is a challenging feat of engineering, but the implications of the results for brain theory are not always clear. Exploratory pattern analyses complement the confirmatory approaches mentioned so far and can reveal strong, unexpected effects that might be missed when testing only a restricted set of predefined hypotheses.

Collapse

428

Grossberg S, Markowitz J, Cao Y. On the road to invariant recognition: explaining tradeoff and morph properties of cells in inferotemporal cortex using multiple-scale task-sensitive attentive learning. Neural Netw 2011;24:1036-49. [PMID: 21665428 DOI: 10.1016/j.neunet.2011.04.001] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2010] [Revised: 03/30/2011] [Accepted: 04/05/2011] [Indexed: 11/30/2022]

429

Cao Y, Grossberg S, Markowitz J. How does the brain rapidly learn and reorganize view-invariant and position-invariant object representations in the inferotemporal cortex? Neural Netw 2011;24:1050-61. [PMID: 21596523 DOI: 10.1016/j.neunet.2011.04.004] [Citation(s) in RCA: 30] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2010] [Revised: 04/10/2011] [Accepted: 04/12/2011] [Indexed: 11/18/2022]

Abstract

All primates depend for their survival on being able to rapidly learn about and recognize objects. Objects may be visually detected at multiple positions, sizes, and viewpoints. How does the brain rapidly learn and recognize objects while scanning a scene with eye movements, without causing a combinatorial explosion in the number of cells that are needed? How does the brain avoid the problem of erroneously classifying parts of different objects together at the same or different positions in a visual scene? In monkeys and humans, a key area for such invariant object category learning and recognition is the inferotemporal cortex (IT). A neural model is proposed to explain how spatial and object attention coordinate the ability of IT to learn invariant category representations of objects that are seen at multiple positions, sizes, and viewpoints. The model clarifies how interactions within a hierarchy of processing stages in the visual brain accomplish this. These stages include the retina, lateral geniculate nucleus, and cortical areas V1, V2, V4, and IT in the brain's What cortical stream, as they interact with spatial attention processes within the parietal cortex of the Where cortical stream. The model builds upon the ARTSCAN model, which proposed how view-invariant object representations are generated. The positional ARTSCAN (pARTSCAN) model proposes how the following additional processes in the What cortical processing stream also enable position-invariant object representations to be learned: IT cells with persistent activity, and a combination of normalizing object category competition and a view-to-object learning law which together ensure that unambiguous views have a larger effect on object recognition than ambiguous views. The model explains how such invariant learning can be fooled when monkeys, or other primates, are presented with an object that is swapped with another object during eye movements to foveate the original object. The swapping procedure is predicted to prevent the reset of spatial attention, which would otherwise keep the representations of multiple objects from being combined by learning. Li and DiCarlo (2008) have presented neurophysiological data from monkeys showing how unsupervised natural experience in a target swapping experiment can rapidly alter object representations in IT. The model quantitatively simulates the swapping data by showing how the swapping procedure fools the spatial attention mechanism. More generally, the model provides a unifying framework, and testable predictions in both monkeys and humans, for understanding object learning data using neurophysiological methods in monkeys, and spatial attention, episodic learning, and memory retrieval data using functional imaging methods in humans.

Collapse

430

Sigala R, Logothetis NK, Rainer G. Own-species bias in the representations of monkey and human face categories in the primate temporal lobe. J Neurophysiol 2011;105:2740-52. [PMID: 21430277 DOI: 10.1152/jn.00882.2010] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022] Open

431

Dura-Bernal S, Wennekers T, Denham SL. The Role of Feedback in a Hierarchical Model of Object Perception. ADVANCES IN EXPERIMENTAL MEDICINE AND BIOLOGY 2011;718:165-79. [DOI: 10.1007/978-1-4614-0164-3_14] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/06/2023]

432

Perceptual learning in Vision Research. Vision Res 2010;51:1552-66. [PMID: 20974167 DOI: 10.1016/j.visres.2010.10.019] [Citation(s) in RCA: 315] [Impact Index Per Article: 21.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2010] [Revised: 10/15/2010] [Accepted: 10/15/2010] [Indexed: 12/31/2022]

Abstract

Reports published in Vision Research during the late years of the 20th century described surprising effects of long-term sensitivity improvement with some basic visual tasks as a result of training. These improvements, found in adult human observers, were highly specific to simple visual features, such as location in the visual field, spatial-frequency, local and global orientation, and in some cases even the eye of origin. The results were interpreted as arising from the plasticity of sensory brain regions that display those features of specificity within their constituting neuronal subpopulations. A new view of the visual cortex has emerged, according to which a degree of plasticity is retained at adult age, allowing flexibility in acquiring new visual skills when the need arises. Although this "sensory plasticity" interpretation is often questioned, it is commonly believed that learning has access to detailed low-level visual representations residing within the visual cortex. More recent studies during the last decade revealed the conditions needed for learning and the conditions under which learning can be generalized across stimuli and tasks. The results are consistent with an account of perceptual learning according to which visual processing is remodeled by the brain, utilizing sensory information acquired during task performance. The stability of the visual system is viewed as an adaptation to a stable environment and instances of perceptual learning as a reaction of the brain to abrupt changes in the environment. Training on a restricted stimulus set may lead to perceptual overfitting and over-specificity. The systemic methodology developed for perceptual learning, and the accumulated knowledge, allows us to explore issues related to learning and memory in general, such as learning rules, reinforcement, memory consolidation, and neural rehabilitation. A persistent open question is the neuro-anatomical substrate underlying these learning effects.

Collapse

433

Soto FA, Wasserman EA. Missing the forest for the trees: object-discrimination learning blocks categorization learning. Psychol Sci 2010;21:1510-7. [PMID: 20817911 PMCID: PMC2953592 DOI: 10.1177/0956797610382125] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open

434

Blumberg J, Kreiman G. How cortical neurons help us see: visual recognition in the human brain. J Clin Invest 2010;120:3054-63. [PMID: 20811161 PMCID: PMC2929717 DOI: 10.1172/jci42161] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022] Open

435

Feldman JA. Cognitive Science should be unified: comment on Griffiths et al. and McClelland et al. Trends Cogn Sci 2010;14:341. [DOI: 10.1016/j.tics.2010.05.008] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2010] [Accepted: 05/25/2010] [Indexed: 11/24/2022]

436

Hu X, Zhang B. A Gaussian attractor network for memory and recognition with experience-dependent learning. Neural Comput 2010;22:1333-57. [PMID: 20100070 DOI: 10.1162/neco.2010.02-09-957] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022]

437

Nagel KI, McLendon HM, Doupe AJ. Differential influence of frequency, timing, and intensity cues in a complex acoustic categorization task. J Neurophysiol 2010;104:1426-37. [PMID: 20610781 DOI: 10.1152/jn.00028.2010] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022] Open

438

Delorme A, Richard G, Fabre-Thorpe M. Key visual features for rapid categorization of animals in natural scenes. Front Psychol 2010;1:21. [PMID: 21607075 PMCID: PMC3095379 DOI: 10.3389/fpsyg.2010.00021] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2010] [Accepted: 05/26/2010] [Indexed: 11/13/2022] Open

Abstract

In speeded categorization tasks, decisions could be based on diagnostic target features or they may need the activation of complete representations of the object. Depending on task requirements, the priming of feature detectors through top-down expectation might lower the threshold of selective units or speed up the rate of information accumulation. In the present paper, 40 subjects performed a rapid go/no-go animal/non-animal categorization task with 400 briefly flashed natural scenes to study how performance depends on physical scene characteristics, target configuration, and the presence or absence of diagnostic animal features. Performance was evaluated both in terms of accuracy and speed and d' curves were plotted as a function of reaction time (RT). Such d' curves give an estimation of the processing dynamics for studied features and characteristics over the entire subject population. Global image characteristics such as color and brightness do not critically influence categorization speed, although they slightly influence accuracy. Global critical factors include the presence of a canonical animal posture and animal/background size ratio suggesting the role of coarse global form. Performance was best for both accuracy and speed, when the animal was in a typical posture and when it occupied about 20-30% of the image. The presence of diagnostic animal features was another critical factor. Performance was significantly impaired both in accuracy (drop 3.3-7.5%) and speed (median RT increase 7-16 ms) when diagnostic animal parts (eyes, mouth, and limbs) were missing. Such animal features were shown to influence performance very early when only 15-25% of the response had been produced. In agreement with other experimental and modeling studies, our results support fast diagnostic recognition of animals based on key intermediate features and priming based on the subject's expertise.

Collapse

439

Continuous transformation learning of translation invariant representations. Exp Brain Res 2010;204:255-70. [PMID: 20544186 DOI: 10.1007/s00221-010-2309-0] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2009] [Accepted: 05/21/2010] [Indexed: 01/24/2023]

440

Neuromorphic sensory systems. Curr Opin Neurobiol 2010;20:288-95. [DOI: 10.1016/j.conb.2010.03.007] [Citation(s) in RCA: 210] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2010] [Revised: 03/22/2010] [Accepted: 03/24/2010] [Indexed: 11/17/2022]

441

Eriksson D, Valentiniene S, Papaioannou S. Relating information, encoding and adaptation: decoding the population firing rate in visual areas 17/18 in response to a stimulus transition. PLoS One 2010;5:e10327. [PMID: 20436907 PMCID: PMC2860500 DOI: 10.1371/journal.pone.0010327] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2009] [Accepted: 03/24/2010] [Indexed: 11/18/2022] Open

442

Goris RLT, de Beeck HPO. Invariance in visual object recognition requires training: a computational argument. Front Neurosci 2010;4:71. [PMID: 20589239 PMCID: PMC2920526 DOI: 10.3389/neuro.01.012.2010] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2009] [Accepted: 12/17/2009] [Indexed: 11/13/2022] Open

443

Soto FA, Wasserman EA. Error-driven learning in visual categorization and object recognition: a common-elements model. Psychol Rev 2010;117:349-81. [PMID: 20438230 PMCID: PMC2930356 DOI: 10.1037/a0018695] [Citation(s) in RCA: 42] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]

444

Földiák P. Neural coding: non-local but explicit and conceptual. Curr Biol 2010;19:R904-6. [PMID: 19825354 DOI: 10.1016/j.cub.2009.08.020] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]

445

Fiser J, Berkes P, Orbán G, Lengyel M. Statistically optimal perception and learning: from behavior to neural representations. Trends Cogn Sci 2010;14:119-30. [PMID: 20153683 PMCID: PMC2939867 DOI: 10.1016/j.tics.2010.01.003] [Citation(s) in RCA: 392] [Impact Index Per Article: 26.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2009] [Revised: 01/06/2010] [Accepted: 01/08/2010] [Indexed: 10/19/2022]

446

Willmore BDB, Prenger RJ, Gallant JL. Neural representation of natural images in visual area V2. J Neurosci 2010;30:2102-14. [PMID: 20147538 PMCID: PMC2994536 DOI: 10.1523/jneurosci.4099-09.2010] [Citation(s) in RCA: 77] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2009] [Revised: 12/04/2009] [Accepted: 12/15/2009] [Indexed: 11/21/2022] Open

447

Bouvier S, Treisman A. Visual feature binding requires reentry. Psychol Sci 2010;21:200-4. [PMID: 20424045 PMCID: PMC3113689 DOI: 10.1177/0956797609357858] [Citation(s) in RCA: 49] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022] Open

448

Honey CJ, Thivierge JP, Sporns O. Can structure predict function in the human brain? Neuroimage 2010;52:766-76. [PMID: 20116438 DOI: 10.1016/j.neuroimage.2010.01.071] [Citation(s) in RCA: 444] [Impact Index Per Article: 29.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2009] [Revised: 01/17/2010] [Accepted: 01/21/2010] [Indexed: 01/07/2023] Open

449

Dehaene S, Nakamura K, Jobert A, Kuroki C, Ogawa S, Cohen L. Why do children make mirror errors in reading? Neural correlates of mirror invariance in the visual word form area. Neuroimage 2010;49:1837-48. [PMID: 19770045 DOI: 10.1016/j.neuroimage.2009.09.024] [Citation(s) in RCA: 93] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2009] [Revised: 09/04/2009] [Accepted: 09/15/2009] [Indexed: 01/18/2023] Open

450

Tyler CW, Likova LT. An algebra for the analysis of object encoding. Neuroimage 2009;50:1243-50. [PMID: 20025978 DOI: 10.1016/j.neuroimage.2009.10.091] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2009] [Revised: 09/29/2009] [Accepted: 10/08/2009] [Indexed: 10/20/2022] Open