1
|
Lee GM, Rodríguez Deliz CL, Bushnell BN, Majaj NJ, Movshon JA, Kiorpes L. Developmentally stable representations of naturalistic image structure in macaque visual cortex. Cell Rep 2024; 43:114534. [PMID: 39067025 DOI: 10.1016/j.celrep.2024.114534] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2024] [Revised: 06/17/2024] [Accepted: 07/09/2024] [Indexed: 07/30/2024] Open
Abstract
To determine whether post-natal improvements in form vision result from changes in mid-level visual cortex, we studied neuronal and behavioral responses to texture stimuli that were matched in local spectral content but varied in "naturalistic" structure. We made longitudinal measurements of visual behavior from 16 to 95 weeks of age, and of neural responses from 20 to 56 weeks. We also measured behavioral and neural responses in near-adult animals more than 3 years old. Behavioral sensitivity reached half-maximum around 25 weeks of age, but neural sensitivities remained stable through all ages tested. Neural sensitivity to naturalistic structure was highest in V4, lower in V2 and inferotemporal cortex (IT), and barely discernible in V1. Our results show a dissociation between stable neural performance and improving behavioral performance, which may reflect improved processing capacity in circuits downstream of visual cortex.
Collapse
Affiliation(s)
- Gerick M Lee
- Center for Neural Science, New York University, New York, NY 10003, USA
| | | | | | - Najib J Majaj
- Center for Neural Science, New York University, New York, NY 10003, USA
| | - J Anthony Movshon
- Center for Neural Science, New York University, New York, NY 10003, USA.
| | - Lynne Kiorpes
- Center for Neural Science, New York University, New York, NY 10003, USA.
| |
Collapse
|
2
|
Quaia C, Krauzlis RJ. Object recognition in primates: what can early visual areas contribute? Front Behav Neurosci 2024; 18:1425496. [PMID: 39070778 PMCID: PMC11272660 DOI: 10.3389/fnbeh.2024.1425496] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2024] [Accepted: 07/01/2024] [Indexed: 07/30/2024] Open
Abstract
Introduction If neuroscientists were asked which brain area is responsible for object recognition in primates, most would probably answer infero-temporal (IT) cortex. While IT is likely responsible for fine discriminations, and it is accordingly dominated by foveal visual inputs, there is more to object recognition than fine discrimination. Importantly, foveation of an object of interest usually requires recognizing, with reasonable confidence, its presence in the periphery. Arguably, IT plays a secondary role in such peripheral recognition, and other visual areas might instead be more critical. Methods To investigate how signals carried by early visual processing areas (such as LGN and V1) could be used for object recognition in the periphery, we focused here on the task of distinguishing faces from non-faces. We tested how sensitive various models were to nuisance parameters, such as changes in scale and orientation of the image, and the type of image background. Results We found that a model of V1 simple or complex cells could provide quite reliable information, resulting in performance better than 80% in realistic scenarios. An LGN model performed considerably worse. Discussion Because peripheral recognition is both crucial to enable fine recognition (by bringing an object of interest on the fovea), and probably sufficient to account for a considerable fraction of our daily recognition-guided behavior, we think that the current focus on area IT and foveal processing is too narrow. We propose that rather than a hierarchical system with IT-like properties as its primary aim, object recognition should be seen as a parallel process, with high-accuracy foveal modules operating in parallel with lower-accuracy and faster modules that can operate across the visual field.
Collapse
Affiliation(s)
- Christian Quaia
- Laboratory of Sensorimotor Research, National Eye Institute, NIH, Bethesda, MD, United States
| | | |
Collapse
|
3
|
Walper D, Bendixen A, Grimm S, Schubö A, Einhäuser W. Attention deployment in natural scenes: Higher-order scene statistics rather than semantics modulate the N2pc component. J Vis 2024; 24:7. [PMID: 38848099 PMCID: PMC11166226 DOI: 10.1167/jov.24.6.7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2023] [Accepted: 04/19/2024] [Indexed: 06/13/2024] Open
Abstract
Which properties of a natural scene affect visual search? We consider the alternative hypotheses that low-level statistics, higher-level statistics, semantics, or layout affect search difficulty in natural scenes. Across three experiments (n = 20 each), we used four different backgrounds that preserve distinct scene properties: (a) natural scenes (all experiments); (b) 1/f noise (pink noise, which preserves only low-level statistics and was used in Experiments 1 and 2); (c) textures that preserve low-level and higher-level statistics but not semantics or layout (Experiments 2 and 3); and (d) inverted (upside-down) scenes that preserve statistics and semantics but not layout (Experiment 2). We included "split scenes" that contained different backgrounds left and right of the midline (Experiment 1, natural/noise; Experiment 3, natural/texture). Participants searched for a Gabor patch that occurred at one of six locations (all experiments). Reaction times were faster for targets on noise and slower on inverted images, compared to natural scenes and textures. The N2pc component of the event-related potential, a marker of attentional selection, had a shorter latency and a higher amplitude for targets in noise than for all other backgrounds. The background contralateral to the target had an effect similar to that on the target side: noise led to faster reactions and shorter N2pc latencies than natural scenes, although we observed no difference in N2pc amplitude. There were no interactions between the target side and the non-target side. Together, this shows that-at least when searching simple targets without own semantic content-natural scenes are more effective distractors than noise and that this results from higher-order statistics rather than from semantics or layout.
Collapse
Affiliation(s)
- Daniel Walper
- Physics of Cognition Group, Chemnitz University of Technology, Chemnitz, Germany
| | - Alexandra Bendixen
- Cognitive Systems Lab, Chemnitz University of Technology, Chemnitz, Germany
- https://www.tu-chemnitz.de/physik/SFKS/index.html.en
| | - Sabine Grimm
- Physics of Cognition Group, Chemnitz University of Technology, Chemnitz, Germany
- Cognitive Systems Lab, Chemnitz University of Technology, Chemnitz, Germany
| | - Anna Schubö
- Cognitive Neuroscience of Perception & Action, Philipps University Marburg, Marburg, Germany
- https://www.uni-marburg.de/en/fb04/team-schuboe
| | - Wolfgang Einhäuser
- Physics of Cognition Group, Chemnitz University of Technology, Chemnitz, Germany
- https://www.tu-chemnitz.de/physik/PHKP/index.html.en
| |
Collapse
|
4
|
Lee GM, Rodríguez-Deliz CL, Bushnell BN, Majaj NJ, Movshon JA, Kiorpes L. Developmentally stable representations of naturalistic image structure in macaque visual cortex. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.02.24.581889. [PMID: 38463955 PMCID: PMC10925106 DOI: 10.1101/2024.02.24.581889] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/12/2024]
Abstract
We studied visual development in macaque monkeys using texture stimuli, matched in local spectral content but varying in "naturalistic" structure. In adult monkeys, naturalistic textures preferentially drive neurons in areas V2 and V4, but not V1. We paired behavioral measurements of naturalness sensitivity with separately-obtained neuronal population recordings from neurons in areas V1, V2, V4, and inferotemporal cortex (IT). We made behavioral measurements from 16 weeks of age and physiological measurements as early as 20 weeks, and continued through 56 weeks. Behavioral sensitivity reached half of maximum at roughly 25 weeks of age. Neural sensitivities remained stable from the earliest ages tested. As in adults, neural sensitivity to naturalistic structure increased from V1 to V2 to V4. While sensitivities in V2 and IT were similar, the dimensionality of the IT representation was more similar to V4's than to V2's.
Collapse
Affiliation(s)
- Gerick M. Lee
- Center for Neural Science New York University New York, NY, USA 10003
| | | | | | - Najib J. Majaj
- Center for Neural Science New York University New York, NY, USA 10003
| | | | - Lynne Kiorpes
- Center for Neural Science New York University New York, NY, USA 10003
| |
Collapse
|
5
|
Loke KS. A novel approach to texture recognition combining deep learning orthogonal convolution with regional input features. PeerJ Comput Sci 2024; 10:e1927. [PMID: 38660180 PMCID: PMC11041941 DOI: 10.7717/peerj-cs.1927] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2023] [Accepted: 02/13/2024] [Indexed: 04/26/2024]
Abstract
Textures provide a powerful segmentation and object detection cue. Recent research has shown that deep convolutional nets like Visual Geometry Group (VGG) and ResNet perform well in non-stationary texture datasets. Non-stationary textures have local structures that change from one region of the image to the other. This is consistent with the view that deep convolutional networks are good at detecting local microstructures disguised as textures. However, stationary textures are textures that have statistical properties that are constant or slow varying over the entire region are not well detected by deep convolutional networks. This research demonstrates that simple seven-layer convolutional networks can obtain better results than deep networks using a novel convolutional technique called orthogonal convolution with pre-calculated regional features using grey level co-occurrence matrix. We obtained an average of 8.5% improvement in accuracy in texture recognition on the Outex dataset over GoogleNet, ResNet, VGG and AlexNet.
Collapse
Affiliation(s)
- Kar-Seng Loke
- Industrial Management, National Taiwan University of Science and Technology, Taipei, Taiwan, Taiwan
| |
Collapse
|
6
|
Ziemba CM, Goris RLT, Stine GM, Perez RK, Simoncelli EP, Movshon JA. Neuronal and behavioral responses to naturalistic texture images in macaque monkeys. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.02.22.581645. [PMID: 38464304 PMCID: PMC10925125 DOI: 10.1101/2024.02.22.581645] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/12/2024]
Abstract
The visual world is richly adorned with texture, which can serve to delineate important elements of natural scenes. In anesthetized macaque monkeys, selectivity for the statistical features of natural texture is weak in V1, but substantial in V2, suggesting that neuronal activity in V2 might directly support texture perception. To test this, we investigated the relation between single cell activity in macaque V1 and V2 and simultaneously measured behavioral judgments of texture. We generated stimuli along a continuum between naturalistic texture and phase-randomized noise and trained two macaque monkeys to judge whether a sample texture more closely resembled one or the other extreme. Analysis of responses revealed that individual V1 and V2 neurons carried much less information about texture naturalness than behavioral reports. However, the sensitivity of V2 neurons, especially those preferring naturalistic textures, was significantly closer to that of behavior compared with V1. The firing of both V1 and V2 neurons predicted perceptual choices in response to repeated presentations of the same ambiguous stimulus in one monkey, despite low individual neural sensitivity. However, neither population predicted choice in the second monkey. We conclude that neural responses supporting texture perception likely continue to develop downstream of V2. Further, combined with neural data recorded while the same two monkeys performed an orientation discrimination task, our results demonstrate that choice-correlated neural activity in early sensory cortex is unstable across observers and tasks, untethered from neuronal sensitivity, and thus unlikely to reflect a critical aspect of the formation of perceptual decisions. Significance statement As visual signals propagate along the cortical hierarchy, they encode increasingly complex aspects of the sensory environment and likely have a more direct relationship with perceptual experience. We replicate and extend previous results from anesthetized monkeys differentiating the selectivity of neurons along the first step in cortical vision from area V1 to V2. However, our results further complicate efforts to establish neural signatures that reveal the relationship between perception and the neuronal activity of sensory populations. We find that choice-correlated activity in V1 and V2 is unstable across different observers and tasks, and also untethered from neuronal sensitivity and other features of nonsensory response modulation.
Collapse
|
7
|
Segraves MA. Using Natural Scenes to Enhance our Understanding of the Cerebral Cortex's Role in Visual Search. Annu Rev Vis Sci 2023; 9:435-454. [PMID: 37164028 DOI: 10.1146/annurev-vision-100720-124033] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/12/2023]
Abstract
Using natural scenes is an approach to studying the visual and eye movement systems approximating how these systems function in everyday life. This review examines the results from behavioral and neurophysiological studies using natural scene viewing in humans and monkeys. The use of natural scenes for the study of cerebral cortical activity is relatively new and presents challenges for data analysis. Methods and results from the use of natural scenes for the study of the visual and eye movement cortex are presented, with emphasis on new insights that this method provides enhancing what is known about these cortical regions from the use of conventional methods.
Collapse
Affiliation(s)
- Mark A Segraves
- Department of Neurobiology, Northwestern University, Evanston, Illinois, USA;
| |
Collapse
|
8
|
Lieber JD, Lee GM, Majaj NJ, Movshon JA. Sensitivity to naturalistic texture relies primarily on high spatial frequencies. J Vis 2023; 23:4. [PMID: 36745452 PMCID: PMC9910384 DOI: 10.1167/jov.23.2.4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2022] [Accepted: 11/19/2022] [Indexed: 02/07/2023] Open
Abstract
Natural images contain information at multiple spatial scales. Though we understand how early visual mechanisms split multiscale images into distinct spatial frequency channels, we do not know how the outputs of these channels are processed further by mid-level visual mechanisms. We have recently developed a texture discrimination task that uses synthetic, multi-scale, "naturalistic" textures to isolate these mid-level mechanisms. Here, we use three experimental manipulations (image blur, image rescaling, and eccentric viewing) to show that perceptual sensitivity to naturalistic structure is strongly dependent on features at high object spatial frequencies (measured in cycles/image). As a result, sensitivity depends on a texture acuity limit, a property of the visual system that sets the highest retinal spatial frequency (measured in cycles/degree) at which observers can detect naturalistic features. Analysis of the texture images using a model observer analysis shows that naturalistic image features at high object spatial frequencies carry more task-relevant information than those at low object spatial frequencies. That is, the dependence of sensitivity on high object spatial frequencies is a property of the texture images, rather than a property of the visual system. Accordingly, we find human observers' ability to extract naturalistic information (their efficiency) is similar for all object spatial frequencies. We conclude that the mid-level mechanisms that underlie perceptual sensitivity effectively extract information from all image features below the texture acuity limit, regardless of their retinal and object spatial frequency.
Collapse
Affiliation(s)
- Justin D Lieber
- Center for Neural Science, New York University, New York, NY, USA
| | - Gerick M Lee
- Center for Neural Science, New York University, New York, NY, USA
| | - Najib J Majaj
- Center for Neural Science, New York University, New York, NY, USA
| | | |
Collapse
|
9
|
Transformation of acoustic information to sensory decision variables in the parietal cortex. Proc Natl Acad Sci U S A 2023; 120:e2212120120. [PMID: 36598952 PMCID: PMC9926273 DOI: 10.1073/pnas.2212120120] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/05/2023] Open
Abstract
The process by which sensory evidence contributes to perceptual choices requires an understanding of its transformation into decision variables. Here, we address this issue by evaluating the neural representation of acoustic information in the auditory cortex-recipient parietal cortex, while gerbils either performed a two-alternative forced-choice auditory discrimination task or while they passively listened to identical acoustic stimuli. During task engagement, stimulus identity decoding performance from simultaneously recorded parietal neurons significantly correlated with psychometric sensitivity. In contrast, decoding performance during passive listening was significantly reduced. Principal component and geometric analyses revealed the emergence of low-dimensional encoding of linearly separable manifolds with respect to stimulus identity and decision, but only during task engagement. These findings confirm that the parietal cortex mediates a transition of acoustic representations into decision-related variables. Finally, using a clustering analysis, we identified three functionally distinct subpopulations of neurons that each encoded task-relevant information during separate temporal segments of a trial. Taken together, our findings demonstrate how parietal cortex neurons integrate and transform encoded auditory information to guide sound-driven perceptual decisions.
Collapse
|
10
|
Lin T, Zhang X, Fields EC, Sekuler R, Gutchess A. Spatial frequency impacts perceptual and attentional ERP components across cultures. Brain Cogn 2022; 157:105834. [PMID: 34999289 PMCID: PMC8792318 DOI: 10.1016/j.bandc.2021.105834] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2021] [Revised: 12/21/2021] [Accepted: 12/21/2021] [Indexed: 11/22/2022]
Abstract
Culture impacts visual perception in several ways.To identify stages of perceptual processing that differ between cultures, we usedelectroencephalography measures of perceptual and attentional responses to simple visual stimuli.Gabor patches of higher or lower spatialfrequencywere presented at high contrast to 25 American and 31 East Asian participants while they were watching for the onset of aninfrequent, oddball stimulus. Region of interest and mass univariate analyses assessed how cultural background and stimuli spatial frequency affected the visual evoked response potentials. Across both groups, the Gabor of lower spatial frequency produced stronger evoked response potentials in the anterior N1 and P3 than did the higher frequency Gabor. The mass univariate analyses also revealed effects of spatial frequency, including a frontal negativity around 150 ms and a widespread posterior positivity around 300 ms. The effects of spatial frequency generally differed little across cultures; although there was some evidence for cultural differences in the P3 response to different frequencies at the Pz electrode, this effect did not emerge in the mass univariate analyses. We discuss these results in relation to those from previous studies, and explore the potential advantages of mass univariate analyses for cultural neuroscience.
Collapse
Affiliation(s)
- Tong Lin
- Brandeis University, United States
| | | | - Eric C Fields
- Brandeis University, United States; Boston College, United States; Westminster College, United States
| | | | | |
Collapse
|
11
|
Barzegaran E, Norcia AM. Neural sources of letter and Vernier acuity. Sci Rep 2020; 10:15449. [PMID: 32963270 PMCID: PMC7509830 DOI: 10.1038/s41598-020-72370-3] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2019] [Accepted: 09/01/2020] [Indexed: 01/23/2023] Open
Abstract
Visual acuity can be measured in many different ways, including with letters and Vernier offsets. Prior psychophysical work has suggested that the two acuities are strongly linked given that they both depend strongly on retinal eccentricity and both are similarly affected in amblyopia. Here we used high-density EEG recordings to ask whether the underlying neural sources are common as suggested by the psychophysics or distinct. To measure visual acuity for letters, we recorded evoked potentials to 3 Hz alternations between intact and scrambled text comprised of letters of varying size. To measure visual acuity for Vernier offsets, we recorded evoked potentials to 3 Hz alternations between bar gratings with and without a set of Vernier offsets. Both alternation types elicited robust activity at the 3 Hz stimulus frequency that scaled in amplitude with both letter and offset size, starting near threshold. Letter and Vernier offset responses differed in both their scalp topography and temporal dynamics. The earliest evoked responses to letters occurred on lateral occipital visual areas, predominantly over the left hemisphere. Later responses were measured at electrodes over early visual cortex, suggesting that letter structure is first extracted in second-tier extra-striate areas and that responses over early visual areas are due to feedback. Responses to Vernier offsets, by contrast, occurred first at medial occipital electrodes, with responses at later time-points being more broadly distributed—consistent with feedforward pathway mediation. The previously observed commonalities between letter and Vernier acuity may be due to common bottlenecks in early visual cortex but not because the two tasks are subserved by a common network of visual areas.
Collapse
Affiliation(s)
- Elham Barzegaran
- Wu Tsai Neurosciences Institute, 290 Jane Stanford Way, Stanford, CA, 94305, USA.
| | - Anthony M Norcia
- Wu Tsai Neurosciences Institute, 290 Jane Stanford Way, Stanford, CA, 94305, USA.
| |
Collapse
|
12
|
Abstract
Area V4-the focus of this review-is a mid-level processing stage along the ventral visual pathway of the macaque monkey. V4 is extensively interconnected with other visual cortical areas along the ventral and dorsal visual streams, with frontal cortical areas, and with several subcortical structures. Thus, it is well poised to play a broad and integrative role in visual perception and recognition-the functional domain of the ventral pathway. Neurophysiological studies in monkeys engaged in passive fixation and behavioral tasks suggest that V4 responses are dictated by tuning in a high-dimensional stimulus space defined by form, texture, color, depth, and other attributes of visual stimuli. This high-dimensional tuning may underlie the development of object-based representations in the visual cortex that are critical for tracking, recognizing, and interacting with objects. Neurophysiological and lesion studies also suggest that V4 responses are important for guiding perceptual decisions and higher-order behavior.
Collapse
Affiliation(s)
- Anitha Pasupathy
- Department of Biological Structure, University of Washington, Seattle, Washington 98195, USA; ,
- Washington National Primate Research Center, University of Washington, Seattle, Washington 98121, USA
| | - Dina V Popovkina
- Department of Psychology, University of Washington, Seattle, Washington 98105, USA;
| | - Taekjun Kim
- Department of Biological Structure, University of Washington, Seattle, Washington 98195, USA; ,
- Washington National Primate Research Center, University of Washington, Seattle, Washington 98121, USA
| |
Collapse
|
13
|
Leopold DA, Park SH. Studying the visual brain in its natural rhythm. Neuroimage 2020; 216:116790. [PMID: 32278093 DOI: 10.1016/j.neuroimage.2020.116790] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2019] [Revised: 03/26/2020] [Accepted: 03/30/2020] [Indexed: 12/27/2022] Open
Abstract
How the brain fluidly orchestrates visual behavior is a central question in cognitive neuroscience. Researchers studying neural responses in humans and nonhuman primates have mapped out visual response profiles and cognitive modulation in a large number of brain areas, most often using pared down stimuli and highly controlled behavioral paradigms. The historical emphasis on reductionism has placed most studies at one pole of an inherent trade-off between strictly controlled experimental variables and open designs that monitor the brain during its natural modes of operation. This bias toward simplified experiments has strongly shaped the field of visual neuroscience, with little guarantee that the principles and concepts established within that framework will apply more generally. In recent years, a growing number of studies have begun to relax strict experimental control with the aim of understanding how the brain responds under more naturalistic conditions. In this article, we survey research that has explicitly embraced the complexity and rhythm of natural vision. We focus on those studies most pertinent to understanding high-level visual specializations in brains of humans and nonhuman primates. We conclude that representationalist concepts borne from conventional visual experiments fall short in their ability to capture the real-life visual operations undertaken by the brain. More naturalistic approaches, though fraught with experimental and analytic challenges, provide fertile ground for neuroscientists seeking new inroads to investigate how the brain supports core aspects of our daily visual experience.
Collapse
Affiliation(s)
- David A Leopold
- Section on Cognitive Neurophysiology and Imaging, Laboratory of Neuropsychology, National Institute of Mental Health, National Institutes of Health, Bethesda, MD, 20892, USA; Neurophysiology Imaging Facility, National Institute of Mental Health, National Institute of Neurological Disorders and Stroke, National Eye Institute, National Institutes of Health, Bethesda, MD, 20892, USA.
| | - Soo Hyun Park
- Section on Cognitive Neurophysiology and Imaging, Laboratory of Neuropsychology, National Institute of Mental Health, National Institutes of Health, Bethesda, MD, 20892, USA
| |
Collapse
|
14
|
Dekel R, Sagi D. Interaction of contexts in context-dependent orientation estimation. Vision Res 2020; 169:58-72. [PMID: 32179340 DOI: 10.1016/j.visres.2020.02.006] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2019] [Revised: 02/26/2020] [Accepted: 02/27/2020] [Indexed: 10/24/2022]
Abstract
The processing of a visual stimulus is known to be influenced by the statistics in recent visual history and by the stimulus' visual surround. Such contextual influences lead to perceptually salient phenomena, such as the tilt aftereffect and the tilt illusion. Despite much research on the influence of an isolated context, it is not clear how multiple, possibly competing sources of contextual influence interact. Here, using psychophysical methods, we compared the combined influence of multiple contexts to the sum of the isolated context influences. The results showed large deviations from linear additivity for adjacent or overlapping contexts, and remarkably, clear additivity when the contexts were sufficiently separated. Specifically, for adjacent or overlapping contexts, the combined effect was often lower than the sum of the isolated component effects (sub-additivity), or was more influenced by one component than another (selection). For contexts that were separated in time (600 ms), the combined effect measured the exact sum of the isolated component effects (in degrees of bias). Overall, the results imply an initial compressive transformation during visual processing, followed by selection between the processed parts.
Collapse
Affiliation(s)
- Ron Dekel
- Department of Neurobiology, The Weizmann Institute of Science, Rehovot 7610001, Israel
| | - Dov Sagi
- Department of Neurobiology, The Weizmann Institute of Science, Rehovot 7610001, Israel.
| |
Collapse
|
15
|
Object shape and surface properties are jointly encoded in mid-level ventral visual cortex. Curr Opin Neurobiol 2019; 58:199-208. [PMID: 31586749 DOI: 10.1016/j.conb.2019.09.009] [Citation(s) in RCA: 16] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2019] [Revised: 08/30/2019] [Accepted: 09/11/2019] [Indexed: 11/22/2022]
Abstract
Recognizing a myriad visual objects rapidly is a hallmark of the primate visual system. Traditional theories of object recognition have focused on how crucial form features, for example, the orientation of edges, may be extracted in early visual cortex and utilized to recognize objects. An alternative view argues that much of early and mid-level visual processing focuses on encoding surface characteristics, for example, texture. Neurophysiological evidence from primate area V4 supports a third alternative - the joint, but independent, encoding of form and texture - that would be advantageous for segmenting objects from the background in natural scenes and for object recognition that is independent of surface texture. Future studies that leverage deep convolutional network models, especially focusing on network failures to match biology and behavior, can advance our insights into how such a joint representation of form and surface properties might emerge in visual cortex.
Collapse
|
16
|
Wallis TS, Funke CM, Ecker AS, Gatys LA, Wichmann FA, Bethge M. Image content is more important than Bouma's Law for scene metamers. eLife 2019; 8:42512. [PMID: 31038458 PMCID: PMC6491040 DOI: 10.7554/elife.42512] [Citation(s) in RCA: 23] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2018] [Accepted: 03/09/2019] [Indexed: 11/16/2022] Open
Abstract
We subjectively perceive our visual field with high fidelity, yet peripheral distortions can go unnoticed and peripheral objects can be difficult to identify (crowding). Prior work showed that humans could not discriminate images synthesised to match the responses of a mid-level ventral visual stream model when information was averaged in receptive fields with a scaling of about half their retinal eccentricity. This result implicated ventral visual area V2, approximated ‘Bouma’s Law’ of crowding, and has subsequently been interpreted as a link between crowding zones, receptive field scaling, and our perceptual experience. However, this experiment never assessed natural images. We find that humans can easily discriminate real and model-generated images at V2 scaling, requiring scales at least as small as V1 receptive fields to generate metamers. We speculate that explaining why scenes look as they do may require incorporating segmentation and global organisational constraints in addition to local pooling. As you read this digest, your eyes move to follow the lines of text. But now try to hold your eyes in one position, while reading the text on either side and below: it soon becomes clear that peripheral vision is not as good as we tend to assume. It is not possible to read text far away from the center of your line of vision, but you can see ‘something’ out of the corner of your eye. You can see that there is text there, even if you cannot read it, and you can see where your screen or page ends. So how does the brain generate peripheral vision, and why does it differ from what you see when you look straight ahead? One idea is that the visual system averages information over areas of the peripheral visual field. This gives rise to texture-like patterns, as opposed to images made up of fine details. Imagine looking at an expanse of foliage, gravel or fur, for example. Your eyes cannot make out the individual leaves, pebbles or hairs. Instead, you perceive an overall pattern in the form of a texture. Our peripheral vision may also consist of such textures, created when the brain averages information over areas of space. Wallis, Funke et al. have now tested this idea using an existing computer model that averages visual input in this way. By giving the model a series of photographs to process, Wallis, Funke et al. obtained images that should in theory simulate peripheral vision. If the model mimics the mechanisms that generate peripheral vision, then healthy volunteers should be unable to distinguish the processed images from the original photographs. But in fact, the participants could easily discriminate the two sets of images. This suggests that the visual system does not solely use textures to represent information in the peripheral visual field. Wallis, Funke et al. propose that other factors, such as how the visual system separates and groups objects, may instead determine what we see in our peripheral vision. This knowledge could ultimately benefit patients with eye diseases such as macular degeneration, a condition that causes loss of vision in the center of the visual field and forces patients to rely on their peripheral vision.
Collapse
Affiliation(s)
- Thomas Sa Wallis
- Werner Reichardt Center for Integrative Neuroscience, Eberhard Karls Universität Tübingen, Tübingen, Germany.,Bernstein Center for Computational Neuroscience, Berlin, Germany
| | - Christina M Funke
- Werner Reichardt Center for Integrative Neuroscience, Eberhard Karls Universität Tübingen, Tübingen, Germany.,Bernstein Center for Computational Neuroscience, Berlin, Germany
| | - Alexander S Ecker
- Werner Reichardt Center for Integrative Neuroscience, Eberhard Karls Universität Tübingen, Tübingen, Germany.,Bernstein Center for Computational Neuroscience, Berlin, Germany.,Center for Neuroscience and Artificial Intelligence, Baylor College of Medicine, Houston, United States.,Institute for Theoretical Physics, Eberhard Karls Universität Tübingen, Tübingen, Germany
| | - Leon A Gatys
- Werner Reichardt Center for Integrative Neuroscience, Eberhard Karls Universität Tübingen, Tübingen, Germany
| | - Felix A Wichmann
- Neural Information Processing Group, Faculty of Science, Eberhard Karls Universität Tübingen, Tübingen, Germany
| | - Matthias Bethge
- Center for Neuroscience and Artificial Intelligence, Baylor College of Medicine, Houston, United States.,Institute for Theoretical Physics, Eberhard Karls Universität Tübingen, Tübingen, Germany.,Max Planck Institute for Biological Cybernetics, Tübingen, Germany
| |
Collapse
|
17
|
Neural Coding for Shape and Texture in Macaque Area V4. J Neurosci 2019; 39:4760-4774. [PMID: 30948478 DOI: 10.1523/jneurosci.3073-18.2019] [Citation(s) in RCA: 32] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2018] [Revised: 03/19/2019] [Accepted: 04/01/2019] [Indexed: 11/21/2022] Open
Abstract
The distinct visual sensations of shape and texture have been studied separately in cortex; therefore, it remains unknown whether separate neuronal populations encode each of these properties or one population carries a joint encoding. We directly compared shape and texture selectivity of individual V4 neurons in awake macaques (1 male, 1 female) and found that V4 neurons lie along a continuum from strong tuning for boundary curvature of shapes to strong tuning for perceptual dimensions of texture. Among neurons tuned to both attributes, tuning for shape and texture were largely separable, with the latter delayed by ∼30 ms. We also found that shape stimuli typically evoked stronger, more selective responses than did texture patches, regardless of whether the latter were contained within or extended beyond the receptive field. These results suggest that there are separate specializations in mid-level cortical processing for visual attributes of shape and texture.SIGNIFICANCE STATEMENT Object recognition depends on our ability to see both the shape of the boundaries of objects and properties of their surfaces. However, neuroscientists have never before examined how shape and texture are linked together in mid-level visual cortex. In this study, we used systematically designed sets of simple shapes and texture patches to probe the responses of individual neurons in the primate visual cortex. Our results provide the first evidence that some cortical neurons specialize in processing shape whereas others specialize in processing textures. Most neurons lie between the ends of this continuum, and in these neurons we find that shape and texture encoding are largely independent.
Collapse
|
18
|
Pospisil DA, Pasupathy A, Bair W. 'Artiphysiology' reveals V4-like shape tuning in a deep network trained for image classification. eLife 2018; 7:38242. [PMID: 30570484 PMCID: PMC6335056 DOI: 10.7554/elife.38242] [Citation(s) in RCA: 39] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2018] [Accepted: 12/17/2018] [Indexed: 11/22/2022] Open
Abstract
Deep networks provide a potentially rich interconnection between neuroscientific and artificial approaches to understanding visual intelligence, but the relationship between artificial and neural representations of complex visual form has not been elucidated at the level of single-unit selectivity. Taking the approach of an electrophysiologist to characterizing single CNN units, we found many units exhibit translation-invariant boundary curvature selectivity approaching that of exemplar neurons in the primate mid-level visual area V4. For some V4-like units, particularly in middle layers, the natural images that drove them best were qualitatively consistent with selectivity for object boundaries. Our results identify a novel image-computable model for V4 boundary curvature selectivity and suggest that such a representation may begin to emerge within an artificial network trained for image categorization, even though boundary information was not provided during training. This raises the possibility that single-unit selectivity in CNNs will become a guide for understanding sensory cortex.
Collapse
Affiliation(s)
- Dean A Pospisil
- Department of Biological Structure, Washington National Primate Research Center, University of Washington, Seattle, United States
| | - Anitha Pasupathy
- Department of Biological Structure, Washington National Primate Research Center, University of Washington, Seattle, United States.,University of Washington Institute for Neuroengineering, Seattle, United States
| | - Wyeth Bair
- Department of Biological Structure, Washington National Primate Research Center, University of Washington, Seattle, United States.,University of Washington Institute for Neuroengineering, Seattle, United States.,Computational Neuroscience Center, University of Washington, Seattle, United States
| |
Collapse
|
19
|
Aitchison L, Lengyel M. With or without you: predictive coding and Bayesian inference in the brain. Curr Opin Neurobiol 2017; 46:219-227. [PMID: 28942084 DOI: 10.1016/j.conb.2017.08.010] [Citation(s) in RCA: 124] [Impact Index Per Article: 17.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2017] [Accepted: 08/23/2017] [Indexed: 12/24/2022]
Abstract
Two theoretical ideas have emerged recently with the ambition to provide a unifying functional explanation of neural population coding and dynamics: predictive coding and Bayesian inference. Here, we describe the two theories and their combination into a single framework: Bayesian predictive coding. We clarify how the two theories can be distinguished, despite sharing core computational concepts and addressing an overlapping set of empirical phenomena. We argue that predictive coding is an algorithmic/representational motif that can serve several different computational goals of which Bayesian inference is but one. Conversely, while Bayesian inference can utilize predictive coding, it can also be realized by a variety of other representations. We critically evaluate the experimental evidence supporting Bayesian predictive coding and discuss how to test it more directly.
Collapse
Affiliation(s)
- Laurence Aitchison
- Computational & Biological Learning Lab, Department of Engineering, University of Cambridge, Cambridge, United Kingdom
| | - Máté Lengyel
- Computational & Biological Learning Lab, Department of Engineering, University of Cambridge, Cambridge, United Kingdom; Department of Cognitive Science, Central European University, Budapest, Hungary.
| |
Collapse
|
20
|
Abstract
Sensitivity to temporal change places fundamental limits on object processing in the visual system. An emerging consensus from the behavioral and neuroimaging literature suggests that temporal resolution differs substantially for stimuli of different complexity and for brain areas at different levels of the cortical hierarchy. Here, we used steady-state visually evoked potentials to directly measure three fundamental parameters that characterize the underlying neural response to text and face images: temporal resolution, peak temporal frequency, and response latency. We presented full-screen images of text or a human face, alternated with a scrambled image, at temporal frequencies between 1 and 12 Hz. These images elicited a robust response at the first harmonic that showed differential tuning, scalp topography, and delay for the text and face images. Face-selective responses were maximal at 4 Hz, but text-selective responses, by contrast, were maximal at 1 Hz. The topography of the text image response was strongly left-lateralized at higher stimulation rates, whereas the response to the face image was slightly right-lateralized but nearly bilateral at all frequencies. Both text and face images elicited steady-state activity at more than one apparent latency; we observed early (141-160 msec) and late (>250 msec) text- and face-selective responses. These differences in temporal tuning profiles are likely to reflect differences in the nature of the computations performed by word- and face-selective cortex. Despite the close proximity of word- and face-selective regions on the cortical surface, our measurements demonstrate substantial differences in the temporal dynamics of word- versus face-selective responses.
Collapse
|