1
|
Motlagh SC, Joanisse M, Wang B, Mohsenzadeh Y. Unveiling the neural dynamics of conscious perception in rapid object recognition. Neuroimage 2024; 296:120668. [PMID: 38848982 DOI: 10.1016/j.neuroimage.2024.120668] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2023] [Revised: 05/23/2024] [Accepted: 06/05/2024] [Indexed: 06/09/2024] Open
Abstract
Our brain excels at recognizing objects, even when they flash by in a rapid sequence. However, the neural processes determining whether a target image in a rapid sequence can be recognized or not remains elusive. We used electroencephalography (EEG) to investigate the temporal dynamics of brain processes that shape perceptual outcomes in these challenging viewing conditions. Using naturalistic images and advanced multivariate pattern analysis (MVPA) techniques, we probed the brain dynamics governing conscious object recognition. Our results show that although initially similar, the processes for when an object can or cannot be recognized diverge around 180 ms post-appearance, coinciding with feedback neural processes. Decoding analyses indicate that gist perception (partial conscious perception) can occur at ∼120 ms through feedforward mechanisms. In contrast, object identification (full conscious perception of the image) is resolved at ∼190 ms after target onset, suggesting involvement of recurrent processing. These findings underscore the importance of recurrent neural connections in object recognition and awareness in rapid visual presentations.
Collapse
Affiliation(s)
- Saba Charmi Motlagh
- Western Center for Brain and Mind, Western University, London, Ontario, Canada; Vector Institute for Artificial Intelligence, Toronto, Ontario, Canada
| | - Marc Joanisse
- Western Center for Brain and Mind, Western University, London, Ontario, Canada; Department of Psychology, Western University, London, Ontario, Canada
| | - Boyu Wang
- Western Center for Brain and Mind, Western University, London, Ontario, Canada; Vector Institute for Artificial Intelligence, Toronto, Ontario, Canada; Department of Computer Science, Western University, London, Ontario, Canada
| | - Yalda Mohsenzadeh
- Western Center for Brain and Mind, Western University, London, Ontario, Canada; Vector Institute for Artificial Intelligence, Toronto, Ontario, Canada; Department of Computer Science, Western University, London, Ontario, Canada.
| |
Collapse
|
2
|
Lee K, Dora S, Mejias JF, Bohte SM, Pennartz CMA. Predictive coding with spiking neurons and feedforward gist signaling. Front Comput Neurosci 2024; 18:1338280. [PMID: 38680678 PMCID: PMC11045951 DOI: 10.3389/fncom.2024.1338280] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2023] [Accepted: 03/14/2024] [Indexed: 05/01/2024] Open
Abstract
Predictive coding (PC) is an influential theory in neuroscience, which suggests the existence of a cortical architecture that is constantly generating and updating predictive representations of sensory inputs. Owing to its hierarchical and generative nature, PC has inspired many computational models of perception in the literature. However, the biological plausibility of existing models has not been sufficiently explored due to their use of artificial neurons that approximate neural activity with firing rates in the continuous time domain and propagate signals synchronously. Therefore, we developed a spiking neural network for predictive coding (SNN-PC), in which neurons communicate using event-driven and asynchronous spikes. Adopting the hierarchical structure and Hebbian learning algorithms from previous PC neural network models, SNN-PC introduces two novel features: (1) a fast feedforward sweep from the input to higher areas, which generates a spatially reduced and abstract representation of input (i.e., a neural code for the gist of a scene) and provides a neurobiological alternative to an arbitrary choice of priors; and (2) a separation of positive and negative error-computing neurons, which counters the biological implausibility of a bi-directional error neuron with a very high baseline firing rate. After training with the MNIST handwritten digit dataset, SNN-PC developed hierarchical internal representations and was able to reconstruct samples it had not seen during training. SNN-PC suggests biologically plausible mechanisms by which the brain may perform perceptual inference and learning in an unsupervised manner. In addition, it may be used in neuromorphic applications that can utilize its energy-efficient, event-driven, local learning, and parallel information processing nature.
Collapse
Affiliation(s)
- Kwangjun Lee
- Cognitive and Systems Neuroscience Group, Swammerdam Institute for Life Sciences, Faculty of Science, University of Amsterdam, Amsterdam, Netherlands
| | - Shirin Dora
- Cognitive and Systems Neuroscience Group, Swammerdam Institute for Life Sciences, Faculty of Science, University of Amsterdam, Amsterdam, Netherlands
- Department of Computer Science, School of Science, Loughborough University, Loughborough, United Kingdom
| | - Jorge F. Mejias
- Cognitive and Systems Neuroscience Group, Swammerdam Institute for Life Sciences, Faculty of Science, University of Amsterdam, Amsterdam, Netherlands
| | - Sander M. Bohte
- Cognitive and Systems Neuroscience Group, Swammerdam Institute for Life Sciences, Faculty of Science, University of Amsterdam, Amsterdam, Netherlands
- Machine Learning Group, Centre of Mathematics and Computer Science, Amsterdam, Netherlands
| | - Cyriel M. A. Pennartz
- Cognitive and Systems Neuroscience Group, Swammerdam Institute for Life Sciences, Faculty of Science, University of Amsterdam, Amsterdam, Netherlands
| |
Collapse
|
3
|
Campbell A, Tanaka JW. Fast saccades to faces during the feedforward sweep. J Vis 2024; 24:16. [PMID: 38630459 PMCID: PMC11037494 DOI: 10.1167/jov.24.4.16] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/10/2022] [Accepted: 09/19/2023] [Indexed: 04/19/2024] Open
Abstract
Saccadic choice tasks use eye movements as a response method, typically in a task where observers are asked to saccade as quickly as possible to an image of a prespecified target category. Using this approach, face-selective saccades have been observed within 100 ms poststimulus. When taking into account oculomotor processing, this suggests that faces can be detected in as little as 70 to 80 ms. It has therefore been suggested that face detection must occur during the initial feedforward sweep, since this latency leaves little time for feedback processing. In the current experiment, we tested this hypothesis using backward masking-a technique shown to primarily disrupt feedback processing while leaving feedforward activation relatively intact. Based on minimum saccadic reaction time, we found that face detection benefited from ultra-fast, accurate saccades within 110 to 160 ms and that these eye movements are obtainable even under extreme masking conditions that limit perceptual awareness. However, masking did significantly increase the median SRT for faces. In the manual responses, we found remarkable detection accuracy for faces and houses, even when participants indicated having no visual experience of the test images. These results provide evidence for the view that the saccadic bias to faces is initiated by coarse information used to categorize faces in the feedforward sweep but that, in most cases, additional processing is required to quickly reach the threshold for saccade initiation.
Collapse
Affiliation(s)
- Alison Campbell
- Department of Psychology, University of Victoria, Victoria, BC, Canada
- https://orcid.org/0000-0001-6891-8609
| | - James W Tanaka
- Department of Psychology, University of Victoria, Victoria, BC, Canada
- https://orcid.org/0000-0001-6559-0388
| |
Collapse
|
4
|
Srivastava S, Wang WY, Eckstein MP. Emergent human-like covert attention in feedforward convolutional neural networks. Curr Biol 2024; 34:579-593.e12. [PMID: 38244541 DOI: 10.1016/j.cub.2023.12.058] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2023] [Revised: 10/09/2023] [Accepted: 12/19/2023] [Indexed: 01/22/2024]
Abstract
Covert attention allows the selection of locations or features of the visual scene without moving the eyes. Cues and contexts predictive of a target's location orient covert attention and improve perceptual performance. The performance benefits are widely attributed to theories of covert attention as a limited resource, zoom, spotlight, or weighting of visual information. However, such concepts are difficult to map to neuronal populations. We show that a feedforward convolutional neural network (CNN) trained on images to optimize target detection accuracy and with no explicit incorporation of an attention mechanism, a limited resource, or feedback connections learns to utilize cues and contexts in the three most prominent covert attention tasks (Posner cueing, set size effects in search, and contextual cueing) and predicts the cue/context influences on human accuracy. The CNN's cueing/context effects generalize across network training schemes, to peripheral and central pre-cues, discrimination tasks, and reaction time measures, and critically do not vary with reductions in network resources (size). The CNN shows comparable cueing/context effects to a model that optimally uses image information to make decisions (Bayesian ideal observer) but generalizes these effects to cue instances unseen during training. Together, the findings suggest that human-like behavioral signatures of covert attention in the three landmark paradigms might be an emergent property of task accuracy optimization in neuronal populations without positing limited attentional resources. The findings might explain recent behavioral results showing cueing and context effects across a variety of simple organisms with no neocortex, from archerfish to fruit flies.
Collapse
Affiliation(s)
- Sudhanshu Srivastava
- Graduate Program in Dynamical Neuroscience, University of California, Santa Barbara, Santa Barbara, CA 93106, USA; Institute for Collaborative Biotechnologies, University of California, Santa Barbara, Santa Barbara, CA 93106, USA.
| | - William Yang Wang
- Department of Computer Science, University of California, Santa Barbara, Santa Barbara, CA 93106, USA; Institute for Collaborative Biotechnologies, University of California, Santa Barbara, Santa Barbara, CA 93106, USA.
| | - Miguel P Eckstein
- Graduate Program in Dynamical Neuroscience, University of California, Santa Barbara, Santa Barbara, CA 93106, USA; Department of Psychological and Brain Sciences, University of California, Santa Barbara, Santa Barbara, CA 93106, USA; Department of Computer Science, University of California, Santa Barbara, Santa Barbara, CA 93106, USA; Department of Electrical and Computer Engineering, University of California, Santa Barbara, Santa Barbara, CA 93106, USA; Institute for Collaborative Biotechnologies, University of California, Santa Barbara, Santa Barbara, CA 93106, USA.
| |
Collapse
|
5
|
von Seth J, Nicholls VI, Tyler LK, Clarke A. Recurrent connectivity supports higher-level visual and semantic object representations in the brain. Commun Biol 2023; 6:1207. [PMID: 38012301 PMCID: PMC10682037 DOI: 10.1038/s42003-023-05565-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2023] [Accepted: 11/09/2023] [Indexed: 11/29/2023] Open
Abstract
Visual object recognition has been traditionally conceptualised as a predominantly feedforward process through the ventral visual pathway. While feedforward artificial neural networks (ANNs) can achieve human-level classification on some image-labelling tasks, it's unclear whether computational models of vision alone can accurately capture the evolving spatiotemporal neural dynamics. Here, we probe these dynamics using a combination of representational similarity and connectivity analyses of fMRI and MEG data recorded during the recognition of familiar, unambiguous objects. Modelling the visual and semantic properties of our stimuli using an artificial neural network as well as a semantic feature model, we find that unique aspects of the neural architecture and connectivity dynamics relate to visual and semantic object properties. Critically, we show that recurrent processing between the anterior and posterior ventral temporal cortex relates to higher-level visual properties prior to semantic object properties, in addition to semantic-related feedback from the frontal lobe to the ventral temporal lobe between 250 and 500 ms after stimulus onset. These results demonstrate the distinct contributions made by semantic object properties in explaining neural activity and connectivity, highlighting it as a core part of object recognition not fully accounted for by current biologically inspired neural networks.
Collapse
Affiliation(s)
- Jacqueline von Seth
- MRC Cognition and Brain Sciences Unit, University of Cambridge, Cambridge, UK
| | | | - Lorraine K Tyler
- Department of Psychology, University of Cambridge, Cambridge, UK
- Cambridge Centre for Ageing and Neuroscience (Cam-CAN), University of Cambridge and MRC Cognition and Brain Sciences Unit, Cambridge, UK
| | - Alex Clarke
- Department of Psychology, University of Cambridge, Cambridge, UK.
| |
Collapse
|
6
|
Chen L, Cichy RM, Kaiser D. Alpha-frequency feedback to early visual cortex orchestrates coherent naturalistic vision. SCIENCE ADVANCES 2023; 9:eadi2321. [PMID: 37948520 PMCID: PMC10637741 DOI: 10.1126/sciadv.adi2321] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/12/2023] [Accepted: 10/12/2023] [Indexed: 11/12/2023]
Abstract
During naturalistic vision, the brain generates coherent percepts by integrating sensory inputs scattered across the visual field. Here, we asked whether this integration process is mediated by rhythmic cortical feedback. In electroencephalography (EEG) and functional magnetic resonance imaging (fMRI) experiments, we experimentally manipulated integrative processing by changing the spatiotemporal coherence of naturalistic videos presented across visual hemifields. Our EEG data revealed that information about incoherent videos is coded in feedforward-related gamma activity while information about coherent videos is coded in feedback-related alpha activity, indicating that integration is indeed mediated by rhythmic activity. Our fMRI data identified scene-selective cortex and human middle temporal complex (hMT) as likely sources of this feedback. Analytically combining our EEG and fMRI data further revealed that feedback-related representations in the alpha band shape the earliest stages of visual processing in cortex. Together, our findings indicate that the construction of coherent visual experiences relies on cortical feedback rhythms that fully traverse the visual hierarchy.
Collapse
Affiliation(s)
- Lixiang Chen
- Department of Education and Psychology, Freie Universität Berlin, Berlin 14195, Germany
| | - Radoslaw M. Cichy
- Department of Education and Psychology, Freie Universität Berlin, Berlin 14195, Germany
| | - Daniel Kaiser
- Mathematical Institute, Department of Mathematics and Computer Science, Physics, Geography, Justus-Liebig-Universität Gießen, Gießen 35392, Germany
- Center for Mind, Brain and Behavior (CMBB), Philipps-Universität Marburg and Justus-Liebig-Universität Gießen, Marburg 35032, Germany
| |
Collapse
|
7
|
Toosi T, Issa EB. Brain-like Flexible Visual Inference by Harnessing Feedback-Feedforward Alignment. ARXIV 2023:arXiv:2310.20599v1. [PMID: 37961740 PMCID: PMC10635293] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 11/15/2023]
Abstract
In natural vision, feedback connections support versatile visual inference capabilities such as making sense of the occluded or noisy bottom-up sensory information or mediating pure top-down processes such as imagination. However, the mechanisms by which the feedback pathway learns to give rise to these capabilities flexibly are not clear. We propose that top-down effects emerge through alignment between feedforward and feedback pathways, each optimizing its own objectives. To achieve this co-optimization, we introduce Feedback-Feedforward Alignment (FFA), a learning algorithm that leverages feedback and feedforward pathways as mutual credit assignment computational graphs, enabling alignment. In our study, we demonstrate the effectiveness of FFA in co-optimizing classification and reconstruction tasks on widely used MNIST and CIFAR10 datasets. Notably, the alignment mechanism in FFA endows feedback connections with emergent visual inference functions, including denoising, resolving occlusions, hallucination, and imagination. Moreover, FFA offers bio-plausibility compared to traditional back-propagation (BP) methods in implementation. By repurposing the computational graph of credit assignment into a goal-driven feedback pathway, FFA alleviates weight transport problems encountered in BP, enhancing the bio-plausibility of the learning algorithm. Our study presents FFA as a promising proof-of-concept for the mechanisms underlying how feedback connections in the visual cortex support flexible visual functions. This work also contributes to the broader field of visual inference underlying perceptual phenomena and has implications for developing more biologically inspired learning algorithms.
Collapse
Affiliation(s)
- Tahereh Toosi
- Center for Theoretical Neuroscience, Zuckerman Mind Brain Behavior Institute, Columbia University, New York, NY
| | - Elias B. Issa
- Department of Neuroscience, Zuckerman Mind Brain Behavior Institute, Columbia University, New York, NY
| |
Collapse
|
8
|
Peterson MA, Campbell ES. Backward masking implicates cortico-cortical recurrent processes in convex figure context effects and cortico-thalamic recurrent processes in resolving figure-ground ambiguity. Front Psychol 2023; 14:1243405. [PMID: 37809293 PMCID: PMC10552270 DOI: 10.3389/fpsyg.2023.1243405] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2023] [Accepted: 08/17/2023] [Indexed: 10/10/2023] Open
Abstract
Introduction Previous experiments purportedly showed that image-based factors like convexity were sufficient for figure assignment. Recently, however, we found that the probability of perceiving a figure on the convex side of a central border was only slightly higher than chance for two-region displays and increased with the number of display regions; this increase was observed only when the concave regions were homogeneously colored. These convex figure context effects (CEs) revealed that figure assignment in these classic displays entails more than a response to local convexity. A Bayesian observer replicated the convex figure CEs using both a convexity object prior and a new, homogeneous background prior and made the novel prediction that the classic displays in which both the convex and concave regions were homogeneous were ambiguous during perceptual organization. Methods Here, we report three experiments investigating the proposed ambiguity and examining how the convex figure CEs unfold over time with an emphasis on whether they entail recurrent processing. Displays were shown for 100 ms followed by pattern masks after ISIs of 0, 50, or 100 ms. The masking conditions were designed to add noise to recurrent processing and therefore to delay the outcome of processes in which they play a role. In Exp. 1, participants viewed two- and eight-region displays with homogeneous convex regions (homo-convex displays; the putatively ambiguous displays). In Exp. 2, participants viewed putatively unambiguous hetero-convex displays. In Exp. 3, displays and masks were presented to different eyes, thereby delaying mask interference in the thalamus for up to 100 ms. Results and discussion The results of Exps. 1 and 2 are consistent with the interpretation that recurrent processing is involved in generating the convex figure CEs and resolving the ambiguity of homo-convex displays. The results of Exp. 3 suggested that corticofugal recurrent processing is involved in resolving the ambiguity of homo-convex displays and that cortico-cortical recurrent processes play a role in generating convex figure CEs and these two types of recurrent processes operate in parallel. Our results add to evidence that perceptual organization evolves dynamically and reveal that stimuli that seem unambiguous can be ambiguous during perceptual organization.
Collapse
Affiliation(s)
- Mary A. Peterson
- Department of Psychology, University of Arizona, Tucson, AZ, United States
- Cognitive Science Program, University of Arizona, Tucson, AZ, United States
| | - Elizabeth Salvagio Campbell
- Department of Psychology, University of Arizona, Tucson, AZ, United States
- Cognitive Science Program, University of Arizona, Tucson, AZ, United States
- College of Medicine Tucson, University of Arizona, Tucson, AZ, United States
| |
Collapse
|
9
|
Abstract
Deep neural networks (DNNs) are machine learning algorithms that have revolutionized computer vision due to their remarkable successes in tasks like object classification and segmentation. The success of DNNs as computer vision algorithms has led to the suggestion that DNNs may also be good models of human visual perception. In this article, we review evidence regarding current DNNs as adequate behavioral models of human core object recognition. To this end, we argue that it is important to distinguish between statistical tools and computational models and to understand model quality as a multidimensional concept in which clarity about modeling goals is key. Reviewing a large number of psychophysical and computational explorations of core object recognition performance in humans and DNNs, we argue that DNNs are highly valuable scientific tools but that, as of today, DNNs should only be regarded as promising-but not yet adequate-computational models of human core object recognition behavior. On the way, we dispel several myths surrounding DNNs in vision science.
Collapse
Affiliation(s)
- Felix A Wichmann
- Neural Information Processing Group, University of Tübingen, Tübingen, Germany;
| | | |
Collapse
|
10
|
Allen CH, Maurer JM, Gullapalli AR, Edwards BG, Aharoni E, Harenski CL, Anderson NE, Harenski KA, Calhoun VD, Kiehl KA. Psychopathic traits and altered resting-state functional connectivity in incarcerated adolescent girls. FRONTIERS IN NEUROIMAGING 2023; 2:1216494. [PMID: 37554634 PMCID: PMC10406221 DOI: 10.3389/fnimg.2023.1216494] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 05/03/2023] [Accepted: 07/19/2023] [Indexed: 08/10/2023]
Abstract
Previous work in incarcerated boys and adult men and women suggest that individuals scoring high on psychopathic traits show altered resting-state limbic/paralimbic, and default mode functional network properties. However, it is unclear whether similar results extend to high-risk adolescent girls with elevated psychopathic traits. This study examined whether psychopathic traits [assessed via the Hare Psychopathy Checklist: Youth Version (PCL:YV)] were associated with altered inter-network connectivity, intra-network connectivity (i.e., functional coherence within a network), and amplitude of low-frequency fluctuations (ALFFs) across resting-state networks among high-risk incarcerated adolescent girls (n = 40). Resting-state networks were identified by applying group independent component analysis (ICA) to resting-state fMRI scans, and a priori regions of interest included limbic, paralimbic, and default mode network components. We tested the association of psychopathic traits (PCL:YV Factor 1 measuring affective/interpersonal traits and PCL:YV Factor 2 assessing antisocial/lifestyle traits) to these three resting-state measures. PCL:YV Factor 1 scores were associated with increased low-frequency and decreased high-frequency fluctuations in components corresponding to the default mode network, as well as increased intra-network FNC in components corresponding to cognitive control networks. PCL:YV Factor 2 scores were associated with increased low-frequency fluctuations in sensorimotor networks and decreased high-frequency fluctuations in default mode, sensorimotor, and visual networks. Consistent with previous analyses in incarcerated adult women, our results suggest that psychopathic traits among incarcerated adolescent girls are associated with altered intra-network ALFFs-primarily that of increased low-frequency and decreased high-frequency fluctuations-and connectivity across multiple networks including paralimbic regions. These results suggest stable neurobiological correlates of psychopathic traits among women across development.
Collapse
Affiliation(s)
- Corey H. Allen
- The Mind Research Network, Albuquerque, NM, United States
| | | | | | | | - Eyal Aharoni
- Department of Psychology, Georgia State University, Atlanta, GA, United States
| | | | | | | | - Vince D. Calhoun
- Department of Electrical and Computer Engineering, Georgia Institute of Technology, Atlanta, GA, United States
- Tri-Institutional Center for Translational Research in Neuroimaging and Data Science (TReNDS), Georgia State University, Georgia Institute of Technology, Emory University, Atlanta, GA, United States
- Department of Computer Science, Georgia State University, Atlanta, GA, United States
| | - Kent A. Kiehl
- The Mind Research Network, Albuquerque, NM, United States
- Department of Psychology, University of New Mexico, Albuquerque, NM, United States
| |
Collapse
|
11
|
Schuurmans JP, Bennett MA, Petras K, Goffaux V. Backward masking reveals coarse-to-fine dynamics in human V1. Neuroimage 2023; 274:120139. [PMID: 37137434 DOI: 10.1016/j.neuroimage.2023.120139] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2022] [Revised: 04/20/2023] [Accepted: 04/26/2023] [Indexed: 05/05/2023] Open
Abstract
Natural images exhibit luminance variations aligned across a broad spectrum of spatial frequencies (SFs). It has been proposed that, at early stages of processing, the coarse signals carried by the low SF (LSF) of the visual input are sent rapidly from primary visual cortex (V1) to ventral, dorsal and frontal regions to form a coarse representation of the input, which is later sent back to V1 to guide the processing of fine-grained high SFs (i.e., HSF). We used functional resonance imaging (fMRI) to investigate the role of human V1 in the coarse-to-fine integration of visual input. We disrupted the processing of the coarse and fine content of full-spectrum human face stimuli via backward masking of selective SF ranges (LSFs: <1.75cpd and HSFs: >1.75cpd) at specific times (50, 83, 100 or 150ms). In line with coarse-to-fine proposals, we found that (1) the selective masking of stimulus LSF disrupted V1 activity in the earliest time window, and progressively decreased in influence, while (2) an opposite trend was observed for the masking of stimulus' HSF. This pattern of activity was found in V1, as well as in ventral (i.e. the Fusiform Face area, FFA), dorsal and orbitofrontal regions. We additionally presented subjects with contrast negated stimuli. While contrast negation significantly reduced response amplitudes in the FFA, as well as coupling between FFA and V1, coarse-to-fine dynamics were not affected by this manipulation. The fact that V1 response dynamics to strictly identical stimulus sets differed depending on the masked scale adds to growing evidence that V1 role goes beyond the early and quasi-passive transmission of visual information to the rest of the brain. It instead indicates that V1 may yield a 'spatially registered common forum' or 'blackboard' that integrates top-down inferences with incoming visual signals through its recurrent interaction with high-level regions located in the inferotemporal, dorsal and frontal regions.
Collapse
Affiliation(s)
- Jolien P Schuurmans
- Psychological Sciences Research Institute (IPSY), UC Louvain, Louvain-la-Neuve, Belgium.
| | - Matthew A Bennett
- Psychological Sciences Research Institute (IPSY), UC Louvain, Louvain-la-Neuve, Belgium; Institute of Neuroscience (IONS), UC Louvain, Louvain-la-Neuve, Belgium
| | - Kirsten Petras
- Integrative Neuroscience and Cognition Center, CNRS, Université Paris Cité, Paris, France
| | - Valérie Goffaux
- Psychological Sciences Research Institute (IPSY), UC Louvain, Louvain-la-Neuve, Belgium; Institute of Neuroscience (IONS), UC Louvain, Louvain-la-Neuve, Belgium; Maastricht University, Maastricht, the Netherlands
| |
Collapse
|
12
|
Vannuscorps G, Galaburda A, Caramazza A. From intermediate shape-centered representations to the perception of oriented shapes: response to commentaries. Cogn Neuropsychol 2023; 40:71-94. [PMID: 37642330 DOI: 10.1080/02643294.2023.2250511] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2023] [Revised: 07/14/2023] [Accepted: 07/31/2023] [Indexed: 08/31/2023]
Abstract
In this response paper, we start by addressing the main points made by the commentators on the target article's main theoretical conclusions: the existence and characteristics of the intermediate shape-centered representations (ISCRs) in the visual system, their emergence from edge detection mechanisms operating on different types of visual properties, and how they are eventually reunited in higher order frames of reference underlying conscious visual perception. We also address the much-commented issue of the possible neural mechanisms of the ISCRs. In the final section, we address more specific and general comments, questions, and suggestions which, albeit very interesting, were less directly focused on the main conclusions of the target paper.
Collapse
Affiliation(s)
- Gilles Vannuscorps
- Department of Psychology, Harvard University, Cambridge, MA, USA
- Institute of Psychological Sciences, Université catholique de Louvain, Louvain-la-Neuve, Belgium
- Institute of Neuroscience, Université catholique de Louvain, Louvain-la-Neuve, Belgium
- Louvain Bionics, Université catholique de Louvain, Louvain-la-Neuve, Belgium
| | - Albert Galaburda
- Department of Neurology, Harvard Medical School and Beth Israel Deaconess Medical Center, Boston, MA, USA
| | - Alfonso Caramazza
- Department of Psychology, Harvard University, Cambridge, MA, USA
- Center for Mind/Brain Sciences (CIMeC), Università degli Studi di Trento, Rovereto, Italy
| |
Collapse
|
13
|
Zhang Y, Aghajan ZM, Ison M, Lu Q, Tang H, Kalender G, Monsoor T, Zheng J, Kreiman G, Roychowdhury V, Fried I. Decoding of human identity by computer vision and neuronal vision. Sci Rep 2023; 13:651. [PMID: 36635322 PMCID: PMC9837190 DOI: 10.1038/s41598-022-26946-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2022] [Accepted: 12/22/2022] [Indexed: 01/14/2023] Open
Abstract
Extracting meaning from a dynamic and variable flow of incoming information is a major goal of both natural and artificial intelligence. Computer vision (CV) guided by deep learning (DL) has made significant strides in recognizing a specific identity despite highly variable attributes. This is the same challenge faced by the nervous system and partially addressed by the concept cells-neurons exhibiting selective firing in response to specific persons/places, described in the human medial temporal lobe (MTL) . Yet, access to neurons representing a particular concept is limited due to these neurons' sparse coding. It is conceivable, however, that the information required for such decoding is present in relatively small neuronal populations. To evaluate how well neuronal populations encode identity information in natural settings, we recorded neuronal activity from multiple brain regions of nine neurosurgical epilepsy patients implanted with depth electrodes, while the subjects watched an episode of the TV series "24". First, we devised a minimally supervised CV algorithm (with comparable performance against manually-labeled data) to detect the most prevalent characters (above 1% overall appearance) in each frame. Next, we implemented DL models that used the time-varying population neural data as inputs and decoded the visual presence of the four main characters throughout the episode. This methodology allowed us to compare "computer vision" with "neuronal vision"-footprints associated with each character present in the activity of a subset of neurons-and identify the brain regions that contributed to this decoding process. We then tested the DL models during a recognition memory task following movie viewing where subjects were asked to recognize clip segments from the presented episode. DL model activations were not only modulated by the presence of the corresponding characters but also by participants' subjective memory of whether they had seen the clip segment, and by the associative strengths of the characters in the narrative plot. The described approach can offer novel ways to probe the representation of concepts in time-evolving dynamic behavioral tasks. Further, the results suggest that the information required to robustly decode concepts is present in the population activity of only tens of neurons even in brain regions beyond MTL.
Collapse
Affiliation(s)
- Yipeng Zhang
- grid.19006.3e0000 0000 9632 6718Department of Electrical and Computer Engineering, University of California Los Angeles, Los Angeles, CA USA
| | - Zahra M. Aghajan
- grid.19006.3e0000 0000 9632 6718Department of Neurosurgery, David Geffen School of Medicine, University of California Los Angeles, Los Angeles, CA USA
| | - Matias Ison
- grid.4563.40000 0004 1936 8868School of Psychology, University of Nottingham, Nottingham, UK
| | - Qiujing Lu
- grid.19006.3e0000 0000 9632 6718Department of Electrical and Computer Engineering, University of California Los Angeles, Los Angeles, CA USA
| | - Hanlin Tang
- grid.38142.3c000000041936754XChildren’s Hospital, Harvard Medical School, Boston, MA USA
| | - Guldamla Kalender
- grid.19006.3e0000 0000 9632 6718Department of Neurosurgery, David Geffen School of Medicine, University of California Los Angeles, Los Angeles, CA USA
| | - Tonmoy Monsoor
- grid.19006.3e0000 0000 9632 6718Department of Electrical and Computer Engineering, University of California Los Angeles, Los Angeles, CA USA
| | - Jie Zheng
- grid.38142.3c000000041936754XChildren’s Hospital, Harvard Medical School, Boston, MA USA
| | - Gabriel Kreiman
- grid.38142.3c000000041936754XChildren’s Hospital, Harvard Medical School, Boston, MA USA ,grid.116068.80000 0001 2341 2786Center for Brains, Minds and Machines, Massachusetts Institute of Technology, Cambridge, MA USA
| | - Vwani Roychowdhury
- grid.19006.3e0000 0000 9632 6718Department of Electrical and Computer Engineering, University of California Los Angeles, Los Angeles, CA USA
| | - Itzhak Fried
- Department of Neurosurgery, David Geffen School of Medicine, University of California Los Angeles, Los Angeles, CA, USA. .,Department of Psychiatry and Biobehavioral Sciences, Jane and Terry Semel Institute for Neuroscience and Human Behavior, University of California Los Angeles, Los Angeles, CA, USA. .,Sackler Faculty of Medicine, Tel Aviv University, Tel Aviv, Israel.
| |
Collapse
|
14
|
A Developmental Approach for Training Deep Belief Networks. Cognit Comput 2022. [DOI: 10.1007/s12559-022-10085-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]
Abstract
AbstractDeep belief networks (DBNs) are stochastic neural networks that can extract rich internal representations of the environment from the sensory data. DBNs had a catalytic effect in triggering the deep learning revolution, demonstrating for the very first time the feasibility of unsupervised learning in networks with many layers of hidden neurons. These hierarchical architectures incorporate plausible biological and cognitive properties, making them particularly appealing as computational models of human perception and cognition. However, learning in DBNs is usually carried out in a greedy, layer-wise fashion, which does not allow to simulate the holistic maturation of cortical circuits and prevents from modeling cognitive development. Here we present iDBN, an iterative learning algorithm for DBNs that allows to jointly update the connection weights across all layers of the model. We evaluate the proposed iterative algorithm on two different sets of visual stimuli, measuring the generative capabilities of the learned model and its potential to support supervised downstream tasks. We also track network development in terms of graph theoretical properties and investigate the potential extension of iDBN to continual learning scenarios. DBNs trained using our iterative approach achieve a final performance comparable to that of the greedy counterparts, at the same time allowing to accurately analyze the gradual development of internal representations in the deep network and the progressive improvement in task performance. Our work paves the way to the use of iDBN for modeling neurocognitive development.
Collapse
|
15
|
Tesileanu T, Piasini E, Balasubramanian V. Efficient processing of natural scenes in visual cortex. Front Cell Neurosci 2022; 16:1006703. [PMID: 36545653 PMCID: PMC9760692 DOI: 10.3389/fncel.2022.1006703] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2022] [Accepted: 11/17/2022] [Indexed: 12/12/2022] Open
Abstract
Neural circuits in the periphery of the visual, auditory, and olfactory systems are believed to use limited resources efficiently to represent sensory information by adapting to the statistical structure of the natural environment. This "efficient coding" principle has been used to explain many aspects of early visual circuits including the distribution of photoreceptors, the mosaic geometry and center-surround structure of retinal receptive fields, the excess OFF pathways relative to ON pathways, saccade statistics, and the structure of simple cell receptive fields in V1. We know less about the extent to which such adaptations may occur in deeper areas of cortex beyond V1. We thus review recent developments showing that the perception of visual textures, which depends on processing in V2 and beyond in mammals, is adapted in rats and humans to the multi-point statistics of luminance in natural scenes. These results suggest that central circuits in the visual brain are adapted for seeing key aspects of natural scenes. We conclude by discussing how adaptation to natural temporal statistics may aid in learning and representing visual objects, and propose two challenges for the future: (1) explaining the distribution of shape sensitivity in the ventral visual stream from the statistics of object shape in natural images, and (2) explaining cell types of the vertebrate retina in terms of feature detectors that are adapted to the spatio-temporal structures of natural stimuli. We also discuss how new methods based on machine learning may complement the normative, principles-based approach to theoretical neuroscience.
Collapse
Affiliation(s)
- Tiberiu Tesileanu
- Center for Computational Neuroscience, Flatiron Institute, New York, NY, United States,*Correspondence: Tiberiu Tesileanu
| | - Eugenio Piasini
- Scuola Internazionale Superiore di Studi Avanzati (SISSA), Trieste, Italy,Eugenio Piasini
| | - Vijay Balasubramanian
- Department of Physics and Astronomy, David Rittenhouse Laboratory, University of Pennsylvania, Philadelphia, PA, United States,Santa Fe Institute, Santa Fe, NM, United States
| |
Collapse
|
16
|
Zhang M, Armendariz M, Xiao W, Rose O, Bendtz K, Livingstone M, Ponce C, Kreiman G. Look twice: A generalist computational model predicts return fixations across tasks and species. PLoS Comput Biol 2022; 18:e1010654. [PMID: 36413523 PMCID: PMC9681066 DOI: 10.1371/journal.pcbi.1010654] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2021] [Accepted: 10/13/2022] [Indexed: 11/23/2022] Open
Abstract
Primates constantly explore their surroundings via saccadic eye movements that bring different parts of an image into high resolution. In addition to exploring new regions in the visual field, primates also make frequent return fixations, revisiting previously foveated locations. We systematically studied a total of 44,328 return fixations out of 217,440 fixations. Return fixations were ubiquitous across different behavioral tasks, in monkeys and humans, both when subjects viewed static images and when subjects performed natural behaviors. Return fixations locations were consistent across subjects, tended to occur within short temporal offsets, and typically followed a 180-degree turn in saccadic direction. To understand the origin of return fixations, we propose a proof-of-principle, biologically-inspired and image-computable neural network model. The model combines five key modules: an image feature extractor, bottom-up saliency cues, task-relevant visual features, finite inhibition-of-return, and saccade size constraints. Even though there are no free parameters that are fine-tuned for each specific task, species, or condition, the model produces fixation sequences resembling the universal properties of return fixations. These results provide initial steps towards a mechanistic understanding of the trade-off between rapid foveal recognition and the need to scrutinize previous fixation locations.
Collapse
Affiliation(s)
- Mengmi Zhang
- Boston Children’s Hospital, Harvard Medical School, Boston, Massachusetts, United States of America
- Center for Brains, Minds and Machines, Cambridge, Massachusetts, United States of America
- CFAR and I2R, Agency for Science, Technology and Research, Singapore
| | - Marcelo Armendariz
- Boston Children’s Hospital, Harvard Medical School, Boston, Massachusetts, United States of America
- Center for Brains, Minds and Machines, Cambridge, Massachusetts, United States of America
- Laboratory for Neuro- and Psychophysiology, KU Leuven, Leuven, Belgium
| | - Will Xiao
- Department of Neurobiology, Harvard Medical School, Boston, Massachusetts, United States of America
| | - Olivia Rose
- Department of Neurobiology, Harvard Medical School, Boston, Massachusetts, United States of America
| | - Katarina Bendtz
- Boston Children’s Hospital, Harvard Medical School, Boston, Massachusetts, United States of America
- Center for Brains, Minds and Machines, Cambridge, Massachusetts, United States of America
| | - Margaret Livingstone
- Department of Neurobiology, Harvard Medical School, Boston, Massachusetts, United States of America
| | - Carlos Ponce
- Department of Neurobiology, Harvard Medical School, Boston, Massachusetts, United States of America
| | - Gabriel Kreiman
- Boston Children’s Hospital, Harvard Medical School, Boston, Massachusetts, United States of America
- Center for Brains, Minds and Machines, Cambridge, Massachusetts, United States of America
- * E-mail:
| |
Collapse
|
17
|
Wang XJ. Theory of the Multiregional Neocortex: Large-Scale Neural Dynamics and Distributed Cognition. Annu Rev Neurosci 2022; 45:533-560. [PMID: 35803587 DOI: 10.1146/annurev-neuro-110920-035434] [Citation(s) in RCA: 19] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/02/2023]
Abstract
The neocortex is a complex neurobiological system with many interacting regions. How these regions work together to subserve flexible behavior and cognition has become increasingly amenable to rigorous research. Here, I review recent experimental and theoretical work on the modus operandi of a multiregional cortex. These studies revealed several general principles for the neocortical interareal connectivity, low-dimensional macroscopic gradients of biological properties across cortical areas, and a hierarchy of timescales for information processing. Theoretical work suggests testable predictions regarding differential excitation and inhibition along feedforward and feedback pathways in the cortical hierarchy. Furthermore, modeling of distributed working memory and simple decision-making has given rise to a novel mathematical concept, dubbed bifurcation in space, that potentially explains how different cortical areas, with a canonical circuit organization but gradients of biological heterogeneities, are able to subserve their respective (e.g., sensory coding versus executive control) functions in a modularly organized brain.
Collapse
Affiliation(s)
- Xiao-Jing Wang
- Center for Neural Science, New York University, New York, NY, USA;
| |
Collapse
|
18
|
Larsen BW, Druckmann S. Towards a more general understanding of the algorithmic utility of recurrent connections. PLoS Comput Biol 2022; 18:e1010227. [PMID: 35727818 PMCID: PMC9258846 DOI: 10.1371/journal.pcbi.1010227] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2021] [Revised: 07/06/2022] [Accepted: 05/17/2022] [Indexed: 11/18/2022] Open
Abstract
Lateral and recurrent connections are ubiquitous in biological neural circuits. Yet while the strong computational abilities of feedforward networks have been extensively studied, our understanding of the role and advantages of recurrent computations that might explain their prevalence remains an important open challenge. Foundational studies by Minsky and Roelfsema argued that computations that require propagation of global information for local computation to take place would particularly benefit from the sequential, parallel nature of processing in recurrent networks. Such “tag propagation” algorithms perform repeated, local propagation of information and were originally introduced in the context of detecting connectedness, a task that is challenging for feedforward networks. Here, we advance the understanding of the utility of lateral and recurrent computation by first performing a large-scale empirical study of neural architectures for the computation of connectedness to explore feedforward solutions more fully and establish robustly the importance of recurrent architectures. In addition, we highlight a tradeoff between computation time and performance and construct hybrid feedforward/recurrent models that perform well even in the presence of varying computational time limitations. We then generalize tag propagation architectures to propagating multiple interacting tags and demonstrate that these are efficient computational substrates for more general computations of connectedness by introducing and solving an abstracted biologically inspired decision-making task. Our work thus clarifies and expands the set of computational tasks that can be solved efficiently by recurrent computation, yielding hypotheses for structure in population activity that may be present in such tasks. In striking contrast to the majority of current-day artificial neural network research which primarily focuses on feedforward architectures, biological brains make extensive use of lateral and recurrent connections. This raises the possibility that this difference makes a fundamental contribution to the gap in computational power between real neural circuits and artificial neural networks. Thus, despite the difficulty of making effective comparisons between different network architectures, developing a more detailed understanding of the computational role played by such connections is a pressing challenge. Here, we leverage the computational capabilities of large-scale machine learning to robustly explore how differences in architectures affect a network’s ability to learn tasks that require propagation of global information. We first focus on the task of determining whether two pixels are connected in an image which has an elegant and efficient recurrent solution: propagate a connected label or tag along paths. Inspired by this solution, we show that it can be generalized in many ways, including propagating multiple tags at once and changing the computation performed on the result of the propagation. Strikingly, this simple expansion of the tag propagation network is sufficient to solve a crucial abstraction to temporal connectedness at the core of many decision-making problems, which we illustrate for an abstracted competitive foraging task. Our results shed light on the set of computational tasks that can be solved efficiently by recurrent computation and how these solutions may relate to the structure of neural activity.
Collapse
Affiliation(s)
- Brett W. Larsen
- Department of Physics, Stanford University, Stanford, California, United States of America
- Department of Neurobiology, Stanford University School of Medicine, Stanford, California, United States of America
- Wu Tsai Neurosciences Institute, Stanford University, Stanford, California, United States of America
| | - Shaul Druckmann
- Department of Neurobiology, Stanford University School of Medicine, Stanford, California, United States of America
- Wu Tsai Neurosciences Institute, Stanford University, Stanford, California, United States of America
- * E-mail:
| |
Collapse
|
19
|
Vacher J, Launay C, Coen-Cagli R. Flexibly regularized mixture models and application to image segmentation. Neural Netw 2022; 149:107-123. [PMID: 35228148 PMCID: PMC8944213 DOI: 10.1016/j.neunet.2022.02.010] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2021] [Revised: 01/08/2022] [Accepted: 02/07/2022] [Indexed: 11/23/2022]
Abstract
Probabilistic finite mixture models are widely used for unsupervised clustering. These models can often be improved by adapting them to the topology of the data. For instance, in order to classify spatially adjacent data points similarly, it is common to introduce a Laplacian constraint on the posterior probability that each data point belongs to a class. Alternatively, the mixing probabilities can be treated as free parameters, while assuming Gauss-Markov or more complex priors to regularize those mixing probabilities. However, these approaches are constrained by the shape of the prior and often lead to complicated or intractable inference. Here, we propose a new parametrization of the Dirichlet distribution to flexibly regularize the mixing probabilities of over-parametrized mixture distributions. Using the Expectation-Maximization algorithm, we show that our approach allows us to define any linear update rule for the mixing probabilities, including spatial smoothing regularization as a special case. We then show that this flexible design can be extended to share class information between multiple mixture models. We apply our algorithm to artificial and natural image segmentation tasks, and we provide quantitative and qualitative comparison of the performance of Gaussian and Student-t mixtures on the Berkeley Segmentation Dataset. We also demonstrate how to propagate class information across the layers of deep convolutional neural networks in a probabilistically optimal way, suggesting a new interpretation for feedback signals in biological visual systems. Our flexible approach can be easily generalized to adapt probabilistic mixture models to arbitrary data topologies.
Collapse
Affiliation(s)
- Jonathan Vacher
- Department of Systems & Computational Biology, Albert Einstein College of Medicine, 1300 Morris Park Ave, Bronx, 10461, NY, USA; Laboratoire des Systèmes Perceptif, Département d'Études Cognitives, École Normale Supérieure, PSL University, 24 rue Lhomond, Bâtiment Jaurès, 2éme étage, Paris, 75005, France.
| | - Claire Launay
- Department of Systems & Computational Biology, Albert Einstein College of Medicine, 1300 Morris Park Ave, Bronx, 10461, NY, USA.
| | - Ruben Coen-Cagli
- Department of Systems & Computational Biology, Albert Einstein College of Medicine, 1300 Morris Park Ave, Bronx, 10461, NY, USA; Dominick P. Purpura Department of Neuroscience, Albert Einstein College of Medicine, 1300 Morris Park Ave, Bronx, 10461, NY, USA; Department of Ophthalmology & Visual Sciences, Albert Einstein College of Medicine, 1300 Morris Park Ave, Bronx, 10461, NY, USA.
| |
Collapse
|
20
|
Ghio M, Conca F, Bellebaum C, Perani D, Tettamanti M. Effective connectivity within the neural system for object-directed action representation during aware and unaware tool processing. Cortex 2022; 153:55-65. [DOI: 10.1016/j.cortex.2022.04.005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2021] [Revised: 02/15/2022] [Accepted: 04/06/2022] [Indexed: 11/25/2022]
|
21
|
Vaishnav M, Cadene R, Alamia A, Linsley D, VanRullen R, Serre T. Understanding the Computational Demands Underlying Visual Reasoning. Neural Comput 2022; 34:1075-1099. [PMID: 35231926 DOI: 10.1162/neco_a_01485] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2021] [Accepted: 12/07/2021] [Indexed: 11/04/2022]
Abstract
Visual understanding requires comprehending complex visual relations between objects within a scene. Here, we seek to characterize the computational demands for abstract visual reasoning. We do this by systematically assessing the ability of modern deep convolutional neural networks (CNNs) to learn to solve the synthetic visual reasoning test (SVRT) challenge, a collection of 23 visual reasoning problems. Our analysis reveals a novel taxonomy of visual reasoning tasks, which can be primarily explained by both the type of relations (same-different versus spatial-relation judgments) and the number of relations used to compose the underlying rules. Prior cognitive neuroscience work suggests that attention plays a key role in humans' visual reasoning ability. To test this hypothesis, we extended the CNNs with spatial and feature-based attention mechanisms. In a second series of experiments, we evaluated the ability of these attention networks to learn to solve the SVRT challenge and found the resulting architectures to be much more efficient at solving the hardest of these visual reasoning tasks. Most important, the corresponding improvements on individual tasks partially explained our novel taxonomy. Overall, this work provides a granular computational account of visual reasoning and yields testable neuroscience predictions regarding the differential need for feature-based versus spatial attention depending on the type of visual reasoning problem.
Collapse
Affiliation(s)
- Mohit Vaishnav
- Artificial and Natural Intelligence Toulouse Institute, Université de Toulouse, 31052 Toulose, France.,Carney Institute for Brain Science, Department of Cognitive Linguistic and Psychological Sciences, Brown University, Providence, RI 02912, U.S.A.
| | - Remi Cadene
- Carney Institute for Brain Science, Department of Cognitive Linguistic and Psychological Sciences, Brown University, Providence, RI 02912, U.S.A.
| | - Andrea Alamia
- Centre de Recherche Cerveau et Cognition, CNRS, Université de Toulouse, 31052 Toulouse, France
| | - Drew Linsley
- Carney Institute for Brain Science, Department of Cognitive Linguistic and Psychological Sciences, Brown University, Providence, RI 02912, U.S.A.
| | - Rufin VanRullen
- Artificial and Natural Intelligence, Toulouse Institute, Université de Toulouse, and Centre de Recherche Cerveau et Cognition, CNRS, Université de Toulouse, 31052 Toulouse, France
| | - Thomas Serre
- Artificial and Natural Intelligence Toulouse Institute, Université de Toulouse, 31052 Toulouse, France.,Carney Institute for Brain Science, Department of Cognitive Linguistic and Psychological Sciences, Brown University, Providence, RI 02912, U.S.A.
| |
Collapse
|
22
|
The role of ventral stream areas for viewpoint-invariant object recognition. Neuroimage 2022; 251:119021. [DOI: 10.1016/j.neuroimage.2022.119021] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2021] [Revised: 01/16/2022] [Accepted: 02/17/2022] [Indexed: 11/21/2022] Open
|
23
|
Zhuo C, Tian H, Zhou C, Sun Y, Chen X, Li R, Chen J, Yang L, Li Q, Zhang Q, Xu Y, Song X. Transcranial direct current stimulation of the occipital lobes with adjunct lithium attenuates the progression of cognitive impairment in patients with first episode schizophrenia. Front Psychiatry 2022; 13:962918. [PMID: 36177219 PMCID: PMC9513041 DOI: 10.3389/fpsyt.2022.962918] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/06/2022] [Accepted: 08/02/2022] [Indexed: 01/10/2023] Open
Abstract
BACKGROUND There is no standard effective treatment for schizophrenia-associated cognitive impairment. Efforts to use non-invasive brain stimulation for this purpose have been focused mostly on the frontal cortex, with little attention being given to the occipital lobe. MATERIALS AND METHODS We compared the effects of nine intervention strategies on cognitive performance in psychometric measures and brain connectivity measured obtained from functional magnetic resonance imaging analyses. The strategies consisted of transcranial direct current stimulation (t-DCS) or repetitive transcranial magnetic stimulation (r-TMS) of the frontal lobe or of the occipital alone or with adjunct lithium, or lithium monotherapy. We measured global functional connectivity density (gFCD) voxel-wise. RESULTS Although all nine patient groups showed significant improvements in global disability scores (GDSs) following the intervention period (vs. before), the greatest improvement in GDS was observed for the group that received occipital lobe-targeted t-DCS with adjunct lithium therapy. tDCS of the occipital lobe improved gFCD throughout the brain, including in the frontal lobes, whereas stimulation of the frontal lobes had less far-reaching benefits on gFCD in the brain. Adverse secondary effects (ASEs) such as heading, dizziness, and nausea, were commonly experienced by patients treated with t-DCS and r-TMS, with or without lithium, whereas ASEs were rare with lithium alone. CONCLUSION The most effective treatment strategy for impacting cognitive impairment and brain communication was t-DCS stimulation of the occipital lobe with adjunct lithium therapy, though patients often experienced headache with dizziness and nausea after treatment sessions.
Collapse
Affiliation(s)
- Chuanjun Zhuo
- Key Laboratory of Real Time Brain Circuit Tracing in Neurology and Psychiatry (RTBNP_Lab), Tianjin Fourth Center Hospital, Tianjin Fourth Central Hospital of Tianjin Medical University, Tianjin, China.,Key Laboratory of Multiple Organ Damages of Major Psychoses (MODMP_Lab), Tianjin Fourth Center Hospital, Tianjin Medical Affiliated Tianjin Fourth Central Hospital, Nankai University Affiliated Tianjin Fourth Center Hospital, Tianjin, China.,Henan Psychiatric Transformation Research Key Laboratory, Zhengzhou University, Zhengzhou, Henan, China.,Biological Psychiatry International Joint Laboratory of Henan, Zhengzhou University, Zhengzhou, Henan, China.,t-DCS and r-TMS Center of Tianjin Anding Hospital, Tianjin Mental Health Center of Tianjin Medical University, Tianjin, China.,Department of Psychiatry, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, Henan, China
| | - Hongjun Tian
- Key Laboratory of Multiple Organ Damages of Major Psychoses (MODMP_Lab), Tianjin Fourth Center Hospital, Tianjin Medical Affiliated Tianjin Fourth Central Hospital, Nankai University Affiliated Tianjin Fourth Center Hospital, Tianjin, China
| | - Chunhua Zhou
- Department of Pharmacology, The First Hospital of Hebei Medical University, Shijiazhuang, Hebei, China
| | - Yun Sun
- t-DCS and r-TMS Center of Tianjin Anding Hospital, Tianjin Mental Health Center of Tianjin Medical University, Tianjin, China
| | - Xinying Chen
- t-DCS and r-TMS Center of Tianjin Anding Hospital, Tianjin Mental Health Center of Tianjin Medical University, Tianjin, China
| | - Ranli Li
- t-DCS and r-TMS Center of Tianjin Anding Hospital, Tianjin Mental Health Center of Tianjin Medical University, Tianjin, China
| | - Jiayue Chen
- Key Laboratory of Real Time Brain Circuit Tracing in Neurology and Psychiatry (RTBNP_Lab), Tianjin Fourth Center Hospital, Tianjin Fourth Central Hospital of Tianjin Medical University, Tianjin, China
| | - Lei Yang
- Key Laboratory of Real Time Brain Circuit Tracing in Neurology and Psychiatry (RTBNP_Lab), Tianjin Fourth Center Hospital, Tianjin Fourth Central Hospital of Tianjin Medical University, Tianjin, China
| | - Qianchen Li
- Key Laboratory of Real Time Brain Circuit Tracing in Neurology and Psychiatry (RTBNP_Lab), Tianjin Fourth Center Hospital, Tianjin Fourth Central Hospital of Tianjin Medical University, Tianjin, China
| | - Qiuyu Zhang
- Key Laboratory of Real Time Brain Circuit Tracing in Neurology and Psychiatry (RTBNP_Lab), Tianjin Fourth Center Hospital, Tianjin Fourth Central Hospital of Tianjin Medical University, Tianjin, China
| | - Yong Xu
- Department of Psychiatry, The First Hospital Affiliated to Shanxi Medical University, Taiyuan, China
| | - Xueqin Song
- Department of Psychiatry, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, Henan, China
| |
Collapse
|
24
|
Biological convolutions improve DNN robustness to noise and generalisation. Neural Netw 2021; 148:96-110. [PMID: 35114495 DOI: 10.1016/j.neunet.2021.12.005] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2021] [Revised: 11/11/2021] [Accepted: 12/07/2021] [Indexed: 11/19/2022]
Abstract
Deep Convolutional Neural Networks (DNNs) have achieved superhuman accuracy on standard image classification benchmarks. Their success has reignited significant interest in their use as models of the primate visual system, bolstered by claims of their architectural and representational similarities. However, closer scrutiny of these models suggests that they rely on various forms of shortcut learning to achieve their impressive performance, such as using texture rather than shape information. Such superficial solutions to image recognition have been shown to make DNNs brittle in the face of more challenging tests such as noise-perturbed or out-of-distribution images, casting doubt on their similarity to their biological counterparts. In the present work, we demonstrate that adding fixed biological filter banks, in particular banks of Gabor filters, helps to constrain the networks to avoid reliance on shortcuts, making them develop more structured internal representations and more tolerance to noise. Importantly, they also gained around 20-35% improved accuracy when generalising to our novel out-of-distribution test image sets over standard end-to-end trained architectures. We take these findings to suggest that these properties of the primate visual system should be incorporated into DNNs to make them more able to cope with real-world vision and better capture some of the more impressive aspects of human visual perception such as generalisation.
Collapse
|
25
|
Gupta SK, Zhang M, Wu CC, Wolfe JM, Kreiman G. Visual Search Asymmetry: Deep Nets and Humans Share Similar Inherent Biases. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 2021; 34:6946-6959. [PMID: 36062138 PMCID: PMC9436507] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Visual search is a ubiquitous and often challenging daily task, exemplified by looking for the car keys at home or a friend in a crowd. An intriguing property of some classical search tasks is an asymmetry such that finding a target A among distractors B can be easier than finding B among A. To elucidate the mechanisms responsible for asymmetry in visual search, we propose a computational model that takes a target and a search image as inputs and produces a sequence of eye movements until the target is found. The model integrates eccentricity-dependent visual recognition with target-dependent top-down cues. We compared the model against human behavior in six paradigmatic search tasks that show asymmetry in humans. Without prior exposure to the stimuli or task-specific training, the model provides a plausible mechanism for search asymmetry. We hypothesized that the polarity of search asymmetry arises from experience with the natural environment. We tested this hypothesis by training the model on augmented versions of ImageNet where the biases of natural images were either removed or reversed. The polarity of search asymmetry disappeared or was altered depending on the training protocol. This study highlights how classical perceptual properties can emerge in neural network models, without the need for task-specific training, but rather as a consequence of the statistical properties of the developmental diet fed to the model. All source code and data are publicly available at https://github.com/kreimanlab/VisualSearchAsymmetry.
Collapse
Affiliation(s)
| | - Mengmi Zhang
- Children's Hospital, Harvard Medical School
- Center for Brains, Minds and Machines
| | - Chia-Chien Wu
- Brigham and Women's Hospital, Harvard Medical School
| | | | - Gabriel Kreiman
- Children's Hospital, Harvard Medical School
- Center for Brains, Minds and Machines
| |
Collapse
|
26
|
Seijdel N, Loke J, van de Klundert R, van der Meer M, Quispel E, van Gaal S, de Haan EHF, Scholte HS. On the Necessity of Recurrent Processing during Object Recognition: It Depends on the Need for Scene Segmentation. J Neurosci 2021; 41:6281-6289. [PMID: 34088797 PMCID: PMC8287993 DOI: 10.1523/jneurosci.2851-20.2021] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2020] [Revised: 04/11/2021] [Accepted: 05/13/2021] [Indexed: 11/21/2022] Open
Abstract
Although feedforward activity may suffice for recognizing objects in isolation, additional visual operations that aid object recognition might be needed for real-world scenes. One such additional operation is figure-ground segmentation, extracting the relevant features and locations of the target object while ignoring irrelevant features. In this study of 60 human participants (female and male), we show objects on backgrounds of increasing complexity to investigate whether recurrent computations are increasingly important for segmenting objects from more complex backgrounds. Three lines of evidence show that recurrent processing is critical for recognition of objects embedded in complex scenes. First, behavioral results indicated a greater reduction in performance after masking objects presented on more complex backgrounds, with the degree of impairment increasing with increasing background complexity. Second, electroencephalography (EEG) measurements showed clear differences in the evoked response potentials between conditions around time points beyond feedforward activity, and exploratory object decoding analyses based on the EEG signal indicated later decoding onsets for objects embedded in more complex backgrounds. Third, deep convolutional neural network performance confirmed this interpretation. Feedforward and less deep networks showed a higher degree of impairment in recognition for objects in complex backgrounds compared with recurrent and deeper networks. Together, these results support the notion that recurrent computations drive figure-ground segmentation of objects in complex scenes.SIGNIFICANCE STATEMENT The incredible speed of object recognition suggests that it relies purely on a fast feedforward buildup of perceptual activity. However, this view is contradicted by studies showing that disruption of recurrent processing leads to decreased object recognition performance. Here, we resolve this issue by showing that how object recognition is resolved and whether recurrent processing is crucial depends on the context in which it is presented. For objects presented in isolation or in simple environments, feedforward activity could be sufficient for successful object recognition. However, when the environment is more complex, additional processing seems necessary to select the elements that belong to the object and by that segregate them from the background.
Collapse
Affiliation(s)
- Noor Seijdel
- Department of Psychology, University of Amsterdam, 1018 WS Amsterdam, The Netherlands
- Amsterdam Brain and Cognition Center, University of Amsterdam, 1018 WS Amsterdam, The Netherlands
| | - Jessica Loke
- Department of Psychology, University of Amsterdam, 1018 WS Amsterdam, The Netherlands
- Amsterdam Brain and Cognition Center, University of Amsterdam, 1018 WS Amsterdam, The Netherlands
| | - Ron van de Klundert
- Department of Psychology, University of Amsterdam, 1018 WS Amsterdam, The Netherlands
| | - Matthew van der Meer
- Department of Psychology, University of Amsterdam, 1018 WS Amsterdam, The Netherlands
| | - Eva Quispel
- Department of Psychology, University of Amsterdam, 1018 WS Amsterdam, The Netherlands
| | - Simon van Gaal
- Department of Psychology, University of Amsterdam, 1018 WS Amsterdam, The Netherlands
- Amsterdam Brain and Cognition Center, University of Amsterdam, 1018 WS Amsterdam, The Netherlands
| | - Edward H F de Haan
- Department of Psychology, University of Amsterdam, 1018 WS Amsterdam, The Netherlands
- Amsterdam Brain and Cognition Center, University of Amsterdam, 1018 WS Amsterdam, The Netherlands
| | - H Steven Scholte
- Department of Psychology, University of Amsterdam, 1018 WS Amsterdam, The Netherlands
- Amsterdam Brain and Cognition Center, University of Amsterdam, 1018 WS Amsterdam, The Netherlands
| |
Collapse
|
27
|
Chavlis S, Poirazi P. Drawing inspiration from biological dendrites to empower artificial neural networks. Curr Opin Neurobiol 2021; 70:1-10. [PMID: 34087540 DOI: 10.1016/j.conb.2021.04.007] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2021] [Revised: 04/21/2021] [Accepted: 04/28/2021] [Indexed: 12/24/2022]
Abstract
This article highlights specific features of biological neurons and their dendritic trees, whose adoption may help advance artificial neural networks used in various machine learning applications. Advancements could take the form of increased computational capabilities and/or reduced power consumption. Proposed features include dendritic anatomy, dendritic nonlinearities, and compartmentalized plasticity rules, all of which shape learning and information processing in biological networks. We discuss the computational benefits provided by these features in biological neurons and suggest ways to adopt them in artificial neurons in order to exploit the respective benefits in machine learning.
Collapse
Affiliation(s)
- Spyridon Chavlis
- Institute of Molecular Biology and Biotechnology, Foundation for Research and Technology-Hellas, Heraklion, 70013, Greece
| | - Panayiota Poirazi
- Institute of Molecular Biology and Biotechnology, Foundation for Research and Technology-Hellas, Heraklion, 70013, Greece.
| |
Collapse
|
28
|
Doron G, Shin JN, Takahashi N, Drüke M, Bocklisch C, Skenderi S, de Mont L, Toumazou M, Ledderose J, Brecht M, Naud R, Larkum ME. Perirhinal input to neocortical layer 1 controls learning. Science 2021; 370:370/6523/eaaz3136. [PMID: 33335033 DOI: 10.1126/science.aaz3136] [Citation(s) in RCA: 57] [Impact Index Per Article: 19.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2019] [Revised: 08/27/2020] [Accepted: 10/23/2020] [Indexed: 12/28/2022]
Abstract
Hippocampal output influences memory formation in the neocortex, but this process is poorly understood because the precise anatomical location and the underlying cellular mechanisms remain elusive. Here, we show that perirhinal input, predominantly to sensory cortical layer 1 (L1), controls hippocampal-dependent associative learning in rodents. This process was marked by the emergence of distinct firing responses in defined subpopulations of layer 5 (L5) pyramidal neurons whose tuft dendrites receive perirhinal inputs in L1. Learning correlated with burst firing and the enhancement of dendritic excitability, and it was suppressed by disruption of dendritic activity. Furthermore, bursts, but not regular spike trains, were sufficient to retrieve learned behavior. We conclude that hippocampal information arriving at L5 tuft dendrites in neocortical L1 mediates memory formation in the neocortex.
Collapse
Affiliation(s)
- Guy Doron
- Institute for Biology, Humboldt-Universität zu Berlin, D-10117 Berlin, Germany.
| | - Jiyun N Shin
- Institute for Biology, Humboldt-Universität zu Berlin, D-10117 Berlin, Germany
| | - Naoya Takahashi
- Institute for Biology, Humboldt-Universität zu Berlin, D-10117 Berlin, Germany
| | - Moritz Drüke
- Institute for Biology, Humboldt-Universität zu Berlin, D-10117 Berlin, Germany
| | - Christina Bocklisch
- Institute for Biology, Humboldt-Universität zu Berlin, D-10117 Berlin, Germany
| | - Salina Skenderi
- Institute for Biology, Humboldt-Universität zu Berlin, D-10117 Berlin, Germany
| | - Lisa de Mont
- Institute for Biology, Humboldt-Universität zu Berlin, D-10117 Berlin, Germany
| | - Maria Toumazou
- Institute for Biology, Humboldt-Universität zu Berlin, D-10117 Berlin, Germany
| | - Julia Ledderose
- Institute for Biology, Humboldt-Universität zu Berlin, D-10117 Berlin, Germany
| | - Michael Brecht
- Bernstein Center for Computational Neuroscience, Humboldt-Universität zu Berlin, D-10115 Berlin, Germany.,NeuroCure Cluster, Charité - Universitätsmedizin Berlin, D-10117 Berlin, Germany
| | - Richard Naud
- University of Ottawa Brain and Mind Institute, Department of Cellular and Molecular Medicine, University of Ottawa, Ottawa, ON K1H 8M5, Canada.,Department of Physics, University of Ottawa, Ottawa, ON K1N 6N5, Canada
| | - Matthew E Larkum
- Institute for Biology, Humboldt-Universität zu Berlin, D-10117 Berlin, Germany. .,NeuroCure Cluster, Charité - Universitätsmedizin Berlin, D-10117 Berlin, Germany
| |
Collapse
|
29
|
Differential Involvement of EEG Oscillatory Components in Sameness versus Spatial-Relation Visual Reasoning Tasks. eNeuro 2021; 8:ENEURO.0267-20.2020. [PMID: 33239271 PMCID: PMC7877474 DOI: 10.1523/eneuro.0267-20.2020] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2020] [Revised: 10/20/2020] [Accepted: 10/21/2020] [Indexed: 11/21/2022] Open
Abstract
The development of deep convolutional neural networks (CNNs) has recently led to great successes in computer vision, and CNNs have become de facto computational models of vision. However, a growing body of work suggests that they exhibit critical limitations on tasks beyond image categorization. Here, we study one such fundamental limitation, concerning the judgment of whether two simultaneously presented items are the same or different (SD) compared with a baseline assessment of their spatial relationship (SR). In both human subjects and artificial neural networks, we test the prediction that SD tasks recruit additional cortical mechanisms which underlie critical aspects of visual cognition that are not explained by current computational models. We thus recorded electroencephalography (EEG) signals from human participants engaged in the same tasks as the computational models. Importantly, in humans the two tasks were matched in terms of difficulty by an adaptive psychometric procedure; yet, on top of a modulation of evoked potentials (EPs), our results revealed higher activity in the low β (16–24 Hz) band in the SD compared with the SR conditions. We surmise that these oscillations reflect the crucial involvement of additional mechanisms, such as working memory and attention, which are missing in current feed-forward CNNs.
Collapse
|
30
|
van Bergen RS, Kriegeskorte N. Going in circles is the way forward: the role of recurrence in visual inference. Curr Opin Neurobiol 2020; 65:176-193. [PMID: 33279795 DOI: 10.1016/j.conb.2020.11.009] [Citation(s) in RCA: 27] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2020] [Revised: 11/16/2020] [Accepted: 11/16/2020] [Indexed: 11/30/2022]
Abstract
Biological visual systems exhibit abundant recurrent connectivity. State-of-the-art neural network models for visual recognition, by contrast, rely heavily or exclusively on feedforward computation. Any finite-time recurrent neural network (RNN) can be unrolled along time to yield an equivalent feedforward neural network (FNN). This important insight suggests that computational neuroscientists may not need to engage recurrent computation, and that computer-vision engineers may be limiting themselves to a special case of FNN if they build recurrent models. Here we argue, to the contrary, that FNNs are a special case of RNNs and that computational neuroscientists and engineers should engage recurrence to understand how brains and machines can (1) achieve greater and more flexible computational depth (2) compress complex computations into limited hardware (3) integrate priors and priorities into visual inference through expectation and attention (4) exploit sequential dependencies in their data for better inference and prediction and (5) leverage the power of iterative computation.
Collapse
Affiliation(s)
- Ruben S van Bergen
- Zuckerman Mind Brain Behavior Institute, Columbia University, New York, NY, United States
| | - Nikolaus Kriegeskorte
- Zuckerman Mind Brain Behavior Institute, Columbia University, New York, NY, United States; Department of Psychology, Columbia University, New York, NY, United States; Department of Neuroscience, Columbia University, New York, NY, United States; Affiliated member, Electrical Engineering, Columbia University, New York, NY, United States.
| |
Collapse
|