1
|
Margalit E, Lee H, Finzi D, DiCarlo JJ, Grill-Spector K, Yamins DLK. A unifying framework for functional organization in early and higher ventral visual cortex. Neuron 2024; 112:2435-2451.e7. [PMID: 38733985 PMCID: PMC11257790 DOI: 10.1016/j.neuron.2024.04.018] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2023] [Revised: 12/08/2023] [Accepted: 04/15/2024] [Indexed: 05/13/2024]
Abstract
A key feature of cortical systems is functional organization: the arrangement of functionally distinct neurons in characteristic spatial patterns. However, the principles underlying the emergence of functional organization in the cortex are poorly understood. Here, we develop the topographic deep artificial neural network (TDANN), the first model to predict several aspects of the functional organization of multiple cortical areas in the primate visual system. We analyze the factors driving the TDANN's success and find that it balances two objectives: learning a task-general sensory representation and maximizing the spatial smoothness of responses according to a metric that scales with cortical surface area. In turn, the representations learned by the TDANN are more brain-like than in spatially unconstrained models. Finally, we provide evidence that the TDANN's functional organization balances performance with between-area connection length. Our results offer a unified principle for understanding the functional organization of the primate ventral visual system.
Collapse
Affiliation(s)
- Eshed Margalit
- Neurosciences Graduate Program, Stanford University, Stanford, CA 94305, USA.
| | - Hyodong Lee
- Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
| | - Dawn Finzi
- Department of Psychology, Stanford University, Stanford, CA 94305, USA; Department of Computer Science, Stanford University, Stanford, CA 94305, USA
| | - James J DiCarlo
- Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA 02139, USA; McGovern Institute for Brain Research, Massachusetts Institute of Technology, Cambridge, MA 02139, USA; Center for Brains Minds and Machines, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
| | - Kalanit Grill-Spector
- Department of Psychology, Stanford University, Stanford, CA 94305, USA; Wu Tsai Neurosciences Institute, Stanford University, Stanford, CA 94305, USA
| | - Daniel L K Yamins
- Department of Psychology, Stanford University, Stanford, CA 94305, USA; Department of Computer Science, Stanford University, Stanford, CA 94305, USA; Wu Tsai Neurosciences Institute, Stanford University, Stanford, CA 94305, USA
| |
Collapse
|
2
|
Alcalá-López D, Mei N, Margolles P, Soto D. Brain-wide representation of social knowledge. Soc Cogn Affect Neurosci 2024; 19:nsae032. [PMID: 38804694 PMCID: PMC11173195 DOI: 10.1093/scan/nsae032] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2023] [Revised: 02/28/2024] [Accepted: 05/30/2024] [Indexed: 05/29/2024] Open
Abstract
Understanding how the human brain maps different dimensions of social conceptualizations remains a key unresolved issue. We performed a functional magnetic resonance imaging (MRI) study in which participants were exposed to audio definitions of personality traits and asked to simulate experiences associated with the concepts. Half of the concepts were affective (e.g. empathetic), and the other half were non-affective (e.g. intelligent). Orthogonally, half of the concepts were highly likable (e.g. sincere) and half were socially undesirable (e.g. liar). Behaviourally, we observed that the dimension of social desirability reflected the participant's subjective ratings better than affect. FMRI decoding results showed that both social desirability and affect could be decoded in local patterns of activity through distributed brain regions including the superior temporal, inferior frontal, precuneus and key nodes of the default mode network in posterior/anterior cingulate and ventromedial prefrontal cortex. Decoding accuracy was better for social desirability than affect. A representational similarity analysis further demonstrated that a deep language model significantly predicted brain activity associated with the concepts in bilateral regions of superior and anterior temporal lobes. The results demonstrate a brain-wide representation of social knowledge, involving default model network systems that support the multimodal simulation of social experience, with a further reliance on language-related preprocessing.
Collapse
Affiliation(s)
- Daniel Alcalá-López
- Consciousness group, Basque Center on Cognition, Brain and Language, San Sebastian 20009, Spain
| | - Ning Mei
- Psychology Department, Shenzhen University, Nanshan district, Guangdong province 3688, China
| | - Pedro Margolles
- Consciousness group, Basque Center on Cognition, Brain and Language, San Sebastian 20009, Spain
| | - David Soto
- Consciousness group, Basque Center on Cognition, Brain and Language, San Sebastian 20009, Spain
| |
Collapse
|
3
|
Cusack R, Ranzato M, Charvet CJ. Helpless infants are learning a foundation model. Trends Cogn Sci 2024:S1364-6613(24)00114-1. [PMID: 38839537 DOI: 10.1016/j.tics.2024.05.001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2023] [Revised: 04/24/2024] [Accepted: 05/03/2024] [Indexed: 06/07/2024]
Abstract
Humans have a protracted postnatal helplessness period, typically attributed to human-specific maternal constraints causing an early birth when the brain is highly immature. By aligning neurodevelopmental events across species, however, it has been found that humans are not born with especially immature brains compared with animal species with a shorter helpless period. Consistent with this, the rapidly growing field of infant neuroimaging has found that brain connectivity and functional activation at birth share many similarities with the mature brain. Inspired by machine learning, where deep neural networks also benefit from a 'helpless period' of pre-training, we propose that human infants are learning a foundation model: a set of fundamental representations that underpin later cognition with high performance and rapid generalisation.
Collapse
|
4
|
Lu Z, Wang Y, Golomb JD. Achieving more human brain-like vision via human EEG representational alignment. ARXIV 2024:arXiv:2401.17231v2. [PMID: 38351926 PMCID: PMC10862929] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 02/19/2024]
Abstract
Despite advancements in artificial intelligence, object recognition models still lag behind in emulating visual information processing in human brains. Recent studies have highlighted the potential of using neural data to mimic brain processing; however, these often rely on invasive neural recordings from non-human subjects, leaving a critical gap in understanding human visual perception. Addressing this gap, we present, for the first time, 'Re(presentational)Al(ignment)net', a vision model aligned with human brain activity based on non-invasive EEG, demonstrating a significantly higher similarity to human brain representations. Our innovative image-to-brain multi-layer encoding framework advances human neural alignment by optimizing multiple model layers and enabling the model to efficiently learn and mimic human brain's visual representational patterns across object categories and different modalities. Our findings suggest that ReAlnet represents a breakthrough in bridging the gap between artificial and human vision, and paving the way for more brain-like artificial intelligence systems.
Collapse
Affiliation(s)
- Zitong Lu
- Department of Psychology, The Ohio State University
| | - Yile Wang
- Department of Neuroscience, The University of Texas at Dallas
| | | |
Collapse
|
5
|
Xie Y, Mack ML. Reconciling category exceptions through representational shifts. Psychon Bull Rev 2024:10.3758/s13423-024-02501-8. [PMID: 38639836 DOI: 10.3758/s13423-024-02501-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 03/28/2024] [Indexed: 04/20/2024]
Abstract
Real-world categories often contain exceptions that disobey the perceptual regularities followed by other members. Prominent psychological and neurobiological theories indicate that exception learning relies on the flexible modulation of object representations, but the specific representational shifts key to learning remain poorly understood. Here, we leveraged behavioral and computational approaches to elucidate the representational dynamics during the acquisition of exceptions that violate established regularity knowledge. In our study, participants (n = 42) learned novel categories in which regular and exceptional items were introduced successively; we then fitted a computational model to individuals' categorization performance to infer latent stimulus representations before and after exception learning. We found that in the representational space, exception learning not only drove confusable exceptions to be differentiated from regular items, but also led exceptions within the same category to be integrated based on shared characteristics. These shifts resulted in distinct representational clusters of regular items and exceptions that constituted hierarchically structured category representations, and the distinct clustering of exceptions from regular items was associated with a high ability to generalize and reconcile knowledge of regularities and exceptions. Moreover, by having a second group of participants (n = 42) to judge stimuli's similarity before and after exception learning, we revealed misalignment between representational similarity and behavioral similarity judgments, which further highlights the hierarchical layouts of categories with regularities and exceptions. Altogether, our findings elucidate the representational dynamics giving rise to generalizable category structures that reconcile perceptually inconsistent category members, thereby advancing the understanding of knowledge formation.
Collapse
Affiliation(s)
- Yongzhen Xie
- Department of Psychology, University of Toronto, 100 St. George Street, Toronto, ON, M5S 3G3, Canada.
| | - Michael L Mack
- Department of Psychology, University of Toronto, 100 St. George Street, Toronto, ON, M5S 3G3, Canada
| |
Collapse
|
6
|
Pauley C, Karlsson A, Sander MC. Early visual cortices reveal interrelated item and category representations in aging. eNeuro 2024; 11:ENEURO.0337-23.2023. [PMID: 38413198 PMCID: PMC10960632 DOI: 10.1523/eneuro.0337-23.2023] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2023] [Revised: 12/18/2023] [Accepted: 12/24/2023] [Indexed: 02/29/2024] Open
Abstract
Neural dedifferentiation, the finding that neural representations tend to be less distinct in older adults compared with younger adults, has been associated with age-related declines in memory performance. Most studies assessing the relation between memory and neural dedifferentiation have evaluated how age impacts the distinctiveness of neural representations for different visual categories (e.g., scenes and objects). However, how age impacts the quality of neural representations at the level of individual items is still an open question. Here, we present data from an age-comparative fMRI study that aimed to understand how the distinctiveness of neural representations for individual stimuli differs between younger and older adults and relates to memory outcomes. Pattern similarity searchlight analyses yielded indicators of neural dedifferentiation at the level of individual items as well as at the category level in posterior occipital cortices. We asked whether age differences in neural distinctiveness at each representational level were associated with inter- and/or intraindividual variability in memory performance. While age-related dedifferentiation at both the item and category level related to between-person differences in memory, neural distinctiveness at the category level also tracked within-person variability in memory performance. Concurrently, neural distinctiveness at the item level was strongly associated with neural distinctiveness at the category level both within and across participants, elucidating a potential representational mechanism linking item- and category-level distinctiveness. In sum, we provide evidence that age-related neural dedifferentiation co-exists across multiple representational levels and is related to memory performance.Significance Statement Age-related memory decline has been associated with neural dedifferentiation, the finding that older adults have less distinctive neural representations than younger adults. This has been mostly shown for category information, while evidence for age differences in the specificity of item representations is meager. We used pattern similarity searchlight analyses to find indicators of neural dedifferentiation at both levels of representation (category and item) and linked distinctiveness to memory performance. Both item- and category-level dedifferentiation in the calcarine cortex were related to interindividual differences in memory performance, while category-level distinctiveness further tracked intraindividual variability. Crucially, neural distinctiveness was strongly tied between the item and category levels, indicating that intersecting representational properties of posterior occipital cortices reflect both individual exemplars and categories.
Collapse
Affiliation(s)
- Claire Pauley
- Center for Lifespan Psychology, Max Planck Institute for Human Development, Berlin 14195, Germany
- Faculty of Life Sciences, Humboldt-Universität zu Berlin, Berlin 10115, Germany
| | - Anna Karlsson
- Faculty of Life Sciences, Humboldt-Universität zu Berlin, Berlin 10115, Germany
| | - Myriam C. Sander
- Center for Lifespan Psychology, Max Planck Institute for Human Development, Berlin 14195, Germany
| |
Collapse
|
7
|
Matteucci G, Piasini E, Zoccolan D. Unsupervised learning of mid-level visual representations. Curr Opin Neurobiol 2024; 84:102834. [PMID: 38154417 DOI: 10.1016/j.conb.2023.102834] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2023] [Revised: 12/03/2023] [Accepted: 12/05/2023] [Indexed: 12/30/2023]
Abstract
Recently, a confluence between trends in neuroscience and machine learning has brought a renewed focus on unsupervised learning, where sensory processing systems learn to exploit the statistical structure of their inputs in the absence of explicit training targets or rewards. Sophisticated experimental approaches have enabled the investigation of the influence of sensory experience on neural self-organization and its synaptic bases. Meanwhile, novel algorithms for unsupervised and self-supervised learning have become increasingly popular both as inspiration for theories of the brain, particularly for the function of intermediate visual cortical areas, and as building blocks of real-world learning machines. Here we review some of these recent developments, placing them in historical context and highlighting some research lines that promise exciting breakthroughs in the near future.
Collapse
Affiliation(s)
- Giulio Matteucci
- Department of Basic Neurosciences, University of Geneva, Geneva, 1206, Switzerland. https://twitter.com/giulio_matt
| | - Eugenio Piasini
- International School for Advanced Studies (SISSA), Trieste, 34136, Italy
| | - Davide Zoccolan
- International School for Advanced Studies (SISSA), Trieste, 34136, Italy.
| |
Collapse
|
8
|
Yildirim I, Siegel MH, Soltani AA, Ray Chaudhuri S, Tenenbaum JB. Perception of 3D shape integrates intuitive physics and analysis-by-synthesis. Nat Hum Behav 2024; 8:320-335. [PMID: 37996497 DOI: 10.1038/s41562-023-01759-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2023] [Accepted: 10/12/2023] [Indexed: 11/25/2023]
Abstract
Many surface cues support three-dimensional shape perception, but humans can sometimes still see shape when these features are missing-such as when an object is covered with a draped cloth. Here we propose a framework for three-dimensional shape perception that explains perception in both typical and atypical cases as analysis-by-synthesis, or inference in a generative model of image formation. The model integrates intuitive physics to explain how shape can be inferred from the deformations it causes to other objects, as in cloth draping. Behavioural and computational studies comparing this account with several alternatives show that it best matches human observers (total n = 174) in both accuracy and response times, and is the only model that correlates significantly with human performance on difficult discriminations. We suggest that bottom-up deep neural network models are not fully adequate accounts of human shape perception, and point to how machine vision systems might achieve more human-like robustness.
Collapse
Affiliation(s)
- Ilker Yildirim
- Department of Psychology, Yale University, New Haven, CT, USA.
- Department of Statistics & Data Science, Yale University, New Haven, CT, USA.
- Wu-Tsai Institute, Yale University, New Haven, CT, USA.
| | - Max H Siegel
- Department of Brain & Cognitive Sciences, MIT, Cambridge, MA, USA.
- The Center for Brains, Minds, and Machines, MIT, Cambridge, MA, USA.
| | - Amir A Soltani
- Department of Brain & Cognitive Sciences, MIT, Cambridge, MA, USA
- The Center for Brains, Minds, and Machines, MIT, Cambridge, MA, USA
| | | | - Joshua B Tenenbaum
- Department of Brain & Cognitive Sciences, MIT, Cambridge, MA, USA.
- The Center for Brains, Minds, and Machines, MIT, Cambridge, MA, USA.
| |
Collapse
|
9
|
Abassi E, Papeo L. Category-Selective Representation of Relationships in the Visual Cortex. J Neurosci 2024; 44:e0250232023. [PMID: 38124013 PMCID: PMC10860595 DOI: 10.1523/jneurosci.0250-23.2023] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2023] [Revised: 09/29/2023] [Accepted: 10/14/2023] [Indexed: 12/23/2023] Open
Abstract
Understanding social interaction requires processing social agents and their relationships. The latest results show that much of this process is visually solved: visual areas can represent multiple people encoding emergent information about their interaction that is not explained by the response to the individuals alone. A neural signature of this process is an increased response in visual areas, to face-to-face (seemingly interacting) people, relative to people presented as unrelated (back-to-back). This effect highlighted a network of visual areas for representing relational information. How is this network organized? Using functional MRI, we measured the brain activity of healthy female and male humans (N = 42), in response to images of two faces or two (head-blurred) bodies, facing toward or away from each other. Taking the facing > non-facing effect as a signature of relation perception, we found that relations between faces and between bodies were coded in distinct areas, mirroring the categorical representation of faces and bodies in the visual cortex. Additional analyses suggest the existence of a third network encoding relations between (nonsocial) objects. Finally, a separate occipitotemporal network showed the generalization of relational information across body, face, and nonsocial object dyads (multivariate pattern classification analysis), revealing shared properties of relations across categories. In sum, beyond single entities, the visual cortex encodes the relations that bind multiple entities into relationships; it does so in a category-selective fashion, thus respecting a general organizing principle of representation in high-level vision. Visual areas encoding visual relational information can reveal the processing of emergent properties of social (and nonsocial) interaction, which trigger inferential processes.
Collapse
Affiliation(s)
- Etienne Abassi
- Institut des Sciences Cognitives-Marc Jeannerod, UMR5229, Centre National de la Recherche Scientifique (CNRS), Université Claude Bernard Lyon 1, Bron 69675, France
| | - Liuba Papeo
- Institut des Sciences Cognitives-Marc Jeannerod, UMR5229, Centre National de la Recherche Scientifique (CNRS), Université Claude Bernard Lyon 1, Bron 69675, France
| |
Collapse
|
10
|
Peters B, DiCarlo JJ, Gureckis T, Haefner R, Isik L, Tenenbaum J, Konkle T, Naselaris T, Stachenfeld K, Tavares Z, Tsao D, Yildirim I, Kriegeskorte N. How does the primate brain combine generative and discriminative computations in vision? ARXIV 2024:arXiv:2401.06005v1. [PMID: 38259351 PMCID: PMC10802669] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Subscribe] [Scholar Register] [Indexed: 01/24/2024]
Abstract
Vision is widely understood as an inference problem. However, two contrasting conceptions of the inference process have each been influential in research on biological vision as well as the engineering of machine vision. The first emphasizes bottom-up signal flow, describing vision as a largely feedforward, discriminative inference process that filters and transforms the visual information to remove irrelevant variation and represent behaviorally relevant information in a format suitable for downstream functions of cognition and behavioral control. In this conception, vision is driven by the sensory data, and perception is direct because the processing proceeds from the data to the latent variables of interest. The notion of "inference" in this conception is that of the engineering literature on neural networks, where feedforward convolutional neural networks processing images are said to perform inference. The alternative conception is that of vision as an inference process in Helmholtz's sense, where the sensory evidence is evaluated in the context of a generative model of the causal processes that give rise to it. In this conception, vision inverts a generative model through an interrogation of the sensory evidence in a process often thought to involve top-down predictions of sensory data to evaluate the likelihood of alternative hypotheses. The authors include scientists rooted in roughly equal numbers in each of the conceptions and motivated to overcome what might be a false dichotomy between them and engage the other perspective in the realm of theory and experiment. The primate brain employs an unknown algorithm that may combine the advantages of both conceptions. We explain and clarify the terminology, review the key empirical evidence, and propose an empirical research program that transcends the dichotomy and sets the stage for revealing the mysterious hybrid algorithm of primate vision.
Collapse
Affiliation(s)
- Benjamin Peters
- Zuckerman Mind Brain Behavior Institute, Columbia University
- School of Psychology & Neuroscience, University of Glasgow
| | - James J DiCarlo
- Department of Brain and Cognitive Sciences, MIT
- McGovern Institute for Brain Research, MIT
- NSF Center for Brains, Minds and Machines, MIT
- Quest for Intelligence, Schwarzman College of Computing, MIT
| | | | - Ralf Haefner
- Brain and Cognitive Sciences, University of Rochester
- Center for Visual Science, University of Rochester
| | - Leyla Isik
- Department of Cognitive Science, Johns Hopkins University
| | - Joshua Tenenbaum
- Department of Brain and Cognitive Sciences, MIT
- NSF Center for Brains, Minds and Machines, MIT
- Computer Science and Artificial Intelligence Laboratory, MIT
| | - Talia Konkle
- Department of Psychology, Harvard University
- Center for Brain Science, Harvard University
- Kempner Institute for Natural and Artificial Intelligence, Harvard University
| | | | | | - Zenna Tavares
- Zuckerman Mind Brain Behavior Institute, Columbia University
- Data Science Institute, Columbia University
| | - Doris Tsao
- Dept of Molecular & Cell Biology, University of California Berkeley
- Howard Hughes Medical Institute
| | - Ilker Yildirim
- Department of Psychology, Yale University
- Department of Statistics and Data Science, Yale University
| | - Nikolaus Kriegeskorte
- Zuckerman Mind Brain Behavior Institute, Columbia University
- Department of Psychology, Columbia University
- Department of Neuroscience, Columbia University
- Department of Electrical Engineering, Columbia University
| |
Collapse
|
11
|
Elmoznino E, Bonner MF. High-performing neural network models of visual cortex benefit from high latent dimensionality. PLoS Comput Biol 2024; 20:e1011792. [PMID: 38198504 PMCID: PMC10805290 DOI: 10.1371/journal.pcbi.1011792] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2023] [Revised: 01/23/2024] [Accepted: 12/30/2023] [Indexed: 01/12/2024] Open
Abstract
Geometric descriptions of deep neural networks (DNNs) have the potential to uncover core representational principles of computational models in neuroscience. Here we examined the geometry of DNN models of visual cortex by quantifying the latent dimensionality of their natural image representations. A popular view holds that optimal DNNs compress their representations onto low-dimensional subspaces to achieve invariance and robustness, which suggests that better models of visual cortex should have lower dimensional geometries. Surprisingly, we found a strong trend in the opposite direction-neural networks with high-dimensional image subspaces tended to have better generalization performance when predicting cortical responses to held-out stimuli in both monkey electrophysiology and human fMRI data. Moreover, we found that high dimensionality was associated with better performance when learning new categories of stimuli, suggesting that higher dimensional representations are better suited to generalize beyond their training domains. These findings suggest a general principle whereby high-dimensional geometry confers computational benefits to DNN models of visual cortex.
Collapse
Affiliation(s)
- Eric Elmoznino
- Department of Cognitive Science, Johns Hopkins University, Baltimore, Maryland, United States of America
| | - Michael F. Bonner
- Department of Cognitive Science, Johns Hopkins University, Baltimore, Maryland, United States of America
| |
Collapse
|
12
|
Shi Y, Bi D, Hesse JK, Lanfranchi FF, Chen S, Tsao DY. Rapid, concerted switching of the neural code in inferotemporal cortex. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.12.06.570341. [PMID: 38106108 PMCID: PMC10723419 DOI: 10.1101/2023.12.06.570341] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/19/2023]
Abstract
A fundamental paradigm in neuroscience is the concept of neural coding through tuning functions 1 . According to this idea, neurons encode stimuli through fixed mappings of stimulus features to firing rates. Here, we report that the tuning of visual neurons can rapidly and coherently change across a population to attend to a whole and its parts. We set out to investigate a longstanding debate concerning whether inferotemporal (IT) cortex uses a specialized code for representing specific types of objects or whether it uses a general code that applies to any object. We found that face cells in macaque IT cortex initially adopted a general code optimized for face detection. But following a rapid, concerted population event lasting < 20 ms, the neural code transformed into a face-specific one with two striking properties: (i) response gradients to principal detection-related dimensions reversed direction, and (ii) new tuning developed to multiple higher feature space dimensions supporting fine face discrimination. These dynamics were face specific and did not occur in response to objects. Overall, these results show that, for faces, face cells shift from detection to discrimination by switching from an object-general code to a face-specific code. More broadly, our results suggest a novel mechanism for neural representation: concerted, stimulus-dependent switching of the neural code used by a cortical area.
Collapse
|
13
|
Hermann K, Nayebi A, van Steenkiste S, Jones M. For human-like models, train on human-like tasks. Behav Brain Sci 2023; 46:e394. [PMID: 38054325 DOI: 10.1017/s0140525x23001516] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/07/2023]
Abstract
Bowers et al. express skepticism about deep neural networks (DNNs) as models of human vision due to DNNs' failures to account for results from psychological research. We argue that to fairly assess DNNs, we must first train them on more human-like tasks which we hypothesize will induce more human-like behaviors and representations.
Collapse
Affiliation(s)
| | - Aran Nayebi
- McGovern Institute, Massachusetts Institute of Technology, Cambridge, MA, USA https://anayebi.github.io/
| | | | - Matt Jones
- Google Research, Mountain View, CA, USA https://www.sjoerdvansteenkiste.com/
- Department of Psychology and Neuroscience, University of Colorado, Boulder, CO, ://matt.colorado.edu
| |
Collapse
|
14
|
Feather J, Leclerc G, Mądry A, McDermott JH. Model metamers reveal divergent invariances between biological and artificial neural networks. Nat Neurosci 2023; 26:2017-2034. [PMID: 37845543 PMCID: PMC10620097 DOI: 10.1038/s41593-023-01442-0] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2022] [Accepted: 08/29/2023] [Indexed: 10/18/2023]
Abstract
Deep neural network models of sensory systems are often proposed to learn representational transformations with invariances like those in the brain. To reveal these invariances, we generated 'model metamers', stimuli whose activations within a model stage are matched to those of a natural stimulus. Metamers for state-of-the-art supervised and unsupervised neural network models of vision and audition were often completely unrecognizable to humans when generated from late model stages, suggesting differences between model and human invariances. Targeted model changes improved human recognizability of model metamers but did not eliminate the overall human-model discrepancy. The human recognizability of a model's metamers was well predicted by their recognizability by other models, suggesting that models contain idiosyncratic invariances in addition to those required by the task. Metamer recognizability dissociated from both traditional brain-based benchmarks and adversarial vulnerability, revealing a distinct failure mode of existing sensory models and providing a complementary benchmark for model assessment.
Collapse
Affiliation(s)
- Jenelle Feather
- Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA, USA.
- McGovern Institute, Massachusetts Institute of Technology, Cambridge, MA, USA.
- Center for Brains, Minds and Machines, Massachusetts Institute of Technology, Cambridge, MA, USA.
- Center for Computational Neuroscience, Flatiron Institute, Cambridge, MA, USA.
| | - Guillaume Leclerc
- Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology, Cambridge, MA, USA
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Aleksander Mądry
- Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology, Cambridge, MA, USA
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Josh H McDermott
- Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA, USA.
- McGovern Institute, Massachusetts Institute of Technology, Cambridge, MA, USA.
- Center for Brains, Minds and Machines, Massachusetts Institute of Technology, Cambridge, MA, USA.
- Speech and Hearing Bioscience and Technology, Harvard University, Cambridge, MA, USA.
| |
Collapse
|
15
|
Jiahui G, Feilong M, Visconti di Oleggio Castello M, Nastase SA, Haxby JV, Gobbini MI. Modeling naturalistic face processing in humans with deep convolutional neural networks. Proc Natl Acad Sci U S A 2023; 120:e2304085120. [PMID: 37847731 PMCID: PMC10614847 DOI: 10.1073/pnas.2304085120] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2023] [Accepted: 09/11/2023] [Indexed: 10/19/2023] Open
Abstract
Deep convolutional neural networks (DCNNs) trained for face identification can rival and even exceed human-level performance. The ways in which the internal face representations in DCNNs relate to human cognitive representations and brain activity are not well understood. Nearly all previous studies focused on static face image processing with rapid display times and ignored the processing of naturalistic, dynamic information. To address this gap, we developed the largest naturalistic dynamic face stimulus set in human neuroimaging research (700+ naturalistic video clips of unfamiliar faces). We used this naturalistic dataset to compare representational geometries estimated from DCNNs, behavioral responses, and brain responses. We found that DCNN representational geometries were consistent across architectures, cognitive representational geometries were consistent across raters in a behavioral arrangement task, and neural representational geometries in face areas were consistent across brains. Representational geometries in late, fully connected DCNN layers, which are optimized for individuation, were much more weakly correlated with cognitive and neural geometries than were geometries in late-intermediate layers. The late-intermediate face-DCNN layers successfully matched cognitive representational geometries, as measured with a behavioral arrangement task that primarily reflected categorical attributes, and correlated with neural representational geometries in known face-selective topographies. Our study suggests that current DCNNs successfully capture neural cognitive processes for categorical attributes of faces but less accurately capture individuation and dynamic features.
Collapse
Affiliation(s)
- Guo Jiahui
- Center for Cognitive Neuroscience, Dartmouth College, Hanover, NH03755
| | - Ma Feilong
- Center for Cognitive Neuroscience, Dartmouth College, Hanover, NH03755
| | | | - Samuel A. Nastase
- Princeton Neuroscience Institute, Princeton University, Princeton, NJ08544
| | - James V. Haxby
- Center for Cognitive Neuroscience, Dartmouth College, Hanover, NH03755
| | - M. Ida Gobbini
- Department of Medical and Surgical Sciences, University of Bologna, Bologna40138, Italy
- Istituti di Ricovero e Cura a Carattere Scientifico, Istituto delle Scienze Neurologiche di Bologna, Bologna40139, Italia
| |
Collapse
|
16
|
Pham TQ, Matsui T, Chikazoe J. Evaluation of the Hierarchical Correspondence between the Human Brain and Artificial Neural Networks: A Review. BIOLOGY 2023; 12:1330. [PMID: 37887040 PMCID: PMC10604784 DOI: 10.3390/biology12101330] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/14/2023] [Revised: 09/22/2023] [Accepted: 10/10/2023] [Indexed: 10/28/2023]
Abstract
Artificial neural networks (ANNs) that are heavily inspired by the human brain now achieve human-level performance across multiple task domains. ANNs have thus drawn attention in neuroscience, raising the possibility of providing a framework for understanding the information encoded in the human brain. However, the correspondence between ANNs and the brain cannot be measured directly. They differ in outputs and substrates, neurons vastly outnumber their ANN analogs (i.e., nodes), and the key algorithm responsible for most of modern ANN training (i.e., backpropagation) is likely absent from the brain. Neuroscientists have thus taken a variety of approaches to examine the similarity between the brain and ANNs at multiple levels of their information hierarchy. This review provides an overview of the currently available approaches and their limitations for evaluating brain-ANN correspondence.
Collapse
Affiliation(s)
| | - Teppei Matsui
- Graduate School of Brain Science, Doshisha University, Kyoto 610-0321, Japan
| | | |
Collapse
|
17
|
Lin C, Bulls LS, Tepfer LJ, Vyas AD, Thornton MA. Advancing Naturalistic Affective Science with Deep Learning. AFFECTIVE SCIENCE 2023; 4:550-562. [PMID: 37744976 PMCID: PMC10514024 DOI: 10.1007/s42761-023-00215-z] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/11/2023] [Accepted: 08/03/2023] [Indexed: 09/26/2023]
Abstract
People express their own emotions and perceive others' emotions via a variety of channels, including facial movements, body gestures, vocal prosody, and language. Studying these channels of affective behavior offers insight into both the experience and perception of emotion. Prior research has predominantly focused on studying individual channels of affective behavior in isolation using tightly controlled, non-naturalistic experiments. This approach limits our understanding of emotion in more naturalistic contexts where different channels of information tend to interact. Traditional methods struggle to address this limitation: manually annotating behavior is time-consuming, making it infeasible to do at large scale; manually selecting and manipulating stimuli based on hypotheses may neglect unanticipated features, potentially generating biased conclusions; and common linear modeling approaches cannot fully capture the complex, nonlinear, and interactive nature of real-life affective processes. In this methodology review, we describe how deep learning can be applied to address these challenges to advance a more naturalistic affective science. First, we describe current practices in affective research and explain why existing methods face challenges in revealing a more naturalistic understanding of emotion. Second, we introduce deep learning approaches and explain how they can be applied to tackle three main challenges: quantifying naturalistic behaviors, selecting and manipulating naturalistic stimuli, and modeling naturalistic affective processes. Finally, we describe the limitations of these deep learning methods, and how these limitations might be avoided or mitigated. By detailing the promise and the peril of deep learning, this review aims to pave the way for a more naturalistic affective science.
Collapse
Affiliation(s)
- Chujun Lin
- Department of Psychological and Brain Sciences, Dartmouth College, Hanover, NH USA
| | - Landry S. Bulls
- Department of Psychological and Brain Sciences, Dartmouth College, Hanover, NH USA
| | - Lindsey J. Tepfer
- Department of Psychological and Brain Sciences, Dartmouth College, Hanover, NH USA
| | - Amisha D. Vyas
- Department of Psychological and Brain Sciences, Dartmouth College, Hanover, NH USA
| | - Mark A. Thornton
- Department of Psychological and Brain Sciences, Dartmouth College, Hanover, NH USA
| |
Collapse
|
18
|
Vinken K, Prince JS, Konkle T, Livingstone MS. The neural code for "face cells" is not face-specific. SCIENCE ADVANCES 2023; 9:eadg1736. [PMID: 37647400 PMCID: PMC10468123 DOI: 10.1126/sciadv.adg1736] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/07/2022] [Accepted: 07/27/2023] [Indexed: 09/01/2023]
Abstract
Face cells are neurons that respond more to faces than to non-face objects. They are found in clusters in the inferotemporal cortex, thought to process faces specifically, and, hence, studied using faces almost exclusively. Analyzing neural responses in and around macaque face patches to hundreds of objects, we found graded response profiles for non-face objects that predicted the degree of face selectivity and provided information on face-cell tuning beyond that from actual faces. This relationship between non-face and face responses was not predicted by color and simple shape properties but by information encoded in deep neural networks trained on general objects rather than face classification. These findings contradict the long-standing assumption that face versus non-face selectivity emerges from face-specific features and challenge the practice of focusing on only the most effective stimulus. They provide evidence instead that category-selective neurons are best understood by their tuning directions in a domain-general object space.
Collapse
Affiliation(s)
- Kasper Vinken
- Department of Neurobiology, Harvard Medical School, Boston, MA 02115, USA
| | - Jacob S. Prince
- Department of Psychology, Harvard University, Cambridge, MA 02478, USA
| | - Talia Konkle
- Department of Psychology, Harvard University, Cambridge, MA 02478, USA
| | | |
Collapse
|
19
|
Schütt HH, Kipnis AD, Diedrichsen J, Kriegeskorte N. Statistical inference on representational geometries. eLife 2023; 12:e82566. [PMID: 37610302 PMCID: PMC10446828 DOI: 10.7554/elife.82566] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2022] [Accepted: 08/07/2023] [Indexed: 08/24/2023] Open
Abstract
Neuroscience has recently made much progress, expanding the complexity of both neural activity measurements and brain-computational models. However, we lack robust methods for connecting theory and experiment by evaluating our new big models with our new big data. Here, we introduce new inference methods enabling researchers to evaluate and compare models based on the accuracy of their predictions of representational geometries: A good model should accurately predict the distances among the neural population representations (e.g. of a set of stimuli). Our inference methods combine novel 2-factor extensions of crossvalidation (to prevent overfitting to either subjects or conditions from inflating our estimates of model accuracy) and bootstrapping (to enable inferential model comparison with simultaneous generalization to both new subjects and new conditions). We validate the inference methods on data where the ground-truth model is known, by simulating data with deep neural networks and by resampling of calcium-imaging and functional MRI data. Results demonstrate that the methods are valid and conclusions generalize correctly. These data analysis methods are available in an open-source Python toolbox (rsatoolbox.readthedocs.io).
Collapse
Affiliation(s)
- Heiko H Schütt
- Zuckerman Institute, Columbia UniversityNew YorkUnited States
| | | | | | | |
Collapse
|
20
|
Doshi FR, Konkle T. Cortical topographic motifs emerge in a self-organized map of object space. SCIENCE ADVANCES 2023; 9:eade8187. [PMID: 37343093 DOI: 10.1126/sciadv.ade8187] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/09/2022] [Accepted: 05/17/2023] [Indexed: 06/23/2023]
Abstract
The human ventral visual stream has a highly systematic organization of object information, but the causal pressures driving these topographic motifs are highly debated. Here, we use self-organizing principles to learn a topographic representation of the data manifold of a deep neural network representational space. We find that a smooth mapping of this representational space showed many brain-like motifs, with a large-scale organization by animacy and real-world object size, supported by mid-level feature tuning, with naturally emerging face- and scene-selective regions. While some theories of the object-selective cortex posit that these differently tuned regions of the brain reflect a collection of distinctly specified functional modules, the present work provides computational support for an alternate hypothesis that the tuning and topography of the object-selective cortex reflect a smooth mapping of a unified representational space.
Collapse
Affiliation(s)
- Fenil R Doshi
- Department of Psychology and Center for Brain Sciences, Harvard University, Cambridge, MA, USA
| | - Talia Konkle
- Department of Psychology and Center for Brain Sciences, Harvard University, Cambridge, MA, USA
| |
Collapse
|
21
|
Doerig A, Sommers RP, Seeliger K, Richards B, Ismael J, Lindsay GW, Kording KP, Konkle T, van Gerven MAJ, Kriegeskorte N, Kietzmann TC. The neuroconnectionist research programme. Nat Rev Neurosci 2023:10.1038/s41583-023-00705-w. [PMID: 37253949 DOI: 10.1038/s41583-023-00705-w] [Citation(s) in RCA: 14] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 04/21/2023] [Indexed: 06/01/2023]
Abstract
Artificial neural networks (ANNs) inspired by biology are beginning to be widely used to model behavioural and neural data, an approach we call 'neuroconnectionism'. ANNs have been not only lauded as the current best models of information processing in the brain but also criticized for failing to account for basic cognitive functions. In this Perspective article, we propose that arguing about the successes and failures of a restricted set of current ANNs is the wrong approach to assess the promise of neuroconnectionism for brain science. Instead, we take inspiration from the philosophy of science, and in particular from Lakatos, who showed that the core of a scientific research programme is often not directly falsifiable but should be assessed by its capacity to generate novel insights. Following this view, we present neuroconnectionism as a general research programme centred around ANNs as a computational language for expressing falsifiable theories about brain computation. We describe the core of the programme, the underlying computational framework and its tools for testing specific neuroscientific hypotheses and deriving novel understanding. Taking a longitudinal view, we review past and present neuroconnectionist projects and their responses to challenges and argue that the research programme is highly progressive, generating new and otherwise unreachable insights into the workings of the brain.
Collapse
Affiliation(s)
- Adrien Doerig
- Institute of Cognitive Science, University of Osnabrück, Osnabrück, Germany.
- Donders Institute for Brain, Cognition and Behaviour, Nijmegen, The Netherlands.
| | - Rowan P Sommers
- Department of Neurobiology of Language, Max Planck Institute for Psycholinguistics, Nijmegen, The Netherlands
| | - Katja Seeliger
- Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany
| | - Blake Richards
- Department of Neurology and Neurosurgery, McGill University, Montréal, QC, Canada
- School of Computer Science, McGill University, Montréal, QC, Canada
- Mila, Montréal, QC, Canada
- Montréal Neurological Institute, Montréal, QC, Canada
- Learning in Machines and Brains Program, CIFAR, Toronto, ON, Canada
| | | | | | - Konrad P Kording
- Learning in Machines and Brains Program, CIFAR, Toronto, ON, Canada
- Bioengineering, Neuroscience, University of Pennsylvania, Pennsylvania, PA, USA
| | | | | | | | - Tim C Kietzmann
- Institute of Cognitive Science, University of Osnabrück, Osnabrück, Germany
| |
Collapse
|
22
|
Margalit E, Lee H, Finzi D, DiCarlo JJ, Grill-Spector K, Yamins DLK. A Unifying Principle for the Functional Organization of Visual Cortex. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.05.18.541361. [PMID: 37292946 PMCID: PMC10245753 DOI: 10.1101/2023.05.18.541361] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
A key feature of many cortical systems is functional organization: the arrangement of neurons with specific functional properties in characteristic spatial patterns across the cortical surface. However, the principles underlying the emergence and utility of functional organization are poorly understood. Here we develop the Topographic Deep Artificial Neural Network (TDANN), the first unified model to accurately predict the functional organization of multiple cortical areas in the primate visual system. We analyze the key factors responsible for the TDANN's success and find that it strikes a balance between two specific objectives: achieving a task-general sensory representation that is self-supervised, and maximizing the smoothness of responses across the cortical sheet according to a metric that scales relative to cortical surface area. In turn, the representations learned by the TDANN are lower dimensional and more brain-like than those in models that lack a spatial smoothness constraint. Finally, we provide evidence that the TDANN's functional organization balances performance with inter-area connection length, and use the resulting models for a proof-of-principle optimization of cortical prosthetic design. Our results thus offer a unified principle for understanding functional organization and a novel view of the functional role of the visual system in particular.
Collapse
Affiliation(s)
- Eshed Margalit
- Neurosciences Graduate Program, Stanford University, Stanford, CA 94305
| | - Hyodong Lee
- Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA 02139
| | - Dawn Finzi
- Department of Psychology, Stanford University, Stanford, CA 94305
- Department of Computer Science, Stanford University, Stanford, CA 94305
| | - James J DiCarlo
- Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA 02139
- McGovern Institute for Brain Research, Massachusetts Institute of Technology, Cambridge, MA 02139
- Center for Brains Minds and Machines, Massachusetts Institute of Technology, Cambridge, MA 02139
| | - Kalanit Grill-Spector
- Department of Psychology, Stanford University, Stanford, CA 94305
- Wu Tsai Neurosciences Institute, Stanford University, Stanford, CA 94305
| | - Daniel L K Yamins
- Department of Psychology, Stanford University, Stanford, CA 94305
- Department of Computer Science, Stanford University, Stanford, CA 94305
- Wu Tsai Neurosciences Institute, Stanford University, Stanford, CA 94305
| |
Collapse
|
23
|
Jozwik KM, Kietzmann TC, Cichy RM, Kriegeskorte N, Mur M. Deep Neural Networks and Visuo-Semantic Models Explain Complementary Components of Human Ventral-Stream Representational Dynamics. J Neurosci 2023; 43:1731-1741. [PMID: 36759190 PMCID: PMC10010451 DOI: 10.1523/jneurosci.1424-22.2022] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2022] [Revised: 11/08/2022] [Accepted: 12/20/2022] [Indexed: 02/11/2023] Open
Abstract
Deep neural networks (DNNs) are promising models of the cortical computations supporting human object recognition. However, despite their ability to explain a significant portion of variance in neural data, the agreement between models and brain representational dynamics is far from perfect. We address this issue by asking which representational features are currently unaccounted for in neural time series data, estimated for multiple areas of the ventral stream via source-reconstructed magnetoencephalography data acquired in human participants (nine females, six males) during object viewing. We focus on the ability of visuo-semantic models, consisting of human-generated labels of object features and categories, to explain variance beyond the explanatory power of DNNs alone. We report a gradual reversal in the relative importance of DNN versus visuo-semantic features as ventral-stream object representations unfold over space and time. Although lower-level visual areas are better explained by DNN features starting early in time (at 66 ms after stimulus onset), higher-level cortical dynamics are best accounted for by visuo-semantic features starting later in time (at 146 ms after stimulus onset). Among the visuo-semantic features, object parts and basic categories drive the advantage over DNNs. These results show that a significant component of the variance unexplained by DNNs in higher-level cortical dynamics is structured and can be explained by readily nameable aspects of the objects. We conclude that current DNNs fail to fully capture dynamic representations in higher-level human visual cortex and suggest a path toward more accurate models of ventral-stream computations.SIGNIFICANCE STATEMENT When we view objects such as faces and cars in our visual environment, their neural representations dynamically unfold over time at a millisecond scale. These dynamics reflect the cortical computations that support fast and robust object recognition. DNNs have emerged as a promising framework for modeling these computations but cannot yet fully account for the neural dynamics. Using magnetoencephalography data acquired in human observers during object viewing, we show that readily nameable aspects of objects, such as 'eye', 'wheel', and 'face', can account for variance in the neural dynamics over and above DNNs. These findings suggest that DNNs and humans may in part rely on different object features for visual recognition and provide guidelines for model improvement.
Collapse
Affiliation(s)
- Kamila M Jozwik
- Department of Psychology, University of Cambridge, Cambridge CB2 3EB, United Kingdom
| | - Tim C Kietzmann
- Institute of Cognitive Science, University of Osnabrück, 49069 Osnabrück, Germany
| | - Radoslaw M Cichy
- Department of Education and Psychology, Freie Universität Berlin, 14195 Berlin, Germany
| | - Nikolaus Kriegeskorte
- Zuckerman Mind Brain Behavior Institute, Columbia University, New York, New York 10027
| | - Marieke Mur
- Department of Psychology, Western University, London, Ontario N6A 3K7, Canada
- Department of Computer Science, Western University, London, Ontario N6A 3K7, Canada
| |
Collapse
|
24
|
Kanwisher N, Khosla M, Dobs K. Using artificial neural networks to ask 'why' questions of minds and brains. Trends Neurosci 2023; 46:240-254. [PMID: 36658072 DOI: 10.1016/j.tins.2022.12.008] [Citation(s) in RCA: 12] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2022] [Revised: 11/29/2022] [Accepted: 12/22/2022] [Indexed: 01/19/2023]
Abstract
Neuroscientists have long characterized the properties and functions of the nervous system, and are increasingly succeeding in answering how brains perform the tasks they do. But the question 'why' brains work the way they do is asked less often. The new ability to optimize artificial neural networks (ANNs) for performance on human-like tasks now enables us to approach these 'why' questions by asking when the properties of networks optimized for a given task mirror the behavioral and neural characteristics of humans performing the same task. Here we highlight the recent success of this strategy in explaining why the visual and auditory systems work the way they do, at both behavioral and neural levels.
Collapse
Affiliation(s)
- Nancy Kanwisher
- Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA, USA; McGovern Institute for Brain Research, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Meenakshi Khosla
- Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA, USA; McGovern Institute for Brain Research, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Katharina Dobs
- Department of Psychology, Justus Liebig University Giessen, Giessen, Germany; Center for Mind, Brain and Behavior (CMBB), University of Marburg and Justus Liebig University, Giessen, Germany.
| |
Collapse
|
25
|
Han Z, Sereno A. Identifying and Localizing Multiple Objects Using Artificial Ventral and Dorsal Cortical Visual Pathways. Neural Comput 2023; 35:249-275. [PMID: 36543331 DOI: 10.1162/neco_a_01559] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2022] [Accepted: 10/09/2022] [Indexed: 12/24/2022]
Abstract
In our previous study (Han & Sereno, 2022a), we found that two artificial cortical visual pathways trained for either identity or space actively retain information about both identity and space independently and differently. We also found that this independently and differently retained information about identity and space in two separate pathways may be necessary to accurately and optimally recognize and localize objects. One limitation of our previous study was that there was only one object in each visual image, whereas in reality, there may be multiple objects in a scene. In this study, we find we are able to generalize our findings to object recognition and localization tasks where multiple objects are present in each visual image. We constrain the binding problem by training the identity network pathway to report the identities of objects in a given order according to the relative spatial relationships between the objects, given that most visual cortical areas including high-level ventral steam areas retain spatial information. Under these conditions, we find that the artificial neural networks with two pathways for identity and space have better performance in multiple-objects recognition and localization tasks (higher average testing accuracy, lower testing accuracy variance, less training time) than the artificial neural networks with a single pathway. We also find that the required number of training samples and the required training time increase quickly, and potentially exponentially, when the number of objects in each image increases, and we suggest that binding information from multiple objects simultaneously within any network (cortical area) induces conflict or competition and may be part of the reason why our brain has limited attentional and visual working memory capacities.
Collapse
Affiliation(s)
- Zhixian Han
- Department of Psychological Sciences, Purdue University, West Lafayette, IN 47907, U.S.A.
| | - Anne Sereno
- Department of Psychological Sciences and Weldon School of Biomedical Engineering, Purdue University, West Lafayette, IN 47907, U.S.A.
| |
Collapse
|
26
|
Bracci S, Op de Beeck HP. Understanding Human Object Vision: A Picture Is Worth a Thousand Representations. Annu Rev Psychol 2023; 74:113-135. [PMID: 36378917 DOI: 10.1146/annurev-psych-032720-041031] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Objects are the core meaningful elements in our visual environment. Classic theories of object vision focus upon object recognition and are elegant and simple. Some of their proposals still stand, yet the simplicity is gone. Recent evolutions in behavioral paradigms, neuroscientific methods, and computational modeling have allowed vision scientists to uncover the complexity of the multidimensional representational space that underlies object vision. We review these findings and propose that the key to understanding this complexity is to relate object vision to the full repertoire of behavioral goals that underlie human behavior, running far beyond object recognition. There might be no such thing as core object recognition, and if it exists, then its importance is more limited than traditionally thought.
Collapse
Affiliation(s)
- Stefania Bracci
- Center for Mind/Brain Sciences, University of Trento, Rovereto, Italy;
| | - Hans P Op de Beeck
- Leuven Brain Institute, Research Unit Brain & Cognition, KU Leuven, Leuven, Belgium;
| |
Collapse
|
27
|
Dissociation and hierarchy of human visual pathways for simultaneously coding facial identity and expression. Neuroimage 2022; 264:119769. [PMID: 36435341 DOI: 10.1016/j.neuroimage.2022.119769] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2022] [Revised: 11/14/2022] [Accepted: 11/22/2022] [Indexed: 11/25/2022] Open
Abstract
Humans have an extraordinary ability to recognize facial expression and identity from a single face simultaneously and effortlessly, however, the underlying neural computation is not well understood. Here, we optimized a multi-task deep neural network to classify facial expression and identity simultaneously. Under various optimization training strategies, the best-performing model consistently showed 'share-separate' organization. The two separate branches of the best-performing model also exhibited distinct abilities to categorize facial expression and identity, and these abilities increased along the facial expression or identity branches toward high layers. By comparing the representational similarities between the best-performing model and functional magnetic resonance imaging (fMRI) responses in the human visual cortex to the same face stimuli, the face-selective posterior superior temporal sulcus (pSTS) in the dorsal visual cortex was significantly correlated with layers in the expression branch of the model, and the anterior inferotemporal cortex (aIT) and anterior fusiform face area (aFFA) in the ventral visual cortex were significantly correlated with layers in the identity branch of the model. Besides, the aFFA and aIT better matched the high layers of the model, while the posterior FFA (pFFA) and occipital facial area (OFA) better matched the middle and early layers of the model, respectively. Overall, our study provides a task-optimization computational model to better understand the neural mechanism underlying face recognition, which suggest that similar to the best-performing model, the human visual system exhibits both dissociated and hierarchical neuroanatomical organization when simultaneously coding facial identity and expression.
Collapse
|
28
|
Harnett NG, Finegold KE, Lebois LAM, van Rooij SJH, Ely TD, Murty VP, Jovanovic T, Bruce SE, House SL, Beaudoin FL, An X, Zeng D, Neylan TC, Clifford GD, Linnstaedt SD, Germine LT, Bollen KA, Rauch SL, Haran JP, Storrow AB, Lewandowski C, Musey PI, Hendry PL, Sheikh S, Jones CW, Punches BE, Kurz MC, Swor RA, Hudak LA, Pascual JL, Seamon MJ, Harris E, Chang AM, Pearson C, Peak DA, Domeier RM, Rathlev NK, O'Neil BJ, Sergot P, Sanchez LD, Miller MW, Pietrzak RH, Joormann J, Barch DM, Pizzagalli DA, Sheridan JF, Harte SE, Elliott JM, Kessler RC, Koenen KC, McLean SA, Nickerson LD, Ressler KJ, Stevens JS. Structural covariance of the ventral visual stream predicts posttraumatic intrusion and nightmare symptoms: a multivariate data fusion analysis. Transl Psychiatry 2022; 12:321. [PMID: 35941117 PMCID: PMC9360028 DOI: 10.1038/s41398-022-02085-8] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/22/2022] [Revised: 07/14/2022] [Accepted: 07/20/2022] [Indexed: 01/16/2023] Open
Abstract
Visual components of trauma memories are often vividly re-experienced by survivors with deleterious consequences for normal function. Neuroimaging research on trauma has primarily focused on threat-processing circuitry as core to trauma-related dysfunction. Conversely, limited attention has been given to visual circuitry which may be particularly relevant to posttraumatic stress disorder (PTSD). Prior work suggests that the ventral visual stream is directly related to the cognitive and affective disturbances observed in PTSD and may be predictive of later symptom expression. The present study used multimodal magnetic resonance imaging data (n = 278) collected two weeks after trauma exposure from the AURORA study, a longitudinal, multisite investigation of adverse posttraumatic neuropsychiatric sequelae. Indices of gray and white matter were combined using data fusion to identify a structural covariance network (SCN) of the ventral visual stream 2 weeks after trauma. Participant's loadings on the SCN were positively associated with both intrusion symptoms and intensity of nightmares. Further, SCN loadings moderated connectivity between a previously observed amygdala-hippocampal functional covariance network and the inferior temporal gyrus. Follow-up MRI data at 6 months showed an inverse relationship between SCN loadings and negative alterations in cognition in mood. Further, individuals who showed decreased strength of the SCN between 2 weeks and 6 months had generally higher PTSD symptom severity over time. The present findings highlight a role for structural integrity of the ventral visual stream in the development of PTSD. The ventral visual stream may be particularly important for the consolidation or retrieval of trauma memories and may contribute to efficient reactivation of visual components of the trauma memory, thereby exacerbating PTSD symptoms. Potentially chronic engagement of the network may lead to reduced structural integrity which becomes a risk factor for lasting PTSD symptoms.
Collapse
Affiliation(s)
- Nathaniel G Harnett
- Division of Depression and Anxiety, McLean Hospital, Belmont, MA, USA.
- Department of Psychiatry, Harvard Medical School, Boston, MA, USA.
| | | | - Lauren A M Lebois
- Division of Depression and Anxiety, McLean Hospital, Belmont, MA, USA
- Department of Psychiatry, Harvard Medical School, Boston, MA, USA
| | - Sanne J H van Rooij
- Department of Psychiatry and Behavioral Sciences, Emory University School of Medicine, Atlanta, GA, USA
| | - Timothy D Ely
- Department of Psychiatry and Behavioral Sciences, Emory University School of Medicine, Atlanta, GA, USA
| | - Vishnu P Murty
- Department of Psychology, Temple University, Philadelphia, PA, USA
| | - Tanja Jovanovic
- Department of Psychiatry and Behavioral Neurosciences, Wayne State University, Detroit, MI, USA
| | - Steven E Bruce
- Department of Psychological Sciences, University of Missouri - St. Louis, St. Louis, MO, USA
| | - Stacey L House
- Department of Emergency Medicine, Washington University School of Medicine, St. Louis, MO, USA
| | - Francesca L Beaudoin
- Department of Emergency Medicine & Department of Health Services, Policy, and Practice, The Alpert Medical School of Brown University, Rhode Island Hospital and The Miriam Hospital, Providence, RI, USA
| | - Xinming An
- Institute for Trauma Recovery, Department of Anesthesiology, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Donglin Zeng
- Department of Biostatistics, Gillings School of Global Public Health, University of North Carolina, Chapel Hill, NC, USA
| | - Thomas C Neylan
- Departments of Psychiatry and Neurology, University of California San Francisco, San Francisco, CA, USA
| | - Gari D Clifford
- Department of Biomedical Informatics, Emory University School of Medicine, Atlanta, GA, USA
- Department of Biomedical Engineering, Georgia Institute of Technology and Emory University, Atlanta, GA, USA
| | - Sarah D Linnstaedt
- Institute for Trauma Recovery, Department of Anesthesiology, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Laura T Germine
- Department of Psychiatry, Harvard Medical School, Boston, MA, USA
- Institute for Technology in Psychiatry, McLean Hospital, Belmont, MA, USA
- The Many Brains Project, Belmont, MA, USA
| | - Kenneth A Bollen
- Department of Psychology and Neuroscience & Department of Sociology, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Scott L Rauch
- Department of Psychiatry, Harvard Medical School, Boston, MA, USA
- Institute for Technology in Psychiatry, McLean Hospital, Belmont, MA, USA
- Department of Psychiatry, McLean Hospital, Belmont, MA, USA
| | - John P Haran
- Department of Emergency Medicine, University of Massachusetts Chan Medical School, Worcester, MA, USA
| | - Alan B Storrow
- Department of Emergency Medicine, Vanderbilt University Medical Center, Nashville, TN, USA
| | | | - Paul I Musey
- Department of Emergency Medicine, Indiana University School of Medicine, Indianapolis, IN, USA
| | - Phyllis L Hendry
- Department of Emergency Medicine, University of Florida College of Medicine-Jacksonville, Jacksonville, FL, USA
| | - Sophia Sheikh
- Department of Emergency Medicine, University of Florida College of Medicine-Jacksonville, Jacksonville, FL, USA
| | - Christopher W Jones
- Department of Emergency Medicine, Cooper Medical School of Rowan University, Camden, NJ, USA
| | - Brittany E Punches
- Department of Emergency Medicine, Ohio State University College of Medicine, Columbus, OH, USA
- Ohio State University College of Nursing, Columbus, OH, USA
| | - Michael C Kurz
- Department of Emergency Medicine, University of Alabama School of Medicine, Birmingham, AL, USA
- Department of Surgery, Division of Acute Care Surgery, University of Alabama School of Medicine, Birmingham, AL, USA
- Center for Injury Science, University of Alabama at Birmingham, Birmingham, AL, USA
| | - Robert A Swor
- Department of Emergency Medicine, Oakland University William Beaumont School of Medicine, Rochester, MI, USA
| | - Lauren A Hudak
- Department of Emergency Medicine, Emory University School of Medicine, Atlanta, GA, USA
| | - Jose L Pascual
- Department of Surgery, Department of Neurosurgery, University of Pennsylvania, Philadelphia, PA, USA
- Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
| | - Mark J Seamon
- Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
- Department of Surgery, Division of Traumatology, Surgical Critical Care and Emergency Surgery, University of Pennsylvania, Philadelphia, PA, USA
| | | | - Anna M Chang
- Department of Emergency Medicine, Jefferson University Hospitals, Philadelphia, PA, USA
| | - Claire Pearson
- Department of Emergency Medicine, Wayne State University, Ascension St. John Hospital, Detroit, MI, USA
| | - David A Peak
- Department of Emergency Medicine, Massachusetts General Hospital, Boston, MA, USA
| | - Robert M Domeier
- Department of Emergency Medicine, Saint Joseph Mercy Hospital, Ypsilanti, MI, USA
| | - Niels K Rathlev
- Department of Emergency Medicine, University of Massachusetts Medical School-Baystate, Springfield, MA, USA
| | - Brian J O'Neil
- Department of Emergency Medicine, Wayne State University, Detroit Receiving Hospital, Detroit, MI, USA
| | - Paulina Sergot
- Department of Emergency Medicine, McGovern Medical School, University of Texas Health, Houston, TX, USA
| | - Leon D Sanchez
- Department of Emergency Medicine, Brigham and Women's Hospital, Boston, MA, USA
- Department of Emergency Medicine, Harvard Medical School, Boston, MA, USA
| | - Mark W Miller
- National Center for PTSD, Behavioral Science Division, VA Boston Healthcare System, Boston, MA, USA
- Department of Psychiatry, Boston University School of Medicine, Boston, MA, USA
| | - Robert H Pietrzak
- National Center for PTSD, Clinical Neurosciences Division, VA Connecticut Healthcare System, West Haven, CT, USA
- Department of Psychiatry, Yale School of Medicine, New Haven, CT, USA
| | - Jutta Joormann
- Department of Psychology, Yale University, New Haven, CT, USA
| | - Deanna M Barch
- Department of Psychological & Brain Sciences, Washington University in St. Louis, St. Louis, MO, USA
| | - Diego A Pizzagalli
- Division of Depression and Anxiety, McLean Hospital, Belmont, MA, USA
- Department of Psychiatry, Harvard Medical School, Boston, MA, USA
| | - John F Sheridan
- Division of Biosciences, Ohio State University College of Dentistry, Columbus, OH, USA
- Institute for Behavioral Medicine Research, OSU Wexner Medical Center, Columbus, OH, USA
| | - Steven E Harte
- Department of Anesthesiology, University of Michigan Medical School, Ann Arbor, MI, USA
- Department of Internal Medicine-Rheumatology, University of Michigan Medical School, Ann Arbor, MI, USA
| | - James M Elliott
- Kolling Institute, University of Sydney, St Leonards, New South Wales, Australia
- Faculty of Medicine and Health, University of Sydney, Northern Sydney Local Health District, New South Wales, Australia
- Physical Therapy & Human Movement Sciences, Feinberg School of Medicine, Northwestern University, Chicago, IL, USA
| | - Ronald C Kessler
- Department of Health Care Policy, Harvard Medical School, Boston, MA, USA
| | - Karestan C Koenen
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Harvard University, Boston, MA, USA
| | - Samuel A McLean
- Department of Emergency Medicine, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
- Institute for Trauma Recovery, Department of Psychiatry, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Lisa D Nickerson
- Department of Psychiatry, Harvard Medical School, Boston, MA, USA
- McLean Imaging Center, McLean Hospital, Belmont, MA, USA
| | - Kerry J Ressler
- Division of Depression and Anxiety, McLean Hospital, Belmont, MA, USA
- Department of Psychiatry, Harvard Medical School, Boston, MA, USA
| | - Jennifer S Stevens
- Department of Psychiatry and Behavioral Sciences, Emory University School of Medicine, Atlanta, GA, USA
| |
Collapse
|
29
|
Sedigh-Sarvestani M, Fitzpatrick D. What and Where: Location-Dependent Feature Sensitivity as a Canonical Organizing Principle of the Visual System. Front Neural Circuits 2022; 16:834876. [PMID: 35498372 PMCID: PMC9039279 DOI: 10.3389/fncir.2022.834876] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2021] [Accepted: 03/01/2022] [Indexed: 11/13/2022] Open
Abstract
Traditionally, functional representations in early visual areas are conceived as retinotopic maps preserving ego-centric spatial location information while ensuring that other stimulus features are uniformly represented for all locations in space. Recent results challenge this framework of relatively independent encoding of location and features in the early visual system, emphasizing location-dependent feature sensitivities that reflect specialization of cortical circuits for different locations in visual space. Here we review the evidence for such location-specific encoding including: (1) systematic variation of functional properties within conventional retinotopic maps in the cortex; (2) novel periodic retinotopic transforms that dramatically illustrate the tight linkage of feature sensitivity, spatial location, and cortical circuitry; and (3) retinotopic biases in cortical areas, and groups of areas, that have been defined by their functional specializations. We propose that location-dependent feature sensitivity is a fundamental organizing principle of the visual system that achieves efficient representation of positional regularities in visual experience, and reflects the evolutionary selection of sensory and motor circuits to optimally represent behaviorally relevant information. Future studies are necessary to discover mechanisms underlying joint encoding of location and functional information, how this relates to behavior, emerges during development, and varies across species.
Collapse
|