1
|
Huang S, De Brigard F, Cabeza R, Davis SW. Connectivity analyses for task-based fMRI. Phys Life Rev 2024; 49:139-156. [PMID: 38728902 PMCID: PMC11116041 DOI: 10.1016/j.plrev.2024.04.012] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2024] [Accepted: 04/29/2024] [Indexed: 05/12/2024]
Abstract
Functional connectivity is conventionally defined by measuring the similarity between brain signals from two regions. The technique has become widely adopted in the analysis of functional magnetic resonance imaging (fMRI) data, where it has provided cognitive neuroscientists with abundant information on how brain regions interact to support complex cognition. However, in the past decade the notion of "connectivity" has expanded in both the complexity and heterogeneity of its application to cognitive neuroscience, resulting in greater difficulty of interpretation, replication, and cross-study comparisons. In this paper, we begin with the canonical notions of functional connectivity and then introduce recent methodological developments that either estimate some alternative form of connectivity or extend the analytical framework, with the hope of bringing better clarity for cognitive neuroscience researchers.
Collapse
Affiliation(s)
- Shenyang Huang
- Department of Psychology and Neuroscience, Duke University, Durham, NC 27708, United States; Center for Cognitive Neuroscience, Duke University, Durham, NC 27708, United States.
| | - Felipe De Brigard
- Department of Psychology and Neuroscience, Duke University, Durham, NC 27708, United States; Center for Cognitive Neuroscience, Duke University, Durham, NC 27708, United States; Department of Philosophy, Duke University, Durham, NC 27708, United States
| | - Roberto Cabeza
- Department of Psychology and Neuroscience, Duke University, Durham, NC 27708, United States; Center for Cognitive Neuroscience, Duke University, Durham, NC 27708, United States
| | - Simon W Davis
- Department of Psychology and Neuroscience, Duke University, Durham, NC 27708, United States; Department of Philosophy, Duke University, Durham, NC 27708, United States; Department of Neurology, Duke University School of Medicine, Durham, NC 27708, United States
| |
Collapse
|
2
|
Morales-Torres R, Wing EA, Deng L, Davis SW, Cabeza R. Visual Recognition Memory of Scenes Is Driven by Categorical, Not Sensory, Visual Representations. J Neurosci 2024; 44:e1479232024. [PMID: 38569925 PMCID: PMC11112637 DOI: 10.1523/jneurosci.1479-23.2024] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2023] [Revised: 02/07/2024] [Accepted: 02/14/2024] [Indexed: 04/05/2024] Open
Abstract
When we perceive a scene, our brain processes various types of visual information simultaneously, ranging from sensory features, such as line orientations and colors, to categorical features, such as objects and their arrangements. Whereas the role of sensory and categorical visual representations in predicting subsequent memory has been studied using isolated objects, their impact on memory for complex scenes remains largely unknown. To address this gap, we conducted an fMRI study in which female and male participants encoded pictures of familiar scenes (e.g., an airport picture) and later recalled them, while rating the vividness of their visual recall. Outside the scanner, participants had to distinguish each seen scene from three similar lures (e.g., three airport pictures). We modeled the sensory and categorical visual features of multiple scenes using both early and late layers of a deep convolutional neural network. Then, we applied representational similarity analysis to determine which brain regions represented stimuli in accordance with the sensory and categorical models. We found that categorical, but not sensory, representations predicted subsequent memory. In line with the previous result, only for the categorical model, the average recognition performance of each scene exhibited a positive correlation with the average visual dissimilarity between the item in question and its respective lures. These results strongly suggest that even in memory tests that ostensibly rely solely on visual cues (such as forced-choice visual recognition with similar distractors), memory decisions for scenes may be primarily influenced by categorical rather than sensory representations.
Collapse
Affiliation(s)
| | - Erik A Wing
- Rotman Research Institute, Baycrest Health Sciences, Toronto, Ontario M6A 2E1, Canada
| | - Lifu Deng
- Department of Psychology & Neuroscience, Duke University, Durham, North Carolina 27708
| | - Simon W Davis
- Department of Psychology & Neuroscience, Duke University, Durham, North Carolina 27708
- Department of Neurology, Duke University School of Medicine, Durham, North Carolina 27708
| | - Roberto Cabeza
- Department of Psychology & Neuroscience, Duke University, Durham, North Carolina 27708
| |
Collapse
|
3
|
Xie Y, Mack ML. Reconciling category exceptions through representational shifts. Psychon Bull Rev 2024:10.3758/s13423-024-02501-8. [PMID: 38639836 DOI: 10.3758/s13423-024-02501-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 03/28/2024] [Indexed: 04/20/2024]
Abstract
Real-world categories often contain exceptions that disobey the perceptual regularities followed by other members. Prominent psychological and neurobiological theories indicate that exception learning relies on the flexible modulation of object representations, but the specific representational shifts key to learning remain poorly understood. Here, we leveraged behavioral and computational approaches to elucidate the representational dynamics during the acquisition of exceptions that violate established regularity knowledge. In our study, participants (n = 42) learned novel categories in which regular and exceptional items were introduced successively; we then fitted a computational model to individuals' categorization performance to infer latent stimulus representations before and after exception learning. We found that in the representational space, exception learning not only drove confusable exceptions to be differentiated from regular items, but also led exceptions within the same category to be integrated based on shared characteristics. These shifts resulted in distinct representational clusters of regular items and exceptions that constituted hierarchically structured category representations, and the distinct clustering of exceptions from regular items was associated with a high ability to generalize and reconcile knowledge of regularities and exceptions. Moreover, by having a second group of participants (n = 42) to judge stimuli's similarity before and after exception learning, we revealed misalignment between representational similarity and behavioral similarity judgments, which further highlights the hierarchical layouts of categories with regularities and exceptions. Altogether, our findings elucidate the representational dynamics giving rise to generalizable category structures that reconcile perceptually inconsistent category members, thereby advancing the understanding of knowledge formation.
Collapse
Affiliation(s)
- Yongzhen Xie
- Department of Psychology, University of Toronto, 100 St. George Street, Toronto, ON, M5S 3G3, Canada.
| | - Michael L Mack
- Department of Psychology, University of Toronto, 100 St. George Street, Toronto, ON, M5S 3G3, Canada
| |
Collapse
|
4
|
Carrington M, Liu AG, Candy C, Martin A, Avery J. Naturalistic food categories are driven by subjective estimates rather than objective measures of food qualities. Food Qual Prefer 2024; 113:105073. [PMID: 38222065 PMCID: PMC10783799 DOI: 10.1016/j.foodqual.2023.105073] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/16/2024]
Abstract
Food-related studies often categorize foods using criteria such as fat and sugar content (e.g., high-fat, high-sugar foods; low-fat, low-sugar foods), and use these categorizations for further analyses. While these criteria are relevant to nutritional health, it is unclear whether they agree with the ways in which we typically group foods. Do these objective categories correspond to our subjective sense? To address this question, we recruited a group of 487 online participants to perform a triplet comparison task involving implicit object similarity judgements on images of 36 foods, which varied in their levels of fat and sugar. We also acquired subjective ratings of other food properties from another set of 369 online participants. Data from the online triplet task was used to generate a similarity matrix of these 36 foods. Principal Components Analysis (PCA) of this matrix identified that the strongest determinant of food similarity (the first PC) was most highly related to participants' judgements of how processed the foods were, while the second component was most related to estimates of sugar and fat content. K-means clustering analysis revealed five emergent food groupings along these PC axes: sweets, fats, starches, fruits, and vegetables. Our results suggest that naturalistic categorizations of food are driven primarily by knowledge of the origin of foods (i.e., grown or manufactured), rather than by their sensory or macronutrient properties. These differences should be considered and explored when developing methods for scientific food studies.
Collapse
Affiliation(s)
- Madeline Carrington
- Laboratory of Brain and Cognition, National Institute of Mental Health, Bethesda, MD, United States 20892
| | - Alexander G. Liu
- Laboratory of Brain and Cognition, National Institute of Mental Health, Bethesda, MD, United States 20892
| | - Caroline Candy
- Laboratory of Brain and Cognition, National Institute of Mental Health, Bethesda, MD, United States 20892
| | - Alex Martin
- Laboratory of Brain and Cognition, National Institute of Mental Health, Bethesda, MD, United States 20892
| | - Jason Avery
- Laboratory of Brain and Cognition, National Institute of Mental Health, Bethesda, MD, United States 20892
| |
Collapse
|
5
|
Waraich SA, Victor JD. The Geometry of Low- and High-Level Perceptual Spaces. J Neurosci 2024; 44:e1460232023. [PMID: 38267235 PMCID: PMC10860617 DOI: 10.1523/jneurosci.1460-23.2023] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2023] [Revised: 11/27/2023] [Accepted: 11/28/2023] [Indexed: 01/26/2024] Open
Abstract
Low-level features are typically continuous (e.g., the gamut between two colors), but semantic information is often categorical (there is no corresponding gradient between dog and turtle) and hierarchical (animals live in land, water, or air). To determine the impact of these differences on cognitive representations, we characterized the geometry of perceptual spaces of five domains: a domain dominated by semantic information (animal names presented as words), a domain dominated by low-level features (colored textures), and three intermediate domains (animal images, lightly texturized animal images that were easy to recognize, and heavily texturized animal images that were difficult to recognize). Each domain had 37 stimuli derived from the same animal names. From 13 participants (9F), we gathered similarity judgments in each domain via an efficient psychophysical ranking paradigm. We then built geometric models of each domain for each participant, in which distances between stimuli accounted for participants' similarity judgments and intrinsic uncertainty. Remarkably, the five domains had similar global properties: each required 5-7 dimensions, and a modest amount of spherical curvature provided the best fit. However, the arrangement of the stimuli within these embeddings depended on the level of semantic information: dendrograms derived from semantic domains (word, image, and lightly texturized images) were more "tree-like" than those from feature-dominated domains (heavily texturized images and textures). Thus, the perceptual spaces of domains along this feature-dominated to semantic-dominated gradient shift to a tree-like organization when semantic information dominates, while retaining a similar global geometry.
Collapse
Affiliation(s)
| | - Jonathan D Victor
- Division of Systems Neurology and Neuroscience, Feil Family Brain and Mind Research Institute, Weill Cornell Medical College, New York 10065, New York
| |
Collapse
|
6
|
Tuckute G, Feather J, Boebinger D, McDermott JH. Many but not all deep neural network audio models capture brain responses and exhibit correspondence between model stages and brain regions. PLoS Biol 2023; 21:e3002366. [PMID: 38091351 PMCID: PMC10718467 DOI: 10.1371/journal.pbio.3002366] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2022] [Accepted: 10/06/2023] [Indexed: 12/18/2023] Open
Abstract
Models that predict brain responses to stimuli provide one measure of understanding of a sensory system and have many potential applications in science and engineering. Deep artificial neural networks have emerged as the leading such predictive models of the visual system but are less explored in audition. Prior work provided examples of audio-trained neural networks that produced good predictions of auditory cortical fMRI responses and exhibited correspondence between model stages and brain regions, but left it unclear whether these results generalize to other neural network models and, thus, how to further improve models in this domain. We evaluated model-brain correspondence for publicly available audio neural network models along with in-house models trained on 4 different tasks. Most tested models outpredicted standard spectromporal filter-bank models of auditory cortex and exhibited systematic model-brain correspondence: Middle stages best predicted primary auditory cortex, while deep stages best predicted non-primary cortex. However, some state-of-the-art models produced substantially worse brain predictions. Models trained to recognize speech in background noise produced better brain predictions than models trained to recognize speech in quiet, potentially because hearing in noise imposes constraints on biological auditory representations. The training task influenced the prediction quality for specific cortical tuning properties, with best overall predictions resulting from models trained on multiple tasks. The results generally support the promise of deep neural networks as models of audition, though they also indicate that current models do not explain auditory cortical responses in their entirety.
Collapse
Affiliation(s)
- Greta Tuckute
- Department of Brain and Cognitive Sciences, McGovern Institute for Brain Research MIT, Cambridge, Massachusetts, United States of America
- Center for Brains, Minds, and Machines, MIT, Cambridge, Massachusetts, United States of America
| | - Jenelle Feather
- Department of Brain and Cognitive Sciences, McGovern Institute for Brain Research MIT, Cambridge, Massachusetts, United States of America
- Center for Brains, Minds, and Machines, MIT, Cambridge, Massachusetts, United States of America
| | - Dana Boebinger
- Department of Brain and Cognitive Sciences, McGovern Institute for Brain Research MIT, Cambridge, Massachusetts, United States of America
- Center for Brains, Minds, and Machines, MIT, Cambridge, Massachusetts, United States of America
- Program in Speech and Hearing Biosciences and Technology, Harvard, Cambridge, Massachusetts, United States of America
- University of Rochester Medical Center, Rochester, New York, New York, United States of America
| | - Josh H. McDermott
- Department of Brain and Cognitive Sciences, McGovern Institute for Brain Research MIT, Cambridge, Massachusetts, United States of America
- Center for Brains, Minds, and Machines, MIT, Cambridge, Massachusetts, United States of America
- Program in Speech and Hearing Biosciences and Technology, Harvard, Cambridge, Massachusetts, United States of America
| |
Collapse
|
7
|
Tarigopula P, Fairhall SL, Bavaresco A, Truong N, Hasson U. Improved prediction of behavioral and neural similarity spaces using pruned DNNs. Neural Netw 2023; 168:89-104. [PMID: 37748394 DOI: 10.1016/j.neunet.2023.08.049] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2023] [Revised: 08/18/2023] [Accepted: 08/28/2023] [Indexed: 09/27/2023]
Abstract
Deep Neural Networks (DNNs) have become an important tool for modeling brain and behavior. One key area of interest has been to apply these networks to model human similarity judgements. Several previous works have used the embeddings from the penultimate layer of vision DNNs and showed that a reweighting of these features improves the fit between human similarity judgments and DNNs. These studies underline the idea that these embeddings form a good basis set but lack the correct level of salience. Here we re-examined the grounds for this idea and on the contrary, we hypothesized that these embeddings, beyond forming a good basis set, also have the correct level of salience to account for similarity judgments. It is just that the huge dimensional embedding needs to be pruned to select those features relevant for the considered domain for which a similarity space is modeled. In Study 1 we supervised DNN pruning based on a subset of human similarity judgments. We found that pruning: i) improved out-of-sample prediction of human similarity judgments from DNN embeddings, ii) produced better alignment with WordNet hierarchy, and iii) retained much higher classification accuracy than reweighting. Study 2 showed that pruning by neurobiological data is highly effective in improving out-of-sample prediction of brain-derived representational dissimilarity matrices from DNN embeddings, at times fleshing out isomorphisms not otherwise observable. Using pruned DNNs, image-level heatmaps can be produced to identify image sections whose features load on dimensions coded by a brain area. Pruning supervised by human brain/behavior therefore effectively identifies alignable dimensions of knowledge between DNNs and humans and constitutes an effective method for understanding the organization of knowledge in neural networks.
Collapse
Affiliation(s)
- Priya Tarigopula
- Center for Mind/Brain Sciences - CIMeC, University of Trento, Italy.
| | | | | | - Nhut Truong
- Center for Mind/Brain Sciences - CIMeC, University of Trento, Italy.
| | - Uri Hasson
- Center for Mind/Brain Sciences - CIMeC, University of Trento, Italy.
| |
Collapse
|
8
|
Karapetian A, Boyanova A, Pandaram M, Obermayer K, Kietzmann TC, Cichy RM. Empirically Identifying and Computationally Modeling the Brain-Behavior Relationship for Human Scene Categorization. J Cogn Neurosci 2023; 35:1879-1897. [PMID: 37590093 PMCID: PMC10586810 DOI: 10.1162/jocn_a_02043] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/19/2023]
Abstract
Humans effortlessly make quick and accurate perceptual decisions about the nature of their immediate visual environment, such as the category of the scene they face. Previous research has revealed a rich set of cortical representations potentially underlying this feat. However, it remains unknown which of these representations are suitably formatted for decision-making. Here, we approached this question empirically and computationally, using neuroimaging and computational modeling. For the empirical part, we collected EEG data and RTs from human participants during a scene categorization task (natural vs. man-made). We then related EEG data to behavior to behavior using a multivariate extension of signal detection theory. We observed a correlation between neural data and behavior specifically between ∼100 msec and ∼200 msec after stimulus onset, suggesting that the neural scene representations in this time period are suitably formatted for decision-making. For the computational part, we evaluated a recurrent convolutional neural network (RCNN) as a model of brain and behavior. Unifying our previous observations in an image-computable model, the RCNN predicted well the neural representations, the behavioral scene categorization data, as well as the relationship between them. Our results identify and computationally characterize the neural and behavioral correlates of scene categorization in humans.
Collapse
Affiliation(s)
- Agnessa Karapetian
- Freie Universität Berlin, Germany
- Charité - Universitätsmedizin Berlin, Einstein Center for Neurosciences Berlin, Germany
- Bernstein Center for Computational Neuroscience Berlin, Germany
| | | | | | - Klaus Obermayer
- Charité - Universitätsmedizin Berlin, Einstein Center for Neurosciences Berlin, Germany
- Bernstein Center for Computational Neuroscience Berlin, Germany
- Technische Universität Berlin, Germany
- Humboldt-Universität zu Berlin, Germany
| | | | - Radoslaw M Cichy
- Freie Universität Berlin, Germany
- Charité - Universitätsmedizin Berlin, Einstein Center for Neurosciences Berlin, Germany
- Bernstein Center for Computational Neuroscience Berlin, Germany
- Humboldt-Universität zu Berlin, Germany
| |
Collapse
|
9
|
Magri C, Elmoznino E, Bonner MF. Scene context is predictive of unconstrained object similarity judgments. Cognition 2023; 239:105535. [PMID: 37481806 DOI: 10.1016/j.cognition.2023.105535] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2022] [Revised: 06/21/2023] [Accepted: 06/23/2023] [Indexed: 07/25/2023]
Abstract
What makes objects alike in the human mind? Computational approaches for characterizing object similarity have largely focused on the visual forms of objects or their linguistic associations. However, intuitive notions of object similarity may depend heavily on contextual reasoning-that is, objects may be grouped together in the mind if they occur in the context of similar scenes or events. Using large-scale analyses of natural scene statistics and human behavior, we found that a computational model of the associations between objects and their scene contexts is strongly predictive of how humans spontaneously group objects by similarity. Specifically, we learned contextual prototypes for a diverse set of object categories by taking the average response of a convolutional neural network (CNN) to the scene contexts in which the objects typically occurred. In behavioral experiments, we found that contextual prototypes were strongly predictive of human similarity judgments for a large set of objects and rivaled the performance of models based on CNN representations of the objects themselves or word embeddings for their names. Together, our findings reveal the remarkable degree to which the natural statistics of context predict commonsense notions of object similarity.
Collapse
Affiliation(s)
- Caterina Magri
- Department of Cognitive Science, Johns Hopkins University, 3400 N. Charles St., Baltimore, MD 21218, United States of America
| | - Eric Elmoznino
- Department of Cognitive Science, Johns Hopkins University, 3400 N. Charles St., Baltimore, MD 21218, United States of America
| | - Michael F Bonner
- Department of Cognitive Science, Johns Hopkins University, 3400 N. Charles St., Baltimore, MD 21218, United States of America.
| |
Collapse
|
10
|
Zhang H, Ding X, Liu N, Nolan R, Ungerleider LG, Japee S. Equivalent processing of facial expression and identity by macaque visual system and task-optimized neural network. Neuroimage 2023; 273:120067. [PMID: 36997134 PMCID: PMC10165955 DOI: 10.1016/j.neuroimage.2023.120067] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2022] [Revised: 02/20/2023] [Accepted: 03/27/2023] [Indexed: 03/30/2023] Open
Abstract
Both the primate visual system and artificial deep neural network (DNN) models show an extraordinary ability to simultaneously classify facial expression and identity. However, the neural computations underlying the two systems are unclear. Here, we developed a multi-task DNN model that optimally classified both monkey facial expressions and identities. By comparing the fMRI neural representations of the macaque visual cortex with the best-performing DNN model, we found that both systems: (1) share initial stages for processing low-level face features which segregate into separate branches at later stages for processing facial expression and identity respectively, and (2) gain more specificity for the processing of either facial expression or identity as one progresses along each branch towards higher stages. Correspondence analysis between the DNN and monkey visual areas revealed that the amygdala and anterior fundus face patch (AF) matched well with later layers of the DNN's facial expression branch, while the anterior medial face patch (AM) matched well with later layers of the DNN's facial identity branch. Our results highlight the anatomical and functional similarities between macaque visual system and DNN model, suggesting a common mechanism between the two systems.
Collapse
Affiliation(s)
- Hui Zhang
- School of Engineering Medicine, Beihang University; Key Laboratory of Biomechanics and Mechanobiology (Beihang University), Ministry of Education, Key Laboratory of Big Data-Based Precision Medicine, Ministry of Industry and Information Technology of the People's Republic of China, Beijing 100191, China; Laboratory of Brain and Cognition, NIMH, NIH, Bethesda, Maryland 20892, USA.
| | - Xuetong Ding
- School of Engineering Medicine, Beihang University; Key Laboratory of Biomechanics and Mechanobiology (Beihang University), Ministry of Education, Key Laboratory of Big Data-Based Precision Medicine, Ministry of Industry and Information Technology of the People's Republic of China, Beijing 100191, China
| | - Ning Liu
- State Key Laboratory of Brain and Cognitive Science, Institute of Biophysics, Chinese Academy of Sciences, Beijing 100101, China..óSchool of Artificial Intelligence, University of Chinese Academy of Sciences, Beijing, 100049, China; Institute of Artificial Intelligence, Hefei Comprehensive National Science Center, Hefei 230088, China; Laboratory of Brain and Cognition, NIMH, NIH, Bethesda, Maryland 20892, USA
| | - Rachel Nolan
- Laboratory of Brain and Cognition, NIMH, NIH, Bethesda, Maryland 20892, USA
| | | | - Shruti Japee
- Laboratory of Brain and Cognition, NIMH, NIH, Bethesda, Maryland 20892, USA
| |
Collapse
|
11
|
Josephs EL, Hebart MN, Konkle T. Dimensions underlying human understanding of the reachable world. Cognition 2023; 234:105368. [PMID: 36641868 DOI: 10.1016/j.cognition.2023.105368] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2022] [Revised: 12/20/2022] [Accepted: 01/09/2023] [Indexed: 01/15/2023]
Abstract
Near-scale environments, like work desks, restaurant place settings or lab benches, are the interface of our hand-based interactions with the world. How are our conceptual representations of these environments organized? What properties distinguish among reachspaces, and why? We obtained 1.25 million similarity judgments on 990 reachspace images, and generated a 30-dimensional embedding which accurately predicts these judgments. Examination of the embedding dimensions revealed key properties underlying these judgments, such as reachspace layout, affordance, and visual appearance. Clustering performed over the embedding revealed four distinct interpretable classes of reachspaces, distinguishing among spaces related to food, electronics, analog activities, and storage or display. Finally, we found that reachspace similarity ratings were better predicted by the function of the spaces than their locations, suggesting that reachspaces are largely conceptualized in terms of the actions they support. Altogether, these results reveal the behaviorally-relevant principles that structure our internal representations of reach-relevant environments.
Collapse
Affiliation(s)
- Emilie L Josephs
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, USA; Psychology Department, Harvard University, Cambridge, USA.
| | - Martin N Hebart
- Vision and Computational Cognition Group, Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany.
| | - Talia Konkle
- Psychology Department, Harvard University, Cambridge, USA.
| |
Collapse
|
12
|
Hodgetts CJ, Close JOE, Hahn U. Similarity and structured representation in human and nonhuman apes. Cognition 2023; 236:105419. [PMID: 37104894 DOI: 10.1016/j.cognition.2023.105419] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2022] [Revised: 02/17/2023] [Accepted: 02/18/2023] [Indexed: 04/29/2023]
Abstract
How we judge the similarity between objects in the world is connected ultimately to how we represent those objects. It has been argued extensively that object representations in humans are 'structured' in nature, meaning that both individual features and the relations between them can influence similarity. In contrast, popular models within comparative psychology assume that nonhuman species appreciate only surface-level, featural similarities. By applying psychological models of structural and featural similarity (from conjunctive feature models to Tversky's Contrast Model) to visual similarity judgements from adult humans, chimpanzees, and gorillas, we demonstrate a cross-species sensitivity to complex structural information, particularly for stimuli that combine colour and shape. These results shed new light on the representational complexity of nonhuman apes, and the fundamental limits of featural coding in explaining object representation and similarity, which emerge strikingly across both human and nonhuman species.
Collapse
Affiliation(s)
- Carl J Hodgetts
- Department of Psychology, Royal Holloway, University of London, Egham, Surrey TW20 0EX, UK; Cardiff University Brain Research Imaging Centre, School of Psychology, Cardiff University, Maindy Road, Cardiff CF24 4HQ, UK.
| | - James O E Close
- Department of Developmental and Comparative Psychology, Max Planck Institute for Evolutionary Anthropology, Deutscher Platz 6, 04103 Leipzig, Germany; School of Psychology and Sport Science, Anglia Ruskin University, East Road, Cambridge CB1 1PT, UK
| | - Ulrike Hahn
- Department of Psychological Sciences, Birkbeck, University of London, Malet Street, London WC1E 7HX, UK
| |
Collapse
|
13
|
Lee J, Jung M, Lustig N, Lee J. Neural representations of the perception of handwritten digits and visual objects from a convolutional neural network compared to humans. Hum Brain Mapp 2023; 44:2018-2038. [PMID: 36637109 PMCID: PMC9980894 DOI: 10.1002/hbm.26189] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2022] [Revised: 12/04/2022] [Accepted: 12/12/2022] [Indexed: 01/14/2023] Open
Abstract
We investigated neural representations for visual perception of 10 handwritten digits and six visual objects from a convolutional neural network (CNN) and humans using functional magnetic resonance imaging (fMRI). Once our CNN model was fine-tuned using a pre-trained VGG16 model to recognize the visual stimuli from the digit and object categories, representational similarity analysis (RSA) was conducted using neural activations from fMRI and feature representations from the CNN model across all 16 classes. The encoded neural representation of the CNN model exhibited the hierarchical topography mapping of the human visual system. The feature representations in the lower convolutional (Conv) layers showed greater similarity with the neural representations in the early visual areas and parietal cortices, including the posterior cingulate cortex. The feature representations in the higher Conv layers were encoded in the higher-order visual areas, including the ventral/medial/dorsal stream and middle temporal complex. The neural representations in the classification layers were observed mainly in the ventral stream visual cortex (including the inferior temporal cortex), superior parietal cortex, and prefrontal cortex. There was a surprising similarity between the neural representations from the CNN model and the neural representations for human visual perception in the context of the perception of digits versus objects, particularly in the primary visual and associated areas. This study also illustrates the uniqueness of human visual perception. Unlike the CNN model, the neural representation of digits and objects for humans is more widely distributed across the whole brain, including the frontal and temporal areas.
Collapse
Affiliation(s)
- Juhyeon Lee
- Department of Brain and Cognitive EngineeringKorea UniversitySeoulRepublic of Korea
| | - Minyoung Jung
- Department of Brain and Cognitive EngineeringKorea UniversitySeoulRepublic of Korea
| | - Niv Lustig
- Department of Brain and Cognitive EngineeringKorea UniversitySeoulRepublic of Korea
| | - Jong‐Hwan Lee
- Department of Brain and Cognitive EngineeringKorea UniversitySeoulRepublic of Korea
| |
Collapse
|
14
|
Disentangling five dimensions of animacy in human brain and behaviour. Commun Biol 2022; 5:1247. [PMCID: PMC9663603 DOI: 10.1038/s42003-022-04194-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/11/2022] [Accepted: 10/31/2022] [Indexed: 11/16/2022] Open
Abstract
AbstractDistinguishing animate from inanimate things is of great behavioural importance. Despite distinct brain and behavioural responses to animate and inanimate things, it remains unclear which object properties drive these responses. Here, we investigate the importance of five object dimensions related to animacy (“being alive”, “looking like an animal”, “having agency”, “having mobility”, and “being unpredictable”) in brain (fMRI, EEG) and behaviour (property and similarity judgements) of 19 participants. We used a stimulus set of 128 images, optimized by a genetic algorithm to disentangle these five dimensions. The five dimensions explained much variance in the similarity judgments. Each dimension explained significant variance in the brain representations (except, surprisingly, “being alive”), however, to a lesser extent than in behaviour. Different brain regions sensitive to animacy may represent distinct dimensions, either as accessible perceptual stepping stones toward detecting whether something is alive or because they are of behavioural importance in their own right.
Collapse
|
15
|
Janini D, Hamblin C, Deza A, Konkle T. General object-based features account for letter perception. PLoS Comput Biol 2022; 18:e1010522. [PMID: 36155642 PMCID: PMC9536565 DOI: 10.1371/journal.pcbi.1010522] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2021] [Revised: 10/06/2022] [Accepted: 08/29/2022] [Indexed: 11/30/2022] Open
Abstract
After years of experience, humans become experts at perceiving letters. Is this visual capacity attained by learning specialized letter features, or by reusing general visual features previously learned in service of object categorization? To explore this question, we first measured the perceptual similarity of letters in two behavioral tasks, visual search and letter categorization. Then, we trained deep convolutional neural networks on either 26-way letter categorization or 1000-way object categorization, as a way to operationalize possible specialized letter features and general object-based features, respectively. We found that the general object-based features more robustly correlated with the perceptual similarity of letters. We then operationalized additional forms of experience-dependent letter specialization by altering object-trained networks with varied forms of letter training; however, none of these forms of letter specialization improved the match to human behavior. Thus, our findings reveal that it is not necessary to appeal to specialized letter representations to account for perceptual similarity of letters. Instead, we argue that it is more likely that the perception of letters depends on domain-general visual features. For over a century, scientists have conducted behavioral experiments to investigate how the visual system recognizes letters, but it has proven difficult to propose a model of the feature space underlying this capacity. Here we leveraged recent advances in machine learning to model a wide variety of features ranging from specialized letter features to general object-based features. Across two large-scale behavioral experiments we find that general object-based features account well for letter perception, and that adding letter specialization did not improve the correspondence to human behavior. It is plausible that the ability to recognize letters largely relies on general visual features unaltered by letter learning.
Collapse
Affiliation(s)
- Daniel Janini
- Department of Psychology, Harvard University, Cambridge, Massachusetts, United States of America
- * E-mail:
| | - Chris Hamblin
- Department of Psychology, Harvard University, Cambridge, Massachusetts, United States of America
| | - Arturo Deza
- Department of Psychology, Harvard University, Cambridge, Massachusetts, United States of America
- Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, Massachusetts, United States of America
| | - Talia Konkle
- Department of Psychology, Harvard University, Cambridge, Massachusetts, United States of America
| |
Collapse
|
16
|
Tang K, Chin M, Chun M, Xu Y. The contribution of object identity and configuration to scene representation in convolutional neural networks. PLoS One 2022; 17:e0270667. [PMID: 35763531 PMCID: PMC9239439 DOI: 10.1371/journal.pone.0270667] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2022] [Accepted: 06/14/2022] [Indexed: 11/23/2022] Open
Abstract
Scene perception involves extracting the identities of the objects comprising a scene in conjunction with their configuration (the spatial layout of the objects in the scene). How object identity and configuration information is weighted during scene processing and how this weighting evolves over the course of scene processing however, is not fully understood. Recent developments in convolutional neural networks (CNNs) have demonstrated their aptitude at scene processing tasks and identified correlations between processing in CNNs and in the human brain. Here we examined four CNN architectures (Alexnet, Resnet18, Resnet50, Densenet161) and their sensitivity to changes in object and configuration information over the course of scene processing. Despite differences among the four CNN architectures, across all CNNs, we observed a common pattern in the CNN's response to object identity and configuration changes. Each CNN demonstrated greater sensitivity to configuration changes in early stages of processing and stronger sensitivity to object identity changes in later stages. This pattern persists regardless of the spatial structure present in the image background, the accuracy of the CNN in classifying the scene, and even the task used to train the CNN. Importantly, CNNs' sensitivity to a configuration change is not the same as their sensitivity to any type of position change, such as that induced by a uniform translation of the objects without a configuration change. These results provide one of the first documentations of how object identity and configuration information are weighted in CNNs during scene processing.
Collapse
Affiliation(s)
- Kevin Tang
- Department of Psychology, Yale University, New Haven, CT, United States of America
| | - Matthew Chin
- Department of Psychology, Yale University, New Haven, CT, United States of America
| | - Marvin Chun
- Department of Psychology, Yale University, New Haven, CT, United States of America
| | - Yaoda Xu
- Department of Psychology, Yale University, New Haven, CT, United States of America
- * E-mail:
| |
Collapse
|
17
|
Standardized database of 400 complex abstract fractals. Behav Res Methods 2021; 54:2302-2317. [PMID: 34918225 DOI: 10.3758/s13428-021-01726-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 10/13/2021] [Indexed: 11/08/2022]
Abstract
In experimental settings, characteristics of presented stimuli influence cognitive processes. Knowledge about stimulus features is important to manipulate or control the influence of stimuli. To date, there are a lack of standardized data incorporating such information for complex abstract stimuli. Thus, we provide norms for a database of 400 abstract and complex stimuli. Grey-scaled fractals were rated by 512 participants on the stimulus features of abstractness, animacy, verbalizability, complexity, familiarity, favorableness, and memorability. Moreover, 111 participants labeled the fractals, enabling us to calculate indices of naming agreement and modal names. Overall, the results confirmed high abstractness and low verbalizability of the provided stimuli. To establish external validation for selected stimulus features, we evaluated (a) classifier probability of a deep neural network labeling the fractals, negatively correlated with ratings of abstractness and positively with verbalizability and naming agreement; (b) data compression rate of fractal image files, positively correlated with the rating of complexity; and (c) performance of 212 participants in a recognition-memory task, positively correlated with the rating of memorability. The present work fills the gap of a standardized database for abstract stimuli and provides a database with valid norms for abstract and complex stimuli based on ratings and external validation measures. This database can be used to control and manipulate these stimulus features in experimental settings using abstract stimuli. Such a database is essential in experimental research using abstract stimuli for instance to control for verbal influence and strategy or to control for novelty and familiarity.
Collapse
|
18
|
Abstract
Is Mr. Hyde more similar to his alter ego Dr. Jekyll, because of their physical identity, or to Jack the Ripper, because both evoke fear and loathing? The relative weight of emotional and visual dimensions in similarity judgements is still unclear. We expected an asymmetric effect of these dimensions on similarity perception, such that faces that express the same or similar feeling are judged as more similar than different emotional expressions of same person. We selected 10 male faces with different expressions. Each face posed one neutral expression and one emotional expression (five disgust, five fear). We paired these expressions, resulting in 190 pairs, varying either in emotional expressions, physical identity, or both. Twenty healthy participants rated the similarity of paired faces on a 7-point scale. We report a symmetric effect of emotional expression and identity on similarity judgements, suggesting that people may perceive Mr. Hyde to be just as similar to Dr. Jekyll (identity) as to Jack the Ripper (emotion). We also observed that emotional mismatch decreased perceived similarity, suggesting that emotions play a prominent role in similarity judgements. From an evolutionary perspective, poor discrimination between emotional stimuli might endanger the individual.
Collapse
|
19
|
Muttenthaler L, Hebart MN. THINGSvision: A Python Toolbox for Streamlining the Extraction of Activations From Deep Neural Networks. Front Neuroinform 2021; 15:679838. [PMID: 34630062 PMCID: PMC8494008 DOI: 10.3389/fninf.2021.679838] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2021] [Accepted: 08/10/2021] [Indexed: 11/25/2022] Open
Abstract
Over the past decade, deep neural network (DNN) models have received a lot of attention due to their near-human object classification performance and their excellent prediction of signals recorded from biological visual systems. To better understand the function of these networks and relate them to hypotheses about brain activity and behavior, researchers need to extract the activations to images across different DNN layers. The abundance of different DNN variants, however, can often be unwieldy, and the task of extracting DNN activations from different layers may be non-trivial and error-prone for someone without a strong computational background. Thus, researchers in the fields of cognitive science and computational neuroscience would benefit from a library or package that supports a user in the extraction task. THINGSvision is a new Python module that aims at closing this gap by providing a simple and unified tool for extracting layer activations for a wide range of pretrained and randomly-initialized neural network architectures, even for users with little to no programming experience. We demonstrate the general utility of THINGsvision by relating extracted DNN activations to a number of functional MRI and behavioral datasets using representational similarity analysis, which can be performed as an integral part of the toolbox. Together, THINGSvision enables researchers across diverse fields to extract features in a streamlined manner for their custom image dataset, thereby improving the ease of relating DNNs, brain activity, and behavior, and improving the reproducibility of findings in these research fields.
Collapse
Affiliation(s)
- Lukas Muttenthaler
- Vision and Computational Cognition Group, Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany
- Machine Learning Group, Technical University of Berlin, Berlin, Germany
| | - Martin N. Hebart
- Vision and Computational Cognition Group, Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany
| |
Collapse
|
20
|
Lindsay GW. Convolutional Neural Networks as a Model of the Visual System: Past, Present, and Future. J Cogn Neurosci 2021; 33:2017-2031. [DOI: 10.1162/jocn_a_01544] [Citation(s) in RCA: 96] [Impact Index Per Article: 32.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]
Abstract
Abstract
Convolutional neural networks (CNNs) were inspired by early findings in the study of biological vision. They have since become successful tools in computer vision and state-of-the-art models of both neural activity and behavior on visual tasks. This review highlights what, in the context of CNNs, it means to be a good model in computational neuroscience and the various ways models can provide insight. Specifically, it covers the origins of CNNs and the methods by which we validate them as models of biological vision. It then goes on to elaborate on what we can learn about biological vision by understanding and experimenting on CNNs and discusses emerging opportunities for the use of CNNs in vision research beyond basic object recognition.
Collapse
|
21
|
Xu Y, Vaziri-Pashkam M. Limits to visual representational correspondence between convolutional neural networks and the human brain. Nat Commun 2021; 12:2065. [PMID: 33824315 PMCID: PMC8024324 DOI: 10.1038/s41467-021-22244-7] [Citation(s) in RCA: 41] [Impact Index Per Article: 13.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2020] [Accepted: 03/05/2021] [Indexed: 02/01/2023] Open
Abstract
Convolutional neural networks (CNNs) are increasingly used to model human vision due to their high object categorization capabilities and general correspondence with human brain responses. Here we evaluate the performance of 14 different CNNs compared with human fMRI responses to natural and artificial images using representational similarity analysis. Despite the presence of some CNN-brain correspondence and CNNs' impressive ability to fully capture lower level visual representation of real-world objects, we show that CNNs do not fully capture higher level visual representations of real-world objects, nor those of artificial objects, either at lower or higher levels of visual representations. The latter is particularly critical, as the processing of both real-world and artificial visual stimuli engages the same neural circuits. We report similar results regardless of differences in CNN architecture, training, or the presence of recurrent processing. This indicates some fundamental differences exist in how the brain and CNNs represent visual information.
Collapse
Affiliation(s)
- Yaoda Xu
- Psychology Department, Yale University, New Haven, CT, USA.
| | - Maryam Vaziri-Pashkam
- Laboratory of Brain and Cognition, National Institute of Mental Health, Bethesda, MD, USA
| |
Collapse
|
22
|
Liu X, Zhen Z, Liu J. Hierarchical Sparse Coding of Objects in Deep Convolutional Neural Networks. Front Comput Neurosci 2020; 14:578158. [PMID: 33362499 PMCID: PMC7755594 DOI: 10.3389/fncom.2020.578158] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2020] [Accepted: 11/17/2020] [Indexed: 12/04/2022] Open
Abstract
Recently, deep convolutional neural networks (DCNNs) have attained human-level performances on challenging object recognition tasks owing to their complex internal representation. However, it remains unclear how objects are represented in DCNNs with an overwhelming number of features and non-linear operations. In parallel, the same question has been extensively studied in primates' brain, and three types of coding schemes have been found: one object is coded by the entire neuronal population (distributed coding), or by one single neuron (local coding), or by a subset of neuronal population (sparse coding). Here we asked whether DCNNs adopted any of these coding schemes to represent objects. Specifically, we used the population sparseness index, which is widely-used in neurophysiological studies on primates' brain, to characterize the degree of sparseness at each layer in representative DCNNs pretrained for object categorization. We found that the sparse coding scheme was adopted at all layers of the DCNNs, and the degree of sparseness increased along the hierarchy. That is, the coding scheme shifted from distributed-like coding at lower layers to local-like coding at higher layers. Further, the degree of sparseness was positively correlated with DCNNs' performance in object categorization, suggesting that the coding scheme was related to behavioral performance. Finally, with the lesion approach, we demonstrated that both external learning experiences and built-in gating operations were necessary to construct such a hierarchical coding scheme. In sum, our study provides direct evidence that DCNNs adopted a hierarchically-evolved sparse coding scheme as the biological brain does, suggesting the possibility of an implementation-independent principle underling object recognition.
Collapse
Affiliation(s)
- Xingyu Liu
- Beijing Key Laboratory of Applied Experimental Psychology, Faculty of Psychology, Beijing Normal University, Beijing, China
| | - Zonglei Zhen
- Beijing Key Laboratory of Applied Experimental Psychology, Faculty of Psychology, Beijing Normal University, Beijing, China
| | - Jia Liu
- Department of Psychology & Tsinghua Laboratory of Brain and Intelligence, Tsinghua University, Beijing, China
| |
Collapse
|
23
|
Chen X, Zhou M, Gong Z, Xu W, Liu X, Huang T, Zhen Z, Liu J. DNNBrain: A Unifying Toolbox for Mapping Deep Neural Networks and Brains. Front Comput Neurosci 2020; 14:580632. [PMID: 33328946 PMCID: PMC7734148 DOI: 10.3389/fncom.2020.580632] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2020] [Accepted: 10/27/2020] [Indexed: 01/24/2023] Open
Abstract
Deep neural networks (DNNs) have attained human-level performance on dozens of challenging tasks via an end-to-end deep learning strategy. Deep learning allows data representations that have multiple levels of abstraction; however, it does not explicitly provide any insights into the internal operations of DNNs. Deep learning's success is appealing to neuroscientists not only as a method for applying DNNs to model biological neural systems but also as a means of adopting concepts and methods from cognitive neuroscience to understand the internal representations of DNNs. Although general deep learning frameworks, such as PyTorch and TensorFlow, could be used to allow such cross-disciplinary investigations, the use of these frameworks typically requires high-level programming expertise and comprehensive mathematical knowledge. A toolbox specifically designed as a mechanism for cognitive neuroscientists to map both DNNs and brains is urgently needed. Here, we present DNNBrain, a Python-based toolbox designed for exploring the internal representations of DNNs as well as brains. Through the integration of DNN software packages and well-established brain imaging tools, DNNBrain provides application programming and command line interfaces for a variety of research scenarios. These include extracting DNN activation, probing and visualizing DNN representations, and mapping DNN representations onto the brain. We expect that our toolbox will accelerate scientific research by both applying DNNs to model biological neural systems and utilizing paradigms of cognitive neuroscience to unveil the black box of DNNs.
Collapse
Affiliation(s)
- Xiayu Chen
- Beijing Key Laboratory of Applied Experimental Psychology, Faculty of Psychology, Beijing Normal University, Beijing, China
| | - Ming Zhou
- Beijing Key Laboratory of Applied Experimental Psychology, Faculty of Psychology, Beijing Normal University, Beijing, China
| | - Zhengxin Gong
- Beijing Key Laboratory of Applied Experimental Psychology, Faculty of Psychology, Beijing Normal University, Beijing, China
| | - Wei Xu
- Beijing Key Laboratory of Applied Experimental Psychology, Faculty of Psychology, Beijing Normal University, Beijing, China
| | - Xingyu Liu
- Beijing Key Laboratory of Applied Experimental Psychology, Faculty of Psychology, Beijing Normal University, Beijing, China
| | - Taicheng Huang
- State Key Laboratory of Cognitive Neuroscience and Learning, Beijing Normal University, Beijing, China
| | - Zonglei Zhen
- Beijing Key Laboratory of Applied Experimental Psychology, Faculty of Psychology, Beijing Normal University, Beijing, China
| | - Jia Liu
- Beijing Key Laboratory of Applied Experimental Psychology, Faculty of Psychology, Beijing Normal University, Beijing, China
| |
Collapse
|
24
|
Reaction times predict dynamic brain representations measured with MEG for only some object categorisation tasks. Neuropsychologia 2020; 151:107687. [PMID: 33212137 DOI: 10.1016/j.neuropsychologia.2020.107687] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2020] [Revised: 09/29/2020] [Accepted: 11/10/2020] [Indexed: 11/21/2022]
Abstract
Behavioural categorisation reaction times (RTs) provide a useful way to link behaviour to brain representations measured with neuroimaging. In this framework, objects are assumed to be represented in a multidimensional activation space, with the distances between object representations indicating their degree of neural similarity. Faster RTs have been reported to correlate with greater distances from a classification decision boundary for animacy. Objects inherently belong to more than one category, yet it is not known whether the RT-distance relationship, and its evolution over the time-course of the neural response, is similar across different categories. Here we used magnetoencephalography (MEG) to address this question. Our stimuli included typically animate and inanimate objects, as well as more ambiguous examples (i.e., robots and toys). We conducted four semantic categorisation tasks on the same stimulus set assessing animacy, living, moving, and human-similarity concepts, and linked the categorisation RTs to MEG time-series decoding data. Our results show a sustained RT-distance relationship throughout the time course of object processing for not only animacy, but also categorisation according to human-similarity. Interestingly, this sustained RT-distance relationship was not observed for the living and moving category organisations, despite comparable classification accuracy of the MEG data across all four category organisations. Our findings show that behavioural RTs predict representational distance for an organisational principle other than animacy, however further research is needed to determine why this relationship is observed only for some category organisations and not others.
Collapse
|
25
|
Wardle SG, Baker C. Recent advances in understanding object recognition in the human brain: deep neural networks, temporal dynamics, and context. F1000Res 2020; 9. [PMID: 32566136 PMCID: PMC7291077 DOI: 10.12688/f1000research.22296.1] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 06/08/2020] [Indexed: 12/17/2022] Open
Abstract
Object recognition is the ability to identify an object or category based on the combination of visual features observed. It is a remarkable feat of the human brain, given that the patterns of light received by the eye associated with the properties of a given object vary widely with simple changes in viewing angle, ambient lighting, and distance. Furthermore, different exemplars of a specific object category can vary widely in visual appearance, such that successful categorization requires generalization across disparate visual features. In this review, we discuss recent advances in understanding the neural representations underlying object recognition in the human brain. We highlight three current trends in the approach towards this goal within the field of cognitive neuroscience. Firstly, we consider the influence of deep neural networks both as potential models of object vision and in how their representations relate to those in the human brain. Secondly, we review the contribution that time-series neuroimaging methods have made towards understanding the temporal dynamics of object representations beyond their spatial organization within different brain regions. Finally, we argue that an increasing emphasis on the context (both visual and task) within which object recognition occurs has led to a broader conceptualization of what constitutes an object representation for the brain. We conclude by identifying some current challenges facing the experimental pursuit of understanding object recognition and outline some emerging directions that are likely to yield new insight into this complex cognitive process.
Collapse
Affiliation(s)
- Susan G Wardle
- Laboratory of Brain and Cognition, National Institute of Mental Health, National Institutes of Health, Bethesda, MD, 20892, USA
| | - Chris Baker
- Laboratory of Brain and Cognition, National Institute of Mental Health, National Institutes of Health, Bethesda, MD, 20892, USA
| |
Collapse
|
26
|
Mattioni S, Rezk M, Battal C, Bottini R, Cuculiza Mendoza KE, Oosterhof NN, Collignon O. Categorical representation from sound and sight in the ventral occipito-temporal cortex of sighted and blind. eLife 2020; 9:50732. [PMID: 32108572 PMCID: PMC7108866 DOI: 10.7554/elife.50732] [Citation(s) in RCA: 37] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2019] [Accepted: 02/14/2020] [Indexed: 01/08/2023] Open
Abstract
Is vision necessary for the development of the categorical organization of the Ventral Occipito-Temporal Cortex (VOTC)? We used fMRI to characterize VOTC responses to eight categories presented acoustically in sighted and early blind individuals, and visually in a separate sighted group. We observed that VOTC reliably encodes sound categories in sighted and blind people using a representational structure and connectivity partially similar to the one found in vision. Sound categories were, however, more reliably encoded in the blind than the sighted group, using a representational format closer to the one found in vision. Crucially, VOTC in blind represents the categorical membership of sounds rather than their acoustic features. Our results suggest that sounds trigger categorical responses in the VOTC of congenitally blind and sighted people that partially match the topography and functional profile of the visual response, despite qualitative nuances in the categorical organization of VOTC between modalities and groups. The world is full of rich and dynamic visual information. To avoid information overload, the human brain groups inputs into categories such as faces, houses, or tools. A part of the brain called the ventral occipito-temporal cortex (VOTC) helps categorize visual information. Specific parts of the VOTC prefer different types of visual input; for example, one part may tend to respond more to faces, whilst another may prefer houses. However, it is not clear how the VOTC characterizes information. One idea is that similarities between certain types of visual information may drive how information is organized in the VOTC. For example, looking at faces requires using central vision, while looking at houses requires using peripheral vision. Furthermore, all faces have a roundish shape while houses tend to have a more rectangular shape. Another possibility, however, is that the categorization of different inputs cannot be explained just by vision, and is also be driven by higher-level aspects of each category. For instance, how humans use or interact with something may also influence how an input is categorized. If categories are established depending (at least partially) on these higher-level aspects, rather than purely through visual likeness, it is likely that the VOTC would respond similarly to both sounds and images representing these categories. Now, Mattioni et al. have tested how individuals with and without sight respond to eight different categories of information to find out whether or not categorization is driven purely by visual likeness. Each category was presented to participants using sounds while measuring their brain activity. In addition, a group of participants who could see were also presented with the categories visually. Mattioni et al. then compared what happened in the VOTC of the three groups – sighted people presented with sounds, blind people presented with sounds, and sighted people presented with images – in response to each category. The experiment revealed that the VOTC organizes both auditory and visual information in a similar way. However, there were more similarities between the way blind people categorized auditory information and how sighted people categorized visual information than between how sighted people categorized each type of input. Mattioni et al. also found that the region of the VOTC that responds to inanimate objects massively overlapped across the three groups, whereas the part of the VOTC that responds to living things was more variable. These findings suggest that the way that the VOTC organizes information is, at least partly, independent from vision. The experiments also provide some information about how the brain reorganizes in people who are born blind. Further studies may reveal how differences in the VOTC of people with and without sight affect regions typically associated with auditory categorization, and potentially explain how the brain reorganizes in people who become blind later in life.
Collapse
Affiliation(s)
- Stefania Mattioni
- Institute of research in Psychology (IPSY) & Institute of Neuroscience (IoNS) - University of Louvain (UCLouvain), Louvain-la-Neuve, Belgium
| | - Mohamed Rezk
- Institute of research in Psychology (IPSY) & Institute of Neuroscience (IoNS) - University of Louvain (UCLouvain), Louvain-la-Neuve, Belgium.,Centre for Mind/Brain Sciences, University of Trento, Trento, Italy
| | - Ceren Battal
- Institute of research in Psychology (IPSY) & Institute of Neuroscience (IoNS) - University of Louvain (UCLouvain), Louvain-la-Neuve, Belgium.,Centre for Mind/Brain Sciences, University of Trento, Trento, Italy
| | - Roberto Bottini
- Centre for Mind/Brain Sciences, University of Trento, Trento, Italy
| | | | | | - Olivier Collignon
- Institute of research in Psychology (IPSY) & Institute of Neuroscience (IoNS) - University of Louvain (UCLouvain), Louvain-la-Neuve, Belgium
| |
Collapse
|
27
|
|
28
|
The Emotional Facet of Subjective and Neural Indices of Similarity. Brain Topogr 2019; 32:956-964. [PMID: 31728708 PMCID: PMC6882781 DOI: 10.1007/s10548-019-00743-7] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2019] [Accepted: 11/02/2019] [Indexed: 12/27/2022]
Abstract
Emotional similarity refers to the tendency to group stimuli together because they evoke the same feelings in us. The majority of research on similarity perception that has been conducted to date has focused on non-emotional stimuli. Different models have been proposed to explain how we represent semantic concepts, and judge the similarity among them. They are supported from behavioural and neural evidence, often combined by using Multivariate Pattern Analyses. By contrast, less is known about the cognitive and neural mechanisms underlying the judgement of similarity between real-life emotional experiences. This review summarizes the major findings, debates and limitations in the semantic similarity literature. They will serve as background to the emotional facet of similarity that will be the focus of this review. A multi-modal and overarching approach, which relates different levels of neuroscientific explanation (i.e., computational, algorithmic and implementation), would be the key to further unveil what makes emotional experiences similar to each other.
Collapse
|