1
|
Kumar S, Sumers TR, Yamakoshi T, Goldstein A, Hasson U, Norman KA, Griffiths TL, Hawkins RD, Nastase SA. Shared functional specialization in transformer-based language models and the human brain. Nat Commun 2024; 15:5523. [PMID: 38951520 PMCID: PMC11217339 DOI: 10.1038/s41467-024-49173-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2023] [Accepted: 05/24/2024] [Indexed: 07/03/2024] Open
Abstract
When processing language, the brain is thought to deploy specialized computations to construct meaning from complex linguistic structures. Recently, artificial neural networks based on the Transformer architecture have revolutionized the field of natural language processing. Transformers integrate contextual information across words via structured circuit computations. Prior work has focused on the internal representations ("embeddings") generated by these circuits. In this paper, we instead analyze the circuit computations directly: we deconstruct these computations into the functionally-specialized "transformations" that integrate contextual information across words. Using functional MRI data acquired while participants listened to naturalistic stories, we first verify that the transformations account for considerable variance in brain activity across the cortical language network. We then demonstrate that the emergent computations performed by individual, functionally-specialized "attention heads" differentially predict brain activity in specific cortical regions. These heads fall along gradients corresponding to different layers and context lengths in a low-dimensional cortical space.
Collapse
Affiliation(s)
- Sreejan Kumar
- Princeton Neuroscience Institute, Princeton University, Princeton, NJ, 08540, USA.
| | - Theodore R Sumers
- Department of Computer Science, Princeton University, Princeton, NJ, 08540, USA.
| | - Takateru Yamakoshi
- Faculty of Medicine, The University of Tokyo, Bunkyo-ku, Tokyo, 113-0033, Japan
| | - Ariel Goldstein
- Department of Cognitive and Brain Sciences and Business School, Hebrew University, Jerusalem, 9190401, Israel
| | - Uri Hasson
- Princeton Neuroscience Institute, Princeton University, Princeton, NJ, 08540, USA
- Department of Psychology, Princeton University, Princeton, NJ, 08540, USA
| | - Kenneth A Norman
- Princeton Neuroscience Institute, Princeton University, Princeton, NJ, 08540, USA
- Department of Psychology, Princeton University, Princeton, NJ, 08540, USA
| | - Thomas L Griffiths
- Department of Computer Science, Princeton University, Princeton, NJ, 08540, USA
- Department of Psychology, Princeton University, Princeton, NJ, 08540, USA
| | - Robert D Hawkins
- Princeton Neuroscience Institute, Princeton University, Princeton, NJ, 08540, USA
- Department of Psychology, Princeton University, Princeton, NJ, 08540, USA
| | - Samuel A Nastase
- Princeton Neuroscience Institute, Princeton University, Princeton, NJ, 08540, USA.
| |
Collapse
|
2
|
Prochnow A, Zhou X, Ghorbani F, Roessner V, Hommel B, Beste C. Event segmentation in ADHD: neglect of social information and deviant theta activity point to a mechanism underlying ADHD. Gen Psychiatr 2024; 37:e101486. [PMID: 38859926 PMCID: PMC11163598 DOI: 10.1136/gpsych-2023-101486] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/14/2023] [Accepted: 05/08/2024] [Indexed: 06/12/2024] Open
Abstract
Background Attention-deficit/hyperactivity disorder (ADHD) is one of the most frequently diagnosed psychiatric conditions in children and adolescents. Although the symptoms appear to be well described, no coherent conceptual mechanistic framework integrates their occurrence and variance and the associated problems that people with ADHD face. Aims The current study proposes that altered event segmentation processes provide a novel mechanistic framework for understanding deficits in ADHD. Methods Adolescents with ADHD and neurotypically developing (NT) peers watched a short movie and were then asked to indicate the boundaries between meaningful segments of the movie. Concomitantly recorded electroencephalography (EEG) data were analysed for differences in frequency band activity and effective connectivity between brain areas. Results Compared with their NT peers, the ADHD group showed less dependence of their segmentation behaviour on social information, indicating that they did not consider social information to the same extent as their unaffected peers. This divergence was accompanied by differences in EEG theta band activity and a different effective connectivity network architecture at the source level. Specifically, NT adolescents primarily showed error signalling in and between the left and right fusiform gyri related to social information processing, which was not the case in the ADHD group. For the ADHD group, the inferior frontal cortex associated with attentional sampling served as a hub instead, indicating problems in the deployment of attentional control. Conclusions This study shows that adolescents with ADHD perceive events differently from their NT peers, in association with a different brain network architecture that reflects less adaptation to the situation and problems in attentional sampling of environmental information. The results call for a novel conceptual view of ADHD, based on event segmentation theory.
Collapse
Affiliation(s)
- Astrid Prochnow
- Cognitive Neurophysiology, Department of Child and Adolescent Psychiatry, Faculty of Medicine, TU Dresden, Dresden, Germany
| | - Xianzhen Zhou
- Cognitive Neurophysiology, Department of Child and Adolescent Psychiatry, Faculty of Medicine, TU Dresden, Dresden, Germany
| | - Foroogh Ghorbani
- Cognitive Neurophysiology, Department of Child and Adolescent Psychiatry, Faculty of Medicine, TU Dresden, Dresden, Germany
| | - Veit Roessner
- Cognitive Neurophysiology, Department of Child and Adolescent Psychiatry, Faculty of Medicine, TU Dresden, Dresden, Germany
| | - Bernhard Hommel
- Faculty of Psychology, Shandong Normal University, Jinan, Shandong, China
| | - Christian Beste
- Cognitive Neurophysiology, Department of Child and Adolescent Psychiatry, Faculty of Medicine, TU Dresden, Dresden, Germany
- Faculty of Psychology, Shandong Normal University, Jinan, Shandong, China
| |
Collapse
|
3
|
Lee Masson H, Chang L, Isik L. Multidimensional neural representations of social features during movie viewing. Soc Cogn Affect Neurosci 2024; 19:nsae030. [PMID: 38722755 PMCID: PMC11130526 DOI: 10.1093/scan/nsae030] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2023] [Revised: 03/05/2024] [Accepted: 05/03/2024] [Indexed: 05/29/2024] Open
Abstract
The social world is dynamic and contextually embedded. Yet, most studies utilize simple stimuli that do not capture the complexity of everyday social episodes. To address this, we implemented a movie viewing paradigm and investigated how everyday social episodes are processed in the brain. Participants watched one of two movies during an MRI scan. Neural patterns from brain regions involved in social perception, mentalization, action observation and sensory processing were extracted. Representational similarity analysis results revealed that several labeled social features (including social interaction, mentalization, the actions of others, characters talking about themselves, talking about others and talking about objects) were represented in the superior temporal gyrus (STG) and middle temporal gyrus (MTG). The mentalization feature was also represented throughout the theory of mind network, and characters talking about others engaged the temporoparietal junction (TPJ), suggesting that listeners may spontaneously infer the mental state of those being talked about. In contrast, we did not observe the action representations in the frontoparietal regions of the action observation network. The current findings indicate that STG and MTG serve as key regions for social processing, and that listening to characters talk about others elicits spontaneous mental state inference in TPJ during natural movie viewing.
Collapse
Affiliation(s)
| | - Lucy Chang
- Department of Cognitive Science, Johns Hopkins University, Baltimore 21218, USA
| | - Leyla Isik
- Department of Cognitive Science, Johns Hopkins University, Baltimore 21218, USA
| |
Collapse
|
4
|
Bonnaire J, Dumas G, Cassell J. Bringing together multimodal and multilevel approaches to study the emergence of social bonds between children and improve social AI. FRONTIERS IN NEUROERGONOMICS 2024; 5:1290256. [PMID: 38827377 PMCID: PMC11140154 DOI: 10.3389/fnrgo.2024.1290256] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/07/2023] [Accepted: 04/29/2024] [Indexed: 06/04/2024]
Abstract
This protocol paper outlines an innovative multimodal and multilevel approach to studying the emergence and evolution of how children build social bonds with their peers, and its potential application to improving social artificial intelligence (AI). We detail a unique hyperscanning experimental framework utilizing functional near-infrared spectroscopy (fNIRS) to observe inter-brain synchrony in child dyads during collaborative tasks and social interactions. Our proposed longitudinal study spans middle childhood, aiming to capture the dynamic development of social connections and cognitive engagement in naturalistic settings. To do so we bring together four kinds of data: the multimodal conversational behaviors that dyads of children engage in, evidence of their state of interpersonal rapport, collaborative performance on educational tasks, and inter-brain synchrony. Preliminary pilot data provide foundational support for our approach, indicating promising directions for identifying neural patterns associated with productive social interactions. The planned research will explore the neural correlates of social bond formation, informing the creation of a virtual peer learning partner in the field of Social Neuroergonomics. This protocol promises significant contributions to understanding the neural basis of social connectivity in children, while also offering a blueprint for designing empathetic and effective social AI tools, particularly for educational contexts.
Collapse
Affiliation(s)
| | - Guillaume Dumas
- Research Center of the CHU Sainte-Justine, Department of Psychiatry, University of Montréal, Montreal, QC, Canada
- Mila–Quebec Artificial Intelligence Institute, Montreal, QC, Canada
| | - Justine Cassell
- Inria Paris Centre, Paris, France
- School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, United States
| |
Collapse
|
5
|
Della Vedova G, Proverbio AM. Neural signatures of imaginary motivational states: desire for music, movement and social play. Brain Topogr 2024:10.1007/s10548-024-01047-1. [PMID: 38625520 DOI: 10.1007/s10548-024-01047-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2023] [Accepted: 03/12/2024] [Indexed: 04/17/2024]
Abstract
The literature has demonstrated the potential for detecting accurate electrical signals that correspond to the will or intention to move, as well as decoding the thoughts of individuals who imagine houses, faces or objects. This investigation examines the presence of precise neural markers of imagined motivational states through the combining of electrophysiological and neuroimaging methods. 20 participants were instructed to vividly imagine the desire to move, listen to music or engage in social activities. Their EEG was recorded from 128 scalp sites and analysed using individual standardized Low-Resolution Brain Electromagnetic Tomographies (LORETAs) in the N400 time window (400-600 ms). The activation of 1056 voxels was examined in relation to the 3 motivational states. The most active dipoles were grouped in eight regions of interest (ROI), including Occipital, Temporal, Fusiform, Premotor, Frontal, OBF/IF, Parietal, and Limbic areas. The statistical analysis revealed that all motivational imaginary states engaged the right hemisphere more than the left hemisphere. Distinct markers were identified for the three motivational states. Specifically, the right temporal area was more relevant for "Social Play", the orbitofrontal/inferior frontal cortex for listening to music, and the left premotor cortex for the "Movement" desire. This outcome is encouraging in terms of the potential use of neural indicators in the realm of brain-computer interface, for interpreting the thoughts and desires of individuals with locked-in syndrome.
Collapse
Affiliation(s)
- Giada Della Vedova
- Cognitive Electrophysiology lab, Dept. of Psychology, University of Milano, Bicocca, Italy
| | - Alice Mado Proverbio
- Cognitive Electrophysiology lab, Dept. of Psychology, University of Milano, Bicocca, Italy.
- NeuroMI, Milan Center for Neuroscience, Milan, Italy.
- Department of Psychology of University of Milano-Bicocca, Piazza dell'Ateneo nuovo 1, Milan, 20162, Italy.
| |
Collapse
|
6
|
Lee Masson H, Chen J, Isik L. A shared neural code for perceiving and remembering social interactions in the human superior temporal sulcus. Neuropsychologia 2024; 196:108823. [PMID: 38346576 DOI: 10.1016/j.neuropsychologia.2024.108823] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2023] [Revised: 01/15/2024] [Accepted: 02/09/2024] [Indexed: 02/20/2024]
Abstract
Recognizing and remembering social information is a crucial cognitive skill. Neural patterns in the superior temporal sulcus (STS) support our ability to perceive others' social interactions. However, despite the prominence of social interactions in memory, the neural basis of remembering social interactions is still unknown. To fill this gap, we investigated the brain mechanisms underlying memory of others' social interactions during free spoken recall of a naturalistic movie. By applying machine learning-based fMRI encoding analyses to densely labeled movie and recall data we found that a subset of the STS activity evoked by viewing social interactions predicted neural responses in not only held-out movie data, but also during memory recall. These results provide the first evidence that activity in the STS is reinstated in response to specific social content and that its reactivation underlies our ability to remember others' interactions. These findings further suggest that the STS contains representations of social interactions that are not only perceptually driven, but also more abstract or conceptual in nature.
Collapse
Affiliation(s)
- Haemy Lee Masson
- Department of Psychology, Durham University, Durham, DH1 3LE, United Kingdom; Department of Cognitive Science, Johns Hopkins University, Baltimore, MD, 21218, United States.
| | - Janice Chen
- Department of Psychological and Brain Sciences, Johns Hopkins University, Baltimore, MD, 21218, United States
| | - Leyla Isik
- Department of Cognitive Science, Johns Hopkins University, Baltimore, MD, 21218, United States.
| |
Collapse
|
7
|
Kabulska Z, Zhuang T, Lingnau A. Overlapping representations of observed actions and action-related features. Hum Brain Mapp 2024; 45:e26605. [PMID: 38379447 PMCID: PMC10879913 DOI: 10.1002/hbm.26605] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2023] [Revised: 12/21/2023] [Accepted: 01/12/2024] [Indexed: 02/22/2024] Open
Abstract
The lateral occipitotemporal cortex (LOTC) has been shown to capture the representational structure of a smaller range of actions. In the current study, we carried out an fMRI experiment in which we presented human participants with images depicting 100 different actions and used representational similarity analysis (RSA) to determine which brain regions capture the semantic action space established using judgments of action similarity. Moreover, to determine the contribution of a wide range of action-related features to the neural representation of the semantic action space we constructed an action feature model on the basis of ratings of 44 different features. We found that the semantic action space model and the action feature model are best captured by overlapping activation patterns in bilateral LOTC and ventral occipitotemporal cortex (VOTC). An RSA on eight dimensions resulting from principal component analysis carried out on the action feature model revealed partly overlapping representations within bilateral LOTC, VOTC, and the parietal lobe. Our results suggest spatially overlapping representations of the semantic action space of a wide range of actions and the corresponding action-related features. Together, our results add to our understanding of the kind of representations along the LOTC that support action understanding.
Collapse
Affiliation(s)
- Zuzanna Kabulska
- Faculty of Human Sciences, Institute of Psychology, Chair of Cognitive NeuroscienceUniversity of RegensburgRegensburgGermany
| | - Tonghe Zhuang
- Faculty of Human Sciences, Institute of Psychology, Chair of Cognitive NeuroscienceUniversity of RegensburgRegensburgGermany
| | - Angelika Lingnau
- Faculty of Human Sciences, Institute of Psychology, Chair of Cognitive NeuroscienceUniversity of RegensburgRegensburgGermany
| |
Collapse
|
8
|
Soulos P, Isik L. Disentangled deep generative models reveal coding principles of the human face processing network. PLoS Comput Biol 2024; 20:e1011887. [PMID: 38408105 PMCID: PMC10919870 DOI: 10.1371/journal.pcbi.1011887] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2023] [Revised: 03/07/2024] [Accepted: 02/02/2024] [Indexed: 02/28/2024] Open
Abstract
Despite decades of research, much is still unknown about the computations carried out in the human face processing network. Recently, deep networks have been proposed as a computational account of human visual processing, but while they provide a good match to neural data throughout visual cortex, they lack interpretability. We introduce a method for interpreting brain activity using a new class of deep generative models, disentangled representation learning models, which learn a low-dimensional latent space that "disentangles" different semantically meaningful dimensions of faces, such as rotation, lighting, or hairstyle, in an unsupervised manner by enforcing statistical independence between dimensions. We find that the majority of our model's learned latent dimensions are interpretable by human raters. Further, these latent dimensions serve as a good encoding model for human fMRI data. We next investigate the representation of different latent dimensions across face-selective voxels. We find that low- and high-level face features are represented in posterior and anterior face-selective regions, respectively, corroborating prior models of human face recognition. Interestingly, though, we find identity-relevant and irrelevant face features across the face processing network. Finally, we provide new insight into the few "entangled" (uninterpretable) dimensions in our model by showing that they match responses in the ventral stream and carry information about facial identity. Disentangled face encoding models provide an exciting alternative to standard "black box" deep learning approaches for modeling and interpreting human brain data.
Collapse
Affiliation(s)
- Paul Soulos
- Department of Cognitive Science, Johns Hopkins University, Baltimore, Maryland, United States of America
| | - Leyla Isik
- Department of Cognitive Science, Johns Hopkins University, Baltimore, Maryland, United States of America
| |
Collapse
|
9
|
Gandolfo M, Abassi E, Balgova E, Downing PE, Papeo L, Koldewyn K. Converging evidence that left extrastriate body area supports visual sensitivity to social interactions. Curr Biol 2024; 34:343-351.e5. [PMID: 38181794 DOI: 10.1016/j.cub.2023.12.009] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2023] [Revised: 11/25/2023] [Accepted: 12/05/2023] [Indexed: 01/07/2024]
Abstract
Navigating our complex social world requires processing the interactions we observe. Recent psychophysical and neuroimaging studies provide parallel evidence that the human visual system may be attuned to efficiently perceive dyadic interactions. This work implies, but has not yet demonstrated, that activity in body-selective cortical regions causally supports efficient visual perception of interactions. We adopt a multi-method approach to close this important gap. First, using a large fMRI dataset (n = 92), we found that the left hemisphere extrastriate body area (EBA) responds more to face-to-face than non-facing dyads. Second, we replicated a behavioral marker of visual sensitivity to interactions: categorization of facing dyads is more impaired by inversion than non-facing dyads. Third, in a pre-registered experiment, we used fMRI-guided transcranial magnetic stimulation to show that online stimulation of the left EBA, but not a nearby control region, abolishes this selective inversion effect. Activity in left EBA, thus, causally supports the efficient perception of social interactions.
Collapse
Affiliation(s)
- Marco Gandolfo
- Donders Institute, Radboud University, Nijmegen 6525GD, the Netherlands; Department of Psychology, Bangor University, Bangor LL572AS, Gwynedd, UK.
| | - Etienne Abassi
- Institut des Sciences Cognitives, Marc Jeannerod, Lyon 69500, France
| | - Eva Balgova
- Department of Psychology, Bangor University, Bangor LL572AS, Gwynedd, UK; Department of Psychology, Aberystwyth University, Aberystwyth SY23 3UX, Ceredigion, UK
| | - Paul E Downing
- Department of Psychology, Bangor University, Bangor LL572AS, Gwynedd, UK
| | - Liuba Papeo
- Institut des Sciences Cognitives, Marc Jeannerod, Lyon 69500, France
| | - Kami Koldewyn
- Department of Psychology, Bangor University, Bangor LL572AS, Gwynedd, UK.
| |
Collapse
|
10
|
Karakose-Akbiyik S, Sussman O, Wurm MF, Caramazza A. The Role of Agentive and Physical Forces in the Neural Representation of Motion Events. J Neurosci 2024; 44:e1363232023. [PMID: 38050107 PMCID: PMC10860628 DOI: 10.1523/jneurosci.1363-23.2023] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2023] [Revised: 11/14/2023] [Accepted: 11/19/2023] [Indexed: 12/06/2023] Open
Abstract
How does the brain represent information about motion events in relation to agentive and physical forces? In this study, we investigated the neural activity patterns associated with observing animated actions of agents (e.g., an agent hitting a chair) in comparison to similar movements of inanimate objects that were either shaped solely by the physics of the scene (e.g., gravity causing an object to fall down a hill and hit a chair) or initiated by agents (e.g., a visible agent causing an object to hit a chair). Using an fMRI-based multivariate pattern analysis (MVPA), this design allowed testing where in the brain the neural activity patterns associated with motion events change as a function of, or are invariant to, agentive versus physical forces behind them. A total of 29 human participants (nine male) participated in the study. Cross-decoding revealed a shared neural representation of animate and inanimate motion events that is invariant to agentive or physical forces in regions spanning frontoparietal and posterior temporal cortices. In contrast, the right lateral occipitotemporal cortex showed a higher sensitivity to agentive events, while the left dorsal premotor cortex was more sensitive to information about inanimate object events that were solely shaped by the physics of the scene.
Collapse
Affiliation(s)
| | - Oliver Sussman
- Department of Psychology, Harvard University, Cambridge, Massachusetts 02138
| | - Moritz F Wurm
- Center for Mind/Brain Sciences - CIMeC, University of Trento, 38068 Rovereto, Italy
| | - Alfonso Caramazza
- Department of Psychology, Harvard University, Cambridge, Massachusetts 02138
- Center for Mind/Brain Sciences - CIMeC, University of Trento, 38068 Rovereto, Italy
| |
Collapse
|
11
|
Olson HA, Chen EM, Lydic KO, Saxe RR. Left-Hemisphere Cortical Language Regions Respond Equally to Observed Dialogue and Monologue. NEUROBIOLOGY OF LANGUAGE (CAMBRIDGE, MASS.) 2023; 4:575-610. [PMID: 38144236 PMCID: PMC10745132 DOI: 10.1162/nol_a_00123] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 02/01/2023] [Accepted: 09/20/2023] [Indexed: 12/26/2023]
Abstract
Much of the language we encounter in our everyday lives comes in the form of conversation, yet the majority of research on the neural basis of language comprehension has used input from only one speaker at a time. Twenty adults were scanned while passively observing audiovisual conversations using functional magnetic resonance imaging. In a block-design task, participants watched 20 s videos of puppets speaking either to another puppet (the dialogue condition) or directly to the viewer (the monologue condition), while the audio was either comprehensible (played forward) or incomprehensible (played backward). Individually functionally localized left-hemisphere language regions responded more to comprehensible than incomprehensible speech but did not respond differently to dialogue than monologue. In a second task, participants watched videos (1-3 min each) of two puppets conversing with each other, in which one puppet was comprehensible while the other's speech was reversed. All participants saw the same visual input but were randomly assigned which character's speech was comprehensible. In left-hemisphere cortical language regions, the time course of activity was correlated only among participants who heard the same character speaking comprehensibly, despite identical visual input across all participants. For comparison, some individually localized theory of mind regions and right-hemisphere homologues of language regions responded more to dialogue than monologue in the first task, and in the second task, activity in some regions was correlated across all participants regardless of which character was speaking comprehensibly. Together, these results suggest that canonical left-hemisphere cortical language regions are not sensitive to differences between observed dialogue and monologue.
Collapse
|
12
|
McMahon E, Bonner MF, Isik L. Hierarchical organization of social action features along the lateral visual pathway. Curr Biol 2023; 33:5035-5047.e8. [PMID: 37918399 PMCID: PMC10841461 DOI: 10.1016/j.cub.2023.10.015] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2023] [Revised: 09/01/2023] [Accepted: 10/10/2023] [Indexed: 11/04/2023]
Abstract
Recent theoretical work has argued that in addition to the classical ventral (what) and dorsal (where/how) visual streams, there is a third visual stream on the lateral surface of the brain specialized for processing social information. Like visual representations in the ventral and dorsal streams, representations in the lateral stream are thought to be hierarchically organized. However, no prior studies have comprehensively investigated the organization of naturalistic, social visual content in the lateral stream. To address this question, we curated a naturalistic stimulus set of 250 3-s videos of two people engaged in everyday actions. Each clip was richly annotated for its low-level visual features, mid-level scene and object properties, visual social primitives (including the distance between people and the extent to which they were facing), and high-level information about social interactions and affective content. Using a condition-rich fMRI experiment and a within-subject encoding model approach, we found that low-level visual features are represented in early visual cortex (EVC) and middle temporal (MT) area, mid-level visual social features in extrastriate body area (EBA) and lateral occipital complex (LOC), and high-level social interaction information along the superior temporal sulcus (STS). Communicative interactions, in particular, explained unique variance in regions of the STS after accounting for variance explained by all other labeled features. Taken together, these results provide support for representation of increasingly abstract social visual content-consistent with hierarchical organization-along the lateral visual stream and suggest that recognizing communicative actions may be a key computational goal of the lateral visual pathway.
Collapse
Affiliation(s)
- Emalie McMahon
- Department of Cognitive Science, Zanvyl Krieger School of Arts & Sciences, Johns Hopkins University, 237 Krieger Hall, 3400 N. Charles Street, Baltimore, MD 21218, USA.
| | - Michael F Bonner
- Department of Cognitive Science, Zanvyl Krieger School of Arts & Sciences, Johns Hopkins University, 237 Krieger Hall, 3400 N. Charles Street, Baltimore, MD 21218, USA
| | - Leyla Isik
- Department of Cognitive Science, Zanvyl Krieger School of Arts & Sciences, Johns Hopkins University, 237 Krieger Hall, 3400 N. Charles Street, Baltimore, MD 21218, USA; Department of Biomedical Engineering, Whiting School of Engineering, Johns Hopkins University, Suite 400 West, Wyman Park Building, 3400 N. Charles Street, Baltimore, MD 21218, USA
| |
Collapse
|
13
|
Malik M, Isik L. Relational visual representations underlie human social interaction recognition. Nat Commun 2023; 14:7317. [PMID: 37951960 PMCID: PMC10640586 DOI: 10.1038/s41467-023-43156-8] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2022] [Accepted: 11/02/2023] [Indexed: 11/14/2023] Open
Abstract
Humans effortlessly recognize social interactions from visual input. Attempts to model this ability have typically relied on generative inverse planning models, which make predictions by inverting a generative model of agents' interactions based on their inferred goals, suggesting humans use a similar process of mental inference to recognize interactions. However, growing behavioral and neuroscience evidence suggests that recognizing social interactions is a visual process, separate from complex mental state inference. Yet despite their success in other domains, visual neural network models have been unable to reproduce human-like interaction recognition. We hypothesize that humans rely on relational visual information in particular, and develop a relational, graph neural network model, SocialGNN. Unlike prior models, SocialGNN accurately predicts human interaction judgments across both animated and natural videos. These results suggest that humans can make complex social interaction judgments without an explicit model of the social and physical world, and that structured, relational visual representations are key to this behavior.
Collapse
Affiliation(s)
- Manasi Malik
- Department of Cognitive Science, Johns Hopkins University, Baltimore, MD, 21218, USA.
| | - Leyla Isik
- Department of Cognitive Science, Johns Hopkins University, Baltimore, MD, 21218, USA.
| |
Collapse
|
14
|
Landsiedel J, Koldewyn K. Auditory dyadic interactions through the "eye" of the social brain: How visual is the posterior STS interaction region? IMAGING NEUROSCIENCE (CAMBRIDGE, MASS.) 2023; 1:1-20. [PMID: 37719835 PMCID: PMC10503480 DOI: 10.1162/imag_a_00003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/16/2023] [Accepted: 05/17/2023] [Indexed: 09/19/2023]
Abstract
Human interactions contain potent social cues that meet not only the eye but also the ear. Although research has identified a region in the posterior superior temporal sulcus as being particularly sensitive to visually presented social interactions (SI-pSTS), its response to auditory interactions has not been tested. Here, we used fMRI to explore brain response to auditory interactions, with a focus on temporal regions known to be important in auditory processing and social interaction perception. In Experiment 1, monolingual participants listened to two-speaker conversations (intact or sentence-scrambled) and one-speaker narrations in both a known and an unknown language. Speaker number and conversational coherence were explored in separately localised regions-of-interest (ROI). In Experiment 2, bilingual participants were scanned to explore the role of language comprehension. Combining univariate and multivariate analyses, we found initial evidence for a heteromodal response to social interactions in SI-pSTS. Specifically, right SI-pSTS preferred auditory interactions over control stimuli and represented information about both speaker number and interactive coherence. Bilateral temporal voice areas (TVA) showed a similar, but less specific, profile. Exploratory analyses identified another auditory-interaction sensitive area in anterior STS. Indeed, direct comparison suggests modality specific tuning, with SI-pSTS preferring visual information while aSTS prefers auditory information. Altogether, these results suggest that right SI-pSTS is a heteromodal region that represents information about social interactions in both visual and auditory domains. Future work is needed to clarify the roles of TVA and aSTS in auditory interaction perception and further probe right SI-pSTS interaction-selectivity using non-semantic prosodic cues.
Collapse
Affiliation(s)
- Julia Landsiedel
- Department of Psychology, School of Human and Behavioural Sciences, Bangor University, Bangor, United Kingdom
| | - Kami Koldewyn
- Department of Psychology, School of Human and Behavioural Sciences, Bangor University, Bangor, United Kingdom
| |
Collapse
|
15
|
Deen B, Schwiedrzik CM, Sliwa J, Freiwald WA. Specialized Networks for Social Cognition in the Primate Brain. Annu Rev Neurosci 2023; 46:381-401. [PMID: 37428602 PMCID: PMC11115357 DOI: 10.1146/annurev-neuro-102522-121410] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/12/2023]
Abstract
Primates have evolved diverse cognitive capabilities to navigate their complex social world. To understand how the brain implements critical social cognitive abilities, we describe functional specialization in the domains of face processing, social interaction understanding, and mental state attribution. Systems for face processing are specialized from the level of single cells to populations of neurons within brain regions to hierarchically organized networks that extract and represent abstract social information. Such functional specialization is not confined to the sensorimotor periphery but appears to be a pervasive theme of primate brain organization all the way to the apex regions of cortical hierarchies. Circuits processing social information are juxtaposed with parallel systems involved in processing nonsocial information, suggesting common computations applied to different domains. The emerging picture of the neural basis of social cognition is a set of distinct but interacting subnetworks involved in component processes such as face perception and social reasoning, traversing large parts of the primate brain.
Collapse
Affiliation(s)
- Ben Deen
- Psychology Department & Tulane Brain Institute, Tulane University, New Orleans, Louisiana, USA
| | - Caspar M Schwiedrzik
- Neural Circuits and Cognition Lab, European Neuroscience Institute Göttingen, A Joint Initiative of the University Medical Center Göttingen and the Max Planck Society; Perception and Plasticity Group, German Primate Center, Leibniz Institute for Primate Research; and Leibniz-Science Campus Primate Cognition, Göttingen, Germany
| | - Julia Sliwa
- Sorbonne Université, Institut du Cerveau, ICM, Inserm, CNRS, APHP, Hôpital de la Pitié Salpêtrière, Paris, France
| | - Winrich A Freiwald
- Laboratory of Neural Systems and The Price Family Center for the Social Brain, The Rockefeller University, New York, NY, USA;
- The Center for Brains, Minds and Machines, Cambridge, Massachusetts, USA
| |
Collapse
|
16
|
Rolls ET, Wirth S, Deco G, Huang C, Feng J. The human posterior cingulate, retrosplenial, and medial parietal cortex effective connectome, and implications for memory and navigation. Hum Brain Mapp 2023; 44:629-655. [PMID: 36178249 PMCID: PMC9842927 DOI: 10.1002/hbm.26089] [Citation(s) in RCA: 17] [Impact Index Per Article: 17.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2022] [Revised: 09/05/2022] [Accepted: 09/07/2022] [Indexed: 01/25/2023] Open
Abstract
The human posterior cingulate, retrosplenial, and medial parietal cortex are involved in memory and navigation. The functional anatomy underlying these cognitive functions was investigated by measuring the effective connectivity of these Posterior Cingulate Division (PCD) regions in the Human Connectome Project-MMP1 atlas in 171 HCP participants, and complemented with functional connectivity and diffusion tractography. First, the postero-ventral parts of the PCD (31pd, 31pv, 7m, d23ab, and v23ab) have effective connectivity with the temporal pole, inferior temporal visual cortex, cortex in the superior temporal sulcus implicated in auditory and semantic processing, with the reward-related vmPFC and pregenual anterior cingulate cortex, with the inferior parietal cortex, and with the hippocampal system. This connectivity implicates it in hippocampal episodic memory, providing routes for "what," reward and semantic schema-related information to access the hippocampus. Second, the antero-dorsal parts of the PCD (especially 31a and 23d, PCV, and also RSC) have connectivity with early visual cortical areas including those that represent spatial scenes, with the superior parietal cortex, with the pregenual anterior cingulate cortex, and with the hippocampal system. This connectivity implicates it in the "where" component for hippocampal episodic memory and for spatial navigation. The dorsal-transitional-visual (DVT) and ProStriate regions where the retrosplenial scene area is located have connectivity from early visual cortical areas to the parahippocampal scene area, providing a ventromedial route for spatial scene information to reach the hippocampus. These connectivities provide important routes for "what," reward, and "where" scene-related information for human hippocampal episodic memory and navigation. The midcingulate cortex provides a route from the anterior dorsal parts of the PCD and the supracallosal part of the anterior cingulate cortex to premotor regions.
Collapse
Affiliation(s)
- Edmund T. Rolls
- Oxford Centre for Computational NeuroscienceOxfordUK
- Department of Computer ScienceUniversity of WarwickCoventryUK
- Institute of Science and Technology for Brain Inspired IntelligenceFudan UniversityShanghaiChina
- Key Laboratory of Computational Neuroscience and Brain Inspired IntelligenceFudan University, Ministry of EducationShanghaiChina
- Fudan ISTBI—ZJNU Algorithm Centre for Brain‐Inspired IntelligenceZhejiang Normal UniversityJinhuaChina
| | - Sylvia Wirth
- Institut des Sciences Cognitives Marc Jeannerod, UMR 5229CNRS and University of LyonBronFrance
| | - Gustavo Deco
- Center for Brain and Cognition, Computational Neuroscience Group, Department of Information and Communication TechnologiesUniversitat Pompeu FabraBarcelonaSpain
- Brain and CognitionPompeu Fabra UniversityBarcelonaSpain
- Institució Catalana de la Recerca i Estudis Avançats (ICREA)Universitat Pompeu FabraBarcelonaSpain
| | - Chu‐Chung Huang
- Shanghai Key Laboratory of Brain Functional Genomics (Ministry of Education), School of Psychology and Cognitive ScienceEast China Normal UniversityShanghaiChina
| | - Jianfeng Feng
- Department of Computer ScienceUniversity of WarwickCoventryUK
- Institute of Science and Technology for Brain Inspired IntelligenceFudan UniversityShanghaiChina
- Key Laboratory of Computational Neuroscience and Brain Inspired IntelligenceFudan University, Ministry of EducationShanghaiChina
- Fudan ISTBI—ZJNU Algorithm Centre for Brain‐Inspired IntelligenceZhejiang Normal UniversityJinhuaChina
| |
Collapse
|
17
|
Abassi E, Papeo L. Behavioral and neural markers of visual configural processing in social scene perception. Neuroimage 2022; 260:119506. [PMID: 35878724 DOI: 10.1016/j.neuroimage.2022.119506] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2021] [Revised: 07/18/2022] [Accepted: 07/21/2022] [Indexed: 11/19/2022] Open
Abstract
Research on face perception has revealed highly specialized visual mechanisms such as configural processing, and provided markers of interindividual differences -including disease risks and alterations- in visuo-perceptual abilities that traffic in social cognition. Is face perception unique in degree or kind of mechanisms, and in its relevance for social cognition? Combining functional MRI and behavioral methods, we address the processing of an uncharted class of socially relevant stimuli: minimal social scenes involving configurations of two bodies spatially close and face-to-face as if interacting (hereafter, facing dyads). We report category-specific activity for facing (vs. non-facing) dyads in visual cortex. That activity shows face-like signatures of configural processing -i.e., stronger response to facing (vs. non-facing) dyads, and greater susceptibility to stimulus inversion for facing (vs. non-facing) dyads-, and is predicted by performance-based measures of configural processing in visual perception of body dyads. Moreover, we observe that the individual performance in body-dyad perception is reliable, stable-over-time and correlated with the individual social sensitivity, coarsely captured by the Autism-Spectrum Quotient. Further analyses clarify the relationship between single-body and body-dyad perception. We propose that facing dyads are processed through highly specialized mechanisms -and brain areas-, analogously to other biologically and socially relevant stimuli such as faces. Like face perception, facing-dyad perception can reveal basic (visual) processes that lay the foundations for understanding others, their relationships and interactions.
Collapse
Affiliation(s)
- Etienne Abassi
- Institut des Sciences Cognitives-Marc Jeannerod, UMR5229, Centre National de la Recherche Scientifique (CNRS) and Université Claude Bernard Lyon 1, 67 Bd. Pinel, 69675 Bron France.
| | - Liuba Papeo
- Institut des Sciences Cognitives-Marc Jeannerod, UMR5229, Centre National de la Recherche Scientifique (CNRS) and Université Claude Bernard Lyon 1, 67 Bd. Pinel, 69675 Bron France
| |
Collapse
|