1
|
Shain C, Kean H, Casto C, Lipkin B, Affourtit J, Siegelman M, Mollica F, Fedorenko E. Distributed Sensitivity to Syntax and Semantics throughout the Language Network. J Cogn Neurosci 2024; 36:1427-1471. [PMID: 38683732 DOI: 10.1162/jocn_a_02164] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/02/2024]
Abstract
Human language is expressive because it is compositional: The meaning of a sentence (semantics) can be inferred from its structure (syntax). It is commonly believed that language syntax and semantics are processed by distinct brain regions. Here, we revisit this claim using precision fMRI methods to capture separation or overlap of function in the brains of individual participants. Contrary to prior claims, we find distributed sensitivity to both syntax and semantics throughout a broad frontotemporal brain network. Our results join a growing body of evidence for an integrated network for language in the human brain within which internal specialization is primarily a matter of degree rather than kind, in contrast with influential proposals that advocate distinct specialization of different brain areas for different types of linguistic functions.
Collapse
Affiliation(s)
| | - Hope Kean
- Massachusetts Institute of Technology
| | | | | | | | | | | | | |
Collapse
|
2
|
Yu S, Gu C, Huang K, Li P. Predicting the next sentence (not word) in large language models: What model-brain alignment tells us about discourse comprehension. SCIENCE ADVANCES 2024; 10:eadn7744. [PMID: 38781343 PMCID: PMC11114233 DOI: 10.1126/sciadv.adn7744] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/01/2024] [Accepted: 04/18/2024] [Indexed: 05/25/2024]
Abstract
Current large language models (LLMs) rely on word prediction as their backbone pretraining task. Although word prediction is an important mechanism underlying language processing, human language comprehension occurs at multiple levels, involving the integration of words and sentences to achieve a full understanding of discourse. This study models language comprehension by using the next sentence prediction (NSP) task to investigate mechanisms of discourse-level comprehension. We show that NSP pretraining enhanced a model's alignment with brain data especially in the right hemisphere and in the multiple demand network, highlighting the contributions of nonclassical language regions to high-level language understanding. Our results also suggest that NSP can enable the model to better capture human comprehension performance and to better encode contextual information. Our study demonstrates that the inclusion of diverse learning objectives in a model leads to more human-like representations, and investigating the neurocognitive plausibility of pretraining tasks in LLMs can shed light on outstanding questions in language neuroscience.
Collapse
Affiliation(s)
- Shaoyun Yu
- Department of Chinese and Bilingual Studies, The Hong Kong Polytechnic University, Hong Kong SAR, China
| | - Chanyuan Gu
- Department of Chinese and Bilingual Studies, The Hong Kong Polytechnic University, Hong Kong SAR, China
| | - Kexin Huang
- Department of Chinese and Bilingual Studies, The Hong Kong Polytechnic University, Hong Kong SAR, China
| | - Ping Li
- Department of Chinese and Bilingual Studies, The Hong Kong Polytechnic University, Hong Kong SAR, China
- Centre for Immersive Learning and Metaverse in Education, The Hong Kong Polytechnic University, Hong Kong SAR, China
| |
Collapse
|
3
|
Fernandino L, Binder JR. How does the "default mode" network contribute to semantic cognition? BRAIN AND LANGUAGE 2024; 252:105405. [PMID: 38579461 PMCID: PMC11135161 DOI: 10.1016/j.bandl.2024.105405] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/05/2023] [Revised: 02/26/2024] [Accepted: 03/23/2024] [Indexed: 04/07/2024]
Abstract
This review examines whether and how the "default mode" network (DMN) contributes to semantic processing. We review evidence implicating the DMN in the processing of individual word meanings and in sentence- and discourse-level semantics. Next, we argue that the areas comprising the DMN contribute to semantic processing by coordinating and integrating the simultaneous activity of local neuronal ensembles across multiple unimodal and multimodal cortical regions, creating a transient, global neuronal ensemble. The resulting ensemble implements an integrated simulation of phenomenological experience - that is, an embodied situation model - constructed from various modalities of experiential memory traces. These situation models, we argue, are necessary not only for semantic processing but also for aspects of cognition that are not traditionally considered semantic. Although many aspects of this proposal remain provisional, we believe it provides new insights into the relationships between semantic and non-semantic cognition and into the functions of the DMN.
Collapse
Affiliation(s)
- Leonardo Fernandino
- Department of Neurology, Medical College of Wisconsin, USA; Department of Biomedical Engineering, Medical College of Wisconsin, USA.
| | - Jeffrey R Binder
- Department of Neurology, Medical College of Wisconsin, USA; Department of Biophysics, Medical College of Wisconsin, USA
| |
Collapse
|
4
|
Hinzen W, Palaniyappan L. The 'L-factor': Language as a transdiagnostic dimension in psychopathology. Prog Neuropsychopharmacol Biol Psychiatry 2024; 131:110952. [PMID: 38280712 DOI: 10.1016/j.pnpbp.2024.110952] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/25/2023] [Revised: 12/20/2023] [Accepted: 01/23/2024] [Indexed: 01/29/2024]
Abstract
Thoughts and moods constituting our mental life incessantly change. When the steady flow of this dynamics diverges in clinical directions, the possible pathways involved are captured through discrete diagnostic labels. Yet a single vulnerable neurocognitive system may be causally involved in psychopathological deviations transdiagnostically. We argue that language viewed as integrating cortical functions is the best current candidate, whose forms of breakdown along its different dimensions are then manifest as symptoms - from prosodic abnormalities and rumination in depression to distortions of speech perception in verbal hallucinations, distortions of meaning and content in delusions, or disorganized speech in formal thought disorder. Spontaneous connected speech provides continuous objective readouts generating a highly accessible bio-behavioral marker with the potential of revolutionizing neuropsychological measurement. This argument turns language into a transdiagnostic 'L-factor' providing an analytical and mechanistic substrate for previously proposed latent general factors of psychopathology ('p-factor') and cognitive functioning ('c-factor'). Together with immense practical opportunities afforded by rapidly advancing natural language processing (NLP) technologies and abundantly available data, this suggests a new era of translational clinical psychiatry, in which both psychopathology and language may be rethought together.
Collapse
Affiliation(s)
- Wolfram Hinzen
- Department of Translation & Language Sciences, Universitat Pompeu Fabra, Barcelona, Spain; Institut Català de Recerca i Estudis Avançats (ICREA), Barcelona, Spain.
| | - Lena Palaniyappan
- Douglas Mental Health University Institute, Department of Psychiatry, McGill University, Montreal H4H1R3, Quebec, Canada; Robarts Research Institute & Lawson Health Research Institute, London, ON, Canada
| |
Collapse
|
5
|
Antonello R, Huth A. Predictive Coding or Just Feature Discovery? An Alternative Account of Why Language Models Fit Brain Data. NEUROBIOLOGY OF LANGUAGE (CAMBRIDGE, MASS.) 2024; 5:64-79. [PMID: 38645616 PMCID: PMC11025645 DOI: 10.1162/nol_a_00087] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 02/28/2022] [Accepted: 10/26/2022] [Indexed: 04/23/2024]
Abstract
Many recent studies have shown that representations drawn from neural network language models are extremely effective at predicting brain responses to natural language. But why do these models work so well? One proposed explanation is that language models and brains are similar because they have the same objective: to predict upcoming words before they are perceived. This explanation is attractive because it lends support to the popular theory of predictive coding. We provide several analyses that cast doubt on this claim. First, we show that the ability to predict future words does not uniquely (or even best) explain why some representations are a better match to the brain than others. Second, we show that within a language model, representations that are best at predicting future words are strictly worse brain models than other representations. Finally, we argue in favor of an alternative explanation for the success of language models in neuroscience: These models are effective at predicting brain responses because they generally capture a wide variety of linguistic phenomena.
Collapse
Affiliation(s)
- Richard Antonello
- Department of Computer Science, University of Texas at Austin, Austin, TX, USA
| | - Alexander Huth
- Department of Computer Science, University of Texas at Austin, Austin, TX, USA
| |
Collapse
|
6
|
Jain S, Vo VA, Wehbe L, Huth AG. Computational Language Modeling and the Promise of In Silico Experimentation. NEUROBIOLOGY OF LANGUAGE (CAMBRIDGE, MASS.) 2024; 5:80-106. [PMID: 38645624 PMCID: PMC11025654 DOI: 10.1162/nol_a_00101] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 02/28/2022] [Accepted: 01/18/2023] [Indexed: 04/23/2024]
Abstract
Language neuroscience currently relies on two major experimental paradigms: controlled experiments using carefully hand-designed stimuli, and natural stimulus experiments. These approaches have complementary advantages which allow them to address distinct aspects of the neurobiology of language, but each approach also comes with drawbacks. Here we discuss a third paradigm-in silico experimentation using deep learning-based encoding models-that has been enabled by recent advances in cognitive computational neuroscience. This paradigm promises to combine the interpretability of controlled experiments with the generalizability and broad scope of natural stimulus experiments. We show four examples of simulating language neuroscience experiments in silico and then discuss both the advantages and caveats of this approach.
Collapse
Affiliation(s)
- Shailee Jain
- Department of Computer Science, University of Texas at Austin, Austin, TX, USA
| | - Vy A. Vo
- Brain-Inspired Computing Lab, Intel Labs, Hillsboro, OR, USA
| | - Leila Wehbe
- Machine Learning Department, Carnegie Mellon University, Pittsburgh, PA, USA
- Neuroscience Institute, Carnegie Mellon University, Pittsburgh, PA, USA
| | - Alexander G. Huth
- Department of Computer Science, University of Texas at Austin, Austin, TX, USA
- Department of Neuroscience, University of Texas at Austin, Austin, TX, USA
| |
Collapse
|
7
|
Fairhall SL. Sentence-level embeddings reveal dissociable word- and sentence-level cortical representation across coarse- and fine-grained levels of meaning. BRAIN AND LANGUAGE 2024; 250:105389. [PMID: 38306958 DOI: 10.1016/j.bandl.2024.105389] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/31/2023] [Revised: 01/09/2024] [Accepted: 01/26/2024] [Indexed: 02/04/2024]
Abstract
In this large-sample (N = 64) fMRI study, sentence embeddings (text-embedding-ada-002, OpenAI) and representational similarity analysis were used to contrast sentence-level and word-level semantic representation. Overall, sentence-level information resulted in a 20-25 % increase in the model's ability to captures neural representation when compared to word-level only information (word-order scrambled embeddings). This increase was relatively undifferentiated across the cortex. However, when coarse-grained (across thematic category) and fine-grained (within thematic category) combinatorial meaning were separately assessed, word- and sentence-level representations were seen to strongly dissociate across the cortex and to do so differently as a function of grain. Coarse-grained sentence-level representations were evident in occipitotemporal, ventral temporal and medial prefrontal cortex, while fine-grained differences were seen in lateral prefrontal and parietal cortex, middle temporal gyrus, the precuneus, and medial prefrontal cortex. This result indicates dissociable cortical substrates underly single concept versus combinatorial meaning and that different cortical regions specialise for fine- and coarse-grained meaning.
Collapse
Affiliation(s)
- Scott L Fairhall
- Center for Mind/Brain Sciences (CIMeC), University of Trento, Italy.
| |
Collapse
|
8
|
He R, Palominos C, Zhang H, Alonso-Sánchez MF, Palaniyappan L, Hinzen W. Navigating the semantic space: Unraveling the structure of meaning in psychosis using different computational language models. Psychiatry Res 2024; 333:115752. [PMID: 38280291 DOI: 10.1016/j.psychres.2024.115752] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/28/2023] [Revised: 01/16/2024] [Accepted: 01/21/2024] [Indexed: 01/29/2024]
Abstract
Speech in psychosis has long been ascribed as involving 'loosening of associations'. We pursued the aim to elucidate its underlying cognitive mechanisms by analysing picture descriptions from 94 subjects (29 healthy controls, 18 participants at clinical high risk, 29 with first-episode psychosis, and 18 with chronic schizophrenia), using five language models with different computational architectures: FastText, which represents meaning non-contextually/statically; BERT, which represents contextual meaning sensitive to grammar and context; Infersent and SBERT, which provide sentential representations; and CLIP, which evaluates speech relative to a visual stimulus. These models were used to quantify semantic distances crossed between successive tokens/sentences, and semantic perplexity indicating unexpectedness in continuations. Results showed that, among patients, semantic similarity increased when measured with FastText, Infersent, and SBERT, while it decreased with CLIP and BERT. Higher perplexity was observed in first-episode psychosis. Static semantic measures were associated with clinically measured impoverishment of thought and referential semantic measures with disorganization. These patterns indicate a shrinking conceptual semantic space as represented by static language models, which co-occurs with a widening in the referential semantic space as represented by contextual models. This duality underlines the need to separate these two forms of meaning for understanding mechanisms involved in semantic change in psychosis.
Collapse
Affiliation(s)
- Rui He
- Department of Translation & Language Sciences, Universitat Pompeu Fabra, Carrer Roc Boronat, 138, Barcelona, 08018, Spain.
| | - Claudio Palominos
- Department of Translation & Language Sciences, Universitat Pompeu Fabra, Carrer Roc Boronat, 138, Barcelona, 08018, Spain
| | - Han Zhang
- Department of Translation & Language Sciences, Universitat Pompeu Fabra, Carrer Roc Boronat, 138, Barcelona, 08018, Spain
| | | | - Lena Palaniyappan
- Douglas Mental Health University Institute, Department of Psychiatry, McGill University, Montreal, Quebec, Canada; Department of Medical Biophysics, Schulich School of Medicine and Dentistry, Western University, London, Ontario, Canada; Robarts Research Institute, Schulich School of Medicine and Dentistry, Western University, London, Ontario, Canada
| | - Wolfram Hinzen
- Department of Translation & Language Sciences, Universitat Pompeu Fabra, Carrer Roc Boronat, 138, Barcelona, 08018, Spain; Intitut Català de Recerca i Estudis Avançats (ICREA), Barcelona, Spain
| |
Collapse
|
9
|
Skrill D, Norman-Haignere SV. Large language models transition from integrating across position-yoked, exponential windows to structure-yoked, power-law windows. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 2023; 36:638-654. [PMID: 38434255 PMCID: PMC10907028] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Subscribe] [Scholar Register] [Indexed: 03/05/2024]
Abstract
Modern language models excel at integrating across long temporal scales needed to encode linguistic meaning and show non-trivial similarities to biological neural systems. Prior work suggests that human brain responses to language exhibit hierarchically organized "integration windows" that substantially constrain the overall influence of an input token (e.g., a word) on the neural response. However, little prior work has attempted to use integration windows to characterize computations in large language models (LLMs). We developed a simple word-swap procedure for estimating integration windows from black-box language models that does not depend on access to gradients or knowledge of the model architecture (e.g., attention weights). Using this method, we show that trained LLMs exhibit stereotyped integration windows that are well-fit by a convex combination of an exponential and a power-law function, with a partial transition from exponential to power-law dynamics across network layers. We then introduce a metric for quantifying the extent to which these integration windows vary with structural boundaries (e.g., sentence boundaries), and using this metric, we show that integration windows become increasingly yoked to structure at later network layers. None of these findings were observed in an untrained model, which as expected integrated uniformly across its input. These results suggest that LLMs learn to integrate information in natural language using a stereotyped pattern: integrating across position-yoked, exponential windows at early layers, followed by structure-yoked, power-law windows at later layers. The methods we describe in this paper provide a general-purpose toolkit for understanding temporal integration in language models, facilitating cross-disciplinary research at the intersection of biological and artificial intelligence.
Collapse
Affiliation(s)
- David Skrill
- Department of Biostatistics and Computational Biology, University of Rochester Medical Center, Rochester, NY 14642
| | - Sam V Norman-Haignere
- Depts. of Biostatistics and Computational Biology, Neuroscience, University of Rochester Medical Center, Rochester, NY 14642
- Depts. of Brain and Cognitive Sciences, Biomedical Engineering, University of Rochester, Rochester, NY 14642
| |
Collapse
|
10
|
Bruera A, Tao Y, Anderson A, Çokal D, Haber J, Poesio M. Modeling Brain Representations of Words' Concreteness in Context Using GPT-2 and Human Ratings. Cogn Sci 2023; 47:e13388. [PMID: 38103208 DOI: 10.1111/cogs.13388] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2023] [Revised: 09/12/2023] [Accepted: 10/27/2023] [Indexed: 12/18/2023]
Abstract
The meaning of most words in language depends on their context. Understanding how the human brain extracts contextualized meaning, and identifying where in the brain this takes place, remain important scientific challenges. But technological and computational advances in neuroscience and artificial intelligence now provide unprecedented opportunities to study the human brain in action as language is read and understood. Recent contextualized language models seem to be able to capture homonymic meaning variation ("bat", in a baseball vs. a vampire context), as well as more nuanced differences of meaning-for example, polysemous words such as "book", which can be interpreted in distinct but related senses ("explain a book", information, vs. "open a book", object) whose differences are fine-grained. We study these subtle differences in lexical meaning along the concrete/abstract dimension, as they are triggered by verb-noun semantic composition. We analyze functional magnetic resonance imaging (fMRI) activations elicited by Italian verb phrases containing nouns whose interpretation is affected by the verb to different degrees. By using a contextualized language model and human concreteness ratings, we shed light on where in the brain such fine-grained meaning variation takes place and how it is coded. Our results show that phrase concreteness judgments and the contextualized model can predict BOLD activation associated with semantic composition within the language network. Importantly, representations derived from a complex, nonlinear composition process consistently outperform simpler composition approaches. This is compatible with a holistic view of semantic composition in the brain, where semantic representations are modified by the process of composition itself. When looking at individual brain areas, we find that encoding performance is statistically significant, although with differing patterns of results, suggesting differential involvement, in the posterior superior temporal sulcus, inferior frontal gyrus and anterior temporal lobe, and in motor areas previously associated with processing of concreteness/abstractness.
Collapse
Affiliation(s)
- Andrea Bruera
- School of Electronic Engineering and Computer Science, Cognitive Science Research Group, Queen Mary University of London
- Lise Meitner Research Group Cognition and Plasticity, Max Planck Institute for Human Cognitive and Brain Sciences
| | - Yuan Tao
- Department of Cognitive Science, Johns Hopkins University
| | | | - Derya Çokal
- Department of German Language and Literature I-Linguistics, University of Cologne
| | - Janosch Haber
- School of Electronic Engineering and Computer Science, Cognitive Science Research Group, Queen Mary University of London
- Chattermill, London
| | - Massimo Poesio
- School of Electronic Engineering and Computer Science, Cognitive Science Research Group, Queen Mary University of London
- Department of Information and Computing Sciences, University of Utrecht
| |
Collapse
|
11
|
Möhring L, Gläscher J. Prediction errors drive dynamic changes in neural patterns that guide behavior. Cell Rep 2023; 42:112931. [PMID: 37540597 DOI: 10.1016/j.celrep.2023.112931] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2023] [Revised: 06/13/2023] [Accepted: 07/18/2023] [Indexed: 08/06/2023] Open
Abstract
Learning describes the process by which our internal expectation models of the world are updated by surprising outcomes (prediction errors [PEs]) to improve predictions of future events. However, the mechanisms through which error signals dynamically influence existing neural representations are unknown. Here, we use functional magnetic resonance imaging (fMRI) in humans solving a two-step Markov decision task to investigate changes in neural activation patterns following PEs. Using a dynamic multivariate pattern analysis, we can show that PE-related fMRI responses in error-coding regions predict trial-by-trial changes in multivariate neural patterns in the orbitofrontal cortex, the precuneus, and the ventromedial prefrontal cortex (vmPFC). Importantly, the dynamics of these pattern changes in the vmPFC also predicted upcoming changes in choice strategies and thus highlight the importance of these pattern changes for behavior.
Collapse
Affiliation(s)
- Leon Möhring
- Institute for Systems Neuroscience, University Medical Center Hamburg-Eppendorf, Martinistr. 52, 20246 Hamburg, Germany.
| | - Jan Gläscher
- Institute for Systems Neuroscience, University Medical Center Hamburg-Eppendorf, Martinistr. 52, 20246 Hamburg, Germany.
| |
Collapse
|
12
|
Murphy E. ROSE: A Neurocomputational Architecture for Syntax. ARXIV 2023:arXiv:2303.08877v1. [PMID: 36994166 PMCID: PMC10055479] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 03/31/2023]
Abstract
A comprehensive model of natural language processing in the brain must accommodate four components: representations, operations, structures and encoding. It further requires a principled account of how these different components mechanistically, and causally, relate to each another. While previous models have isolated regions of interest for structure-building and lexical access, and have utilized specific neural recording measures to expose possible signatures of syntax, many gaps remain with respect to bridging distinct scales of analysis that map onto these four components. By expanding existing accounts of how neural oscillations can index various linguistic processes, this article proposes a neurocomputational architecture for syntax, termed the ROSE model (Representation, Operation, Structure, Encoding). Under ROSE, the basic data structures of syntax are atomic features, types of mental representations (R), and are coded at the single-unit and ensemble level. Elementary computations (O) that transform these units into manipulable objects accessible to subsequent structure-building levels are coded via high frequency broadband γ activity. Low frequency synchronization and cross-frequency coupling code for recursive categorial inferences (S). Distinct forms of low frequency coupling and phase-amplitude coupling (δ-θ coupling via pSTS-IFG; θ-γ coupling via IFG to conceptual hubs in lateral and ventral temporal cortex) then encode these structures onto distinct workspaces (E). Causally connecting R to O is spike-phase/LFP coupling; connecting O to S is phase-amplitude coupling; connecting S to E is a system of frontotemporal traveling oscillations; connecting E back to lower levels is low-frequency phase resetting of spike-LFP coupling. This compositional neural code has important implications for algorithmic accounts, since it makes concrete predictions for the appropriate level of study for psycholinguistic parsing models. ROSE is reliant on neurophysiologically plausible mechanisms, is supported at all four levels by a range of recent empirical research, and provides an anatomically precise and falsifiable grounding for the basic property of natural language syntax: hierarchical, recursive structure-building.
Collapse
Affiliation(s)
- Elliot Murphy
- Vivian L. Smith Department of Neurosurgery, McGovern Medical School, UTHealth, Houston, TX, USA
- Texas Institute for Restorative Neurotechnologies, UTHealth, Houston, TX, USA
| |
Collapse
|
13
|
Angular gyrus: an anatomical case study for association cortex. Brain Struct Funct 2023; 228:131-143. [PMID: 35906433 DOI: 10.1007/s00429-022-02537-3] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2022] [Accepted: 07/05/2022] [Indexed: 01/07/2023]
Abstract
The angular gyrus is associated with a spectrum of higher order cognitive functions. This mini-review undertakes a broad survey of putative neuroanatomical substrates, guided by the premise that area-specific specializations derive from a combination of extrinsic connections and intrinsic area properties. Three levels of spatial resolution are discussed: cellular, supracellular connectivity, and synaptic micro-scale, with examples necessarily drawn mainly from experimental work with nonhuman primates. A significant factor in the functional specialization of the human parietal cortex is the pronounced enlargement. In addition to "more" cells, synapses, and connections, however, the heterogeneity itself can be considered an important property. Multiple anatomical features support the idea of overlapping and temporally dynamic membership in several brain wide subnetworks, but how these features operate in the context of higher cognitive functions remains for continued investigations.
Collapse
|
14
|
Caucheteux C, Gramfort A, King JR. Deep language algorithms predict semantic comprehension from brain activity. Sci Rep 2022; 12:16327. [PMID: 36175483 PMCID: PMC9522791 DOI: 10.1038/s41598-022-20460-9] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2021] [Accepted: 09/13/2022] [Indexed: 11/09/2022] Open
Abstract
Deep language algorithms, like GPT-2, have demonstrated remarkable abilities to process text, and now constitute the backbone of automatic translation, summarization and dialogue. However, whether these models encode information that relates to human comprehension still remains controversial. Here, we show that the representations of GPT-2 not only map onto the brain responses to spoken stories, but they also predict the extent to which subjects understand the corresponding narratives. To this end, we analyze 101 subjects recorded with functional Magnetic Resonance Imaging while listening to 70 min of short stories. We then fit a linear mapping model to predict brain activity from GPT-2's activations. Finally, we show that this mapping reliably correlates ([Formula: see text]) with subjects' comprehension scores as assessed for each story. This effect peaks in the angular, medial temporal and supra-marginal gyri, and is best accounted for by the long-distance dependencies generated in the deep layers of GPT-2. Overall, this study shows how deep language models help clarify the brain computations underlying language comprehension.
Collapse
Affiliation(s)
- Charlotte Caucheteux
- Meta AI Research, Paris, France.
- Université Paris-Saclay, Inria, CEA, Palaiseau, France.
| | | | - Jean-Rémi King
- Meta AI Research, Paris, France
- École normale supérieure, PSL University, CNRS, Paris, France
| |
Collapse
|
15
|
Semantic Analysis Technology of English Translation Based on Deep Neural Network. COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE 2022; 2022:1176943. [PMID: 35860648 PMCID: PMC9293510 DOI: 10.1155/2022/1176943] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/29/2022] [Revised: 06/18/2022] [Accepted: 06/23/2022] [Indexed: 11/17/2022]
Abstract
English translation plays an important role in the development of science and technology and cultural exchanges. With the increase in translation volume, intelligent translation has become inevitable, but there is no effective solution for semantic translation in English translation. To provide an effective translation improvement scheme in English translation, this paper studies and analyzes the application of deep neural network in English translation semantic analysis. Based on a brief analysis of the research progress of English translation analysis and the current situation of neural network, a neural network translation architecture is established, and a deep neural network model for English translation analysis is proposed. Aiming at the problem of gradient disappearance in RNN model, the ability of Gru neural network to deal with long-distance translation is enhanced, and the computational complexity of Gru neural network is reduced. At the same time, a bi-directional Gru model is designed to translate according to the context. For some nonlinear translation, a deep neural network model based on part of speech sequence information is proposed to realize semantic analysis, and experiments are designed to test the translation effect of the neural network model. The simulation results show that the English translation based on deep neural network can improve the translation effect, reduce errors and improve the accuracy of semantic analysis of English translation, which has certain reference significance for improving the level of English translation.
Collapse
|
16
|
Zou H, Xiang K. Sentiment Classification Method Based on Blending of Emoticons and Short Texts. ENTROPY 2022; 24:e24030398. [PMID: 35327909 PMCID: PMC8965825 DOI: 10.3390/e24030398] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/05/2022] [Revised: 03/09/2022] [Accepted: 03/11/2022] [Indexed: 12/04/2022]
Abstract
With the development of Internet technology, short texts have gradually become the main medium for people to obtain information and communicate. Short text reduces the threshold of information production and reading by virtue of its short length, which is in line with the trend of fragmented reading in the context of the current fast-paced life. In addition, short texts contain emojis to make the communication immersive. However, short-text content means it contains relatively little information, which is not conducive to the analysis of sentiment characteristics. Therefore, this paper proposes a sentiment classification method based on the blending of emoticons and short-text content. Emoticons and short-text content are transformed into vectors, and the corresponding word vector and emoticon vector are connected into a sentencing matrix in turn. The sentence matrix is input into a convolution neural network classification model for classification. The results indicate that, compared with existing methods, the proposed method improves the accuracy of analysis.
Collapse
Affiliation(s)
- Haochen Zou
- Department of Computer Science and Software Engineering, Concordia University, Montreal, QC H3G 1M8, Canada
- Correspondence:
| | - Kun Xiang
- Department of Science and Engineering, Hosei University, Koganei 184-8584, Tokyo, Japan;
| |
Collapse
|
17
|
Bruera A, Poesio M. Exploring the Representations of Individual Entities in the Brain Combining EEG and Distributional Semantics. Front Artif Intell 2022; 5:796793. [PMID: 35280237 PMCID: PMC8905499 DOI: 10.3389/frai.2022.796793] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2021] [Accepted: 01/25/2022] [Indexed: 11/23/2022] Open
Abstract
Semantic knowledge about individual entities (i.e., the referents of proper names such as Jacinta Ardern) is fine-grained, episodic, and strongly social in nature, when compared with knowledge about generic entities (the referents of common nouns such as politician). We investigate the semantic representations of individual entities in the brain; and for the first time we approach this question using both neural data, in the form of newly-acquired EEG data, and distributional models of word meaning, employing them to isolate semantic information regarding individual entities in the brain. We ran two sets of analyses. The first set of analyses is only concerned with the evoked responses to individual entities and their categories. We find that it is possible to classify them according to both their coarse and their fine-grained category at appropriate timepoints, but that it is hard to map representational information learned from individuals to their categories. In the second set of analyses, we learn to decode from evoked responses to distributional word vectors. These results indicate that such a mapping can be learnt successfully: this counts not only as a demonstration that representations of individuals can be discriminated in EEG responses, but also as a first brain-based validation of distributional semantic models as representations of individual entities. Finally, in-depth analyses of the decoder performance provide additional evidence that the referents of proper names and categories have little in common when it comes to their representation in the brain.
Collapse
Affiliation(s)
- Andrea Bruera
- Cognitive Science Research Group, School of Electronic Engineering and Computer Science, Queen Mary University of London, London, United Kingdom
| | | |
Collapse
|
18
|
Kaiser D, Jacobs AM, Cichy RM. Modelling brain representations of abstract concepts. PLoS Comput Biol 2022; 18:e1009837. [PMID: 35120139 PMCID: PMC8849470 DOI: 10.1371/journal.pcbi.1009837] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2021] [Revised: 02/16/2022] [Accepted: 01/14/2022] [Indexed: 11/18/2022] Open
Abstract
Abstract conceptual representations are critical for human cognition. Despite their importance, key properties of these representations remain poorly understood. Here, we used computational models of distributional semantics to predict multivariate fMRI activity patterns during the activation and contextualization of abstract concepts. We devised a task in which participants had to embed abstract nouns into a story that they developed around a given background context. We found that representations in inferior parietal cortex were predicted by concept similarities emerging in models of distributional semantics. By constructing different model families, we reveal the models’ learning trajectories and delineate how abstract and concrete training materials contribute to the formation of brain-like representations. These results inform theories about the format and emergence of abstract conceptual representations in the human brain. How do we conceive abstract concepts, like love, peace, or truth? In this study, we investigate how our brains support the activation and contextualization of such abstract concepts. We asked participants to embed abstract nouns into a coherent story while we recorded functional MRI. Using multivariate analysis techniques, we computed how similar different abstract concepts were represented during this task. We then modelled these neural similarities among concepts with computational models of distributional semantics which capture the words’ co-occurance statistics in large natural language corpora. Our results reveal a correspondence between the computational models and brain representations in the inferior parietal cortex. This correspondence held when the computational models were only trained on subsets of the corpora that contained as few as 100,000 sentences and only abstract or concrete words. Our findings establish a neural correlate of abstract concept representation in the inferior parietal cortex, and they provide a first characterization of the format of these representations.
Collapse
Affiliation(s)
- Daniel Kaiser
- Mathematical Institute, Department of Mathematics and Computer Science, Physics, Geography, Justus-Liebig-Universität Gießen, Gießen, Germany
- Center for Mind, Brain and Behavior (CMBB), Philipps-Universität Marburg and Justus-Liebig-Universität Gießen, Marburg, Germany
- Department of Psychology, University of York, York, United Kingdom
- * E-mail:
| | - Arthur M. Jacobs
- Department of Education and Psychology, Freie Universität Berlin, Berlin, Germany
- Center for Cognitive Neuroscience Berlin, Freie Universität Berlin, Berlin, Germany
| | - Radoslaw M. Cichy
- Department of Education and Psychology, Freie Universität Berlin, Berlin, Germany
- Berlin School of Mind and Brain, Humboldt-Universität zu Berlin, Berlin, Germany
- Bernstein Center for Computational Neuroscience Berlin, Berlin, Germany
| |
Collapse
|