1
|
Chang M, Pu Z, Wang J. Oral reading promotes predictive processing in Chinese sentence reading: eye movement evidence. PeerJ 2024; 12:e18307. [PMID: 39430560 PMCID: PMC11491060 DOI: 10.7717/peerj.18307] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2024] [Accepted: 09/23/2024] [Indexed: 10/22/2024] Open
Abstract
Background Fluent sentence reading is widely acknowledged to depend on top-down contextual prediction, wherein sentential and contextual cues guide the pre-activation of linguistic representations before encountering stimuli, facilitating subsequent comprehension. The Prediction-by-Production hypothesis posits an explanation for predictive processes in language comprehension, suggesting that prediction during comprehension involves processes associated with language production. However, there is a lack of eye movement evidence supporting this hypothesis within sentence reading contexts. Thus, we manipulated reading mode and word predictability to examine the influence of language production on predictive processing. Methods Participants engaged in silent or oral reading of sentences containing either high or low-predictable target words. Eye movements were recorded using the Eyelink1000 eye tracker. Results The findings revealed a higher skipping rate and shorter fixation times for high-predictable words compared to low-predictable ones, and for silent compared to oral reading. Notably, interactive effects were observed in the time measures (FFD, SFD, GD) during first-pass reading, indicating that word predictability effects were more pronounced during oral reading than silent reading. Discussion The observed pattern of results suggests that the activation of the production system enhances predictive processing during the early lexical access, providing empirical support for the Prediction-by-Production hypothesis in eye movement sentence reading situations, extending the current understanding of the timing and nature of predictions in reading comprehension.
Collapse
Affiliation(s)
- Min Chang
- School of Education Science, Nantong University, Nantong, China
| | - Zhenying Pu
- School of Education Science, Nantong University, Nantong, China
| | - Jingxin Wang
- Key Research Base of Humanities and Social Sciences of the Ministry of Education, Academy of Psychology and Behavior, Tianjin Normal University, Tianjin, China
- Faculty of Psychology, Tianjin Normal University, Tianjin, China
| |
Collapse
|
2
|
Song M, Wang J, Cai Q. The unique contribution of uncertainty reduction during naturalistic language comprehension. Cortex 2024; 181:12-25. [PMID: 39447486 DOI: 10.1016/j.cortex.2024.09.007] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2024] [Revised: 07/21/2024] [Accepted: 09/24/2024] [Indexed: 10/26/2024]
Abstract
Language comprehension is an incremental process with prediction. Delineating various mental states during such a process is critical to understanding the relationship between human cognition and the properties of language. Entropy reduction, which indicates the dynamic decrease of uncertainty as language input unfolds, has been recognized as effective in predicting neural responses during comprehension. According to the entropy reduction hypothesis (Hale, 2006), entropy reduction is related to the processing difficulty of a word, the effect of which may overlap with other well-documented information-theoretical metrics such as surprisal or next-word entropy. However, the processing difficulty was often confused with the information conveyed by a word, especially lacking neural differentiation. We propose that entropy reduction represents the cognitive neural process of information gain that can be dissociated from processing difficulty. This study characterized various information-theoretical metrics using GPT-2 and identified the unique effects of entropy reduction in predicting fMRI time series acquired during language comprehension. In addition to the effects of surprisal and entropy, entropy reduction was associated with activations in the left inferior frontal gyrus, bilateral ventromedial prefrontal cortex, insula, thalamus, basal ganglia, and middle cingulate cortex. The reduction of uncertainty, rather than its fluctuation, proved to be an effective factor in modeling neural responses. The neural substrates underlying the reduction in uncertainty might imply the brain's desire for information regardless of processing difficulty.
Collapse
Affiliation(s)
- Ming Song
- Shanghai Key Laboratory of Brain Functional Genomics (Ministry of Education), Affiliated Mental Health Center (ECNU), Institute of Brain and Education Innovation, School of Psychology and Cognitive Science, East China Normal University, Shanghai 200062, China; Shanghai Changning Mental Health Center, Shanghai, China; Shanghai Center for Brain Science and Brain-Inspired Technology, Shanghai, China
| | - Jing Wang
- Shanghai Key Laboratory of Brain Functional Genomics (Ministry of Education), Affiliated Mental Health Center (ECNU), Institute of Brain and Education Innovation, School of Psychology and Cognitive Science, East China Normal University, Shanghai 200062, China; Shanghai Changning Mental Health Center, Shanghai, China; Shanghai Center for Brain Science and Brain-Inspired Technology, Shanghai, China.
| | - Qing Cai
- Shanghai Key Laboratory of Brain Functional Genomics (Ministry of Education), Affiliated Mental Health Center (ECNU), Institute of Brain and Education Innovation, School of Psychology and Cognitive Science, East China Normal University, Shanghai 200062, China; Shanghai Changning Mental Health Center, Shanghai, China; Shanghai Center for Brain Science and Brain-Inspired Technology, Shanghai, China.
| |
Collapse
|
3
|
Fedorenko E, Ivanova AA, Regev TI. Reply to 'The language network is topographically diverse and driven by rapid syntactic inferences'. Nat Rev Neurosci 2024; 25:706. [PMID: 39123047 DOI: 10.1038/s41583-024-00853-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/12/2024]
Affiliation(s)
- Evelina Fedorenko
- Brain and Cognitive Sciences Department, Massachusetts Institute of Technology, Cambridge, MA, USA.
- McGovern Institute for Brain Research, Massachusetts Institute of Technology, Cambridge, MA, USA.
- The Program in Speech and Hearing in Bioscience and Technology, Harvard University, Cambridge, MA, USA.
| | - Anna A Ivanova
- School of Psychology, Georgia Institute of Technology, Atlanta, GA, USA
| | - Tamar I Regev
- Brain and Cognitive Sciences Department, Massachusetts Institute of Technology, Cambridge, MA, USA
- McGovern Institute for Brain Research, Massachusetts Institute of Technology, Cambridge, MA, USA
| |
Collapse
|
4
|
Cometa A, Battaglini C, Artoni F, Greco M, Frank R, Repetto C, Bottoni F, Cappa SF, Micera S, Ricciardi E, Moro A. Brain and grammar: revealing electrophysiological basic structures with competing statistical models. Cereb Cortex 2024; 34:bhae317. [PMID: 39098819 DOI: 10.1093/cercor/bhae317] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2024] [Revised: 07/08/2024] [Accepted: 07/24/2024] [Indexed: 08/06/2024] Open
Abstract
Acoustic, lexical, and syntactic information are simultaneously processed in the brain requiring complex strategies to distinguish their electrophysiological activity. Capitalizing on previous works that factor out acoustic information, we could concentrate on the lexical and syntactic contribution to language processing by testing competing statistical models. We exploited electroencephalographic recordings and compared different surprisal models selectively involving lexical information, part of speech, or syntactic structures in various combinations. Electroencephalographic responses were recorded in 32 participants during listening to affirmative active declarative sentences. We compared the activation corresponding to basic syntactic structures, such as noun phrases vs. verb phrases. Lexical and syntactic processing activates different frequency bands, partially different time windows, and different networks. Moreover, surprisal models based on part of speech inventory only do not explain well the electrophysiological data, while those including syntactic information do. By disentangling acoustic, lexical, and syntactic information, we demonstrated differential brain sensitivity to syntactic information. These results confirm and extend previous measures obtained with intracranial recordings, supporting our hypothesis that syntactic structures are crucial in neural language processing. This study provides a detailed understanding of how the brain processes syntactic information, highlighting the importance of syntactic surprisal in shaping neural responses during language comprehension.
Collapse
Affiliation(s)
- Andrea Cometa
- MoMiLab, IMT School for Advanced Studies Lucca, Piazza S.Francesco, 19, Lucca 55100, Italy
- The BioRobotics Institute and Department of Excellence in Robotics and AI, Scuola Superiore Sant'Anna, Viale Rinaldo Piaggio 34, Pontedera 56025, Italy
- Cognitive Neuroscience (ICoN) Center, University School for Advanced Studies IUSS, Piazza Vittoria 15, Pavia 27100, Italy
| | - Chiara Battaglini
- Neurolinguistics and Experimental Pragmatics (NEP) Lab, University School for Advanced Studies IUSS Pavia, Piazza della Vittoria 15, Pavia 27100, Italy
| | - Fiorenzo Artoni
- Department of Clinical Neurosciences, Faculty of Medicine, University of Geneva, 1, rue Michel-Servet, Genéve 1211, Switzerland
| | - Matteo Greco
- Cognitive Neuroscience (ICoN) Center, University School for Advanced Studies IUSS, Piazza Vittoria 15, Pavia 27100, Italy
| | - Robert Frank
- Department of Linguistics, Yale University, 370 Temple St, New Haven, CT 06511, United States
| | - Claudia Repetto
- Department of Psychology, Università Cattolica del Sacro Cuore, Largo A. Gemelli 1, Milan 20123, Italy
| | - Franco Bottoni
- Istituto Clinico Humanitas, IRCCS, Via Alessandro Manzoni 56, Rozzano 20089, Italy
| | - Stefano F Cappa
- Cognitive Neuroscience (ICoN) Center, University School for Advanced Studies IUSS, Piazza Vittoria 15, Pavia 27100, Italy
- Dementia Research Center, IRCCS Mondino Foundation National Institute of Neurology, Via Mondino 2, Pavia 27100, Italy
| | - Silvestro Micera
- The BioRobotics Institute and Department of Excellence in Robotics and AI, Scuola Superiore Sant'Anna, Viale Rinaldo Piaggio 34, Pontedera 56025, Italy
- Bertarelli Foundation Chair in Translational NeuroEngineering, Center for Neuroprosthetics and School of Engineering, Ecole Polytechnique Federale de Lausanne, Campus Biotech, Chemin des Mines 9, Geneva, GE CH 1202, Switzerland
| | - Emiliano Ricciardi
- MoMiLab, IMT School for Advanced Studies Lucca, Piazza S.Francesco, 19, Lucca 55100, Italy
| | - Andrea Moro
- Cognitive Neuroscience (ICoN) Center, University School for Advanced Studies IUSS, Piazza Vittoria 15, Pavia 27100, Italy
| |
Collapse
|
5
|
Tuckute G, Kanwisher N, Fedorenko E. Language in Brains, Minds, and Machines. Annu Rev Neurosci 2024; 47:277-301. [PMID: 38669478 DOI: 10.1146/annurev-neuro-120623-101142] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/28/2024]
Abstract
It has long been argued that only humans could produce and understand language. But now, for the first time, artificial language models (LMs) achieve this feat. Here we survey the new purchase LMs are providing on the question of how language is implemented in the brain. We discuss why, a priori, LMs might be expected to share similarities with the human language system. We then summarize evidence that LMs represent linguistic information similarly enough to humans to enable relatively accurate brain encoding and decoding during language processing. Finally, we examine which LM properties-their architecture, task performance, or training-are critical for capturing human neural responses to language and review studies using LMs as in silico model organisms for testing hypotheses about language. These ongoing investigations bring us closer to understanding the representations and processes that underlie our ability to comprehend sentences and express thoughts in language.
Collapse
Affiliation(s)
- Greta Tuckute
- Department of Brain and Cognitive Sciences and McGovern Institute for Brain Research, Massachusetts Institute of Technology, Cambridge, Massachusetts, USA;
| | - Nancy Kanwisher
- Department of Brain and Cognitive Sciences and McGovern Institute for Brain Research, Massachusetts Institute of Technology, Cambridge, Massachusetts, USA;
| | - Evelina Fedorenko
- Department of Brain and Cognitive Sciences and McGovern Institute for Brain Research, Massachusetts Institute of Technology, Cambridge, Massachusetts, USA;
| |
Collapse
|
6
|
Meylan SC, Griffiths TL. Word Forms Reflect Trade-Offs Between Speaker Effort and Robust Listener Recognition. Cogn Sci 2024; 48:e13478. [PMID: 38980972 DOI: 10.1111/cogs.13478] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2022] [Revised: 06/03/2024] [Accepted: 06/11/2024] [Indexed: 07/11/2024]
Abstract
How do cognitive pressures shape the lexicons of natural languages? Here, we reframe George Kingsley Zipf's proposed "law of abbreviation" within a more general framework that relates it to cognitive pressures that affect speakers and listeners. In this new framework, speakers' drive to reduce effort (Zipf's proposal) is counteracted by the need for low-frequency words to have word forms that are sufficiently distinctive to allow for accurate recognition by listeners. To support this framework, we replicate and extend recent work using the prevalence of subword phonemic sequences (phonotactic probability) to measure speakers' production effort in place of Zipf's measure of length. Across languages and corpora, phonotactic probability is more strongly correlated with word frequency than word length. We also show this measure of ease of speech production (phonotactic probability) is strongly correlated with a measure of perceptual difficulty that indexes the degree of competition from alternative interpretations in word recognition. This is consistent with the claim that there must be trade-offs between these two factors, and is inconsistent with a recent proposal that phonotactic probability facilitates both perception and production. To our knowledge, this is the first work to offer an explanation why long, phonotactically improbable word forms remain in the lexicons of natural languages.
Collapse
Affiliation(s)
- Stephan C Meylan
- Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology
| | | |
Collapse
|
7
|
Kauf C, Kim HS, Lee EJ, Jhingan N, Selena She J, Taliaferro M, Gibson E, Fedorenko E. Linguistic inputs must be syntactically parsable to fully engage the language network. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.06.21.599332. [PMID: 38948870 PMCID: PMC11212959 DOI: 10.1101/2024.06.21.599332] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/02/2024]
Abstract
Human language comprehension is remarkably robust to ill-formed inputs (e.g., word transpositions). This robustness has led some to argue that syntactic parsing is largely an illusion, and that incremental comprehension is more heuristic, shallow, and semantics-based than is often assumed. However, the available data are also consistent with the possibility that humans always perform rule-like symbolic parsing and simply deploy error correction mechanisms to reconstruct ill-formed inputs when needed. We put these hypotheses to a new stringent test by examining brain responses to a) stimuli that should pose a challenge for syntactic reconstruction but allow for complex meanings to be built within local contexts through associative/shallow processing (sentences presented in a backward word order), and b) grammatically well-formed but semantically implausible sentences that should impede semantics-based heuristic processing. Using a novel behavioral syntactic reconstruction paradigm, we demonstrate that backward-presented sentences indeed impede the recovery of grammatical structure during incremental comprehension. Critically, these backward-presented stimuli elicit a relatively low response in the language areas, as measured with fMRI. In contrast, semantically implausible but grammatically well-formed sentences elicit a response in the language areas similar in magnitude to naturalistic (plausible) sentences. In other words, the ability to build syntactic structures during incremental language processing is both necessary and sufficient to fully engage the language network. Taken together, these results provide strongest to date support for a generalized reliance of human language comprehension on syntactic parsing.
Collapse
Affiliation(s)
- Carina Kauf
- Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA 02139 USA
- McGovern Institute for Brain Research, Massachusetts Institute of Technology, Cambridge, MA 02139 USA
| | - Hee So Kim
- Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA 02139 USA
- McGovern Institute for Brain Research, Massachusetts Institute of Technology, Cambridge, MA 02139 USA
| | - Elizabeth J. Lee
- Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA 02139 USA
- McGovern Institute for Brain Research, Massachusetts Institute of Technology, Cambridge, MA 02139 USA
| | - Niharika Jhingan
- Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA 02139 USA
- McGovern Institute for Brain Research, Massachusetts Institute of Technology, Cambridge, MA 02139 USA
| | - Jingyuan Selena She
- Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA 02139 USA
- McGovern Institute for Brain Research, Massachusetts Institute of Technology, Cambridge, MA 02139 USA
| | - Maya Taliaferro
- Department of Psychology, New York University, New York, NY 10012 USA
| | - Edward Gibson
- Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA 02139 USA
| | - Evelina Fedorenko
- Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA 02139 USA
- McGovern Institute for Brain Research, Massachusetts Institute of Technology, Cambridge, MA 02139 USA
- The Program in Speech and Hearing Bioscience and Technology, Harvard University, Cambridge, MA 02138 USA
| |
Collapse
|
8
|
Mahowald K, Ivanova AA, Blank IA, Kanwisher N, Tenenbaum JB, Fedorenko E. Dissociating language and thought in large language models. Trends Cogn Sci 2024; 28:517-540. [PMID: 38508911 DOI: 10.1016/j.tics.2024.01.011] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2023] [Revised: 01/31/2024] [Accepted: 01/31/2024] [Indexed: 03/22/2024]
Abstract
Large language models (LLMs) have come closest among all models to date to mastering human language, yet opinions about their linguistic and cognitive capabilities remain split. Here, we evaluate LLMs using a distinction between formal linguistic competence (knowledge of linguistic rules and patterns) and functional linguistic competence (understanding and using language in the world). We ground this distinction in human neuroscience, which has shown that formal and functional competence rely on different neural mechanisms. Although LLMs are surprisingly good at formal competence, their performance on functional competence tasks remains spotty and often requires specialized fine-tuning and/or coupling with external modules. We posit that models that use language in human-like ways would need to master both of these competence types, which, in turn, could require the emergence of separate mechanisms specialized for formal versus functional linguistic competence.
Collapse
|
9
|
Isono S. Category Locality Theory: A unified account of locality effects in sentence comprehension. Cognition 2024; 247:105766. [PMID: 38583323 DOI: 10.1016/j.cognition.2024.105766] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2023] [Revised: 01/25/2024] [Accepted: 03/06/2024] [Indexed: 04/09/2024]
Abstract
In real-time sentence comprehension, the comprehender is often required to establish syntactic dependencies between words that are linearly distant. Major models of sentence comprehension assume that longer dependencies are more difficult to process because of working memory limitations. While the expected effect of distance on reading times (locality effect) has been robustly observed in certain constructions, such as relative clauses in English, its generalizability to a wider range of constructions has been empirically questioned. The current study proposes a new metric of syntactic distance that capitalizes on the flexible constituency of Combinatory Categorial Grammar (CCG), and argues that it offers a unified account of the locality effects. It is shown that this metric correctly predicts both the presence of the locality effect in English relative clauses and its absence in verb-final languages, without assuming language- or dependency-specific differences in the sensitivity to the locality effect. It is further shown that the CCG-based distance is a significant predictor of the self-paced reading times from an English corpus, even when other known predictors such as dependency-based locality and surprisal are taken into account. These results suggest that human sentence comprehension involves rapid integration of input words into efficiently compressed syntactic representations, and CCG is a plausible theory of the grammar that subserves this process.
Collapse
|
10
|
Shain C, Kean H, Casto C, Lipkin B, Affourtit J, Siegelman M, Mollica F, Fedorenko E. Distributed Sensitivity to Syntax and Semantics throughout the Language Network. J Cogn Neurosci 2024; 36:1427-1471. [PMID: 38683732 DOI: 10.1162/jocn_a_02164] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/02/2024]
Abstract
Human language is expressive because it is compositional: The meaning of a sentence (semantics) can be inferred from its structure (syntax). It is commonly believed that language syntax and semantics are processed by distinct brain regions. Here, we revisit this claim using precision fMRI methods to capture separation or overlap of function in the brains of individual participants. Contrary to prior claims, we find distributed sensitivity to both syntax and semantics throughout a broad frontotemporal brain network. Our results join a growing body of evidence for an integrated network for language in the human brain within which internal specialization is primarily a matter of degree rather than kind, in contrast with influential proposals that advocate distinct specialization of different brain areas for different types of linguistic functions.
Collapse
Affiliation(s)
| | - Hope Kean
- Massachusetts Institute of Technology
| | | | | | | | | | | | | |
Collapse
|
11
|
Fedorenko E, Ivanova AA, Regev TI. The language network as a natural kind within the broader landscape of the human brain. Nat Rev Neurosci 2024; 25:289-312. [PMID: 38609551 DOI: 10.1038/s41583-024-00802-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 02/23/2024] [Indexed: 04/14/2024]
Abstract
Language behaviour is complex, but neuroscientific evidence disentangles it into distinct components supported by dedicated brain areas or networks. In this Review, we describe the 'core' language network, which includes left-hemisphere frontal and temporal areas, and show that it is strongly interconnected, independent of input and output modalities, causally important for language and language-selective. We discuss evidence that this language network plausibly stores language knowledge and supports core linguistic computations related to accessing words and constructions from memory and combining them to interpret (decode) or generate (encode) linguistic messages. We emphasize that the language network works closely with, but is distinct from, both lower-level - perceptual and motor - mechanisms and higher-level systems of knowledge and reasoning. The perceptual and motor mechanisms process linguistic signals, but, in contrast to the language network, are sensitive only to these signals' surface properties, not their meanings; the systems of knowledge and reasoning (such as the system that supports social reasoning) are sometimes engaged during language use but are not language-selective. This Review lays a foundation both for in-depth investigations of these different components of the language processing pipeline and for probing inter-component interactions.
Collapse
Affiliation(s)
- Evelina Fedorenko
- Brain and Cognitive Sciences Department, Massachusetts Institute of Technology, Cambridge, MA, USA.
- McGovern Institute for Brain Research, Massachusetts Institute of Technology, Cambridge, MA, USA.
- The Program in Speech and Hearing in Bioscience and Technology, Harvard University, Cambridge, MA, USA.
| | - Anna A Ivanova
- School of Psychology, Georgia Institute of Technology, Atlanta, GA, USA
| | - Tamar I Regev
- Brain and Cognitive Sciences Department, Massachusetts Institute of Technology, Cambridge, MA, USA
- McGovern Institute for Brain Research, Massachusetts Institute of Technology, Cambridge, MA, USA
| |
Collapse
|
12
|
Hosseini EA, Schrimpf M, Zhang Y, Bowman S, Zaslavsky N, Fedorenko E. Artificial Neural Network Language Models Predict Human Brain Responses to Language Even After a Developmentally Realistic Amount of Training. NEUROBIOLOGY OF LANGUAGE (CAMBRIDGE, MASS.) 2024; 5:43-63. [PMID: 38645622 PMCID: PMC11025646 DOI: 10.1162/nol_a_00137] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 03/29/2023] [Accepted: 01/09/2024] [Indexed: 04/23/2024]
Abstract
Artificial neural networks have emerged as computationally plausible models of human language processing. A major criticism of these models is that the amount of training data they receive far exceeds that of humans during language learning. Here, we use two complementary approaches to ask how the models' ability to capture human fMRI responses to sentences is affected by the amount of training data. First, we evaluate GPT-2 models trained on 1 million, 10 million, 100 million, or 1 billion words against an fMRI benchmark. We consider the 100-million-word model to be developmentally plausible in terms of the amount of training data given that this amount is similar to what children are estimated to be exposed to during the first 10 years of life. Second, we test the performance of a GPT-2 model trained on a 9-billion-token dataset to reach state-of-the-art next-word prediction performance on the human benchmark at different stages during training. Across both approaches, we find that (i) the models trained on a developmentally plausible amount of data already achieve near-maximal performance in capturing fMRI responses to sentences. Further, (ii) lower perplexity-a measure of next-word prediction performance-is associated with stronger alignment with human data, suggesting that models that have received enough training to achieve sufficiently high next-word prediction performance also acquire representations of sentences that are predictive of human fMRI responses. In tandem, these findings establish that although some training is necessary for the models' predictive ability, a developmentally realistic amount of training (∼100 million words) may suffice.
Collapse
Affiliation(s)
- Eghbal A. Hosseini
- Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA, USA
- McGovern Institute for Brain Research, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Martin Schrimpf
- The MIT Quest for Intelligence Initiative, Cambridge, MA, USA
- Swiss Federal Institute of Technology, Lausanne, Switzerland
| | - Yian Zhang
- Computer Science Department, Stanford University, Stanford, CA, USA
| | - Samuel Bowman
- Center for Data Science, New York University, New York, NY, USA
- Department of Linguistics, New York University, New York, NY, USA
- Department of Computer Science, New York University, New York, NY, USA
| | - Noga Zaslavsky
- Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA, USA
- McGovern Institute for Brain Research, Massachusetts Institute of Technology, Cambridge, MA, USA
- K. Lisa Yang Integrative Computational Neuroscience (ICoN) Center, Massachusetts Institute of Technology, Cambridge, MA, USA
- Department of Language Science, University of California, Irvine, CA, USA
| | - Evelina Fedorenko
- Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA, USA
- McGovern Institute for Brain Research, Massachusetts Institute of Technology, Cambridge, MA, USA
- The MIT Quest for Intelligence Initiative, Cambridge, MA, USA
- Speech and Hearing Bioscience and Technology Program, Harvard University, Boston, MA, USA
| |
Collapse
|
13
|
Kauf C, Tuckute G, Levy R, Andreas J, Fedorenko E. Lexical-Semantic Content, Not Syntactic Structure, Is the Main Contributor to ANN-Brain Similarity of fMRI Responses in the Language Network. NEUROBIOLOGY OF LANGUAGE (CAMBRIDGE, MASS.) 2024; 5:7-42. [PMID: 38645614 PMCID: PMC11025651 DOI: 10.1162/nol_a_00116] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/15/2022] [Accepted: 07/11/2023] [Indexed: 04/23/2024]
Abstract
Representations from artificial neural network (ANN) language models have been shown to predict human brain activity in the language network. To understand what aspects of linguistic stimuli contribute to ANN-to-brain similarity, we used an fMRI data set of responses to n = 627 naturalistic English sentences (Pereira et al., 2018) and systematically manipulated the stimuli for which ANN representations were extracted. In particular, we (i) perturbed sentences' word order, (ii) removed different subsets of words, or (iii) replaced sentences with other sentences of varying semantic similarity. We found that the lexical-semantic content of the sentence (largely carried by content words) rather than the sentence's syntactic form (conveyed via word order or function words) is primarily responsible for the ANN-to-brain similarity. In follow-up analyses, we found that perturbation manipulations that adversely affect brain predictivity also lead to more divergent representations in the ANN's embedding space and decrease the ANN's ability to predict upcoming tokens in those stimuli. Further, results are robust as to whether the mapping model is trained on intact or perturbed stimuli and whether the ANN sentence representations are conditioned on the same linguistic context that humans saw. The critical result-that lexical-semantic content is the main contributor to the similarity between ANN representations and neural ones-aligns with the idea that the goal of the human language system is to extract meaning from linguistic strings. Finally, this work highlights the strength of systematic experimental manipulations for evaluating how close we are to accurate and generalizable models of the human language network.
Collapse
Affiliation(s)
- Carina Kauf
- Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA, USA
- McGovern Institute for Brain Research, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Greta Tuckute
- Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA, USA
- McGovern Institute for Brain Research, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Roger Levy
- Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Jacob Andreas
- Computer Science & Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Evelina Fedorenko
- Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA, USA
- McGovern Institute for Brain Research, Massachusetts Institute of Technology, Cambridge, MA, USA
- Program in Speech and Hearing Bioscience and Technology, Harvard University, Cambridge, MA, USA
| |
Collapse
|
14
|
Antonello R, Huth A. Predictive Coding or Just Feature Discovery? An Alternative Account of Why Language Models Fit Brain Data. NEUROBIOLOGY OF LANGUAGE (CAMBRIDGE, MASS.) 2024; 5:64-79. [PMID: 38645616 PMCID: PMC11025645 DOI: 10.1162/nol_a_00087] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 02/28/2022] [Accepted: 10/26/2022] [Indexed: 04/23/2024]
Abstract
Many recent studies have shown that representations drawn from neural network language models are extremely effective at predicting brain responses to natural language. But why do these models work so well? One proposed explanation is that language models and brains are similar because they have the same objective: to predict upcoming words before they are perceived. This explanation is attractive because it lends support to the popular theory of predictive coding. We provide several analyses that cast doubt on this claim. First, we show that the ability to predict future words does not uniquely (or even best) explain why some representations are a better match to the brain than others. Second, we show that within a language model, representations that are best at predicting future words are strictly worse brain models than other representations. Finally, we argue in favor of an alternative explanation for the success of language models in neuroscience: These models are effective at predicting brain responses because they generally capture a wide variety of linguistic phenomena.
Collapse
Affiliation(s)
- Richard Antonello
- Department of Computer Science, University of Texas at Austin, Austin, TX, USA
| | - Alexander Huth
- Department of Computer Science, University of Texas at Austin, Austin, TX, USA
| |
Collapse
|
15
|
Medrano J, Friston K, Zeidman P. Linking fast and slow: The case for generative models. Netw Neurosci 2024; 8:24-43. [PMID: 38562283 PMCID: PMC10861163 DOI: 10.1162/netn_a_00343] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2023] [Accepted: 10/11/2023] [Indexed: 04/04/2024] Open
Abstract
A pervasive challenge in neuroscience is testing whether neuronal connectivity changes over time due to specific causes, such as stimuli, events, or clinical interventions. Recent hardware innovations and falling data storage costs enable longer, more naturalistic neuronal recordings. The implicit opportunity for understanding the self-organised brain calls for new analysis methods that link temporal scales: from the order of milliseconds over which neuronal dynamics evolve, to the order of minutes, days, or even years over which experimental observations unfold. This review article demonstrates how hierarchical generative models and Bayesian inference help to characterise neuronal activity across different time scales. Crucially, these methods go beyond describing statistical associations among observations and enable inference about underlying mechanisms. We offer an overview of fundamental concepts in state-space modeling and suggest a taxonomy for these methods. Additionally, we introduce key mathematical principles that underscore a separation of temporal scales, such as the slaving principle, and review Bayesian methods that are being used to test hypotheses about the brain with multiscale data. We hope that this review will serve as a useful primer for experimental and computational neuroscientists on the state of the art and current directions of travel in the complex systems modelling literature.
Collapse
Affiliation(s)
- Johan Medrano
- The Wellcome Centre for Human Neuroimaging, UCL Queen Square Institute of Neurology, London, UK
| | - Karl Friston
- The Wellcome Centre for Human Neuroimaging, UCL Queen Square Institute of Neurology, London, UK
| | - Peter Zeidman
- The Wellcome Centre for Human Neuroimaging, UCL Queen Square Institute of Neurology, London, UK
| |
Collapse
|
16
|
Jain S, Vo VA, Wehbe L, Huth AG. Computational Language Modeling and the Promise of In Silico Experimentation. NEUROBIOLOGY OF LANGUAGE (CAMBRIDGE, MASS.) 2024; 5:80-106. [PMID: 38645624 PMCID: PMC11025654 DOI: 10.1162/nol_a_00101] [Citation(s) in RCA: 12] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 02/28/2022] [Accepted: 01/18/2023] [Indexed: 04/23/2024]
Abstract
Language neuroscience currently relies on two major experimental paradigms: controlled experiments using carefully hand-designed stimuli, and natural stimulus experiments. These approaches have complementary advantages which allow them to address distinct aspects of the neurobiology of language, but each approach also comes with drawbacks. Here we discuss a third paradigm-in silico experimentation using deep learning-based encoding models-that has been enabled by recent advances in cognitive computational neuroscience. This paradigm promises to combine the interpretability of controlled experiments with the generalizability and broad scope of natural stimulus experiments. We show four examples of simulating language neuroscience experiments in silico and then discuss both the advantages and caveats of this approach.
Collapse
Affiliation(s)
- Shailee Jain
- Department of Computer Science, University of Texas at Austin, Austin, TX, USA
| | - Vy A. Vo
- Brain-Inspired Computing Lab, Intel Labs, Hillsboro, OR, USA
| | - Leila Wehbe
- Machine Learning Department, Carnegie Mellon University, Pittsburgh, PA, USA
- Neuroscience Institute, Carnegie Mellon University, Pittsburgh, PA, USA
| | - Alexander G. Huth
- Department of Computer Science, University of Texas at Austin, Austin, TX, USA
- Department of Neuroscience, University of Texas at Austin, Austin, TX, USA
| |
Collapse
|
17
|
Huber E, Sauppe S, Isasi-Isasmendi A, Bornkessel-Schlesewsky I, Merlo P, Bickel B. Surprisal From Language Models Can Predict ERPs in Processing Predicate-Argument Structures Only if Enriched by an Agent Preference Principle. NEUROBIOLOGY OF LANGUAGE (CAMBRIDGE, MASS.) 2024; 5:167-200. [PMID: 38645615 PMCID: PMC11025647 DOI: 10.1162/nol_a_00121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/03/2022] [Accepted: 08/30/2023] [Indexed: 04/23/2024]
Abstract
Language models based on artificial neural networks increasingly capture key aspects of how humans process sentences. Most notably, model-based surprisals predict event-related potentials such as N400 amplitudes during parsing. Assuming that these models represent realistic estimates of human linguistic experience, their success in modeling language processing raises the possibility that the human processing system relies on no other principles than the general architecture of language models and on sufficient linguistic input. Here, we test this hypothesis on N400 effects observed during the processing of verb-final sentences in German, Basque, and Hindi. By stacking Bayesian generalised additive models, we show that, in each language, N400 amplitudes and topographies in the region of the verb are best predicted when model-based surprisals are complemented by an Agent Preference principle that transiently interprets initial role-ambiguous noun phrases as agents, leading to reanalysis when this interpretation fails. Our findings demonstrate the need for this principle independently of usage frequencies and structural differences between languages. The principle has an unequal force, however. Compared to surprisal, its effect is weakest in German, stronger in Hindi, and still stronger in Basque. This gradient is correlated with the extent to which grammars allow unmarked NPs to be patients, a structural feature that boosts reanalysis effects. We conclude that language models gain more neurobiological plausibility by incorporating an Agent Preference. Conversely, theories of human processing profit from incorporating surprisal estimates in addition to principles like the Agent Preference, which arguably have distinct evolutionary roots.
Collapse
Affiliation(s)
- Eva Huber
- Department of Comparative Language Science, University of Zurich, Zurich, Switzerland
- Center for the Interdisciplinary Study of Language Evolution, University of Zurich, Zurich, Switzerland
| | - Sebastian Sauppe
- Department of Comparative Language Science, University of Zurich, Zurich, Switzerland
- Center for the Interdisciplinary Study of Language Evolution, University of Zurich, Zurich, Switzerland
- Department of Psychology, University of Zurich, Zurich, Switzerland
| | - Arrate Isasi-Isasmendi
- Department of Comparative Language Science, University of Zurich, Zurich, Switzerland
- Center for the Interdisciplinary Study of Language Evolution, University of Zurich, Zurich, Switzerland
| | - Ina Bornkessel-Schlesewsky
- Cognitive Neuroscience Laboratory, Australian Research Centre for Interactive and Virtual Environments, University of South Australia, Adelaide, Australia
| | - Paola Merlo
- Department of Linguistics, University of Geneva, Geneva, Switzerland
- University Center for Computer Science, University of Geneva, Geneva, Switzerland
| | - Balthasar Bickel
- Department of Comparative Language Science, University of Zurich, Zurich, Switzerland
- Center for the Interdisciplinary Study of Language Evolution, University of Zurich, Zurich, Switzerland
| |
Collapse
|
18
|
Hmamouche Y, Ochs M, Prévot L, Chaminade T. Interpretable prediction of brain activity during conversations from multimodal behavioral signals. PLoS One 2024; 19:e0284342. [PMID: 38512831 PMCID: PMC10956754 DOI: 10.1371/journal.pone.0284342] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2022] [Accepted: 03/29/2023] [Indexed: 03/23/2024] Open
Abstract
We present an analytical framework aimed at predicting the local brain activity in uncontrolled experimental conditions based on multimodal recordings of participants' behavior, and its application to a corpus of participants having conversations with another human or a conversational humanoid robot. The framework consists in extracting high-level features from the raw behavioral recordings and applying a dynamic prediction of binarized fMRI-recorded local brain activity using these behavioral features. The objective is to identify behavioral features required for this prediction, and their relative weights, depending on the brain area under investigation and the experimental condition. In order to validate our framework, we use a corpus of uncontrolled conversations of participants with a human or a robotic agent, focusing on brain regions involved in speech processing, and more generally in social interactions. The framework not only predicts local brain activity significantly better than random, it also quantifies the weights of behavioral features required for this prediction, depending on the brain area under investigation and on the nature of the conversational partner. In the left Superior Temporal Sulcus, perceived speech is the most important behavioral feature for predicting brain activity, regardless of the agent, while several features, which differ between the human and robot interlocutors, contribute to the prediction in regions involved in social cognition, such as the TemporoParietal Junction. This framework therefore allows us to study how multiple behavioral signals from different modalities are integrated in individual brain regions during complex social interactions.
Collapse
Affiliation(s)
- Youssef Hmamouche
- International Artificial Intelligence Center of Morocco, University Mohammed VI Polytechnique, Rabat, Morocco
| | - Magalie Ochs
- LIS UMR 7020, CNRS, Aix Marseille Université, Université de Toulon, Marseille, France
| | - Laurent Prévot
- LPL UMR 7309, CNRS, Aix Marseille Université, Marseille, France
| | | |
Collapse
|
19
|
Shain C, Schuler W. A Deep Learning Approach to Analyzing Continuous-Time Cognitive Processes. Open Mind (Camb) 2024; 8:235-264. [PMID: 38528907 PMCID: PMC10962694 DOI: 10.1162/opmi_a_00126] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2023] [Accepted: 01/31/2024] [Indexed: 03/27/2024] Open
Abstract
The dynamics of the mind are complex. Mental processes unfold continuously in time and may be sensitive to a myriad of interacting variables, especially in naturalistic settings. But statistical models used to analyze data from cognitive experiments often assume simplistic dynamics. Recent advances in deep learning have yielded startling improvements to simulations of dynamical cognitive processes, including speech comprehension, visual perception, and goal-directed behavior. But due to poor interpretability, deep learning is generally not used for scientific analysis. Here, we bridge this gap by showing that deep learning can be used, not just to imitate, but to analyze complex processes, providing flexible function approximation while preserving interpretability. To do so, we define and implement a nonlinear regression model in which the probability distribution over the response variable is parameterized by convolving the history of predictors over time using an artificial neural network, thereby allowing the shape and continuous temporal extent of effects to be inferred directly from time series data. Our approach relaxes standard simplifying assumptions (e.g., linearity, stationarity, and homoscedasticity) that are implausible for many cognitive processes and may critically affect the interpretation of data. We demonstrate substantial improvements on behavioral and neuroimaging data from the language processing domain, and we show that our model enables discovery of novel patterns in exploratory analyses, controls for diverse confounds in confirmatory analyses, and opens up research questions in cognitive (neuro)science that are otherwise hard to study.
Collapse
Affiliation(s)
- Cory Shain
- Department of Brain & Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - William Schuler
- Department of Linguistics, The Ohio State University, Columbus, OH, USA
| |
Collapse
|
20
|
Giglio L, Ostarek M, Sharoh D, Hagoort P. Diverging neural dynamics for syntactic structure building in naturalistic speaking and listening. Proc Natl Acad Sci U S A 2024; 121:e2310766121. [PMID: 38442171 PMCID: PMC10945772 DOI: 10.1073/pnas.2310766121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2023] [Accepted: 01/31/2024] [Indexed: 03/07/2024] Open
Abstract
The neural correlates of sentence production are typically studied using task paradigms that differ considerably from the experience of speaking outside of an experimental setting. In this fMRI study, we aimed to gain a better understanding of syntactic processing in spontaneous production versus naturalistic comprehension in three regions of interest (BA44, BA45, and left posterior middle temporal gyrus). A group of participants (n = 16) was asked to speak about the events of an episode of a TV series in the scanner. Another group of participants (n = 36) listened to the spoken recall of a participant from the first group. To model syntactic processing, we extracted word-by-word metrics of phrase-structure building with a top-down and a bottom-up parser that make different hypotheses about the timing of structure building. While the top-down parser anticipates syntactic structure, sometimes before it is obvious to the listener, the bottom-up parser builds syntactic structure in an integratory way after all of the evidence has been presented. In comprehension, neural activity was found to be better modeled by the bottom-up parser, while in production, it was better modeled by the top-down parser. We additionally modeled structure building in production with two strategies that were developed here to make different predictions about the incrementality of structure building during speaking. We found evidence for highly incremental and anticipatory structure building in production, which was confirmed by a converging analysis of the pausing patterns in speech. Overall, this study shows the feasibility of studying the neural dynamics of spontaneous language production.
Collapse
Affiliation(s)
- Laura Giglio
- Max Planck Institute for Psycholinguistics, Nijmegen6525XD, The Netherlands
- Radboud University, Donders Institute for Brain, Cognition and Behaviour, Nijmegen6525EN, The Netherlands
| | - Markus Ostarek
- Max Planck Institute for Psycholinguistics, Nijmegen6525XD, The Netherlands
| | - Daniel Sharoh
- Max Planck Institute for Psycholinguistics, Nijmegen6525XD, The Netherlands
- Radboud University, Donders Institute for Brain, Cognition and Behaviour, Nijmegen6525EN, The Netherlands
| | - Peter Hagoort
- Max Planck Institute for Psycholinguistics, Nijmegen6525XD, The Netherlands
- Radboud University, Donders Institute for Brain, Cognition and Behaviour, Nijmegen6525EN, The Netherlands
| |
Collapse
|
21
|
Shain C. Word Frequency and Predictability Dissociate in Naturalistic Reading. Open Mind (Camb) 2024; 8:177-201. [PMID: 38476662 PMCID: PMC10932590 DOI: 10.1162/opmi_a_00119] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2023] [Accepted: 01/10/2024] [Indexed: 03/14/2024] Open
Abstract
Many studies of human language processing have shown that readers slow down at less frequent or less predictable words, but there is debate about whether frequency and predictability effects reflect separable cognitive phenomena: are cognitive operations that retrieve words from the mental lexicon based on sensory cues distinct from those that predict upcoming words based on context? Previous evidence for a frequency-predictability dissociation is mostly based on small samples (both for estimating predictability and frequency and for testing their effects on human behavior), artificial materials (e.g., isolated constructed sentences), and implausible modeling assumptions (discrete-time dynamics, linearity, additivity, constant variance, and invariance over time), which raises the question: do frequency and predictability dissociate in ordinary language comprehension, such as story reading? This study leverages recent progress in open data and computational modeling to address this question at scale. A large collection of naturalistic reading data (six datasets, >2.2 M datapoints) is analyzed using nonlinear continuous-time regression, and frequency and predictability are estimated using statistical language models trained on more data than is currently typical in psycholinguistics. Despite the use of naturalistic data, strong predictability estimates, and flexible regression models, results converge with earlier experimental studies in supporting dissociable and additive frequency and predictability effects.
Collapse
Affiliation(s)
- Cory Shain
- Department of Brain & Cognitive Sciences and McGovern Institute for Brain Research, Massachusetts Institute of Technology, Cambridge, MA, USA
| |
Collapse
|
22
|
Shain C, Meister C, Pimentel T, Cotterell R, Levy R. Large-scale evidence for logarithmic effects of word predictability on reading time. Proc Natl Acad Sci U S A 2024; 121:e2307876121. [PMID: 38422017 DOI: 10.1073/pnas.2307876121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2023] [Accepted: 11/11/2023] [Indexed: 03/02/2024] Open
Abstract
During real-time language comprehension, our minds rapidly decode complex meanings from sequences of words. The difficulty of doing so is known to be related to words' contextual predictability, but what cognitive processes do these predictability effects reflect? In one view, predictability effects reflect facilitation due to anticipatory processing of words that are predictable from context. This view predicts a linear effect of predictability on processing demand. In another view, predictability effects reflect the costs of probabilistic inference over sentence interpretations. This view predicts either a logarithmic or a superlogarithmic effect of predictability on processing demand, depending on whether it assumes pressures toward a uniform distribution of information over time. The empirical record is currently mixed. Here, we revisit this question at scale: We analyze six reading datasets, estimate next-word probabilities with diverse statistical language models, and model reading times using recent advances in nonlinear regression. Results support a logarithmic effect of word predictability on processing difficulty, which favors probabilistic inference as a key component of human language processing.
Collapse
Affiliation(s)
- Cory Shain
- Department of Brain & Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA 02139
| | - Clara Meister
- Department of Computer Science, Institute for Machine Learning, ETH Zürich, Zürich 8092, Schweiz
| | - Tiago Pimentel
- Department of Computer Science and Technology, University of Cambridge, Cambridge CB3 0FD, United Kingdom
| | - Ryan Cotterell
- Department of Computer Science, Institute for Machine Learning, ETH Zürich, Zürich 8092, Schweiz
| | - Roger Levy
- Department of Brain & Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA 02139
| |
Collapse
|
23
|
Tuckute G, Sathe A, Srikant S, Taliaferro M, Wang M, Schrimpf M, Kay K, Fedorenko E. Driving and suppressing the human language network using large language models. Nat Hum Behav 2024; 8:544-561. [PMID: 38172630 DOI: 10.1038/s41562-023-01783-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2023] [Accepted: 11/10/2023] [Indexed: 01/05/2024]
Abstract
Transformer models such as GPT generate human-like language and are predictive of human brain responses to language. Here, using functional-MRI-measured brain responses to 1,000 diverse sentences, we first show that a GPT-based encoding model can predict the magnitude of the brain response associated with each sentence. We then use the model to identify new sentences that are predicted to drive or suppress responses in the human language network. We show that these model-selected novel sentences indeed strongly drive and suppress the activity of human language areas in new individuals. A systematic analysis of the model-selected sentences reveals that surprisal and well-formedness of linguistic input are key determinants of response strength in the language network. These results establish the ability of neural network models to not only mimic human language but also non-invasively control neural activity in higher-level cortical areas, such as the language network.
Collapse
Affiliation(s)
- Greta Tuckute
- Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA, USA.
- McGovern Institute for Brain Research, Massachusetts Institute of Technology, Cambridge, MA, USA.
| | - Aalok Sathe
- Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA, USA
- McGovern Institute for Brain Research, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Shashank Srikant
- Computer Science & Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, MA, USA
- MIT-IBM Watson AI Lab, Cambridge, MA, USA
| | - Maya Taliaferro
- Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA, USA
- McGovern Institute for Brain Research, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Mingye Wang
- Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA, USA
- McGovern Institute for Brain Research, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Martin Schrimpf
- McGovern Institute for Brain Research, Massachusetts Institute of Technology, Cambridge, MA, USA
- Quest for Intelligence, Massachusetts Institute of Technology, Cambridge, MA, USA
- Neuro-X Institute, École Polytechnique Fédérale de Lausanne, Lausanne, Switzerland
| | - Kendrick Kay
- Center for Magnetic Resonance Research, University of Minnesota, Minneapolis, MN, USA
| | - Evelina Fedorenko
- Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA, USA.
- McGovern Institute for Brain Research, Massachusetts Institute of Technology, Cambridge, MA, USA.
- Program in Speech and Hearing Bioscience and Technology, Harvard University, Cambridge, MA, USA.
| |
Collapse
|
24
|
Malik-Moraleda S, Jouravlev O, Taliaferro M, Mineroff Z, Cucu T, Mahowald K, Blank IA, Fedorenko E. Functional characterization of the language network of polyglots and hyperpolyglots with precision fMRI. Cereb Cortex 2024; 34:bhae049. [PMID: 38466812 PMCID: PMC10928488 DOI: 10.1093/cercor/bhae049] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2023] [Revised: 01/24/2024] [Accepted: 01/25/2024] [Indexed: 03/13/2024] Open
Abstract
How do polyglots-individuals who speak five or more languages-process their languages, and what can this population tell us about the language system? Using fMRI, we identified the language network in each of 34 polyglots (including 16 hyperpolyglots with knowledge of 10+ languages) and examined its response to the native language, non-native languages of varying proficiency, and unfamiliar languages. All language conditions engaged all areas of the language network relative to a control condition. Languages that participants rated as higher proficiency elicited stronger responses, except for the native language, which elicited a similar or lower response than a non-native language of similar proficiency. Furthermore, unfamiliar languages that were typologically related to the participants' high-to-moderate-proficiency languages elicited a stronger response than unfamiliar unrelated languages. The results suggest that the language network's response magnitude scales with the degree of engagement of linguistic computations (e.g. related to lexical access and syntactic-structure building). We also replicated a prior finding of weaker responses to native language in polyglots than non-polyglot bilinguals. These results contribute to our understanding of how multiple languages coexist within a single brain and provide new evidence that the language network responds more strongly to stimuli that more fully engage linguistic computations.
Collapse
Affiliation(s)
- Saima Malik-Moraleda
- Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA 02139, United States
- McGovern Institute for Brain Research, Massachusetts Institute of Technology, Cambridge, MA 02139, United States
- Program in Speech and Hearing Bioscience and Technology, Harvard University, Boston, MA 02114, United States
| | - Olessia Jouravlev
- Department of Cognitive Science, Carleton University, Ottawa K1S 5B6, Canada
| | - Maya Taliaferro
- Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA 02139, United States
- McGovern Institute for Brain Research, Massachusetts Institute of Technology, Cambridge, MA 02139, United States
| | - Zachary Mineroff
- Eberly Center, Carnegie Mellon University, Pittsburgh, PA 15289, United States
| | - Theodore Cucu
- Department of Psychology, Carnegie Mellon University, Pittsburgh, PA 15289, United States
| | - Kyle Mahowald
- Department of Linguistics, The University of Texas at Austin, Austin, TX 78712, United States
| | - Idan A Blank
- Department of Psychology, University of California Los Angeles, Los Angeles, CA 90095, United States
| | - Evelina Fedorenko
- Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA 02139, United States
- McGovern Institute for Brain Research, Massachusetts Institute of Technology, Cambridge, MA 02139, United States
- Program in Speech and Hearing Bioscience and Technology, Harvard University, Boston, MA 02114, United States
| |
Collapse
|
25
|
Li X, Qu Q. Verbal working memory capacity modulates semantic and phonological prediction in spoken comprehension. Psychon Bull Rev 2024; 31:249-258. [PMID: 37558832 DOI: 10.3758/s13423-023-02348-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 07/23/2023] [Indexed: 08/11/2023]
Abstract
Mounting evidence suggests that people may use multiple cues to predict different levels of representation (e.g., semantic, syntactic, and phonological) during language comprehension. One question that has been less investigated is the relationship between general cognitive processing and the efficiency of prediction at various linguistic levels, such as semantic and phonological levels. To address this research gap, the present study investigated how working memory capacity (WMC) modulates different kinds of prediction behavior (i.e., semantic prediction and phonological prediction) in the visual world. Chinese speakers listened to the highly predictable sentences that contained a highly predictable target word, and viewed a visual display of objects. The visual display of objects contained a target object corresponding to the predictable word, a semantic or a phonological competitor that was semantically or phonologically related to the predictable word, and an unrelated object. We conducted a Chinese version of the reading span task to measure verbal WMC and grouped participants into high- and low-span groups. Participants showed semantic and phonological prediction with comparable size in both groups during language comprehension, with earlier semantic prediction in the high-span group, and a similar time course of phonological prediction in both groups. These results suggest that verbal working memory modulates predictive processing in language comprehension.
Collapse
Affiliation(s)
- Xinjing Li
- Key Laboratory of Behavioral Science, Institute of Psychology, Chinese Academy of Sciences, 16 Lincui Road, Chaoyang District, Beijing, China, 100101
- Department of Psychology, University of Chinese Academy of Sciences, Beijing, China
| | - Qingqing Qu
- Key Laboratory of Behavioral Science, Institute of Psychology, Chinese Academy of Sciences, 16 Lincui Road, Chaoyang District, Beijing, China, 100101.
- Department of Psychology, University of Chinese Academy of Sciences, Beijing, China.
| |
Collapse
|
26
|
Andrade PE, Müllensiefen D, Andrade OVCA, Dunstan J, Zuk J, Gaab N. Sequence Processing in Music Predicts Reading Skills in Young Readers: A Longitudinal Study. JOURNAL OF LEARNING DISABILITIES 2024; 57:43-60. [PMID: 36935627 DOI: 10.1177/00222194231157722] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/18/2023]
Abstract
Musical abilities, both in the pitch and temporal dimension, have been shown to be positively associated with phonological awareness and reading abilities in both children and adults. There is increasing evidence that the relationship between music and language relies primarily on the temporal dimension, including both meter and rhythm. It remains unclear to what extent skill level in these temporal aspects of music may uniquely contribute to the prediction of reading outcomes. A longitudinal design was used to test a group-administered musical sequence transcription task (MSTT). This task was designed to preferentially engage sequence processing skills while controlling for fine-grained pitch discrimination and rhythm in terms of temporal grouping. Forty-five children, native speakers of Portuguese (Mage = 7.4 years), completed the MSTT and a cognitive-linguistic protocol that included visual and auditory working memory tasks, as well as phonological awareness and reading tasks in second grade. Participants then completed reading assessments in third and fifth grades. Longitudinal regression models showed that MSTT and phonological awareness had comparable power to predict reading. The MSTT showed an overall classification accuracy for identifying low-achievement readers in Grades 2, 3, and 5 that was analogous to a comprehensive model including core predictors of reading disability. In addition, MSTT was the variable with the highest loading and the most discriminatory indicator of a phonological factor. These findings carry implications for the role of temporal sequence processing in contributing to the relationship between music and language and the potential use of MSTT as a language-independent, time- and cost-effective tool for the early identification of children at risk of reading disability.
Collapse
|
27
|
van der Burght CL, Friederici AD, Maran M, Papitto G, Pyatigorskaya E, Schroën JAM, Trettenbrein PC, Zaccarella E. Cleaning up the Brickyard: How Theory and Methodology Shape Experiments in Cognitive Neuroscience of Language. J Cogn Neurosci 2023; 35:2067-2088. [PMID: 37713672 DOI: 10.1162/jocn_a_02058] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/17/2023]
Abstract
The capacity for language is a defining property of our species, yet despite decades of research, evidence on its neural basis is still mixed and a generalized consensus is difficult to achieve. We suggest that this is partly caused by researchers defining "language" in different ways, with focus on a wide range of phenomena, properties, and levels of investigation. Accordingly, there is very little agreement among cognitive neuroscientists of language on the operationalization of fundamental concepts to be investigated in neuroscientific experiments. Here, we review chains of derivation in the cognitive neuroscience of language, focusing on how the hypothesis under consideration is defined by a combination of theoretical and methodological assumptions. We first attempt to disentangle the complex relationship between linguistics, psychology, and neuroscience in the field. Next, we focus on how conclusions that can be drawn from any experiment are inherently constrained by auxiliary assumptions, both theoretical and methodological, on which the validity of conclusions drawn rests. These issues are discussed in the context of classical experimental manipulations as well as study designs that employ novel approaches such as naturalistic stimuli and computational modeling. We conclude by proposing that a highly interdisciplinary field such as the cognitive neuroscience of language requires researchers to form explicit statements concerning the theoretical definitions, methodological choices, and other constraining factors involved in their work.
Collapse
Affiliation(s)
| | - Angela D Friederici
- Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany
| | - Matteo Maran
- Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany
- International Max Planck Research School on Neuroscience of Communication, Leipzig, Germany
| | - Giorgio Papitto
- Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany
- International Max Planck Research School on Neuroscience of Communication, Leipzig, Germany
| | - Elena Pyatigorskaya
- Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany
- International Max Planck Research School on Neuroscience of Communication, Leipzig, Germany
| | - Joëlle A M Schroën
- Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany
- International Max Planck Research School on Neuroscience of Communication, Leipzig, Germany
| | - Patrick C Trettenbrein
- Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany
- International Max Planck Research School on Neuroscience of Communication, Leipzig, Germany
- University of Göttingen, Göttingen, Germany
| | - Emiliano Zaccarella
- Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany
| |
Collapse
|
28
|
Ryskin RA, Spivey MJ. Toward sophisticated models of naturalistic language behavior Comment on "Beyond Simple Laboratory Studies" by A. Maselli et al. Phys Life Rev 2023; 47:191-194. [PMID: 37926021 DOI: 10.1016/j.plrev.2023.10.022] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2023] [Accepted: 10/18/2023] [Indexed: 11/07/2023]
Affiliation(s)
- Rachel A Ryskin
- Department of Cognitive & Information Sciences, University of California, Merced, USA
| | - Michael J Spivey
- Department of Cognitive & Information Sciences, University of California, Merced, USA.
| |
Collapse
|
29
|
Kauf C, Ivanova AA, Rambelli G, Chersoni E, She JS, Chowdhury Z, Fedorenko E, Lenci A. Event Knowledge in Large Language Models: The Gap Between the Impossible and the Unlikely. Cogn Sci 2023; 47:e13386. [PMID: 38009752 DOI: 10.1111/cogs.13386] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2022] [Revised: 10/27/2023] [Accepted: 11/04/2023] [Indexed: 11/29/2023]
Abstract
Word co-occurrence patterns in language corpora contain a surprising amount of conceptual knowledge. Large language models (LLMs), trained to predict words in context, leverage these patterns to achieve impressive performance on diverse semantic tasks requiring world knowledge. An important but understudied question about LLMs' semantic abilities is whether they acquire generalized knowledge of common events. Here, we test whether five pretrained LLMs (from 2018's BERT to 2023's MPT) assign a higher likelihood to plausible descriptions of agent-patient interactions than to minimally different implausible versions of the same event. Using three curated sets of minimal sentence pairs (total n = 1215), we found that pretrained LLMs possess substantial event knowledge, outperforming other distributional language models. In particular, they almost always assign a higher likelihood to possible versus impossible events (The teacher bought the laptop vs. The laptop bought the teacher). However, LLMs show less consistent preferences for likely versus unlikely events (The nanny tutored the boy vs. The boy tutored the nanny). In follow-up analyses, we show that (i) LLM scores are driven by both plausibility and surface-level sentence features, (ii) LLM scores generalize well across syntactic variants (active vs. passive constructions) but less well across semantic variants (synonymous sentences), (iii) some LLM errors mirror human judgment ambiguity, and (iv) sentence plausibility serves as an organizing dimension in internal LLM representations. Overall, our results show that important aspects of event knowledge naturally emerge from distributional linguistic patterns, but also highlight a gap between representations of possible/impossible and likely/unlikely events.
Collapse
Affiliation(s)
- Carina Kauf
- Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology
- McGovern Institute for Brain Research, Massachusetts Institute of Technology
| | - Anna A Ivanova
- Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology
- McGovern Institute for Brain Research, Massachusetts Institute of Technology
- Computer Science and Artificial Intelligence Lab, Massachusetts Institute of Technology
| | - Giulia Rambelli
- Department of Modern Languages, Literatures and Cultures, University of Bologna
| | - Emmanuele Chersoni
- Department of Chinese and Bilingual Studies, Hong Kong Polytechnic University
| | - Jingyuan Selena She
- Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology
- McGovern Institute for Brain Research, Massachusetts Institute of Technology
| | | | - Evelina Fedorenko
- Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology
- McGovern Institute for Brain Research, Massachusetts Institute of Technology
| | - Alessandro Lenci
- Department of Philology, Literature, and Linguistics, University of Pisa
| |
Collapse
|
30
|
Ryskin R, Nieuwland MS. Prediction during language comprehension: what is next? Trends Cogn Sci 2023; 27:1032-1052. [PMID: 37704456 DOI: 10.1016/j.tics.2023.08.003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2022] [Revised: 08/03/2023] [Accepted: 08/04/2023] [Indexed: 09/15/2023]
Abstract
Prediction is often regarded as an integral aspect of incremental language comprehension, but little is known about the cognitive architectures and mechanisms that support it. We review studies showing that listeners and readers use all manner of contextual information to generate multifaceted predictions about upcoming input. The nature of these predictions may vary between individuals owing to differences in language experience, among other factors. We then turn to unresolved questions which may guide the search for the underlying mechanisms. (i) Is prediction essential to language processing or an optional strategy? (ii) Are predictions generated from within the language system or by domain-general processes? (iii) What is the relationship between prediction and memory? (iv) Does prediction in comprehension require simulation via the production system? We discuss promising directions for making progress in answering these questions and for developing a mechanistic understanding of prediction in language.
Collapse
Affiliation(s)
- Rachel Ryskin
- Department of Cognitive and Information Sciences, University of California Merced, 5200 Lake Road, Merced, CA 95343, USA.
| | - Mante S Nieuwland
- Max Planck Institute for Psycholinguistics, Nijmegen, The Netherlands; Donders Institute for Brain, Cognition, and Behaviour, Nijmegen, The Netherlands
| |
Collapse
|
31
|
Matchin W, den Ouden DB, Basilakos A, Stark BC, Fridriksson J, Hickok G. Grammatical Parallelism in Aphasia: A Lesion-Symptom Mapping Study. NEUROBIOLOGY OF LANGUAGE (CAMBRIDGE, MASS.) 2023; 4:550-574. [PMID: 37946730 PMCID: PMC10631800 DOI: 10.1162/nol_a_00117] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/03/2022] [Accepted: 07/19/2023] [Indexed: 11/12/2023]
Abstract
Sentence structure, or syntax, is potentially a uniquely creative aspect of the human mind. Neuropsychological experiments in the 1970s suggested parallel syntactic production and comprehension deficits in agrammatic Broca's aphasia, thought to result from damage to syntactic mechanisms in Broca's area in the left frontal lobe. This hypothesis was sometimes termed overarching agrammatism, converging with developments in linguistic theory concerning central syntactic mechanisms supporting language production and comprehension. However, the evidence supporting an association among receptive syntactic deficits, expressive agrammatism, and damage to frontal cortex is equivocal. In addition, the relationship among a distinct grammatical production deficit in aphasia, paragrammatism, and receptive syntax has not been assessed. We used lesion-symptom mapping in three partially overlapping groups of left-hemisphere stroke patients to investigate these issues: grammatical production deficits in a primary group of 53 subjects and syntactic comprehension in larger sample sizes (N = 130, 218) that overlapped with the primary group. Paragrammatic production deficits were significantly associated with multiple analyses of syntactic comprehension, particularly when incorporating lesion volume as a covariate, but agrammatic production deficits were not. The lesion correlates of impaired performance of syntactic comprehension were significantly associated with damage to temporal lobe regions, which were also implicated in paragrammatism, but not with the inferior and middle frontal regions implicated in expressive agrammatism. Our results provide strong evidence against the overarching agrammatism hypothesis. By contrast, our results suggest the possibility of an alternative grammatical parallelism hypothesis rooted in paragrammatism and a central syntactic system in the posterior temporal lobe.
Collapse
Affiliation(s)
- William Matchin
- Department of Communication Sciences and Disorders, University of South Carolina, Columbia, SC, USA
| | - Dirk-Bart den Ouden
- Department of Communication Sciences and Disorders, University of South Carolina, Columbia, SC, USA
| | - Alexandra Basilakos
- Department of Communication Sciences and Disorders, University of South Carolina, Columbia, SC, USA
| | - Brielle Caserta Stark
- Department of Speech, Language and Hearing Sciences, Program for Neuroscience, Indiana University Bloomington, Bloomington, IN, USA
| | - Julius Fridriksson
- Department of Communication Sciences and Disorders, University of South Carolina, Columbia, SC, USA
| | - Gregory Hickok
- Department of Cognitive Sciences, Department of Language Science, University of California, Irvine, Irvine, CA, USA
| |
Collapse
|
32
|
Tuckute G, Sathe A, Srikant S, Taliaferro M, Wang M, Schrimpf M, Kay K, Fedorenko E. Driving and suppressing the human language network using large language models. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.04.16.537080. [PMID: 37090673 PMCID: PMC10120732 DOI: 10.1101/2023.04.16.537080] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/25/2023]
Abstract
Transformer models such as GPT generate human-like language and are highly predictive of human brain responses to language. Here, using fMRI-measured brain responses to 1,000 diverse sentences, we first show that a GPT-based encoding model can predict the magnitude of brain response associated with each sentence. Then, we use the model to identify new sentences that are predicted to drive or suppress responses in the human language network. We show that these model-selected novel sentences indeed strongly drive and suppress activity of human language areas in new individuals. A systematic analysis of the model-selected sentences reveals that surprisal and well-formedness of linguistic input are key determinants of response strength in the language network. These results establish the ability of neural network models to not only mimic human language but also noninvasively control neural activity in higher-level cortical areas, like the language network.
Collapse
Affiliation(s)
- Greta Tuckute
- Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA 02139 USA
- McGovern Institute for Brain Research, Massachusetts Institute of Technology, Cambridge, MA 02139 USA
| | - Aalok Sathe
- Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA 02139 USA
- McGovern Institute for Brain Research, Massachusetts Institute of Technology, Cambridge, MA 02139 USA
| | - Shashank Srikant
- Computer Science & Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, MA 02139 USA
- MIT-IBM Watson AI Lab, Cambridge, MA 02142, USA
| | - Maya Taliaferro
- Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA 02139 USA
- McGovern Institute for Brain Research, Massachusetts Institute of Technology, Cambridge, MA 02139 USA
| | - Mingye Wang
- Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA 02139 USA
- McGovern Institute for Brain Research, Massachusetts Institute of Technology, Cambridge, MA 02139 USA
| | - Martin Schrimpf
- McGovern Institute for Brain Research, Massachusetts Institute of Technology, Cambridge, MA 02139 USA
- Quest for Intelligence, Massachusetts Institute of Technology, Cambridge, MA 02139 USA
- Neuro-X Institute, École Polytechnique Fédérale de Lausanne, CH-1015 Lausanne, Switzerland
| | - Kendrick Kay
- Center for Magnetic Resonance Research, University of Minnesota, Minneapolis, MN 55455 USA
| | - Evelina Fedorenko
- Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA 02139 USA
- McGovern Institute for Brain Research, Massachusetts Institute of Technology, Cambridge, MA 02139 USA
- The Program in Speech and Hearing Bioscience and Technology, Harvard University, Cambridge, MA 02138 USA
| |
Collapse
|
33
|
Alotaibi S, Alsaleh A, Wuerger S, Meyer G. Rapid neural changes during novel speech-sound learning: An fMRI and DTI study. BRAIN AND LANGUAGE 2023; 245:105324. [PMID: 37741162 DOI: 10.1016/j.bandl.2023.105324] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/20/2022] [Revised: 09/05/2023] [Accepted: 09/12/2023] [Indexed: 09/25/2023]
Abstract
While the functional and microstructural changes that occur when we learn new language skills are well documented, relatively little is known about the time course of these changes. Here a combined functional magnetic resonance imaging (fMRI) and diffusion tensor imaging (DTI) study that tracks neural change over three days of learning Arabic phonetic categorization as a new language (L-training) is presented. Twenty adult native English-speaking (L-native) participants are scanned before and after training to perceive and produce L-training phonetic contrasts for one hour on three consecutive days. A third (Chinese) language is used as a control language (L-control). Behavioral results show significant performance improvement for L-training in both learnt tasks; the perception and production task. Imaging analysis reveals that, training-related hemodynamic fMRI signal and fractional anisotropy (FA) value increasing can be observed, in the left inferior frontal gyrus (LIFG) and positively correlated with behavioral improvement. Moreover, post training functional connectivity findings show a significant increasing between LIFG and left inferior parietal lobule for L-training. These results indicate that three hours of phonetic categorization learning causes functional and microstructural changes that are typically associated with much more long-term learning.
Collapse
Affiliation(s)
- Sahal Alotaibi
- Radiology Dept, Applied Medical Sciences, Taif University, Taif 21944, Saudi Arabia; Faculty of Health & Life Sciences, University of Liverpool, Liverpool L69 7ZA, United Kingdom
| | - Alanood Alsaleh
- Radiological Sciences Dept, Applied Medical Sciences, King Saud University, Riyadh, Saudi Arabia
| | - Sophie Wuerger
- Clinical and Cognitive Neuroscience Group, Dept of Psychology, University of Liverpool, Liverpool L69 7ZA, United Kingdom
| | - Georg Meyer
- Clinical and Cognitive Neuroscience Group, Dept of Psychology, University of Liverpool, Liverpool L69 7ZA, United Kingdom; Virtual Engineering Centre, Digital Innovation Facility, University of Liverpool, L69 3RF, United Kingdom.
| |
Collapse
|
34
|
Philips M, Schneck SM, Levy DF, Wilson SM. Modality-Specificity of the Neural Correlates of Linguistic and Non-Linguistic Demand. NEUROBIOLOGY OF LANGUAGE (CAMBRIDGE, MASS.) 2023; 4:516-535. [PMID: 37841966 PMCID: PMC10575553 DOI: 10.1162/nol_a_00114] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/15/2022] [Accepted: 06/28/2023] [Indexed: 10/17/2023]
Abstract
Imaging studies of language processing in clinical populations can be complicated to interpret for several reasons, one being the difficulty of matching the effortfulness of processing across individuals or tasks. To better understand how effortful linguistic processing is reflected in functional activity, we investigated the neural correlates of task difficulty in linguistic and non-linguistic contexts in the auditory modality and then compared our findings to a recent analogous experiment in the visual modality in a different cohort. Nineteen neurologically normal individuals were scanned with fMRI as they performed a linguistic task (semantic matching) and a non-linguistic task (melodic matching), each with two levels of difficulty. We found that left hemisphere frontal and temporal language regions, as well as the right inferior frontal gyrus, were modulated by linguistic demand and not by non-linguistic demand. This was broadly similar to what was previously observed in the visual modality. In contrast, the multiple demand (MD) network, a set of brain regions thought to support cognitive flexibility in many contexts, was modulated neither by linguistic demand nor by non-linguistic demand in the auditory modality. This finding was in striking contradistinction to what was previously observed in the visual modality, where the MD network was robustly modulated by both linguistic and non-linguistic demand. Our findings suggest that while the language network is modulated by linguistic demand irrespective of modality, modulation of the MD network by linguistic demand is not inherent to linguistic processing, but rather depends on specific task factors.
Collapse
Affiliation(s)
- Mackenzie Philips
- Department of Hearing and Speech Sciences, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Sarah M. Schneck
- Department of Hearing and Speech Sciences, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Deborah F. Levy
- Department of Hearing and Speech Sciences, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Stephen M. Wilson
- Department of Hearing and Speech Sciences, Vanderbilt University Medical Center, Nashville, TN, USA
- School of Health and Rehabilitation Sciences, University of Queensland, Brisbane, Australia
| |
Collapse
|
35
|
Desbordes T, Lakretz Y, Chanoine V, Oquab M, Badier JM, Trébuchon A, Carron R, Bénar CG, Dehaene S, King JR. Dimensionality and Ramping: Signatures of Sentence Integration in the Dynamics of Brains and Deep Language Models. J Neurosci 2023; 43:5350-5364. [PMID: 37217308 PMCID: PMC10359032 DOI: 10.1523/jneurosci.1163-22.2023] [Citation(s) in RCA: 8] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2022] [Revised: 02/07/2023] [Accepted: 02/19/2023] [Indexed: 05/24/2023] Open
Abstract
A sentence is more than the sum of its words: its meaning depends on how they combine with one another. The brain mechanisms underlying such semantic composition remain poorly understood. To shed light on the neural vector code underlying semantic composition, we introduce two hypotheses: (1) the intrinsic dimensionality of the space of neural representations should increase as a sentence unfolds, paralleling the growing complexity of its semantic representation; and (2) this progressive integration should be reflected in ramping and sentence-final signals. To test these predictions, we designed a dataset of closely matched normal and jabberwocky sentences (composed of meaningless pseudo words) and displayed them to deep language models and to 11 human participants (5 men and 6 women) monitored with simultaneous MEG and intracranial EEG. In both deep language models and electrophysiological data, we found that representational dimensionality was higher for meaningful sentences than jabberwocky. Furthermore, multivariate decoding of normal versus jabberwocky confirmed three dynamic patterns: (1) a phasic pattern following each word, peaking in temporal and parietal areas; (2) a ramping pattern, characteristic of bilateral inferior and middle frontal gyri; and (3) a sentence-final pattern in left superior frontal gyrus and right orbitofrontal cortex. These results provide a first glimpse into the neural geometry of semantic integration and constrain the search for a neural code of linguistic composition.SIGNIFICANCE STATEMENT Starting from general linguistic concepts, we make two sets of predictions in neural signals evoked by reading multiword sentences. First, the intrinsic dimensionality of the representation should grow with additional meaningful words. Second, the neural dynamics should exhibit signatures of encoding, maintaining, and resolving semantic composition. We successfully validated these hypotheses in deep neural language models, artificial neural networks trained on text and performing very well on many natural language processing tasks. Then, using a unique combination of MEG and intracranial electrodes, we recorded high-resolution brain data from human participants while they read a controlled set of sentences. Time-resolved dimensionality analysis showed increasing dimensionality with meaning, and multivariate decoding allowed us to isolate the three dynamical patterns we had hypothesized.
Collapse
Affiliation(s)
- Théo Desbordes
- Meta AI Research, Paris 75002, France; and Cognitive Neuroimaging Unit NeuroSpin center, 91191, Gif-sur-Yvette, France
| | - Yair Lakretz
- Cognitive Neuroimaging Unit NeuroSpin center, Gif-sur-Yvette, 91191, France
| | - Valérie Chanoine
- Institute of Language, Communication and the Brain, Aix-en-Provence, 13100, France; and Aix-Marseille Université, Centre National de la Recherche Scientifique, LPL, Aix-en-Provence, 13100, France
| | | | - Jean-Michel Badier
- Aix Marseille Université, Institut National de la Santé et de la Recherche Médicale, CNRS, LPL, Aix-en-Provence 13100; and Inst Neurosci Syst, Marseille, 13005, France
| | - Agnès Trébuchon
- Aix Marseille Université, Institut National de la Santé et de la Recherche Médicale, CNRS, LPL, Aix-en-Provence 13100, France; and Inst Neurosci Syst, Marseille, 13005, France; and Assistance Publique Hopitaux de Marseille, Timone hospital, Epileptology and Cerebral Rythmology, Marseille, 13385, France
| | - Romain Carron
- Aix Marseille Université, Institut National de la Santé et de la Recherche Médicale, CNRS, LPL, Aix-en-Provence 13100, France; and Inst Neurosci Syst, Marseille, 13005, France; and Assistance Publique Hopitaux de Marseille, Timone hospital, Functional and Stereotactic Neurosurgery, Marseille, 13385, France
| | - Christian-G Bénar
- Aix Marseille Université, Institut National de la Santé et de la Recherche Médicale, CNRS, LPL, Aix-en-Provence 13100, France; and Inst Neurosci Syst, Marseille, 13005, France
| | - Stanislas Dehaene
- Université Paris Saclay, Institut National de la Santé et de la Recherche Médicale, Commissariat à l'Energie Atomique, Cognitive Neuroimaging Unit, NeuroSpin center, Saclay, 91191, France; and Collège de France, PSL University, Paris, 75231, France
| | - Jean-Rémi King
- Meta AI Research, Paris 75002, France; and Cognitive Neuroimaging Unit NeuroSpin center, 91191, Gif-sur-Yvette, France
- LSP, École normale supérieure, PSL (Paris Sciences & Lettres) University, CNRS, 75005 Paris, France
| |
Collapse
|
36
|
Wilmskoetter J, Busby N, He X, Caciagli L, Roth R, Kristinsson S, Davis KA, Rorden C, Bassett DS, Fridriksson J, Bonilha L. Dynamic network properties of the superior temporal gyrus mediate the impact of brain age gap on chronic aphasia severity. Commun Biol 2023; 6:727. [PMID: 37452209 PMCID: PMC10349039 DOI: 10.1038/s42003-023-05119-z] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2023] [Accepted: 07/07/2023] [Indexed: 07/18/2023] Open
Abstract
Brain structure deteriorates with aging and predisposes an individual to more severe language impairments (aphasia) after a stroke. However, the underlying mechanisms of this relation are not well understood. Here we use an approach to model brain network properties outside the stroke lesion, network controllability, to investigate relations among individualized structural brain connections, brain age, and aphasia severity in 93 participants with chronic post-stroke aphasia. Controlling for the stroke lesion size, we observe that lower average controllability of the posterior superior temporal gyrus (STG) mediates the relation between advanced brain aging and aphasia severity. Lower controllability of the left posterior STG signifies that activity in the left posterior STG is less likely to yield a response in other brain regions due to the topological properties of the structural brain networks. These results indicate that advanced brain aging among individuals with post-stroke aphasia is associated with disruption of dynamic properties of a critical language-related area, the STG, which contributes to worse aphasic symptoms. Because brain aging is variable among individuals with aphasia, our results provide further insight into the mechanisms underlying the variance in clinical trajectories in post-stroke aphasia.
Collapse
Affiliation(s)
- Janina Wilmskoetter
- Department of Health and Rehabilitation Sciences, College of Health Professions, Medical University of South Carolina, Charleston, SC, USA.
| | - Natalie Busby
- Department of Communication Sciences and Disorders, University of South Carolina, Columbia, SC, USA
| | - Xiaosong He
- Department of Psychology, University of Science and Technology of China, Beijing, China
| | - Lorenzo Caciagli
- Department of Bioengineering, School of Engineering & Applied Science, University of Pennsylvania, Philadelphia, PA, USA
| | - Rebecca Roth
- Department of Neurology, Emory University, Atlanta, GA, USA
| | - Sigfus Kristinsson
- Department of Communication Sciences and Disorders, University of South Carolina, Columbia, SC, USA
| | - Kathryn A Davis
- Department of Neurology, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
| | - Chris Rorden
- Department of Psychology, University of South Carolina, Columbia, SC, USA
| | - Dani S Bassett
- Department of Bioengineering, School of Engineering & Applied Science, University of Pennsylvania, Philadelphia, PA, USA
- Department of Neurology, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
- Department of Electrical and Systems Engineering, School of Engineering & Applied Science, University of Pennsylvania, Philadelphia, PA, USA
- Department of Physics & Astronomy, School of Arts & Sciences, University of Pennsylvania, Philadelphia, PA, USA
- Department of Psychiatry, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
- Santa Fe Institute, Santa Fe, New Mexico, NM, USA
| | - Julius Fridriksson
- Department of Communication Sciences and Disorders, University of South Carolina, Columbia, SC, USA
| | | |
Collapse
|
37
|
Stanojević M, Brennan JR, Dunagan D, Steedman M, Hale JT. Modeling Structure-Building in the Brain With CCG Parsing and Large Language Models. Cogn Sci 2023; 47:e13312. [PMID: 37417470 DOI: 10.1111/cogs.13312] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2022] [Revised: 06/07/2023] [Accepted: 06/17/2023] [Indexed: 07/08/2023]
Abstract
To model behavioral and neural correlates of language comprehension in naturalistic environments, researchers have turned to broad-coverage tools from natural-language processing and machine learning. Where syntactic structure is explicitly modeled, prior work has relied predominantly on context-free grammars (CFGs), yet such formalisms are not sufficiently expressive for human languages. Combinatory categorial grammars (CCGs) are sufficiently expressive directly compositional models of grammar with flexible constituency that affords incremental interpretation. In this work, we evaluate whether a more expressive CCG provides a better model than a CFG for human neural signals collected with functional magnetic resonance imaging (fMRI) while participants listen to an audiobook story. We further test between variants of CCG that differ in how they handle optional adjuncts. These evaluations are carried out against a baseline that includes estimates of next-word predictability from a transformer neural network language model. Such a comparison reveals unique contributions of CCG structure-building predominantly in the left posterior temporal lobe: CCG-derived measures offer a superior fit to neural signals compared to those derived from a CFG. These effects are spatially distinct from bilateral superior temporal effects that are unique to predictability. Neural effects for structure-building are thus separable from predictability during naturalistic listening, and those effects are best characterized by a grammar whose expressive power is motivated on independent linguistic grounds.
Collapse
Affiliation(s)
| | | | | | | | - John T Hale
- Google DeepMind
- Department of Linguistics, University of Georgia
| |
Collapse
|
38
|
Chen X, Affourtit J, Ryskin R, Regev TI, Norman-Haignere S, Jouravlev O, Malik-Moraleda S, Kean H, Varley R, Fedorenko E. The human language system, including its inferior frontal component in "Broca's area," does not support music perception. Cereb Cortex 2023; 33:7904-7929. [PMID: 37005063 PMCID: PMC10505454 DOI: 10.1093/cercor/bhad087] [Citation(s) in RCA: 10] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2022] [Revised: 01/02/2023] [Accepted: 01/03/2023] [Indexed: 04/04/2023] Open
Abstract
Language and music are two human-unique capacities whose relationship remains debated. Some have argued for overlap in processing mechanisms, especially for structure processing. Such claims often concern the inferior frontal component of the language system located within "Broca's area." However, others have failed to find overlap. Using a robust individual-subject fMRI approach, we examined the responses of language brain regions to music stimuli, and probed the musical abilities of individuals with severe aphasia. Across 4 experiments, we obtained a clear answer: music perception does not engage the language system, and judgments about music structure are possible even in the presence of severe damage to the language network. In particular, the language regions' responses to music are generally low, often below the fixation baseline, and never exceed responses elicited by nonmusic auditory conditions, like animal sounds. Furthermore, the language regions are not sensitive to music structure: they show low responses to both intact and structure-scrambled music, and to melodies with vs. without structural violations. Finally, in line with past patient investigations, individuals with aphasia, who cannot judge sentence grammaticality, perform well on melody well-formedness judgments. Thus, the mechanisms that process structure in language do not appear to process music, including music syntax.
Collapse
Affiliation(s)
- Xuanyi Chen
- Department of Cognitive Sciences, Rice University, TX 77005, United States
- Department of Brain and Cognitive Sciences, MIT, Cambridge, MA 02139, United States
- McGovern Institute for Brain Research, MIT, Cambridge, MA 02139, United States
| | - Josef Affourtit
- Department of Brain and Cognitive Sciences, MIT, Cambridge, MA 02139, United States
- McGovern Institute for Brain Research, MIT, Cambridge, MA 02139, United States
| | - Rachel Ryskin
- Department of Brain and Cognitive Sciences, MIT, Cambridge, MA 02139, United States
- McGovern Institute for Brain Research, MIT, Cambridge, MA 02139, United States
- Department of Cognitive & Information Sciences, University of California, Merced, Merced, CA 95343, United States
| | - Tamar I Regev
- Department of Brain and Cognitive Sciences, MIT, Cambridge, MA 02139, United States
- McGovern Institute for Brain Research, MIT, Cambridge, MA 02139, United States
| | - Samuel Norman-Haignere
- Department of Biostatistics & Computational Biology, University of Rochester Medical Center, Rochester, NY, United States
- Department of Neuroscience, University of Rochester Medical Center, Rochester, NY, United States
- Department of Biomedical Engineering, University of Rochester, Rochester, NY, United States
- Department of Brain and Cognitive Sciences, University of Rochester, Rochester, NY, United States
| | - Olessia Jouravlev
- Department of Brain and Cognitive Sciences, MIT, Cambridge, MA 02139, United States
- McGovern Institute for Brain Research, MIT, Cambridge, MA 02139, United States
- Department of Cognitive Science, Carleton University, Ottawa, ON, Canada
| | - Saima Malik-Moraleda
- Department of Brain and Cognitive Sciences, MIT, Cambridge, MA 02139, United States
- McGovern Institute for Brain Research, MIT, Cambridge, MA 02139, United States
- The Program in Speech and Hearing Bioscience and Technology, Harvard University, Cambridge, MA 02138, United States
| | - Hope Kean
- Department of Brain and Cognitive Sciences, MIT, Cambridge, MA 02139, United States
- McGovern Institute for Brain Research, MIT, Cambridge, MA 02139, United States
| | - Rosemary Varley
- Psychology & Language Sciences, UCL, London, WCN1 1PF, United Kingdom
| | - Evelina Fedorenko
- Department of Brain and Cognitive Sciences, MIT, Cambridge, MA 02139, United States
- McGovern Institute for Brain Research, MIT, Cambridge, MA 02139, United States
- The Program in Speech and Hearing Bioscience and Technology, Harvard University, Cambridge, MA 02138, United States
| |
Collapse
|
39
|
Shain C, Paunov A, Chen X, Lipkin B, Fedorenko E. No evidence of theory of mind reasoning in the human language network. Cereb Cortex 2023; 33:6299-6319. [PMID: 36585774 PMCID: PMC10183748 DOI: 10.1093/cercor/bhac505] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2022] [Revised: 11/30/2022] [Accepted: 12/01/2022] [Indexed: 01/01/2023] Open
Abstract
Language comprehension and the ability to infer others' thoughts (theory of mind [ToM]) are interrelated during development and language use. However, neural evidence that bears on the relationship between language and ToM mechanisms is mixed. Although robust dissociations have been reported in brain disorders, brain activations for contrasts that target language and ToM bear similarities, and some have reported overlap. We take another look at the language-ToM relationship by evaluating the response of the language network, as measured with fMRI, to verbal and nonverbal ToM across 151 participants. Individual-participant analyses reveal that all core language regions respond more strongly when participants read vignettes about false beliefs compared to the control vignettes. However, we show that these differences are largely due to linguistic confounds, and no such effects appear in a nonverbal ToM task. These results argue against cognitive and neural overlap between language processing and ToM. In exploratory analyses, we find responses to social processing in the "periphery" of the language network-right-hemisphere homotopes of core language areas and areas in bilateral angular gyri-but these responses are not selectively ToM-related and may reflect general visual semantic processing.
Collapse
Affiliation(s)
- Cory Shain
- Department of Brain and Cognitive Sciences, McGovern Institute for Brain Research, MIT Bldg 46-316077 Massachusetts Avenue, Cambridge, MA 02139, United States
| | - Alexander Paunov
- INSERM-CEA Cognitive Neuroimaging Unit (UNICOG), NeuroSpin Center, Gif sur Yvette 91191, France
| | - Xuanyi Chen
- Department of Cognitive Sciences, Rice University, 6100 Main Street, Houston, TX 77005, United States
| | - Benjamin Lipkin
- Department of Brain and Cognitive Sciences, McGovern Institute for Brain Research, MIT Bldg 46-316077 Massachusetts Avenue, Cambridge, MA 02139, United States
| | - Evelina Fedorenko
- Department of Brain and Cognitive Sciences, McGovern Institute for Brain Research, MIT Bldg 46-316077 Massachusetts Avenue, Cambridge, MA 02139, United States
- Program in Speech Hearing in Bioscience and Technology, Harvard Medical School, 260 Longwood Avenue, TMEC 333, Boston, MA 02115, United States
| |
Collapse
|
40
|
Kauf C, Tuckute G, Levy R, Andreas J, Fedorenko E. Lexical semantic content, not syntactic structure, is the main contributor to ANN-brain similarity of fMRI responses in the language network. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.05.05.539646. [PMID: 37205405 PMCID: PMC10187317 DOI: 10.1101/2023.05.05.539646] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/21/2023]
Abstract
Representations from artificial neural network (ANN) language models have been shown to predict human brain activity in the language network. To understand what aspects of linguistic stimuli contribute to ANN-to-brain similarity, we used an fMRI dataset of responses to n=627 naturalistic English sentences (Pereira et al., 2018) and systematically manipulated the stimuli for which ANN representations were extracted. In particular, we i) perturbed sentences' word order, ii) removed different subsets of words, or iii) replaced sentences with other sentences of varying semantic similarity. We found that the lexical semantic content of the sentence (largely carried by content words) rather than the sentence's syntactic form (conveyed via word order or function words) is primarily responsible for the ANN-to-brain similarity. In follow-up analyses, we found that perturbation manipulations that adversely affect brain predictivity also lead to more divergent representations in the ANN's embedding space and decrease the ANN's ability to predict upcoming tokens in those stimuli. Further, results are robust to whether the mapping model is trained on intact or perturbed stimuli, and whether the ANN sentence representations are conditioned on the same linguistic context that humans saw. The critical result-that lexical-semantic content is the main contributor to the similarity between ANN representations and neural ones-aligns with the idea that the goal of the human language system is to extract meaning from linguistic strings. Finally, this work highlights the strength of systematic experimental manipulations for evaluating how close we are to accurate and generalizable models of the human language network.
Collapse
Affiliation(s)
- Carina Kauf
- Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology
- McGovern Institute for Brain Research, Massachusetts Institute of Technology
| | - Greta Tuckute
- Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology
- McGovern Institute for Brain Research, Massachusetts Institute of Technology
| | - Roger Levy
- Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology
| | - Jacob Andreas
- Computer Science & Artificial Intelligence Laboratory, Massachusetts Institute of Technology
| | - Evelina Fedorenko
- Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology
- McGovern Institute for Brain Research, Massachusetts Institute of Technology
- Program in Speech and Hearing Bioscience and Technology, Harvard University
| |
Collapse
|
41
|
Hauptman M, Blank I, Fedorenko E. Non-literal language processing is jointly supported by the language and theory of mind networks: Evidence from a novel meta-analytic fMRI approach. Cortex 2023; 162:96-114. [PMID: 37023480 PMCID: PMC10210011 DOI: 10.1016/j.cortex.2023.01.013] [Citation(s) in RCA: 8] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2022] [Revised: 11/08/2022] [Accepted: 01/11/2023] [Indexed: 03/12/2023]
Abstract
Going beyond the literal meaning of language is key to communicative success. However, the mechanisms that support non-literal inferences remain debated. Using a novel meta-analytic approach, we evaluate the contribution of linguistic, social-cognitive, and executive mechanisms to non-literal interpretation. We identified 74 fMRI experiments (n = 1,430 participants) from 2001 to 2021 that contrasted non-literal language comprehension with a literal control condition, spanning ten phenomena (e.g., metaphor, irony, indirect speech). Applying the activation likelihood estimation approach to the 825 activation peaks yielded six left-lateralized clusters. We then evaluated the locations of both the individual-study peaks and the clusters against probabilistic functional atlases (cf. anatomical locations, as is typically done) for three candidate brain networks-the language-selective network (Fedorenko, Behr, & Kanwisher, 2011), which supports language processing, the Theory of Mind (ToM) network (Saxe & Kanwisher, 2003), which supports social inferences, and the domain-general Multiple-Demand (MD) network (Duncan, 2010), which supports executive control. These atlases were created by overlaying individual activation maps of participants who performed robust and extensively validated 'localizer' tasks that selectively target each network in question (n = 806 for language; n = 198 for ToM; n = 691 for MD). We found that both the individual-study peaks and the ALE clusters fell primarily within the language network and the ToM network. These results suggest that non-literal processing is supported by both i) mechanisms that process literal linguistic meaning, and ii) mechanisms that support general social inference. They thus undermine a strong divide between literal and non-literal aspects of language and challenge the claim that non-literal processing requires additional executive resources.
Collapse
Affiliation(s)
- Miriam Hauptman
- Department of Brain and Cognitive Sciences, MIT, Cambridge, MA 02139, USA; McGovern Institute for Brain Research, MIT, Cambridge, MA 02139, USA; Department of Psychological & Brain Sciences, Johns Hopkins University, Baltimore, MD 21218, USA.
| | - Idan Blank
- Department of Brain and Cognitive Sciences, MIT, Cambridge, MA 02139, USA; McGovern Institute for Brain Research, MIT, Cambridge, MA 02139, USA; Department of Psychology, UCLA, Los Angeles, CA 90095, USA; Department of Linguistics, UCLA, Los Angeles, CA 90095, USA
| | - Evelina Fedorenko
- Department of Brain and Cognitive Sciences, MIT, Cambridge, MA 02139, USA; McGovern Institute for Brain Research, MIT, Cambridge, MA 02139, USA; Program in Speech and Hearing in Bioscience and Technology, Harvard University, Boston, MA 02114, USA.
| |
Collapse
|
42
|
Woolnough O, Donos C, Murphy E, Rollo PS, Roccaforte ZJ, Dehaene S, Tandon N. Spatiotemporally distributed frontotemporal networks for sentence reading. Proc Natl Acad Sci U S A 2023; 120:e2300252120. [PMID: 37068244 PMCID: PMC10151604 DOI: 10.1073/pnas.2300252120] [Citation(s) in RCA: 20] [Impact Index Per Article: 20.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2023] [Accepted: 03/14/2023] [Indexed: 04/19/2023] Open
Abstract
Reading a sentence entails integrating the meanings of individual words to infer more complex, higher-order meaning. This highly rapid and complex human behavior is known to engage the inferior frontal gyrus (IFG) and middle temporal gyrus (MTG) in the language-dominant hemisphere, yet whether there are distinct contributions of these regions to sentence reading is still unclear. To probe these neural spatiotemporal dynamics, we used direct intracranial recordings to measure neural activity while reading sentences, meaning-deficient Jabberwocky sentences, and lists of words or pseudowords. We isolated two functionally and spatiotemporally distinct frontotemporal networks, each sensitive to distinct aspects of word and sentence composition. The first distributed network engages the IFG and MTG, with IFG activity preceding MTG. Activity in this network ramps up over the duration of a sentence and is reduced or absent during Jabberwocky and word lists, implying its role in the derivation of sentence-level meaning. The second network engages the superior temporal gyrus and the IFG, with temporal responses leading those in frontal lobe, and shows greater activation for each word in a list than those in sentences, suggesting that sentential context enables greater efficiency in the lexical and/or phonological processing of individual words. These adjacent, yet spatiotemporally dissociable neural mechanisms for word- and sentence-level processes shed light on the richly layered semantic networks that enable us to fluently read. These results imply distributed, dynamic computation across the frontotemporal language network rather than a clear dichotomy between the contributions of frontal and temporal structures.
Collapse
Affiliation(s)
- Oscar Woolnough
- Vivian L. Smith Department of Neurosurgery, McGovern Medical School at UT Health Houston, Houston, TX77030
- Texas Institute for Restorative Neurotechnologies, University of Texas Health Science Center at Houston, Houston, TX77030
| | - Cristian Donos
- Vivian L. Smith Department of Neurosurgery, McGovern Medical School at UT Health Houston, Houston, TX77030
- Faculty of Physics, University of Bucharest, 050663Bucharest, Romania
| | - Elliot Murphy
- Vivian L. Smith Department of Neurosurgery, McGovern Medical School at UT Health Houston, Houston, TX77030
- Texas Institute for Restorative Neurotechnologies, University of Texas Health Science Center at Houston, Houston, TX77030
| | - Patrick S. Rollo
- Vivian L. Smith Department of Neurosurgery, McGovern Medical School at UT Health Houston, Houston, TX77030
- Texas Institute for Restorative Neurotechnologies, University of Texas Health Science Center at Houston, Houston, TX77030
| | - Zachary J. Roccaforte
- Vivian L. Smith Department of Neurosurgery, McGovern Medical School at UT Health Houston, Houston, TX77030
- Texas Institute for Restorative Neurotechnologies, University of Texas Health Science Center at Houston, Houston, TX77030
| | - Stanislas Dehaene
- Cognitive Neuroimaging Unit, Université Paris-Saclay, INSERM, CEA, NeuroSpin Center, 91191Gif-sur-Yvette, France
- Collège de France, 75005Paris, France
| | - Nitin Tandon
- Vivian L. Smith Department of Neurosurgery, McGovern Medical School at UT Health Houston, Houston, TX77030
- Texas Institute for Restorative Neurotechnologies, University of Texas Health Science Center at Houston, Houston, TX77030
- Memorial Hermann Hospital, Texas Medical Center, Houston, TX77030
| |
Collapse
|
43
|
Weiss KL, Hawelka S, Hutzler F, Schuster S. Stronger functional connectivity during reading contextually predictable words in slow readers. Sci Rep 2023; 13:5989. [PMID: 37045976 PMCID: PMC10097649 DOI: 10.1038/s41598-023-33231-x] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2022] [Accepted: 04/10/2023] [Indexed: 04/14/2023] Open
Abstract
The effect of word predictability is well-documented in terms of local brain activation, but less is known about the functional connectivity among those regions associated with processing predictable words. Evidence from eye movement studies showed that the effect is much more pronounced in slow than in fast readers, suggesting that speed-impaired readers rely more on sentence context to compensate for their difficulties with visual word recognition. The present study aimed to investigate differences in functional connectivity of fast and slow readers within core regions associated with processing predictable words. We hypothesize a stronger synchronization between higher-order language areas, such as the left middle temporal (MTG) and inferior frontal gyrus (IFG), and the left occipito-temporal cortex (OTC) in slow readers. Our results show that slow readers exhibit more functional correlations among these connections; especially between the left IFG and OTC. We interpret our results in terms of the lexical quality hypothesis which postulates a stronger involvement of semantics on orthographic processing in (speed-)impaired readers.
Collapse
Affiliation(s)
| | - Stefan Hawelka
- Department of Psychology, Centre for Cognitive Neuroscience, Paris-Lodron-University of Salzburg, Hellbrunnerstr. 34, 5020, Salzburg, Austria
| | - Florian Hutzler
- Department of Psychology, Centre for Cognitive Neuroscience, Paris-Lodron-University of Salzburg, Hellbrunnerstr. 34, 5020, Salzburg, Austria
| | - Sarah Schuster
- Department of Psychology, Centre for Cognitive Neuroscience, Paris-Lodron-University of Salzburg, Hellbrunnerstr. 34, 5020, Salzburg, Austria.
| |
Collapse
|
44
|
Hu J, Small H, Kean H, Takahashi A, Zekelman L, Kleinman D, Ryan E, Nieto-Castañón A, Ferreira V, Fedorenko E. Precision fMRI reveals that the language-selective network supports both phrase-structure building and lexical access during language production. Cereb Cortex 2023; 33:4384-4404. [PMID: 36130104 PMCID: PMC10110436 DOI: 10.1093/cercor/bhac350] [Citation(s) in RCA: 17] [Impact Index Per Article: 17.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2022] [Revised: 08/01/2022] [Accepted: 08/02/2022] [Indexed: 11/13/2022] Open
Abstract
A fronto-temporal brain network has long been implicated in language comprehension. However, this network's role in language production remains debated. In particular, it remains unclear whether all or only some language regions contribute to production, and which aspects of production these regions support. Across 3 functional magnetic resonance imaging experiments that rely on robust individual-subject analyses, we characterize the language network's response to high-level production demands. We report 3 novel results. First, sentence production, spoken or typed, elicits a strong response throughout the language network. Second, the language network responds to both phrase-structure building and lexical access demands, although the response to phrase-structure building is stronger and more spatially extensive, present in every language region. Finally, contra some proposals, we find no evidence of brain regions-within or outside the language network-that selectively support phrase-structure building in production relative to comprehension. Instead, all language regions respond more strongly during production than comprehension, suggesting that production incurs a greater cost for the language network. Together, these results align with the idea that language comprehension and production draw on the same knowledge representations, which are stored in a distributed manner within the language-selective network and are used to both interpret and generate linguistic utterances.
Collapse
Affiliation(s)
- Jennifer Hu
- Department of Brain and Cognitive Sciences, MIT, Cambridge, MA 02139, United States
| | - Hannah Small
- Department of Cognitive Science, Johns Hopkins University, Baltimore, MD 21218, United States
| | - Hope Kean
- Department of Brain and Cognitive Sciences, MIT, Cambridge, MA 02139, United States
- McGovern Institute for Brain Research, MIT, Cambridge, MA 02139, United States
| | - Atsushi Takahashi
- McGovern Institute for Brain Research, MIT, Cambridge, MA 02139, United States
| | - Leo Zekelman
- Program in Speech and Hearing Bioscience and Technology, Harvard University, Cambridge, MA 02138, United States
| | | | - Elizabeth Ryan
- St. George’s Medical School, St. George’s University, Grenada, West Indies
| | - Alfonso Nieto-Castañón
- McGovern Institute for Brain Research, MIT, Cambridge, MA 02139, United States
- Department of Speech, Language, and Hearing Sciences, Boston University, Boston, MA 02215, United States
| | - Victor Ferreira
- Department of Psychology, UCSD, La Jolla, CA 92093, United States
| | - Evelina Fedorenko
- Department of Brain and Cognitive Sciences, MIT, Cambridge, MA 02139, United States
- McGovern Institute for Brain Research, MIT, Cambridge, MA 02139, United States
- Program in Speech and Hearing Bioscience and Technology, Harvard University, Cambridge, MA 02138, United States
| |
Collapse
|
45
|
Caucheteux C, Gramfort A, King JR. Evidence of a predictive coding hierarchy in the human brain listening to speech. Nat Hum Behav 2023; 7:430-441. [PMID: 36864133 PMCID: PMC10038805 DOI: 10.1038/s41562-022-01516-2] [Citation(s) in RCA: 29] [Impact Index Per Article: 29.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2022] [Accepted: 12/15/2022] [Indexed: 03/04/2023]
Abstract
Considerable progress has recently been made in natural language processing: deep learning algorithms are increasingly able to generate, summarize, translate and classify texts. Yet, these language models still fail to match the language abilities of humans. Predictive coding theory offers a tentative explanation to this discrepancy: while language models are optimized to predict nearby words, the human brain would continuously predict a hierarchy of representations that spans multiple timescales. To test this hypothesis, we analysed the functional magnetic resonance imaging brain signals of 304 participants listening to short stories. First, we confirmed that the activations of modern language models linearly map onto the brain responses to speech. Second, we showed that enhancing these algorithms with predictions that span multiple timescales improves this brain mapping. Finally, we showed that these predictions are organized hierarchically: frontoparietal cortices predict higher-level, longer-range and more contextual representations than temporal cortices. Overall, these results strengthen the role of hierarchical predictive coding in language processing and illustrate how the synergy between neuroscience and artificial intelligence can unravel the computational bases of human cognition.
Collapse
Affiliation(s)
- Charlotte Caucheteux
- Meta AI, Paris, France.
- Université Paris-Saclay, Inria, Commissariat à l'Énergie Atomique et aux Énergies Alternatives, Paris, France.
| | - Alexandre Gramfort
- Meta AI, Paris, France
- Université Paris-Saclay, Inria, Commissariat à l'Énergie Atomique et aux Énergies Alternatives, Paris, France
| | - Jean-Rémi King
- Meta AI, Paris, France.
- Laboratoire des systèmes perceptifs, Département d'études cognitives, École normale supérieure, PSL University, CNRS, Paris, France.
| |
Collapse
|
46
|
Mak M, Faber M, Willems RM. Different kinds of simulation during literary reading: Insights from a combined fMRI and eye-tracking study. Cortex 2023; 162:115-135. [PMID: 37023479 DOI: 10.1016/j.cortex.2023.01.014] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2022] [Revised: 11/02/2022] [Accepted: 01/22/2023] [Indexed: 03/17/2023]
Abstract
Mental simulation is an important aspect of narrative reading. In a previous study, we found that gaze durations are differentially impacted by different kinds of mental simulation. Motor simulation, perceptual simulation, and mentalizing as elicited by literary short stories influenced eye movements in distinguishable ways (Mak & Willems, 2019). In the current study, we investigated the existence of a common neural locus for these different kinds of simulation. We additionally investigated whether individual differences during reading, as indexed by the eye movements, are reflected in domain-specific activations in the brain. We found a variety of brain areas activated by simulation-eliciting content, both modality-specific brain areas and a general simulation area. Individual variation in percent signal change in activated areas was related to measures of story appreciation as well as personal characteristics (i.e., transportability, perspective taking). Taken together, these findings suggest that mental simulation is supported by both domain-specific processes grounded in previous experiences, and by the neural mechanisms that underlie higher-order language processing (e.g., situation model building, event indexing, integration).
Collapse
Affiliation(s)
- Marloes Mak
- Centre for Language Studies, Radboud University Nijmegen, Erasmusplein 1, 6525 HT Nijmegen, the Netherlands.
| | - Myrthe Faber
- Donders Institute for Brain, Cognition and Behaviour, Radboud University Nijmegen, Kapittelweg 29, 6525 EN Nijmegen, the Netherlands; Department of Communication and Cognition, Tilburg Center for Cognition and Communication, Tilburg University, Warandelaan 2, 5037 AB Tilburg, the Netherlands
| | - Roel M Willems
- Centre for Language Studies, Radboud University Nijmegen, Erasmusplein 1, 6525 HT Nijmegen, the Netherlands; Donders Institute for Brain, Cognition and Behaviour, Radboud University Nijmegen, Kapittelweg 29, 6525 EN Nijmegen, the Netherlands; Max Planck Institute for Psycholinguistics, Wundtlaan 1, 6525 XD Nijmegen, the Netherlands
| |
Collapse
|
47
|
Gillis M, Kries J, Vandermosten M, Francart T. Neural tracking of linguistic and acoustic speech representations decreases with advancing age. Neuroimage 2023; 267:119841. [PMID: 36584758 PMCID: PMC9878439 DOI: 10.1016/j.neuroimage.2022.119841] [Citation(s) in RCA: 8] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2022] [Revised: 12/21/2022] [Accepted: 12/26/2022] [Indexed: 12/29/2022] Open
Abstract
BACKGROUND Older adults process speech differently, but it is not yet clear how aging affects different levels of processing natural, continuous speech, both in terms of bottom-up acoustic analysis and top-down generation of linguistic-based predictions. We studied natural speech processing across the adult lifespan via electroencephalography (EEG) measurements of neural tracking. GOALS Our goals are to analyze the unique contribution of linguistic speech processing across the adult lifespan using natural speech, while controlling for the influence of acoustic processing. Moreover, we also studied acoustic processing across age. In particular, we focus on changes in spatial and temporal activation patterns in response to natural speech across the lifespan. METHODS 52 normal-hearing adults between 17 and 82 years of age listened to a naturally spoken story while the EEG signal was recorded. We investigated the effect of age on acoustic and linguistic processing of speech. Because age correlated with hearing capacity and measures of cognition, we investigated whether the observed age effect is mediated by these factors. Furthermore, we investigated whether there is an effect of age on hemisphere lateralization and on spatiotemporal patterns of the neural responses. RESULTS Our EEG results showed that linguistic speech processing declines with advancing age. Moreover, as age increased, the neural response latency to certain aspects of linguistic speech processing increased. Also acoustic neural tracking (NT) decreased with increasing age, which is at odds with the literature. In contrast to linguistic processing, older subjects showed shorter latencies for early acoustic responses to speech. No evidence was found for hemispheric lateralization in neither younger nor older adults during linguistic speech processing. Most of the observed aging effects on acoustic and linguistic processing were not explained by age-related decline in hearing capacity or cognition. However, our results suggest that the effect of decreasing linguistic neural tracking with advancing age at word-level is also partially due to an age-related decline in cognition than a robust effect of age. CONCLUSION Spatial and temporal characteristics of the neural responses to continuous speech change across the adult lifespan for both acoustic and linguistic speech processing. These changes may be traces of structural and/or functional change that occurs with advancing age.
Collapse
Affiliation(s)
- Marlies Gillis
- Experimental Oto-Rhino-Laryngology, Department of Neurosciences, Leuven Brain Institute, KU Leuven, Belgium.
| | - Jill Kries
- Experimental Oto-Rhino-Laryngology, Department of Neurosciences, Leuven Brain Institute, KU Leuven, Belgium.
| | - Maaike Vandermosten
- Experimental Oto-Rhino-Laryngology, Department of Neurosciences, Leuven Brain Institute, KU Leuven, Belgium
| | - Tom Francart
- Experimental Oto-Rhino-Laryngology, Department of Neurosciences, Leuven Brain Institute, KU Leuven, Belgium
| |
Collapse
|
48
|
Hollenstein N, Tröndle M, Plomecka M, Kiegeland S, Özyurt Y, Jäger LA, Langer N. The ZuCo benchmark on cross-subject reading task classification with EEG and eye-tracking data. Front Psychol 2023; 13:1028824. [PMID: 36710838 PMCID: PMC9878684 DOI: 10.3389/fpsyg.2022.1028824] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2022] [Accepted: 12/20/2022] [Indexed: 01/13/2023] Open
Abstract
We present a new machine learning benchmark for reading task classification with the goal of advancing EEG and eye-tracking research at the intersection between computational language processing and cognitive neuroscience. The benchmark task consists of a cross-subject classification to distinguish between two reading paradigms: normal reading and task-specific reading. The data for the benchmark is based on the Zurich Cognitive Language Processing Corpus (ZuCo 2.0), which provides simultaneous eye-tracking and EEG signals from natural reading of English sentences. The training dataset is publicly available, and we present a newly recorded hidden testset. We provide multiple solid baseline methods for this task and discuss future improvements. We release our code and provide an easy-to-use interface to evaluate new approaches with an accompanying public leaderboard: www.zuco-benchmark.com.
Collapse
Affiliation(s)
- Nora Hollenstein
- Center for Language Technology, University of Copenhagen, Copenhagen, Denmark
| | - Marius Tröndle
- Department of Psychology, University of Zurich, Zurich, Switzerland
| | - Martyna Plomecka
- Department of Psychology, University of Zurich, Zurich, Switzerland
| | | | | | - Lena A. Jäger
- Department of Computational Linguistics, University of Zurich, Zurich, Switzerland
- Department of Computer Science, University of Potsdam, Potsdam, Germany
| | - Nicolas Langer
- Department of Psychology, University of Zurich, Zurich, Switzerland
| |
Collapse
|
49
|
Yoshioka A, Tanabe HC, Nakagawa E, Sumiya M, Koike T, Sadato N. The Role of the Left Inferior Frontal Gyrus in Introspection during Verbal Communication. Brain Sci 2023; 13:brainsci13010111. [PMID: 36672092 PMCID: PMC9856826 DOI: 10.3390/brainsci13010111] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2022] [Revised: 12/30/2022] [Accepted: 12/30/2022] [Indexed: 01/11/2023] Open
Abstract
Conversation enables the sharing of our subjective experiences through verbalizing introspected thoughts and feelings. The mentalizing network represents introspection, and successful conversation is characterized by alignment through imitation mediated by the mirror neuron system (MNS). Therefore, we hypothesized that the interaction between the mentalizing network and MNS mediates the conversational exchange of introspection. To test this, we performed hyperscanning functional magnetic resonance imaging during structured real-time conversations between 19 pairs of healthy participants. The participants first evaluated their preference for and familiarity with a presented object and then disclosed it. The control was the object feature identification task. When contrasted with the control, the preference/familiarity evaluation phase activated the dorso-medial prefrontal cortex, anterior cingulate cortex, precuneus, left hippocampus, right cerebellum, and orbital portion of the left inferior frontal gyrus (IFG), which represents introspection. The left IFG was activated when the two participants' statements of introspection were mismatched during the disclosure. Disclosing introspection enhanced the functional connectivity of the left IFG with the bilateral superior temporal gyrus and primary motor cortex, representing the auditory MNS. Thus, the mentalizing system and MNS are hierarchically linked in the left IFG during a conversation, allowing for the sharing of introspection of the self and others.
Collapse
Affiliation(s)
- Ayumi Yoshioka
- Department of Cognitive and Psychological Sciences, Graduate School of Informatics, Nagoya University, Nagoya 464-8601, Japan
- Japan Society for the Promotion of Science, Tokyo 102-0083, Japan
- Division of Cerebral Integration, Department of System Neuroscience, National Institute for Physiological Sciences (NIPS), Okazaki 444-8585, Japan
- Research Organization of Science and Technology, Ritsumeikan University, Kusatsu 525-8577, Japan
| | - Hiroki C. Tanabe
- Department of Cognitive and Psychological Sciences, Graduate School of Informatics, Nagoya University, Nagoya 464-8601, Japan
- Correspondence: (H.C.T.); (N.S.); Tel.: +81-52-789-2256 (H.C.T.); +81-564-55-7841 (N.S.); Fax: +81-52-789-2256 (H.C.T.); +81-564-55-7843 (N.S.)
| | - Eri Nakagawa
- Division of Cerebral Integration, Department of System Neuroscience, National Institute for Physiological Sciences (NIPS), Okazaki 444-8585, Japan
| | - Motofumi Sumiya
- Japan Society for the Promotion of Science, Tokyo 102-0083, Japan
- Division of Cerebral Integration, Department of System Neuroscience, National Institute for Physiological Sciences (NIPS), Okazaki 444-8585, Japan
| | - Takahiko Koike
- Division of Cerebral Integration, Department of System Neuroscience, National Institute for Physiological Sciences (NIPS), Okazaki 444-8585, Japan
| | - Norihiro Sadato
- Division of Cerebral Integration, Department of System Neuroscience, National Institute for Physiological Sciences (NIPS), Okazaki 444-8585, Japan
- Research Organization of Science and Technology, Ritsumeikan University, Kusatsu 525-8577, Japan
- Correspondence: (H.C.T.); (N.S.); Tel.: +81-52-789-2256 (H.C.T.); +81-564-55-7841 (N.S.); Fax: +81-52-789-2256 (H.C.T.); +81-564-55-7843 (N.S.)
| |
Collapse
|
50
|
MacGregor LJ, Gilbert RA, Balewski Z, Mitchell DJ, Erzinçlioğlu SW, Rodd JM, Duncan J, Fedorenko E, Davis MH. Causal Contributions of the Domain-General (Multiple Demand) and the Language-Selective Brain Networks to Perceptual and Semantic Challenges in Speech Comprehension. NEUROBIOLOGY OF LANGUAGE (CAMBRIDGE, MASS.) 2022; 3:665-698. [PMID: 36742011 PMCID: PMC9893226 DOI: 10.1162/nol_a_00081] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 03/14/2022] [Accepted: 09/07/2022] [Indexed: 06/18/2023]
Abstract
Listening to spoken language engages domain-general multiple demand (MD; frontoparietal) regions of the human brain, in addition to domain-selective (frontotemporal) language regions, particularly when comprehension is challenging. However, there is limited evidence that the MD network makes a functional contribution to core aspects of understanding language. In a behavioural study of volunteers (n = 19) with chronic brain lesions, but without aphasia, we assessed the causal role of these networks in perceiving, comprehending, and adapting to spoken sentences made more challenging by acoustic-degradation or lexico-semantic ambiguity. We measured perception of and adaptation to acoustically degraded (noise-vocoded) sentences with a word report task before and after training. Participants with greater damage to MD but not language regions required more vocoder channels to achieve 50% word report, indicating impaired perception. Perception improved following training, reflecting adaptation to acoustic degradation, but adaptation was unrelated to lesion location or extent. Comprehension of spoken sentences with semantically ambiguous words was measured with a sentence coherence judgement task. Accuracy was high and unaffected by lesion location or extent. Adaptation to semantic ambiguity was measured in a subsequent word association task, which showed that availability of lower-frequency meanings of ambiguous words increased following their comprehension (word-meaning priming). Word-meaning priming was reduced for participants with greater damage to language but not MD regions. Language and MD networks make dissociable contributions to challenging speech comprehension: Using recent experience to update word meaning preferences depends on language-selective regions, whereas the domain-general MD network plays a causal role in reporting words from degraded speech.
Collapse
Affiliation(s)
- Lucy J. MacGregor
- MRC Cognition and Brain Sciences Unit, University of Cambridge, Cambridge, UK
| | - Rebecca A. Gilbert
- MRC Cognition and Brain Sciences Unit, University of Cambridge, Cambridge, UK
| | - Zuzanna Balewski
- Helen Wills Neuroscience Institute, University of California, Berkeley, Berkeley, CA
| | - Daniel J. Mitchell
- MRC Cognition and Brain Sciences Unit, University of Cambridge, Cambridge, UK
| | | | - Jennifer M. Rodd
- Psychology and Language Sciences, University College London, London, UK
| | - John Duncan
- MRC Cognition and Brain Sciences Unit, University of Cambridge, Cambridge, UK
| | - Evelina Fedorenko
- Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA
- McGovern Institute for Brain Research, Massachusetts Institute of Technology, Cambridge, MA
- Program in Speech and Hearing Bioscience and Technology, Harvard University, Cambridge, MA
| | - Matthew H. Davis
- MRC Cognition and Brain Sciences Unit, University of Cambridge, Cambridge, UK
| |
Collapse
|