1
|
Mahowald K, Ivanova AA, Blank IA, Kanwisher N, Tenenbaum JB, Fedorenko E. Dissociating language and thought in large language models. Trends Cogn Sci 2024; 28:517-540. [PMID: 38508911 DOI: 10.1016/j.tics.2024.01.011] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2023] [Revised: 01/31/2024] [Accepted: 01/31/2024] [Indexed: 03/22/2024]
Abstract
Large language models (LLMs) have come closest among all models to date to mastering human language, yet opinions about their linguistic and cognitive capabilities remain split. Here, we evaluate LLMs using a distinction between formal linguistic competence (knowledge of linguistic rules and patterns) and functional linguistic competence (understanding and using language in the world). We ground this distinction in human neuroscience, which has shown that formal and functional competence rely on different neural mechanisms. Although LLMs are surprisingly good at formal competence, their performance on functional competence tasks remains spotty and often requires specialized fine-tuning and/or coupling with external modules. We posit that models that use language in human-like ways would need to master both of these competence types, which, in turn, could require the emergence of separate mechanisms specialized for formal versus functional linguistic competence.
Collapse
|
2
|
Yu L, Dugan P, Doyle W, Devinsky O, Friedman D, Flinker A. A left-lateralized dorsolateral prefrontal network for naming. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.05.15.594403. [PMID: 38798614 PMCID: PMC11118423 DOI: 10.1101/2024.05.15.594403] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/29/2024]
Abstract
The ability to connect the form and meaning of a concept, known as word retrieval, is fundamental to human communication. While various input modalities could lead to identical word retrieval, the exact neural dynamics supporting this convergence relevant to daily auditory discourse remain poorly understood. Here, we leveraged neurosurgical electrocorticographic (ECoG) recordings from 48 patients and dissociated two key language networks that highly overlap in time and space integral to word retrieval. Using unsupervised temporal clustering techniques, we found a semantic processing network located in the middle and inferior frontal gyri. This network was distinct from an articulatory planning network in the inferior frontal and precentral gyri, which was agnostic to input modalities. Functionally, we confirmed that the semantic processing network encodes word surprisal during sentence perception. Our findings characterize how humans integrate ongoing auditory semantic information over time, a critical linguistic function from passive comprehension to daily discourse.
Collapse
Affiliation(s)
- Leyao Yu
- Department of Biomedical Engineering, New York University, New York, 10016, New York, the United States
- Department of Neurology, School of Medicine, New York University, New York, 10016, New York, the United States
| | - Patricia Dugan
- Department of Neurology, School of Medicine, New York University, New York, 10016, New York, the United States
| | - Werner Doyle
- Department of Neurosurgery, School of Medicine, New York University, New York, 10016, New York, the United States
| | - Orrin Devinsky
- Department of Neurology, School of Medicine, New York University, New York, 10016, New York, the United States
| | - Daniel Friedman
- Department of Neurology, School of Medicine, New York University, New York, 10016, New York, the United States
| | - Adeen Flinker
- Department of Biomedical Engineering, New York University, New York, 10016, New York, the United States
- Department of Neurology, School of Medicine, New York University, New York, 10016, New York, the United States
| |
Collapse
|
3
|
Lopopolo A, Fedorenko E, Levy R, Rabovsky M. Cognitive Computational Neuroscience of Language: Using Computational Models to Investigate Language Processing in the Brain. NEUROBIOLOGY OF LANGUAGE (CAMBRIDGE, MASS.) 2024; 5:1-6. [PMID: 38645621 PMCID: PMC11025655 DOI: 10.1162/nol_e_00131] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/23/2024]
Affiliation(s)
| | - Evelina Fedorenko
- Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA, USA
- McGovern Institute for Brain Research, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Roger Levy
- Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Milena Rabovsky
- Department of Psychology, University of Potsdam, Potsdam, Germany
| |
Collapse
|
4
|
Sugimoto Y, Yoshida R, Jeong H, Koizumi M, Brennan JR, Oseki Y. Localizing Syntactic Composition with Left-Corner Recurrent Neural Network Grammars. NEUROBIOLOGY OF LANGUAGE (CAMBRIDGE, MASS.) 2024; 5:201-224. [PMID: 38645619 PMCID: PMC11025653 DOI: 10.1162/nol_a_00118] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 01/06/2023] [Accepted: 07/24/2023] [Indexed: 04/23/2024]
Abstract
In computational neurolinguistics, it has been demonstrated that hierarchical models such as recurrent neural network grammars (RNNGs), which jointly generate word sequences and their syntactic structures via the syntactic composition, better explained human brain activity than sequential models such as long short-term memory networks (LSTMs). However, the vanilla RNNG has employed the top-down parsing strategy, which has been pointed out in the psycholinguistics literature as suboptimal especially for head-final/left-branching languages, and alternatively the left-corner parsing strategy has been proposed as the psychologically plausible parsing strategy. In this article, building on this line of inquiry, we investigate not only whether hierarchical models like RNNGs better explain human brain activity than sequential models like LSTMs, but also which parsing strategy is more neurobiologically plausible, by developing a novel fMRI corpus where participants read newspaper articles in a head-final/left-branching language, namely Japanese, through the naturalistic fMRI experiment. The results revealed that left-corner RNNGs outperformed both LSTMs and top-down RNNGs in the left inferior frontal and temporal-parietal regions, suggesting that there are certain brain regions that localize the syntactic composition with the left-corner parsing strategy.
Collapse
Affiliation(s)
- Yushi Sugimoto
- Graduate School of Arts and Sciences, University of Tokyo, Tokyo, Japan
| | - Ryo Yoshida
- Graduate School of Arts and Sciences, University of Tokyo, Tokyo, Japan
| | - Hyeonjeong Jeong
- Graduate School of International Cultural Studies, Tohoku University, Sendai, Japan
| | - Masatoshi Koizumi
- Department of Linguistics, Graduate School of Arts and Letters, Tohoku University, Sendai, Japan
| | | | - Yohei Oseki
- Graduate School of Arts and Sciences, University of Tokyo, Tokyo, Japan
| |
Collapse
|
5
|
Huber E, Sauppe S, Isasi-Isasmendi A, Bornkessel-Schlesewsky I, Merlo P, Bickel B. Surprisal From Language Models Can Predict ERPs in Processing Predicate-Argument Structures Only if Enriched by an Agent Preference Principle. NEUROBIOLOGY OF LANGUAGE (CAMBRIDGE, MASS.) 2024; 5:167-200. [PMID: 38645615 PMCID: PMC11025647 DOI: 10.1162/nol_a_00121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/03/2022] [Accepted: 08/30/2023] [Indexed: 04/23/2024]
Abstract
Language models based on artificial neural networks increasingly capture key aspects of how humans process sentences. Most notably, model-based surprisals predict event-related potentials such as N400 amplitudes during parsing. Assuming that these models represent realistic estimates of human linguistic experience, their success in modeling language processing raises the possibility that the human processing system relies on no other principles than the general architecture of language models and on sufficient linguistic input. Here, we test this hypothesis on N400 effects observed during the processing of verb-final sentences in German, Basque, and Hindi. By stacking Bayesian generalised additive models, we show that, in each language, N400 amplitudes and topographies in the region of the verb are best predicted when model-based surprisals are complemented by an Agent Preference principle that transiently interprets initial role-ambiguous noun phrases as agents, leading to reanalysis when this interpretation fails. Our findings demonstrate the need for this principle independently of usage frequencies and structural differences between languages. The principle has an unequal force, however. Compared to surprisal, its effect is weakest in German, stronger in Hindi, and still stronger in Basque. This gradient is correlated with the extent to which grammars allow unmarked NPs to be patients, a structural feature that boosts reanalysis effects. We conclude that language models gain more neurobiological plausibility by incorporating an Agent Preference. Conversely, theories of human processing profit from incorporating surprisal estimates in addition to principles like the Agent Preference, which arguably have distinct evolutionary roots.
Collapse
Affiliation(s)
- Eva Huber
- Department of Comparative Language Science, University of Zurich, Zurich, Switzerland
- Center for the Interdisciplinary Study of Language Evolution, University of Zurich, Zurich, Switzerland
| | - Sebastian Sauppe
- Department of Comparative Language Science, University of Zurich, Zurich, Switzerland
- Center for the Interdisciplinary Study of Language Evolution, University of Zurich, Zurich, Switzerland
- Department of Psychology, University of Zurich, Zurich, Switzerland
| | - Arrate Isasi-Isasmendi
- Department of Comparative Language Science, University of Zurich, Zurich, Switzerland
- Center for the Interdisciplinary Study of Language Evolution, University of Zurich, Zurich, Switzerland
| | - Ina Bornkessel-Schlesewsky
- Cognitive Neuroscience Laboratory, Australian Research Centre for Interactive and Virtual Environments, University of South Australia, Adelaide, Australia
| | - Paola Merlo
- Department of Linguistics, University of Geneva, Geneva, Switzerland
- University Center for Computer Science, University of Geneva, Geneva, Switzerland
| | - Balthasar Bickel
- Department of Comparative Language Science, University of Zurich, Zurich, Switzerland
- Center for the Interdisciplinary Study of Language Evolution, University of Zurich, Zurich, Switzerland
| |
Collapse
|
6
|
Giglio L, Ostarek M, Sharoh D, Hagoort P. Diverging neural dynamics for syntactic structure building in naturalistic speaking and listening. Proc Natl Acad Sci U S A 2024; 121:e2310766121. [PMID: 38442171 PMCID: PMC10945772 DOI: 10.1073/pnas.2310766121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2023] [Accepted: 01/31/2024] [Indexed: 03/07/2024] Open
Abstract
The neural correlates of sentence production are typically studied using task paradigms that differ considerably from the experience of speaking outside of an experimental setting. In this fMRI study, we aimed to gain a better understanding of syntactic processing in spontaneous production versus naturalistic comprehension in three regions of interest (BA44, BA45, and left posterior middle temporal gyrus). A group of participants (n = 16) was asked to speak about the events of an episode of a TV series in the scanner. Another group of participants (n = 36) listened to the spoken recall of a participant from the first group. To model syntactic processing, we extracted word-by-word metrics of phrase-structure building with a top-down and a bottom-up parser that make different hypotheses about the timing of structure building. While the top-down parser anticipates syntactic structure, sometimes before it is obvious to the listener, the bottom-up parser builds syntactic structure in an integratory way after all of the evidence has been presented. In comprehension, neural activity was found to be better modeled by the bottom-up parser, while in production, it was better modeled by the top-down parser. We additionally modeled structure building in production with two strategies that were developed here to make different predictions about the incrementality of structure building during speaking. We found evidence for highly incremental and anticipatory structure building in production, which was confirmed by a converging analysis of the pausing patterns in speech. Overall, this study shows the feasibility of studying the neural dynamics of spontaneous language production.
Collapse
Affiliation(s)
- Laura Giglio
- Max Planck Institute for Psycholinguistics, Nijmegen6525XD, The Netherlands
- Radboud University, Donders Institute for Brain, Cognition and Behaviour, Nijmegen6525EN, The Netherlands
| | - Markus Ostarek
- Max Planck Institute for Psycholinguistics, Nijmegen6525XD, The Netherlands
| | - Daniel Sharoh
- Max Planck Institute for Psycholinguistics, Nijmegen6525XD, The Netherlands
- Radboud University, Donders Institute for Brain, Cognition and Behaviour, Nijmegen6525EN, The Netherlands
| | - Peter Hagoort
- Max Planck Institute for Psycholinguistics, Nijmegen6525XD, The Netherlands
- Radboud University, Donders Institute for Brain, Cognition and Behaviour, Nijmegen6525EN, The Netherlands
| |
Collapse
|
7
|
Stanojević M, Brennan JR, Dunagan D, Steedman M, Hale JT. Modeling Structure-Building in the Brain With CCG Parsing and Large Language Models. Cogn Sci 2023; 47:e13312. [PMID: 37417470 DOI: 10.1111/cogs.13312] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2022] [Revised: 06/07/2023] [Accepted: 06/17/2023] [Indexed: 07/08/2023]
Abstract
To model behavioral and neural correlates of language comprehension in naturalistic environments, researchers have turned to broad-coverage tools from natural-language processing and machine learning. Where syntactic structure is explicitly modeled, prior work has relied predominantly on context-free grammars (CFGs), yet such formalisms are not sufficiently expressive for human languages. Combinatory categorial grammars (CCGs) are sufficiently expressive directly compositional models of grammar with flexible constituency that affords incremental interpretation. In this work, we evaluate whether a more expressive CCG provides a better model than a CFG for human neural signals collected with functional magnetic resonance imaging (fMRI) while participants listen to an audiobook story. We further test between variants of CCG that differ in how they handle optional adjuncts. These evaluations are carried out against a baseline that includes estimates of next-word predictability from a transformer neural network language model. Such a comparison reveals unique contributions of CCG structure-building predominantly in the left posterior temporal lobe: CCG-derived measures offer a superior fit to neural signals compared to those derived from a CFG. These effects are spatially distinct from bilateral superior temporal effects that are unique to predictability. Neural effects for structure-building are thus separable from predictability during naturalistic listening, and those effects are best characterized by a grammar whose expressive power is motivated on independent linguistic grounds.
Collapse
Affiliation(s)
| | | | | | | | - John T Hale
- Google DeepMind
- Department of Linguistics, University of Georgia
| |
Collapse
|
8
|
Syntax through the looking glass: A review on two-word linguistic processing across behavioral, neuroimaging and neurostimulation studies. Neurosci Biobehav Rev 2022; 142:104881. [DOI: 10.1016/j.neubiorev.2022.104881] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2022] [Revised: 09/13/2022] [Accepted: 09/15/2022] [Indexed: 11/23/2022]
|
9
|
Abstract
Understanding spoken language requires transforming ambiguous acoustic streams into a hierarchy of representations, from phonemes to meaning. It has been suggested that the brain uses prediction to guide the interpretation of incoming input. However, the role of prediction in language processing remains disputed, with disagreement about both the ubiquity and representational nature of predictions. Here, we address both issues by analyzing brain recordings of participants listening to audiobooks, and using a deep neural network (GPT-2) to precisely quantify contextual predictions. First, we establish that brain responses to words are modulated by ubiquitous predictions. Next, we disentangle model-based predictions into distinct dimensions, revealing dissociable neural signatures of predictions about syntactic category (parts of speech), phonemes, and semantics. Finally, we show that high-level (word) predictions inform low-level (phoneme) predictions, supporting hierarchical predictive processing. Together, these results underscore the ubiquity of prediction in language processing, showing that the brain spontaneously predicts upcoming language at multiple levels of abstraction.
Collapse
|
10
|
Heilbron M, Armeni K, Schoffelen JM, Hagoort P, de Lange FP. A hierarchy of linguistic predictions during natural language comprehension. Proc Natl Acad Sci U S A 2022; 119:e2201968119. [PMID: 35921434 DOI: 10.1101/2020.12.03.410399] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/21/2023] Open
Abstract
Understanding spoken language requires transforming ambiguous acoustic streams into a hierarchy of representations, from phonemes to meaning. It has been suggested that the brain uses prediction to guide the interpretation of incoming input. However, the role of prediction in language processing remains disputed, with disagreement about both the ubiquity and representational nature of predictions. Here, we address both issues by analyzing brain recordings of participants listening to audiobooks, and using a deep neural network (GPT-2) to precisely quantify contextual predictions. First, we establish that brain responses to words are modulated by ubiquitous predictions. Next, we disentangle model-based predictions into distinct dimensions, revealing dissociable neural signatures of predictions about syntactic category (parts of speech), phonemes, and semantics. Finally, we show that high-level (word) predictions inform low-level (phoneme) predictions, supporting hierarchical predictive processing. Together, these results underscore the ubiquity of prediction in language processing, showing that the brain spontaneously predicts upcoming language at multiple levels of abstraction.
Collapse
Affiliation(s)
- Micha Heilbron
- Donders Institute, Radboud University, 6525 EN Nijmegen, The Netherlands
- Max Planck Institute for Psycholinguistics, 6525 XD Nijmegen, The Netherlands
| | - Kristijan Armeni
- Donders Institute, Radboud University, 6525 EN Nijmegen, The Netherlands
| | | | - Peter Hagoort
- Donders Institute, Radboud University, 6525 EN Nijmegen, The Netherlands
- Max Planck Institute for Psycholinguistics, 6525 XD Nijmegen, The Netherlands
| | - Floris P de Lange
- Donders Institute, Radboud University, 6525 EN Nijmegen, The Netherlands
| |
Collapse
|
11
|
Abstract
Speech processing in the human brain is grounded in non-specific auditory processing in the general mammalian brain, but relies on human-specific adaptations for processing speech and language. For this reason, many recent neurophysiological investigations of speech processing have turned to the human brain, with an emphasis on continuous speech. Substantial progress has been made using the phenomenon of "neural speech tracking", in which neurophysiological responses time-lock to the rhythm of auditory (and other) features in continuous speech. One broad category of investigations concerns the extent to which speech tracking measures are related to speech intelligibility, which has clinical applications in addition to its scientific importance. Recent investigations have also focused on disentangling different neural processes that contribute to speech tracking. The two lines of research are closely related, since processing stages throughout auditory cortex contribute to speech comprehension, in addition to subcortical processing and higher order and attentional processes.
Collapse
Affiliation(s)
- Christian Brodbeck
- Institute for Systems Research, University of Maryland, College Park, Maryland 20742, U.S.A
| | - Jonathan Z. Simon
- Institute for Systems Research, University of Maryland, College Park, Maryland 20742, U.S.A
- Department of Electrical and Computer Engineering, University of Maryland, College Park, Maryland 20742, U.S.A
- Department of Biology, University of Maryland, College Park, Maryland 20742, U.S.A
| |
Collapse
|
12
|
Nieuwland MS, Kazanina N. The Neural Basis of Linguistic Prediction: Introduction to the Special Issue. Neuropsychologia 2020; 146:107532. [PMID: 32553845 DOI: 10.1016/j.neuropsychologia.2020.107532] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022]
Affiliation(s)
- Mante S Nieuwland
- Max Planck Institute for Psycholinguistics, Nijmegen, the Netherlands; Donders Institute for Brain, Cognition and Behaviour, Nijmegen, the Netherlands; Heinrich-Heine-University, Düsseldorf, Germany.
| | - Nina Kazanina
- School of Psychological Science, University of Bristol, Bristol, United Kingdom; Institute of Cognitive Neuroscience, National Research University Higher School of Economics, Moscow, Russian Federation
| |
Collapse
|