1
|
Nour Eddine S, Brothers T, Wang L, Spratling M, Kuperberg GR. A predictive coding model of the N400. Cognition 2024; 246:105755. [PMID: 38428168 PMCID: PMC10984641 DOI: 10.1016/j.cognition.2024.105755] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2023] [Revised: 02/14/2024] [Accepted: 02/19/2024] [Indexed: 03/03/2024]
Abstract
The N400 event-related component has been widely used to investigate the neural mechanisms underlying real-time language comprehension. However, despite decades of research, there is still no unifying theory that can explain both its temporal dynamics and functional properties. In this work, we show that predictive coding - a biologically plausible algorithm for approximating Bayesian inference - offers a promising framework for characterizing the N400. Using an implemented predictive coding computational model, we demonstrate how the N400 can be formalized as the lexico-semantic prediction error produced as the brain infers meaning from the linguistic form of incoming words. We show that the magnitude of lexico-semantic prediction error mirrors the functional sensitivity of the N400 to various lexical variables, priming, contextual effects, as well as their higher-order interactions. We further show that the dynamics of the predictive coding algorithm provides a natural explanation for the temporal dynamics of the N400, and a biologically plausible link to neural activity. Together, these findings directly situate the N400 within the broader context of predictive coding research. More generally, they raise the possibility that the brain may use the same computational mechanism for inference across linguistic and non-linguistic domains.
Collapse
Affiliation(s)
- Samer Nour Eddine
- Department of Psychology and Center for Cognitive Science, Tufts University, United States of America.
| | - Trevor Brothers
- Department of Psychology and Center for Cognitive Science, Tufts University, United States of America; Department of Psychology, North Carolina A&T, United States of America
| | - Lin Wang
- Department of Psychology and Center for Cognitive Science, Tufts University, United States of America; Department of Psychiatry and the Athinoula A. Martinos Center for Biomedical Imaging, Massachusetts General Hospital, Harvard Medical School, United States of America
| | | | - Gina R Kuperberg
- Department of Psychology and Center for Cognitive Science, Tufts University, United States of America; Department of Psychiatry and the Athinoula A. Martinos Center for Biomedical Imaging, Massachusetts General Hospital, Harvard Medical School, United States of America
| |
Collapse
|
2
|
Michaelov JA, Bardolph MD, Van Petten CK, Bergen BK, Coulson S. Strong Prediction: Language Model Surprisal Explains Multiple N400 Effects. NEUROBIOLOGY OF LANGUAGE (CAMBRIDGE, MASS.) 2024; 5:107-135. [PMID: 38645623 PMCID: PMC11025652 DOI: 10.1162/nol_a_00105] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 10/06/2022] [Accepted: 03/24/2023] [Indexed: 04/23/2024]
Abstract
Theoretical accounts of the N400 are divided as to whether the amplitude of the N400 response to a stimulus reflects the extent to which the stimulus was predicted, the extent to which the stimulus is semantically similar to its preceding context, or both. We use state-of-the-art machine learning tools to investigate which of these three accounts is best supported by the evidence. GPT-3, a neural language model trained to compute the conditional probability of any word based on the words that precede it, was used to operationalize contextual predictability. In particular, we used an information-theoretic construct known as surprisal (the negative logarithm of the conditional probability). Contextual semantic similarity was operationalized by using two high-quality co-occurrence-derived vector-based meaning representations for words: GloVe and fastText. The cosine between the vector representation of the sentence frame and final word was used to derive contextual cosine similarity estimates. A series of regression models were constructed, where these variables, along with cloze probability and plausibility ratings, were used to predict single trial N400 amplitudes recorded from healthy adults as they read sentences whose final word varied in its predictability, plausibility, and semantic relationship to the likeliest sentence completion. Statistical model comparison indicated GPT-3 surprisal provided the best account of N400 amplitude and suggested that apparently disparate N400 effects of expectancy, plausibility, and contextual semantic similarity can be reduced to variation in the predictability of words. The results are argued to support predictive coding in the human language network.
Collapse
Affiliation(s)
- James A. Michaelov
- Department of Cognitive Science, University of California, San Diego, La Jolla, CA, USA
| | - Megan D. Bardolph
- Department of Cognitive Science, University of California, San Diego, La Jolla, CA, USA
| | - Cyma K. Van Petten
- Department of Psychology, Binghamton University, State University of New York, Binghamton, NY, USA
| | - Benjamin K. Bergen
- Department of Cognitive Science, University of California, San Diego, La Jolla, CA, USA
| | - Seana Coulson
- Department of Cognitive Science, University of California, San Diego, La Jolla, CA, USA
| |
Collapse
|
3
|
Lopopolo A, Rabovsky M. Tracking Lexical and Semantic Prediction Error Underlying the N400 Using Artificial Neural Network Models of Sentence Processing. NEUROBIOLOGY OF LANGUAGE (CAMBRIDGE, MASS.) 2024; 5:136-166. [PMID: 38645617 PMCID: PMC11025650 DOI: 10.1162/nol_a_00134] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/14/2022] [Accepted: 12/18/2023] [Indexed: 04/23/2024]
Abstract
Recent research has shown that the internal dynamics of an artificial neural network model of sentence comprehension displayed a similar pattern to the amplitude of the N400 in several conditions known to modulate this event-related potential. These results led Rabovsky et al. (2018) to suggest that the N400 might reflect change in an implicit predictive representation of meaning corresponding to semantic prediction error. This explanation stands as an alternative to the hypothesis that the N400 reflects lexical prediction error as estimated by word surprisal (Frank et al., 2015). In the present study, we directly model the amplitude of the N400 elicited during naturalistic sentence processing by using as predictor the update of the distributed representation of sentence meaning generated by a sentence gestalt model (McClelland et al., 1989) trained on a large-scale text corpus. This enables a quantitative prediction of N400 amplitudes based on a cognitively motivated model, as well as quantitative comparison of this model to alternative models of the N400. Specifically, we compare the update measure from the sentence gestalt model to surprisal estimated by a comparable language model trained on next-word prediction. Our results suggest that both sentence gestalt update and surprisal predict aspects of N400 amplitudes. Thus, we argue that N400 amplitudes might reflect two distinct but probably closely related sub-processes that contribute to the processing of a sentence.
Collapse
Affiliation(s)
| | - Milena Rabovsky
- Department of Psychology, University of Potsdam, Potsdam, Germany
| |
Collapse
|
4
|
Ryskin R, Nieuwland MS. Prediction during language comprehension: what is next? Trends Cogn Sci 2023; 27:1032-1052. [PMID: 37704456 DOI: 10.1016/j.tics.2023.08.003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2022] [Revised: 08/03/2023] [Accepted: 08/04/2023] [Indexed: 09/15/2023]
Abstract
Prediction is often regarded as an integral aspect of incremental language comprehension, but little is known about the cognitive architectures and mechanisms that support it. We review studies showing that listeners and readers use all manner of contextual information to generate multifaceted predictions about upcoming input. The nature of these predictions may vary between individuals owing to differences in language experience, among other factors. We then turn to unresolved questions which may guide the search for the underlying mechanisms. (i) Is prediction essential to language processing or an optional strategy? (ii) Are predictions generated from within the language system or by domain-general processes? (iii) What is the relationship between prediction and memory? (iv) Does prediction in comprehension require simulation via the production system? We discuss promising directions for making progress in answering these questions and for developing a mechanistic understanding of prediction in language.
Collapse
Affiliation(s)
- Rachel Ryskin
- Department of Cognitive and Information Sciences, University of California Merced, 5200 Lake Road, Merced, CA 95343, USA.
| | - Mante S Nieuwland
- Max Planck Institute for Psycholinguistics, Nijmegen, The Netherlands; Donders Institute for Brain, Cognition, and Behaviour, Nijmegen, The Netherlands
| |
Collapse
|
5
|
Carter GA, Nieuwland MS. Predicting Definite and Indefinite Referents During Discourse Comprehension: Evidence from Event-Related Potentials. Cogn Sci 2022; 46:e13092. [PMID: 35122304 PMCID: PMC9286847 DOI: 10.1111/cogs.13092] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2021] [Revised: 10/10/2021] [Accepted: 12/14/2021] [Indexed: 11/30/2022]
Abstract
Linguistic predictions may be generated from and evaluated against a representation of events and referents described in the discourse. Compatible with this idea, recent work shows that predictions about novel noun phrases include their definiteness. In the current follow‐up study, we ask whether people engage similar prediction‐related processes for definite and indefinite referents. This question is relevant for linguistic theories that imply a processing difference between definite and indefinite noun phrases, typically because definiteness is thought to require a uniquely identifiable referent in the discourse. We addressed this question in an event‐related potential (ERP) study (N = 48) with preregistration of data acquisition, preprocessing, and Bayesian analysis. Participants read Dutch mini‐stories with a definite or indefinite novel noun phrase (e.g., “het/een huis,” the/a house), wherein (in)definiteness of the article was either expected or unexpected and the noun was always strongly expected. Unexpected articles elicited enhanced N400s, but unexpectedly indefinite articles also elicited a positive ERP effect at frontal channels compared to expectedly indefinite articles. We tentatively link this effect to an antiuniqueness violation, which may force people to introduce a new referent over and above the already anticipated one. Interestingly, expectedly definite nouns elicited larger N400s than unexpectedly definite nouns (replicating a previous surprising finding) and indefinite nouns. Although the exact nature of these noun effects remains unknown, expectedly definite nouns may have triggered the strongest semantic activation because they alone refer to specific and concrete referents. In sum, results from both the articles and nouns clearly demonstrate that definiteness marking has a rapid effect on processing, counter to recent claims regarding definiteness processing.
Collapse
Affiliation(s)
- Georgia-Ann Carter
- Max Planck Institute for Psycholinguistics, Nijmegen.,ILCC, School of Informatics, University of Edinburgh, Edinburgh
| | - Mante S Nieuwland
- Max Planck Institute for Psycholinguistics, Nijmegen.,Donders Institute for Brain, Cognition and Behaviour, Nijmegen
| |
Collapse
|
6
|
Nour Eddine S, Brothers T, Kuperberg GR. The N400 in silico: A review of computational models. PSYCHOLOGY OF LEARNING AND MOTIVATION 2022. [DOI: 10.1016/bs.plm.2022.03.005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
|
7
|
Hörberg T, Jaeger TF. A Rational Model of Incremental Argument Interpretation: The Comprehension of Swedish Transitive Clauses. Front Psychol 2021; 12:674202. [PMID: 34721134 PMCID: PMC8554243 DOI: 10.3389/fpsyg.2021.674202] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2021] [Accepted: 09/21/2021] [Indexed: 11/16/2022] Open
Abstract
A central component of sentence understanding is verb-argument interpretation, determining how the referents in the sentence are related to the events or states expressed by the verb. Previous work has found that comprehenders change their argument interpretations incrementally as the sentence unfolds, based on morphosyntactic (e.g., case, agreement), lexico-semantic (e.g., animacy, verb-argument fit), and discourse cues (e.g., givenness). However, it is still unknown whether these cues have a privileged role in language processing, or whether their effects on argument interpretation originate in implicit expectations based on the joint distribution of these cues with argument assignments experienced in previous language input. We compare the former, linguistic account against the latter, expectation-based account, using data from production and comprehension of transitive clauses in Swedish. Based on a large corpus of Swedish, we develop a rational (Bayesian) model of incremental argument interpretation. This model predicts the processing difficulty experienced at different points in the sentence as a function of the Bayesian surprise associated with changes in expectations over possible argument interpretations. We then test the model against reading times from a self-paced reading experiment on Swedish. We find Bayesian surprise to be a significant predictor of reading times, complementing effects of word surprisal. Bayesian surprise also captures the qualitative effects of morpho-syntactic and lexico-semantic cues. Additional model comparisons find that it—with a single degree of freedom—captures much, if not all, of the effects associated with these cues. This suggests that the effects of form- and meaning-based cues to argument interpretation are mediated through expectation-based processing.
Collapse
Affiliation(s)
- Thomas Hörberg
- Department of Linguistics, Stockholm University, Stockholm, Sweden.,Department of Computational Science and Technology, KTH Royal Institute of Technology, Stockholm, Sweden
| | - T Florian Jaeger
- Department of Brain and Cognitive Sciences, University of Rochester, Rochester, NY, United States.,Department of Computer Science, University of Rochester, Rochester, NY, United States
| |
Collapse
|
8
|
Ryskin R, Stearns L, Bergen L, Eddy M, Fedorenko E, Gibson E. An ERP index of real-time error correction within a noisy-channel framework of human communication. Neuropsychologia 2021; 158:107855. [PMID: 33865848 DOI: 10.1016/j.neuropsychologia.2021.107855] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2020] [Revised: 03/31/2021] [Accepted: 04/06/2021] [Indexed: 10/21/2022]
Abstract
Recent evidence suggests that language processing is well-adapted to noise in the input (e.g., spelling or speech errors, misreading or mishearing) and that comprehenders readily correct the input via rational inference over possible intended sentences given probable noise corruptions. In the current study, we probed the processing of noisy linguistic input, asking whether well-studied ERP components may serve as useful indices of this inferential process. In particular, we examined sentences where semantic violations could be attributed to noise-for example, in "The storyteller could turn any incident into an amusing antidote", where the implausible word "antidote" is orthographically and phonologically close to the intended "anecdote". We found that the processing of such sentences-where the probability that the message was corrupted by noise exceeds the probability that it was produced intentionally and perceived accurately-was associated with a reduced (less negative) N400 effect and an increased P600 effect, compared to semantic violations which are unlikely to be attributed to noise ("The storyteller could turn any incident into an amusing hearse"). Further, the magnitudes of these ERP effects were correlated with the probability that the comprehender retrieved a plausible alternative. This work thus adds to the growing body of literature that suggests that many aspects of language processing are optimized for dealing with noise in the input, and opens the door to electrophysiologic investigations of the computations that support the processing of imperfect input.
Collapse
Affiliation(s)
| | | | - Leon Bergen
- University of California, San Diego, United States
| | - Marianna Eddy
- Massachusetts Institute of Technology, United States
| | - Evelina Fedorenko
- Massachusetts Institute of Technology, United States; McGovern Institute for Brain Research, United States
| | - Edward Gibson
- Massachusetts Institute of Technology, United States
| |
Collapse
|
9
|
Alemán Bañón J, Martin C. The role of crosslinguistic differences in second language anticipatory processing: An event-related potentials study. Neuropsychologia 2021; 155:107797. [PMID: 33610614 DOI: 10.1016/j.neuropsychologia.2021.107797] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2020] [Revised: 12/24/2020] [Accepted: 02/07/2021] [Indexed: 11/29/2022]
Abstract
The present study uses event-related potentials to investigate how crosslinguistic (dis)similarities modulate anticipatory processing in the second language (L2). Participants read predictive stories in English that made a genitive construction consisting of a third-person singular possessive pronoun and a kinship noun (e.g., his mother) likely in an upcoming continuation. The possessive pronoun's form depended on the antecedent's natural gender, which had been previously established in the stories. The continuation included either the expected genitive construction or an unexpected one with a possessive pronoun of the opposite gender. We manipulated crosslinguistic (dis)similarity by comparing advanced English learners with either Swedish or Spanish as their L1. While Swedish has equivalent possessive pronouns that mark the antecedent's natural gender (i.e., hans/hennes "his/her"), Spanish does not. In fact, Spanish possessive pronouns mark the syntactic features (number, gender) of the possessed noun (e.g., nosotros queremos a nuestra madre "we-MASC love our-FEM mother-FEM). Twenty-four native speakers of English elicited an N400 effect for prenominal possessives that were unexpected based on the possessor noun's natural gender, consistent with the possibility that they activated the pronoun's form or its semantic features (natural gender). Thirty-two Swedish-speaking learners yielded a qualitatively and quantitatively native-like N400 for unexpected prenominal possessives. In contrast, twenty-five Spanish-speaking learners showed a P600 effect for unexpected possessives, consistent with the possibility that they experienced difficulty integrating a pronoun that mismatched the expected gender. Results suggest that differences with respect to the features encoded in the activated representation result in different predictive mechanisms among adult L2 learners.
Collapse
Affiliation(s)
- José Alemán Bañón
- Centre for Research on Bilingualism, Department of Swedish and Multilingualism, Stockholm University, Universitetsvägen 10 (D355), 10691, Stockholm, Sweden.
| | - Clara Martin
- Basque Center on Cognition, Brain and Language, Paseo Mikeletegi 69, 20009, Donostia-San Sebastián, Spain; Ikerbasque, Basque Foundation for Science, María Díaz de Haro 3, 48013, Bilbao, Spain
| |
Collapse
|
10
|
Fleur DS, Flecken M, Rommers J, Nieuwland MS. Definitely saw it coming? The dual nature of the pre-nominal prediction effect. Cognition 2020; 204:104335. [DOI: 10.1016/j.cognition.2020.104335] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2020] [Revised: 05/15/2020] [Accepted: 05/22/2020] [Indexed: 01/29/2023]
|
11
|
Nieuwland MS, Kazanina N. The Neural Basis of Linguistic Prediction: Introduction to the Special Issue. Neuropsychologia 2020; 146:107532. [PMID: 32553845 DOI: 10.1016/j.neuropsychologia.2020.107532] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022]
Affiliation(s)
- Mante S Nieuwland
- Max Planck Institute for Psycholinguistics, Nijmegen, the Netherlands; Donders Institute for Brain, Cognition and Behaviour, Nijmegen, the Netherlands; Heinrich-Heine-University, Düsseldorf, Germany.
| | - Nina Kazanina
- School of Psychological Science, University of Bristol, Bristol, United Kingdom; Institute of Cognitive Neuroscience, National Research University Higher School of Economics, Moscow, Russian Federation
| |
Collapse
|