1
|
Shain C, Schuler W. A Deep Learning Approach to Analyzing Continuous-Time Cognitive Processes. Open Mind (Camb) 2024; 8:235-264. [PMID: 38528907 PMCID: PMC10962694 DOI: 10.1162/opmi_a_00126] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2023] [Accepted: 01/31/2024] [Indexed: 03/27/2024] Open
Abstract
The dynamics of the mind are complex. Mental processes unfold continuously in time and may be sensitive to a myriad of interacting variables, especially in naturalistic settings. But statistical models used to analyze data from cognitive experiments often assume simplistic dynamics. Recent advances in deep learning have yielded startling improvements to simulations of dynamical cognitive processes, including speech comprehension, visual perception, and goal-directed behavior. But due to poor interpretability, deep learning is generally not used for scientific analysis. Here, we bridge this gap by showing that deep learning can be used, not just to imitate, but to analyze complex processes, providing flexible function approximation while preserving interpretability. To do so, we define and implement a nonlinear regression model in which the probability distribution over the response variable is parameterized by convolving the history of predictors over time using an artificial neural network, thereby allowing the shape and continuous temporal extent of effects to be inferred directly from time series data. Our approach relaxes standard simplifying assumptions (e.g., linearity, stationarity, and homoscedasticity) that are implausible for many cognitive processes and may critically affect the interpretation of data. We demonstrate substantial improvements on behavioral and neuroimaging data from the language processing domain, and we show that our model enables discovery of novel patterns in exploratory analyses, controls for diverse confounds in confirmatory analyses, and opens up research questions in cognitive (neuro)science that are otherwise hard to study.
Collapse
Affiliation(s)
- Cory Shain
- Department of Brain & Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - William Schuler
- Department of Linguistics, The Ohio State University, Columbus, OH, USA
| |
Collapse
|
2
|
Shain C. Word Frequency and Predictability Dissociate in Naturalistic Reading. Open Mind (Camb) 2024; 8:177-201. [PMID: 38476662 PMCID: PMC10932590 DOI: 10.1162/opmi_a_00119] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2023] [Accepted: 01/10/2024] [Indexed: 03/14/2024] Open
Abstract
Many studies of human language processing have shown that readers slow down at less frequent or less predictable words, but there is debate about whether frequency and predictability effects reflect separable cognitive phenomena: are cognitive operations that retrieve words from the mental lexicon based on sensory cues distinct from those that predict upcoming words based on context? Previous evidence for a frequency-predictability dissociation is mostly based on small samples (both for estimating predictability and frequency and for testing their effects on human behavior), artificial materials (e.g., isolated constructed sentences), and implausible modeling assumptions (discrete-time dynamics, linearity, additivity, constant variance, and invariance over time), which raises the question: do frequency and predictability dissociate in ordinary language comprehension, such as story reading? This study leverages recent progress in open data and computational modeling to address this question at scale. A large collection of naturalistic reading data (six datasets, >2.2 M datapoints) is analyzed using nonlinear continuous-time regression, and frequency and predictability are estimated using statistical language models trained on more data than is currently typical in psycholinguistics. Despite the use of naturalistic data, strong predictability estimates, and flexible regression models, results converge with earlier experimental studies in supporting dissociable and additive frequency and predictability effects.
Collapse
Affiliation(s)
- Cory Shain
- Department of Brain & Cognitive Sciences and McGovern Institute for Brain Research, Massachusetts Institute of Technology, Cambridge, MA, USA
| |
Collapse
|
3
|
Shain C, Meister C, Pimentel T, Cotterell R, Levy R. Large-scale evidence for logarithmic effects of word predictability on reading time. Proc Natl Acad Sci U S A 2024; 121:e2307876121. [PMID: 38422017 DOI: 10.1073/pnas.2307876121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2023] [Accepted: 11/11/2023] [Indexed: 03/02/2024] Open
Abstract
During real-time language comprehension, our minds rapidly decode complex meanings from sequences of words. The difficulty of doing so is known to be related to words' contextual predictability, but what cognitive processes do these predictability effects reflect? In one view, predictability effects reflect facilitation due to anticipatory processing of words that are predictable from context. This view predicts a linear effect of predictability on processing demand. In another view, predictability effects reflect the costs of probabilistic inference over sentence interpretations. This view predicts either a logarithmic or a superlogarithmic effect of predictability on processing demand, depending on whether it assumes pressures toward a uniform distribution of information over time. The empirical record is currently mixed. Here, we revisit this question at scale: We analyze six reading datasets, estimate next-word probabilities with diverse statistical language models, and model reading times using recent advances in nonlinear regression. Results support a logarithmic effect of word predictability on processing difficulty, which favors probabilistic inference as a key component of human language processing.
Collapse
Affiliation(s)
- Cory Shain
- Department of Brain & Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA 02139
| | - Clara Meister
- Department of Computer Science, Institute for Machine Learning, ETH Zürich, Zürich 8092, Schweiz
| | - Tiago Pimentel
- Department of Computer Science and Technology, University of Cambridge, Cambridge CB3 0FD, United Kingdom
| | - Ryan Cotterell
- Department of Computer Science, Institute for Machine Learning, ETH Zürich, Zürich 8092, Schweiz
| | - Roger Levy
- Department of Brain & Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA 02139
| |
Collapse
|
4
|
Wang S, Zhang X, Zhang J, Zong C. A synchronized multimodal neuroimaging dataset for studying brain language processing. Sci Data 2022; 9:590. [PMID: 36180444 PMCID: PMC9525723 DOI: 10.1038/s41597-022-01708-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2022] [Accepted: 08/22/2022] [Indexed: 11/15/2022] Open
Abstract
We present a synchronized multimodal neuroimaging dataset for studying brain language processing (SMN4Lang) that contains functional magnetic resonance imaging (fMRI) and magnetoencephalography (MEG) data on the same 12 healthy volunteers while the volunteers listened to 6 hours of naturalistic stories, as well as high-resolution structural (T1, T2), diffusion MRI and resting-state fMRI data for each participant. We also provide rich linguistic annotations for the stimuli, including word frequencies, syntactic tree structures, time-aligned characters and words, and various types of word and character embeddings. Quality assessment indicators verify that this is a high-quality neuroimaging dataset. Such synchronized data is separately collected by the same group of participants first listening to story materials in fMRI and then in MEG which are well suited to studying the dynamic processing of language comprehension, such as the time and location of different linguistic features encoded in the brain. In addition, this dataset, comprising a large vocabulary from stories with various topics, can serve as a brain benchmark to evaluate and improve computational language models. Measurement(s) | functional brain measurement • Magnetoencephalography | Technology Type(s) | Functional Magnetic Resonance Imaging • Magnetoencephalography | Factor Type(s) | naturalistic stimuli listening | Sample Characteristic - Organism | humanbeings |
Collapse
Affiliation(s)
- Shaonan Wang
- National Laboratory of Pattern Recognition, Institute of Automation, CAS, Beijing, China. .,School of Artificial Intelligence, University of Chinese Academy of Sciences, Beijing, China.
| | - Xiaohan Zhang
- National Laboratory of Pattern Recognition, Institute of Automation, CAS, Beijing, China.,School of Artificial Intelligence, University of Chinese Academy of Sciences, Beijing, China
| | - Jiajun Zhang
- National Laboratory of Pattern Recognition, Institute of Automation, CAS, Beijing, China.,School of Artificial Intelligence, University of Chinese Academy of Sciences, Beijing, China
| | - Chengqing Zong
- National Laboratory of Pattern Recognition, Institute of Automation, CAS, Beijing, China.,School of Artificial Intelligence, University of Chinese Academy of Sciences, Beijing, China
| |
Collapse
|
5
|
Oh BD, Clark C, Schuler W. Comparison of Structural Parsers and Neural Language Models as Surprisal Estimators. Front Artif Intell 2022; 5:777963. [PMID: 35310956 PMCID: PMC8929193 DOI: 10.3389/frai.2022.777963] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2021] [Accepted: 01/31/2022] [Indexed: 12/04/2022] Open
Abstract
Expectation-based theories of sentence processing posit that processing difficulty is determined by predictability in context. While predictability quantified via surprisal has gained empirical support, this representation-agnostic measure leaves open the question of how to best approximate the human comprehender's latent probability model. This article first describes an incremental left-corner parser that incorporates information about common linguistic abstractions such as syntactic categories, predicate-argument structure, and morphological rules as a computational-level model of sentence processing. The article then evaluates a variety of structural parsers and deep neural language models as cognitive models of sentence processing by comparing the predictive power of their surprisal estimates on self-paced reading, eye-tracking, and fMRI data collected during real-time language processing. The results show that surprisal estimates from the proposed left-corner processing model deliver comparable and often superior fits to self-paced reading and eye-tracking data when compared to those from neural language models trained on much more data. This may suggest that the strong linguistic generalizations made by the proposed processing model may help predict humanlike processing costs that manifest in latency-based measures, even when the amount of training data is limited. Additionally, experiments using Transformer-based language models sharing the same primary architecture and training data show a surprising negative correlation between parameter count and fit to self-paced reading and eye-tracking data. These findings suggest that large-scale neural language models are making weaker generalizations based on patterns of lexical items rather than stronger, more humanlike generalizations based on linguistic structure.
Collapse
|
6
|
Wehbe L, Blank IA, Shain C, Futrell R, Levy R, von der Malsburg T, Smith N, Gibson E, Fedorenko E. Incremental Language Comprehension Difficulty Predicts Activity in the Language Network but Not the Multiple Demand Network. Cereb Cortex 2021; 31:4006-4023. [PMID: 33895807 DOI: 10.1093/cercor/bhab065] [Citation(s) in RCA: 29] [Impact Index Per Article: 9.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2020] [Revised: 01/15/2021] [Accepted: 02/21/2021] [Indexed: 12/28/2022] Open
Abstract
What role do domain-general executive functions play in human language comprehension? To address this question, we examine the relationship between behavioral measures of comprehension and neural activity in the domain-general "multiple demand" (MD) network, which has been linked to constructs like attention, working memory, inhibitory control, and selection, and implicated in diverse goal-directed behaviors. Specifically, functional magnetic resonance imaging data collected during naturalistic story listening are compared with theory-neutral measures of online comprehension difficulty and incremental processing load (reading times and eye-fixation durations). Critically, to ensure that variance in these measures is driven by features of the linguistic stimulus rather than reflecting participant- or trial-level variability, the neuroimaging and behavioral datasets were collected in nonoverlapping samples. We find no behavioral-neural link in functionally localized MD regions; instead, this link is found in the domain-specific, fronto-temporal "core language network," in both left-hemispheric areas and their right hemispheric homotopic areas. These results argue against strong involvement of domain-general executive circuits in language comprehension.
Collapse
Affiliation(s)
- Leila Wehbe
- Carnegie Mellon University, Machine Learning Department PA 15213, USA
| | - Idan Asher Blank
- Massachusetts Institute of Technology, Department of Brain and Cognitive Sciences MA 02139, USA.,University of California Los Angeles, Department of Psychology CA 90095, USA
| | - Cory Shain
- Ohio State University, Department of Linguistics OH 43210, USA
| | - Richard Futrell
- Massachusetts Institute of Technology, Department of Brain and Cognitive Sciences MA 02139, USA.,University of California Irvine, Department of Linguistics CA 92697, USA
| | - Roger Levy
- Massachusetts Institute of Technology, Department of Brain and Cognitive Sciences MA 02139, USA.,University of California San Diego, Department of Linguistics CA 92161, USA
| | - Titus von der Malsburg
- Massachusetts Institute of Technology, Department of Brain and Cognitive Sciences MA 02139, USA.,University of Stuttgart, Institute of Linguistics, 70049 Stuttgart, Germany
| | - Nathaniel Smith
- University of California San Diego, Department of Linguistics CA 92161, USA
| | - Edward Gibson
- Massachusetts Institute of Technology, Department of Brain and Cognitive Sciences MA 02139, USA
| | - Evelina Fedorenko
- Massachusetts Institute of Technology, Department of Brain and Cognitive Sciences MA 02139, USA.,Massachusetts Institute of Technology, McGovern Institute for Brain ResearchMA 02139, USA
| |
Collapse
|