1
|
Benítez-Burraco A. The cognitive science of language diversity: achievements and challenges. Cogn Process 2025:10.1007/s10339-025-01262-z. [PMID: 39998596 DOI: 10.1007/s10339-025-01262-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2024] [Accepted: 02/10/2025] [Indexed: 02/27/2025]
Abstract
Linguistics needs to embrace all the way down a key feature of language: its diversity. In this paper, we build on recent experimental findings and theoretical discussions about the neuroscience and the cognitive science of linguistic variation, but also on proposals by theoretical biology, to advance some future directions for a more solid neurocognitive approach to language diversity. We argue that the cognitive foundations and the neuroscience of human language will be better understood if we pursue a unitary explanation of four key dimensions of linguistic variation: the different functions performed by language, the diversity of sociolinguistic phenomena, the typological differences between human languages, and the diverse developmental paths to language. Succeeding in the cognitive and neurobiological examination and explanation of these four dimensions will not only result in a more comprehensive understanding of how our brain processes language, but also of how language evolved and the core properties of human language(s).
Collapse
Affiliation(s)
- Antonio Benítez-Burraco
- Department of Spanish, Linguistics and Theory of Literature (Linguistics), Faculty of Philology, University of Seville, C/Palos de la Frontera s/n, 41004, Seville, Spain.
| |
Collapse
|
2
|
Coopmans CW, de Hoop H, Tezcan F, Hagoort P, Martin AE. Language-specific neural dynamics extend syntax into the time domain. PLoS Biol 2025; 23:e3002968. [PMID: 39836653 PMCID: PMC11750093 DOI: 10.1371/journal.pbio.3002968] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2024] [Accepted: 12/05/2024] [Indexed: 01/23/2025] Open
Abstract
Studies of perception have long shown that the brain adds information to its sensory analysis of the physical environment. A touchstone example for humans is language use: to comprehend a physical signal like speech, the brain must add linguistic knowledge, including syntax. Yet, syntactic rules and representations are widely assumed to be atemporal (i.e., abstract and not bound by time), so they must be translated into time-varying signals for speech comprehension and production. Here, we test 3 different models of the temporal spell-out of syntactic structure against brain activity of people listening to Dutch stories: an integratory bottom-up parser, a predictive top-down parser, and a mildly predictive left-corner parser. These models build exactly the same structure but differ in when syntactic information is added by the brain-this difference is captured in the (temporal distribution of the) complexity metric "incremental node count." Using temporal response function models with both acoustic and information-theoretic control predictors, node counts were regressed against source-reconstructed delta-band activity acquired with magnetoencephalography. Neural dynamics in left frontal and temporal regions most strongly reflect node counts derived by the top-down method, which postulates syntax early in time, suggesting that predictive structure building is an important component of Dutch sentence comprehension. The absence of strong effects of the left-corner model further suggests that its mildly predictive strategy does not represent Dutch language comprehension well, in contrast to what has been found for English. Understanding when the brain projects its knowledge of syntax onto speech, and whether this is done in language-specific ways, will inform and constrain the development of mechanistic models of syntactic structure building in the brain.
Collapse
Affiliation(s)
- Cas W. Coopmans
- Max Planck Institute for Psycholinguistics, Nijmegen, the Netherlands
- Donders Institute for Brain, Cognition, and Behaviour, Radboud University, Nijmegen, the Netherlands
- Centre for Language Studies, Radboud University, Nijmegen, the Netherlands
| | - Helen de Hoop
- Centre for Language Studies, Radboud University, Nijmegen, the Netherlands
| | - Filiz Tezcan
- Max Planck Institute for Psycholinguistics, Nijmegen, the Netherlands
- Donders Institute for Brain, Cognition, and Behaviour, Radboud University, Nijmegen, the Netherlands
| | - Peter Hagoort
- Max Planck Institute for Psycholinguistics, Nijmegen, the Netherlands
- Donders Institute for Brain, Cognition, and Behaviour, Radboud University, Nijmegen, the Netherlands
| | - Andrea E. Martin
- Max Planck Institute for Psycholinguistics, Nijmegen, the Netherlands
- Donders Institute for Brain, Cognition, and Behaviour, Radboud University, Nijmegen, the Netherlands
| |
Collapse
|
3
|
Weissbart H, Martin AE. The structure and statistics of language jointly shape cross-frequency neural dynamics during spoken language comprehension. Nat Commun 2024; 15:8850. [PMID: 39397036 PMCID: PMC11471778 DOI: 10.1038/s41467-024-53128-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2023] [Accepted: 09/30/2024] [Indexed: 10/15/2024] Open
Abstract
Humans excel at extracting structurally-determined meaning from speech despite inherent physical variability. This study explores the brain's ability to predict and understand spoken language robustly. It investigates the relationship between structural and statistical language knowledge in brain dynamics, focusing on phase and amplitude modulation. Using syntactic features from constituent hierarchies and surface statistics from a transformer model as predictors of forward encoding models, we reconstructed cross-frequency neural dynamics from MEG data during audiobook listening. Our findings challenge a strict separation of linguistic structure and statistics in the brain, with both aiding neural signal reconstruction. Syntactic features have a more temporally spread impact, and both word entropy and the number of closing syntactic constituents are linked to the phase-amplitude coupling of neural dynamics, implying a role in temporal prediction and cortical oscillation alignment during speech processing. Our results indicate that structured and statistical information jointly shape neural dynamics during spoken language comprehension and suggest an integration process via a cross-frequency coupling mechanism.
Collapse
Affiliation(s)
- Hugo Weissbart
- Donders Centre for Cognitive Neuroimaging, Radboud University, Nijmegen, The Netherlands.
- Max Planck Institute for Psycholinguistics, Nijmegen, The Netherlands.
| | - Andrea E Martin
- Donders Centre for Cognitive Neuroimaging, Radboud University, Nijmegen, The Netherlands
- Max Planck Institute for Psycholinguistics, Nijmegen, The Netherlands
| |
Collapse
|
4
|
Chalas N, Meyer L, Lo CW, Park H, Kluger DS, Abbasi O, Kayser C, Nitsch R, Gross J. Dissociating prosodic from syntactic delta activity during natural speech comprehension. Curr Biol 2024; 34:3537-3549.e5. [PMID: 39047734 DOI: 10.1016/j.cub.2024.06.072] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2024] [Revised: 06/24/2024] [Accepted: 06/27/2024] [Indexed: 07/27/2024]
Abstract
Decoding human speech requires the brain to segment the incoming acoustic signal into meaningful linguistic units, ranging from syllables and words to phrases. Integrating these linguistic constituents into a coherent percept sets the root of compositional meaning and hence understanding. One important cue for segmentation in natural speech is prosodic cues, such as pauses, but their interplay with higher-level linguistic processing is still unknown. Here, we dissociate the neural tracking of prosodic pauses from the segmentation of multi-word chunks using magnetoencephalography (MEG). We find that manipulating the regularity of pauses disrupts slow speech-brain tracking bilaterally in auditory areas (below 2 Hz) and in turn increases left-lateralized coherence of higher-frequency auditory activity at speech onsets (around 25-45 Hz). Critically, we also find that multi-word chunks-defined as short, coherent bundles of inter-word dependencies-are processed through the rhythmic fluctuations of low-frequency activity (below 2 Hz) bilaterally and independently of prosodic cues. Importantly, low-frequency alignment at chunk onsets increases the accuracy of an encoding model in bilateral auditory and frontal areas while controlling for the effect of acoustics. Our findings provide novel insights into the neural basis of speech perception, demonstrating that both acoustic features (prosodic cues) and abstract linguistic processing at the multi-word timescale are underpinned independently by low-frequency electrophysiological brain activity in the delta frequency range.
Collapse
Affiliation(s)
- Nikos Chalas
- Institute for Biomagnetism and Biosignal Analysis, University of Münster, Münster, Germany; Otto-Creutzfeldt-Center for Cognitive and Behavioral Neuroscience, University of Münster, Münster, Germany; Institute for Translational Neuroscience, University of Münster, Münster, Germany.
| | - Lars Meyer
- Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany
| | - Chia-Wen Lo
- Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany
| | - Hyojin Park
- Centre for Human Brain Health (CHBH), School of Psychology, University of Birmingham, Birmingham, UK
| | - Daniel S Kluger
- Institute for Biomagnetism and Biosignal Analysis, University of Münster, Münster, Germany; Otto-Creutzfeldt-Center for Cognitive and Behavioral Neuroscience, University of Münster, Münster, Germany
| | - Omid Abbasi
- Institute for Biomagnetism and Biosignal Analysis, University of Münster, Münster, Germany
| | - Christoph Kayser
- Department for Cognitive Neuroscience, Faculty of Biology, Bielefeld University, 33615 Bielefeld, Germany
| | - Robert Nitsch
- Institute for Translational Neuroscience, University of Münster, Münster, Germany
| | - Joachim Gross
- Institute for Biomagnetism and Biosignal Analysis, University of Münster, Münster, Germany; Otto-Creutzfeldt-Center for Cognitive and Behavioral Neuroscience, University of Münster, Münster, Germany
| |
Collapse
|
5
|
Zhao J, Martin AE, Coopmans CW. Structural and sequential regularities modulate phrase-rate neural tracking. Sci Rep 2024; 14:16603. [PMID: 39025957 PMCID: PMC11258220 DOI: 10.1038/s41598-024-67153-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2024] [Accepted: 07/08/2024] [Indexed: 07/20/2024] Open
Abstract
Electrophysiological brain activity has been shown to synchronize with the quasi-regular repetition of grammatical phrases in connected speech-so-called phrase-rate neural tracking. Current debate centers around whether this phenomenon is best explained in terms of the syntactic properties of phrases or in terms of syntax-external information, such as the sequential repetition of parts of speech. As these two factors were confounded in previous studies, much of the literature is compatible with both accounts. Here, we used electroencephalography (EEG) to determine if and when the brain is sensitive to both types of information. Twenty native speakers of Mandarin Chinese listened to isochronously presented streams of monosyllabic words, which contained either grammatical two-word phrases (e.g., catch fish, sell house) or non-grammatical word combinations (e.g., full lend, bread far). Within the grammatical conditions, we varied two structural factors: the position of the head of each phrase and the type of attachment. Within the non-grammatical conditions, we varied the consistency with which parts of speech were repeated. Tracking was quantified through evoked power and inter-trial phase coherence, both derived from the frequency-domain representation of EEG responses. As expected, neural tracking at the phrase rate was stronger in grammatical sequences than in non-grammatical sequences without syntactic structure. Moreover, it was modulated by both attachment type and head position, revealing the structure-sensitivity of phrase-rate tracking. We additionally found that the brain tracks the repetition of parts of speech in non-grammatical sequences. These data provide an integrative perspective on the current debate about neural tracking effects, revealing that the brain utilizes regularities computed over multiple levels of linguistic representation in guiding rhythmic computation.
Collapse
Affiliation(s)
- Junyuan Zhao
- Department of Linguistics, University of Michigan, Ann Arbor, MI, USA
| | - Andrea E Martin
- Max Planck Institute for Psycholinguistics, Nijmegen, The Netherlands
- Donders Institute for Brain, Cognition and Behaviour, Radboud University, Nijmegen, The Netherlands
| | - Cas W Coopmans
- Max Planck Institute for Psycholinguistics, Nijmegen, The Netherlands.
- Donders Institute for Brain, Cognition and Behaviour, Radboud University, Nijmegen, The Netherlands.
| |
Collapse
|
6
|
Ding R, Ten Oever S, Martin AE. Delta-band Activity Underlies Referential Meaning Representation during Pronoun Resolution. J Cogn Neurosci 2024; 36:1472-1492. [PMID: 38652108 DOI: 10.1162/jocn_a_02163] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/25/2024]
Abstract
Human language offers a variety of ways to create meaning, one of which is referring to entities, objects, or events in the world. One such meaning maker is understanding to whom or to what a pronoun in a discourse refers to. To understand a pronoun, the brain must access matching entities or concepts that have been encoded in memory from previous linguistic context. Models of language processing propose that internally stored linguistic concepts, accessed via exogenous cues such as phonological input of a word, are represented as (a)synchronous activities across a population of neurons active at specific frequency bands. Converging evidence suggests that delta band activity (1-3 Hz) is involved in temporal and representational integration during sentence processing. Moreover, recent advances in the neurobiology of memory suggest that recollection engages neural dynamics similar to those which occurred during memory encoding. Integrating from these two research lines, we here tested the hypothesis that neural dynamic patterns, especially in delta frequency range, underlying referential meaning representation, would be reinstated during pronoun resolution. By leveraging neural decoding techniques (i.e., representational similarity analysis) on a magnetoencephalogram data set acquired during a naturalistic story-listening task, we provide evidence that delta-band activity underlies referential meaning representation. Our findings suggest that, during spoken language comprehension, endogenous linguistic representations such as referential concepts may be proactively retrieved and represented via activation of their underlying dynamic neural patterns.
Collapse
Affiliation(s)
- Rong Ding
- Max Planck Institute for Psycholinguistics, Nijmegen, The Netherlands
| | - Sanne Ten Oever
- Max Planck Institute for Psycholinguistics, Nijmegen, The Netherlands
- Radboud University Donders Centre for Cognitive Neuroimaging, Nijmegen, The Netherlands
- Department of Cognitive Neuroscience, Faculty of Psychology and Neuroscience, Maastricht University, Maastricht, The Netherlands
| | - Andrea E Martin
- Max Planck Institute for Psycholinguistics, Nijmegen, The Netherlands
- Radboud University Donders Centre for Cognitive Neuroimaging, Nijmegen, The Netherlands
| |
Collapse
|
7
|
Corsini A, Tomassini A, Pastore A, Delis I, Fadiga L, D'Ausilio A. Speech perception difficulty modulates theta-band encoding of articulatory synergies. J Neurophysiol 2024; 131:480-491. [PMID: 38323331 DOI: 10.1152/jn.00388.2023] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2023] [Revised: 01/04/2024] [Accepted: 01/25/2024] [Indexed: 02/08/2024] Open
Abstract
The human brain tracks available speech acoustics and extrapolates missing information such as the speaker's articulatory patterns. However, the extent to which articulatory reconstruction supports speech perception remains unclear. This study explores the relationship between articulatory reconstruction and task difficulty. Participants listened to sentences and performed a speech-rhyming task. Real kinematic data of the speaker's vocal tract were recorded via electromagnetic articulography (EMA) and aligned to corresponding acoustic outputs. We extracted articulatory synergies from the EMA data with principal component analysis (PCA) and employed partial information decomposition (PID) to separate the electroencephalographic (EEG) encoding of acoustic and articulatory features into unique, redundant, and synergistic atoms of information. We median-split sentences into easy (ES) and hard (HS) based on participants' performance and found that greater task difficulty involved greater encoding of unique articulatory information in the theta band. We conclude that fine-grained articulatory reconstruction plays a complementary role in the encoding of speech acoustics, lending further support to the claim that motor processes support speech perception.NEW & NOTEWORTHY Top-down processes originating from the motor system contribute to speech perception through the reconstruction of the speaker's articulatory movement. This study investigates the role of such articulatory simulation under variable task difficulty. We show that more challenging listening tasks lead to increased encoding of articulatory kinematics in the theta band and suggest that, in such situations, fine-grained articulatory reconstruction complements acoustic encoding.
Collapse
Affiliation(s)
- Alessandro Corsini
- Center for Translational Neurophysiology of Speech and Communication, Istituto Italiano di Tecnologia, Ferrara, Italy
- Department of Neuroscience and Rehabilitation, Università di Ferrara, Ferrara, Italy
| | - Alice Tomassini
- Center for Translational Neurophysiology of Speech and Communication, Istituto Italiano di Tecnologia, Ferrara, Italy
- Department of Neuroscience and Rehabilitation, Università di Ferrara, Ferrara, Italy
| | - Aldo Pastore
- Laboratorio NEST, Scuola Normale Superiore, Pisa, Italy
| | - Ioannis Delis
- School of Biomedical Sciences, University of Leeds, Leeds, United Kingdom
| | - Luciano Fadiga
- Center for Translational Neurophysiology of Speech and Communication, Istituto Italiano di Tecnologia, Ferrara, Italy
- Department of Neuroscience and Rehabilitation, Università di Ferrara, Ferrara, Italy
| | - Alessandro D'Ausilio
- Center for Translational Neurophysiology of Speech and Communication, Istituto Italiano di Tecnologia, Ferrara, Italy
- Department of Neuroscience and Rehabilitation, Università di Ferrara, Ferrara, Italy
| |
Collapse
|
8
|
Ten Oever S, Martin AE. Interdependence of "What" and "When" in the Brain. J Cogn Neurosci 2024; 36:167-186. [PMID: 37847823 DOI: 10.1162/jocn_a_02067] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2023]
Abstract
From a brain's-eye-view, when a stimulus occurs and what it is are interrelated aspects of interpreting the perceptual world. Yet in practice, the putative perceptual inferences about sensory content and timing are often dichotomized and not investigated as an integrated process. We here argue that neural temporal dynamics can influence what is perceived, and in turn, stimulus content can influence the time at which perception is achieved. This computational principle results from the highly interdependent relationship of what and when in the environment. Both brain processes and perceptual events display strong temporal variability that is not always modeled; we argue that understanding-and, minimally, modeling-this temporal variability is key for theories of how the brain generates unified and consistent neural representations and that we ignore temporal variability in our analysis practice at the peril of both data interpretation and theory-building. Here, we review what and when interactions in the brain, demonstrate via simulations how temporal variability can result in misguided interpretations and conclusions, and outline how to integrate and synthesize what and when in theories and models of brain computation.
Collapse
Affiliation(s)
- Sanne Ten Oever
- Max Planck Institute for Psycholinguistics, Nijmegen, The Netherlands
- Donders Centre for Cognitive Neuroimaging, Nijmegen, The Netherlands
- Maastricht University, The Netherlands
| | - Andrea E Martin
- Max Planck Institute for Psycholinguistics, Nijmegen, The Netherlands
- Donders Centre for Cognitive Neuroimaging, Nijmegen, The Netherlands
| |
Collapse
|
9
|
Mai G, Wang WSY. Distinct roles of delta- and theta-band neural tracking for sharpening and predictive coding of multi-level speech features during spoken language processing. Hum Brain Mapp 2023; 44:6149-6172. [PMID: 37818940 PMCID: PMC10619373 DOI: 10.1002/hbm.26503] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2023] [Revised: 08/17/2023] [Accepted: 09/13/2023] [Indexed: 10/13/2023] Open
Abstract
The brain tracks and encodes multi-level speech features during spoken language processing. It is evident that this speech tracking is dominant at low frequencies (<8 Hz) including delta and theta bands. Recent research has demonstrated distinctions between delta- and theta-band tracking but has not elucidated how they differentially encode speech across linguistic levels. Here, we hypothesised that delta-band tracking encodes prediction errors (enhanced processing of unexpected features) while theta-band tracking encodes neural sharpening (enhanced processing of expected features) when people perceive speech with different linguistic contents. EEG responses were recorded when normal-hearing participants attended to continuous auditory stimuli that contained different phonological/morphological and semantic contents: (1) real-words, (2) pseudo-words and (3) time-reversed speech. We employed multivariate temporal response functions to measure EEG reconstruction accuracies in response to acoustic (spectrogram), phonetic and phonemic features with the partialling procedure that singles out unique contributions of individual features. We found higher delta-band accuracies for pseudo-words than real-words and time-reversed speech, especially during encoding of phonetic features. Notably, individual time-lag analyses showed that significantly higher accuracies for pseudo-words than real-words started at early processing stages for phonetic encoding (<100 ms post-feature) and later stages for acoustic and phonemic encoding (>200 and 400 ms post-feature, respectively). Theta-band accuracies, on the other hand, were higher when stimuli had richer linguistic content (real-words > pseudo-words > time-reversed speech). Such effects also started at early stages (<100 ms post-feature) during encoding of all individual features or when all features were combined. We argue these results indicate that delta-band tracking may play a role in predictive coding leading to greater tracking of pseudo-words due to the presence of unexpected/unpredicted semantic information, while theta-band tracking encodes sharpened signals caused by more expected phonological/morphological and semantic contents. Early presence of these effects reflects rapid computations of sharpening and prediction errors. Moreover, by measuring changes in EEG alpha power, we did not find evidence that the observed effects can be solitarily explained by attentional demands or listening efforts. Finally, we used directed information analyses to illustrate feedforward and feedback information transfers between prediction errors and sharpening across linguistic levels, showcasing how our results fit with the hierarchical Predictive Coding framework. Together, we suggest the distinct roles of delta and theta neural tracking for sharpening and predictive coding of multi-level speech features during spoken language processing.
Collapse
Affiliation(s)
- Guangting Mai
- Hearing Theme, National Institute for Health Research Nottingham Biomedical Research Centre, Nottingham, UK
- Academic Unit of Mental Health and Clinical Neurosciences, School of Medicine, The University of Nottingham, Nottingham, UK
- Division of Psychology and Language Sciences, Faculty of Brain Sciences, University College London, London, UK
| | - William S-Y Wang
- Department of Chinese and Bilingual Studies, Hong Kong Polytechnic University, Hung Hom, Hong Kong
- Language Engineering Laboratory, The Chinese University of Hong Kong, Hong Kong, China
| |
Collapse
|
10
|
Coopmans CW, Mai A, Slaats S, Weissbart H, Martin AE. What oscillations can do for syntax depends on your theory of structure building. Nat Rev Neurosci 2023; 24:723. [PMID: 37696998 DOI: 10.1038/s41583-023-00734-5] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/13/2023]
Affiliation(s)
- Cas W Coopmans
- Donders Institute for Brain, Cognition and Behaviour, Radboud University, Nijmegen, The Netherlands.
- Max Planck Institute for Psycholinguistics, Nijmegen, The Netherlands.
| | - Anna Mai
- Max Planck Institute for Psycholinguistics, Nijmegen, The Netherlands
| | - Sophie Slaats
- Max Planck Institute for Psycholinguistics, Nijmegen, The Netherlands
| | - Hugo Weissbart
- Donders Institute for Brain, Cognition and Behaviour, Radboud University, Nijmegen, The Netherlands
| | - Andrea E Martin
- Donders Institute for Brain, Cognition and Behaviour, Radboud University, Nijmegen, The Netherlands
- Max Planck Institute for Psycholinguistics, Nijmegen, The Netherlands
| |
Collapse
|
11
|
Mohammadi Y, Graversen C, Østergaard J, Andersen OK, Reichenbach T. Phase-locking of Neural Activity to the Envelope of Speech in the Delta Frequency Band Reflects Differences between Word Lists and Sentences. J Cogn Neurosci 2023; 35:1301-1311. [PMID: 37379482 DOI: 10.1162/jocn_a_02016] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/30/2023]
Abstract
The envelope of a speech signal is tracked by neural activity in the cerebral cortex. The cortical tracking occurs mainly in two frequency bands, theta (4-8 Hz) and delta (1-4 Hz). Tracking in the faster theta band has been mostly associated with lower-level acoustic processing, such as the parsing of syllables, whereas the slower tracking in the delta band relates to higher-level linguistic information of words and word sequences. However, much regarding the more specific association between cortical tracking and acoustic as well as linguistic processing remains to be uncovered. Here, we recorded EEG responses to both meaningful sentences and random word lists in different levels of signal-to-noise ratios (SNRs) that lead to different levels of speech comprehension as well as listening effort. We then related the neural signals to the acoustic stimuli by computing the phase-locking value (PLV) between the EEG recordings and the speech envelope. We found that the PLV in the delta band increases with increasing SNR for sentences but not for the random word lists, showing that the PLV in this frequency band reflects linguistic information. When attempting to disentangle the effects of SNR, speech comprehension, and listening effort, we observed a trend that the PLV in the delta band might reflect listening effort rather than the other two variables, although the effect was not statistically significant. In summary, our study shows that the PLV in the delta band reflects linguistic information and might be related to listening effort.
Collapse
|
12
|
Tezcan F, Weissbart H, Martin AE. A tradeoff between acoustic and linguistic feature encoding in spoken language comprehension. eLife 2023; 12:e82386. [PMID: 37417736 PMCID: PMC10328533 DOI: 10.7554/elife.82386] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2022] [Accepted: 06/18/2023] [Indexed: 07/08/2023] Open
Abstract
When we comprehend language from speech, the phase of the neural response aligns with particular features of the speech input, resulting in a phenomenon referred to as neural tracking. In recent years, a large body of work has demonstrated the tracking of the acoustic envelope and abstract linguistic units at the phoneme and word levels, and beyond. However, the degree to which speech tracking is driven by acoustic edges of the signal, or by internally-generated linguistic units, or by the interplay of both, remains contentious. In this study, we used naturalistic story-listening to investigate (1) whether phoneme-level features are tracked over and above acoustic edges, (2) whether word entropy, which can reflect sentence- and discourse-level constraints, impacted the encoding of acoustic and phoneme-level features, and (3) whether the tracking of acoustic edges was enhanced or suppressed during comprehension of a first language (Dutch) compared to a statistically familiar but uncomprehended language (French). We first show that encoding models with phoneme-level linguistic features, in addition to acoustic features, uncovered an increased neural tracking response; this signal was further amplified in a comprehended language, putatively reflecting the transformation of acoustic features into internally generated phoneme-level representations. Phonemes were tracked more strongly in a comprehended language, suggesting that language comprehension functions as a neural filter over acoustic edges of the speech signal as it transforms sensory signals into abstract linguistic units. We then show that word entropy enhances neural tracking of both acoustic and phonemic features when sentence- and discourse-context are less constraining. When language was not comprehended, acoustic features, but not phonemic ones, were more strongly modulated, but in contrast, when a native language is comprehended, phoneme features are more strongly modulated. Taken together, our findings highlight the flexible modulation of acoustic, and phonemic features by sentence and discourse-level constraint in language comprehension, and document the neural transformation from speech perception to language comprehension, consistent with an account of language processing as a neural filter from sensory to abstract representations.
Collapse
Affiliation(s)
- Filiz Tezcan
- Language and Computation in Neural Systems Group, Max Planck Institute for PsycholinguisticsNijmegenNetherlands
| | - Hugo Weissbart
- Donders Centre for Cognitive Neuroimaging, Radboud UniversityNijmegenNetherlands
| | - Andrea E Martin
- Language and Computation in Neural Systems Group, Max Planck Institute for PsycholinguisticsNijmegenNetherlands
- Donders Centre for Cognitive Neuroimaging, Radboud UniversityNijmegenNetherlands
| |
Collapse
|
13
|
Slaats S, Weissbart H, Schoffelen JM, Meyer AS, Martin AE. Delta-Band Neural Responses to Individual Words Are Modulated by Sentence Processing. J Neurosci 2023; 43:4867-4883. [PMID: 37221093 PMCID: PMC10312058 DOI: 10.1523/jneurosci.0964-22.2023] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2022] [Revised: 04/17/2023] [Accepted: 04/27/2023] [Indexed: 05/25/2023] Open
Abstract
To understand language, we need to recognize words and combine them into phrases and sentences. During this process, responses to the words themselves are changed. In a step toward understanding how the brain builds sentence structure, the present study concerns the neural readout of this adaptation. We ask whether low-frequency neural readouts associated with words change as a function of being in a sentence. To this end, we analyzed an MEG dataset by Schoffelen et al. (2019) of 102 human participants (51 women) listening to sentences and word lists, the latter lacking any syntactic structure and combinatorial meaning. Using temporal response functions and a cumulative model-fitting approach, we disentangled delta- and theta-band responses to lexical information (word frequency), from responses to sensory and distributional variables. The results suggest that delta-band responses to words are affected by sentence context in time and space, over and above entropy and surprisal. In both conditions, the word frequency response spanned left temporal and posterior frontal areas; however, the response appeared later in word lists than in sentences. In addition, sentence context determined whether inferior frontal areas were responsive to lexical information. In the theta band, the amplitude was larger in the word list condition ∼100 milliseconds in right frontal areas. We conclude that low-frequency responses to words are changed by sentential context. The results of this study show how the neural representation of words is affected by structural context and as such provide insight into how the brain instantiates compositionality in language.SIGNIFICANCE STATEMENT Human language is unprecedented in its combinatorial capacity: we are capable of producing and understanding sentences we have never heard before. Although the mechanisms underlying this capacity have been described in formal linguistics and cognitive science, how they are implemented in the brain remains to a large extent unknown. A large body of earlier work from the cognitive neuroscientific literature implies a role for delta-band neural activity in the representation of linguistic structure and meaning. In this work, we combine these insights and techniques with findings from psycholinguistics to show that meaning is more than the sum of its parts; the delta-band MEG signal differentially reflects lexical information inside and outside sentence structures.
Collapse
Affiliation(s)
- Sophie Slaats
- Max Planck Institute for Psycholinguistics, 6525 XD Nijmegen, The Netherlands
- The International Max Planck Research School for Language Sciences, 6525 XD Nijmegen, The Netherlands
| | - Hugo Weissbart
- Donders Institute for Brain, Cognition and Behaviour, Radboud University, 6525 EN Nijmegen, The Netherlands
| | - Jan-Mathijs Schoffelen
- Donders Institute for Brain, Cognition and Behaviour, Radboud University, 6525 EN Nijmegen, The Netherlands
| | - Antje S Meyer
- Max Planck Institute for Psycholinguistics, 6525 XD Nijmegen, The Netherlands
- Donders Institute for Brain, Cognition and Behaviour, Radboud University, 6525 EN Nijmegen, The Netherlands
| | - Andrea E Martin
- Max Planck Institute for Psycholinguistics, 6525 XD Nijmegen, The Netherlands
- Donders Institute for Brain, Cognition and Behaviour, Radboud University, 6525 EN Nijmegen, The Netherlands
| |
Collapse
|
14
|
De Clercq P, Vanthornhout J, Vandermosten M, Francart T. Beyond linear neural envelope tracking: a mutual information approach. J Neural Eng 2023; 20. [PMID: 36812597 DOI: 10.1088/1741-2552/acbe1d] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2022] [Accepted: 02/22/2023] [Indexed: 02/24/2023]
Abstract
Objective.The human brain tracks the temporal envelope of speech, which contains essential cues for speech understanding. Linear models are the most common tool to study neural envelope tracking. However, information on how speech is processed can be lost since nonlinear relations are precluded. Analysis based on mutual information (MI), on the other hand, can detect both linear and nonlinear relations and is gradually becoming more popular in the field of neural envelope tracking. Yet, several different approaches to calculating MI are applied with no consensus on which approach to use. Furthermore, the added value of nonlinear techniques remains a subject of debate in the field. The present paper aims to resolve these open questions.Approach.We analyzed electroencephalography (EEG) data of participants listening to continuous speech and applied MI analyses and linear models.Main results.Comparing the different MI approaches, we conclude that results are most reliable and robust using the Gaussian copula approach, which first transforms the data to standard Gaussians. With this approach, the MI analysis is a valid technique for studying neural envelope tracking. Like linear models, it allows spatial and temporal interpretations of speech processing, peak latency analyses, and applications to multiple EEG channels combined. In a final analysis, we tested whether nonlinear components were present in the neural response to the envelope by first removing all linear components in the data. We robustly detected nonlinear components on the single-subject level using the MI analysis.Significance.We demonstrate that the human brain processes speech in a nonlinear way. Unlike linear models, the MI analysis detects such nonlinear relations, proving its added value to neural envelope tracking. In addition, the MI analysis retains spatial and temporal characteristics of speech processing, an advantage lost when using more complex (nonlinear) deep neural networks.
Collapse
Affiliation(s)
- Pieter De Clercq
- Experimental Oto-Rhino-Laryngology, Department of Neurosciences, Leuven Brain Institute, KU Leuven, Belgium
| | - Jonas Vanthornhout
- Experimental Oto-Rhino-Laryngology, Department of Neurosciences, Leuven Brain Institute, KU Leuven, Belgium
| | - Maaike Vandermosten
- Experimental Oto-Rhino-Laryngology, Department of Neurosciences, Leuven Brain Institute, KU Leuven, Belgium
| | - Tom Francart
- Experimental Oto-Rhino-Laryngology, Department of Neurosciences, Leuven Brain Institute, KU Leuven, Belgium
| |
Collapse
|
15
|
Lo CW, Tung TY, Ke AH, Brennan JR. Hierarchy, Not Lexical Regularity, Modulates Low-Frequency Neural Synchrony During Language Comprehension. NEUROBIOLOGY OF LANGUAGE (CAMBRIDGE, MASS.) 2022; 3:538-555. [PMID: 37215342 PMCID: PMC10158645 DOI: 10.1162/nol_a_00077] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 03/02/2022] [Accepted: 06/20/2022] [Indexed: 05/24/2023]
Abstract
Neural responses appear to synchronize with sentence structure. However, researchers have debated whether this response in the delta band (0.5-3 Hz) really reflects hierarchical information or simply lexical regularities. Computational simulations in which sentences are represented simply as sequences of high-dimensional numeric vectors that encode lexical information seem to give rise to power spectra similar to those observed for sentence synchronization, suggesting that sentence-level cortical tracking findings may reflect sequential lexical or part-of-speech information, and not necessarily hierarchical syntactic information. Using electroencephalography (EEG) data and the frequency-tagging paradigm, we develop a novel experimental condition to tease apart the predictions of the lexical and the hierarchical accounts of the attested low-frequency synchronization. Under a lexical model, synchronization should be observed even when words are reversed within their phrases (e.g., "sheep white grass eat" instead of "white sheep eat grass"), because the same lexical items are preserved at the same regular intervals. Critically, such stimuli are not syntactically well-formed; thus a hierarchical model does not predict synchronization of phrase- and sentence-level structure in the reversed phrase condition. Computational simulations confirm these diverging predictions. EEG data from N = 31 native speakers of Mandarin show robust delta synchronization to syntactically well-formed isochronous speech. Importantly, no such pattern is observed for reversed phrases, consistent with the hierarchical, but not the lexical, accounts.
Collapse
Affiliation(s)
- Chia-Wen Lo
- Research Group Language Cycles, Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany
- Department of Linguistics, University of Michigan, Ann Arbor, MI, USA
| | - Tzu-Yun Tung
- Department of Linguistics, University of Michigan, Ann Arbor, MI, USA
| | - Alan Hezao Ke
- Department of Linguistics, University of Michigan, Ann Arbor, MI, USA
- Department of Linguistics, Languages and Cultures, Michigan State University, East Lansing, MI, USA
| | | |
Collapse
|