1
|
Sorensen DO, Avcu E, Lynch S, Ahlfors SP, Gow DW. Neural representation of phonological wordform in temporal cortex. Psychon Bull Rev 2024:10.3758/s13423-024-02511-6. [PMID: 38689188 DOI: 10.3758/s13423-024-02511-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 04/08/2024] [Indexed: 05/02/2024]
Abstract
While the neural bases of the earliest stages of speech categorization have been widely explored using neural decoding methods, there is still a lack of consensus on questions as basic as how wordforms are represented and in what way this word-level representation influences downstream processing in the brain. Isolating and localizing the neural representations of wordform is challenging because spoken words activate a variety of representations (e.g., segmental, semantic, articulatory) in addition to form-based representations. We addressed these challenges through a novel integrated neural decoding and effective connectivity design using region of interest (ROI)-based, source-reconstructed magnetoencephalography/electroencephalography (MEG/EEG) data collected during a lexical decision task. To identify wordform representations, we trained classifiers on words and nonwords from different phonological neighborhoods and then tested the classifiers' ability to discriminate between untrained target words that overlapped phonologically with the trained items. Training with word neighbors supported significantly better decoding than training with nonword neighbors in the period immediately following target presentation. Decoding regions included mostly right hemisphere regions in the posterior temporal lobe implicated in phonetic and lexical representation. Additionally, neighbors that aligned with target word beginnings (critical for word recognition) supported decoding, but equivalent phonological overlap with word codas did not, suggesting lexical mediation. Effective connectivity analyses showed a rich pattern of interaction between ROIs that support decoding based on training with lexical neighbors, especially driven by right posterior middle temporal gyrus. Collectively, these results evidence functional representation of wordforms in temporal lobes isolated from phonemic or semantic representations.
Collapse
Affiliation(s)
- David O Sorensen
- Division of Medical Sciences, Harvard Medical School, Cambridge, MA, USA
| | - Enes Avcu
- Department of Neurology, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA
| | - Skyla Lynch
- Department of Neurology, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA
| | - Seppo P Ahlfors
- Athinoula A. Martinos Center for Biomedical Imaging, Massachusetts General Hospital, Charlestown, MA, USA
- Department of Radiology, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA
| | - David W Gow
- Division of Medical Sciences, Harvard Medical School, Cambridge, MA, USA.
- Department of Neurology, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA.
- Athinoula A. Martinos Center for Biomedical Imaging, Massachusetts General Hospital, Charlestown, MA, USA.
- Department of Psychology, Salem State University, Salem, MA, USA.
- Neurodynamics and Neural Decoding Group, Massachusetts General Hospital, 65 Landsdowne Street, rm 219, Cambridge, MA, 02139, USA.
| |
Collapse
|
2
|
Lyu B, Marslen-Wilson WD, Fang Y, Tyler LK. Finding structure during incremental speech comprehension. eLife 2024; 12:RP89311. [PMID: 38577982 PMCID: PMC10997333 DOI: 10.7554/elife.89311] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/06/2024] Open
Abstract
A core aspect of human speech comprehension is the ability to incrementally integrate consecutive words into a structured and coherent interpretation, aligning with the speaker's intended meaning. This rapid process is subject to multidimensional probabilistic constraints, including both linguistic knowledge and non-linguistic information within specific contexts, and it is their interpretative coherence that drives successful comprehension. To study the neural substrates of this process, we extract word-by-word measures of sentential structure from BERT, a deep language model, which effectively approximates the coherent outcomes of the dynamic interplay among various types of constraints. Using representational similarity analysis, we tested BERT parse depths and relevant corpus-based measures against the spatiotemporally resolved brain activity recorded by electro-/magnetoencephalography when participants were listening to the same sentences. Our results provide a detailed picture of the neurobiological processes involved in the incremental construction of structured interpretations. These findings show when and where coherent interpretations emerge through the evaluation and integration of multifaceted constraints in the brain, which engages bilateral brain regions extending beyond the classical fronto-temporal language system. Furthermore, this study provides empirical evidence supporting the use of artificial neural networks as computational models for revealing the neural dynamics underpinning complex cognitive processes in the brain.
Collapse
Affiliation(s)
| | - William D Marslen-Wilson
- Centre for Speech, Language and the Brain, Department of Psychology, University of CambridgeCambridgeUnited Kingdom
| | - Yuxing Fang
- Centre for Speech, Language and the Brain, Department of Psychology, University of CambridgeCambridgeUnited Kingdom
| | - Lorraine K Tyler
- Centre for Speech, Language and the Brain, Department of Psychology, University of CambridgeCambridgeUnited Kingdom
| |
Collapse
|
3
|
Wang L, Brothers T, Jensen O, Kuperberg GR. Dissociating the pre-activation of word meaning and form during sentence comprehension: Evidence from EEG representational similarity analysis. Psychon Bull Rev 2024; 31:862-873. [PMID: 37783897 PMCID: PMC10985416 DOI: 10.3758/s13423-023-02385-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 09/10/2023] [Indexed: 10/04/2023]
Abstract
During language comprehension, the processing of each incoming word is facilitated in proportion to its predictability. Here, we asked whether anticipated upcoming linguistic information is actually pre-activated before new bottom-up input becomes available, and if so, whether this pre-activation is limited to the level of semantic features, or whether extends to representations of individual word-forms (orthography/phonology). We carried out Representational Similarity Analysis on EEG data while participants read highly constraining sentences. Prior to the onset of the expected target words, sentence pairs predicting semantically related words (financial "bank" - "loan") and form-related words (financial "bank" - river "bank") produced more similar neural patterns than pairs predicting unrelated words ("bank" - "lesson"). This provides direct neural evidence for item-specific semantic and form predictive pre-activation. Moreover, the semantic pre-activation effect preceded the form pre-activation effect, suggesting that top-down pre-activation is propagated from higher to lower levels of the linguistic hierarchy over time.
Collapse
Affiliation(s)
- Lin Wang
- Department of Psychiatry and the Athinoula A. Martinos Center for Biomedical Imaging, Massachusetts General Hospital, Harvard Medical School, Charlestown, MA, 02129, USA.
- Department of Psychology, Tufts University, Medford, MA, USA.
| | - Trevor Brothers
- Department of Psychiatry and the Athinoula A. Martinos Center for Biomedical Imaging, Massachusetts General Hospital, Harvard Medical School, Charlestown, MA, 02129, USA
- Department of Psychology, Tufts University, Medford, MA, USA
| | - Ole Jensen
- Centre for Human Brain Health, School of Psychology, University of Birmingham, Birmingham, UK
| | - Gina R Kuperberg
- Department of Psychiatry and the Athinoula A. Martinos Center for Biomedical Imaging, Massachusetts General Hospital, Harvard Medical School, Charlestown, MA, 02129, USA
- Department of Psychology, Tufts University, Medford, MA, USA
| |
Collapse
|
4
|
Tabas A, von Kriegstein K. Multiple Concurrent Predictions Inform Prediction Error in the Human Auditory Pathway. J Neurosci 2024; 44:e2219222023. [PMID: 37949655 PMCID: PMC10851690 DOI: 10.1523/jneurosci.2219-22.2023] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2022] [Revised: 09/08/2023] [Accepted: 09/16/2023] [Indexed: 11/12/2023] Open
Abstract
The key assumption of the predictive coding framework is that internal representations are used to generate predictions on how the sensory input will look like in the immediate future. These predictions are tested against the actual input by the so-called prediction error units, which encode the residuals of the predictions. What happens to prediction errors, however, if predictions drawn by different stages of the sensory hierarchy contradict each other? To answer this question, we conducted two fMRI experiments while female and male human participants listened to sequences of sounds: pure tones in the first experiment and frequency-modulated sweeps in the second experiment. In both experiments, we used repetition to induce predictions based on stimulus statistics (stats-informed predictions) and abstract rules disclosed in the task instructions to induce an orthogonal set of (task-informed) predictions. We tested three alternative scenarios: neural responses in the auditory sensory pathway encode prediction error with respect to (1) the stats-informed predictions, (2) the task-informed predictions, or (3) a combination of both. Results showed that neural populations in all recorded regions (bilateral inferior colliculus, medial geniculate body, and primary and secondary auditory cortices) encode prediction error with respect to a combination of the two orthogonal sets of predictions. The findings suggest that predictive coding exploits the non-linear architecture of the auditory pathway for the transmission of predictions. Such non-linear transmission of predictions might be crucial for the predictive coding of complex auditory signals like speech.Significance Statement Sensory systems exploit our subjective expectations to make sense of an overwhelming influx of sensory signals. It is still unclear how expectations at each stage of the processing pipeline are used to predict the representations at the other stages. The current view is that this transmission is hierarchical and linear. Here we measured fMRI responses in auditory cortex, sensory thalamus, and midbrain while we induced two sets of mutually inconsistent expectations on the sensory input, each putatively encoded at a different stage. We show that responses at all stages are concurrently shaped by both sets of expectations. The results challenge the hypothesis that expectations are transmitted linearly and provide for a normative explanation of the non-linear physiology of the corticofugal sensory system.
Collapse
Affiliation(s)
- Alejandro Tabas
- Department of Engineering, University of Cambridge, Cambridge CB2 1PZ, United Kingdom
- Department of Psychology, Technische Universität Dresden, 01062 Dresden, Germany
- Max Planck Institute for Human Cognitive and Brain Sciences, 04103 Leipzig, Germany
| | - Katharina von Kriegstein
- Department of Psychology, Technische Universität Dresden, 01062 Dresden, Germany
- Max Planck Institute for Human Cognitive and Brain Sciences, 04103 Leipzig, Germany
| |
Collapse
|
5
|
Sorensen DO, Avcu E, Lynch S, Ahlfors SP, Gow DW. Neural representation of phonological wordform in bilateral posterior temporal cortex. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.07.19.549751. [PMID: 37503242 PMCID: PMC10370090 DOI: 10.1101/2023.07.19.549751] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/29/2023]
Abstract
While the neural bases of the earliest stages of speech categorization have been widely explored using neural decoding methods, there is still a lack of consensus on questions as basic as how wordforms are represented and in what way this word-level representation influences downstream processing in the brain. Isolating and localizing the neural representations of wordform is challenging because spoken words evoke activation of a variety of representations (e.g., segmental, semantic, articulatory) in addition to form-based representations. We addressed these challenges through a novel integrated neural decoding and effective connectivity design using region of interest (ROI)-based, source reconstructed magnetoencephalography/electroencephalography (MEG/EEG) data collected during a lexical decision task. To localize wordform representations, we trained classifiers on words and nonwords from different phonological neighborhoods and then tested the classifiers' ability to discriminate between untrained target words that overlapped phonologically with the trained items. Training with either word or nonword neighbors supported decoding in many brain regions during an early analysis window (100-400 ms) reflecting primarily incremental phonological processing. Training with word neighbors, but not nonword neighbors, supported decoding in a bilateral set of temporal lobe ROIs, in a later time window (400-600 ms) reflecting activation related to word recognition. These ROIs included bilateral posterior temporal regions implicated in wordform representation. Effective connectivity analyses among regions within this subset indicated that word-evoked activity influenced the decoding accuracy more than nonword-evoked activity did. Taken together, these results evidence functional representation of wordforms in bilateral temporal lobes isolated from phonemic or semantic representations.
Collapse
|
6
|
Bruffaerts R, Pongos A, Shain C, Lipkin B, Siegelman M, Wens V, Sjøgård M, Pantazis D, Blank I, Goldman S, De Tiège X, Fedorenko E. Functional identification of language-responsive channels in individual participants in MEG investigations. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.03.23.533424. [PMID: 36993378 PMCID: PMC10055362 DOI: 10.1101/2023.03.23.533424] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]
Abstract
Making meaningful inferences about the functional architecture of the language system requires the ability to refer to the same neural units across individuals and studies. Traditional brain imaging approaches align and average brains together in a common space. However, lateral frontal and temporal cortex, where the language system resides, is characterized by high structural and functional inter-individual variability. This variability reduces the sensitivity and functional resolution of group-averaging analyses. This problem is compounded by the fact that language areas often lay in close proximity to regions of other large-scale networks with different functional profiles. A solution inspired by other fields of cognitive neuroscience (e.g., vision) is to identify language areas functionally in each individual brain using a 'localizer' task (e.g., a language comprehension task). This approach has proven productive in fMRI, yielding a number of discoveries about the language system, and has been successfully extended to intracranial recording investigations. Here, we apply this approach to MEG. Across two experiments (one in Dutch speakers, n=19; one in English speakers, n=23), we examined neural responses to the processing of sentences and a control condition (nonword sequences). We demonstrated that the neural response to language is spatially consistent at the individual level. The language-responsive sensors of interest were, as expected, less responsive to the nonwords condition. Clear inter-individual differences were present in the topography of the neural response to language, leading to greater sensitivity when the data were analyzed at the individual level compared to the group level. Thus, as in fMRI, functional localization yields benefits in MEG and thus opens the door to probing fine-grained distinctions in space and time in future MEG investigations of language processing.
Collapse
Affiliation(s)
- Rose Bruffaerts
- Computational Neurology, Experimental Neurobiology Unit (ENU), Department of Biomedical Sciences, University of Antwerp, Belgium; Department of Neurosciences, KU Leuven, Belgium
- Brain and Cognitive Sciences Department, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Alvince Pongos
- Brain and Cognitive Sciences Department, Massachusetts Institute of Technology, Cambridge, MA, USA
- Department of Bioengineering, UC Berkeley-UCSF, San Francisco, CA, USA
| | - Cory Shain
- Brain and Cognitive Sciences Department, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Benjamin Lipkin
- Brain and Cognitive Sciences Department, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Matthew Siegelman
- Brain and Cognitive Sciences Department, Massachusetts Institute of Technology, Cambridge, MA, USA
- Department of Psychology, Columbia University, New York, NY, USA
| | - Vincent Wens
- Department of Functional Neuroimaging, Service of Nuclear Medicine, CUB Hôpital Erasme, Université libre de Bruxelles, Brussels, Belgium
| | - Martin Sjøgård
- Department of Functional Neuroimaging, Service of Nuclear Medicine, CUB Hôpital Erasme, Université libre de Bruxelles, Brussels, Belgium
| | - Dimitrios Pantazis
- McGovern Institute for Brain Research, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Idan Blank
- Brain and Cognitive Sciences Department, Massachusetts Institute of Technology, Cambridge, MA, USA
- Department of Psychology, University of California Los Angeles, CA, USA
| | - Serge Goldman
- Department of Functional Neuroimaging, Service of Nuclear Medicine, CUB Hôpital Erasme, Université libre de Bruxelles, Brussels, Belgium
| | - Xavier De Tiège
- Department of Functional Neuroimaging, Service of Nuclear Medicine, CUB Hôpital Erasme, Université libre de Bruxelles, Brussels, Belgium
| | - Evelina Fedorenko
- Brain and Cognitive Sciences Department, Massachusetts Institute of Technology, Cambridge, MA, USA
- McGovern Institute for Brain Research, Massachusetts Institute of Technology, Cambridge, MA, USA
| |
Collapse
|
7
|
McMurray B, Sarrett ME, Chiu S, Black AK, Wang A, Canale R, Aslin RN. Decoding the temporal dynamics of spoken word and nonword processing from EEG. Neuroimage 2022; 260:119457. [PMID: 35842096 PMCID: PMC10875705 DOI: 10.1016/j.neuroimage.2022.119457] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2022] [Revised: 07/02/2022] [Accepted: 07/06/2022] [Indexed: 11/23/2022] Open
Abstract
The efficiency of spoken word recognition is essential for real-time communication. There is consensus that this efficiency relies on an implicit process of activating multiple word candidates that compete for recognition as the acoustic signal unfolds in real-time. However, few methods capture the neural basis of this dynamic competition on a msec-by-msec basis. This is crucial for understanding the neuroscience of language, and for understanding hearing, language and cognitive disorders in people for whom current behavioral methods are not suitable. We applied machine-learning techniques to standard EEG signals to decode which word was heard on each trial and analyzed the patterns of confusion over time. Results mirrored psycholinguistic findings: Early on, the decoder was equally likely to report the target (e.g., baggage) or a similar sounding competitor (badger), but by around 500 msec, competitors were suppressed. Follow up analyses show that this is robust across EEG systems (gel and saline), with fewer channels, and with fewer trials. Results are robust within individuals and show high reliability. This suggests a powerful and simple paradigm that can assess the neural dynamics of speech decoding, with potential applications for understanding lexical development in a variety of clinical disorders.
Collapse
Affiliation(s)
- Bob McMurray
- Dept. of Psychological and Brain Sciences, Dept. of Communication Sciences and Disorders, Dept. of Linguistics and Dept. of Otolaryngology, University of Iowa.
| | - McCall E Sarrett
- Interdisciplinary Graduate Program in Neuroscience, Unviersity of Iowa
| | - Samantha Chiu
- Dept. of Psychological and Brain Sciences, University of Iowa
| | - Alexis K Black
- School of Audiology and Speech Sciences, University of British Columbia, Haskins Laboratories
| | - Alice Wang
- Dept. of Psychology, University of Oregon, Haskins Laboratories
| | - Rebecca Canale
- Dept. of Psychological Sciences, University of Connecticut, Haskins Laboratories
| | - Richard N Aslin
- Haskins Laboratories, Department of Psychology and Child Study Center, Yale University, Department of Psychology, University of Connecticut
| |
Collapse
|