1
|
Luo C, Ding N. Cortical encoding of hierarchical linguistic information when syllabic rhythms are obscured by echoes. Neuroimage 2024; 300:120875. [PMID: 39341475 DOI: 10.1016/j.neuroimage.2024.120875] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2024] [Revised: 09/24/2024] [Accepted: 09/26/2024] [Indexed: 10/01/2024] Open
Abstract
In speech perception, low-frequency cortical activity tracks hierarchical linguistic units (e.g., syllables, phrases, and sentences) on top of acoustic features (e.g., speech envelope). Since the fluctuation of speech envelope typically corresponds to the syllabic boundaries, one common interpretation is that the acoustic envelope underlies the extraction of discrete syllables from continuous speech for subsequent linguistic processing. However, it remains unclear whether and how cortical activity encodes linguistic information when the speech envelope does not provide acoustic correlates of syllables. To address the issue, we introduced a frequency-tagging speech stream where the syllabic rhythm was obscured by echoic envelopes and investigated neural encoding of hierarchical linguistic information using electroencephalography (EEG). When listeners attended to the echoic speech, cortical activity showed reliable tracking of syllable, phrase, and sentence levels, among which the higher-level linguistic units elicited more robust neural responses. When attention was diverted from the echoic speech, reliable neural tracking of the syllable level was also observed in contrast to deteriorated neural tracking of the phrase and sentence levels. Further analyses revealed that the envelope aligned with the syllabic rhythm could be recovered from the echoic speech through a neural adaptation model, and the reconstructed envelope yielded higher predictive power for the neural tracking responses than either the original echoic envelope or anechoic envelope. Taken together, these results suggest that neural adaptation and attentional modulation jointly contribute to neural encoding of linguistic information in distorted speech where the syllabic rhythm is obscured by echoes.
Collapse
Affiliation(s)
- Cheng Luo
- Zhejiang Lab, Hangzhou 311121, China.
| | - Nai Ding
- Key Laboratory for Biomedical Engineering of Ministry of Education, College of Biomedical Engineering and Instrument Sciences, Zhejiang University, Hangzhou 310027, China; The State Key Lab of Brain-Machine Intelligence; The MOE Frontier Science Center for Brain Science & Brain-machine Integration, Zhejiang University, Hangzhou 310027, China
| |
Collapse
|
2
|
Naeije G, Niesen M, Vander Ghinst M, Bourguignon M. Simultaneous EEG recording of cortical tracking of speech and movement kinematics. Neuroscience 2024; 561:1-10. [PMID: 39395635 DOI: 10.1016/j.neuroscience.2024.10.013] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2024] [Revised: 09/23/2024] [Accepted: 10/06/2024] [Indexed: 10/14/2024]
Abstract
RATIONALE Cortical activity is coupled with streams of sensory stimulation. The coupling with the temporal envelope of heard speech is known as the cortical tracking of speech (CTS), and that with movement kinematics is known as the corticokinematic coupling (CKC). Simultaneous measurement of both couplings is desirable in clinical settings, but it is unknown whether the inherent dual-tasking condition has an impact on CTS or CKC. AIM We aim to determine whether and how CTS and CKC levels are affected when recorded simultaneously. METHODS Twenty-three healthy young adults underwent 64-channel EEG recordings while listening to stories and while performing repetitive finger-tapping movements in 3 conditions: separately (audio- or tapping-only) or simultaneously (audio-tapping). CTS and CKC values were estimated using coherence analysis between each EEG signal and speech temporal envelope (CTS) or finger acceleration (CKC). CTS was also estimated as the reconstruction accuracy of a decoding model. RESULTS Across recordings, CTS assessed with reconstruction accuracy was significant in 85 % of the subjects at phrasal frequency (0.5 Hz) and in 68 % at syllabic frequencies (4-8 Hz), and CKC was significant in over 85 % of the subjects at movement frequency and its first harmonic. Comparing CTS and CKC values evaluated in separate recordings to those in simultaneous recordings revealed no significant difference and moderate-to-high levels of correlation. CONCLUSION Despite the subtle behavioral effects, CTS and CKC are not evidently altered by the dual-task setting inherent to recording them simultaneously and can be evaluated simultaneously using EEG in clinical settings.
Collapse
Affiliation(s)
- Gilles Naeije
- Laboratoire de Neuroanatomie et Neuroimagerie Translationnelles, UNI - ULB Neuroscience Institute, Université libre de Bruxelles (ULB), Brussels, Belgium; Centre de Référence Neuromusculaire, Department of Neurology, HUB Hôpital Erasme, Université libre de Bruxelles (ULB), Brussels, Belgium.
| | - Maxime Niesen
- Laboratoire de Neuroanatomie et Neuroimagerie Translationnelles, UNI - ULB Neuroscience Institute, Université libre de Bruxelles (ULB), Brussels, Belgium; Service d'ORL et de chirurgie cervico-faciale, HUB Hôpital Erasme, Université libre de Bruxelles (ULB), Brussels, Belgium
| | - Marc Vander Ghinst
- Laboratoire de Neuroanatomie et Neuroimagerie Translationnelles, UNI - ULB Neuroscience Institute, Université libre de Bruxelles (ULB), Brussels, Belgium; Service d'ORL et de chirurgie cervico-faciale, HUB Hôpital Erasme, Université libre de Bruxelles (ULB), Brussels, Belgium
| | - Mathieu Bourguignon
- Laboratoire de Neuroanatomie et Neuroimagerie Translationnelles, UNI - ULB Neuroscience Institute, Université libre de Bruxelles (ULB), Brussels, Belgium; Laboratory of Neurophysiology and Movement Biomechanics, UNI - ULB Neuroscience Institute, Université libre de Bruxelles (ULB), Brussels, Belgium
| |
Collapse
|
3
|
Déaux EC, Piette T, Gaunet F, Legou T, Arnal L, Giraud AL. Dog-human vocal interactions match dogs' sensory-motor tuning. PLoS Biol 2024; 22:e3002789. [PMID: 39352912 PMCID: PMC11444399 DOI: 10.1371/journal.pbio.3002789] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2023] [Accepted: 08/06/2024] [Indexed: 10/04/2024] Open
Abstract
Within species, vocal and auditory systems presumably coevolved to converge on a critical temporal acoustic structure that can be best produced and perceived. While dogs cannot produce articulated sounds, they respond to speech, raising the question as to whether this heterospecific receptive ability could be shaped by exposure to speech or remains bounded by their own sensorimotor capacity. Using acoustic analyses of dog vocalisations, we show that their main production rhythm is slower than the dominant (syllabic) speech rate, and that human-dog-directed speech falls halfway in between. Comparative exploration of neural (electroencephalography) and behavioural responses to speech reveals that comprehension in dogs relies on a slower speech rhythm tracking (delta) than humans' (theta), even though dogs are equally sensitive to speech content and prosody. Thus, the dog audio-motor tuning differs from humans', and we hypothesise that humans may adjust their speech rate to this shared temporal channel as means to improve communication efficacy.
Collapse
Affiliation(s)
- Eloïse C. Déaux
- Department of Basic Neurosciences, Faculty of Medicine, University of Geneva, Geneva, Switzerland
| | - Théophane Piette
- Department of Basic Neurosciences, Faculty of Medicine, University of Geneva, Geneva, Switzerland
| | - Florence Gaunet
- Aix-Marseille University and CNRS, Laboratoire de Psychologie Cognitive (UMR 7290), Marseille, France
| | - Thierry Legou
- Aix Marseille University and CNRS, Laboratoire Parole et Langage (UMR 6057), Aix-en-Provence, France
| | - Luc Arnal
- Université Paris Cité, Institut Pasteur, AP-HP, Inserm, Fondation Pour l’Audition, Institut de l’Audition, IHU reConnect, F-75012 Paris, France
| | - Anne-Lise Giraud
- Department of Basic Neurosciences, Faculty of Medicine, University of Geneva, Geneva, Switzerland
- Université Paris Cité, Institut Pasteur, AP-HP, Inserm, Fondation Pour l’Audition, Institut de l’Audition, IHU reConnect, F-75012 Paris, France
| |
Collapse
|
4
|
Kasten FH, Busson Q, Zoefel B. Opposing neural processing modes alternate rhythmically during sustained auditory attention. Commun Biol 2024; 7:1125. [PMID: 39266696 PMCID: PMC11393317 DOI: 10.1038/s42003-024-06834-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2023] [Accepted: 09/03/2024] [Indexed: 09/14/2024] Open
Abstract
During continuous tasks, humans show spontaneous fluctuations in performance, putatively caused by varying attentional resources allocated to process external information. If neural resources are used to process other, presumably "internal" information, sensory input can be missed and explain an apparent dichotomy of "internal" versus "external" attention. In the current study, we extract presumed neural signatures of these attentional modes in human electroencephalography (EEG): neural entrainment and α-oscillations (~10-Hz), linked to the processing and suppression of sensory information, respectively. We test whether they exhibit structured fluctuations over time, while listeners attend to an ecologically relevant stimulus, like speech, and complete a task that requires full and continuous attention. Results show an antagonistic relation between neural entrainment to speech and spontaneous α-oscillations in two distinct brain networks-one specialized in the processing of external information, the other reminiscent of the dorsal attention network. These opposing neural modes undergo slow, periodic fluctuations around ~0.07 Hz and are related to the detection of auditory targets. Our study might have tapped into a general attentional mechanism that is conserved across species and has important implications for situations in which sustained attention to sensory information is critical.
Collapse
Affiliation(s)
- Florian H Kasten
- Department for Cognitive, Affective, Behavioral Neuroscience with Focus Neurostimulation, Institute of Psychology, University of Trier, Trier, Germany.
- Centre de Recherche Cerveau & Cognition, CNRS, Toulouse, France.
- Université Toulouse III Paul Sabatier, Toulouse, France.
| | | | - Benedikt Zoefel
- Centre de Recherche Cerveau & Cognition, CNRS, Toulouse, France.
- Université Toulouse III Paul Sabatier, Toulouse, France.
| |
Collapse
|
5
|
Çetinçelik M, Jordan-Barros A, Rowland CF, Snijders TM. The effect of visual speech cues on neural tracking of speech in 10-month-old infants. Eur J Neurosci 2024; 60:5381-5399. [PMID: 39188179 DOI: 10.1111/ejn.16492] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/12/2024] [Revised: 07/04/2024] [Accepted: 07/20/2024] [Indexed: 08/28/2024]
Abstract
While infants' sensitivity to visual speech cues and the benefit of these cues have been well-established by behavioural studies, there is little evidence on the effect of visual speech cues on infants' neural processing of continuous auditory speech. In this study, we investigated whether visual speech cues, such as the movements of the lips, jaw, and larynx, facilitate infants' neural speech tracking. Ten-month-old Dutch-learning infants watched videos of a speaker reciting passages in infant-directed speech while electroencephalography (EEG) was recorded. In the videos, either the full face of the speaker was displayed or the speaker's mouth and jaw were masked with a block, obstructing the visual speech cues. To assess neural tracking, speech-brain coherence (SBC) was calculated, focusing particularly on the stress and syllabic rates (1-1.75 and 2.5-3.5 Hz respectively in our stimuli). First, overall, SBC was compared to surrogate data, and then, differences in SBC in the two conditions were tested at the frequencies of interest. Our results indicated that infants show significant tracking at both stress and syllabic rates. However, no differences were identified between the two conditions, meaning that infants' neural tracking was not modulated further by the presence of visual speech cues. Furthermore, we demonstrated that infants' neural tracking of low-frequency information is related to their subsequent vocabulary development at 18 months. Overall, this study provides evidence that infants' neural tracking of speech is not necessarily impaired when visual speech cues are not fully visible and that neural tracking may be a potential mechanism in successful language acquisition.
Collapse
Affiliation(s)
- Melis Çetinçelik
- Max Planck Institute for Psycholinguistics, Nijmegen, The Netherlands
- Department of Experimental Psychology, Utrecht University, Utrecht, The Netherlands
- Cognitive Neuropsychology Department, Tilburg University, Tilburg, The Netherlands
| | - Antonia Jordan-Barros
- Centre for Brain and Cognitive Development, Department of Psychological Science, Birkbeck, University of London, London, UK
- Experimental Psychology, University College London, London, UK
| | - Caroline F Rowland
- Max Planck Institute for Psycholinguistics, Nijmegen, The Netherlands
- Donders Institute for Brain, Cognition and Behaviour, Radboud University, Nijmegen, The Netherlands
| | - Tineke M Snijders
- Max Planck Institute for Psycholinguistics, Nijmegen, The Netherlands
- Cognitive Neuropsychology Department, Tilburg University, Tilburg, The Netherlands
- Donders Institute for Brain, Cognition and Behaviour, Radboud University, Nijmegen, The Netherlands
| |
Collapse
|
6
|
Zoefel B, Abbasi O, Gross J, Kotz SA. Entrainment echoes in the cerebellum. Proc Natl Acad Sci U S A 2024; 121:e2411167121. [PMID: 39136991 PMCID: PMC11348099 DOI: 10.1073/pnas.2411167121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2024] [Accepted: 07/05/2024] [Indexed: 08/29/2024] Open
Abstract
Evidence accumulates that the cerebellum's role in the brain is not restricted to motor functions. Rather, cerebellar activity seems to be crucial for a variety of tasks that rely on precise event timing and prediction. Due to its complex structure and importance in communication, human speech requires a particularly precise and predictive coordination of neural processes to be successfully comprehended. Recent studies proposed that the cerebellum is indeed a major contributor to speech processing, but how this contribution is achieved mechanistically remains poorly understood. The current study aimed to reveal a mechanism underlying cortico-cerebellar coordination and demonstrate its speech-specificity. In a reanalysis of magnetoencephalography data, we found that activity in the cerebellum aligned to rhythmic sequences of noise-vocoded speech, irrespective of its intelligibility. We then tested whether these "entrained" responses persist, and how they interact with other brain regions, when a rhythmic stimulus stopped and temporal predictions had to be updated. We found that only intelligible speech produced sustained rhythmic responses in the cerebellum. During this "entrainment echo," but not during rhythmic speech itself, cerebellar activity was coupled with that in the left inferior frontal gyrus, and specifically at rates corresponding to the preceding stimulus rhythm. This finding represents evidence for specific cerebellum-driven temporal predictions in speech processing and their relay to cortical regions.
Collapse
Affiliation(s)
- Benedikt Zoefel
- Centre de Recherche Cerveau et Cognition, CNRS, Toulouse31100, France
- Université Paul Sabatier Toulouse III, Toulouse31400, France
| | - Omid Abbasi
- Institute for Biomagnetism and Biosignal Analysis, University of Münster, Münster48149, Germany
| | - Joachim Gross
- Institute for Biomagnetism and Biosignal Analysis, University of Münster, Münster48149, Germany
- Otto-Creutzfeldt-Center for Cognitive and Behavioral Neuroscience, University of Münster, Münster48149, Germany
| | - Sonja A. Kotz
- Department of Neuropsychology and Psychopharmacology, Faculty of Psychology and Neuroscience, Maastricht University, Maastricht 6229, the Netherlands
- Department of Neuropsychology, Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig04103, Germany
| |
Collapse
|
7
|
Issa MF, Khan I, Ruzzoli M, Molinaro N, Lizarazu M. On the speech envelope in the cortical tracking of speech. Neuroimage 2024; 297:120675. [PMID: 38885886 DOI: 10.1016/j.neuroimage.2024.120675] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2024] [Revised: 06/05/2024] [Accepted: 06/06/2024] [Indexed: 06/20/2024] Open
Abstract
The synchronization between the speech envelope and neural activity in auditory regions, referred to as cortical tracking of speech (CTS), plays a key role in speech processing. The method selected for extracting the envelope is a crucial step in CTS measurement, and the absence of a consensus on best practices among the various methods can influence analysis outcomes and interpretation. Here, we systematically compare five standard envelope extraction methods the absolute value of Hilbert transform (absHilbert), gammatone filterbanks, heuristic approach, Bark scale, and vocalic energy), analyzing their impact on the CTS. We present performance metrics for each method based on the recording of brain activity from participants listening to speech in clear and noisy conditions, utilizing intracranial EEG, MEG and EEG data. As expected, we observed significant CTS in temporal brain regions below 10 Hz across all datasets, regardless of the extraction methods. In general, the gammatone filterbanks approach consistently demonstrated superior performance compared to other methods. Results from our study can guide scientists in the field to make informed decisions about the optimal analysis to extract the CTS, contributing to advancing the understanding of the neuronal mechanisms implicated in CTS.
Collapse
Affiliation(s)
- Mohamed F Issa
- BCBL, Basque Center on Cognition, Brain and Language, San Sebastian, Spain; Department of Scientific Computing, Faculty of Computers and Artificial Intelligence, Benha University, Benha, Egypt.
| | - Izhar Khan
- BCBL, Basque Center on Cognition, Brain and Language, San Sebastian, Spain
| | - Manuela Ruzzoli
- BCBL, Basque Center on Cognition, Brain and Language, San Sebastian, Spain; Ikerbasque, Basque Foundation for Science, Bilbao, Spain
| | - Nicola Molinaro
- BCBL, Basque Center on Cognition, Brain and Language, San Sebastian, Spain; Ikerbasque, Basque Foundation for Science, Bilbao, Spain
| | - Mikel Lizarazu
- BCBL, Basque Center on Cognition, Brain and Language, San Sebastian, Spain
| |
Collapse
|
8
|
Chalas N, Meyer L, Lo CW, Park H, Kluger DS, Abbasi O, Kayser C, Nitsch R, Gross J. Dissociating prosodic from syntactic delta activity during natural speech comprehension. Curr Biol 2024; 34:3537-3549.e5. [PMID: 39047734 DOI: 10.1016/j.cub.2024.06.072] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2024] [Revised: 06/24/2024] [Accepted: 06/27/2024] [Indexed: 07/27/2024]
Abstract
Decoding human speech requires the brain to segment the incoming acoustic signal into meaningful linguistic units, ranging from syllables and words to phrases. Integrating these linguistic constituents into a coherent percept sets the root of compositional meaning and hence understanding. One important cue for segmentation in natural speech is prosodic cues, such as pauses, but their interplay with higher-level linguistic processing is still unknown. Here, we dissociate the neural tracking of prosodic pauses from the segmentation of multi-word chunks using magnetoencephalography (MEG). We find that manipulating the regularity of pauses disrupts slow speech-brain tracking bilaterally in auditory areas (below 2 Hz) and in turn increases left-lateralized coherence of higher-frequency auditory activity at speech onsets (around 25-45 Hz). Critically, we also find that multi-word chunks-defined as short, coherent bundles of inter-word dependencies-are processed through the rhythmic fluctuations of low-frequency activity (below 2 Hz) bilaterally and independently of prosodic cues. Importantly, low-frequency alignment at chunk onsets increases the accuracy of an encoding model in bilateral auditory and frontal areas while controlling for the effect of acoustics. Our findings provide novel insights into the neural basis of speech perception, demonstrating that both acoustic features (prosodic cues) and abstract linguistic processing at the multi-word timescale are underpinned independently by low-frequency electrophysiological brain activity in the delta frequency range.
Collapse
Affiliation(s)
- Nikos Chalas
- Institute for Biomagnetism and Biosignal Analysis, University of Münster, Münster, Germany; Otto-Creutzfeldt-Center for Cognitive and Behavioral Neuroscience, University of Münster, Münster, Germany; Institute for Translational Neuroscience, University of Münster, Münster, Germany.
| | - Lars Meyer
- Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany
| | - Chia-Wen Lo
- Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany
| | - Hyojin Park
- Centre for Human Brain Health (CHBH), School of Psychology, University of Birmingham, Birmingham, UK
| | - Daniel S Kluger
- Institute for Biomagnetism and Biosignal Analysis, University of Münster, Münster, Germany; Otto-Creutzfeldt-Center for Cognitive and Behavioral Neuroscience, University of Münster, Münster, Germany
| | - Omid Abbasi
- Institute for Biomagnetism and Biosignal Analysis, University of Münster, Münster, Germany
| | - Christoph Kayser
- Department for Cognitive Neuroscience, Faculty of Biology, Bielefeld University, 33615 Bielefeld, Germany
| | - Robert Nitsch
- Institute for Translational Neuroscience, University of Münster, Münster, Germany
| | - Joachim Gross
- Institute for Biomagnetism and Biosignal Analysis, University of Münster, Münster, Germany; Otto-Creutzfeldt-Center for Cognitive and Behavioral Neuroscience, University of Münster, Münster, Germany
| |
Collapse
|
9
|
Cometa A, Battaglini C, Artoni F, Greco M, Frank R, Repetto C, Bottoni F, Cappa SF, Micera S, Ricciardi E, Moro A. Brain and grammar: revealing electrophysiological basic structures with competing statistical models. Cereb Cortex 2024; 34:bhae317. [PMID: 39098819 DOI: 10.1093/cercor/bhae317] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2024] [Revised: 07/08/2024] [Accepted: 07/24/2024] [Indexed: 08/06/2024] Open
Abstract
Acoustic, lexical, and syntactic information are simultaneously processed in the brain requiring complex strategies to distinguish their electrophysiological activity. Capitalizing on previous works that factor out acoustic information, we could concentrate on the lexical and syntactic contribution to language processing by testing competing statistical models. We exploited electroencephalographic recordings and compared different surprisal models selectively involving lexical information, part of speech, or syntactic structures in various combinations. Electroencephalographic responses were recorded in 32 participants during listening to affirmative active declarative sentences. We compared the activation corresponding to basic syntactic structures, such as noun phrases vs. verb phrases. Lexical and syntactic processing activates different frequency bands, partially different time windows, and different networks. Moreover, surprisal models based on part of speech inventory only do not explain well the electrophysiological data, while those including syntactic information do. By disentangling acoustic, lexical, and syntactic information, we demonstrated differential brain sensitivity to syntactic information. These results confirm and extend previous measures obtained with intracranial recordings, supporting our hypothesis that syntactic structures are crucial in neural language processing. This study provides a detailed understanding of how the brain processes syntactic information, highlighting the importance of syntactic surprisal in shaping neural responses during language comprehension.
Collapse
Affiliation(s)
- Andrea Cometa
- MoMiLab, IMT School for Advanced Studies Lucca, Piazza S.Francesco, 19, Lucca 55100, Italy
- The BioRobotics Institute and Department of Excellence in Robotics and AI, Scuola Superiore Sant'Anna, Viale Rinaldo Piaggio 34, Pontedera 56025, Italy
- Cognitive Neuroscience (ICoN) Center, University School for Advanced Studies IUSS, Piazza Vittoria 15, Pavia 27100, Italy
| | - Chiara Battaglini
- Neurolinguistics and Experimental Pragmatics (NEP) Lab, University School for Advanced Studies IUSS Pavia, Piazza della Vittoria 15, Pavia 27100, Italy
| | - Fiorenzo Artoni
- Department of Clinical Neurosciences, Faculty of Medicine, University of Geneva, 1, rue Michel-Servet, Genéve 1211, Switzerland
| | - Matteo Greco
- Cognitive Neuroscience (ICoN) Center, University School for Advanced Studies IUSS, Piazza Vittoria 15, Pavia 27100, Italy
| | - Robert Frank
- Department of Linguistics, Yale University, 370 Temple St, New Haven, CT 06511, United States
| | - Claudia Repetto
- Department of Psychology, Università Cattolica del Sacro Cuore, Largo A. Gemelli 1, Milan 20123, Italy
| | - Franco Bottoni
- Istituto Clinico Humanitas, IRCCS, Via Alessandro Manzoni 56, Rozzano 20089, Italy
| | - Stefano F Cappa
- Cognitive Neuroscience (ICoN) Center, University School for Advanced Studies IUSS, Piazza Vittoria 15, Pavia 27100, Italy
- Dementia Research Center, IRCCS Mondino Foundation National Institute of Neurology, Via Mondino 2, Pavia 27100, Italy
| | - Silvestro Micera
- The BioRobotics Institute and Department of Excellence in Robotics and AI, Scuola Superiore Sant'Anna, Viale Rinaldo Piaggio 34, Pontedera 56025, Italy
- Bertarelli Foundation Chair in Translational NeuroEngineering, Center for Neuroprosthetics and School of Engineering, Ecole Polytechnique Federale de Lausanne, Campus Biotech, Chemin des Mines 9, Geneva, GE CH 1202, Switzerland
| | - Emiliano Ricciardi
- MoMiLab, IMT School for Advanced Studies Lucca, Piazza S.Francesco, 19, Lucca 55100, Italy
| | - Andrea Moro
- Cognitive Neuroscience (ICoN) Center, University School for Advanced Studies IUSS, Piazza Vittoria 15, Pavia 27100, Italy
| |
Collapse
|
10
|
Iverson P, Song J. Neural Tracking of Speech Acoustics in Noise Is Coupled with Lexical Predictability as Estimated by Large Language Models. eNeuro 2024; 11:ENEURO.0507-23.2024. [PMID: 39095091 PMCID: PMC11335968 DOI: 10.1523/eneuro.0507-23.2024] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2023] [Revised: 07/15/2024] [Accepted: 07/15/2024] [Indexed: 08/04/2024] Open
Abstract
Adults heard recordings of two spatially separated speakers reading newspaper and magazine articles. They were asked to listen to one of them and ignore the other, and EEG was recorded to assess their neural processing. Machine learning extracted neural sources that tracked the target and distractor speakers at three levels: the acoustic envelope of speech (delta- and theta-band modulations), lexical frequency for individual words, and the contextual predictability of individual words estimated by GPT-4 and earlier lexical models. To provide a broader view of speech perception, half of the subjects completed a simultaneous visual task, and the listeners included both native and non-native English speakers. Distinct neural components were extracted for these levels of auditory and lexical processing, demonstrating that native English speakers had greater target-distractor separation compared with non-native English speakers on most measures, and that lexical processing was reduced by the visual task. Moreover, there was a novel interaction of lexical predictability and frequency with auditory processing; acoustic tracking was stronger for lexically harder words, suggesting that people listened harder to the acoustics when needed for lexical selection. This demonstrates that speech perception is not simply a feedforward process from acoustic processing to the lexicon. Rather, the adaptable context-sensitive processing long known to occur at a lexical level has broader consequences for perception, coupling with the acoustic tracking of individual speakers in noise.
Collapse
Affiliation(s)
- Paul Iverson
- Department of Speech, Hearing and Phonetic Sciences, University College London, London WC1N 1PF, United Kingdom
| | - Jieun Song
- School of Digital Humanities and Computational Social Sciences, Korea Advanced Institute of Science and Technology, Daejeon 34141, Republic of Korea
| |
Collapse
|
11
|
Teng X, Larrouy-Maestri P, Poeppel D. Segmenting and Predicting Musical Phrase Structure Exploits Neural Gain Modulation and Phase Precession. J Neurosci 2024; 44:e1331232024. [PMID: 38926087 PMCID: PMC11270514 DOI: 10.1523/jneurosci.1331-23.2024] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2023] [Revised: 05/29/2024] [Accepted: 06/11/2024] [Indexed: 06/28/2024] Open
Abstract
Music, like spoken language, is often characterized by hierarchically organized structure. Previous experiments have shown neural tracking of notes and beats, but little work touches on the more abstract question: how does the brain establish high-level musical structures in real time? We presented Bach chorales to participants (20 females and 9 males) undergoing electroencephalogram (EEG) recording to investigate how the brain tracks musical phrases. We removed the main temporal cues to phrasal structures, so that listeners could only rely on harmonic information to parse a continuous musical stream. Phrasal structures were disrupted by locally or globally reversing the harmonic progression, so that our observations on the original music could be controlled and compared. We first replicated the findings on neural tracking of musical notes and beats, substantiating the positive correlation between musical training and neural tracking. Critically, we discovered a neural signature in the frequency range ∼0.1 Hz (modulations of EEG power) that reliably tracks musical phrasal structure. Next, we developed an approach to quantify the phrasal phase precession of the EEG power, revealing that phrase tracking is indeed an operation of active segmentation involving predictive processes. We demonstrate that the brain establishes complex musical structures online over long timescales (>5 s) and actively segments continuous music streams in a manner comparable to language processing. These two neural signatures, phrase tracking and phrasal phase precession, provide new conceptual and technical tools to study the processes underpinning high-level structure building using noninvasive recording techniques.
Collapse
Affiliation(s)
- Xiangbin Teng
- Department of Psychology, The Chinese University of Hong Kong, Shatin, Hong Kong SAR, China
| | - Pauline Larrouy-Maestri
- Music Department, Max-Planck-Institute for Empirical Aesthetics, Frankfurt 60322, Germany
- Center for Language, Music, and Emotion (CLaME), New York, New York 10003
| | - David Poeppel
- Center for Language, Music, and Emotion (CLaME), New York, New York 10003
- Department of Psychology, New York University, New York, New York 10003
- Ernst Struengmann Institute for Neuroscience, Frankfurt 60528, Germany
- Music and Audio Research Laboratory (MARL), New York, New York 11201
| |
Collapse
|
12
|
Pérez-Navarro J, Klimovich-Gray A, Lizarazu M, Piazza G, Molinaro N, Lallier M. Early language experience modulates the tradeoff between acoustic-temporal and lexico-semantic cortical tracking of speech. iScience 2024; 27:110247. [PMID: 39006483 PMCID: PMC11246002 DOI: 10.1016/j.isci.2024.110247] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2023] [Revised: 03/14/2024] [Accepted: 06/07/2024] [Indexed: 07/16/2024] Open
Abstract
Cortical tracking of speech is relevant for the development of speech perception skills. However, no study to date has explored whether and how cortical tracking of speech is shaped by accumulated language experience, the central question of this study. In 35 bilingual children (6-year-old) with considerably bigger experience in one language, we collected electroencephalography data while they listened to continuous speech in their two languages. Cortical tracking of speech was assessed at acoustic-temporal and lexico-semantic levels. Children showed more robust acoustic-temporal tracking in the least experienced language, and more sensitive cortical tracking of semantic information in the most experienced language. Additionally, and only for the most experienced language, acoustic-temporal tracking was specifically linked to phonological abilities, and lexico-semantic tracking to vocabulary knowledge. Our results indicate that accumulated linguistic experience is a relevant maturational factor for the cortical tracking of speech at different levels during early language acquisition.
Collapse
Affiliation(s)
- Jose Pérez-Navarro
- Basque Center on Cognition, Brain and Language (BCBL), 20009 Donostia-San Sebastian, Spain
| | | | - Mikel Lizarazu
- Basque Center on Cognition, Brain and Language (BCBL), 20009 Donostia-San Sebastian, Spain
| | - Giorgio Piazza
- Basque Center on Cognition, Brain and Language (BCBL), 20009 Donostia-San Sebastian, Spain
| | - Nicola Molinaro
- Basque Center on Cognition, Brain and Language (BCBL), 20009 Donostia-San Sebastian, Spain
- Ikerbasque, Basque Foundation for Science, 48009 Bilbao, Spain
| | - Marie Lallier
- Basque Center on Cognition, Brain and Language (BCBL), 20009 Donostia-San Sebastian, Spain
| |
Collapse
|
13
|
Xiao Q, Zheng X, Wen Y, Yuan Z, Chen Z, Lan Y, Li S, Huang X, Zhong H, Xu C, Zhan C, Pan J, Xie Q. Individualized music induces theta-gamma phase-amplitude coupling in patients with disorders of consciousness. Front Neurosci 2024; 18:1395627. [PMID: 39010944 PMCID: PMC11248187 DOI: 10.3389/fnins.2024.1395627] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2024] [Accepted: 06/18/2024] [Indexed: 07/17/2024] Open
Abstract
Objective This study aimed to determine whether patients with disorders of consciousness (DoC) could experience neural entrainment to individualized music, which explored the cross-modal influences of music on patients with DoC through phase-amplitude coupling (PAC). Furthermore, the study assessed the efficacy of individualized music or preferred music (PM) versus relaxing music (RM) in impacting patient outcomes, and examined the role of cross-modal influences in determining these outcomes. Methods Thirty-two patients with DoC [17 with vegetative state/unresponsive wakefulness syndrome (VS/UWS) and 15 with minimally conscious state (MCS)], alongside 16 healthy controls (HCs), were recruited for this study. Neural activities in the frontal-parietal network were recorded using scalp electroencephalography (EEG) during baseline (BL), RM and PM. Cerebral-acoustic coherence (CACoh) was explored to investigate participants' abilitiy to track music, meanwhile, the phase-amplitude coupling (PAC) was utilized to evaluate the cross-modal influences of music. Three months post-intervention, the outcomes of patients with DoC were followed up using the Coma Recovery Scale-Revised (CRS-R). Results HCs and patients with MCS showed higher CACoh compared to VS/UWS patients within musical pulse frequency (p = 0.016, p = 0.045; p < 0.001, p = 0.048, for RM and PM, respectively, following Bonferroni correction). Only theta-gamma PAC demonstrated a significant interaction effect between groups and music conditions (F (2,44) = 2.685, p = 0.036). For HCs, the theta-gamma PAC in the frontal-parietal network was stronger in the PM condition compared to the RM (p = 0.016) and BL condition (p < 0.001). For patients with MCS, the theta-gamma PAC was stronger in the PM than in the BL (p = 0.040), while no difference was observed among the three music conditions in patients with VS/UWS. Additionally, we found that MCS patients who showed improved outcomes after 3 months exhibited evident neural responses to preferred music (p = 0.019). Furthermore, the ratio of theta-gamma coupling changes in PM relative to BL could predict clinical outcomes in MCS patients (r = 0.992, p < 0.001). Conclusion Individualized music may serve as a potential therapeutic method for patients with DoC through cross-modal influences, which rely on enhanced theta-gamma PAC within the consciousness-related network.
Collapse
Affiliation(s)
- Qiuyi Xiao
- Joint Research Centre for Disorders of Consciousness, Department of Rehabilitation Medicine, Zhujiang Hospital, Southern Medical University, Guangzhou, Guangdong, China
- School of Laboratory Medicine and Biotechnology, Southern Medical University, Guangzhou, Guangdong, China
| | - Xiaochun Zheng
- Joint Research Centre for Disorders of Consciousness, Department of Rehabilitation Medicine, Zhujiang Hospital, Southern Medical University, Guangzhou, Guangdong, China
- School of Laboratory Medicine and Biotechnology, Southern Medical University, Guangzhou, Guangdong, China
| | - Yun Wen
- Music and Reflection Incorporated, Guangzhou, Guangdong, China
| | - Zhanxing Yuan
- Joint Research Centre for Disorders of Consciousness, Department of Rehabilitation Medicine, Zhujiang Hospital, Southern Medical University, Guangzhou, Guangdong, China
| | - Zerong Chen
- Joint Research Centre for Disorders of Consciousness, Department of Rehabilitation Medicine, Zhujiang Hospital, Southern Medical University, Guangzhou, Guangdong, China
- School of Laboratory Medicine and Biotechnology, Southern Medical University, Guangzhou, Guangdong, China
| | - Yue Lan
- Joint Research Centre for Disorders of Consciousness, Department of Rehabilitation Medicine, Zhujiang Hospital, Southern Medical University, Guangzhou, Guangdong, China
| | - Shuiyan Li
- Joint Research Centre for Disorders of Consciousness, Department of Rehabilitation Medicine, Zhujiang Hospital, Southern Medical University, Guangzhou, Guangdong, China
- School of Laboratory Medicine and Biotechnology, Southern Medical University, Guangzhou, Guangdong, China
| | - Xiyan Huang
- Joint Research Centre for Disorders of Consciousness, Department of Rehabilitation Medicine, Zhujiang Hospital, Southern Medical University, Guangzhou, Guangdong, China
| | - Haili Zhong
- Joint Research Centre for Disorders of Consciousness, Department of Rehabilitation Medicine, Zhujiang Hospital, Southern Medical University, Guangzhou, Guangdong, China
| | - Chengwei Xu
- Joint Research Centre for Disorders of Consciousness, Department of Rehabilitation Medicine, Zhujiang Hospital, Southern Medical University, Guangzhou, Guangdong, China
| | - Chang'an Zhan
- Guangdong Provincial Key Laboratory of Medical Image Processing, Southern Medical University, Guangzhou, Guangdong, China
| | - Jiahui Pan
- School of Software, South China Normal University, Guangzhou, Guangdong, China
| | - Qiuyou Xie
- Joint Research Centre for Disorders of Consciousness, Department of Rehabilitation Medicine, Zhujiang Hospital, Southern Medical University, Guangzhou, Guangdong, China
- School of Laboratory Medicine and Biotechnology, Southern Medical University, Guangzhou, Guangdong, China
- Department of Hyperbaric Oxygen, Zhujiang Hospital, Southern Medical University, Guangzhou, Guangdong, China
- School of Rehabilitation Sciences, Southern Medical University, Guangzhou, Guangdong, China
| |
Collapse
|
14
|
Baus C, Millan I, Chen XJ, Blanco-Elorrieta E. Exploring the Interplay Between Language Comprehension and Cortical Tracking: The Bilingual Test Case. NEUROBIOLOGY OF LANGUAGE (CAMBRIDGE, MASS.) 2024; 5:484-496. [PMID: 38911463 PMCID: PMC11192516 DOI: 10.1162/nol_a_00141] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 01/10/2024] [Accepted: 03/04/2024] [Indexed: 06/25/2024]
Abstract
Cortical tracking, the synchronization of brain activity to linguistic rhythms is a well-established phenomenon. However, its nature has been heavily contested: Is it purely epiphenomenal or does it play a fundamental role in speech comprehension? Previous research has used intelligibility manipulations to examine this topic. Here, we instead varied listeners' language comprehension skills while keeping the auditory stimulus constant. To do so, we tested 22 native English speakers and 22 Spanish/Catalan bilinguals learning English as a second language (SL) in an EEG cortical entrainment experiment and correlated the responses with the magnitude of the N400 component of a semantic comprehension task. As expected, native listeners effectively tracked sentential, phrasal, and syllabic linguistic structures. In contrast, SL listeners exhibited limitations in tracking sentential structures but successfully tracked phrasal and syllabic rhythms. Importantly, the amplitude of the neural entrainment correlated with the amplitude of the detection of semantic incongruities in SLs, showing a direct connection between tracking and the ability to understand speech. Together, these findings shed light on the interplay between language comprehension and cortical tracking, to identify neural entrainment as a fundamental principle for speech comprehension.
Collapse
Affiliation(s)
- Cristina Baus
- Department of Cognition, Development and Educational Psychology, University of Barcelona, Barcelona, Spain
- Institute of Neurosciences, University of Barcelona, Barcelona, Spain
| | | | | | - Esti Blanco-Elorrieta
- Department of Psychology, New York University, New York, NY, USA
- Department of Neural Science, New York University, New York, NY, USA
| |
Collapse
|
15
|
Kries J, De Clercq P, Gillis M, Vanthornhout J, Lemmens R, Francart T, Vandermosten M. Exploring neural tracking of acoustic and linguistic speech representations in individuals with post-stroke aphasia. Hum Brain Mapp 2024; 45:e26676. [PMID: 38798131 PMCID: PMC11128780 DOI: 10.1002/hbm.26676] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2023] [Revised: 03/04/2024] [Accepted: 03/21/2024] [Indexed: 05/29/2024] Open
Abstract
Aphasia is a communication disorder that affects processing of language at different levels (e.g., acoustic, phonological, semantic). Recording brain activity via Electroencephalography while people listen to a continuous story allows to analyze brain responses to acoustic and linguistic properties of speech. When the neural activity aligns with these speech properties, it is referred to as neural tracking. Even though measuring neural tracking of speech may present an interesting approach to studying aphasia in an ecologically valid way, it has not yet been investigated in individuals with stroke-induced aphasia. Here, we explored processing of acoustic and linguistic speech representations in individuals with aphasia in the chronic phase after stroke and age-matched healthy controls. We found decreased neural tracking of acoustic speech representations (envelope and envelope onsets) in individuals with aphasia. In addition, word surprisal displayed decreased amplitudes in individuals with aphasia around 195 ms over frontal electrodes, although this effect was not corrected for multiple comparisons. These results show that there is potential to capture language processing impairments in individuals with aphasia by measuring neural tracking of continuous speech. However, more research is needed to validate these results. Nonetheless, this exploratory study shows that neural tracking of naturalistic, continuous speech presents a powerful approach to studying aphasia.
Collapse
Affiliation(s)
- Jill Kries
- Experimental Oto‐Rhino‐Laryngology, Department of Neurosciences, Leuven Brain InstituteKU LeuvenLeuvenBelgium
- Department of PsychologyStanford UniversityStanfordCaliforniaUSA
| | - Pieter De Clercq
- Experimental Oto‐Rhino‐Laryngology, Department of Neurosciences, Leuven Brain InstituteKU LeuvenLeuvenBelgium
| | - Marlies Gillis
- Experimental Oto‐Rhino‐Laryngology, Department of Neurosciences, Leuven Brain InstituteKU LeuvenLeuvenBelgium
| | - Jonas Vanthornhout
- Experimental Oto‐Rhino‐Laryngology, Department of Neurosciences, Leuven Brain InstituteKU LeuvenLeuvenBelgium
| | - Robin Lemmens
- Experimental Neurology, Department of NeurosciencesKU LeuvenLeuvenBelgium
- Laboratory of Neurobiology, VIB‐KU Leuven Center for Brain and Disease ResearchLeuvenBelgium
- Department of NeurologyUniversity Hospitals LeuvenLeuvenBelgium
| | - Tom Francart
- Experimental Oto‐Rhino‐Laryngology, Department of Neurosciences, Leuven Brain InstituteKU LeuvenLeuvenBelgium
| | - Maaike Vandermosten
- Experimental Oto‐Rhino‐Laryngology, Department of Neurosciences, Leuven Brain InstituteKU LeuvenLeuvenBelgium
| |
Collapse
|
16
|
Nora A, Rinkinen O, Renvall H, Service E, Arkkila E, Smolander S, Laasonen M, Salmelin R. Impaired Cortical Tracking of Speech in Children with Developmental Language Disorder. J Neurosci 2024; 44:e2048232024. [PMID: 38589232 PMCID: PMC11140678 DOI: 10.1523/jneurosci.2048-23.2024] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2023] [Revised: 03/25/2024] [Accepted: 03/26/2024] [Indexed: 04/10/2024] Open
Abstract
In developmental language disorder (DLD), learning to comprehend and express oneself with spoken language is impaired, but the reason for this remains unknown. Using millisecond-scale magnetoencephalography recordings combined with machine learning models, we investigated whether the possible neural basis of this disruption lies in poor cortical tracking of speech. The stimuli were common spoken Finnish words (e.g., dog, car, hammer) and sounds with corresponding meanings (e.g., dog bark, car engine, hammering). In both children with DLD (10 boys and 7 girls) and typically developing (TD) control children (14 boys and 3 girls), aged 10-15 years, the cortical activation to spoken words was best modeled as time-locked to the unfolding speech input at ∼100 ms latency between sound and cortical activation. Amplitude envelope (amplitude changes) and spectrogram (detailed time-varying spectral content) of the spoken words, but not other sounds, were very successfully decoded based on time-locked brain responses in bilateral temporal areas; based on the cortical responses, the models could tell at ∼75-85% accuracy which of the two sounds had been presented to the participant. However, the cortical representation of the amplitude envelope information was poorer in children with DLD compared with TD children at longer latencies (at ∼200-300 ms lag). We interpret this effect as reflecting poorer retention of acoustic-phonetic information in short-term memory. This impaired tracking could potentially affect the processing and learning of words as well as continuous speech. The present results offer an explanation for the problems in language comprehension and acquisition in DLD.
Collapse
Affiliation(s)
- Anni Nora
- Department of Neuroscience and Biomedical Engineering, Aalto University, Espoo FI-00076, Finland
- Aalto NeuroImaging (ANI), Aalto University, Espoo FI-00076, Finland
| | - Oona Rinkinen
- Department of Neuroscience and Biomedical Engineering, Aalto University, Espoo FI-00076, Finland
- Aalto NeuroImaging (ANI), Aalto University, Espoo FI-00076, Finland
| | - Hanna Renvall
- Department of Neuroscience and Biomedical Engineering, Aalto University, Espoo FI-00076, Finland
- Aalto NeuroImaging (ANI), Aalto University, Espoo FI-00076, Finland
- BioMag Laboratory, HUS Diagnostic Center, Helsinki University Hospital, Helsinki FI-00029, Finland
| | - Elisabet Service
- Department of Linguistics and Languages, Centre for Advanced Research in Experimental and Applied Linguistics (ARiEAL), McMaster University, Hamilton, Ontario L8S 4L8, Canada
- Department of Psychology and Logopedics, University of Helsinki, Helsinki FI-00014, Finland
| | - Eva Arkkila
- Department of Otorhinolaryngology and Phoniatrics, Head and Neck Center, Helsinki University Hospital and University of Helsinki, Helsinki FI-00014, Finland
| | - Sini Smolander
- Department of Otorhinolaryngology and Phoniatrics, Head and Neck Center, Helsinki University Hospital and University of Helsinki, Helsinki FI-00014, Finland
- Research Unit of Logopedics, University of Oulu, Oulu FI-90014, Finland
- Department of Logopedics, University of Eastern Finland, Joensuu FI-80101, Finland
| | - Marja Laasonen
- Department of Otorhinolaryngology and Phoniatrics, Head and Neck Center, Helsinki University Hospital and University of Helsinki, Helsinki FI-00014, Finland
- Department of Logopedics, University of Eastern Finland, Joensuu FI-80101, Finland
| | - Riitta Salmelin
- Department of Neuroscience and Biomedical Engineering, Aalto University, Espoo FI-00076, Finland
- Aalto NeuroImaging (ANI), Aalto University, Espoo FI-00076, Finland
| |
Collapse
|
17
|
Aldag N, Nogueira W. Psychoacoustic and electroencephalographic responses to changes in amplitude modulation depth and frequency in relation to speech recognition in cochlear implantees. Sci Rep 2024; 14:8181. [PMID: 38589483 PMCID: PMC11002021 DOI: 10.1038/s41598-024-58225-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2023] [Accepted: 03/26/2024] [Indexed: 04/10/2024] Open
Abstract
Temporal envelope modulations (TEMs) are one of the most important features that cochlear implant (CI) users rely on to understand speech. Electroencephalographic assessment of TEM encoding could help clinicians to predict speech recognition more objectively, even in patients unable to provide active feedback. The acoustic change complex (ACC) and the auditory steady-state response (ASSR) evoked by low-frequency amplitude-modulated pulse trains can be used to assess TEM encoding with electrical stimulation of individual CI electrodes. In this study, we focused on amplitude modulation detection (AMD) and amplitude modulation frequency discrimination (AMFD) with stimulation of a basal versus an apical electrode. In twelve adult CI users, we (a) assessed behavioral AMFD thresholds and (b) recorded cortical auditory evoked potentials (CAEPs), AMD-ACC, AMFD-ACC, and ASSR in a combined 3-stimulus paradigm. We found that the electrophysiological responses were significantly higher for apical than for basal stimulation. Peak amplitudes of AMFD-ACC were small and (therefore) did not correlate with speech-in-noise recognition. We found significant correlations between speech-in-noise recognition and (a) behavioral AMFD thresholds and (b) AMD-ACC peak amplitudes. AMD and AMFD hold potential to develop a clinically applicable tool for assessing TEM encoding to predict speech recognition in CI users.
Collapse
Affiliation(s)
- Nina Aldag
- Department of Otolaryngology, Hannover Medical School and Cluster of Excellence 'Hearing4all', Hanover, Germany
| | - Waldo Nogueira
- Department of Otolaryngology, Hannover Medical School and Cluster of Excellence 'Hearing4all', Hanover, Germany.
| |
Collapse
|
18
|
Riddle J, Schooler JW. Hierarchical consciousness: the Nested Observer Windows model. Neurosci Conscious 2024; 2024:niae010. [PMID: 38504828 PMCID: PMC10949963 DOI: 10.1093/nc/niae010] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2023] [Revised: 01/31/2024] [Accepted: 02/26/2024] [Indexed: 03/21/2024] Open
Abstract
Foremost in our experience is the intuition that we possess a unified conscious experience. However, many observations run counter to this intuition: we experience paralyzing indecision when faced with two appealing behavioral choices, we simultaneously hold contradictory beliefs, and the content of our thought is often characterized by an internal debate. Here, we propose the Nested Observer Windows (NOW) Model, a framework for hierarchical consciousness wherein information processed across many spatiotemporal scales of the brain feeds into subjective experience. The model likens the mind to a hierarchy of nested mosaic tiles-where an image is composed of mosaic tiles, and each of these tiles is itself an image composed of mosaic tiles. Unitary consciousness exists at the apex of this nested hierarchy where perceptual constructs become fully integrated and complex behaviors are initiated via abstract commands. We define an observer window as a spatially and temporally constrained system within which information is integrated, e.g. in functional brain regions and neurons. Three principles from the signal analysis of electrical activity describe the nested hierarchy and generate testable predictions. First, nested observer windows disseminate information across spatiotemporal scales with cross-frequency coupling. Second, observer windows are characterized by a high degree of internal synchrony (with zero phase lag). Third, observer windows at the same spatiotemporal level share information with each other through coherence (with non-zero phase lag). The theoretical framework of the NOW Model accounts for a wide range of subjective experiences and a novel approach for integrating prominent theories of consciousness.
Collapse
Affiliation(s)
- Justin Riddle
- Department of Psychology, Florida State University, 1107 W Call St, Tallahassee, FL 32304, USA
| | - Jonathan W Schooler
- Department of Psychological & Brain Sciences, University of California, Santa Barbara, Psychological & Brain Sciences, Santa Barbara, CA 93106, USA
| |
Collapse
|
19
|
Corsini A, Tomassini A, Pastore A, Delis I, Fadiga L, D'Ausilio A. Speech perception difficulty modulates theta-band encoding of articulatory synergies. J Neurophysiol 2024; 131:480-491. [PMID: 38323331 DOI: 10.1152/jn.00388.2023] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2023] [Revised: 01/04/2024] [Accepted: 01/25/2024] [Indexed: 02/08/2024] Open
Abstract
The human brain tracks available speech acoustics and extrapolates missing information such as the speaker's articulatory patterns. However, the extent to which articulatory reconstruction supports speech perception remains unclear. This study explores the relationship between articulatory reconstruction and task difficulty. Participants listened to sentences and performed a speech-rhyming task. Real kinematic data of the speaker's vocal tract were recorded via electromagnetic articulography (EMA) and aligned to corresponding acoustic outputs. We extracted articulatory synergies from the EMA data with principal component analysis (PCA) and employed partial information decomposition (PID) to separate the electroencephalographic (EEG) encoding of acoustic and articulatory features into unique, redundant, and synergistic atoms of information. We median-split sentences into easy (ES) and hard (HS) based on participants' performance and found that greater task difficulty involved greater encoding of unique articulatory information in the theta band. We conclude that fine-grained articulatory reconstruction plays a complementary role in the encoding of speech acoustics, lending further support to the claim that motor processes support speech perception.NEW & NOTEWORTHY Top-down processes originating from the motor system contribute to speech perception through the reconstruction of the speaker's articulatory movement. This study investigates the role of such articulatory simulation under variable task difficulty. We show that more challenging listening tasks lead to increased encoding of articulatory kinematics in the theta band and suggest that, in such situations, fine-grained articulatory reconstruction complements acoustic encoding.
Collapse
Affiliation(s)
- Alessandro Corsini
- Center for Translational Neurophysiology of Speech and Communication, Istituto Italiano di Tecnologia, Ferrara, Italy
- Department of Neuroscience and Rehabilitation, Università di Ferrara, Ferrara, Italy
| | - Alice Tomassini
- Center for Translational Neurophysiology of Speech and Communication, Istituto Italiano di Tecnologia, Ferrara, Italy
- Department of Neuroscience and Rehabilitation, Università di Ferrara, Ferrara, Italy
| | - Aldo Pastore
- Laboratorio NEST, Scuola Normale Superiore, Pisa, Italy
| | - Ioannis Delis
- School of Biomedical Sciences, University of Leeds, Leeds, United Kingdom
| | - Luciano Fadiga
- Center for Translational Neurophysiology of Speech and Communication, Istituto Italiano di Tecnologia, Ferrara, Italy
- Department of Neuroscience and Rehabilitation, Università di Ferrara, Ferrara, Italy
| | - Alessandro D'Ausilio
- Center for Translational Neurophysiology of Speech and Communication, Istituto Italiano di Tecnologia, Ferrara, Italy
- Department of Neuroscience and Rehabilitation, Università di Ferrara, Ferrara, Italy
| |
Collapse
|
20
|
Momtaz S, Bidelman GM. Effects of Stimulus Rate and Periodicity on Auditory Cortical Entrainment to Continuous Sounds. eNeuro 2024; 11:ENEURO.0027-23.2024. [PMID: 38253583 PMCID: PMC10913036 DOI: 10.1523/eneuro.0027-23.2024] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2023] [Revised: 01/14/2024] [Accepted: 01/16/2024] [Indexed: 01/24/2024] Open
Abstract
The neural mechanisms underlying the exogenous coding and neural entrainment to repetitive auditory stimuli have seen a recent surge of interest. However, few studies have characterized how parametric changes in stimulus presentation alter entrained responses. We examined the degree to which the brain entrains to repeated speech (i.e., /ba/) and nonspeech (i.e., click) sounds using phase-locking value (PLV) analysis applied to multichannel human electroencephalogram (EEG) data. Passive cortico-acoustic tracking was investigated in N = 24 normal young adults utilizing EEG source analyses that isolated neural activity stemming from both auditory temporal cortices. We parametrically manipulated the rate and periodicity of repetitive, continuous speech and click stimuli to investigate how speed and jitter in ongoing sound streams affect oscillatory entrainment. Neuronal synchronization to speech was enhanced at 4.5 Hz (the putative universal rate of speech) and showed a differential pattern to that of clicks, particularly at higher rates. PLV to speech decreased with increasing jitter but remained superior to clicks. Surprisingly, PLV entrainment to clicks was invariant to periodicity manipulations. Our findings provide evidence that the brain's neural entrainment to complex sounds is enhanced and more sensitized when processing speech-like stimuli, even at the syllable level, relative to nonspeech sounds. The fact that this specialization is apparent even under passive listening suggests a priority of the auditory system for synchronizing to behaviorally relevant signals.
Collapse
Affiliation(s)
- Sara Momtaz
- School of Communication Sciences & Disorders, University of Memphis, Memphis, Tennessee 38152
- Boys Town National Research Hospital, Boys Town, Nebraska 68131
| | - Gavin M Bidelman
- Department of Speech, Language and Hearing Sciences, Indiana University, Bloomington, Indiana 47408
- Program in Neuroscience, Indiana University, Bloomington, Indiana 47405
| |
Collapse
|
21
|
Karunathilake IMD, Brodbeck C, Bhattasali S, Resnik P, Simon JZ. Neural Dynamics of the Processing of Speech Features: Evidence for a Progression of Features from Acoustic to Sentential Processing. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.02.02.578603. [PMID: 38352332 PMCID: PMC10862830 DOI: 10.1101/2024.02.02.578603] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 02/22/2024]
Abstract
When we listen to speech, our brain's neurophysiological responses "track" its acoustic features, but it is less well understood how these auditory responses are modulated by linguistic content. Here, we recorded magnetoencephalography (MEG) responses while subjects listened to four types of continuous-speech-like passages: speech-envelope modulated noise, English-like non-words, scrambled words, and narrative passage. Temporal response function (TRF) analysis provides strong neural evidence for the emergent features of speech processing in cortex, from acoustics to higher-level linguistics, as incremental steps in neural speech processing. Critically, we show a stepwise hierarchical progression of progressively higher order features over time, reflected in both bottom-up (early) and top-down (late) processing stages. Linguistically driven top-down mechanisms take the form of late N400-like responses, suggesting a central role of predictive coding mechanisms at multiple levels. As expected, the neural processing of lower-level acoustic feature responses is bilateral or right lateralized, with left lateralization emerging only for lexical-semantic features. Finally, our results identify potential neural markers of the computations underlying speech perception and comprehension.
Collapse
Affiliation(s)
| | - Christian Brodbeck
- Department of Computing and Software, McMaster University, Hamilton, ON, Canada
| | - Shohini Bhattasali
- Department of Language Studies, University of Toronto, Scarborough, Canada
| | - Philip Resnik
- Department of Linguistics and Institute for Advanced Computer Studies, University of Maryland, College Park, MD, USA
| | - Jonathan Z Simon
- Department of Electrical and Computer Engineering, University of Maryland, College Park, MD, USA
- Department of Biology, University of Maryland, College Park, MD, USA
- Institute for Systems Research, University of Maryland, College Park, MD, USA
| |
Collapse
|
22
|
Zoefel B, Kösem A. Neural tracking of continuous acoustics: properties, speech-specificity and open questions. Eur J Neurosci 2024; 59:394-414. [PMID: 38151889 DOI: 10.1111/ejn.16221] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2023] [Revised: 11/17/2023] [Accepted: 11/22/2023] [Indexed: 12/29/2023]
Abstract
Human speech is a particularly relevant acoustic stimulus for our species, due to its role of information transmission during communication. Speech is inherently a dynamic signal, and a recent line of research focused on neural activity following the temporal structure of speech. We review findings that characterise neural dynamics in the processing of continuous acoustics and that allow us to compare these dynamics with temporal aspects in human speech. We highlight properties and constraints that both neural and speech dynamics have, suggesting that auditory neural systems are optimised to process human speech. We then discuss the speech-specificity of neural dynamics and their potential mechanistic origins and summarise open questions in the field.
Collapse
Affiliation(s)
- Benedikt Zoefel
- Centre de Recherche Cerveau et Cognition (CerCo), CNRS UMR 5549, Toulouse, France
- Université de Toulouse III Paul Sabatier, Toulouse, France
| | - Anne Kösem
- Lyon Neuroscience Research Center (CRNL), INSERM U1028, Bron, France
| |
Collapse
|
23
|
Dikker S, Brito NH, Dumas G. It takes a village: A multi-brain approach to studying multigenerational family communication. Dev Cogn Neurosci 2024; 65:101330. [PMID: 38091864 PMCID: PMC10716709 DOI: 10.1016/j.dcn.2023.101330] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2023] [Revised: 08/27/2023] [Accepted: 12/07/2023] [Indexed: 12/17/2023] Open
Abstract
Grandparents play a critical role in child rearing across the globe. Yet, there is a shortage of neurobiological research examining the relationship between grandparents and their grandchildren. We employ multi-brain neurocomputational models to simulate how changes in neurophysiological processes in both development and healthy aging affect multigenerational inter-brain coupling - a neural marker that has been linked to a range of socio-emotional and cognitive outcomes. The simulations suggest that grandparent-child interactions may be paired with higher inter-brain coupling than parent-child interactions, raising the possibility that the former may be more advantageous under certain conditions. Critically, this enhancement of inter-brain coupling for grandparent-child interactions is more pronounced in tri-generational interactions that also include a parent, which may speak to findings that grandparent involvement in childrearing is most beneficial if the parent is also an active household member. Together, these findings underscore that a better understanding of the neurobiological basis of cross-generational interactions is vital, and that such knowledge can be helpful in guiding interventions that consider the whole family. We advocate for a community neuroscience approach in developmental social neuroscience to capture the diversity of child-caregiver relationships in real-world settings.
Collapse
|
24
|
Gao J, Chen H, Fang M, Ding N. Original speech and its echo are segregated and separately processed in the human brain. PLoS Biol 2024; 22:e3002498. [PMID: 38358954 PMCID: PMC10868781 DOI: 10.1371/journal.pbio.3002498] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2023] [Accepted: 01/15/2024] [Indexed: 02/17/2024] Open
Abstract
Speech recognition crucially relies on slow temporal modulations (<16 Hz) in speech. Recent studies, however, have demonstrated that the long-delay echoes, which are common during online conferencing, can eliminate crucial temporal modulations in speech but do not affect speech intelligibility. Here, we investigated the underlying neural mechanisms. MEG experiments demonstrated that cortical activity can effectively track the temporal modulations eliminated by an echo, which cannot be fully explained by basic neural adaptation mechanisms. Furthermore, cortical responses to echoic speech can be better explained by a model that segregates speech from its echo than by a model that encodes echoic speech as a whole. The speech segregation effect was observed even when attention was diverted but would disappear when segregation cues, i.e., speech fine structure, were removed. These results strongly suggested that, through mechanisms such as stream segregation, the auditory system can build an echo-insensitive representation of speech envelope, which can support reliable speech recognition.
Collapse
Affiliation(s)
- Jiaxin Gao
- Key Laboratory for Biomedical Engineering of Ministry of Education, College of Biomedical Engineering and Instrument Sciences, Zhejiang University, Hangzhou, China
| | - Honghua Chen
- Key Laboratory for Biomedical Engineering of Ministry of Education, College of Biomedical Engineering and Instrument Sciences, Zhejiang University, Hangzhou, China
| | - Mingxuan Fang
- Key Laboratory for Biomedical Engineering of Ministry of Education, College of Biomedical Engineering and Instrument Sciences, Zhejiang University, Hangzhou, China
| | - Nai Ding
- Key Laboratory for Biomedical Engineering of Ministry of Education, College of Biomedical Engineering and Instrument Sciences, Zhejiang University, Hangzhou, China
- Nanhu Brain-computer Interface Institute, Hangzhou, China
- The State key Lab of Brain-Machine Intelligence; The MOE Frontier Science Center for Brain Science & Brain-machine Integration, Zhejiang University, Hangzhou, China
| |
Collapse
|
25
|
Silva Pereira S, Özer EE, Sebastian-Galles N. Complexity of STG signals and linguistic rhythm: a methodological study for EEG data. Cereb Cortex 2024; 34:bhad549. [PMID: 38236741 DOI: 10.1093/cercor/bhad549] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2023] [Revised: 12/29/2023] [Accepted: 12/30/2023] [Indexed: 02/06/2024] Open
Abstract
The superior temporal and the Heschl's gyri of the human brain play a fundamental role in speech processing. Neurons synchronize their activity to the amplitude envelope of the speech signal to extract acoustic and linguistic features, a process known as neural tracking/entrainment. Electroencephalography has been extensively used in language-related research due to its high temporal resolution and reduced cost, but it does not allow for a precise source localization. Motivated by the lack of a unified methodology for the interpretation of source reconstructed signals, we propose a method based on modularity and signal complexity. The procedure was tested on data from an experiment in which we investigated the impact of native language on tracking to linguistic rhythms in two groups: English natives and Spanish natives. In the experiment, we found no effect of native language but an effect of language rhythm. Here, we compare source projected signals in the auditory areas of both hemispheres for the different conditions using nonparametric permutation tests, modularity, and a dynamical complexity measure. We found increasing values of complexity for decreased regularity in the stimuli, giving us the possibility to conclude that languages with less complex rhythms are easier to track by the auditory cortex.
Collapse
Affiliation(s)
- Silvana Silva Pereira
- Center for Brain and Cognition, Department of Information and Communications Technologies, Universitat Pompeu Fabra, 08005 Barcelona, Spain
| | - Ege Ekin Özer
- Center for Brain and Cognition, Department of Information and Communications Technologies, Universitat Pompeu Fabra, 08005 Barcelona, Spain
| | - Nuria Sebastian-Galles
- Center for Brain and Cognition, Department of Information and Communications Technologies, Universitat Pompeu Fabra, 08005 Barcelona, Spain
| |
Collapse
|
26
|
Mongold SJ, Georgiev C, Legrand T, Bourguignon M. Afferents to Action: Cortical Proprioceptive Processing Assessed with Corticokinematic Coherence Specifically Relates to Gross Motor Skills. eNeuro 2024; 11:ENEURO.0384-23.2023. [PMID: 38164580 PMCID: PMC10849019 DOI: 10.1523/eneuro.0384-23.2023] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2023] [Revised: 11/29/2023] [Accepted: 12/08/2023] [Indexed: 01/03/2024] Open
Abstract
Voluntary motor control is thought to be predicated on the ability to efficiently integrate and process somatosensory afferent information. However, current approaches in the field of motor control have not factored in objective markers of how the brain tracks incoming somatosensory information. Here, we asked whether motor performance relates to such markers obtained with an analysis of the coupling between peripheral kinematics and cortical oscillations during continuous movements, best known as corticokinematic coherence (CKC). Motor performance was evaluated by measuring both gross and fine motor skills using the Box and Blocks Test (BBT) and the Purdue Pegboard Test (PPT), respectively, and with a biomechanics measure of coordination. A total of 61 participants completed the BBT, while equipped with electroencephalography and electromyography, and the PPT. We evaluated CKC, from the signals collected during the BBT, as the coherence between movement rhythmicity and brain activity, and coordination as the cross-correlation between muscle activity. CKC at movements' first harmonic was positively associated with BBT scores (r = 0.41, p = 0.001), and alone showed no relationship with PPT scores (r = 0.07, p = 0.60), but in synergy with BBT scores, participants with lower PPT scores had higher CKC than expected based on their BBT score. Coordination was not associated with motor performance or CKC (p > 0.05). These findings demonstrate that cortical somatosensory processing in the form of strengthened brain-peripheral coupling is specifically associated with better gross motor skills and thus may be considered as a valuable addition to classical tests of proprioception acuity.
Collapse
Affiliation(s)
- Scott J Mongold
- Université libre de Bruxelles (ULB), UNI-ULB Neuroscience Institute, Laboratory of Neurophysiology and Movement Biomechanics, 1070 Brussels, Belgium
| | - Christian Georgiev
- Université libre de Bruxelles (ULB), UNI-ULB Neuroscience Institute, Laboratory of Neurophysiology and Movement Biomechanics, 1070 Brussels, Belgium
| | - Thomas Legrand
- Université libre de Bruxelles (ULB), UNI-ULB Neuroscience Institute, Laboratory of Neurophysiology and Movement Biomechanics, 1070 Brussels, Belgium
- University College Dublin (UCD), School of Electrical and Electronic Engineering, D04 V1W8 Dublin, Ireland
| | - Mathieu Bourguignon
- Université libre de Bruxelles (ULB), UNI-ULB Neuroscience Institute, Laboratory of Neurophysiology and Movement Biomechanics, 1070 Brussels, Belgium
- Université libre de Bruxelles (ULB), UNI - ULB Neurosciences Institute, Laboratoire de Neuroanatomie et de Neuroimagerie translationnelles (LN2T), 1070 Brussels, Belgium
- BCBL, Basque Center on Cognition, Brain and Language, 20009 San Sebastian, Spain
| |
Collapse
|
27
|
MacIntyre AD, Carlyon RP, Goehring T. Neural Decoding of the Speech Envelope: Effects of Intelligibility and Spectral Degradation. Trends Hear 2024; 28:23312165241266316. [PMID: 39183533 PMCID: PMC11345737 DOI: 10.1177/23312165241266316] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2024] [Revised: 05/23/2024] [Accepted: 06/16/2024] [Indexed: 08/27/2024] Open
Abstract
During continuous speech perception, endogenous neural activity becomes time-locked to acoustic stimulus features, such as the speech amplitude envelope. This speech-brain coupling can be decoded using non-invasive brain imaging techniques, including electroencephalography (EEG). Neural decoding may provide clinical use as an objective measure of stimulus encoding by the brain-for example during cochlear implant listening, wherein the speech signal is severely spectrally degraded. Yet, interplay between acoustic and linguistic factors may lead to top-down modulation of perception, thereby complicating audiological applications. To address this ambiguity, we assess neural decoding of the speech envelope under spectral degradation with EEG in acoustically hearing listeners (n = 38; 18-35 years old) using vocoded speech. We dissociate sensory encoding from higher-order processing by employing intelligible (English) and non-intelligible (Dutch) stimuli, with auditory attention sustained using a repeated-phrase detection task. Subject-specific and group decoders were trained to reconstruct the speech envelope from held-out EEG data, with decoder significance determined via random permutation testing. Whereas speech envelope reconstruction did not vary by spectral resolution, intelligible speech was associated with better decoding accuracy in general. Results were similar across subject-specific and group analyses, with less consistent effects of spectral degradation in group decoding. Permutation tests revealed possible differences in decoder statistical significance by experimental condition. In general, while robust neural decoding was observed at the individual and group level, variability within participants would most likely prevent the clinical use of such a measure to differentiate levels of spectral degradation and intelligibility on an individual basis.
Collapse
Affiliation(s)
| | - Robert P. Carlyon
- MRC Cognition and Brain Sciences Unit, University of Cambridge, Cambridge, UK
| | - Tobias Goehring
- MRC Cognition and Brain Sciences Unit, University of Cambridge, Cambridge, UK
| |
Collapse
|
28
|
Batterink LJ, Mulgrew J, Gibbings A. Rhythmically Modulating Neural Entrainment during Exposure to Regularities Influences Statistical Learning. J Cogn Neurosci 2024; 36:107-127. [PMID: 37902580 DOI: 10.1162/jocn_a_02079] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/31/2023]
Abstract
The ability to discover regularities in the environment, such as syllable patterns in speech, is known as statistical learning. Previous studies have shown that statistical learning is accompanied by neural entrainment, in which neural activity temporally aligns with repeating patterns over time. However, it is unclear whether these rhythmic neural dynamics play a functional role in statistical learning or whether they largely reflect the downstream consequences of learning, such as the enhanced perception of learned words in speech. To better understand this issue, we manipulated participants' neural entrainment during statistical learning using continuous rhythmic visual stimulation. Participants were exposed to a speech stream of repeating nonsense words while viewing either (1) a visual stimulus with a "congruent" rhythm that aligned with the word structure, (2) a visual stimulus with an incongruent rhythm, or (3) a static visual stimulus. Statistical learning was subsequently measured using both an explicit and implicit test. Participants in the congruent condition showed a significant increase in neural entrainment over auditory regions at the relevant word frequency, over and above effects of passive volume conduction, indicating that visual stimulation successfully altered neural entrainment within relevant neural substrates. Critically, during the subsequent implicit test, participants in the congruent condition showed an enhanced ability to predict upcoming syllables and stronger neural phase synchronization to component words, suggesting that they had gained greater sensitivity to the statistical structure of the speech stream relative to the incongruent and static groups. This learning benefit could not be attributed to strategic processes, as participants were largely unaware of the contingencies between the visual stimulation and embedded words. These results indicate that manipulating neural entrainment during exposure to regularities influences statistical learning outcomes, suggesting that neural entrainment may functionally contribute to statistical learning. Our findings encourage future studies using non-invasive brain stimulation methods to further understand the role of entrainment in statistical learning.
Collapse
|
29
|
Ahmed F, Nidiffer AR, Lalor EC. The effect of gaze on EEG measures of multisensory integration in a cocktail party scenario. Front Hum Neurosci 2023; 17:1283206. [PMID: 38162285 PMCID: PMC10754997 DOI: 10.3389/fnhum.2023.1283206] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2023] [Accepted: 11/20/2023] [Indexed: 01/03/2024] Open
Abstract
Seeing the speaker's face greatly improves our speech comprehension in noisy environments. This is due to the brain's ability to combine the auditory and the visual information around us, a process known as multisensory integration. Selective attention also strongly influences what we comprehend in scenarios with multiple speakers-an effect known as the cocktail-party phenomenon. However, the interaction between attention and multisensory integration is not fully understood, especially when it comes to natural, continuous speech. In a recent electroencephalography (EEG) study, we explored this issue and showed that multisensory integration is enhanced when an audiovisual speaker is attended compared to when that speaker is unattended. Here, we extend that work to investigate how this interaction varies depending on a person's gaze behavior, which affects the quality of the visual information they have access to. To do so, we recorded EEG from 31 healthy adults as they performed selective attention tasks in several paradigms involving two concurrently presented audiovisual speakers. We then modeled how the recorded EEG related to the audio speech (envelope) of the presented speakers. Crucially, we compared two classes of model - one that assumed underlying multisensory integration (AV) versus another that assumed two independent unisensory audio and visual processes (A+V). This comparison revealed evidence of strong attentional effects on multisensory integration when participants were looking directly at the face of an audiovisual speaker. This effect was not apparent when the speaker's face was in the peripheral vision of the participants. Overall, our findings suggest a strong influence of attention on multisensory integration when high fidelity visual (articulatory) speech information is available. More generally, this suggests that the interplay between attention and multisensory integration during natural audiovisual speech is dynamic and is adaptable based on the specific task and environment.
Collapse
Affiliation(s)
| | | | - Edmund C. Lalor
- Department of Biomedical Engineering, Department of Neuroscience, and Del Monte Institute for Neuroscience, and Center for Visual Science, University of Rochester, Rochester, NY, United States
| |
Collapse
|
30
|
Karunathilake IMD, Kulasingham JP, Simon JZ. Neural tracking measures of speech intelligibility: Manipulating intelligibility while keeping acoustics unchanged. Proc Natl Acad Sci U S A 2023; 120:e2309166120. [PMID: 38032934 PMCID: PMC10710032 DOI: 10.1073/pnas.2309166120] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2023] [Accepted: 10/21/2023] [Indexed: 12/02/2023] Open
Abstract
Neural speech tracking has advanced our understanding of how our brains rapidly map an acoustic speech signal onto linguistic representations and ultimately meaning. It remains unclear, however, how speech intelligibility is related to the corresponding neural responses. Many studies addressing this question vary the level of intelligibility by manipulating the acoustic waveform, but this makes it difficult to cleanly disentangle the effects of intelligibility from underlying acoustical confounds. Here, using magnetoencephalography recordings, we study neural measures of speech intelligibility by manipulating intelligibility while keeping the acoustics strictly unchanged. Acoustically identical degraded speech stimuli (three-band noise-vocoded, ~20 s duration) are presented twice, but the second presentation is preceded by the original (nondegraded) version of the speech. This intermediate priming, which generates a "pop-out" percept, substantially improves the intelligibility of the second degraded speech passage. We investigate how intelligibility and acoustical structure affect acoustic and linguistic neural representations using multivariate temporal response functions (mTRFs). As expected, behavioral results confirm that perceived speech clarity is improved by priming. mTRFs analysis reveals that auditory (speech envelope and envelope onset) neural representations are not affected by priming but only by the acoustics of the stimuli (bottom-up driven). Critically, our findings suggest that segmentation of sounds into words emerges with better speech intelligibility, and most strongly at the later (~400 ms latency) word processing stage, in prefrontal cortex, in line with engagement of top-down mechanisms associated with priming. Taken together, our results show that word representations may provide some objective measures of speech comprehension.
Collapse
Affiliation(s)
| | | | - Jonathan Z. Simon
- Department of Electrical and Computer Engineering, University of Maryland, College Park, MD20742
- Department of Biology, University of Maryland, College Park, MD20742
- Institute for Systems Research, University of Maryland, College Park, MD20742
| |
Collapse
|
31
|
Çetinçelik M, Rowland CF, Snijders TM. Ten-month-old infants' neural tracking of naturalistic speech is not facilitated by the speaker's eye gaze. Dev Cogn Neurosci 2023; 64:101297. [PMID: 37778275 PMCID: PMC10543766 DOI: 10.1016/j.dcn.2023.101297] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2023] [Revised: 08/21/2023] [Accepted: 09/08/2023] [Indexed: 10/03/2023] Open
Abstract
Eye gaze is a powerful ostensive cue in infant-caregiver interactions, with demonstrable effects on language acquisition. While the link between gaze following and later vocabulary is well-established, the effects of eye gaze on other aspects of language, such as speech processing, are less clear. In this EEG study, we examined the effects of the speaker's eye gaze on ten-month-old infants' neural tracking of naturalistic audiovisual speech, a marker for successful speech processing. Infants watched videos of a speaker telling stories, addressing the infant with direct or averted eye gaze. We assessed infants' speech-brain coherence at stress (1-1.75 Hz) and syllable (2.5-3.5 Hz) rates, tested for differences in attention by comparing looking times and EEG theta power in the two conditions, and investigated whether neural tracking predicts later vocabulary. Our results showed that infants' brains tracked the speech rhythm both at the stress and syllable rates, and that infants' neural tracking at the syllable rate predicted later vocabulary. However, speech-brain coherence did not significantly differ between direct and averted gaze conditions and infants did not show greater attention to direct gaze. Overall, our results suggest significant neural tracking at ten months, related to vocabulary development, but not modulated by speaker's gaze.
Collapse
Affiliation(s)
- Melis Çetinçelik
- Department of Experimental Psychology, Utrecht University, Utrecht, the Netherlands; Max Planck Institute for Psycholinguistics, Nijmegen, the Netherlands.
| | - Caroline F Rowland
- Max Planck Institute for Psycholinguistics, Nijmegen, the Netherlands; Donders Institute for Brain, Cognition and Behaviour, Radboud University, Nijmegen, the Netherlands
| | - Tineke M Snijders
- Max Planck Institute for Psycholinguistics, Nijmegen, the Netherlands; Donders Institute for Brain, Cognition and Behaviour, Radboud University, Nijmegen, the Netherlands; Cognitive Neuropsychology Department, Tilburg University, Tilburg, the Netherlands
| |
Collapse
|
32
|
Mai G, Wang WSY. Distinct roles of delta- and theta-band neural tracking for sharpening and predictive coding of multi-level speech features during spoken language processing. Hum Brain Mapp 2023; 44:6149-6172. [PMID: 37818940 PMCID: PMC10619373 DOI: 10.1002/hbm.26503] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2023] [Revised: 08/17/2023] [Accepted: 09/13/2023] [Indexed: 10/13/2023] Open
Abstract
The brain tracks and encodes multi-level speech features during spoken language processing. It is evident that this speech tracking is dominant at low frequencies (<8 Hz) including delta and theta bands. Recent research has demonstrated distinctions between delta- and theta-band tracking but has not elucidated how they differentially encode speech across linguistic levels. Here, we hypothesised that delta-band tracking encodes prediction errors (enhanced processing of unexpected features) while theta-band tracking encodes neural sharpening (enhanced processing of expected features) when people perceive speech with different linguistic contents. EEG responses were recorded when normal-hearing participants attended to continuous auditory stimuli that contained different phonological/morphological and semantic contents: (1) real-words, (2) pseudo-words and (3) time-reversed speech. We employed multivariate temporal response functions to measure EEG reconstruction accuracies in response to acoustic (spectrogram), phonetic and phonemic features with the partialling procedure that singles out unique contributions of individual features. We found higher delta-band accuracies for pseudo-words than real-words and time-reversed speech, especially during encoding of phonetic features. Notably, individual time-lag analyses showed that significantly higher accuracies for pseudo-words than real-words started at early processing stages for phonetic encoding (<100 ms post-feature) and later stages for acoustic and phonemic encoding (>200 and 400 ms post-feature, respectively). Theta-band accuracies, on the other hand, were higher when stimuli had richer linguistic content (real-words > pseudo-words > time-reversed speech). Such effects also started at early stages (<100 ms post-feature) during encoding of all individual features or when all features were combined. We argue these results indicate that delta-band tracking may play a role in predictive coding leading to greater tracking of pseudo-words due to the presence of unexpected/unpredicted semantic information, while theta-band tracking encodes sharpened signals caused by more expected phonological/morphological and semantic contents. Early presence of these effects reflects rapid computations of sharpening and prediction errors. Moreover, by measuring changes in EEG alpha power, we did not find evidence that the observed effects can be solitarily explained by attentional demands or listening efforts. Finally, we used directed information analyses to illustrate feedforward and feedback information transfers between prediction errors and sharpening across linguistic levels, showcasing how our results fit with the hierarchical Predictive Coding framework. Together, we suggest the distinct roles of delta and theta neural tracking for sharpening and predictive coding of multi-level speech features during spoken language processing.
Collapse
Affiliation(s)
- Guangting Mai
- Hearing Theme, National Institute for Health Research Nottingham Biomedical Research Centre, Nottingham, UK
- Academic Unit of Mental Health and Clinical Neurosciences, School of Medicine, The University of Nottingham, Nottingham, UK
- Division of Psychology and Language Sciences, Faculty of Brain Sciences, University College London, London, UK
| | - William S-Y Wang
- Department of Chinese and Bilingual Studies, Hong Kong Polytechnic University, Hung Hom, Hong Kong
- Language Engineering Laboratory, The Chinese University of Hong Kong, Hong Kong, China
| |
Collapse
|
33
|
Ni G, Xu Z, Bai Y, Zheng Q, Zhao R, Wu Y, Ming D. EEG-based assessment of temporal fine structure and envelope effect in mandarin syllable and tone perception. Cereb Cortex 2023; 33:11287-11299. [PMID: 37804238 DOI: 10.1093/cercor/bhad366] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2023] [Revised: 09/13/2023] [Accepted: 09/15/2023] [Indexed: 10/09/2023] Open
Abstract
In recent years, speech perception research has benefited from low-frequency rhythm entrainment tracking of the speech envelope. However, speech perception is still controversial regarding the role of speech envelope and temporal fine structure, especially in Mandarin. This study aimed to discuss the dependence of Mandarin syllables and tones perception on the speech envelope and the temporal fine structure. We recorded the electroencephalogram (EEG) of the subjects under three acoustic conditions using the sound chimerism analysis, including (i) the original speech, (ii) the speech envelope and the sinusoidal modulation, and (iii) the fine structure of time and the modulation of the non-speech (white noise) sound envelope. We found that syllable perception mainly depended on the speech envelope, while tone perception depended on the temporal fine structure. The delta bands were prominent, and the parietal and prefrontal lobes were the main activated brain areas, regardless of whether syllable or tone perception was involved. Finally, we decoded the spatiotemporal features of Mandarin perception from the microstate sequence. The spatiotemporal feature sequence of the EEG caused by speech material was found to be specific, suggesting a new perspective for the subsequent auditory brain-computer interface. These results provided a new scheme for the coding strategy of new hearing aids for native Mandarin speakers. HIGHLIGHTS
Collapse
Affiliation(s)
- Guangjian Ni
- Academy of Medical Engineering and Translational Medicine, Tianjin University, Tianjin 300072 China
- Tianjin Key Laboratory of Brain Science and Neuroengineering, Tianjin 300072 China
- Haihe Laboratory of Brain-Computer Interaction and Human-Machine Integration, Tianjin 300392 China
| | - Zihao Xu
- Academy of Medical Engineering and Translational Medicine, Tianjin University, Tianjin 300072 China
- Tianjin Key Laboratory of Brain Science and Neuroengineering, Tianjin 300072 China
| | - Yanru Bai
- Academy of Medical Engineering and Translational Medicine, Tianjin University, Tianjin 300072 China
- Tianjin Key Laboratory of Brain Science and Neuroengineering, Tianjin 300072 China
| | - Qi Zheng
- Academy of Medical Engineering and Translational Medicine, Tianjin University, Tianjin 300072 China
| | - Ran Zhao
- Academy of Medical Engineering and Translational Medicine, Tianjin University, Tianjin 300072 China
- Tianjin Key Laboratory of Brain Science and Neuroengineering, Tianjin 300072 China
| | - Yubo Wu
- Academy of Medical Engineering and Translational Medicine, Tianjin University, Tianjin 300072 China
| | - Dong Ming
- Academy of Medical Engineering and Translational Medicine, Tianjin University, Tianjin 300072 China
- Tianjin Key Laboratory of Brain Science and Neuroengineering, Tianjin 300072 China
- Haihe Laboratory of Brain-Computer Interaction and Human-Machine Integration, Tianjin 300392 China
| |
Collapse
|
34
|
Ortiz-Barajas MC, Guevara R, Gervain J. Neural oscillations and speech processing at birth. iScience 2023; 26:108187. [PMID: 37965146 PMCID: PMC10641252 DOI: 10.1016/j.isci.2023.108187] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2022] [Revised: 08/29/2023] [Accepted: 10/09/2023] [Indexed: 11/16/2023] Open
Abstract
Are neural oscillations biologically endowed building blocks of the neural architecture for speech processing from birth, or do they require experience to emerge? In adults, delta, theta, and low-gamma oscillations support the simultaneous processing of phrasal, syllabic, and phonemic units in the speech signal, respectively. Using electroencephalography to investigate neural oscillations in the newborn brain we reveal that delta and theta oscillations differ for rhythmically different languages, suggesting that these bands underlie newborns' universal ability to discriminate languages on the basis of rhythm. Additionally, higher theta activity during post-stimulus as compared to pre-stimulus rest suggests that stimulation after-effects are present from birth.
Collapse
Affiliation(s)
- Maria Clemencia Ortiz-Barajas
- Integrative Neuroscience and Cognition Center, CNRS & Université Paris Cité, 45 rue des Saints-Pères, 75006 Paris, France
| | - Ramón Guevara
- Department of Physics and Astronomy, University of Padua, Via Marzolo 8, 35131 Padua, Italy
| | - Judit Gervain
- Integrative Neuroscience and Cognition Center, CNRS & Université Paris Cité, 45 rue des Saints-Pères, 75006 Paris, France
- Department of Developmental and Social Psychology, University of Padua, Via Venezia 8, 35131 Padua, Italy
| |
Collapse
|
35
|
Zhang X, Li J, Li Z, Hong B, Diao T, Ma X, Nolte G, Engel AK, Zhang D. Leading and following: Noise differently affects semantic and acoustic processing during naturalistic speech comprehension. Neuroimage 2023; 282:120404. [PMID: 37806465 DOI: 10.1016/j.neuroimage.2023.120404] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2023] [Revised: 08/19/2023] [Accepted: 10/05/2023] [Indexed: 10/10/2023] Open
Abstract
Despite the distortion of speech signals caused by unavoidable noise in daily life, our ability to comprehend speech in noisy environments is relatively stable. However, the neural mechanisms underlying reliable speech-in-noise comprehension remain to be elucidated. The present study investigated the neural tracking of acoustic and semantic speech information during noisy naturalistic speech comprehension. Participants listened to narrative audio recordings mixed with spectrally matched stationary noise at three signal-to-ratio (SNR) levels (no noise, 3 dB, -3 dB), and 60-channel electroencephalography (EEG) signals were recorded. A temporal response function (TRF) method was employed to derive event-related-like responses to the continuous speech stream at both the acoustic and the semantic levels. Whereas the amplitude envelope of the naturalistic speech was taken as the acoustic feature, word entropy and word surprisal were extracted via the natural language processing method as two semantic features. Theta-band frontocentral TRF responses to the acoustic feature were observed at around 400 ms following speech fluctuation onset over all three SNR levels, and the response latencies were more delayed with increasing noise. Delta-band frontal TRF responses to the semantic feature of word entropy were observed at around 200 to 600 ms leading to speech fluctuation onset over all three SNR levels. The response latencies became more leading with increasing noise and decreasing speech comprehension and intelligibility. While the following responses to speech acoustics were consistent with previous studies, our study revealed the robustness of leading responses to speech semantics, which suggests a possible predictive mechanism at the semantic level for maintaining reliable speech comprehension in noisy environments.
Collapse
Affiliation(s)
- Xinmiao Zhang
- Department of Psychology, School of Social Sciences, Tsinghua University, Beijing 100084, China; Tsinghua Laboratory of Brain and Intelligence, Tsinghua University, Beijing 100084, China
| | - Jiawei Li
- Department of Education and Psychology, Freie Universität Berlin, Berlin 14195, Federal Republic of Germany
| | - Zhuoran Li
- Department of Psychology, School of Social Sciences, Tsinghua University, Beijing 100084, China; Tsinghua Laboratory of Brain and Intelligence, Tsinghua University, Beijing 100084, China
| | - Bo Hong
- Tsinghua Laboratory of Brain and Intelligence, Tsinghua University, Beijing 100084, China; Department of Biomedical Engineering, School of Medicine, Tsinghua University, Beijing 100084, China
| | - Tongxiang Diao
- Department of Otolaryngology, Head and Neck Surgery, Peking University, People's Hospital, Beijing 100044, China
| | - Xin Ma
- Department of Otolaryngology, Head and Neck Surgery, Peking University, People's Hospital, Beijing 100044, China
| | - Guido Nolte
- Department of Neurophysiology and Pathophysiology, University Medical Center Hamburg-Eppendorf, Hamburg 20246, Federal Republic of Germany
| | - Andreas K Engel
- Department of Neurophysiology and Pathophysiology, University Medical Center Hamburg-Eppendorf, Hamburg 20246, Federal Republic of Germany
| | - Dan Zhang
- Department of Psychology, School of Social Sciences, Tsinghua University, Beijing 100084, China; Tsinghua Laboratory of Brain and Intelligence, Tsinghua University, Beijing 100084, China.
| |
Collapse
|
36
|
Quimby AE, Wei K, Adewole D, Eliades S, Cullen DK, Brant JA. Signal processing and stimulation potential within the ascending auditory pathway: a review. Front Neurosci 2023; 17:1277627. [PMID: 38027521 PMCID: PMC10658786 DOI: 10.3389/fnins.2023.1277627] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2023] [Accepted: 10/12/2023] [Indexed: 12/01/2023] Open
Abstract
The human auditory system encodes sound with a high degree of temporal and spectral resolution. When hearing fails, existing neuroprosthetics such as cochlear implants may partially restore hearing through stimulation of auditory neurons at the level of the cochlea, though not without limitations inherent to electrical stimulation. Novel approaches to hearing restoration, such as optogenetics, offer the potential of improved performance. We review signal processing in the ascending auditory pathway and the current state of conventional and emerging neural stimulation strategies at various levels of the auditory system.
Collapse
Affiliation(s)
- Alexandra E. Quimby
- Department of Otolaryngology and Communication Sciences, SUNY Upstate Medical University, Syracuse, NY, United States
| | - Kimberly Wei
- Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, United States
| | - Dayo Adewole
- Corporal Michael J. Crescenz Veterans Affairs Medical Center, Philadelphia, PA, United States
| | - Steven Eliades
- Department of Head and Neck Surgery and Communication Sciences, Duke University, Durham, NC, United States
| | - D. Kacy Cullen
- Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, United States
- Corporal Michael J. Crescenz Veterans Affairs Medical Center, Philadelphia, PA, United States
| | - Jason A. Brant
- Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, United States
- Department of Otorhinolaryngology – Head and Neck Surgery, University of Pennsylvania, Philadelphia, PA, United States
| |
Collapse
|
37
|
Vogeti S, Faramarzi M, Herrmann CS. Alpha transcranial alternating current stimulation modulates auditory perception. Brain Stimul 2023; 16:1646-1652. [PMID: 37949295 DOI: 10.1016/j.brs.2023.11.002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2023] [Revised: 10/30/2023] [Accepted: 11/02/2023] [Indexed: 11/12/2023] Open
Abstract
BACKGROUND Studies using transcranial alternating current stimulation (tACS), a type of non-invasive brain stimulation, have demonstrated a relationship between the positive versus negative phase of both alpha and delta/theta oscillations with variable near-threshold auditory perception. These findings have not been directly compared before. Furthermore, as perception was better in the positive versus negative phase of two different frequencies, it is unclear whether changes in polarity (independent of a specific frequency) could also modulate auditory perception. OBJECTIVE We investigated whether auditory perception depends on the phase of alpha, delta/theta, or polarity alone. METHODS We stimulated participants with alpha, delta, and positive and negative direct current (DC) over temporal and central scalp sites while they identified near-threshold tones-in-noise. A Sham condition without tACS served as a control condition. A repeated-measures analysis of variance was used to assess differences in proportions of hits between conditions and polarities. Permutation-based circular-logistic regressions were used to assess the relationship between circular-predictors and single-trial behavioral responses. An exploratory analysis compared the full circular-logistic regression model to the intercept-only model. RESULTS Overall, there were a greater proportion of hits in the Alpha condition in comparison to Delta, DC, and Sham conditions. We also found an interaction between polarity and stimulation condition; post-hoc analyses revealed a greater proportion of hits in the positive versus negative phase of Alpha tACS. In contrast, no significant differences were found in the Delta, DC, or Sham conditions. The permutation-based circular-logistic regressions did not reveal a statistically significant difference between the obtained RMS of the sine and cosine coefficients and the mean of the surrogate distribution for any of the conditions. However, our exploratory analysis revealed that circular-predictors explained the behavioral data significantly better than an intercept-only model for the Alpha condition, and not the other three conditions. CONCLUSION These findings suggest that alpha tACS, and not delta nor polarity alone, modulates auditory perception.
Collapse
Affiliation(s)
- Sreekari Vogeti
- Experimental Psychology Lab, Department of Psychology, European Medical School, Cluster for Excellence "Hearing for All", Carl von Ossietzky University, Oldenburg, Germany
| | - Maryam Faramarzi
- Experimental Psychology Lab, Department of Psychology, European Medical School, Cluster for Excellence "Hearing for All", Carl von Ossietzky University, Oldenburg, Germany
| | - Christoph S Herrmann
- Experimental Psychology Lab, Department of Psychology, European Medical School, Cluster for Excellence "Hearing for All", Carl von Ossietzky University, Oldenburg, Germany; Neuroimaging Unit, European Medical School, Carl von Ossietzky University, Oldenburg, Germany; Research Center Neurosensory Science, Carl von Ossietzky University, Oldenburg, Germany.
| |
Collapse
|
38
|
Menn KH, Männel C, Meyer L. Does Electrophysiological Maturation Shape Language Acquisition? PERSPECTIVES ON PSYCHOLOGICAL SCIENCE 2023; 18:1271-1281. [PMID: 36753616 PMCID: PMC10623610 DOI: 10.1177/17456916231151584] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/10/2023]
Abstract
Infants master temporal patterns of their native language at a developmental trajectory from slow to fast: Shortly after birth, they recognize the slow acoustic modulations specific to their native language before tuning into faster language-specific patterns between 6 and 12 months of age. We propose here that this trajectory is constrained by neuronal maturation-in particular, the gradual emergence of high-frequency neural oscillations in the infant electroencephalogram. Infants' initial focus on slow prosodic modulations is consistent with the prenatal availability of slow electrophysiological activity (i.e., theta- and delta-band oscillations). Our proposal is consistent with the temporal patterns of infant-directed speech, which initially amplifies slow modulations, approaching the faster modulation range of adult-directed speech only as infants' language has advanced sufficiently. Moreover, our proposal agrees with evidence from premature infants showing maturational age is a stronger predictor of language development than ex utero exposure to speech, indicating that premature infants cannot exploit their earlier availability of speech because of electrophysiological constraints. In sum, we provide a new perspective on language acquisition emphasizing neuronal development as a critical driving force of infants' language development.
Collapse
Affiliation(s)
- Katharina H. Menn
- Research Group Language Cycles, Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany
- Department of Neuropsychology, Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany
- International Max Planck Research School on Neuroscience of Communication: Function, Structure, and Plasticity, Leipzig, Germany
| | - Claudia Männel
- Department of Neuropsychology, Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany
- Department of Audiology and Phoniatrics, Charité – Universitätsmedizin Berlin, Berlin, Germany
| | - Lars Meyer
- Research Group Language Cycles, Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany
- Clinic for Phoniatrics and Pedaudiology, University Hospital Münster, Münster, Germany
| |
Collapse
|
39
|
Schmidt F, Chen Y, Keitel A, Rösch S, Hannemann R, Serman M, Hauswald A, Weisz N. Neural speech tracking shifts from the syllabic to the modulation rate of speech as intelligibility decreases. Psychophysiology 2023; 60:e14362. [PMID: 37350379 PMCID: PMC10909526 DOI: 10.1111/psyp.14362] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2022] [Revised: 04/24/2023] [Accepted: 05/10/2023] [Indexed: 06/24/2023]
Abstract
The most prominent acoustic features in speech are intensity modulations, represented by the amplitude envelope of speech. Synchronization of neural activity with these modulations supports speech comprehension. As the acoustic modulation of speech is related to the production of syllables, investigations of neural speech tracking commonly do not distinguish between lower-level acoustic (envelope modulation) and higher-level linguistic (syllable rate) information. Here we manipulated speech intelligibility using noise-vocoded speech and investigated the spectral dynamics of neural speech processing, across two studies at cortical and subcortical levels of the auditory hierarchy, using magnetoencephalography. Overall, cortical regions mostly track the syllable rate, whereas subcortical regions track the acoustic envelope. Furthermore, with less intelligible speech, tracking of the modulation rate becomes more dominant. Our study highlights the importance of distinguishing between envelope modulation and syllable rate and provides novel possibilities to better understand differences between auditory processing and speech/language processing disorders.
Collapse
Affiliation(s)
- Fabian Schmidt
- Center for Cognitive NeuroscienceUniversity of SalzburgSalzburgAustria
- Department of PsychologyUniversity of SalzburgSalzburgAustria
| | - Ya‐Ping Chen
- Center for Cognitive NeuroscienceUniversity of SalzburgSalzburgAustria
- Department of PsychologyUniversity of SalzburgSalzburgAustria
| | - Anne Keitel
- Psychology, School of Social SciencesUniversity of DundeeDundeeUK
| | - Sebastian Rösch
- Department of OtorhinolaryngologyParacelsus Medical UniversitySalzburgAustria
| | | | - Maja Serman
- Audiological Research UnitSivantos GmbHErlangenGermany
| | - Anne Hauswald
- Center for Cognitive NeuroscienceUniversity of SalzburgSalzburgAustria
- Department of PsychologyUniversity of SalzburgSalzburgAustria
| | - Nathan Weisz
- Center for Cognitive NeuroscienceUniversity of SalzburgSalzburgAustria
- Department of PsychologyUniversity of SalzburgSalzburgAustria
- Neuroscience Institute, Christian Doppler University Hospital, Paracelsus Medical UniversitySalzburgAustria
| |
Collapse
|
40
|
Van Hirtum T, Somers B, Dieudonné B, Verschueren E, Wouters J, Francart T. Neural envelope tracking predicts speech intelligibility and hearing aid benefit in children with hearing loss. Hear Res 2023; 439:108893. [PMID: 37806102 DOI: 10.1016/j.heares.2023.108893] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/30/2023] [Revised: 09/01/2023] [Accepted: 09/27/2023] [Indexed: 10/10/2023]
Abstract
Early assessment of hearing aid benefit is crucial, as the extent to which hearing aids provide audible speech information predicts speech and language outcomes. A growing body of research has proposed neural envelope tracking as an objective measure of speech intelligibility, particularly for individuals unable to provide reliable behavioral feedback. However, its potential for evaluating speech intelligibility and hearing aid benefit in children with hearing loss remains unexplored. In this study, we investigated neural envelope tracking in children with permanent hearing loss through two separate experiments. EEG data were recorded while children listened to age-appropriate stories (Experiment 1) or an animated movie (Experiment 2) under aided and unaided conditions (using personal hearing aids) at multiple stimulus intensities. Neural envelope tracking was evaluated using a linear decoder reconstructing the speech envelope from the EEG in the delta band (0.5-4 Hz). Additionally, we calculated temporal response functions (TRFs) to investigate the spatio-temporal dynamics of the response. In both experiments, neural tracking increased with increasing stimulus intensity, but only in the unaided condition. In the aided condition, neural tracking remained stable across a wide range of intensities, as long as speech intelligibility was maintained. Similarly, TRF amplitudes increased with increasing stimulus intensity in the unaided condition, while in the aided condition significant differences were found in TRF latency rather than TRF amplitude. This suggests that decreasing stimulus intensity does not necessarily impact neural tracking. Furthermore, the use of personal hearing aids significantly enhanced neural envelope tracking, particularly in challenging speech conditions that would be inaudible when unaided. Finally, we found a strong correlation between neural envelope tracking and behaviorally measured speech intelligibility for both narrated stories (Experiment 1) and movie stimuli (Experiment 2). Altogether, these findings indicate that neural envelope tracking could be a valuable tool for predicting speech intelligibility benefits derived from personal hearing aids in hearing-impaired children. Incorporating narrated stories or engaging movies expands the accessibility of these methods even in clinical settings, offering new avenues for using objective speech measures to guide pediatric audiology decision-making.
Collapse
Affiliation(s)
- Tilde Van Hirtum
- KU Leuven - University of Leuven, Department of Neurosciences, Experimental Oto-rhino-laryngology, Herestraat 49 bus 721, 3000 Leuven, Belgium
| | - Ben Somers
- KU Leuven - University of Leuven, Department of Neurosciences, Experimental Oto-rhino-laryngology, Herestraat 49 bus 721, 3000 Leuven, Belgium
| | - Benjamin Dieudonné
- KU Leuven - University of Leuven, Department of Neurosciences, Experimental Oto-rhino-laryngology, Herestraat 49 bus 721, 3000 Leuven, Belgium
| | - Eline Verschueren
- KU Leuven - University of Leuven, Department of Neurosciences, Experimental Oto-rhino-laryngology, Herestraat 49 bus 721, 3000 Leuven, Belgium
| | - Jan Wouters
- KU Leuven - University of Leuven, Department of Neurosciences, Experimental Oto-rhino-laryngology, Herestraat 49 bus 721, 3000 Leuven, Belgium
| | - Tom Francart
- KU Leuven - University of Leuven, Department of Neurosciences, Experimental Oto-rhino-laryngology, Herestraat 49 bus 721, 3000 Leuven, Belgium.
| |
Collapse
|
41
|
Yu L, Huang D, Wang S, Zhang Y. Reduced Neural Specialization for Word-level Linguistic Prosody in Children with Autism. J Autism Dev Disord 2023; 53:4351-4367. [PMID: 36038793 DOI: 10.1007/s10803-022-05720-x] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 08/10/2022] [Indexed: 10/15/2022]
Abstract
Children with autism often show atypical brain lateralization for speech and language processing, however, it is unclear what linguistic component contributes to this phenomenon. Here we measured event-related potential (ERP) responses in 21 school-age autistic children and 25 age-matched neurotypical (NT) peers during listening to word-level prosodic stimuli. We found that both groups displayed larger late negative response (LNR) amplitude to native prosody than to nonnative prosody; however, unlike the NT group exhibiting left-lateralized LNR distinction of prosodic phonology, the autism group showed no evidence of LNR lateralization. Moreover, in both groups, the LNR effects were only present for prosodic phonology but not for phoneme-free prosodic acoustics. These results extended the findings of inadequate neural specialization for language in autism to sub-lexical prosodic structures.
Collapse
Affiliation(s)
- Luodi Yu
- Center for Autism Research, School of Education, Guangzhou University, Wenyi Bldg, Guangzhou, China.
- Philosophy and Social Science Laboratory of Reading and Development in Children and Adolescents (South China Normal University) , Ministry of Education, Guangzhou, China.
| | - Dan Huang
- Guangzhou Rehabilitation & Research Center for Children with ASD, Guangzhou Cana School, Guangzhou, China
| | - Suiping Wang
- Philosophy and Social Science Laboratory of Reading and Development in Children and Adolescents (South China Normal University) , Ministry of Education, Guangzhou, China.
| | - Yang Zhang
- Department of Speech-Language-Hearing Sciences, University of Minnesota, Minneapolis, MN, USA
| |
Collapse
|
42
|
Dejean C, Dupont T, Verpy E, Gonçalves N, Coqueran S, Michalski N, Pucheu S, Bourgeron T, Gourévitch B. Detecting Central Auditory Processing Disorders in Awake Mice. Brain Sci 2023; 13:1539. [PMID: 38002499 PMCID: PMC10669832 DOI: 10.3390/brainsci13111539] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2023] [Revised: 10/24/2023] [Accepted: 10/28/2023] [Indexed: 11/26/2023] Open
Abstract
Mice are increasingly used as models of human-acquired neurological or neurodevelopmental conditions, such as autism, schizophrenia, and Alzheimer's disease. All these conditions involve central auditory processing disorders, which have been little investigated despite their potential for providing interesting insights into the mechanisms behind such disorders. Alterations of the auditory steady-state response to 40 Hz click trains are associated with an imbalance between neuronal excitation and inhibition, a mechanism thought to be common to many neurological disorders. Here, we demonstrate the value of presenting click trains at various rates to mice with chronically implanted pins above the inferior colliculus and the auditory cortex for obtaining easy, reliable, and long-lasting access to subcortical and cortical complex auditory processing in awake mice. Using this protocol on a mutant mouse model of autism with a defect of the Shank3 gene, we show that the neural response is impaired at high click rates (above 60 Hz) and that this impairment is visible subcortically-two results that cannot be obtained with classical protocols for cortical EEG recordings in response to stimulation at 40 Hz. These results demonstrate the value and necessity of a more complete investigation of central auditory processing disorders in mouse models of neurological or neurodevelopmental disorders.
Collapse
Affiliation(s)
- Camille Dejean
- Institut Pasteur, Université Paris Cité, INSERM, Institut de l’Audition, Plasticity of Central Auditory Circuits, F-75012 Paris, France
- Cilcare Company, F-34080 Montpellier, France
- Sorbonne Université, Ecole Doctorale Complexité du Vivant, F-75005 Paris, France
| | - Typhaine Dupont
- Institut Pasteur, Université Paris Cité, INSERM, Institut de l’Audition, Plasticity of Central Auditory Circuits, F-75012 Paris, France
| | - Elisabeth Verpy
- Institut Pasteur, Université Paris Cité, CNRS, IUF, Human Genetics and Cognitive Functions, F-75015 Paris, France
| | - Noémi Gonçalves
- Institut Pasteur, Université Paris Cité, INSERM, Institut de l’Audition, Plasticity of Central Auditory Circuits, F-75012 Paris, France
| | - Sabrina Coqueran
- Institut Pasteur, Université Paris Cité, CNRS, IUF, Human Genetics and Cognitive Functions, F-75015 Paris, France
| | - Nicolas Michalski
- Institut Pasteur, Université Paris Cité, INSERM, Institut de l’Audition, Plasticity of Central Auditory Circuits, F-75012 Paris, France
| | | | - Thomas Bourgeron
- Institut Pasteur, Université Paris Cité, CNRS, IUF, Human Genetics and Cognitive Functions, F-75015 Paris, France
| | - Boris Gourévitch
- Institut Pasteur, Université Paris Cité, INSERM, Institut de l’Audition, Plasticity of Central Auditory Circuits, F-75012 Paris, France
- CNRS, F-75016 Paris, France
| |
Collapse
|
43
|
Karunathilake ID, Kulasingham JP, Simon JZ. Neural Tracking Measures of Speech Intelligibility: Manipulating Intelligibility while Keeping Acoustics Unchanged. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.05.18.541269. [PMID: 37292644 PMCID: PMC10245672 DOI: 10.1101/2023.05.18.541269] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Neural speech tracking has advanced our understanding of how our brains rapidly map an acoustic speech signal onto linguistic representations and ultimately meaning. It remains unclear, however, how speech intelligibility is related to the corresponding neural responses. Many studies addressing this question vary the level of intelligibility by manipulating the acoustic waveform, but this makes it difficult to cleanly disentangle effects of intelligibility from underlying acoustical confounds. Here, using magnetoencephalography (MEG) recordings, we study neural measures of speech intelligibility by manipulating intelligibility while keeping the acoustics strictly unchanged. Acoustically identical degraded speech stimuli (three-band noise vocoded, ~20 s duration) are presented twice, but the second presentation is preceded by the original (non-degraded) version of the speech. This intermediate priming, which generates a 'pop-out' percept, substantially improves the intelligibility of the second degraded speech passage. We investigate how intelligibility and acoustical structure affects acoustic and linguistic neural representations using multivariate Temporal Response Functions (mTRFs). As expected, behavioral results confirm that perceived speech clarity is improved by priming. TRF analysis reveals that auditory (speech envelope and envelope onset) neural representations are not affected by priming, but only by the acoustics of the stimuli (bottom-up driven). Critically, our findings suggest that segmentation of sounds into words emerges with better speech intelligibility, and most strongly at the later (~400 ms latency) word processing stage, in prefrontal cortex (PFC), in line with engagement of top-down mechanisms associated with priming. Taken together, our results show that word representations may provide some objective measures of speech comprehension.
Collapse
Affiliation(s)
| | | | - Jonathan Z. Simon
- Department of Electrical and Computer Engineering, University of Maryland, College Park, MD, 20742, USA
- Department of Biology, University of Maryland, College Park, MD 20742, USA
- Institute for Systems Research, University of Maryland, College Park, MD 20742, USA
| |
Collapse
|
44
|
Quique YM, Gnanateja GN, Dickey MW, Evans WS, Chandrasekaran B. Examining cortical tracking of the speech envelope in post-stroke aphasia. Front Hum Neurosci 2023; 17:1122480. [PMID: 37780966 PMCID: PMC10538638 DOI: 10.3389/fnhum.2023.1122480] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2022] [Accepted: 08/28/2023] [Indexed: 10/03/2023] Open
Abstract
Introduction People with aphasia have been shown to benefit from rhythmic elements for language production during aphasia rehabilitation. However, it is unknown whether rhythmic processing is associated with such benefits. Cortical tracking of the speech envelope (CTenv) may provide a measure of encoding of speech rhythmic properties and serve as a predictor of candidacy for rhythm-based aphasia interventions. Methods Electroencephalography was used to capture electrophysiological responses while Spanish speakers with aphasia (n = 9) listened to a continuous speech narrative (audiobook). The Temporal Response Function was used to estimate CTenv in the delta (associated with word- and phrase-level properties), theta (syllable-level properties), and alpha bands (attention-related properties). CTenv estimates were used to predict aphasia severity, performance in rhythmic perception and production tasks, and treatment response in a sentence-level rhythm-based intervention. Results CTenv in delta and theta, but not alpha, predicted aphasia severity. Neither CTenv in delta, alpha, or theta bands predicted performance in rhythmic perception or production tasks. Some evidence supported that CTenv in theta could predict sentence-level learning in aphasia, but alpha and delta did not. Conclusion CTenv of the syllable-level properties was relatively preserved in individuals with less language impairment. In contrast, higher encoding of word- and phrase-level properties was relatively impaired and was predictive of more severe language impairments. CTenv and treatment response to sentence-level rhythm-based interventions need to be further investigated.
Collapse
Affiliation(s)
- Yina M. Quique
- Center for Education in Health Sciences, Northwestern University Feinberg School of Medicine, Chicago, IL, United States
| | - G. Nike Gnanateja
- Department of Communication Sciences and Disorders, University of Wisconsin-Madison, Madison, WI, United States
| | - Michael Walsh Dickey
- VA Pittsburgh Healthcare System, Pittsburgh, PA, United States
- Department of Communication Sciences and Disorders, University of Pittsburgh, Pittsburgh, PA, United States
| | | | - Bharath Chandrasekaran
- Department of Communication Sciences and Disorders, University of Pittsburgh, Pittsburgh, PA, United States
- Roxelyn and Richard Pepper Department of Communication Science and Disorders, School of Communication. Northwestern University, Evanston, IL, United States
| |
Collapse
|
45
|
Ahmed F, Nidiffer AR, Lalor EC. The effect of gaze on EEG measures of multisensory integration in a cocktail party scenario. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.08.23.554451. [PMID: 37662393 PMCID: PMC10473711 DOI: 10.1101/2023.08.23.554451] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/05/2023]
Abstract
Seeing the speaker's face greatly improves our speech comprehension in noisy environments. This is due to the brain's ability to combine the auditory and the visual information around us, a process known as multisensory integration. Selective attention also strongly influences what we comprehend in scenarios with multiple speakers - an effect known as the cocktail-party phenomenon. However, the interaction between attention and multisensory integration is not fully understood, especially when it comes to natural, continuous speech. In a recent electroencephalography (EEG) study, we explored this issue and showed that multisensory integration is enhanced when an audiovisual speaker is attended compared to when that speaker is unattended. Here, we extend that work to investigate how this interaction varies depending on a person's gaze behavior, which affects the quality of the visual information they have access to. To do so, we recorded EEG from 31 healthy adults as they performed selective attention tasks in several paradigms involving two concurrently presented audiovisual speakers. We then modeled how the recorded EEG related to the audio speech (envelope) of the presented speakers. Crucially, we compared two classes of model - one that assumed underlying multisensory integration (AV) versus another that assumed two independent unisensory audio and visual processes (A+V). This comparison revealed evidence of strong attentional effects on multisensory integration when participants were looking directly at the face of an audiovisual speaker. This effect was not apparent when the speaker's face was in the peripheral vision of the participants. Overall, our findings suggest a strong influence of attention on multisensory integration when high fidelity visual (articulatory) speech information is available. More generally, this suggests that the interplay between attention and multisensory integration during natural audiovisual speech is dynamic and is adaptable based on the specific task and environment.
Collapse
Affiliation(s)
- Farhin Ahmed
- Department of Biomedical Engineering, Department of Neuroscience, and Del Monte Institute for Neuroscience, and Center for Visual Science, University of Rochester, Rochester, NY 14627, USA
| | - Aaron R. Nidiffer
- Department of Biomedical Engineering, Department of Neuroscience, and Del Monte Institute for Neuroscience, and Center for Visual Science, University of Rochester, Rochester, NY 14627, USA
| | - Edmund C. Lalor
- Department of Biomedical Engineering, Department of Neuroscience, and Del Monte Institute for Neuroscience, and Center for Visual Science, University of Rochester, Rochester, NY 14627, USA
| |
Collapse
|
46
|
Jia Z, Xu C, Li J, Gao J, Ding N, Luo B, Zou J. Phase Property of Envelope-Tracking EEG Response Is Preserved in Patients with Disorders of Consciousness. eNeuro 2023; 10:ENEURO.0130-23.2023. [PMID: 37500493 PMCID: PMC10420405 DOI: 10.1523/eneuro.0130-23.2023] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2023] [Revised: 07/16/2023] [Accepted: 07/20/2023] [Indexed: 07/29/2023] Open
Abstract
When listening to speech, the low-frequency cortical response below 10 Hz can track the speech envelope. Previous studies have demonstrated that the phase lag between speech envelope and cortical response can reflect the mechanism by which the envelope-tracking response is generated. Here, we analyze whether the mechanism to generate the envelope-tracking response is modulated by the level of consciousness, by studying how the stimulus-response phase lag is modulated by the disorder of consciousness (DoC). It is observed that DoC patients in general show less reliable neural tracking of speech. Nevertheless, the stimulus-response phase lag changes linearly with frequency between 3.5 and 8 Hz, for DoC patients who show reliable cortical tracking to speech, regardless of the consciousness state. The mean phase lag is also consistent across these DoC patients. These results suggest that the envelope-tracking response to speech can be generated by an automatic process that is barely modulated by the consciousness state.
Collapse
Affiliation(s)
- Ziting Jia
- The Second Hospital, Cheeloo College of Medicine, Shandong University, Jinan 250033, China
| | - Chuan Xu
- Department of Neurology, Sir Run Run Shaw Hospital, Zhejiang University School of Medicine, Hangzhou 310019, China
| | - Jingqi Li
- Department of Rehabilitation, Hangzhou Mingzhou Brain Rehabilitation Hospital, Hangzhou 311215, China
| | - Jian Gao
- Department of Rehabilitation, Hangzhou Mingzhou Brain Rehabilitation Hospital, Hangzhou 311215, China
| | - Nai Ding
- Key Laboratory for Biomedical Engineering of Ministry of Education, College of Biomedical Engineering and Instrument Sciences, Zhejiang University, Hangzhou 310027, China
| | - Benyan Luo
- Department of Neurology, First Affiliated Hospital, School of Medicine, Zhejiang University, Hangzhou 310003, China
| | - Jiajie Zou
- Key Laboratory for Biomedical Engineering of Ministry of Education, College of Biomedical Engineering and Instrument Sciences, Zhejiang University, Hangzhou 310027, China
| |
Collapse
|
47
|
Mohammadi Y, Graversen C, Østergaard J, Andersen OK, Reichenbach T. Phase-locking of Neural Activity to the Envelope of Speech in the Delta Frequency Band Reflects Differences between Word Lists and Sentences. J Cogn Neurosci 2023; 35:1301-1311. [PMID: 37379482 DOI: 10.1162/jocn_a_02016] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/30/2023]
Abstract
The envelope of a speech signal is tracked by neural activity in the cerebral cortex. The cortical tracking occurs mainly in two frequency bands, theta (4-8 Hz) and delta (1-4 Hz). Tracking in the faster theta band has been mostly associated with lower-level acoustic processing, such as the parsing of syllables, whereas the slower tracking in the delta band relates to higher-level linguistic information of words and word sequences. However, much regarding the more specific association between cortical tracking and acoustic as well as linguistic processing remains to be uncovered. Here, we recorded EEG responses to both meaningful sentences and random word lists in different levels of signal-to-noise ratios (SNRs) that lead to different levels of speech comprehension as well as listening effort. We then related the neural signals to the acoustic stimuli by computing the phase-locking value (PLV) between the EEG recordings and the speech envelope. We found that the PLV in the delta band increases with increasing SNR for sentences but not for the random word lists, showing that the PLV in this frequency band reflects linguistic information. When attempting to disentangle the effects of SNR, speech comprehension, and listening effort, we observed a trend that the PLV in the delta band might reflect listening effort rather than the other two variables, although the effect was not statistically significant. In summary, our study shows that the PLV in the delta band reflects linguistic information and might be related to listening effort.
Collapse
|
48
|
Deoisres S, Lu Y, Vanheusden FJ, Bell SL, Simpson DM. Continuous speech with pauses inserted between words increases cortical tracking of speech envelope. PLoS One 2023; 18:e0289288. [PMID: 37498891 PMCID: PMC10374040 DOI: 10.1371/journal.pone.0289288] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2022] [Accepted: 07/17/2023] [Indexed: 07/29/2023] Open
Abstract
The decoding multivariate Temporal Response Function (decoder) or speech envelope reconstruction approach is a well-known tool for assessing the cortical tracking of speech envelope. It is used to analyse the correlation between the speech stimulus and the neural response. It is known that auditory late responses are enhanced with longer gaps between stimuli, but it is not clear if this applies to the decoder, and whether the addition of gaps/pauses in continuous speech could be used to increase the envelope reconstruction accuracy. We investigated this in normal hearing participants who listened to continuous speech with no added pauses (natural speech), and then with short (250 ms) or long (500 ms) silent pauses inserted between each word. The total duration for continuous speech stimulus with no, short, and long pauses were approximately, 10 minutes, 16 minutes, and 21 minutes, respectively. EEG and speech envelope were simultaneously acquired and then filtered into delta (1-4 Hz) and theta (4-8 Hz) frequency bands. In addition to analysing responses to the whole speech envelope, speech envelope was also segmented to focus response analysis on onset and non-onset regions of speech separately. Our results show that continuous speech with additional pauses inserted between words significantly increases the speech envelope reconstruction correlations compared to using natural speech, in both the delta and theta frequency bands. It also appears that these increase in speech envelope reconstruction are dominated by the onset regions in the speech envelope. Introducing pauses in speech stimuli has potential clinical benefit for increasing auditory evoked response detectability, though with the disadvantage of speech sounding less natural. The strong effect of pauses and onsets on the decoder should be considered when comparing results from different speech corpora. Whether the increased cortical response, when longer pauses are introduced, reflect improved intelligibility requires further investigation.
Collapse
Affiliation(s)
- Suwijak Deoisres
- Institute of Sound and Vibration Research, University of Southampton, Southampton, United Kingdom
| | - Yuhan Lu
- Key Laboratory for Biomedical Engineering of Ministry of Education, College of Biomedical Engineering and Instrument Sciences, Zhejiang University, Hangzhou, China
| | - Frederique J Vanheusden
- Department of Engineering, School of Science and Technology, Nottingham Trent University, Nottingham, United Kingdom
| | - Steven L Bell
- Institute of Sound and Vibration Research, University of Southampton, Southampton, United Kingdom
| | - David M Simpson
- Institute of Sound and Vibration Research, University of Southampton, Southampton, United Kingdom
| |
Collapse
|
49
|
Abbasi O, Steingräber N, Chalas N, Kluger DS, Gross J. Spatiotemporal dynamics characterise spectral connectivity profiles of continuous speaking and listening. PLoS Biol 2023; 21:e3002178. [PMID: 37478152 DOI: 10.1371/journal.pbio.3002178] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2023] [Accepted: 05/31/2023] [Indexed: 07/23/2023] Open
Abstract
Speech production and perception are fundamental processes of human cognition that both rely on intricate processing mechanisms that are still poorly understood. Here, we study these processes by using magnetoencephalography (MEG) to comprehensively map connectivity of regional brain activity within the brain and to the speech envelope during continuous speaking and listening. Our results reveal not only a partly shared neural substrate for both processes but also a dissociation in space, delay, and frequency. Neural activity in motor and frontal areas is coupled to succeeding speech in delta band (1 to 3 Hz), whereas coupling in the theta range follows speech in temporal areas during speaking. Neural connectivity results showed a separation of bottom-up and top-down signalling in distinct frequency bands during speaking. Here, we show that frequency-specific connectivity channels for bottom-up and top-down signalling support continuous speaking and listening. These findings further shed light on the complex interplay between different brain regions involved in speech production and perception.
Collapse
Affiliation(s)
- Omid Abbasi
- Institute for Biomagnetism and Biosignal Analysis, University of Münster, Münster, Germany
| | - Nadine Steingräber
- Institute for Biomagnetism and Biosignal Analysis, University of Münster, Münster, Germany
| | - Nikos Chalas
- Institute for Biomagnetism and Biosignal Analysis, University of Münster, Münster, Germany
- Otto-Creutzfeldt-Center for Cognitive and Behavioral Neuroscience, University of Münster, Münster, Germany
| | - Daniel S Kluger
- Institute for Biomagnetism and Biosignal Analysis, University of Münster, Münster, Germany
- Otto-Creutzfeldt-Center for Cognitive and Behavioral Neuroscience, University of Münster, Münster, Germany
| | - Joachim Gross
- Institute for Biomagnetism and Biosignal Analysis, University of Münster, Münster, Germany
- Otto-Creutzfeldt-Center for Cognitive and Behavioral Neuroscience, University of Münster, Münster, Germany
| |
Collapse
|
50
|
Van Herck S, Economou M, Bempt FV, Ghesquière P, Vandermosten M, Wouters J. Pulsatile modulation greatly enhances neural synchronization at syllable rate in children. Neuroimage 2023:120223. [PMID: 37315772 DOI: 10.1016/j.neuroimage.2023.120223] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2022] [Revised: 05/22/2023] [Accepted: 06/11/2023] [Indexed: 06/16/2023] Open
Abstract
Neural processing of the speech envelope is of crucial importance for speech perception and comprehension. This envelope processing is often investigated by measuring neural synchronization to sinusoidal amplitude-modulated stimuli at different modulation frequencies. However, it has been argued that these stimuli lack ecological validity. Pulsatile amplitude-modulated stimuli, on the other hand, are suggested to be more ecologically valid and efficient, and have increased potential to uncover the neural mechanisms behind some developmental disorders such a dyslexia. Nonetheless, pulsatile stimuli have not yet been investigated in pre-reading and beginning reading children, which is a crucial age for developmental reading research. We performed a longitudinal study to examine the potential of pulsatile stimuli in this age range. Fifty-two typically reading children were tested at three time points from the middle of their last year of kindergarten (5 years old) to the end of first grade (7 years old). Using electroencephalography, we measured neural synchronization to syllable rate and phoneme rate sinusoidal and pulsatile amplitude-modulated stimuli. Our results revealed that the pulsatile stimuli significantly enhance neural synchronization at syllable rate, compared to the sinusoidal stimuli. Additionally, the pulsatile stimuli at syllable rate elicited a different hemispheric specialization, more closely resembling natural speech envelope tracking. We postulate that using the pulsatile stimuli greatly increases EEG data acquisition efficiency compared to the common sinusoidal amplitude-modulated stimuli in research in younger children and in developmental reading research.
Collapse
Affiliation(s)
- Shauni Van Herck
- Research Group ExpORL, Department of Neurosciences, KU Leuven, Belgium; Parenting and Special Education Research Unit, Faculty of Psychology and Educational Sciences, KU Leuven, Belgium.
| | - Maria Economou
- Research Group ExpORL, Department of Neurosciences, KU Leuven, Belgium; Parenting and Special Education Research Unit, Faculty of Psychology and Educational Sciences, KU Leuven, Belgium
| | - Femke Vanden Bempt
- Research Group ExpORL, Department of Neurosciences, KU Leuven, Belgium; Parenting and Special Education Research Unit, Faculty of Psychology and Educational Sciences, KU Leuven, Belgium
| | - Pol Ghesquière
- Parenting and Special Education Research Unit, Faculty of Psychology and Educational Sciences, KU Leuven, Belgium
| | | | - Jan Wouters
- Research Group ExpORL, Department of Neurosciences, KU Leuven, Belgium
| |
Collapse
|