101
|
Natural Infant-Directed Speech Facilitates Neural Tracking of Prosody. Neuroimage 2022; 251:118991. [PMID: 35158023 DOI: 10.1016/j.neuroimage.2022.118991] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2021] [Revised: 02/02/2022] [Accepted: 02/10/2022] [Indexed: 01/04/2023] Open
Abstract
Infants prefer to be addressed with infant-directed speech (IDS). IDS benefits language acquisition through amplified low-frequency amplitude modulations. It has been reported that this amplification increases electrophysiological tracking of IDS compared to adult-directed speech (ADS). It is still unknown which particular frequency band triggers this effect. Here, we compare tracking at the rates of syllables and prosodic stress, which are both critical to word segmentation and recognition. In mother-infant dyads (n=30), mothers described novel objects to their 9-month-olds while infants' EEG was recorded. For IDS, mothers were instructed to speak to their children as they typically do, while for ADS, mothers described the objects as if speaking with an adult. Phonetic analyses confirmed that pitch features were more prototypically infant-directed in the IDS-condition compared to the ADS-condition. Neural tracking of speech was assessed by speech-brain coherence, which measures the synchronization between speech envelope and EEG. Results revealed significant speech-brain coherence at both syllabic and prosodic stress rates, indicating that infants track speech in IDS and ADS at both rates. We found significantly higher speech-brain coherence for IDS compared to ADS in the prosodic stress rate but not the syllabic rate. This indicates that the IDS benefit arises primarily from enhanced prosodic stress. Thus, neural tracking is sensitive to parents' speech adaptations during natural interactions, possibly facilitating higher-level inferential processes such as word segmentation from continuous speech.
Collapse
|
102
|
Palana J, Schwartz S, Tager-Flusberg H. Evaluating the Use of Cortical Entrainment to Measure Atypical Speech Processing: A Systematic Review. Neurosci Biobehav Rev 2021; 133:104506. [PMID: 34942267 DOI: 10.1016/j.neubiorev.2021.12.029] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2020] [Revised: 12/12/2021] [Accepted: 12/18/2021] [Indexed: 11/30/2022]
Abstract
BACKGROUND Cortical entrainment has emerged as promising means for measuring continuous speech processing in young, neurotypical adults. However, its utility for capturing atypical speech processing has not been systematically reviewed. OBJECTIVES Synthesize evidence regarding the merit of measuring cortical entrainment to capture atypical speech processing and recommend avenues for future research. METHOD We systematically reviewed publications investigating entrainment to continuous speech in populations with auditory processing differences. RESULTS In the 25 publications reviewed, most studies were conducted on older and/or hearing-impaired adults, for whom slow-wave entrainment to speech was often heightened compared to controls. Research conducted on populations with neurodevelopmental disorders, in whom slow-wave entrainment was often reduced, was less common. Across publications, findings highlighted associations between cortical entrainment and speech processing performance differences. CONCLUSIONS Measures of cortical entrainment offer useful means of capturing speech processing differences and future research should leverage them more extensively when studying populations with neurodevelopmental disorders.
Collapse
Affiliation(s)
- Joseph Palana
- Department of Psychological and Brain Sciences, Boston University, 64 Cummington Mall, Boston, MA, 02215, USA; Laboratories of Cognitive Neuroscience, Division of Developmental Medicine, Harvard Medical School, Boston Children's Hospital, 1 Autumn Street, Boston, MA, 02215, USA
| | - Sophie Schwartz
- Department of Psychological and Brain Sciences, Boston University, 64 Cummington Mall, Boston, MA, 02215, USA
| | - Helen Tager-Flusberg
- Department of Psychological and Brain Sciences, Boston University, 64 Cummington Mall, Boston, MA, 02215, USA.
| |
Collapse
|
103
|
Yu L, Zeng J, Wang S, Zhang Y. Phonetic Encoding Contributes to the Processing of Linguistic Prosody at the Word Level: Cross-Linguistic Evidence From Event-Related Potentials. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2021; 64:4791-4801. [PMID: 34731592 DOI: 10.1044/2021_jslhr-21-00037] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
PURPOSE This study aimed to examine whether abstract knowledge of word-level linguistic prosody is independent of or integrated with phonetic knowledge. METHOD Event-related potential (ERP) responses were measured from 18 adult listeners while they listened to native and nonnative word-level prosody in speech and in nonspeech. The prosodic phonology (speech) conditions included disyllabic pseudowords spoken in Chinese and in English matched for syllabic structure, duration, and intensity. The prosodic acoustic (nonspeech) conditions were hummed versions of the speech stimuli, which eliminated the phonetic content while preserving the acoustic prosodic features. RESULTS We observed language-specific effects on the ERP that native stimuli elicited larger late negative response (LNR) amplitude than nonnative stimuli in the prosodic phonology conditions. However, no such effect was observed in the phoneme-free prosodic acoustic control conditions. CONCLUSIONS The results support the integration view that word-level linguistic prosody likely relies on the phonetic content where the acoustic cues embedded in. It remains to be examined whether the LNR may serve as a neural signature for language-specific processing of prosodic phonology beyond auditory processing of the critical acoustic cues at the suprasyllabic level.
Collapse
Affiliation(s)
- Luodi Yu
- Key Laboratory of Brain, Cognition and Education Sciences, Ministry of Education, South China Normal University, Guangzhou
- School of Psychology, Center for Studies of Psychological Application, and Guangdong Key Laboratory of Mental Health and Cognitive Science, South China Normal University, Guangzhou
| | - Jiajing Zeng
- Key Laboratory of Brain, Cognition and Education Sciences, Ministry of Education, South China Normal University, Guangzhou
- School of Psychology, Center for Studies of Psychological Application, and Guangdong Key Laboratory of Mental Health and Cognitive Science, South China Normal University, Guangzhou
| | - Suiping Wang
- Key Laboratory of Brain, Cognition and Education Sciences, Ministry of Education, South China Normal University, Guangzhou
- School of Psychology, Center for Studies of Psychological Application, and Guangdong Key Laboratory of Mental Health and Cognitive Science, South China Normal University, Guangzhou
| | - Yang Zhang
- Department of Speech-Language-Hearing Sciences, University of Minnesota, Twin Cities, Minneapolis
| |
Collapse
|
104
|
Neural oscillations track natural but not artificial fast speech: Novel insights from speech-brain coupling using MEG. Neuroimage 2021; 244:118577. [PMID: 34525395 DOI: 10.1016/j.neuroimage.2021.118577] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2021] [Revised: 08/27/2021] [Accepted: 09/12/2021] [Indexed: 11/20/2022] Open
Abstract
Neural oscillations contribute to speech parsing via cortical tracking of hierarchical linguistic structures, including syllable rate. While the properties of neural entrainment have been largely probed with speech stimuli at either normal or artificially accelerated rates, the important case of natural fast speech has been largely overlooked. Using magnetoencephalography, we found that listening to naturally-produced speech was associated with cortico-acoustic coupling, both at normal (∼6 syllables/s) and fast (∼9 syllables/s) rates, with a corresponding shift in peak entrainment frequency. Interestingly, time-compressed sentences did not yield such coupling, despite being generated at the same rate as the natural fast sentences. Additionally, neural activity in right motor cortex exhibited stronger tuning to natural fast rather than to artificially accelerated speech, and showed evidence for stronger phase-coupling with left temporo-parietal and motor areas. These findings are highly relevant for our understanding of the role played by auditory and motor cortex oscillations in the perception of naturally produced speech.
Collapse
|
105
|
Gransier R, Wouters J. Neural auditory processing of parameterized speech envelopes. Hear Res 2021; 412:108374. [PMID: 34800800 DOI: 10.1016/j.heares.2021.108374] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/13/2021] [Revised: 10/01/2021] [Accepted: 10/13/2021] [Indexed: 10/19/2022]
Abstract
Speech perception depends highly on the neural processing of the speech envelope. Several auditory processing deficits are hypothesized to result in a reduction in fidelity of the neural representation of the speech envelope across the auditory pathway. Furthermore, this reduction in fidelity is associated with supra-threshold speech processing deficits. Investigating the mechanisms that affect the neural encoding of the speech envelope can be of great value to gain insight in the different mechanisms that account for this reduced neural representation, and to develop stimulation strategies for hearing prosthesis that aim to restore it. In this perspective, we discuss the importance of neural assessment of phase-locking to the speech envelope from an audiological view and introduce the Temporal Envelope Speech Tracking (TEMPEST) stimulus framework which enables the electrophysiological assessment of envelope processing across the auditory pathway in a systematic and standardized way. We postulate that this framework can be used to gain insight in the salience of speech-like temporal envelopes in the neural code and to evaluate the effectiveness of stimulation strategies that aim to restore temporal processing across the auditory pathway with auditory prostheses.
Collapse
Affiliation(s)
- Robin Gransier
- ExpORL, Department of Neurosciences, KU Leuven, 3000 Leuven, Belgium; Leuven Brain Institute, KU Leuven, 3000 Leuven, Belgium.
| | - Jan Wouters
- ExpORL, Department of Neurosciences, KU Leuven, 3000 Leuven, Belgium; Leuven Brain Institute, KU Leuven, 3000 Leuven, Belgium
| |
Collapse
|
106
|
Attaheri A, Choisdealbha ÁN, Di Liberto GM, Rocha S, Brusini P, Mead N, Olawole-Scott H, Boutris P, Gibbon S, Williams I, Grey C, Flanagan S, Goswami U. Delta- and theta-band cortical tracking and phase-amplitude coupling to sung speech by infants. Neuroimage 2021; 247:118698. [PMID: 34798233 DOI: 10.1016/j.neuroimage.2021.118698] [Citation(s) in RCA: 43] [Impact Index Per Article: 14.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2021] [Revised: 10/15/2021] [Accepted: 10/30/2021] [Indexed: 01/13/2023] Open
Abstract
The amplitude envelope of speech carries crucial low-frequency acoustic information that assists linguistic decoding at multiple time scales. Neurophysiological signals are known to track the amplitude envelope of adult-directed speech (ADS), particularly in the theta-band. Acoustic analysis of infant-directed speech (IDS) has revealed significantly greater modulation energy than ADS in an amplitude-modulation (AM) band centred on ∼2 Hz. Accordingly, cortical tracking of IDS by delta-band neural signals may be key to language acquisition. Speech also contains acoustic information within its higher-frequency bands (beta, gamma). Adult EEG and MEG studies reveal an oscillatory hierarchy, whereby low-frequency (delta, theta) neural phase dynamics temporally organize the amplitude of high-frequency signals (phase amplitude coupling, PAC). Whilst consensus is growing around the role of PAC in the matured adult brain, its role in the development of speech processing is unexplored. Here, we examined the presence and maturation of low-frequency (<12 Hz) cortical speech tracking in infants by recording EEG longitudinally from 60 participants when aged 4-, 7- and 11- months as they listened to nursery rhymes. After establishing stimulus-related neural signals in delta and theta, cortical tracking at each age was assessed in the delta, theta and alpha [control] bands using a multivariate temporal response function (mTRF) method. Delta-beta, delta-gamma, theta-beta and theta-gamma phase-amplitude coupling (PAC) was also assessed. Significant delta and theta but not alpha tracking was found. Significant PAC was present at all ages, with both delta and theta -driven coupling observed.
Collapse
Affiliation(s)
- Adam Attaheri
- Department of Psychology, Centre for Neuroscience in Education, University of Cambridge, Downing Street, Cambridge CB2 3 EB, United Kingdom.
| | - Áine Ní Choisdealbha
- Department of Psychology, Centre for Neuroscience in Education, University of Cambridge, Downing Street, Cambridge CB2 3 EB, United Kingdom.
| | - Giovanni M Di Liberto
- Laboratoire des Systèmes Perceptifs, UMR 8248, CNRS, France; Ecole Normale Supérieure, PSL University, France; Department of Mechanical, Trinity Centre for Biomedical Engineering and Trinity Institute of Neuroscience, Manufacturing and Biomedical Engineering, Trinity College, The University of Dublin, Ireland; School of Electrical and Electronic Engineering and UCD Centre for Biomedical Engineering, University College Dublin, Ireland.
| | - Sinead Rocha
- Department of Psychology, Centre for Neuroscience in Education, University of Cambridge, Downing Street, Cambridge CB2 3 EB, United Kingdom.
| | - Perrine Brusini
- Department of Psychology, Centre for Neuroscience in Education, University of Cambridge, Downing Street, Cambridge CB2 3 EB, United Kingdom; Institute of Population Health, Waterhouse Building, Block B, Brownlow Street, Liverpool L69 3GF, United Kingdom.
| | - Natasha Mead
- Department of Psychology, Centre for Neuroscience in Education, University of Cambridge, Downing Street, Cambridge CB2 3 EB, United Kingdom.
| | - Helen Olawole-Scott
- Department of Psychology, Centre for Neuroscience in Education, University of Cambridge, Downing Street, Cambridge CB2 3 EB, United Kingdom.
| | - Panagiotis Boutris
- Department of Psychology, Centre for Neuroscience in Education, University of Cambridge, Downing Street, Cambridge CB2 3 EB, United Kingdom.
| | - Samuel Gibbon
- Department of Psychology, Centre for Neuroscience in Education, University of Cambridge, Downing Street, Cambridge CB2 3 EB, United Kingdom.
| | - Isabel Williams
- Department of Psychology, Centre for Neuroscience in Education, University of Cambridge, Downing Street, Cambridge CB2 3 EB, United Kingdom.
| | - Christina Grey
- Department of Psychology, Centre for Neuroscience in Education, University of Cambridge, Downing Street, Cambridge CB2 3 EB, United Kingdom.
| | - Sheila Flanagan
- Department of Psychology, Centre for Neuroscience in Education, University of Cambridge, Downing Street, Cambridge CB2 3 EB, United Kingdom.
| | - Usha Goswami
- Department of Psychology, Centre for Neuroscience in Education, University of Cambridge, Downing Street, Cambridge CB2 3 EB, United Kingdom.
| |
Collapse
|
107
|
Malaia EA, Borneman SC, Krebs J, Wilbur RB. Low-Frequency Entrainment to Visual Motion Underlies Sign Language Comprehension. IEEE Trans Neural Syst Rehabil Eng 2021; 29:2456-2463. [PMID: 34762589 PMCID: PMC8720261 DOI: 10.1109/tnsre.2021.3127724] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
When people listen to speech, neural activity tracks the entropy fluctuation in the acoustic envelope of the signal. This signal-based entrainment has been shown to be the basis of speech parsing and comprehension. In this electroencephalography (EEG) study, we compute sign language users’ cortical tracking of changes in visual dynamics of the communicative signal in the time-direct videos of sign language, and their time-reversed counterparts, and assess the relative contribution of response frequencies between.2 and 12.4 Hz to comprehension using a machine learning approach to brain state classification. Lower frequencies of EEG response (.2–4 Hz) yield 100% classification accuracy, while information about cortical tracking of the visual envelope in higher frequencies is less informative. This suggests that signers rely on lower visual frequency data, such as envelope of visual signal, for sign language comprehension. In the context of real-time language processing, given the speed of comprehension responses, this suggests that fluent signers employ a predictive processing heuristic based on sign language knowledge.
Collapse
|
108
|
Memory Specific to Temporal Features of Sound Is Formed by Cue-Selective Enhancements in Temporal Coding Enabled by Inhibition of an Epigenetic Regulator. J Neurosci 2021; 41:9192-9209. [PMID: 34544835 DOI: 10.1523/jneurosci.0691-21.2021] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2021] [Revised: 07/23/2021] [Accepted: 08/18/2021] [Indexed: 11/21/2022] Open
Abstract
Recent investigation of memory-related functions in the auditory system have capitalized on the use of memory-modulating molecules to probe the relationship between memory and substrates of memory in auditory system coding. For example, epigenetic mechanisms, which regulate gene expression necessary for memory consolidation, are powerful modulators of learning-induced neuroplasticity and long-term memory (LTM) formation. Inhibition of the epigenetic regulator histone deacetylase 3 (HDAC3) promotes LTM, which is highly specific for spectral features of sound. The present work demonstrates for the first time that HDAC3 inhibition also enables memory for temporal features of sound. Adult male rats trained in an amplitude modulation (AM) rate discrimination task and treated with a selective inhibitor of HDAC3 formed memory that was highly specific to the AM rate paired with reward. Sound-specific memory revealed behaviorally was associated with a signal-specific enhancement in temporal coding in the auditory system; stronger phase locking that was specific to the rewarded AM rate was revealed in both the surface-recorded frequency following response and auditory cortical multiunit activity in rats treated with the HDAC3 inhibitor. Furthermore, HDAC3 inhibition increased trial-to-trial cortical response consistency (relative to naive and trained vehicle-treated rats), which generalized across different AM rates. Stronger signal-specific phase locking correlated with individual behavioral differences in memory specificity for the AM signal. These findings support that epigenetic mechanisms regulate activity-dependent processes that enhance discriminability of sensory cues encoded into LTM in both spectral and temporal domains, which may be important for remembering spectrotemporal features of sounds, for example, as in human voices and speech.SIGNIFICANCE STATEMENT Epigenetic mechanisms have recently been implicated in memory and information processing. Here, we use a pharmacological inhibitor of HDAC3 in a sensory model of learning to reveal the ability of HDAC3 to enable precise memory for amplitude-modulated sound cues. In so doing, we uncover neural substrates for memory's specificity for temporal sound cues. Memory specificity was supported by auditory cortical changes in temporal coding, including greater response consistency and stronger phase locking. HDAC3 appears to regulate effects across domains that determine specific cue saliency for behavior. Thus, epigenetic players may gate how sensory information is stored in long-term memory and can be leveraged to reveal the neural substrates of sensory details stored in memory.
Collapse
|
109
|
Renvall H, Seol J, Tuominen R, Sorger B, Riecke L, Salmelin R. Selective auditory attention within naturalistic scenes modulates reactivity to speech sounds. Eur J Neurosci 2021; 54:7626-7641. [PMID: 34697833 PMCID: PMC9298413 DOI: 10.1111/ejn.15504] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2021] [Accepted: 10/10/2021] [Indexed: 11/27/2022]
Abstract
Rapid recognition and categorization of sounds are essential for humans and animals alike, both for understanding and reacting to our surroundings and for daily communication and social interaction. For humans, perception of speech sounds is of crucial importance. In real life, this task is complicated by the presence of a multitude of meaningful non‐speech sounds. The present behavioural, magnetoencephalography (MEG) and functional magnetic resonance imaging (fMRI) study was set out to address how attention to speech versus attention to natural non‐speech sounds within complex auditory scenes influences cortical processing. The stimuli were superimpositions of spoken words and environmental sounds, with parametric variation of the speech‐to‐environmental sound intensity ratio. The participants' task was to detect a repetition in either the speech or the environmental sound. We found that specifically when participants attended to speech within the superimposed stimuli, higher speech‐to‐environmental sound ratios resulted in shorter sustained MEG responses and stronger BOLD fMRI signals especially in the left supratemporal auditory cortex and in improved behavioural performance. No such effects of speech‐to‐environmental sound ratio were observed when participants attended to the environmental sound part within the exact same stimuli. These findings suggest stronger saliency of speech compared with other meaningful sounds during processing of natural auditory scenes, likely linked to speech‐specific top‐down and bottom‐up mechanisms activated during speech perception that are needed for tracking speech in real‐life‐like auditory environments.
Collapse
Affiliation(s)
- Hanna Renvall
- Department of Neuroscience and Biomedical Engineering, Aalto University, Espoo, Finland.,Aalto NeuroImaging, Aalto University, Espoo, Finland.,BioMag Laboratory, HUS Diagnostic Center, Helsinki University Hospital, University of Helsinki and Aalto University School of Science, Helsinki, Finland
| | - Jaeho Seol
- Department of Neuroscience and Biomedical Engineering, Aalto University, Espoo, Finland.,Aalto NeuroImaging, Aalto University, Espoo, Finland
| | - Riku Tuominen
- Department of Neuroscience and Biomedical Engineering, Aalto University, Espoo, Finland.,Aalto NeuroImaging, Aalto University, Espoo, Finland
| | - Bettina Sorger
- Department of Cognitive Neuroscience, Maastricht University, Maastricht, The Netherlands
| | - Lars Riecke
- Department of Cognitive Neuroscience, Maastricht University, Maastricht, The Netherlands
| | - Riitta Salmelin
- Department of Neuroscience and Biomedical Engineering, Aalto University, Espoo, Finland.,Aalto NeuroImaging, Aalto University, Espoo, Finland
| |
Collapse
|
110
|
Ríos-López P, Molinaro N, Bourguignon M, Lallier M. Right-hemisphere coherence to speech at pre-reading stages predicts reading performance one year later. JOURNAL OF COGNITIVE PSYCHOLOGY 2021. [DOI: 10.1080/20445911.2021.1986514] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
Affiliation(s)
- Paula Ríos-López
- BCBL, Basque Center on Cognition, Brain and Language, Donostia/San Sebastian, Spain
- Leibniz Institute for Neurobiology, Magdeburg, Germany
- Centre for Behavioral and Brain Sciences, Magdeburg, Germany
| | - Nicola Molinaro
- BCBL, Basque Center on Cognition, Brain and Language, Donostia/San Sebastian, Spain
- Ikerbasque, Basque Foundation for Science, Bilbao, Spain
| | - Mathieu Bourguignon
- BCBL, Basque Center on Cognition, Brain and Language, Donostia/San Sebastian, Spain
- Laboratoire de Cartographie Fonctionnelle du Cerveau, Université Libre de Bruxelles, Bruxelles, Belgium
| | - Marie Lallier
- BCBL, Basque Center on Cognition, Brain and Language, Donostia/San Sebastian, Spain
| |
Collapse
|
111
|
Ramos-Escobar N, Segura E, Olivé G, Rodriguez-Fornells A, François C. Oscillatory activity and EEG phase synchrony of concurrent word segmentation and meaning-mapping in 9-year-old children. Dev Cogn Neurosci 2021; 51:101010. [PMID: 34461393 PMCID: PMC8403737 DOI: 10.1016/j.dcn.2021.101010] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2020] [Revised: 08/25/2021] [Accepted: 08/26/2021] [Indexed: 10/28/2022] Open
Abstract
When learning a new language, one must segment words from continuous speech and associate them with meanings. These complex processes can be boosted by attentional mechanisms triggered by multi-sensory information. Previous electrophysiological studies suggest that brain oscillations are sensitive to different hierarchical complexity levels of the input, making them a plausible neural substrate for speech parsing. Here, we investigated the functional role of brain oscillations during concurrent speech segmentation and meaning acquisition in sixty 9-year-old children. We collected EEG data during an audio-visual statistical learning task during which children were exposed to a learning condition with consistent word-picture associations and a random condition with inconsistent word-picture associations before being tested on their ability to recall words and word-picture associations. We capitalized on the brain dynamics to align neural activity to the same rate as an external rhythmic stimulus to explore modulations of neural synchronization and phase synchronization between electrodes during multi-sensory word learning. Results showed enhanced power at both word- and syllabic-rate and increased EEG phase synchronization between frontal and occipital regions in the learning compared to the random condition. These findings suggest that multi-sensory cueing and attentional mechanisms play an essential role in children's successful word learning.
Collapse
Affiliation(s)
- Neus Ramos-Escobar
- Dept. of Cognition, Development and Educational Science, Institute of Neuroscience, University of Barcelona, L'Hospitalet de Llobregat, Barcelona, 08097, Spain; Cognition and Brain Plasticity Group, Bellvitge Biomedical Research Institute (IDIBELL), L'Hospitalet de Llobregat, Barcelona, 08097, Spain
| | - Emma Segura
- Dept. of Cognition, Development and Educational Science, Institute of Neuroscience, University of Barcelona, L'Hospitalet de Llobregat, Barcelona, 08097, Spain; Cognition and Brain Plasticity Group, Bellvitge Biomedical Research Institute (IDIBELL), L'Hospitalet de Llobregat, Barcelona, 08097, Spain
| | - Guillem Olivé
- Dept. of Cognition, Development and Educational Science, Institute of Neuroscience, University of Barcelona, L'Hospitalet de Llobregat, Barcelona, 08097, Spain; Cognition and Brain Plasticity Group, Bellvitge Biomedical Research Institute (IDIBELL), L'Hospitalet de Llobregat, Barcelona, 08097, Spain
| | - Antoni Rodriguez-Fornells
- Dept. of Cognition, Development and Educational Science, Institute of Neuroscience, University of Barcelona, L'Hospitalet de Llobregat, Barcelona, 08097, Spain; Cognition and Brain Plasticity Group, Bellvitge Biomedical Research Institute (IDIBELL), L'Hospitalet de Llobregat, Barcelona, 08097, Spain; Catalan Institution for Research and Advanced Studies, ICREA, Barcelona, Spain.
| | | |
Collapse
|
112
|
Vander Ghinst M, Bourguignon M, Wens V, Naeije G, Ducène C, Niesen M, Hassid S, Choufani G, Goldman S, De Tiège X. Inaccurate cortical tracking of speech in adults with impaired speech perception in noise. Brain Commun 2021; 3:fcab186. [PMID: 34541530 PMCID: PMC8445395 DOI: 10.1093/braincomms/fcab186] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2021] [Revised: 06/05/2021] [Accepted: 06/08/2021] [Indexed: 01/17/2023] Open
Abstract
Impaired speech perception in noise despite normal peripheral auditory function is a common problem in young adults. Despite a growing body of research, the pathophysiology of this impairment remains unknown. This magnetoencephalography study characterizes the cortical tracking of speech in a multi-talker background in a group of highly selected adult subjects with impaired speech perception in noise without peripheral auditory dysfunction. Magnetoencephalographic signals were recorded from 13 subjects with impaired speech perception in noise (six females, mean age: 30 years) and matched healthy subjects while they were listening to 5 different recordings of stories merged with a multi-talker background at different signal to noise ratios (No Noise, +10, +5, 0 and −5 dB). The cortical tracking of speech was quantified with coherence between magnetoencephalographic signals and the temporal envelope of (i) the global auditory scene (i.e. the attended speech stream and the multi-talker background noise), (ii) the attended speech stream only and (iii) the multi-talker background noise. Functional connectivity was then estimated between brain areas showing altered cortical tracking of speech in noise in subjects with impaired speech perception in noise and the rest of the brain. All participants demonstrated a selective cortical representation of the attended speech stream in noisy conditions, but subjects with impaired speech perception in noise displayed reduced cortical tracking of speech at the syllable rate (i.e. 4–8 Hz) in all noisy conditions. Increased functional connectivity was observed in subjects with impaired speech perception in noise in Noiseless and speech in noise conditions between supratemporal auditory cortices and left-dominant brain areas involved in semantic and attention processes. The difficulty to understand speech in a multi-talker background in subjects with impaired speech perception in noise appears to be related to an inaccurate auditory cortex tracking of speech at the syllable rate. The increased functional connectivity between supratemporal auditory cortices and language/attention-related neocortical areas probably aims at supporting speech perception and subsequent recognition in adverse auditory scenes. Overall, this study argues for a central origin of impaired speech perception in noise in the absence of any peripheral auditory dysfunction.
Collapse
Affiliation(s)
- Marc Vander Ghinst
- Laboratoire de Cartographie fonctionnelle du Cerveau, UNI-ULB Neuroscience Institute, Université Libre de Bruxelles (ULB), Brussels 1070, Belgium.,Service, d'ORL et de chirurgie cervico-faciale, CUB Hôpital Erasme, Université Libre de Bruxelles (ULB), Brussels 1070, Belgium
| | - Mathieu Bourguignon
- Laboratoire de Cartographie fonctionnelle du Cerveau, UNI-ULB Neuroscience Institute, Université Libre de Bruxelles (ULB), Brussels 1070, Belgium.,Laboratory of Neurophysiology and Movement Biomechanics, UNI-ULB Neuroscience Institute, Université Libre de Bruxelles (ULB), Brussels 1070, Belgium.,Basque Center on Cognition, Brain and Language (BCBL), Donostia/San Sebastian 20009, Spain
| | - Vincent Wens
- Laboratoire de Cartographie fonctionnelle du Cerveau, UNI-ULB Neuroscience Institute, Université Libre de Bruxelles (ULB), Brussels 1070, Belgium.,Clinics of Functional Neuroimaging, Service of Nuclear Medicine, CUB Hôpital Erasme, Université Libre de Bruxelles (ULB), Brussels 1070, Belgium
| | - Gilles Naeije
- Laboratoire de Cartographie fonctionnelle du Cerveau, UNI-ULB Neuroscience Institute, Université Libre de Bruxelles (ULB), Brussels 1070, Belgium.,Service de Neurologie, ULB-Hôpital Erasme, Université libre de Bruxelles (ULB), Brussels 1070, Belgium
| | - Cecile Ducène
- Laboratoire de Cartographie fonctionnelle du Cerveau, UNI-ULB Neuroscience Institute, Université Libre de Bruxelles (ULB), Brussels 1070, Belgium.,Service, d'ORL et de chirurgie cervico-faciale, CUB Hôpital Erasme, Université Libre de Bruxelles (ULB), Brussels 1070, Belgium
| | - Maxime Niesen
- Laboratoire de Cartographie fonctionnelle du Cerveau, UNI-ULB Neuroscience Institute, Université Libre de Bruxelles (ULB), Brussels 1070, Belgium.,Service, d'ORL et de chirurgie cervico-faciale, CUB Hôpital Erasme, Université Libre de Bruxelles (ULB), Brussels 1070, Belgium
| | - Sergio Hassid
- Service, d'ORL et de chirurgie cervico-faciale, CUB Hôpital Erasme, Université Libre de Bruxelles (ULB), Brussels 1070, Belgium
| | - Georges Choufani
- Service, d'ORL et de chirurgie cervico-faciale, CUB Hôpital Erasme, Université Libre de Bruxelles (ULB), Brussels 1070, Belgium
| | - Serge Goldman
- Laboratoire de Cartographie fonctionnelle du Cerveau, UNI-ULB Neuroscience Institute, Université Libre de Bruxelles (ULB), Brussels 1070, Belgium.,Clinics of Functional Neuroimaging, Service of Nuclear Medicine, CUB Hôpital Erasme, Université Libre de Bruxelles (ULB), Brussels 1070, Belgium
| | - Xavier De Tiège
- Laboratoire de Cartographie fonctionnelle du Cerveau, UNI-ULB Neuroscience Institute, Université Libre de Bruxelles (ULB), Brussels 1070, Belgium.,Clinics of Functional Neuroimaging, Service of Nuclear Medicine, CUB Hôpital Erasme, Université Libre de Bruxelles (ULB), Brussels 1070, Belgium
| |
Collapse
|
113
|
Bhandari P, Demberg V, Kray J. Semantic Predictability Facilitates Comprehension of Degraded Speech in a Graded Manner. Front Psychol 2021; 12:714485. [PMID: 34566795 PMCID: PMC8459870 DOI: 10.3389/fpsyg.2021.714485] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2021] [Accepted: 08/06/2021] [Indexed: 01/02/2023] Open
Abstract
Previous studies have shown that at moderate levels of spectral degradation, semantic predictability facilitates language comprehension. It is argued that when speech is degraded, listeners have narrowed expectations about the sentence endings; i.e., semantic prediction may be limited to only most highly predictable sentence completions. The main objectives of this study were to (i) examine whether listeners form narrowed expectations or whether they form predictions across a wide range of probable sentence endings, (ii) assess whether the facilitatory effect of semantic predictability is modulated by perceptual adaptation to degraded speech, and (iii) use and establish a sensitive metric for the measurement of language comprehension. For this, we created 360 German Subject-Verb-Object sentences that varied in semantic predictability of a sentence-final target word in a graded manner (high, medium, and low) and levels of spectral degradation (1, 4, 6, and 8 channels noise-vocoding). These sentences were presented auditorily to two groups: One group (n =48) performed a listening task in an unpredictable channel context in which the degraded speech levels were randomized, while the other group (n =50) performed the task in a predictable channel context in which the degraded speech levels were blocked. The results showed that at 4 channels noise-vocoding, response accuracy was higher in high-predictability sentences than in the medium-predictability sentences, which in turn was higher than in the low-predictability sentences. This suggests that, in contrast to the narrowed expectations view, comprehension of moderately degraded speech, ranging from low- to high- including medium-predictability sentences, is facilitated in a graded manner; listeners probabilistically preactivate upcoming words from a wide range of semantic space, not limiting only to highly probable sentence endings. Additionally, in both channel contexts, we did not observe learning effects; i.e., response accuracy did not increase over the course of experiment, and response accuracy was higher in the predictable than in the unpredictable channel context. We speculate from these observations that when there is no trial-by-trial variation of the levels of speech degradation, listeners adapt to speech quality at a long timescale; however, when there is a trial-by-trial variation of the high-level semantic feature (e.g., sentence predictability), listeners do not adapt to low-level perceptual property (e.g., speech quality) at a short timescale.
Collapse
Affiliation(s)
- Pratik Bhandari
- Department of Psychology, Saarland University, Saarbrücken, Germany
- Department of Language Science and Technology, Saarland University, Saarbrücken, Germany
| | - Vera Demberg
- Department of Language Science and Technology, Saarland University, Saarbrücken, Germany
- Department of Computer Science, Saarland University, Saarbrücken, Germany
| | - Jutta Kray
- Department of Psychology, Saarland University, Saarbrücken, Germany
| |
Collapse
|
114
|
θ-Band Cortical Tracking of the Speech Envelope Shows the Linear Phase Property. eNeuro 2021; 8:ENEURO.0058-21.2021. [PMID: 34380659 PMCID: PMC8387159 DOI: 10.1523/eneuro.0058-21.2021] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2021] [Revised: 07/10/2021] [Accepted: 07/29/2021] [Indexed: 11/30/2022] Open
Abstract
When listening to speech, low-frequency cortical activity tracks the speech envelope. It remains controversial, however, whether such envelope-tracking neural activity reflects entrainment of neural oscillations or superposition of transient responses evoked by sound features. Recently, it is suggested that the phase of envelope-tracking activity can potentially distinguish entrained oscillations and evoked responses. Here, we analyze the phase of envelope-tracking in humans during passive listening, and observe that the phase lag between cortical activity and speech envelope tends to change linearly across frequency in the θ band (4–8 Hz), suggesting that the θ-band envelope-tracking activity can be readily modeled by evoked responses.
Collapse
|
115
|
Ortiz-Mantilla S, Roesler CP, Realpe-Bonilla T, Benasich AA. Modulation of Theta Phase Synchrony during Syllable Processing as a Function of Interactive Acoustic Experience in Infancy. Cereb Cortex 2021; 32:919-932. [PMID: 34403462 PMCID: PMC8889996 DOI: 10.1093/cercor/bhab256] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2021] [Revised: 07/02/2021] [Accepted: 07/03/2021] [Indexed: 11/13/2022] Open
Abstract
Plasticity, a prominent characteristic of the infant brain, supports formation of cortical representations as infants begin to interact with and adapt to environmental sensory events. Enhanced acoustic processing efficiency along with improved allocation of attentional resources at 7 months and establishment of well-defined phonemic maps at 9 months have been shown to be facilitated by early interactive acoustic experience (IAE). In this study, using an oddball paradigm and measures of theta phase synchrony at source level, we examined short- and long-term effects of nonspeech IAE on syllable processing. Results demonstrated that beyond maturation alone, IAE increased the efficiency of syllabic representation and discrimination, an effect that endured well beyond the immediate training period. As compared with naive controls, the IAE-trained group at 7, 9, and 18 months showed less theta phase synchrony for the standard syllable and at 7 and 18 months for the deviant syllable. The decreased theta phase synchrony exhibited by the trained group suggests more mature, efficient, acoustic processing, and thus, better cortical representation and discrimination of syllabic content. Further, the IAE modulatory effect observed on theta phase synchrony in left auditory cortex at 7 and 9 months was differentially associated with receptive and expressive language scores at 12 and 18 months of age.
Collapse
Affiliation(s)
- Silvia Ortiz-Mantilla
- Center for Molecular and Behavioral Neuroscience, Rutgers University, Newark, NJ 07102, USA
| | - Cynthia P Roesler
- Center for Molecular and Behavioral Neuroscience, Rutgers University, Newark, NJ 07102, USA
| | - Teresa Realpe-Bonilla
- Center for Molecular and Behavioral Neuroscience, Rutgers University, Newark, NJ 07102, USA
| | - April A Benasich
- Center for Molecular and Behavioral Neuroscience, Rutgers University, Newark, NJ 07102, USA
| |
Collapse
|
116
|
Soni S, Tata MS. Brain electrical dynamics in speech segmentation depends upon prior experience with the language. BRAIN AND LANGUAGE 2021; 219:104967. [PMID: 34022679 DOI: 10.1016/j.bandl.2021.104967] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/28/2020] [Revised: 04/26/2021] [Accepted: 05/10/2021] [Indexed: 06/12/2023]
Abstract
It remains unclear whether the process of speech tracking, which facilitates speech segmentation, reflects top-down mechanisms related to prior linguistic models or stimulus-driven mechanisms, or possibly both. To address this, we recorded electroencephalography (EEG) responses from native and non-native speakers of English that had different prior experience with the English language but heard acoustically identical stimuli. Despite a significant difference in the ability to segment and perceive speech, our EEG results showed that theta-band tracking of the speech envelope did not depend significantly on prior experience with language. However, tracking in the theta-band did show changes across repetitions of the same sentence, suggesting a priming effect. Furthermore, native and non-native speakers showed different phase dynamics at word boundaries, suggesting differences in segmentation mechanisms. Finally, we found that the correlation between higher frequency dynamics reflecting phoneme-level processing and perceptual segmentation of words might depend on prior experience with the spoken language.
Collapse
Affiliation(s)
- Shweta Soni
- The University of Lethbridge, Lethbridge, AB, Canada.
| | | |
Collapse
|
117
|
Tune S, Alavash M, Fiedler L, Obleser J. Neural attentional-filter mechanisms of listening success in middle-aged and older individuals. Nat Commun 2021; 12:4533. [PMID: 34312388 PMCID: PMC8313676 DOI: 10.1038/s41467-021-24771-9] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2020] [Accepted: 07/01/2021] [Indexed: 12/12/2022] Open
Abstract
Successful listening crucially depends on intact attentional filters that separate relevant from irrelevant information. Research into their neurobiological implementation has focused on two potential auditory filter strategies: the lateralization of alpha power and selective neural speech tracking. However, the functional interplay of the two neural filter strategies and their potency to index listening success in an ageing population remains unclear. Using electroencephalography and a dual-talker task in a representative sample of listeners (N = 155; age=39-80 years), we here demonstrate an often-missed link from single-trial behavioural outcomes back to trial-by-trial changes in neural attentional filtering. First, we observe preserved attentional-cue-driven modulation of both neural filters across chronological age and hearing levels. Second, neural filter states vary independently of one another, demonstrating complementary neurobiological solutions of spatial selective attention. Stronger neural speech tracking but not alpha lateralization boosts trial-to-trial behavioural performance. Our results highlight the translational potential of neural speech tracking as an individualized neural marker of adaptive listening behaviour.
Collapse
Affiliation(s)
- Sarah Tune
- Department of Psychology, University of Lübeck, Lübeck, Germany.
- Center for Brain, Behavior, and Metabolism, University of Lübeck, Lübeck, Germany.
| | - Mohsen Alavash
- Department of Psychology, University of Lübeck, Lübeck, Germany
- Center for Brain, Behavior, and Metabolism, University of Lübeck, Lübeck, Germany
| | - Lorenz Fiedler
- Department of Psychology, University of Lübeck, Lübeck, Germany
- Center for Brain, Behavior, and Metabolism, University of Lübeck, Lübeck, Germany
- Eriksholm Research Centre, Snekkersten, Denmark
| | - Jonas Obleser
- Department of Psychology, University of Lübeck, Lübeck, Germany.
- Center for Brain, Behavior, and Metabolism, University of Lübeck, Lübeck, Germany.
| |
Collapse
|
118
|
Meng X, Sun C, Du B, Liu L, Zhang Y, Dong Q, Georgiou GK, Nan Y. The development of brain rhythms at rest and its impact on vocabulary acquisition. Dev Sci 2021; 25:e13157. [PMID: 34258830 DOI: 10.1111/desc.13157] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2021] [Revised: 06/28/2021] [Accepted: 07/01/2021] [Indexed: 11/27/2022]
Abstract
A long-standing question in developmental science is how the neurodevelopment of the brain influences cognitive functions. Here, we examined the developmental change of resting EEG power and its links to vocabulary acquisition in school-age children. We further explored what mechanisms may mediate the relation between brain rhythm maturation and vocabulary knowledge. Eyes-opened resting-state EEG data were recorded from 53 typically-developing Chinese children every 2 years between the ages of 7 and 11. Our results showed first that delta, theta, and gamma power decreased over time, whereas alpha and beta power increased over time. Second, after controlling for general cognitive abilities, age, home literacy environment, and phonological skills, theta decreases explained 6.9% and 14.4% of unique variance in expressive vocabulary at ages 9 and 11, respectively. We also found that beta increase from age 7 to 9 significantly predicted receptive vocabulary at age 11. Finally, theta decrease predicted expressive vocabulary through the effects of phoneme deletion at age 9 and tone discrimination at age 11. These results substantiate the important role of brain oscillations at rest, especially theta rhythm, in language development. The developmental change of brain rhythms could serve as sensitive biomarkers for vocabulary development in school-age children, which would be of great value in identifying children at risk of language impairment.
Collapse
Affiliation(s)
- Xiangyun Meng
- State Key Laboratory of Cognitive Neuroscience and Learning & IDG/McGovern Institute for Brain Research, Beijing Normal University, Beijing, China
| | - Chen Sun
- State Key Laboratory of Cognitive Neuroscience and Learning & IDG/McGovern Institute for Brain Research, Beijing Normal University, Beijing, China
| | - Boqi Du
- State Key Laboratory of Cognitive Neuroscience and Learning & IDG/McGovern Institute for Brain Research, Beijing Normal University, Beijing, China
| | - Li Liu
- State Key Laboratory of Cognitive Neuroscience and Learning & IDG/McGovern Institute for Brain Research, Beijing Normal University, Beijing, China
| | - Yuxuan Zhang
- State Key Laboratory of Cognitive Neuroscience and Learning & IDG/McGovern Institute for Brain Research, Beijing Normal University, Beijing, China
| | - Qi Dong
- State Key Laboratory of Cognitive Neuroscience and Learning & IDG/McGovern Institute for Brain Research, Beijing Normal University, Beijing, China
| | - George K Georgiou
- Department of Educational Psychology, University of Alberta, Edmonton, Alberta, Canada
| | - Yun Nan
- State Key Laboratory of Cognitive Neuroscience and Learning & IDG/McGovern Institute for Brain Research, Beijing Normal University, Beijing, China
| |
Collapse
|
119
|
Effects of long-term unilateral cochlear implant use on large-scale network synchronization in adolescents. Hear Res 2021; 409:108308. [PMID: 34343851 DOI: 10.1016/j.heares.2021.108308] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/04/2021] [Revised: 06/25/2021] [Accepted: 06/29/2021] [Indexed: 11/20/2022]
Abstract
Unilateral cochlear implantation (CI) limits deafness-related changes in the auditory pathways but promotes abnormal cortical preference for the stimulated ear and leaves the opposite ear with little protection from auditory deprivation. In the present study, time-frequency analyses of event-related potentials elicited from stimuli presented to each ear were used to determine effects of unilateral CI use on cortical synchrony. CI-elicited activity in 34 adolescents (15.4±1.9 years of age) who had listened with unilateral CIs for most of their lives prior to bilateral implantation were compared to responses elicited by a 500Hz tone-burst in normal hearing peers. Phase-locking values between 4 and 60Hz were calculated for 171 pairs of 19-cephalic recording electrodes. Ear specific results were found in the normal hearing group: higher synchronization in low frequency bands (theta and alpha) from left ear stimulation in the right hemisphere and more high frequency activity (gamma band) from right ear stimulation in the left hemisphere. In the CI group, increased phase synchronization in the theta and beta frequencies with bursts of gamma activity were elicited by the experienced-right CI between frontal, temporal and parietal cortical regions in both hemispheres, consistent with increased recruitment of cortical areas involved in attention and higher-order processes, potentially to support unilateral listening. By contrast, activity was globally desynchronized in response to initial stimulation of the naïve-left ear, suggesting decoupling of these pathways from the cortical hearing network. These data reveal asymmetric auditory development promoted by unilateral CI use, resulting in an abnormally mature neural network.
Collapse
|
120
|
Hashemnia S, Grasse L, Soni S, Tata MS. Human EEG and Recurrent Neural Networks Exhibit Common Temporal Dynamics During Speech Recognition. Front Syst Neurosci 2021; 15:617605. [PMID: 34305540 PMCID: PMC8296978 DOI: 10.3389/fnsys.2021.617605] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2020] [Accepted: 06/10/2021] [Indexed: 11/13/2022] Open
Abstract
Recent deep-learning artificial neural networks have shown remarkable success in recognizing natural human speech, however the reasons for their success are not entirely understood. Success of these methods might be because state-of-the-art networks use recurrent layers or dilated convolutional layers that enable the network to use a time-dependent feature space. The importance of time-dependent features in human cortical mechanisms of speech perception, measured by electroencephalography (EEG) and magnetoencephalography (MEG), have also been of particular recent interest. It is possible that recurrent neural networks (RNNs) achieve their success by emulating aspects of cortical dynamics, albeit through very different computational mechanisms. In that case, we should observe commonalities in the temporal dynamics of deep-learning models, particularly in recurrent layers, and brain electrical activity (EEG) during speech perception. We explored this prediction by presenting the same sentences to both human listeners and the Deep Speech RNN and considered the temporal dynamics of the EEG and RNN units for identical sentences. We tested whether the recently discovered phenomenon of envelope phase tracking in the human EEG is also evident in RNN hidden layers. We furthermore predicted that the clustering of dissimilarity between model representations of pairs of stimuli would be similar in both RNN and EEG dynamics. We found that the dynamics of both the recurrent layer of the network and human EEG signals exhibit envelope phase tracking with similar time lags. We also computed the representational distance matrices (RDMs) of brain and network responses to speech stimuli. The model RDMs became more similar to the brain RDM when going from early network layers to later ones, and eventually peaked at the recurrent layer. These results suggest that the Deep Speech RNN captures a representation of temporal features of speech in a manner similar to human brain.
Collapse
Affiliation(s)
| | | | | | - Matthew S. Tata
- Canadian Centre for Behavioural Neuroscience, Department of Neuroscience, University of Lethbridge, Lethbridge, AB, Canada
| |
Collapse
|
121
|
Kraus F, Tune S, Ruhe A, Obleser J, Wöstmann M. Unilateral Acoustic Degradation Delays Attentional Separation of Competing Speech. Trends Hear 2021; 25:23312165211013242. [PMID: 34184964 PMCID: PMC8246482 DOI: 10.1177/23312165211013242] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022] Open
Abstract
Hearing loss is often asymmetric such that hearing thresholds differ substantially between the two ears. The extreme case of such asymmetric hearing is single-sided deafness. A unilateral cochlear implant (CI) on the more severely impaired ear is an effective treatment to restore hearing. The interactive effects of unilateral acoustic degradation and spatial attention to one sound source in multitalker situations are at present unclear. Here, we simulated some features of listening with a unilateral CI in young, normal-hearing listeners (N = 22) who were presented with 8-band noise-vocoded speech to one ear and intact speech to the other ear. Neural responses were recorded in the electroencephalogram to obtain the spectrotemporal response function to speech. Listeners made more mistakes when answering questions about vocoded (vs. intact) attended speech. At the neural level, we asked how unilateral acoustic degradation would impact the attention-induced amplification of tracking target versus distracting speech. Interestingly, unilateral degradation did not per se reduce the attention-induced amplification but instead delayed it in time: Speech encoding accuracy, modelled on the basis of the spectrotemporal response function, was significantly enhanced for attended versus ignored intact speech at earlier neural response latencies (<∼250 ms). This attentional enhancement was not absent but delayed for vocoded speech. These findings suggest that attentional selection of unilateral, degraded speech is feasible but induces delayed neural separation of competing speech, which might explain listening challenges experienced by unilateral CI users.
Collapse
Affiliation(s)
- Frauke Kraus
- Department of Psychology, University of Lübeck, Lübeck, Germany
| | - Sarah Tune
- Department of Psychology, University of Lübeck, Lübeck, Germany
| | - Anna Ruhe
- Department of Psychology, University of Lübeck, Lübeck, Germany
| | - Jonas Obleser
- Department of Psychology, University of Lübeck, Lübeck, Germany
| | - Malte Wöstmann
- Department of Psychology, University of Lübeck, Lübeck, Germany
| |
Collapse
|
122
|
McAuley JD, Shen Y, Smith T, Kidd GR. Effects of speech-rhythm disruption on selective listening with a single background talker. Atten Percept Psychophys 2021; 83:2229-2240. [PMID: 33782913 PMCID: PMC10612531 DOI: 10.3758/s13414-021-02298-x] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 03/05/2021] [Indexed: 11/08/2022]
Abstract
Recent work by McAuley et al. (Attention, Perception, & Psychophysics, 82, 3222-3233, 2020) using the Coordinate Response Measure (CRM) paradigm with a multitalker background revealed that altering the natural rhythm of target speech amidst background speech worsens target recognition (a target-rhythm effect), while altering background speech rhythm improves target recognition (a background-rhythm effect). Here, we used a single-talker background to examine the role of specific properties of target and background sound patterns on selective listening without the complexity of multiple background stimuli. Experiment 1 manipulated the sex of the background talker, presented with a male target talker, to assess target and background-rhythm effects with and without a strong pitch cue to aid perceptual segregation. Experiment 2 used a vocoded single-talker background to examine target and background-rhythm effects with envelope-based speech rhythms preserved, but without semantic content or temporal fine structure. While a target-rhythm effect was present with all backgrounds, the background-rhythm effect was only observed for the same-sex background condition. Results provide additional support for a selective entrainment hypothesis, while also showing that the background-rhythm effect is not driven by envelope-based speech rhythm alone, and may be reduced or eliminated when pitch or other acoustic differences provide a strong basis for selective listening.
Collapse
Affiliation(s)
- J Devin McAuley
- Department of Psychology, Michigan State University, East Lansing, MI, 48824, USA.
| | - Yi Shen
- Department of Speech and Hearing Sciences, University of Washington, Seattle, WA, USA
| | - Toni Smith
- Department of Psychology, Michigan State University, East Lansing, MI, 48824, USA
| | - Gary R Kidd
- Department of Speech, Language and Hearing Sciences, Indiana University, Bloomington, IN, USA
| |
Collapse
|
123
|
Zoefel B. Visual speech cues recruit neural oscillations to optimise auditory perception: Ways forward for research on human communication. CURRENT RESEARCH IN NEUROBIOLOGY 2021; 2:100015. [PMID: 36246513 PMCID: PMC9559900 DOI: 10.1016/j.crneur.2021.100015] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2021] [Revised: 06/21/2021] [Accepted: 06/21/2021] [Indexed: 11/22/2022] Open
Abstract
In pandemic times, when visual speech cues are masked, it becomes particularly evident how much we rely on them to communicate. Recent research points to a key role of neural oscillations for cross-modal predictions during speech perception. This article bridges several fields of research - neural oscillations, cross-modal speech perception and brain stimulation - to propose ways forward for research on human communication. Future research can test: (1) whether "speech is special" for oscillatory processes underlying cross-modal predictions; (2) whether "visual control" of oscillatory processes in the auditory system is strongest in moments of reduced acoustic regularity; and (3) whether providing information to the brain via electric stimulation can overcome deficits associated with cross-modal information processing in certain pathological conditions.
Collapse
Affiliation(s)
- Benedikt Zoefel
- Centre de Recherche Cerveau et Cognition (CerCo), CNRS UMR 5549, CHU Purpan, Pavillon Baudot, 31052, Toulouse, France
| |
Collapse
|
124
|
Yao B, Taylor JR, Banks B, Kotz SA. Reading direct speech quotes increases theta phase-locking: Evidence for cortical tracking of inner speech? Neuroimage 2021; 239:118313. [PMID: 34175425 DOI: 10.1016/j.neuroimage.2021.118313] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2021] [Revised: 05/28/2021] [Accepted: 06/24/2021] [Indexed: 11/25/2022] Open
Abstract
Growing evidence shows that theta-band (4-7 Hz) activity in the auditory cortex phase-locks to rhythms of overt speech. Does theta activity also encode the rhythmic dynamics of inner speech? Previous research established that silent reading of direct speech quotes (e.g., Mary said: "This dress is lovely!") elicits more vivid inner speech than indirect speech quotes (e.g., Mary said that the dress was lovely). As we cannot directly track the phase alignment between theta activity and inner speech over time, we used EEG to measure the brain's phase-locked responses to the onset of speech quote reading. We found that direct (vs. indirect) quote reading was associated with increased theta phase synchrony over trials at 250-500 ms post-reading onset, with sources of the evoked activity estimated in the speech processing network. An eye-tracking control experiment confirmed that increased theta phase synchrony in direct quote reading was not driven by eye movement patterns, and more likely reflects synchronous phase resetting at the onset of inner speech. These findings suggest a functional role of theta phase modulation in reading-induced inner speech.
Collapse
Affiliation(s)
- Bo Yao
- Division of Neuroscience and Experimental Psychology, School of Biological Sciences, Faculty of Biology, Medicine and Health, University of Manchester, Manchester M13 9PL, United Kingdom.
| | - Jason R Taylor
- Division of Neuroscience and Experimental Psychology, School of Biological Sciences, Faculty of Biology, Medicine and Health, University of Manchester, Manchester M13 9PL, United Kingdom
| | - Briony Banks
- Department of Psychology, Lancaster University, Lancaster LA1 4YF, United Kingdom
| | - Sonja A Kotz
- Department of Neuropsychology & Psychopharmacology, Maastricht University, Maastricht 6211 LK, Netherlands; Department of Neuropsychology, Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig 04103, Germany
| |
Collapse
|
125
|
Sun Y, Michalareas G, Poeppel D. The impact of phase entrainment on auditory detection is highly variable: Revisiting a key finding. Eur J Neurosci 2021; 55:3373-3390. [PMID: 34155728 DOI: 10.1111/ejn.15367] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2020] [Revised: 06/13/2021] [Accepted: 06/15/2021] [Indexed: 11/29/2022]
Abstract
Ample evidence shows that the human brain carefully tracks acoustic temporal regularities in the input, perhaps by entraining cortical neural oscillations to the rate of the stimulation. To what extent the entrained oscillatory activity influences processing of upcoming auditory events remains debated. Here, we revisit a critical finding from Hickok et al. (2015) that demonstrated a clear impact of auditory entrainment on subsequent auditory detection. Participants were asked to detect tones embedded in stationary noise, following a noise that was amplitude modulated at 3 Hz. Tonal targets occurred at various phases relative to the preceding noise modulation. The original study (N = 5) showed that the detectability of the tones (presented at near-threshold intensity) fluctuated cyclically at the same rate as the preceding noise modulation. We conducted an exact replication of the original paradigm (N = 23) and a conceptual replication using a shorter experimental procedure (N = 24). Neither experiment revealed significant entrainment effects at the group level. A restricted analysis on the subset of participants (36%) who did show the entrainment effect revealed no consistent phase alignment between detection facilitation and the preceding rhythmic modulation. Interestingly, both experiments showed group-wide presence of a non-cyclic behavioural pattern, wherein participants' detection of the tonal targets was lower at early and late time points of the target period. The two experiments highlight both the sensitivity of the task to elicit oscillatory entrainment and the striking individual variability in performance.
Collapse
Affiliation(s)
- Yue Sun
- Department of Neuroscience, Max Planck Institute for Empirical Aesthetics, Frankfurt, Germany
| | - Georgios Michalareas
- Department of Neuroscience, Max Planck Institute for Empirical Aesthetics, Frankfurt, Germany
| | - David Poeppel
- Department of Neuroscience, Max Planck Institute for Empirical Aesthetics, Frankfurt, Germany.,Department of Psychology, New York University, New York, New York, USA.,Max Planck-NYU Center for Language, Music, and Emotion (CLaME), New York, New York, USA.,Ernst Strüngmann Institute for Neuroscience, Frankfurt, Germany
| |
Collapse
|
126
|
Bröhl F, Kayser C. Delta/theta band EEG differentially tracks low and high frequency speech-derived envelopes. Neuroimage 2021; 233:117958. [PMID: 33744458 PMCID: PMC8204264 DOI: 10.1016/j.neuroimage.2021.117958] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2020] [Revised: 03/08/2021] [Accepted: 03/09/2021] [Indexed: 11/01/2022] Open
Abstract
The representation of speech in the brain is often examined by measuring the alignment of rhythmic brain activity to the speech envelope. To conveniently quantify this alignment (termed 'speech tracking') many studies consider the broadband speech envelope, which combines acoustic fluctuations across the spectral range. Using EEG recordings, we show that using this broadband envelope can provide a distorted picture on speech encoding. We systematically investigated the encoding of spectrally-limited speech-derived envelopes presented by individual and multiple noise carriers in the human brain. Tracking in the 1 to 6 Hz EEG bands differentially reflected low (0.2 - 0.83 kHz) and high (2.66 - 8 kHz) frequency speech-derived envelopes. This was independent of the specific carrier frequency but sensitive to attentional manipulations, and may reflect the context-dependent emphasis of information from distinct spectral ranges of the speech envelope in low frequency brain activity. As low and high frequency speech envelopes relate to distinct phonemic features, our results suggest that functionally distinct processes contribute to speech tracking in the same EEG bands, and are easily confounded when considering the broadband speech envelope.
Collapse
Affiliation(s)
- Felix Bröhl
- Department for Cognitive Neuroscience, Faculty of Biology, Bielefeld University, Universitätsstr. 25, 33615 Bielefeld, Germany.
| | - Christoph Kayser
- Department for Cognitive Neuroscience, Faculty of Biology, Bielefeld University, Universitätsstr. 25, 33615 Bielefeld, Germany
| |
Collapse
|
127
|
Kolozsvári OB, Xu W, Gerike G, Parviainen T, Nieminen L, Noiray A, Hämäläinen JA. Coherence Between Brain Activation and Speech Envelope at Word and Sentence Levels Showed Age-Related Differences in Low Frequency Bands. NEUROBIOLOGY OF LANGUAGE (CAMBRIDGE, MASS.) 2021; 2:226-253. [PMID: 37216146 PMCID: PMC10158622 DOI: 10.1162/nol_a_00033] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 07/17/2020] [Accepted: 02/17/2021] [Indexed: 05/24/2023]
Abstract
Speech perception is dynamic and shows changes across development. In parallel, functional differences in brain development over time have been well documented and these differences may interact with changes in speech perception during infancy and childhood. Further, there is evidence that the two hemispheres contribute unequally to speech segmentation at the sentence and phonemic levels. To disentangle those contributions, we studied the cortical tracking of various sized units of speech that are crucial for spoken language processing in children (4.7-9.3 years old, N = 34) and adults (N = 19). We measured participants' magnetoencephalogram (MEG) responses to syllables, words, and sentences, calculated the coherence between the speech signal and MEG responses at the level of words and sentences, and further examined auditory evoked responses to syllables. Age-related differences were found for coherence values at the delta and theta frequency bands. Both frequency bands showed an effect of stimulus type, although this was attributed to the length of the stimulus and not the linguistic unit size. There was no difference between hemispheres at the source level either in coherence values for word or sentence processing or in evoked response to syllables. Results highlight the importance of the lower frequencies for speech tracking in the brain across different lexical units. Further, stimulus length affects the speech-brain associations suggesting methodological approaches should be selected carefully when studying speech envelope processing at the neural level. Speech tracking in the brain seems decoupled from more general maturation of the auditory cortex.
Collapse
Affiliation(s)
- Orsolya B. Kolozsvári
- Department of Psychology, University of Jyväskylä, Finland
- Centre for Interdisciplinary Brain Research (CIBR), University of Jyväskylä, Finland
| | - Weiyong Xu
- Department of Psychology, University of Jyväskylä, Finland
- Centre for Interdisciplinary Brain Research (CIBR), University of Jyväskylä, Finland
| | - Georgia Gerike
- Department of Psychology, University of Jyväskylä, Finland
- Centre for Interdisciplinary Brain Research (CIBR), University of Jyväskylä, Finland
- Niilo Mäki Institute, Jyväskylä, Finland
| | - Tiina Parviainen
- Department of Psychology, University of Jyväskylä, Finland
- Centre for Interdisciplinary Brain Research (CIBR), University of Jyväskylä, Finland
| | - Lea Nieminen
- Centre for Applied Language Studies, University of Jyväskylä, Finland
| | - Aude Noiray
- Laboratory for Oral Language Acquisition (LOLA), University of Potsdam, Germany
| | - Jarmo A. Hämäläinen
- Department of Psychology, University of Jyväskylä, Finland
- Centre for Interdisciplinary Brain Research (CIBR), University of Jyväskylä, Finland
| |
Collapse
|
128
|
Xu C, Zou J, He F, Wen X, Li J, Gao J, Ding N, Luo B. Neural Tracking of Sound Rhythms Correlates With Diagnosis, Severity, and Prognosis of Disorders of Consciousness. Front Neurosci 2021; 15:646543. [PMID: 33994924 PMCID: PMC8113690 DOI: 10.3389/fnins.2021.646543] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/27/2020] [Accepted: 03/19/2021] [Indexed: 12/03/2022] Open
Abstract
Effective diagnosis and prognosis of patients with disorders of consciousness (DOC) provides a basis for family counseling, decision-making, and the design of rehabilitation programs. However, effective and objective bedside evaluation is a challenging problem. In this study, we explored electroencephalography (EEG) response tracking sound rhythms as potential neural markers for DOC evaluation. We analyzed the responses to natural speech and tones modulated at 2 and 41 Hz. At the population level, patients with positive outcomes (DOC-P) showed higher cortical synchronization to modulated tones at 41 Hz compared with patients with negative outcomes (DOC-N). At the individual level, phase coherence to modulated tones at 41 Hz was significantly correlated with Coma Recovery Scale-Revised (CRS-R) and Glasgow Outcome Scale-Extended (GOS-E) scores. Furthermore, SVM classifiers, trained using phase coherences in higher frequency bands or combination of the low frequency aSSR and speech tracking responses, performed very well in diagnosis and prognosis of DOC. These findings show that EEG response to auditory rhythms is a potential tool for diagnosis, severity, and prognosis of DOC.
Collapse
Affiliation(s)
- Chuan Xu
- Department of Neurology, First Affiliated Hospital, School of Medicine, Zhejiang University, Hangzhou, China
| | - Jiajie Zou
- Key Laboratory for Biomedical Engineering of Ministry of Education, College of Biomedical Engineering and Instrument Sciences, Zhejiang University, Hangzhou, China.,Research Center for Advanced Artificial Intelligence Theory Zhejiang Lab, Hangzhou, China
| | - Fangping He
- Department of Neurology, First Affiliated Hospital, School of Medicine, Zhejiang University, Hangzhou, China
| | - Xinrui Wen
- Department of Neurology, First Affiliated Hospital, School of Medicine, Zhejiang University, Hangzhou, China
| | - Jingqi Li
- Department of Rehabilitation, Hangzhou Mingzhou Brain Rehabilitation Hospital, Hangzhou, China
| | - Jian Gao
- Department of Rehabilitation, Hangzhou Mingzhou Brain Rehabilitation Hospital, Hangzhou, China
| | - Nai Ding
- Key Laboratory for Biomedical Engineering of Ministry of Education, College of Biomedical Engineering and Instrument Sciences, Zhejiang University, Hangzhou, China.,Research Center for Advanced Artificial Intelligence Theory Zhejiang Lab, Hangzhou, China
| | - Benyan Luo
- Department of Neurology, First Affiliated Hospital, School of Medicine, Zhejiang University, Hangzhou, China
| |
Collapse
|
129
|
Speech Perception with Noise Vocoding and Background Noise: An EEG and Behavioral Study. J Assoc Res Otolaryngol 2021; 22:349-363. [PMID: 33851289 DOI: 10.1007/s10162-021-00787-2] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2020] [Accepted: 01/26/2021] [Indexed: 10/21/2022] Open
Abstract
This study explored the physiological response of the human brain to degraded speech syllables. The degradation was introduced using noise vocoding and/or background noise. The goal was to identify physiological features of auditory-evoked potentials (AEPs) that may explain speech intelligibility. Ten human subjects with normal hearing participated in syllable-detection tasks, while their AEPs were recorded with 32-channel electroencephalography. Subjects were presented with six syllables in the form of consonant-vowel-consonant or vowel-consonant-vowel. Noise vocoding with 22 or 4 frequency channels was applied to the syllables. When examining the peak heights in the AEPs (P1, N1, and P2), vocoding alone showed no consistent effect. P1 was not consistently reduced by background noise, N1 was sometimes reduced by noise, and P2 was almost always highly reduced. Two other physiological metrics were examined: (1) classification accuracy of the syllables based on AEPs, which indicated whether AEPs were distinguishable for different syllables, and (2) cross-condition correlation of AEPs (rcc) between the clean and degraded speech, which indicated the brain's ability to extract speech-related features and suppress response to noise. Both metrics decreased with degraded speech quality. We further tested if the two metrics can explain cross-subject variations in their behavioral performance. A significant correlation existed for rcc, as well as classification based on early AEPs, in the fronto-central areas. Because rcc indicates similarities between clean and degraded speech, our finding suggests that high speech intelligibility may be a result of the brain's ability to ignore noise in the sound carrier and/or background.
Collapse
|
130
|
Differential contributions of synaptic and intrinsic inhibitory currents to speech segmentation via flexible phase-locking in neural oscillators. PLoS Comput Biol 2021; 17:e1008783. [PMID: 33852573 PMCID: PMC8104450 DOI: 10.1371/journal.pcbi.1008783] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2020] [Revised: 05/07/2021] [Accepted: 02/05/2021] [Indexed: 01/07/2023] Open
Abstract
Current hypotheses suggest that speech segmentation—the initial division and grouping of the speech stream into candidate phrases, syllables, and phonemes for further linguistic processing—is executed by a hierarchy of oscillators in auditory cortex. Theta (∼3-12 Hz) rhythms play a key role by phase-locking to recurring acoustic features marking syllable boundaries. Reliable synchronization to quasi-rhythmic inputs, whose variable frequency can dip below cortical theta frequencies (down to ∼1 Hz), requires “flexible” theta oscillators whose underlying neuronal mechanisms remain unknown. Using biophysical computational models, we found that the flexibility of phase-locking in neural oscillators depended on the types of hyperpolarizing currents that paced them. Simulated cortical theta oscillators flexibly phase-locked to slow inputs when these inputs caused both (i) spiking and (ii) the subsequent buildup of outward current sufficient to delay further spiking until the next input. The greatest flexibility in phase-locking arose from a synergistic interaction between intrinsic currents that was not replicated by synaptic currents at similar timescales. Flexibility in phase-locking enabled improved entrainment to speech input, optimal at mid-vocalic channels, which in turn supported syllabic-timescale segmentation through identification of vocalic nuclei. Our results suggest that synaptic and intrinsic inhibition contribute to frequency-restricted and -flexible phase-locking in neural oscillators, respectively. Their differential deployment may enable neural oscillators to play diverse roles, from reliable internal clocking to adaptive segmentation of quasi-regular sensory inputs like speech. Oscillatory activity in auditory cortex is believed to play an important role in auditory and speech processing. One suggested function of these rhythms is to divide the speech stream into candidate phonemes, syllables, words, and phrases, to be matched with learned linguistic templates. This requires brain rhythms to flexibly synchronize with regular acoustic features of the speech stream. How neuronal circuits implement this task remains unknown. In this study, we explored the contribution of inhibitory currents to flexible phase-locking in neuronal theta oscillators, believed to perform initial syllabic segmentation. We found that a combination of specific intrinsic inhibitory currents at multiple timescales, present in a large class of cortical neurons, enabled exceptionally flexible phase-locking, which could be used to precisely segment speech by identifying vowels at mid-syllable. This suggests that the cells exhibiting these currents are a key component in the brain’s auditory and speech processing architecture.
Collapse
|
131
|
Chen F, Zhang H, Ding H, Wang S, Peng G, Zhang Y. Neural coding of formant-exaggerated speech and nonspeech in children with and without autism spectrum disorders. Autism Res 2021; 14:1357-1374. [PMID: 33792205 DOI: 10.1002/aur.2509] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2020] [Revised: 03/09/2021] [Accepted: 03/16/2021] [Indexed: 12/15/2022]
Abstract
The presence of vowel exaggeration in infant-directed speech (IDS) may adapt to the age-appropriate demands in speech and language acquisition. Previous studies have provided behavioral evidence of atypical auditory processing towards IDS in children with autism spectrum disorders (ASD), while the underlying neurophysiological mechanisms remain unknown. This event-related potential (ERP) study investigated the neural coding of formant-exaggerated speech and nonspeech in 24 4- to 11-year-old children with ASD and 24 typically-developing (TD) peers. The EEG data were recorded using an alternating block design, in which each stimulus type (exaggerated/non-exaggerated sound) was presented with equal probability. ERP waveform analysis revealed an enhanced P1 for vowel formant exaggeration in the TD group but not in the ASD group. This speech-specific atypical processing in ASD was not found for the nonspeech stimuli which showed similar P1 enhancement in both ASD and TD groups. Moreover, the time-frequency analysis indicated that children with ASD showed differences in neural synchronization in the delta-theta bands for processing acoustic formant changes embedded in nonspeech. Collectively, the results add substantiating neurophysiological evidence (i.e., a lack of neural enhancement effect of vowel exaggeration) for atypical auditory processing of IDS in children with ASD, which may exert a negative effect on phonetic encoding and language learning. LAY SUMMARY: Atypical responses to motherese might act as a potential early marker of risk for children with ASD. This study investigated the neural responses to such socially relevant stimuli in the ASD brain, and the results suggested a lack of neural enhancement responding to the motherese even in individuals without intellectual disability.
Collapse
Affiliation(s)
- Fei Chen
- School of Foreign Languages, Hunan University, Changsha, China.,Research Centre for Language, Cognition, and Neuroscience & Department of Chinese and Bilingual Studies, The Hong Kong Polytechnic University, Hong Kong SAR, China.,Department of Speech-Language-Hearing Sciences & Center for Neurobehavioral Development, University of Minnesota, Twin Cities, Minnesota, USA
| | - Hao Zhang
- Speech-Language-Hearing Center, School of Foreign Languages, Shanghai Jiao Tong University, Shanghai, China
| | - Hongwei Ding
- Speech-Language-Hearing Center, School of Foreign Languages, Shanghai Jiao Tong University, Shanghai, China
| | - Suiping Wang
- School of Psychology, South China Normal University, Guangzhou, China
| | - Gang Peng
- Research Centre for Language, Cognition, and Neuroscience & Department of Chinese and Bilingual Studies, The Hong Kong Polytechnic University, Hong Kong SAR, China
| | - Yang Zhang
- Department of Speech-Language-Hearing Sciences & Center for Neurobehavioral Development, University of Minnesota, Twin Cities, Minnesota, USA
| |
Collapse
|
132
|
Lizarazu M, Carreiras M, Bourguignon M, Zarraga A, Molinaro N. Language Proficiency Entails Tuning Cortical Activity to Second Language Speech. Cereb Cortex 2021; 31:3820-3831. [PMID: 33791775 DOI: 10.1093/cercor/bhab051] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2020] [Revised: 02/15/2021] [Accepted: 02/15/2021] [Indexed: 11/12/2022] Open
Abstract
Cortical tracking of linguistic structures in speech, such as phrases (<3 Hz, delta band) and syllables (3-8 Hz, theta band), is known to be crucial for speech comprehension. However, it has not been established whether this effect is related to language proficiency. Here, we investigate how auditory cortical activity in second language (L2) learners tracked L2 speech. Using magnetoencephalography, we recorded brain activity from participants listening to Spanish and Basque. Participants were Spanish native (L1) language speakers studying Basque (L2) at the same language center at three different levels: beginner (Grade 1), intermediate (Grade 2), and advanced (Grade 3). We found that 1) both delta and theta tracking to L2 speech in the auditory cortex were related to L2 learning proficiency and that 2) top-down modulations of activity in the left auditory regions during L2 speech listening-by the left inferior frontal and motor regions in delta band and by the left middle temporal regions in theta band-were also related to L2 proficiency. Altogether, these results indicate that the ability to learn an L2 is related to successful cortical tracking of L2 speech and its modulation by neuronal oscillations in higher-order cortical regions.
Collapse
Affiliation(s)
- Mikel Lizarazu
- BCBL, Basque center on Cognition, Brain and Language, Donostia-San Sebastian, 20009, Spain.,Laboratoire de Sciences Cognitives et Psycholinguistique, Département d'Etudes Cognitives, Ecole Normale Supérieure, EHESS, CNRS, PSL University, Paris 75005, France
| | - Manuel Carreiras
- BCBL, Basque center on Cognition, Brain and Language, Donostia-San Sebastian, 20009, Spain.,Ikerbasque, Basque Foundation for Science, Bilbao, 48009, Spain
| | - Mathieu Bourguignon
- BCBL, Basque center on Cognition, Brain and Language, Donostia-San Sebastian, 20009, Spain.,Laboratoire de Cartographie fonctionnelle du Cerveau, UNI - ULB Neuroscience Institute, Université libre de Bruxelles (ULB), Brussels, 1050, Belgium.,Laboratory of neurophysiology and movement biomechanics, UNI - ULB Neuroscience Institute, Université libre de Bruxelles (ULB), Brussels, 1050, Belgium
| | - Asier Zarraga
- BCBL, Basque center on Cognition, Brain and Language, Donostia-San Sebastian, 20009, Spain
| | - Nicola Molinaro
- BCBL, Basque center on Cognition, Brain and Language, Donostia-San Sebastian, 20009, Spain.,Ikerbasque, Basque Foundation for Science, Bilbao, 48009, Spain
| |
Collapse
|
133
|
Brown M, Tanenhaus MK, Dilley L. Syllable Inference as a Mechanism for Spoken Language Understanding. Top Cogn Sci 2021; 13:351-398. [PMID: 33780156 DOI: 10.1111/tops.12529] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2019] [Revised: 02/24/2021] [Accepted: 02/25/2021] [Indexed: 01/25/2023]
Abstract
A classic problem in spoken language comprehension is how listeners perceive speech as being composed of discrete words, given the variable time-course of information in continuous signals. We propose a syllable inference account of spoken word recognition and segmentation, according to which alternative hierarchical models of syllables, words, and phonemes are dynamically posited, which are expected to maximally predict incoming sensory input. Generative models are combined with current estimates of context speech rate drawn from neural oscillatory dynamics, which are sensitive to amplitude rises. Over time, models which result in local minima in error between predicted and recently experienced signals give rise to perceptions of hearing words. Three experiments using the visual world eye-tracking paradigm with a picture-selection task tested hypotheses motivated by this framework. Materials were sentences that were acoustically ambiguous in numbers of syllables, words, and phonemes they contained (cf. English plural constructions, such as "saw (a) raccoon(s) swimming," which have two loci of grammatical information). Time-compressing, or expanding, speech materials permitted determination of how temporal information at, or in the context of, each locus affected looks to, and selection of, pictures with a singular or plural referent (e.g., one or more than one raccoon). Supporting our account, listeners probabilistically interpreted identical chunks of speech as consistent with a singular or plural referent to a degree that was based on the chunk's gradient rate in relation to its context. We interpret these results as evidence that arriving temporal information, judged in relation to language model predictions generated from context speech rate evaluated on a continuous scale, informs inferences about syllables, thereby giving rise to perceptual experiences of understanding spoken language as words separated in time.
Collapse
Affiliation(s)
- Meredith Brown
- Department of Brain and Cognitive Sciences, University of Rochester, Rochester, New York, USA.,Department of Psychiatry and Athinoula A. Martinos Center for Biomedical Imaging, Massachusetts General Hospital, Charlestown, Massachusetts, USA.,Department of Psychology, Tufts University, Medford, Massachusetts, USA
| | - Michael K Tanenhaus
- Department of Brain and Cognitive Sciences, University of Rochester, Rochester, New York, USA.,School of Psychology, Nanjing Normal University, Nanjing, China
| | - Laura Dilley
- Department of Communicative Sciences and Disorders, Michigan State University, East Lansing, Michigan, USA
| |
Collapse
|
134
|
Visual Familiarity Induced 5-Hz Oscillations and Improved Orientation and Direction Selectivities in V1. J Neurosci 2021; 41:2656-2667. [PMID: 33563727 PMCID: PMC8018737 DOI: 10.1523/jneurosci.1337-20.2021] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2020] [Revised: 01/12/2021] [Accepted: 01/17/2021] [Indexed: 11/25/2022] Open
Abstract
Neural oscillations play critical roles in information processing, communication between brain areas, learning, and memory. We have recently discovered that familiar visual stimuli can robustly induce 5-Hz oscillations in the primary visual cortex (V1) of awake mice after the visual experience. To gain more mechanistic insight into this phenomenon, we used in vivo patch-clamp recordings to monitor the subthreshold activity of individual neurons during these oscillations. Neural oscillations play critical roles in information processing, communication between brain areas, learning, and memory. We have recently discovered that familiar visual stimuli can robustly induce 5-Hz oscillations in the primary visual cortex (V1) of awake mice after the visual experience. To gain more mechanistic insight into this phenomenon, we used in vivo patch-clamp recordings to monitor the subthreshold activity of individual neurons during these oscillations. We analyzed the visual tuning properties of V1 neurons in naive and experienced mice to assess the effect of visual experience on the orientation and direction selectivity. Using optogenetic stimulation through the patch pipette in vivo, we measured the synaptic strength of specific intracortical and thalamocortical projections in vivo in the visual cortex before and after the visual experience. We found 5-Hz oscillations in membrane potential (Vm) and firing rates evoked in single neurons in response to the familiar stimulus, consistent with previous studies. Following the visual experience, the average firing rates of visual responses were reduced while the orientation and direction selectivities were increased. Light-evoked EPSCs were significantly increased for layer 5 (L5) projections to other layers of V1 after the visual experience, while the thalamocortical synaptic strength was decreased. In addition, we developed a computational model that could reproduce 5-Hz oscillations with enhanced neuronal selectivity following synaptic plasticity within the recurrent network and decreased feedforward input. SIGNIFICANCE STATEMENT Neural oscillations at around 5 Hz are involved in visual working memory and temporal expectations in primary visual cortex (V1). However, how the oscillations modulate the visual response properties of neurons in V1 and their underlying mechanism is poorly understood. Here, we show that these oscillations may alter the orientation and direction selectivity of the layer 2/3 (L2/3) neurons and correlate with the synaptic plasticity within V1. Our computational recurrent network model reproduces all these observations and provides a mechanistic framework for studying the role of 5-Hz oscillations in visual familiarity.
Collapse
|
135
|
de Lange P, Boto E, Holmes N, Hill RM, Bowtell R, Wens V, De Tiège X, Brookes MJ, Bourguignon M. Measuring the cortical tracking of speech with optically-pumped magnetometers. Neuroimage 2021; 233:117969. [PMID: 33744453 DOI: 10.1016/j.neuroimage.2021.117969] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2020] [Revised: 01/08/2021] [Accepted: 03/04/2021] [Indexed: 11/25/2022] Open
Abstract
During continuous speech listening, brain activity tracks speech rhythmicity at frequencies matching with the repetition rate of phrases (0.2-1.5 Hz), words (2-4 Hz) and syllables (4-8 Hz). Here, we evaluated the applicability of wearable MEG based on optically-pumped magnetometers (OPMs) to measure such cortical tracking of speech (CTS). Measuring CTS with OPMs is a priori challenging given the complications associated with OPM measurements at frequencies below 4 Hz, due to increased intrinsic interference and head movement artifacts. Still, this represents an important development as OPM-MEG provides lifespan compliance and substantially improved spatial resolution compared with classical MEG. In this study, four healthy right-handed adults listened to continuous speech for 9 min. The radial component of the magnetic field was recorded simultaneously with 45-46 OPMs evenly covering the scalp surface and fixed to an additively manufactured helmet which fitted all 4 participants. We estimated CTS with reconstruction accuracy and coherence, and determined the number of dominant principal components (PCs) to remove from the data (as a preprocessing step) for optimal estimation. We also identified the dominant source of CTS using a minimum norm estimate. CTS estimated with reconstruction accuracy and coherence was significant in all 4 participants at phrasal and word rates, and in 3 participants (reconstruction accuracy) or 2 (coherence) at syllabic rate. Overall, close-to-optimal CTS estimation was obtained when the 3 (reconstruction accuracy) or 10 (coherence) first PCs were removed from the data. Importantly, values of reconstruction accuracy (~0.4 for 0.2-1.5-Hz CTS and ~0.1 for 2-8-Hz CTS) were remarkably close to those previously reported in classical MEG studies. Finally, source reconstruction localized the main sources of CTS to bilateral auditory cortices. In conclusion, t his study demonstrates that OPMs can be used for the purpose of CTS assessment. This finding opens new research avenues to unravel the neural network involved in CTS across the lifespan and potential alterations in, e.g., language developmental disorders. Data also suggest that OPMs are generally suitable for recording neural activity at frequencies below 4 Hz provided PCA is used as a preprocessing step; 0.2-1.5-Hz being the lowest frequency range successfully investigated here.
Collapse
Affiliation(s)
- Paul de Lange
- Laboratoire de Cartographie fonctionnelle du Cerveau, UNI - ULB Neuroscience Institute, Université libre de Bruxelles (ULB), 808 Lennik Street, Brussels 1070, Belgium
| | - Elena Boto
- Sir Peter Mansfield Imaging Centre, School of Physics and Astronomy, University of Nottingham, University Park, Nottingham NG7 2RD, United Kingdom
| | - Niall Holmes
- Sir Peter Mansfield Imaging Centre, School of Physics and Astronomy, University of Nottingham, University Park, Nottingham NG7 2RD, United Kingdom
| | - Ryan M Hill
- Sir Peter Mansfield Imaging Centre, School of Physics and Astronomy, University of Nottingham, University Park, Nottingham NG7 2RD, United Kingdom
| | - Richard Bowtell
- Sir Peter Mansfield Imaging Centre, School of Physics and Astronomy, University of Nottingham, University Park, Nottingham NG7 2RD, United Kingdom
| | - Vincent Wens
- Laboratoire de Cartographie fonctionnelle du Cerveau, UNI - ULB Neuroscience Institute, Université libre de Bruxelles (ULB), 808 Lennik Street, Brussels 1070, Belgium; Department of Functional Neuroimaging, Service of Nuclear Medicine, CUB Hôpital Erasme, Université libre de Bruxelles (ULB), Brussels, Belgium
| | - Xavier De Tiège
- Laboratoire de Cartographie fonctionnelle du Cerveau, UNI - ULB Neuroscience Institute, Université libre de Bruxelles (ULB), 808 Lennik Street, Brussels 1070, Belgium; Department of Functional Neuroimaging, Service of Nuclear Medicine, CUB Hôpital Erasme, Université libre de Bruxelles (ULB), Brussels, Belgium
| | - Matthew J Brookes
- Sir Peter Mansfield Imaging Centre, School of Physics and Astronomy, University of Nottingham, University Park, Nottingham NG7 2RD, United Kingdom
| | - Mathieu Bourguignon
- Laboratoire de Cartographie fonctionnelle du Cerveau, UNI - ULB Neuroscience Institute, Université libre de Bruxelles (ULB), 808 Lennik Street, Brussels 1070, Belgium; Laboratory of neurophysiology and movement biomechanics, UNI - ULB Neuroscience Institute, Université libre de Bruxelles (ULB), Brussels, Belgium; BCBL, Basque Center on Cognition, Brain and Language, San Sebastian 20009, Spain.
| |
Collapse
|
136
|
Erkens J, Schulte M, Vormann M, Wilsch A, Herrmann CS. Hearing Impaired Participants Improve More Under Envelope-Transcranial Alternating Current Stimulation When Signal to Noise Ratio Is High. Neurosci Insights 2021; 16:2633105520988854. [PMID: 33709079 PMCID: PMC7907945 DOI: 10.1177/2633105520988854] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2020] [Accepted: 12/31/2020] [Indexed: 11/16/2022] Open
Abstract
An issue commonly expressed by hearing aid users is a difficulty to understand speech in complex hearing scenarios, that is, when speech is presented together with background noise or in situations with multiple speakers. Conventional hearing aids are already designed with these issues in mind, using beamforming to only enhance sound from a specific direction, but these are limited in solving these issues as they can only modulate incoming sound at the cochlear level. However, evidence exists that age-related hearing loss might partially be caused later in the hearing processes due to brain processes slowing down and becoming less efficient. In this study, we tested whether it would be possible to improve the hearing process at the cortical level by improving neural tracking of speech. The speech envelopes of target sentences were transformed into an electrical signal and stimulated onto elderly participants' cortices using transcranial alternating current stimulation (tACS). We compared 2 different signal to noise ratios (SNRs) with 5 different delays between sound presentation and stimulation ranging from 50 ms to 150 ms, and the differences in effects between elderly normal hearing and elderly hearing impaired participants. When the task was performed at a high SNR, hearing impaired participants appeared to gain more from envelope-tACS compared to when the task was performed at a lower SNR. This was not the case for normal hearing participants. Furthermore, a post-hoc analysis of the different time-lags suggest that elderly were significantly better at a stimulation time-lag of 150 ms when the task was presented at a high SNR. In this paper, we outline why these effects are worth exploring further, and what they tell us about the optimal tACS time-lag.
Collapse
Affiliation(s)
- Jules Erkens
- Department of Psychology, Cluster of
Excellence “Hearing4All,” European Medical School, Carl von Ossietzky University,
Oldenburg, Germany
| | | | | | - Anna Wilsch
- Department of Psychology, Cluster of
Excellence “Hearing4All,” European Medical School, Carl von Ossietzky University,
Oldenburg, Germany
| | - Christoph S Herrmann
- Department of Psychology, Cluster of
Excellence “Hearing4All,” European Medical School, Carl von Ossietzky University,
Oldenburg, Germany
- Research Center Neurosensory Science,
Carl von Ossietzky University, Oldenburg, Germany
| |
Collapse
|
137
|
Meyer L, Lakatos P, He Y. Language Dysfunction in Schizophrenia: Assessing Neural Tracking to Characterize the Underlying Disorder(s)? Front Neurosci 2021; 15:640502. [PMID: 33692672 PMCID: PMC7937925 DOI: 10.3389/fnins.2021.640502] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2020] [Accepted: 02/03/2021] [Indexed: 12/19/2022] Open
Abstract
Deficits in language production and comprehension are characteristic of schizophrenia. To date, it remains unclear whether these deficits arise from dysfunctional linguistic knowledge, or dysfunctional predictions derived from the linguistic context. Alternatively, the deficits could be a result of dysfunctional neural tracking of auditory information resulting in decreased auditory information fidelity and even distorted information. Here, we discuss possible ways for clinical neuroscientists to employ neural tracking methodology to independently characterize deficiencies on the auditory-sensory and abstract linguistic levels. This might lead to a mechanistic understanding of the deficits underlying language related disorder(s) in schizophrenia. We propose to combine naturalistic stimulation, measures of speech-brain synchronization, and computational modeling of abstract linguistic knowledge and predictions. These independent but likely interacting assessments may be exploited for an objective and differential diagnosis of schizophrenia, as well as a better understanding of the disorder on the functional level-illustrating the potential of neural tracking methodology as translational tool in a range of psychotic populations.
Collapse
Affiliation(s)
- Lars Meyer
- Research Group Language Cycles, Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany
- Clinic for Phoniatrics and Pedaudiology, University Hospital Münster, Münster, Germany
| | - Peter Lakatos
- Center for Biomedical Imaging and Neuromodulation, Nathan Kline Institute, Orangeburg, NY, United States
| | - Yifei He
- Department of Psychiatry and Psychotherapy, Philipps-University Marburg, Marburg, Germany
| |
Collapse
|
138
|
van Bree S, Sohoglu E, Davis MH, Zoefel B. Sustained neural rhythms reveal endogenous oscillations supporting speech perception. PLoS Biol 2021; 19:e3001142. [PMID: 33635855 PMCID: PMC7946281 DOI: 10.1371/journal.pbio.3001142] [Citation(s) in RCA: 34] [Impact Index Per Article: 11.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2020] [Revised: 03/10/2021] [Accepted: 02/08/2021] [Indexed: 12/23/2022] Open
Abstract
Rhythmic sensory or electrical stimulation will produce rhythmic brain responses. These rhythmic responses are often interpreted as endogenous neural oscillations aligned (or "entrained") to the stimulus rhythm. However, stimulus-aligned brain responses can also be explained as a sequence of evoked responses, which only appear regular due to the rhythmicity of the stimulus, without necessarily involving underlying neural oscillations. To distinguish evoked responses from true oscillatory activity, we tested whether rhythmic stimulation produces oscillatory responses which continue after the end of the stimulus. Such sustained effects provide evidence for true involvement of neural oscillations. In Experiment 1, we found that rhythmic intelligible, but not unintelligible speech produces oscillatory responses in magnetoencephalography (MEG) which outlast the stimulus at parietal sensors. In Experiment 2, we found that transcranial alternating current stimulation (tACS) leads to rhythmic fluctuations in speech perception outcomes after the end of electrical stimulation. We further report that the phase relation between electroencephalography (EEG) responses and rhythmic intelligible speech can predict the tACS phase that leads to most accurate speech perception. Together, we provide fundamental results for several lines of research-including neural entrainment and tACS-and reveal endogenous neural oscillations as a key underlying principle for speech perception.
Collapse
Affiliation(s)
- Sander van Bree
- MRC Cognition and Brain Sciences Unit, University of Cambridge, Cambridge, United Kingdom
- Centre for Cognitive Neuroimaging, University of Glasgow, Glasgow, United Kingdom
- School of Psychology and Centre for Human Brain Health, University of Birmingham, Birmingham, United Kingdom
| | - Ediz Sohoglu
- MRC Cognition and Brain Sciences Unit, University of Cambridge, Cambridge, United Kingdom
- School of Psychology, University of Sussex, Brighton, United Kingdom
| | - Matthew H. Davis
- MRC Cognition and Brain Sciences Unit, University of Cambridge, Cambridge, United Kingdom
| | - Benedikt Zoefel
- MRC Cognition and Brain Sciences Unit, University of Cambridge, Cambridge, United Kingdom
- Centre de Recherche Cerveau et Cognition, CNRS, Toulouse, France
- Université Toulouse III Paul Sabatier, Toulouse, France
| |
Collapse
|
139
|
Beier EJ, Chantavarin S, Rehrig G, Ferreira F, Miller LM. Cortical Tracking of Speech: Toward Collaboration between the Fields of Signal and Sentence Processing. J Cogn Neurosci 2021; 33:574-593. [PMID: 33475452 DOI: 10.1162/jocn_a_01676] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022]
Abstract
In recent years, a growing number of studies have used cortical tracking methods to investigate auditory language processing. Although most studies that employ cortical tracking stem from the field of auditory signal processing, this approach should also be of interest to psycholinguistics-particularly the subfield of sentence processing-given its potential to provide insight into dynamic language comprehension processes. However, there has been limited collaboration between these fields, which we suggest is partly because of differences in theoretical background and methodological constraints, some mutually exclusive. In this paper, we first review the theories and methodological constraints that have historically been prioritized in each field and provide concrete examples of how some of these constraints may be reconciled. We then elaborate on how further collaboration between the two fields could be mutually beneficial. Specifically, we argue that the use of cortical tracking methods may help resolve long-standing debates in the field of sentence processing that commonly used behavioral and neural measures (e.g., ERPs) have failed to adjudicate. Similarly, signal processing researchers who use cortical tracking may be able to reduce noise in the neural data and broaden the impact of their results by controlling for linguistic features of their stimuli and by using simple comprehension tasks. Overall, we argue that a balance between the methodological constraints of the two fields will lead to an overall improved understanding of language processing as well as greater clarity on what mechanisms cortical tracking of speech reflects. Increased collaboration will help resolve debates in both fields and will lead to new and exciting avenues for research.
Collapse
|
140
|
Ortiz Barajas MC, Guevara R, Gervain J. The origins and development of speech envelope tracking during the first months of life. Dev Cogn Neurosci 2021; 48:100915. [PMID: 33515956 PMCID: PMC7847966 DOI: 10.1016/j.dcn.2021.100915] [Citation(s) in RCA: 21] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2020] [Revised: 11/25/2020] [Accepted: 01/08/2021] [Indexed: 11/30/2022] Open
Abstract
The adult brain tracks the modulation of the amplitude of speech, i.e. its envelope. We tested if preverbal infants, i.e. newborns & 6-month-olds, track the speech envelope. Infants track the envelope phase at both ages in the native language & in unfamiliar languages. Infants track the envelope amplitude in the native language at birth but not at 6 months. This suggests that phase tracking is unrelated to language experience, whereas amplitude tracking is shaped by experience.
When humans listen to speech, their neural activity tracks the slow amplitude fluctuations of the speech signal over time, known as the speech envelope. Studies suggest that the quality of this tracking is related to the quality of speech comprehension. However, a critical unanswered question is how envelope tracking arises and what role it plays in language development. Relatedly, its causal role in comprehension remains unclear, as some studies have found it to be present even for unintelligible speech. Using electroencephalography, we investigated whether the neural activity of newborns and 6-month-olds is able to track the speech envelope of familiar and unfamiliar languages in order to explore the developmental origins and functional role of envelope tracking. Our results show that amplitude and phase tracking take place at birth for familiar and unfamiliar languages alike, i.e. independently of prenatal experience. However, by 6 months language familiarity modulates the ability to track the amplitude of the speech envelope, while phase tracking continues to be universal. Our findings support the hypothesis that amplitude and phase tracking could represent two different neural mechanisms of oscillatory synchronisation and may thus play different roles in speech perception.
Collapse
Affiliation(s)
| | - Ramón Guevara
- Department of Physics and Astronomy, University of Padua, Padua, Italy
| | - Judit Gervain
- Integrative Neuroscience and Cognition Center, CNRS & Université de Paris, Paris, France; Department of Developmental Psychology and Socialization, University of Padua, Padua, Italy
| |
Collapse
|
141
|
Modulation Spectra Capture EEG Responses to Speech Signals and Drive Distinct Temporal Response Functions. eNeuro 2021; 8:ENEURO.0399-20.2020. [PMID: 33272971 PMCID: PMC7810259 DOI: 10.1523/eneuro.0399-20.2020] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2020] [Revised: 11/08/2020] [Accepted: 11/14/2020] [Indexed: 11/26/2022] Open
Abstract
Speech signals have a unique shape of long-term modulation spectrum that is distinct from environmental noise, music, and non-speech vocalizations. Does the human auditory system adapt to the speech long-term modulation spectrum and efficiently extract critical information from speech signals? To answer this question, we tested whether neural responses to speech signals can be captured by specific modulation spectra of non-speech acoustic stimuli. We generated amplitude modulated (AM) noise with the speech modulation spectrum and 1/f modulation spectra of different exponents to imitate temporal dynamics of different natural sounds. We presented these AM stimuli and a 10-min piece of natural speech to 19 human participants undergoing electroencephalography (EEG) recording. We derived temporal response functions (TRFs) to the AM stimuli of different spectrum shapes and found distinct neural dynamics for each type of TRFs. We then used the TRFs of AM stimuli to predict neural responses to the speech signals, and found that (1) the TRFs of AM modulation spectra of exponents 1, 1.5, and 2 preferably captured EEG responses to speech signals in the δ band and (2) the θ neural band of speech neural responses can be captured by the AM stimuli of an exponent of 0.75. Our results suggest that the human auditory system shows specificity to the long-term modulation spectrum and is equipped with characteristic neural algorithms tailored to extract critical acoustic information from speech signals.
Collapse
|
142
|
Ford LKW, Borneman J, Krebs J, Malaia E, Ames B. Classification of visual comprehension based on EEG data using sparse optimal scoring. J Neural Eng 2021; 18. [PMID: 33440368 DOI: 10.1088/1741-2552/abdb3b] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2020] [Accepted: 01/13/2021] [Indexed: 11/12/2022]
Abstract
OBJECTIVE Understanding and differentiating brain states is an important task in the field of cognitive neuroscience with applications in health diagnostics (such as detecting neurotypical development vs. Autism Spectrum or coma/vegetative state vs. locked-in state). Electroencephalography (EEG) analysis is a particularly useful tool for this task as EEG data can detect millisecond-level changes in brain activity across a range of frequencies in a non-invasive and relatively inexpensive fashion. The goal of this study is to apply machine learning methods to EEG data in order to classify visual language comprehension across multiple participants. APPROACH 26-channel EEG was recorded for 24 Deaf participants while they watched videos of sign language sentences played in time-direct and time-reverse formats to simulate interpretable vs. uninterpretable sign language, respectively. Sparse Optimal Scoring (SOS) was applied to EEG data in order to classify which type of video a participant was watching, time-direct or time-reversed. The use of SOS also served to reduce the dimensionality of the features to improve model interpretability. MAIN RESULTS The analysis of frequency-domain EEG data resulted in an average out-of-sample classification accuracy of 98.89%, which was far superior to the time-domain analysis. This high classification accuracy suggests this model can accurately identify common neural responses to visual linguistic stimuli. SIGNIFICANCE The significance of this work is in determining necessary and sufficient neural features for classifying the high-level neural process of visual language comprehension across multiple participants.
Collapse
Affiliation(s)
| | - Joshua Borneman
- Speech, Language, and Hearing Sciences, Purdue University, 715 Clinic Drive, West Lafayette, Indiana, 47907-2122, UNITED STATES
| | - Julia Krebs
- Center for Cognitive Neuroscience, University of Salzburg, Hellbrunnerstraße 34, Salzburg, 5020, AUSTRIA
| | - Evguenia Malaia
- Communicative Disorders, The University of Alabama, Box 870242, Tuscaloosa, Alabama, 35487, UNITED STATES
| | - Brendan Ames
- Mathematics, The University of Alabama, Box 870350, Tuscaloosa, Alabama, 35487-0350, UNITED STATES
| |
Collapse
|
143
|
Meng Q, Hegner YL, Giblin I, McMahon C, Johnson BW. Lateralized Cerebral Processing of Abstract Linguistic Structure in Clear and Degraded Speech. Cereb Cortex 2021; 31:591-602. [PMID: 32901245 DOI: 10.1093/cercor/bhaa245] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2020] [Revised: 08/05/2020] [Accepted: 08/06/2020] [Indexed: 11/12/2022] Open
Abstract
Human cortical activity measured with magnetoencephalography (MEG) has been shown to track the temporal regularity of linguistic information in connected speech. In the current study, we investigate the underlying neural sources of these responses and test the hypothesis that they can be directly modulated by changes in speech intelligibility. MEG responses were measured to natural and spectrally degraded (noise-vocoded) speech in 19 normal hearing participants. Results showed that cortical coherence to "abstract" linguistic units with no accompanying acoustic cues (phrases and sentences) were lateralized to the left hemisphere and changed parametrically with intelligibility of speech. In contrast, responses coherent to words/syllables accompanied by acoustic onsets were bilateral and insensitive to intelligibility changes. This dissociation suggests that cerebral responses to linguistic information are directly affected by intelligibility but also powerfully shaped by physical cues in speech. This explains why previous studies have reported widely inconsistent effects of speech intelligibility on cortical entrainment and, within a single experiment, provided clear support for conclusions about language lateralization derived from a large number of separately conducted neuroimaging studies. Since noise-vocoded speech resembles the signals provided by a cochlear implant device, the current methodology has potential clinical utility for assessment of cochlear implant performance.
Collapse
Affiliation(s)
- Qingqing Meng
- The HEARing CRC, Audiology, Hearing and Speech Sciences, University of Melbourne, Melbourne, Victoria 3053, Australia.,Department of Cognitive Science, Macquarie University, Sydney, New South Wales 2109, Australia
| | - Yiwen Li Hegner
- The HEARing CRC, Audiology, Hearing and Speech Sciences, University of Melbourne, Melbourne, Victoria 3053, Australia.,Department of Linguistics, Macquarie University, Sydney, New South Wales 2109, Australia.,MEG-Center, University of Tübingen, Tübingen 72074, Germany
| | - Iain Giblin
- Department of Linguistics, Macquarie University, Sydney, New South Wales 2109, Australia
| | - Catherine McMahon
- The HEARing CRC, Audiology, Hearing and Speech Sciences, University of Melbourne, Melbourne, Victoria 3053, Australia.,Department of Linguistics, Macquarie University, Sydney, New South Wales 2109, Australia.,H:EAR Centre, Macquarie University, New South Wales 2109, Australia
| | - Blake W Johnson
- The HEARing CRC, Audiology, Hearing and Speech Sciences, University of Melbourne, Melbourne, Victoria 3053, Australia.,Department of Cognitive Science, Macquarie University, Sydney, New South Wales 2109, Australia
| |
Collapse
|
144
|
Undurraga JA, Van Yper L, Bance M, McAlpine D, Vickers D. Neural encoding of spectro-temporal cues at slow and near speech-rate in cochlear implant users. Hear Res 2020; 403:108160. [PMID: 33461048 DOI: 10.1016/j.heares.2020.108160] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/25/2020] [Revised: 12/17/2020] [Accepted: 12/21/2020] [Indexed: 10/22/2022]
Abstract
The ability to process rapid modulations in the spectro-temporal structure of sounds is critical for speech comprehension. For users of cochlear implants (CIs), spectral cues in speech are conveyed by differential stimulation of electrode contacts along the cochlea, and temporal cues in terms of the amplitude of stimulating electrical pulses, which track the amplitude-modulated (AM'ed) envelope of speech sounds. Whilst survival of inner-ear neurons and spread of electrical current are known factors that limit the representation of speech information in CI listeners, limitations in the neural representation of dynamic spectro-temporal cues common to speech are also likely to play a role. We assessed the ability of CI listeners to process spectro-temporal cues varying at rates typically present in human speech. Employing an auditory change complex (ACC) paradigm, and a slow (0.5Hz) alternating rate between stimulating electrodes, or different AM frequencies, to evoke a transient cortical ACC, we demonstrate that CI listeners-like normal-hearing listeners-are sensitive to transitions in the spectral- and temporal-domain. However, CI listeners showed impaired cortical responses when either spectral or temporal cues were alternated at faster, speech-like (6-7Hz), rates. Specifically, auditory change following responses-reliably obtained in normal-hearing listeners-were small or absent in CI users, indicating that cortical adaptation to alternating cues at speech-like rates is stronger under electrical stimulation. In CI listeners, temporal processing was also influenced by the polarity-behaviourally-and rate of presentation of electrical pulses-both neurally and behaviorally. Limitations in the ability to process dynamic spectro-temporal cues will likely impact speech comprehension in CI users.
Collapse
Affiliation(s)
- Jaime A Undurraga
- Department of Linguistics, 16 University Avenue, Macquarie University, NSW 2109, Australia.
| | - Lindsey Van Yper
- Department of Linguistics, 16 University Avenue, Macquarie University, NSW 2109, Australia
| | - Manohar Bance
- Cambridge Hearing Group, Department of Clinical Neurosciences, Cambridge Biomedical Campus, University of Cambridge, CB2 0QQ, UK
| | - David McAlpine
- Department of Linguistics, 16 University Avenue, Macquarie University, NSW 2109, Australia
| | - Deborah Vickers
- Cambridge Hearing Group, Department of Clinical Neurosciences, Cambridge Biomedical Campus, University of Cambridge, CB2 0QQ, UK
| |
Collapse
|
145
|
Luo C, Ding N. Cortical encoding of acoustic and linguistic rhythms in spoken narratives. eLife 2020; 9:60433. [PMID: 33345775 PMCID: PMC7775109 DOI: 10.7554/elife.60433] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2020] [Accepted: 12/20/2020] [Indexed: 11/13/2022] Open
Abstract
Speech contains rich acoustic and linguistic information. Using highly controlled speech materials, previous studies have demonstrated that cortical activity is synchronous to the rhythms of perceived linguistic units, for example, words and phrases, on top of basic acoustic features, for example, the speech envelope. When listening to natural speech, it remains unclear, however, how cortical activity jointly encodes acoustic and linguistic information. Here we investigate the neural encoding of words using electroencephalography and observe neural activity synchronous to multi-syllabic words when participants naturally listen to narratives. An amplitude modulation (AM) cue for word rhythm enhances the word-level response, but the effect is only observed during passive listening. Furthermore, words and the AM cue are encoded by spatially separable neural responses that are differentially modulated by attention. These results suggest that bottom-up acoustic cues and top-down linguistic knowledge separately contribute to cortical encoding of linguistic units in spoken narratives.
Collapse
Affiliation(s)
- Cheng Luo
- Key Laboratory for Biomedical Engineering of Ministry of Education, College of Biomedical Engineering and Instrument Sciences, Zhejiang University, Hangzhou, China
| | - Nai Ding
- Key Laboratory for Biomedical Engineering of Ministry of Education, College of Biomedical Engineering and Instrument Sciences, Zhejiang University, Hangzhou, China.,Research Center for Advanced Artificial Intelligence Theory, Zhejiang Lab, Hangzhou, China
| |
Collapse
|
146
|
Kulasingham JP, Brodbeck C, Presacco A, Kuchinsky SE, Anderson S, Simon JZ. High gamma cortical processing of continuous speech in younger and older listeners. Neuroimage 2020; 222:117291. [PMID: 32835821 PMCID: PMC7736126 DOI: 10.1016/j.neuroimage.2020.117291] [Citation(s) in RCA: 23] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2020] [Revised: 08/12/2020] [Accepted: 08/16/2020] [Indexed: 12/11/2022] Open
Abstract
Neural processing along the ascending auditory pathway is often associated with a progressive reduction in characteristic processing rates. For instance, the well-known frequency-following response (FFR) of the auditory midbrain, as measured with electroencephalography (EEG), is dominated by frequencies from ∼100 Hz to several hundred Hz, phase-locking to the acoustic stimulus at those frequencies. In contrast, cortical responses, whether measured by EEG or magnetoencephalography (MEG), are typically characterized by frequencies of a few Hz to a few tens of Hz, time-locking to acoustic envelope features. In this study we investigated a crossover case, cortically generated responses time-locked to continuous speech features at FFR-like rates. Using MEG, we analyzed responses in the high gamma range of 70-200 Hz to continuous speech using neural source-localized reverse correlation and the corresponding temporal response functions (TRFs). Continuous speech stimuli were presented to 40 subjects (17 younger, 23 older adults) with clinically normal hearing and their MEG responses were analyzed in the 70-200 Hz band. Consistent with the relative insensitivity of MEG to many subcortical structures, the spatiotemporal profile of these response components indicated a cortical origin with ∼40 ms peak latency and a right hemisphere bias. TRF analysis was performed using two separate aspects of the speech stimuli: a) the 70-200 Hz carrier of the speech, and b) the 70-200 Hz temporal modulations in the spectral envelope of the speech stimulus. The response was dominantly driven by the envelope modulation, with a much weaker contribution from the carrier. Age-related differences were also analyzed to investigate a reversal previously seen along the ascending auditory pathway, whereby older listeners show weaker midbrain FFR responses than younger listeners, but, paradoxically, have stronger cortical low frequency responses. In contrast to both these earlier results, this study did not find clear age-related differences in high gamma cortical responses to continuous speech. Cortical responses at FFR-like frequencies shared some properties with midbrain responses at the same frequencies and with cortical responses at much lower frequencies.
Collapse
Affiliation(s)
- Joshua P Kulasingham
- (a)Department of Electrical and Computer Engineering, University of Maryland, College Park, MD, United States.
| | - Christian Brodbeck
- (b)Institute for Systems Research, University of Maryland, College Park, Maryland, United States.
| | - Alessandro Presacco
- (b)Institute for Systems Research, University of Maryland, College Park, Maryland, United States.
| | - Stefanie E Kuchinsky
- (c)Audiology and Speech Pathology Center, Walter Reed National Military Medical Center, Bethesda, Maryland, United States.
| | - Samira Anderson
- (d)Department of Hearing and Speech Sciences, University of Maryland, College Park, Maryland, United States.
| | - Jonathan Z Simon
- (a)Department of Electrical and Computer Engineering, University of Maryland, College Park, MD, United States; (b)Institute for Systems Research, University of Maryland, College Park, Maryland, United States; (e)Department of Biology, University of Maryland, College Park, Maryland, United States.
| |
Collapse
|
147
|
Abstract
The role of isochrony in speech—the hypothetical division of speech units into equal duration intervals—has been the subject of a long-standing debate. Current approaches in neurosciences have brought new perspectives in that debate through the theoretical framework of predictive coding and cortical oscillations. Here we assess the comparative roles of naturalness and isochrony in the intelligibility of speech in noise for French and English, two languages representative of two well-established contrastive rhythm classes. We show that both top-down predictions associated with the natural timing of speech and to a lesser extent bottom-up predictions associated with isochrony at a syllabic timescale improve intelligibility. We found a similar pattern of results for both languages, suggesting that temporal characterisation of speech from different rhythm classes could be unified around a single core speech unit, with neurophysiologically defined duration and linguistically anchored temporal location. Taken together, our results suggest that isochrony does not seem to be a main dimension of speech processing, but may be a consequence of neurobiological processing constraints, manifesting in behavioural performance and ultimately explaining why isochronous stimuli occupy a particular status in speech and human perception in general.
Collapse
|
148
|
Sohoglu E, Davis MH. Rapid computations of spectrotemporal prediction error support perception of degraded speech. eLife 2020; 9:e58077. [PMID: 33147138 PMCID: PMC7641582 DOI: 10.7554/elife.58077] [Citation(s) in RCA: 22] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2020] [Accepted: 10/19/2020] [Indexed: 12/15/2022] Open
Abstract
Human speech perception can be described as Bayesian perceptual inference but how are these Bayesian computations instantiated neurally? We used magnetoencephalographic recordings of brain responses to degraded spoken words and experimentally manipulated signal quality and prior knowledge. We first demonstrate that spectrotemporal modulations in speech are more strongly represented in neural responses than alternative speech representations (e.g. spectrogram or articulatory features). Critically, we found an interaction between speech signal quality and expectations from prior written text on the quality of neural representations; increased signal quality enhanced neural representations of speech that mismatched with prior expectations, but led to greater suppression of speech that matched prior expectations. This interaction is a unique neural signature of prediction error computations and is apparent in neural responses within 100 ms of speech input. Our findings contribute to the detailed specification of a computational model of speech perception based on predictive coding frameworks.
Collapse
Affiliation(s)
- Ediz Sohoglu
- School of Psychology, University of SussexBrightonUnited Kingdom
| | - Matthew H Davis
- MRC Cognition and Brain Sciences UnitCambridgeUnited Kingdom
| |
Collapse
|
149
|
Ríos‐López P, Molinaro N, Bourguignon M, Lallier M. Development of neural oscillatory activity in response to speech in children from 4 to 6 years old. Dev Sci 2020; 23:e12947. [PMID: 32043677 PMCID: PMC7685108 DOI: 10.1111/desc.12947] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2018] [Revised: 11/18/2019] [Accepted: 02/05/2020] [Indexed: 11/30/2022]
Abstract
Recent neurophysiological theories propose that the cerebral hemispheres collaborate to resolve the complex temporal nature of speech, such that left-hemisphere (or bilateral) gamma-band oscillatory activity would specialize in coding information at fast rates (phonemic information), whereas right-hemisphere delta- and theta-band activity would code for speech's slow temporal components (syllabic and prosodic information). Despite the relevance that neural entrainment to speech might have for reading acquisition and for core speech perception operations such as the perception of intelligible speech, no study had yet explored its development in young children. In the current study, speech-brain entrainment was recorded via EEG in a cohort of children at three different time points since they were 4-5 to 6-7 years of age. Our results showed that speech-brain entrainment occurred only at delta frequencies (0.5 Hz) at all testing times. The fact that, from the longitudinal perspective, coherence increased in bilateral temporal electrodes suggests that, contrary to previous hypotheses claiming for an innate right-hemispheric bias for processing prosodic information, at 7 years of age the low-frequency components of speech are processed in a bilateral manner. Lastly, delta speech-brain entrainment in the right hemisphere was related to an indirect measure of intelligibility, providing preliminary evidence that the entrainment phenomenon might support core linguistic operations since early childhood.
Collapse
Affiliation(s)
- Paula Ríos‐López
- BCBL ‐ Basque Center on Cognition, Brain and LanguageDonostia/San SebastianSpain
| | - Nicola Molinaro
- BCBL ‐ Basque Center on Cognition, Brain and LanguageDonostia/San SebastianSpain
- IkerbasqueBasque Foundation for ScienceBilbaoSpain
| | - Mathieu Bourguignon
- BCBL ‐ Basque Center on Cognition, Brain and LanguageDonostia/San SebastianSpain
- Laboratoire de Cartographie fonctionnelle du CerveauUniversite libre de BruxellesBrusselsBelgium
| | - Marie Lallier
- BCBL ‐ Basque Center on Cognition, Brain and LanguageDonostia/San SebastianSpain
| |
Collapse
|
150
|
Thézé R, Giraud AL, Mégevand P. The phase of cortical oscillations determines the perceptual fate of visual cues in naturalistic audiovisual speech. SCIENCE ADVANCES 2020; 6:6/45/eabc6348. [PMID: 33148648 PMCID: PMC7673697 DOI: 10.1126/sciadv.abc6348] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/05/2020] [Accepted: 09/17/2020] [Indexed: 06/11/2023]
Abstract
When we see our interlocutor, our brain seamlessly extracts visual cues from their face and processes them along with the sound of their voice, making speech an intrinsically multimodal signal. Visual cues are especially important in noisy environments, when the auditory signal is less reliable. Neuronal oscillations might be involved in the cortical processing of audiovisual speech by selecting which sensory channel contributes more to perception. To test this, we designed computer-generated naturalistic audiovisual speech stimuli where one mismatched phoneme-viseme pair in a key word of sentences created bistable perception. Neurophysiological recordings (high-density scalp and intracranial electroencephalography) revealed that the precise phase angle of theta-band oscillations in posterior temporal and occipital cortex of the right hemisphere was crucial to select whether the auditory or the visual speech cue drove perception. We demonstrate that the phase of cortical oscillations acts as an instrument for sensory selection in audiovisual speech processing.
Collapse
Affiliation(s)
- Raphaël Thézé
- Department of Basic Neurosciences, Faculty of Medicine, University of Geneva, 1202 Geneva, Switzerland
| | - Anne-Lise Giraud
- Department of Basic Neurosciences, Faculty of Medicine, University of Geneva, 1202 Geneva, Switzerland
| | - Pierre Mégevand
- Department of Basic Neurosciences, Faculty of Medicine, University of Geneva, 1202 Geneva, Switzerland.
- Division of Neurology, Department of Clinical Neurosciences, Geneva University Hospitals, 1205 Geneva, Switzerland
| |
Collapse
|