1
|
Desai M, Field AM, Hamilton LS. A comparison of EEG encoding models using audiovisual stimuli and their unimodal counterparts. PLoS Comput Biol 2024; 20:e1012433. [PMID: 39250485 PMCID: PMC11412666 DOI: 10.1371/journal.pcbi.1012433] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2023] [Revised: 09/19/2024] [Accepted: 08/21/2024] [Indexed: 09/11/2024] Open
Abstract
Communication in the real world is inherently multimodal. When having a conversation, typically sighted and hearing people use both auditory and visual cues to understand one another. For example, objects may make sounds as they move in space, or we may use the movement of a person's mouth to better understand what they are saying in a noisy environment. Still, many neuroscience experiments rely on unimodal stimuli to understand encoding of sensory features in the brain. The extent to which visual information may influence encoding of auditory information and vice versa in natural environments is thus unclear. Here, we addressed this question by recording scalp electroencephalography (EEG) in 11 subjects as they listened to and watched movie trailers in audiovisual (AV), visual (V) only, and audio (A) only conditions. We then fit linear encoding models that described the relationship between the brain responses and the acoustic, phonetic, and visual information in the stimuli. We also compared whether auditory and visual feature tuning was the same when stimuli were presented in the original AV format versus when visual or auditory information was removed. In these stimuli, visual and auditory information was relatively uncorrelated, and included spoken narration over a scene as well as animated or live-action characters talking with and without their face visible. For this stimulus, we found that auditory feature tuning was similar in the AV and A-only conditions, and similarly, tuning for visual information was similar when stimuli were presented with the audio present (AV) and when the audio was removed (V only). In a cross prediction analysis, we investigated whether models trained on AV data predicted responses to A or V only test data similarly to models trained on unimodal data. Overall, prediction performance using AV training and V test sets was similar to using V training and V test sets, suggesting that the auditory information has a relatively smaller effect on EEG. In contrast, prediction performance using AV training and A only test set was slightly worse than using matching A only training and A only test sets. This suggests the visual information has a stronger influence on EEG, though this makes no qualitative difference in the derived feature tuning. In effect, our results show that researchers may benefit from the richness of multimodal datasets, which can then be used to answer more than one research question.
Collapse
Affiliation(s)
- Maansi Desai
- Department of Speech, Language, and Hearing Sciences, Moody College of Communication, The University of Texas at Austin, Austin, Texas, United States of America
| | - Alyssa M Field
- Department of Speech, Language, and Hearing Sciences, Moody College of Communication, The University of Texas at Austin, Austin, Texas, United States of America
| | - Liberty S Hamilton
- Department of Speech, Language, and Hearing Sciences, Moody College of Communication, The University of Texas at Austin, Austin, Texas, United States of America
- Department of Neurology, Dell Medical School, The University of Texas at Austin, Austin, Texas, United States of America
| |
Collapse
|
2
|
Senkowski D, Engel AK. Multi-timescale neural dynamics for multisensory integration. Nat Rev Neurosci 2024; 25:625-642. [PMID: 39090214 DOI: 10.1038/s41583-024-00845-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 07/02/2024] [Indexed: 08/04/2024]
Abstract
Carrying out any everyday task, be it driving in traffic, conversing with friends or playing basketball, requires rapid selection, integration and segregation of stimuli from different sensory modalities. At present, even the most advanced artificial intelligence-based systems are unable to replicate the multisensory processes that the human brain routinely performs, but how neural circuits in the brain carry out these processes is still not well understood. In this Perspective, we discuss recent findings that shed fresh light on the oscillatory neural mechanisms that mediate multisensory integration (MI), including power modulations, phase resetting, phase-amplitude coupling and dynamic functional connectivity. We then consider studies that also suggest multi-timescale dynamics in intrinsic ongoing neural activity and during stimulus-driven bottom-up and cognitive top-down neural network processing in the context of MI. We propose a new concept of MI that emphasizes the critical role of neural dynamics at multiple timescales within and across brain networks, enabling the simultaneous integration, segregation, hierarchical structuring and selection of information in different time windows. To highlight predictions from our multi-timescale concept of MI, real-world scenarios in which multi-timescale processes may coordinate MI in a flexible and adaptive manner are considered.
Collapse
Affiliation(s)
- Daniel Senkowski
- Department of Psychiatry and Neurosciences, Charité - Universitätsmedizin Berlin, Berlin, Germany
| | - Andreas K Engel
- Department of Neurophysiology and Pathophysiology, University Medical Center Hamburg-Eppendorf, Hamburg, Germany.
| |
Collapse
|
3
|
Çetinçelik M, Jordan-Barros A, Rowland CF, Snijders TM. The effect of visual speech cues on neural tracking of speech in 10-month-old infants. Eur J Neurosci 2024; 60:5381-5399. [PMID: 39188179 DOI: 10.1111/ejn.16492] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/12/2024] [Revised: 07/04/2024] [Accepted: 07/20/2024] [Indexed: 08/28/2024]
Abstract
While infants' sensitivity to visual speech cues and the benefit of these cues have been well-established by behavioural studies, there is little evidence on the effect of visual speech cues on infants' neural processing of continuous auditory speech. In this study, we investigated whether visual speech cues, such as the movements of the lips, jaw, and larynx, facilitate infants' neural speech tracking. Ten-month-old Dutch-learning infants watched videos of a speaker reciting passages in infant-directed speech while electroencephalography (EEG) was recorded. In the videos, either the full face of the speaker was displayed or the speaker's mouth and jaw were masked with a block, obstructing the visual speech cues. To assess neural tracking, speech-brain coherence (SBC) was calculated, focusing particularly on the stress and syllabic rates (1-1.75 and 2.5-3.5 Hz respectively in our stimuli). First, overall, SBC was compared to surrogate data, and then, differences in SBC in the two conditions were tested at the frequencies of interest. Our results indicated that infants show significant tracking at both stress and syllabic rates. However, no differences were identified between the two conditions, meaning that infants' neural tracking was not modulated further by the presence of visual speech cues. Furthermore, we demonstrated that infants' neural tracking of low-frequency information is related to their subsequent vocabulary development at 18 months. Overall, this study provides evidence that infants' neural tracking of speech is not necessarily impaired when visual speech cues are not fully visible and that neural tracking may be a potential mechanism in successful language acquisition.
Collapse
Affiliation(s)
- Melis Çetinçelik
- Max Planck Institute for Psycholinguistics, Nijmegen, The Netherlands
- Department of Experimental Psychology, Utrecht University, Utrecht, The Netherlands
- Cognitive Neuropsychology Department, Tilburg University, Tilburg, The Netherlands
| | - Antonia Jordan-Barros
- Centre for Brain and Cognitive Development, Department of Psychological Science, Birkbeck, University of London, London, UK
- Experimental Psychology, University College London, London, UK
| | - Caroline F Rowland
- Max Planck Institute for Psycholinguistics, Nijmegen, The Netherlands
- Donders Institute for Brain, Cognition and Behaviour, Radboud University, Nijmegen, The Netherlands
| | - Tineke M Snijders
- Max Planck Institute for Psycholinguistics, Nijmegen, The Netherlands
- Cognitive Neuropsychology Department, Tilburg University, Tilburg, The Netherlands
- Donders Institute for Brain, Cognition and Behaviour, Radboud University, Nijmegen, The Netherlands
| |
Collapse
|
4
|
Ahn E, Majumdar A, Lee TG, Brang D. Evidence for a Causal Dissociation of the McGurk Effect and Congruent Audiovisual Speech Perception via TMS to the Left pSTS. Multisens Res 2024; 37:341-363. [PMID: 39191410 PMCID: PMC11388023 DOI: 10.1163/22134808-bja10129] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2023] [Accepted: 07/21/2024] [Indexed: 08/29/2024]
Abstract
Congruent visual speech improves speech perception accuracy, particularly in noisy environments. Conversely, mismatched visual speech can alter what is heard, leading to an illusory percept that differs from the auditory and visual components, known as the McGurk effect. While prior transcranial magnetic stimulation (TMS) and neuroimaging studies have identified the left posterior superior temporal sulcus (pSTS) as a causal region involved in the generation of the McGurk effect, it remains unclear whether this region is critical only for this illusion or also for the more general benefits of congruent visual speech (e.g., increased accuracy and faster reaction times). Indeed, recent correlative research suggests that the benefits of congruent visual speech and the McGurk effect rely on largely independent mechanisms. To better understand how these different features of audiovisual integration are causally generated by the left pSTS, we used single-pulse TMS to temporarily disrupt processing within this region while subjects were presented with either congruent or incongruent (McGurk) audiovisual combinations. Consistent with past research, we observed that TMS to the left pSTS reduced the strength of the McGurk effect. Importantly, however, left pSTS stimulation had no effect on the positive benefits of congruent audiovisual speech (increased accuracy and faster reaction times), demonstrating a causal dissociation between the two processes. Our results are consistent with models proposing that the pSTS is but one of multiple critical areas supporting audiovisual speech interactions. Moreover, these data add to a growing body of evidence suggesting that the McGurk effect is an imperfect surrogate measure for more general and ecologically valid audiovisual speech behaviors.
Collapse
Affiliation(s)
- EunSeon Ahn
- Department of Psychology, University of Michigan, Ann Arbor, MI 48109, USA
| | - Areti Majumdar
- Department of Psychology, University of Michigan, Ann Arbor, MI 48109, USA
| | - Taraz G Lee
- Department of Psychology, University of Michigan, Ann Arbor, MI 48109, USA
| | - David Brang
- Department of Psychology, University of Michigan, Ann Arbor, MI 48109, USA
| |
Collapse
|
5
|
Sun L, Wang Q, Ai J. The underlying roles and neurobiological mechanisms of music-based intervention in Alzheimer's disease: A comprehensive review. Ageing Res Rev 2024; 96:102265. [PMID: 38479478 DOI: 10.1016/j.arr.2024.102265] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2023] [Revised: 02/25/2024] [Accepted: 03/04/2024] [Indexed: 03/19/2024]
Abstract
Non-pharmacological therapy has gained popularity in the intervention of Alzheimer's disease (AD) due to its apparent therapeutic effectiveness and the limitation of biological drug. A wealth of research indicates that music interventions can enhance cognition, mood and behavior in individuals with AD. Nonetheless, the underlying mechanisms behind these improvements have yet to be fully and systematically delineated. This review aims to holistically review how music-based intervention (MBI) ameliorates abnormal emotion, cognition decline, and behavioral changes in AD patients. We cover several key dimensions: the regulation of MBIs on cerebral blood flow (CBF), their impact on neurotransmission (including GABAergic and monoaminergic transmissions), modulation of synaptic plasticity, and hormonal release. Additionally, we summarize the clinical applications and limitations of active music-based intervention (AMBI), passive music-based intervention (PMBI), and hybrid music-based intervention (HMBI). This thorough analysis enhances our understanding of the role of MBI in AD and supports the development of non-pharmacological therapeutic strategies.
Collapse
Affiliation(s)
- Liyang Sun
- Department of Pharmacology (The State-Province Key Laboratories of Biomedicine-Pharmaceutics of China), College of Pharmacy of Harbin Medical University, 157 Baojian Road, Harbin 150086, China
| | - Qin Wang
- Department of Pharmacology (The State-Province Key Laboratories of Biomedicine-Pharmaceutics of China), College of Pharmacy of Harbin Medical University, 157 Baojian Road, Harbin 150086, China; Department of Breast Surgery, Harbin Medical University Cancer Hospital, 150 Haping Road, Harbin 150040, China; Heilongjiang Academy of Medical Sciences, 157 Baojian Road, Harbin 150086, China
| | - Jing Ai
- Department of Pharmacology (The State-Province Key Laboratories of Biomedicine-Pharmaceutics of China), College of Pharmacy of Harbin Medical University, 157 Baojian Road, Harbin 150086, China; National Key Laboratory of Frigid Zone Cardiovascular Diseases, 157 Baojian Road, Harbin 150086, China.
| |
Collapse
|
6
|
Ahveninen J, Lee HJ, Yu HY, Lee CC, Chou CC, Ahlfors SP, Kuo WJ, Jääskeläinen IP, Lin FH. Visual Stimuli Modulate Local Field Potentials But Drive No High-Frequency Activity in Human Auditory Cortex. J Neurosci 2024; 44:e0890232023. [PMID: 38129133 PMCID: PMC10869150 DOI: 10.1523/jneurosci.0890-23.2023] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2023] [Revised: 11/06/2023] [Accepted: 11/07/2023] [Indexed: 12/23/2023] Open
Abstract
Neuroimaging studies suggest cross-sensory visual influences in human auditory cortices (ACs). Whether these influences reflect active visual processing in human ACs, which drives neuronal firing and concurrent broadband high-frequency activity (BHFA; >70 Hz), or whether they merely modulate sound processing is still debatable. Here, we presented auditory, visual, and audiovisual stimuli to 16 participants (7 women, 9 men) with stereo-EEG depth electrodes implanted near ACs for presurgical monitoring. Anatomically normalized group analyses were facilitated by inverse modeling of intracranial source currents. Analyses of intracranial event-related potentials (iERPs) suggested cross-sensory responses to visual stimuli in ACs, which lagged the earliest auditory responses by several tens of milliseconds. Visual stimuli also modulated the phase of intrinsic low-frequency oscillations and triggered 15-30 Hz event-related desynchronization in ACs. However, BHFA, a putative correlate of neuronal firing, was not significantly increased in ACs after visual stimuli, not even when they coincided with auditory stimuli. Intracranial recordings demonstrate cross-sensory modulations, but no indication of active visual processing in human ACs.
Collapse
Affiliation(s)
- Jyrki Ahveninen
- Athinoula A. Martinos Center for Biomedical Imaging, Department of Radiology, Massachusetts General Hospital, Charlestown, Massachusetts 02129
- Department of Radiology, Harvard Medical School, Boston, Massachusetts 02115
| | - Hsin-Ju Lee
- Physical Sciences Platform, Sunnybrook Research Institute, Toronto, Ontario M4N 3M5, Canada
- Department of Medical Biophysics, University of Toronto, Toronto, Ontario M5G 1L7, Canada
| | - Hsiang-Yu Yu
- Department of Epilepsy, Neurological Institute, Taipei Veterans General Hospital, Taipei 11217, Taiwan
- School of Medicine, National Yang Ming Chiao Tung University, Taipei 112304, Taiwan
| | - Cheng-Chia Lee
- School of Medicine, National Yang Ming Chiao Tung University, Taipei 112304, Taiwan
- Department of Neurosurgery, Neurological Institute, Taipei Veterans General Hospital, Taipei 11217, Taiwan
| | - Chien-Chen Chou
- Department of Epilepsy, Neurological Institute, Taipei Veterans General Hospital, Taipei 11217, Taiwan
- School of Medicine, National Yang Ming Chiao Tung University, Taipei 112304, Taiwan
| | - Seppo P Ahlfors
- Athinoula A. Martinos Center for Biomedical Imaging, Department of Radiology, Massachusetts General Hospital, Charlestown, Massachusetts 02129
- Department of Radiology, Harvard Medical School, Boston, Massachusetts 02115
| | - Wen-Jui Kuo
- Institute of Neuroscience, National Yang Ming Chiao Tung University, Taipei 112304, Taiwan
| | - Iiro P Jääskeläinen
- Brain and Mind Laboratory, Department of Neuroscience and Biomedical Engineering, Aalto University School of Science, Espoo, FI-00076 AALTO, Finland
- International Laboratory of Social Neurobiology, Institute of Cognitive Neuroscience, Higher School of Economics, Moscow 101000, Russia
| | - Fa-Hsuan Lin
- Physical Sciences Platform, Sunnybrook Research Institute, Toronto, Ontario M4N 3M5, Canada
- Department of Medical Biophysics, University of Toronto, Toronto, Ontario M5G 1L7, Canada
- Brain and Mind Laboratory, Department of Neuroscience and Biomedical Engineering, Aalto University School of Science, Espoo, FI-00076 AALTO, Finland
| |
Collapse
|
7
|
Molho W, Raymond N, Reinhart RMG, Trotti R, Grover S, Keshavan M, Lizano P. Lesion network guided delta frequency neuromodulation improves cognition in patients with psychosis spectrum disorders: A pilot study. Asian J Psychiatr 2024; 92:103887. [PMID: 38183737 DOI: 10.1016/j.ajp.2023.103887] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/03/2023] [Revised: 12/05/2023] [Accepted: 12/18/2023] [Indexed: 01/08/2024]
Abstract
BACKGROUND Transcranial electric stimulation (tES) may improve cognition in psychosis spectrum disorders. However, few studies have used novel tES approaches, such as high definition tES (HD-tES) to target specific brain circuits. Recently, the extrastriate visual cortex (V5/MT) has been causally linked to visual hallucinations through lesion network mapping and this may be a promising approach for improving cognition. OBJECTIVE We aim to determine if causal lesion network guided HD-tES to V5/MT improves cognitive performance as measured by the Brief Assessment of Cognition in Schizophrenia (BACS). METHODS A single-blind pilot study with a within-subjects crossover design was performed to characterize the effect of cathodal HD-transcranial direct current stimulation (tDCS) and 2 Hz HD-transcranial alternating current stimulation (tACS) on cognition. Enrolled patients received 20 mins of HD-tES twice daily for 5 consecutive days applied bilaterally to V5/MT with a washout between conditions. BACS assessments were performed at baseline, day-5, and 1-month. RESULTS 6 participants with psychosis spectrum disorder were enrolled. 6 individuals received cathodal HD-tDCS. 4 individuals received 2 Hz HD-tACS. HD-tACS resulted in significant (p < 0.1 baseline to 1-month improvements for Digit Sequencing, Verbal Fluency, and Tower of London. HD-tDCS did not result in significant improvement on any task. CONCLUSIONS HD-tACS targeting V5/MT may be a promising treatment to improve cognitive abilities in individuals with psychosis. By promoting delta oscillations, tACS may enhance cortico-cortico communications across brain networks to improve verbal working memory, processing speed, and executive function. Large-scale investigations are needed to replicate these results.
Collapse
Affiliation(s)
- Willa Molho
- Department of Psychiatry, Beth Israel Deaconess Medical Center, Boston, MA, USA; Division of Translational Neuroscience, Beth Israel Deaconess Medical Center, Boston, MA, USA.
| | - Nicolas Raymond
- Department of Psychiatry, Beth Israel Deaconess Medical Center, Boston, MA, USA; Division of Translational Neuroscience, Beth Israel Deaconess Medical Center, Boston, MA, USA.
| | - Robert M G Reinhart
- Department of Psychological and Brain Science, Boston University, Boston, MA, USA
| | - Rebekah Trotti
- Department of Psychiatry, Beth Israel Deaconess Medical Center, Boston, MA, USA; Division of Translational Neuroscience, Beth Israel Deaconess Medical Center, Boston, MA, USA
| | - Shrey Grover
- Division of Translational Neuroscience, Beth Israel Deaconess Medical Center, Boston, MA, USA
| | - Matcheri Keshavan
- Department of Psychiatry, Beth Israel Deaconess Medical Center, Boston, MA, USA; Department of Psychiatry, Harvard Medical School, Boston, MA, USA
| | - Paulo Lizano
- Department of Psychiatry, Beth Israel Deaconess Medical Center, Boston, MA, USA; Division of Translational Neuroscience, Beth Israel Deaconess Medical Center, Boston, MA, USA; Department of Psychiatry, Harvard Medical School, Boston, MA, USA.
| |
Collapse
|
8
|
Frei V, Schmitt R, Meyer M, Giroud N. Processing of Visual Speech Cues in Speech-in-Noise Comprehension Depends on Working Memory Capacity and Enhances Neural Speech Tracking in Older Adults With Hearing Impairment. Trends Hear 2024; 28:23312165241287622. [PMID: 39444375 PMCID: PMC11520018 DOI: 10.1177/23312165241287622] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2024] [Revised: 08/21/2024] [Accepted: 09/11/2024] [Indexed: 10/25/2024] Open
Abstract
Comprehending speech in noise (SiN) poses a challenge for older hearing-impaired listeners, requiring auditory and working memory resources. Visual speech cues provide additional sensory information supporting speech understanding, while the extent of such visual benefit is characterized by large variability, which might be accounted for by individual differences in working memory capacity (WMC). In the current study, we investigated behavioral and neurofunctional (i.e., neural speech tracking) correlates of auditory and audio-visual speech comprehension in babble noise and the associations with WMC. Healthy older adults with hearing impairment quantified by pure-tone hearing loss (threshold average: 31.85-57 dB, N = 67) listened to sentences in babble noise in audio-only, visual-only and audio-visual speech modality and performed a pattern matching and a comprehension task, while electroencephalography (EEG) was recorded. Behaviorally, no significant difference in task performance was observed across modalities. However, we did find a significant association between individual working memory capacity and task performance, suggesting a more complex interplay between audio-visual speech cues, working memory capacity and real-world listening tasks. Furthermore, we found that the visual speech presentation was accompanied by increased cortical tracking of the speech envelope, particularly in a right-hemispheric auditory topographical cluster. Post-hoc, we investigated the potential relationships between the behavioral performance and neural speech tracking but were not able to establish a significant association. Overall, our results show an increase in neurofunctional correlates of speech associated with congruent visual speech cues, specifically in a right auditory cluster, suggesting multisensory integration.
Collapse
Affiliation(s)
- Vanessa Frei
- Computational Neuroscience of Speech and Hearing, Department of Computational Linguistics, University of Zurich, Zurich, Switzerland
- International Max Planck Research School for the Life Course: Evolutionary and Ontogenetic Dynamics (LIFE), Berlin, Germany
| | - Raffael Schmitt
- Computational Neuroscience of Speech and Hearing, Department of Computational Linguistics, University of Zurich, Zurich, Switzerland
- International Max Planck Research School for the Life Course: Evolutionary and Ontogenetic Dynamics (LIFE), Berlin, Germany
- Competence Center Language & Medicine, Center of Medical Faculty and Faculty of Arts and Sciences, University of Zurich, Zurich, Switzerland
| | - Martin Meyer
- Competence Center Language & Medicine, Center of Medical Faculty and Faculty of Arts and Sciences, University of Zurich, Zurich, Switzerland
- University of Zurich, University Research Priority Program Dynamics of Healthy Aging, Zurich, Switzerland
- Center for Neuroscience Zurich, University and ETH of Zurich, Zurich, Switzerland
- Evolutionary Neuroscience of Language, Department of Comparative Language Science, University of Zurich, Zurich, Switzerland
- Cognitive Psychology Unit, Alpen-Adria University, Klagenfurt, Austria
| | - Nathalie Giroud
- Computational Neuroscience of Speech and Hearing, Department of Computational Linguistics, University of Zurich, Zurich, Switzerland
- International Max Planck Research School for the Life Course: Evolutionary and Ontogenetic Dynamics (LIFE), Berlin, Germany
- Competence Center Language & Medicine, Center of Medical Faculty and Faculty of Arts and Sciences, University of Zurich, Zurich, Switzerland
- Center for Neuroscience Zurich, University and ETH of Zurich, Zurich, Switzerland
| |
Collapse
|
9
|
Batterink LJ, Mulgrew J, Gibbings A. Rhythmically Modulating Neural Entrainment during Exposure to Regularities Influences Statistical Learning. J Cogn Neurosci 2024; 36:107-127. [PMID: 37902580 DOI: 10.1162/jocn_a_02079] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/31/2023]
Abstract
The ability to discover regularities in the environment, such as syllable patterns in speech, is known as statistical learning. Previous studies have shown that statistical learning is accompanied by neural entrainment, in which neural activity temporally aligns with repeating patterns over time. However, it is unclear whether these rhythmic neural dynamics play a functional role in statistical learning or whether they largely reflect the downstream consequences of learning, such as the enhanced perception of learned words in speech. To better understand this issue, we manipulated participants' neural entrainment during statistical learning using continuous rhythmic visual stimulation. Participants were exposed to a speech stream of repeating nonsense words while viewing either (1) a visual stimulus with a "congruent" rhythm that aligned with the word structure, (2) a visual stimulus with an incongruent rhythm, or (3) a static visual stimulus. Statistical learning was subsequently measured using both an explicit and implicit test. Participants in the congruent condition showed a significant increase in neural entrainment over auditory regions at the relevant word frequency, over and above effects of passive volume conduction, indicating that visual stimulation successfully altered neural entrainment within relevant neural substrates. Critically, during the subsequent implicit test, participants in the congruent condition showed an enhanced ability to predict upcoming syllables and stronger neural phase synchronization to component words, suggesting that they had gained greater sensitivity to the statistical structure of the speech stream relative to the incongruent and static groups. This learning benefit could not be attributed to strategic processes, as participants were largely unaware of the contingencies between the visual stimulation and embedded words. These results indicate that manipulating neural entrainment during exposure to regularities influences statistical learning outcomes, suggesting that neural entrainment may functionally contribute to statistical learning. Our findings encourage future studies using non-invasive brain stimulation methods to further understand the role of entrainment in statistical learning.
Collapse
|
10
|
Ahn E, Majumdar A, Lee T, Brang D. Evidence for a Causal Dissociation of the McGurk Effect and Congruent Audiovisual Speech Perception via TMS. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.11.27.568892. [PMID: 38077093 PMCID: PMC10705272 DOI: 10.1101/2023.11.27.568892] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/24/2023]
Abstract
Congruent visual speech improves speech perception accuracy, particularly in noisy environments. Conversely, mismatched visual speech can alter what is heard, leading to an illusory percept known as the McGurk effect. This illusion has been widely used to study audiovisual speech integration, illustrating that auditory and visual cues are combined in the brain to generate a single coherent percept. While prior transcranial magnetic stimulation (TMS) and neuroimaging studies have identified the left posterior superior temporal sulcus (pSTS) as a causal region involved in the generation of the McGurk effect, it remains unclear whether this region is critical only for this illusion or also for the more general benefits of congruent visual speech (e.g., increased accuracy and faster reaction times). Indeed, recent correlative research suggests that the benefits of congruent visual speech and the McGurk effect reflect largely independent mechanisms. To better understand how these different features of audiovisual integration are causally generated by the left pSTS, we used single-pulse TMS to temporarily impair processing while subjects were presented with either incongruent (McGurk) or congruent audiovisual combinations. Consistent with past research, we observed that TMS to the left pSTS significantly reduced the strength of the McGurk effect. Importantly, however, left pSTS stimulation did not affect the positive benefits of congruent audiovisual speech (increased accuracy and faster reaction times), demonstrating a causal dissociation between the two processes. Our results are consistent with models proposing that the pSTS is but one of multiple critical areas supporting audiovisual speech interactions. Moreover, these data add to a growing body of evidence suggesting that the McGurk effect is an imperfect surrogate measure for more general and ecologically valid audiovisual speech behaviors.
Collapse
Affiliation(s)
- EunSeon Ahn
- Department of Psychology, University of Michigan, Ann Arbor, MI 48109
| | - Areti Majumdar
- Department of Psychology, University of Michigan, Ann Arbor, MI 48109
| | - Taraz Lee
- Department of Psychology, University of Michigan, Ann Arbor, MI 48109
| | - David Brang
- Department of Psychology, University of Michigan, Ann Arbor, MI 48109
| |
Collapse
|
11
|
Krason A, Vigliocco G, Mailend ML, Stoll H, Varley R, Buxbaum LJ. Benefit of visual speech information for word comprehension in post-stroke aphasia. Cortex 2023; 165:86-100. [PMID: 37271014 PMCID: PMC10850036 DOI: 10.1016/j.cortex.2023.04.011] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2022] [Revised: 03/13/2023] [Accepted: 04/22/2023] [Indexed: 06/06/2023]
Abstract
Aphasia is a language disorder that often involves speech comprehension impairments affecting communication. In face-to-face settings, speech is accompanied by mouth and facial movements, but little is known about the extent to which they benefit aphasic comprehension. This study investigated the benefit of visual information accompanying speech for word comprehension in people with aphasia (PWA) and the neuroanatomic substrates of any benefit. Thirty-six PWA and 13 neurotypical matched control participants performed a picture-word verification task in which they indicated whether a picture of an animate/inanimate object matched a subsequent word produced by an actress in a video. Stimuli were either audiovisual (with visible mouth and facial movements) or auditory-only (still picture of a silhouette) with audio being clear (unedited) or degraded (6-band noise-vocoding). We found that visual speech information was more beneficial for neurotypical participants than PWA, and more beneficial for both groups when speech was degraded. A multivariate lesion-symptom mapping analysis for the degraded speech condition showed that lesions to superior temporal gyrus, underlying insula, primary and secondary somatosensory cortices, and inferior frontal gyrus were associated with reduced benefit of audiovisual compared to auditory-only speech, suggesting that the integrity of these fronto-temporo-parietal regions may facilitate cross-modal mapping. These findings provide initial insights into our understanding of the impact of audiovisual information on comprehension in aphasia and the brain regions mediating any benefit.
Collapse
Affiliation(s)
- Anna Krason
- Experimental Psychology, University College London, UK; Moss Rehabilitation Research Institute, Elkins Park, PA, USA.
| | - Gabriella Vigliocco
- Experimental Psychology, University College London, UK; Moss Rehabilitation Research Institute, Elkins Park, PA, USA
| | - Marja-Liisa Mailend
- Moss Rehabilitation Research Institute, Elkins Park, PA, USA; Department of Special Education, University of Tartu, Tartu Linn, Estonia
| | - Harrison Stoll
- Moss Rehabilitation Research Institute, Elkins Park, PA, USA; Applied Cognitive and Brain Science, Drexel University, Philadelphia, PA, USA
| | | | - Laurel J Buxbaum
- Moss Rehabilitation Research Institute, Elkins Park, PA, USA; Department of Rehabilitation Medicine, Thomas Jefferson University, Philadelphia, PA, USA
| |
Collapse
|
12
|
Ahmed F, Nidiffer AR, O'Sullivan AE, Zuk NJ, Lalor EC. The integration of continuous audio and visual speech in a cocktail-party environment depends on attention. Neuroimage 2023; 274:120143. [PMID: 37121375 DOI: 10.1016/j.neuroimage.2023.120143] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2022] [Revised: 03/17/2023] [Accepted: 04/27/2023] [Indexed: 05/02/2023] Open
Abstract
In noisy environments, our ability to understand speech benefits greatly from seeing the speaker's face. This is attributed to the brain's ability to integrate audio and visual information, a process known as multisensory integration. In addition, selective attention plays an enormous role in what we understand, the so-called cocktail-party phenomenon. But how attention and multisensory integration interact remains incompletely understood, particularly in the case of natural, continuous speech. Here, we addressed this issue by analyzing EEG data recorded from participants who undertook a multisensory cocktail-party task using natural speech. To assess multisensory integration, we modeled the EEG responses to the speech in two ways. The first assumed that audiovisual speech processing is simply a linear combination of audio speech processing and visual speech processing (i.e., an A + V model), while the second allows for the possibility of audiovisual interactions (i.e., an AV model). Applying these models to the data revealed that EEG responses to attended audiovisual speech were better explained by an AV model, providing evidence for multisensory integration. In contrast, unattended audiovisual speech responses were best captured using an A + V model, suggesting that multisensory integration is suppressed for unattended speech. Follow up analyses revealed some limited evidence for early multisensory integration of unattended AV speech, with no integration occurring at later levels of processing. We take these findings as evidence that the integration of natural audio and visual speech occurs at multiple levels of processing in the brain, each of which can be differentially affected by attention.
Collapse
Affiliation(s)
- Farhin Ahmed
- Department of Biomedical Engineering, Department of Neuroscience, and Del Monte Institute for Neuroscience, University of Rochester, Rochester, NY 14627, USA
| | - Aaron R Nidiffer
- Department of Biomedical Engineering, Department of Neuroscience, and Del Monte Institute for Neuroscience, University of Rochester, Rochester, NY 14627, USA
| | - Aisling E O'Sullivan
- Department of Biomedical Engineering, Department of Neuroscience, and Del Monte Institute for Neuroscience, University of Rochester, Rochester, NY 14627, USA; School of Engineering, Trinity Centre for Biomedical Engineering, and Trinity College Institute of Neuroscience, Trinity College Dublin, Dublin 2, Ireland
| | - Nathaniel J Zuk
- Edmond & Lily Safra Center for Brain Sciences, Hebrew University, Jerusalem, Israel
| | - Edmund C Lalor
- Department of Biomedical Engineering, Department of Neuroscience, and Del Monte Institute for Neuroscience, University of Rochester, Rochester, NY 14627, USA; School of Engineering, Trinity Centre for Biomedical Engineering, and Trinity College Institute of Neuroscience, Trinity College Dublin, Dublin 2, Ireland.
| |
Collapse
|
13
|
Chalas N, Omigie D, Poeppel D, van Wassenhove V. Hierarchically nested networks optimize the analysis of audiovisual speech. iScience 2023; 26:106257. [PMID: 36909667 PMCID: PMC9993032 DOI: 10.1016/j.isci.2023.106257] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2022] [Revised: 12/22/2022] [Accepted: 02/17/2023] [Indexed: 02/22/2023] Open
Abstract
In conversational settings, seeing the speaker's face elicits internal predictions about the upcoming acoustic utterance. Understanding how the listener's cortical dynamics tune to the temporal statistics of audiovisual (AV) speech is thus essential. Using magnetoencephalography, we explored how large-scale frequency-specific dynamics of human brain activity adapt to AV speech delays. First, we show that the amplitude of phase-locked responses parametrically decreases with natural AV speech synchrony, a pattern that is consistent with predictive coding. Second, we show that the temporal statistics of AV speech affect large-scale oscillatory networks at multiple spatial and temporal resolutions. We demonstrate a spatial nestedness of oscillatory networks during the processing of AV speech: these oscillatory hierarchies are such that high-frequency activity (beta, gamma) is contingent on the phase response of low-frequency (delta, theta) networks. Our findings suggest that the endogenous temporal multiplexing of speech processing confers adaptability within the temporal regimes that are essential for speech comprehension.
Collapse
Affiliation(s)
- Nikos Chalas
- Institute for Biomagnetism and Biosignal Analysis, University of Münster, P.C., 48149 Münster, Germany
- CEA, DRF/Joliot, NeuroSpin, INSERM, Cognitive Neuroimaging Unit; CNRS; Université Paris-Saclay, 91191 Gif/Yvette, France
- School of Biology, Faculty of Sciences, Aristotle University of Thessaloniki, P.C., 54124 Thessaloniki, Greece
- Corresponding author
| | - Diana Omigie
- Department of Psychology, Goldsmiths University London, London, UK
| | - David Poeppel
- Department of Psychology, New York University, New York, NY 10003, USA
- Ernst Struengmann Institute for Neuroscience, 60528 Frankfurt am Main, Frankfurt, Germany
| | - Virginie van Wassenhove
- CEA, DRF/Joliot, NeuroSpin, INSERM, Cognitive Neuroimaging Unit; CNRS; Université Paris-Saclay, 91191 Gif/Yvette, France
- Corresponding author
| |
Collapse
|
14
|
Jiang Z, An X, Liu S, Wang L, Yin E, Yan Y, Ming D. The effect of prestimulus low-frequency neural oscillations on the temporal perception of audiovisual speech. Front Neurosci 2023; 17:1067632. [PMID: 36816126 PMCID: PMC9935937 DOI: 10.3389/fnins.2023.1067632] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2022] [Accepted: 01/17/2023] [Indexed: 02/05/2023] Open
Abstract
Objective Perceptual integration and segregation are modulated by the phase of ongoing neural oscillation whose frequency period is broader than the size of the temporal binding window (TBW). Studies have shown that the abstract beep-flash stimuli with about 100 ms TBW were modulated by the alpha band phase. Therefore, we hypothesize that the temporal perception of speech with about hundreds of milliseconds of TBW might be affected by the delta-theta phase. Methods Thus, we conducted a speech-stimuli-based audiovisual simultaneity judgment (SJ) experiment. Twenty human participants (12 females) attended this study, recording 62 channels of EEG. Results Behavioral results showed that the visual leading TBWs are broader than the auditory leading ones [273.37 ± 24.24 ms vs. 198.05 ± 19.28 ms, (mean ± sem)]. We used Phase Opposition Sum (POS) to quantify the differences in mean phase angles and phase concentrations between synchronous and asynchronous responses. The POS results indicated that the delta-theta phase was significantly different between synchronous and asynchronous responses in the A50V condition (50% synchronous responses in auditory leading SOA). However, in the V50A condition (50% synchronous responses in visual leading SOA), we only found the delta band effect. In the two conditions, we did not find a consistency of phases over subjects for both perceptual responses by the post hoc Rayleigh test (all ps > 0.05). The Rayleigh test results suggested that the phase might not reflect the neuronal excitability which assumed that the phases within a perceptual response across subjects concentrated on the same angle but were not uniformly distributed. But V-test showed the phase difference between synchronous and asynchronous responses across subjects had a significant phase opposition (all ps < 0.05) which is compatible with the POS result. Conclusion These results indicate that the speech temporal perception depends on the alignment of stimulus onset with an optimal phase of the neural oscillation whose frequency period might be broader than the size of TBW. The role of the oscillatory phase might be encoding the temporal information which varies across subjects rather than neuronal excitability. Given the enriched temporal structures of spoken language stimuli, the conclusion that phase encodes temporal information is plausible and valuable for future research.
Collapse
Affiliation(s)
- Zeliang Jiang
- Academy of Medical Engineering and Translational Medicine, Tianjin University, Tianjin, China
| | - Xingwei An
- Academy of Medical Engineering and Translational Medicine, Tianjin University, Tianjin, China,Xingwei An,
| | - Shuang Liu
- Academy of Medical Engineering and Translational Medicine, Tianjin University, Tianjin, China
| | - Lu Wang
- Academy of Medical Engineering and Translational Medicine, Tianjin University, Tianjin, China
| | - Erwei Yin
- Academy of Medical Engineering and Translational Medicine, Tianjin University, Tianjin, China,Defense Innovation Institute, Academy of Military Sciences (AMS), Beijing, China,Tianjin Artificial Intelligence Innovation Center (TAIIC), Tianjin, China
| | - Ye Yan
- Academy of Medical Engineering and Translational Medicine, Tianjin University, Tianjin, China,Defense Innovation Institute, Academy of Military Sciences (AMS), Beijing, China,Tianjin Artificial Intelligence Innovation Center (TAIIC), Tianjin, China
| | - Dong Ming
- Academy of Medical Engineering and Translational Medicine, Tianjin University, Tianjin, China,*Correspondence: Dong Ming,
| |
Collapse
|
15
|
Becker R, Hervais-Adelman A. Individual theta-band cortical entrainment to speech in quiet predicts word-in-noise comprehension. Cereb Cortex Commun 2023; 4:tgad001. [PMID: 36726796 PMCID: PMC9883620 DOI: 10.1093/texcom/tgad001] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2022] [Revised: 12/17/2022] [Accepted: 12/18/2022] [Indexed: 01/09/2023] Open
Abstract
Speech elicits brain activity time-locked to its amplitude envelope. The resulting speech-brain synchrony (SBS) is thought to be crucial to speech parsing and comprehension. It has been shown that higher speech-brain coherence is associated with increased speech intelligibility. However, studies depending on the experimental manipulation of speech stimuli do not allow conclusion about the causality of the observed tracking. Here, we investigate whether individual differences in the intrinsic propensity to track the speech envelope when listening to speech-in-quiet is predictive of individual differences in speech-recognition-in-noise, in an independent task. We evaluated the cerebral tracking of speech in source-localized magnetoencephalography, at timescales corresponding to the phrases, words, syllables and phonemes. We found that individual differences in syllabic tracking in right superior temporal gyrus and in left middle temporal gyrus (MTG) were positively associated with recognition accuracy in an independent words-in-noise task. Furthermore, directed connectivity analysis showed that this relationship is partially mediated by top-down connectivity from premotor cortex-associated with speech processing and active sensing in the auditory domain-to left MTG. Thus, the extent of SBS-even during clear speech-reflects an active mechanism of the speech processing system that may confer resilience to noise.
Collapse
Affiliation(s)
- Robert Becker
- Corresponding author: Neurolinguistics, Department of Psychology, University of Zurich (UZH), Zurich, Switzerland.
| | - Alexis Hervais-Adelman
- Neurolinguistics, Department of Psychology, University of Zurich, Zurich 8050, Switzerland,Neuroscience Center Zurich, University of Zurich and Eidgenössische Technische Hochschule Zurich, Zurich 8057, Switzerland
| |
Collapse
|
16
|
Van Engen KJ, Dey A, Sommers MS, Peelle JE. Audiovisual speech perception: Moving beyond McGurk. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2022; 152:3216. [PMID: 36586857 PMCID: PMC9894660 DOI: 10.1121/10.0015262] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/28/2022] [Revised: 10/26/2022] [Accepted: 11/05/2022] [Indexed: 05/29/2023]
Abstract
Although it is clear that sighted listeners use both auditory and visual cues during speech perception, the manner in which multisensory information is combined is a matter of debate. One approach to measuring multisensory integration is to use variants of the McGurk illusion, in which discrepant auditory and visual cues produce auditory percepts that differ from those based on unimodal input. Not all listeners show the same degree of susceptibility to the McGurk illusion, and these individual differences are frequently used as a measure of audiovisual integration ability. However, despite their popularity, we join the voices of others in the field to argue that McGurk tasks are ill-suited for studying real-life multisensory speech perception: McGurk stimuli are often based on isolated syllables (which are rare in conversations) and necessarily rely on audiovisual incongruence that does not occur naturally. Furthermore, recent data show that susceptibility to McGurk tasks does not correlate with performance during natural audiovisual speech perception. Although the McGurk effect is a fascinating illusion, truly understanding the combined use of auditory and visual information during speech perception requires tasks that more closely resemble everyday communication: namely, words, sentences, and narratives with congruent auditory and visual speech cues.
Collapse
Affiliation(s)
- Kristin J Van Engen
- Department of Psychological and Brain Sciences, Washington University, St. Louis, Missouri 63130, USA
| | - Avanti Dey
- PLOS ONE, 1265 Battery Street, San Francisco, California 94111, USA
| | - Mitchell S Sommers
- Department of Psychological and Brain Sciences, Washington University, St. Louis, Missouri 63130, USA
| | - Jonathan E Peelle
- Department of Otolaryngology, Washington University, St. Louis, Missouri 63130, USA
| |
Collapse
|
17
|
Neurodevelopmental oscillatory basis of speech processing in noise. Dev Cogn Neurosci 2022; 59:101181. [PMID: 36549148 PMCID: PMC9792357 DOI: 10.1016/j.dcn.2022.101181] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2022] [Revised: 10/31/2022] [Accepted: 11/25/2022] [Indexed: 11/27/2022] Open
Abstract
Humans' extraordinary ability to understand speech in noise relies on multiple processes that develop with age. Using magnetoencephalography (MEG), we characterize the underlying neuromaturational basis by quantifying how cortical oscillations in 144 participants (aged 5-27 years) track phrasal and syllabic structures in connected speech mixed with different types of noise. While the extraction of prosodic cues from clear speech was stable during development, its maintenance in a multi-talker background matured rapidly up to age 9 and was associated with speech comprehension. Furthermore, while the extraction of subtler information provided by syllables matured at age 9, its maintenance in noisy backgrounds progressively matured until adulthood. Altogether, these results highlight distinct behaviorally relevant maturational trajectories for the neuronal signatures of speech perception. In accordance with grain-size proposals, neuromaturational milestones are reached increasingly late for linguistic units of decreasing size, with further delays incurred by noise.
Collapse
|
18
|
Unraveling the functional attributes of the language connectome: crucial subnetworks, flexibility and variability. Neuroimage 2022; 263:119672. [PMID: 36209795 DOI: 10.1016/j.neuroimage.2022.119672] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2022] [Revised: 10/04/2022] [Accepted: 10/05/2022] [Indexed: 11/23/2022] Open
Abstract
Language processing is a highly integrative function, intertwining linguistic operations (processing the language code intentionally used for communication) and extra-linguistic processes (e.g., attention monitoring, predictive inference, long-term memory). This synergetic cognitive architecture requires a distributed and specialized neural substrate. Brain systems have mainly been examined at rest. However, task-related functional connectivity provides additional and valuable information about how information is processed when various cognitive states are involved. We gathered thirteen language fMRI tasks in a unique database of one hundred and fifty neurotypical adults (InLang [Interactive networks of Language] database), providing the opportunity to assess language features across a wide range of linguistic processes. Using this database, we applied network theory as a computational tool to model the task-related functional connectome of language (LANG atlas). The organization of this data-driven neurocognitive atlas of language was examined at multiple levels, uncovering its major components (or crucial subnetworks), and its anatomical and functional correlates. In addition, we estimated its reconfiguration as a function of linguistic demand (flexibility) or several factors such as age or gender (variability). We observed that several discrete networks could be specifically shaped to promote key functional features of language: coding-decoding (Net1), control-executive (Net2), abstract-knowledge (Net3), and sensorimotor (Net4) functions. The architecture of these systems and the functional connectivity of the pivotal brain regions varied according to the nature of the linguistic process, gender, or age. By accounting for the multifaceted nature of language and modulating factors, this study can contribute to enriching and refining existing neurocognitive models of language. The LANG atlas can also be considered a reference for comparative or clinical studies involving various patients and conditions.
Collapse
|
19
|
Xiao YJ, Wang L, Liu YZ, Chen J, Zhang H, Gao Y, He H, Zhao Z, Wang Z. Excitatory Crossmodal Input to a Widespread Population of Primary Sensory Cortical Neurons. Neurosci Bull 2022; 38:1139-1152. [PMID: 35429324 PMCID: PMC9554107 DOI: 10.1007/s12264-022-00855-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2021] [Accepted: 01/23/2022] [Indexed: 11/28/2022] Open
Abstract
Crossmodal information processing in sensory cortices has been reported in sparsely distributed neurons under normal conditions and can undergo experience- or activity-induced plasticity. Given the potential role in brain function as indicated by previous reports, crossmodal connectivity in the sensory cortex needs to be further explored. Using perforated whole-cell recording in anesthetized adult rats, we found that almost all neurons recorded in the primary somatosensory, auditory, and visual cortices exhibited significant membrane-potential responses to crossmodal stimulation, as recorded when brain activity states were pharmacologically down-regulated in light anesthesia. These crossmodal cortical responses were excitatory and subthreshold, and further seemed to be relayed primarily by the sensory thalamus, but not the sensory cortex, of the stimulated modality. Our experiments indicate a sensory cortical presence of widespread excitatory crossmodal inputs, which might play roles in brain functions involving crossmodal information processing or plasticity.
Collapse
Affiliation(s)
- Yuan-Jie Xiao
- Institute and Key Laboratory of Brain Functional Genomics of the Chinese Ministry of Education, Shanghai Key Laboratory of Brain Functional Genomics, School of Life Sciences, East China Normal University, Shanghai, 200062, China
| | - Lidan Wang
- Institute and Key Laboratory of Brain Functional Genomics of the Chinese Ministry of Education, Shanghai Key Laboratory of Brain Functional Genomics, School of Life Sciences, East China Normal University, Shanghai, 200062, China
| | - Yu-Zhang Liu
- Institute and Key Laboratory of Brain Functional Genomics of the Chinese Ministry of Education, Shanghai Key Laboratory of Brain Functional Genomics, School of Life Sciences, East China Normal University, Shanghai, 200062, China
- Department of Neuroscience, University of Pittsburgh, Pittsburgh, 15260, USA
| | - Jiayu Chen
- Institute and Key Laboratory of Brain Functional Genomics of the Chinese Ministry of Education, Shanghai Key Laboratory of Brain Functional Genomics, School of Life Sciences, East China Normal University, Shanghai, 200062, China
| | - Haoyu Zhang
- Institute and Key Laboratory of Brain Functional Genomics of the Chinese Ministry of Education, Shanghai Key Laboratory of Brain Functional Genomics, School of Life Sciences, East China Normal University, Shanghai, 200062, China
| | - Yan Gao
- Institute and Key Laboratory of Brain Functional Genomics of the Chinese Ministry of Education, Shanghai Key Laboratory of Brain Functional Genomics, School of Life Sciences, East China Normal University, Shanghai, 200062, China
| | - Hua He
- Department of Neurosurgery, Third Affiliated Hospital of the Navy Military Medical University, Shanghai, 200438, China
| | - Zheng Zhao
- Institute and Key Laboratory of Brain Functional Genomics of the Chinese Ministry of Education, Shanghai Key Laboratory of Brain Functional Genomics, School of Life Sciences, East China Normal University, Shanghai, 200062, China.
| | - Zhiru Wang
- Institute and Key Laboratory of Brain Functional Genomics of the Chinese Ministry of Education, Shanghai Key Laboratory of Brain Functional Genomics, School of Life Sciences, East China Normal University, Shanghai, 200062, China.
| |
Collapse
|
20
|
Ross LA, Molholm S, Butler JS, Bene VAD, Foxe JJ. Neural correlates of multisensory enhancement in audiovisual narrative speech perception: a fMRI investigation. Neuroimage 2022; 263:119598. [PMID: 36049699 DOI: 10.1016/j.neuroimage.2022.119598] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2022] [Revised: 08/26/2022] [Accepted: 08/28/2022] [Indexed: 11/25/2022] Open
Abstract
This fMRI study investigated the effect of seeing articulatory movements of a speaker while listening to a naturalistic narrative stimulus. It had the goal to identify regions of the language network showing multisensory enhancement under synchronous audiovisual conditions. We expected this enhancement to emerge in regions known to underlie the integration of auditory and visual information such as the posterior superior temporal gyrus as well as parts of the broader language network, including the semantic system. To this end we presented 53 participants with a continuous narration of a story in auditory alone, visual alone, and both synchronous and asynchronous audiovisual speech conditions while recording brain activity using BOLD fMRI. We found multisensory enhancement in an extensive network of regions underlying multisensory integration and parts of the semantic network as well as extralinguistic regions not usually associated with multisensory integration, namely the primary visual cortex and the bilateral amygdala. Analysis also revealed involvement of thalamic brain regions along the visual and auditory pathways more commonly associated with early sensory processing. We conclude that under natural listening conditions, multisensory enhancement not only involves sites of multisensory integration but many regions of the wider semantic network and includes regions associated with extralinguistic sensory, perceptual and cognitive processing.
Collapse
Affiliation(s)
- Lars A Ross
- The Frederick J. and Marion A. Schindler Cognitive Neurophysiology Laboratory, The Ernest J. Del Monte Institute for Neuroscience, Department of Neuroscience, University of Rochester School of Medicine and Dentistry, Rochester, New York, 14642, USA; Department of Imaging Sciences, University of Rochester Medical Center, University of Rochester School of Medicine and Dentistry, Rochester, New York, 14642, USA; The Cognitive Neurophysiology Laboratory, Departments of Pediatrics and Neuroscience, Albert Einstein College of Medicine & Montefiore Medical Center, Bronx, New York, 10461, USA.
| | - Sophie Molholm
- The Frederick J. and Marion A. Schindler Cognitive Neurophysiology Laboratory, The Ernest J. Del Monte Institute for Neuroscience, Department of Neuroscience, University of Rochester School of Medicine and Dentistry, Rochester, New York, 14642, USA; The Cognitive Neurophysiology Laboratory, Departments of Pediatrics and Neuroscience, Albert Einstein College of Medicine & Montefiore Medical Center, Bronx, New York, 10461, USA
| | - John S Butler
- The Cognitive Neurophysiology Laboratory, Departments of Pediatrics and Neuroscience, Albert Einstein College of Medicine & Montefiore Medical Center, Bronx, New York, 10461, USA; School of Mathematical Sciences, Technological University Dublin, Kevin Street Campus, Dublin, Ireland
| | - Victor A Del Bene
- The Cognitive Neurophysiology Laboratory, Departments of Pediatrics and Neuroscience, Albert Einstein College of Medicine & Montefiore Medical Center, Bronx, New York, 10461, USA; University of Alabama at Birmingham, Heersink School of Medicine, Department of Neurology, Birmingham, Alabama, 35233, USA
| | - John J Foxe
- The Frederick J. and Marion A. Schindler Cognitive Neurophysiology Laboratory, The Ernest J. Del Monte Institute for Neuroscience, Department of Neuroscience, University of Rochester School of Medicine and Dentistry, Rochester, New York, 14642, USA; The Cognitive Neurophysiology Laboratory, Departments of Pediatrics and Neuroscience, Albert Einstein College of Medicine & Montefiore Medical Center, Bronx, New York, 10461, USA.
| |
Collapse
|
21
|
David W, Gransier R, Wouters J. Evaluation of phase-locking to parameterized speech envelopes. Front Neurol 2022; 13:852030. [PMID: 35989900 PMCID: PMC9382131 DOI: 10.3389/fneur.2022.852030] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2022] [Accepted: 06/29/2022] [Indexed: 12/04/2022] Open
Abstract
Humans rely on the temporal processing ability of the auditory system to perceive speech during everyday communication. The temporal envelope of speech is essential for speech perception, particularly envelope modulations below 20 Hz. In the literature, the neural representation of this speech envelope is usually investigated by recording neural phase-locked responses to speech stimuli. However, these phase-locked responses are not only associated with envelope modulation processing, but also with processing of linguistic information at a higher-order level when speech is comprehended. It is thus difficult to disentangle the responses into components from the acoustic envelope itself and the linguistic structures in speech (such as words, phrases and sentences). Another way to investigate neural modulation processing is to use sinusoidal amplitude-modulated stimuli at different modulation frequencies to obtain the temporal modulation transfer function. However, these transfer functions are considerably variable across modulation frequencies and individual listeners. To tackle the issues of both speech and sinusoidal amplitude-modulated stimuli, the recently introduced Temporal Speech Envelope Tracking (TEMPEST) framework proposed the use of stimuli with a distribution of envelope modulations. The framework aims to assess the brain's capability to process temporal envelopes in different frequency bands using stimuli with speech-like envelope modulations. In this study, we provide a proof-of-concept of the framework using stimuli with modulation frequency bands around the syllable and phoneme rate in natural speech. We evaluated whether the evoked phase-locked neural activity correlates with the speech-weighted modulation transfer function measured using sinusoidal amplitude-modulated stimuli in normal-hearing listeners. Since many studies on modulation processing employ different metrics and comparing their results is difficult, we included different power- and phase-based metrics and investigate how these metrics relate to each other. Results reveal a strong correspondence across listeners between the neural activity evoked by the speech-like stimuli and the activity evoked by the sinusoidal amplitude-modulated stimuli. Furthermore, strong correspondence was also apparent between each metric, facilitating comparisons between studies using different metrics. These findings indicate the potential of the TEMPEST framework to efficiently assess the neural capability to process temporal envelope modulations within a frequency band that is important for speech perception.
Collapse
Affiliation(s)
- Wouter David
- ExpORL, Department of Neurosciences, KU Leuven, Leuven, Belgium
| | | | | |
Collapse
|
22
|
Aller M, Økland HS, MacGregor LJ, Blank H, Davis MH. Differential Auditory and Visual Phase-Locking Are Observed during Audio-Visual Benefit and Silent Lip-Reading for Speech Perception. J Neurosci 2022; 42:6108-6120. [PMID: 35760528 PMCID: PMC9351641 DOI: 10.1523/jneurosci.2476-21.2022] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2021] [Revised: 04/04/2022] [Accepted: 04/12/2022] [Indexed: 11/21/2022] Open
Abstract
Speech perception in noisy environments is enhanced by seeing facial movements of communication partners. However, the neural mechanisms by which audio and visual speech are combined are not fully understood. We explore MEG phase-locking to auditory and visual signals in MEG recordings from 14 human participants (6 females, 8 males) that reported words from single spoken sentences. We manipulated the acoustic clarity and visual speech signals such that critical speech information is present in auditory, visual, or both modalities. MEG coherence analysis revealed that both auditory and visual speech envelopes (auditory amplitude modulations and lip aperture changes) were phase-locked to 2-6 Hz brain responses in auditory and visual cortex, consistent with entrainment to syllable-rate components. Partial coherence analysis was used to separate neural responses to correlated audio-visual signals and showed non-zero phase-locking to auditory envelope in occipital cortex during audio-visual (AV) speech. Furthermore, phase-locking to auditory signals in visual cortex was enhanced for AV speech compared with audio-only speech that was matched for intelligibility. Conversely, auditory regions of the superior temporal gyrus did not show above-chance partial coherence with visual speech signals during AV conditions but did show partial coherence in visual-only conditions. Hence, visual speech enabled stronger phase-locking to auditory signals in visual areas, whereas phase-locking of visual speech in auditory regions only occurred during silent lip-reading. Differences in these cross-modal interactions between auditory and visual speech signals are interpreted in line with cross-modal predictive mechanisms during speech perception.SIGNIFICANCE STATEMENT Verbal communication in noisy environments is challenging, especially for hearing-impaired individuals. Seeing facial movements of communication partners improves speech perception when auditory signals are degraded or absent. The neural mechanisms supporting lip-reading or audio-visual benefit are not fully understood. Using MEG recordings and partial coherence analysis, we show that speech information is used differently in brain regions that respond to auditory and visual speech. While visual areas use visual speech to improve phase-locking to auditory speech signals, auditory areas do not show phase-locking to visual speech unless auditory speech is absent and visual speech is used to substitute for missing auditory signals. These findings highlight brain processes that combine visual and auditory signals to support speech understanding.
Collapse
Affiliation(s)
- Máté Aller
- MRC Cognition and Brain Sciences Unit, University of Cambridge, Cambridge, CB2 7EF, United Kingdom
| | - Heidi Solberg Økland
- MRC Cognition and Brain Sciences Unit, University of Cambridge, Cambridge, CB2 7EF, United Kingdom
| | - Lucy J MacGregor
- MRC Cognition and Brain Sciences Unit, University of Cambridge, Cambridge, CB2 7EF, United Kingdom
| | - Helen Blank
- University Medical Center Hamburg-Eppendorf, Hamburg, 20246, Germany
| | - Matthew H Davis
- MRC Cognition and Brain Sciences Unit, University of Cambridge, Cambridge, CB2 7EF, United Kingdom
| |
Collapse
|
23
|
Modulation transfer functions for audiovisual speech. PLoS Comput Biol 2022; 18:e1010273. [PMID: 35852989 PMCID: PMC9295967 DOI: 10.1371/journal.pcbi.1010273] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2022] [Accepted: 06/01/2022] [Indexed: 11/19/2022] Open
Abstract
Temporal synchrony between facial motion and acoustic modulations is a hallmark feature of audiovisual speech. The moving face and mouth during natural speech is known to be correlated with low-frequency acoustic envelope fluctuations (below 10 Hz), but the precise rates at which envelope information is synchronized with motion in different parts of the face are less clear. Here, we used regularized canonical correlation analysis (rCCA) to learn speech envelope filters whose outputs correlate with motion in different parts of the speakers face. We leveraged recent advances in video-based 3D facial landmark estimation allowing us to examine statistical envelope-face correlations across a large number of speakers (∼4000). Specifically, rCCA was used to learn modulation transfer functions (MTFs) for the speech envelope that significantly predict correlation with facial motion across different speakers. The AV analysis revealed bandpass speech envelope filters at distinct temporal scales. A first set of MTFs showed peaks around 3-4 Hz and were correlated with mouth movements. A second set of MTFs captured envelope fluctuations in the 1-2 Hz range correlated with more global face and head motion. These two distinctive timescales emerged only as a property of natural AV speech statistics across many speakers. A similar analysis of fewer speakers performing a controlled speech task highlighted only the well-known temporal modulations around 4 Hz correlated with orofacial motion. The different bandpass ranges of AV correlation align notably with the average rates at which syllables (3-4 Hz) and phrases (1-2 Hz) are produced in natural speech. Whereas periodicities at the syllable rate are evident in the envelope spectrum of the speech signal itself, slower 1-2 Hz regularities thus only become prominent when considering crossmodal signal statistics. This may indicate a motor origin of temporal regularities at the timescales of syllables and phrases in natural speech.
Collapse
|
24
|
Brang D, Plass J, Sherman A, Stacey WC, Wasade VS, Grabowecky M, Ahn E, Towle VL, Tao JX, Wu S, Issa NP, Suzuki S. Visual cortex responds to sound onset and offset during passive listening. J Neurophysiol 2022; 127:1547-1563. [PMID: 35507478 DOI: 10.1152/jn.00164.2021] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022] Open
Abstract
Sounds enhance our ability to detect, localize, and respond to co-occurring visual targets. Research suggests that sounds improve visual processing by resetting the phase of ongoing oscillations in visual cortex. However, it remains unclear what information is relayed from the auditory system to visual areas and if sounds modulate visual activity even in the absence of visual stimuli (e.g., during passive listening). Using intracranial electroencephalography (iEEG) in humans, we examined the sensitivity of visual cortex to three forms of auditory information during a passive listening task: auditory onset responses, auditory offset responses, and rhythmic entrainment to sounds. Because some auditory neurons respond to both sound onsets and offsets, visual timing and duration processing may benefit from each. Additionally, if auditory entrainment information is relayed to visual cortex, it could support the processing of complex stimulus dynamics that are aligned between auditory and visual stimuli. Results demonstrate that in visual cortex, amplitude-modulated sounds elicited transient onset and offset responses in multiple areas, but no entrainment to sound modulation frequencies. These findings suggest that activity in visual cortex (as measured with iEEG in response to auditory stimuli) may not be affected by temporally fine-grained auditory stimulus dynamics during passive listening (though it remains possible that this signal may be observable with simultaneous auditory-visual stimuli). Moreover, auditory responses were maximal in low-level visual cortex, potentially implicating a direct pathway for rapid interactions between auditory and visual cortices. This mechanism may facilitate perception by time-locking visual computations to environmental events marked by auditory discontinuities.
Collapse
Affiliation(s)
- David Brang
- Department of Psychology, University of Michigan, Ann Arbor, MI, United States
| | - John Plass
- Department of Psychology, University of Michigan, Ann Arbor, MI, United States
| | - Aleksandra Sherman
- Department of Cognitive Science, Occidental College, Los Angeles, CA, United States
| | - William C Stacey
- Department of Neurology, University of Michigan, Ann Arbor, MI, United States
| | | | - Marcia Grabowecky
- Department of Psychology, Northwestern University, Evanston, IL, United States
| | - EunSeon Ahn
- Department of Psychology, University of Michigan, Ann Arbor, MI, United States
| | - Vernon L Towle
- Department of Neurology, The University of Chicago, Chicago, IL, United States
| | - James X Tao
- Department of Neurology, The University of Chicago, Chicago, IL, United States
| | - Shasha Wu
- Department of Neurology, The University of Chicago, Chicago, IL, United States
| | - Naoum P Issa
- Department of Neurology, The University of Chicago, Chicago, IL, United States
| | - Satoru Suzuki
- Department of Psychology, Northwestern University, Evanston, IL, United States
| |
Collapse
|
25
|
Bröhl F, Keitel A, Kayser C. MEG Activity in Visual and Auditory Cortices Represents Acoustic Speech-Related Information during Silent Lip Reading. eNeuro 2022; 9:ENEURO.0209-22.2022. [PMID: 35728955 PMCID: PMC9239847 DOI: 10.1523/eneuro.0209-22.2022] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2022] [Accepted: 06/06/2022] [Indexed: 11/21/2022] Open
Abstract
Speech is an intrinsically multisensory signal, and seeing the speaker's lips forms a cornerstone of communication in acoustically impoverished environments. Still, it remains unclear how the brain exploits visual speech for comprehension. Previous work debated whether lip signals are mainly processed along the auditory pathways or whether the visual system directly implements speech-related processes. To probe this, we systematically characterized dynamic representations of multiple acoustic and visual speech-derived features in source localized MEG recordings that were obtained while participants listened to speech or viewed silent speech. Using a mutual-information framework we provide a comprehensive assessment of how well temporal and occipital cortices reflect the physically presented signals and unique aspects of acoustic features that were physically absent but may be critical for comprehension. Our results demonstrate that both cortices feature a functionally specific form of multisensory restoration: during lip reading, they reflect unheard acoustic features, independent of co-existing representations of the visible lip movements. This restoration emphasizes the unheard pitch signature in occipital cortex and the speech envelope in temporal cortex and is predictive of lip-reading performance. These findings suggest that when seeing the speaker's lips, the brain engages both visual and auditory pathways to support comprehension by exploiting multisensory correspondences between lip movements and spectro-temporal acoustic cues.
Collapse
Affiliation(s)
- Felix Bröhl
- Department for Cognitive Neuroscience, Faculty of Biology, Bielefeld University, Bielefeld 33615, Germany
| | - Anne Keitel
- Psychology, University of Dundee, Dundee DD1 4HN, United Kingdom
| | - Christoph Kayser
- Department for Cognitive Neuroscience, Faculty of Biology, Bielefeld University, Bielefeld 33615, Germany
| |
Collapse
|
26
|
Holroyd CB. Interbrain synchrony: on wavy ground. Trends Neurosci 2022; 45:346-357. [PMID: 35236639 DOI: 10.1016/j.tins.2022.02.002] [Citation(s) in RCA: 36] [Impact Index Per Article: 18.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2021] [Revised: 01/08/2022] [Accepted: 02/04/2022] [Indexed: 12/15/2022]
Abstract
In recent years the study of dynamic, between-brain coupling mechanisms has taken social neuroscience by storm. In particular, interbrain synchrony (IBS) is a putative neural mechanism said to promote social interactions by enabling the functional integration of multiple brains. In this article, I argue that this research is beset with three pervasive and interrelated problems. First, the field lacks a widely accepted definition of IBS. Second, IBS wants for theories that can guide the design and interpretation of experiments. Third, a potpourri of tasks and empirical methods permits undue flexibility when testing the hypothesis. These factors synergistically undermine IBS as a theoretical construct. I finish by recommending measures that can address these issues.
Collapse
Affiliation(s)
- Clay B Holroyd
- Department of Experimental Psychology, Ghent University, Henri Dunantlaan 2, 9000 Gent, Belgium.
| |
Collapse
|
27
|
Woolnough O, Forseth KJ, Rollo PS, Roccaforte ZJ, Tandon N. Event-Related Phase Synchronization Propagates Rapidly across Human Ventral Visual Cortex. Neuroimage 2022; 256:119262. [PMID: 35504563 PMCID: PMC9382906 DOI: 10.1016/j.neuroimage.2022.119262] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2021] [Revised: 03/31/2022] [Accepted: 04/27/2022] [Indexed: 11/01/2022] Open
Abstract
Visual inputs to early visual cortex integrate with semantic, linguistic and memory inputs in higher visual cortex, in a manner that is rapid and accurate, and enables complex computations such as face recognition and word reading. This implies the existence of fundamental organizational principles that enable such efficiency. To elaborate on this, we performed intracranial recordings in 82 individuals while they performed tasks of varying visual and cognitive complexity. We discovered that visual inputs induce highly organized posterior-to-anterior propagating patterns of phase modulation across the ventral occipitotemporal cortex. At individual electrodes there was a stereotyped temporal pattern of phase progression following both stimulus onset and offset, consistent across trials and tasks. The phase of low frequency activity in anterior regions was predicted by the prior phase in posterior cortical regions. This spatiotemporal propagation of phase likely serves as a feed-forward organizational influence enabling the integration of information across the ventral visual stream. This phase modulation manifests as the early components of the event related potential; one of the most commonly used measures in human electrophysiology. These findings illuminate fundamental organizational principles of the higher order visual system that enable the rapid recognition and characterization of a variety of inputs.
Collapse
Affiliation(s)
- Oscar Woolnough
- Vivian L. Smith Department of Neurosurgery, McGovern Medical School at UT Health Houston, Houston, TX, 77030, United States of America; Texas Institute for Restorative Neurotechnologies, University of Texas Health Science Center at Houston, Houston, TX, 77030, United States of America
| | - Kiefer J Forseth
- Vivian L. Smith Department of Neurosurgery, McGovern Medical School at UT Health Houston, Houston, TX, 77030, United States of America; Texas Institute for Restorative Neurotechnologies, University of Texas Health Science Center at Houston, Houston, TX, 77030, United States of America
| | - Patrick S Rollo
- Vivian L. Smith Department of Neurosurgery, McGovern Medical School at UT Health Houston, Houston, TX, 77030, United States of America; Texas Institute for Restorative Neurotechnologies, University of Texas Health Science Center at Houston, Houston, TX, 77030, United States of America
| | - Zachary J Roccaforte
- Vivian L. Smith Department of Neurosurgery, McGovern Medical School at UT Health Houston, Houston, TX, 77030, United States of America; Texas Institute for Restorative Neurotechnologies, University of Texas Health Science Center at Houston, Houston, TX, 77030, United States of America
| | - Nitin Tandon
- Vivian L. Smith Department of Neurosurgery, McGovern Medical School at UT Health Houston, Houston, TX, 77030, United States of America; Texas Institute for Restorative Neurotechnologies, University of Texas Health Science Center at Houston, Houston, TX, 77030, United States of America; Memorial Hermann Hospital, Texas Medical Center, Houston, TX, 77030, United States of America.
| |
Collapse
|
28
|
Flaten E, Marshall SA, Dittrich A, Trainor L. Evidence for Top-down Meter Perception in Infancy as Shown by Primed Neural Responses to an Ambiguous Rhythm. Eur J Neurosci 2022; 55:2003-2023. [PMID: 35445451 DOI: 10.1111/ejn.15671] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2021] [Revised: 03/23/2022] [Accepted: 03/24/2022] [Indexed: 11/30/2022]
Abstract
From auditory rhythm patterns, listeners extract the underlying steady beat, and perceptually group beats to form meters. While previous studies show infants discriminate different auditory meters, it remains unknown whether they can maintain (imagine) a metrical interpretation of an ambiguous rhythm through top-down processes. We investigated this via electroencephalographic mismatch responses. We primed 6-month-old infants (N = 24) to hear a 6-beat ambiguous rhythm either in duple meter (n = 13), or in triple meter (n = 11) through loudness accents either on every second or every third beat. Periods of priming were inserted before sequences of the ambiguous unaccented rhythm. To elicit mismatch responses, occasional pitch deviants occurred on either beat 4 (strong beat in triple meter; weak in duple) or beat 5 (strong in duple; weak in triple) of the unaccented trials. At frontal left sites, we found a significant interaction between beat and priming group in the predicted direction. Post-hoc analyses showed mismatch response amplitudes were significantly larger for beat 5 in the duple- than triple-primed group (p = .047) and were non-significantly larger for beat 4 in the triple- than duple-primed group. Further, amplitudes were generally larger in infants with musically experienced parents. At frontal right sites, mismatch responses were generally larger for those in the duple compared to triple group, which may reflect a processing advantage for duple meter. These results indicate infants can impose a top-down, internally generated meter on ambiguous auditory rhythms, an ability that would aid early language and music learning.
Collapse
Affiliation(s)
- Erica Flaten
- Department of Psychology, Neuroscience and Behaviour, McMaster University
| | - Sara A Marshall
- Department of Psychology, Neuroscience and Behaviour, McMaster University
| | - Angela Dittrich
- Department of Psychology, Neuroscience and Behaviour, McMaster University
| | - Laurel Trainor
- Department of Psychology, Neuroscience and Behaviour, McMaster University.,McMaster Institute for Music and the Mind, McMaster University.,Rotman Research Institute, Baycrest Hospital, Toronto, ON, Canada
| |
Collapse
|
29
|
Kershner JR. Multisensory deficits in dyslexia may result from a locus coeruleus attentional network dysfunction. Neuropsychologia 2021; 161:108023. [PMID: 34530025 DOI: 10.1016/j.neuropsychologia.2021.108023] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2021] [Revised: 08/06/2021] [Accepted: 09/11/2021] [Indexed: 12/13/2022]
Abstract
A fundamental educational requirement of beginning reading is to learn, access, and rapidly process associations between novel visuospatial symbols and their phonological representations in speech. Children with difficulties in such cross-modal integration are often divided into dyslexia subtypes, based on whether their primary problem is with the written or spoken component of decoding. The present review suggests that starting in infancy, perceptions of audiovisual speech are integrated by mutual oscillatory phase-resetting between sensory cortices, and throughout development visual and auditory experiences are coupled into unified perceptions. Entirely separate subtypes are incompatible with this view. Visual or auditory deficits will invariably affect processing to some degree in both domains. It is suggested that poor auditory/visual integration may be diagnostic for both forms of dyslexia, stemming from an encoding weakness in the early cross-sensory binding of audiovisual speech. The review presents a model of dyslexia as a dysfunction of the large-scale ventral and dorsal attention networks controlling such binding. Excessive glutamatergic neuronal excitability of the attention networks by the Locus coeruleus-norepinephrine system may interfere with multisensory integration, with deleterious effects on the acquisition of reading by degrading graphene/phoneme conversion.
Collapse
Affiliation(s)
- John R Kershner
- Dept. of Applied Psychology and Human Resources University of Toronto, ON, M5S 1A1, Canada.
| |
Collapse
|
30
|
Kulkarni A, Kegler M, Reichenbach T. Effect of visual input on syllable parsing in a computational model of a neural microcircuit for speech processing. J Neural Eng 2021; 18. [PMID: 34547737 DOI: 10.1088/1741-2552/ac28d3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2021] [Accepted: 09/21/2021] [Indexed: 11/12/2022]
Abstract
Objective.Seeing a person talking can help us understand them, particularly in a noisy environment. However, how the brain integrates the visual information with the auditory signal to enhance speech comprehension remains poorly understood.Approach.Here we address this question in a computational model of a cortical microcircuit for speech processing. The model consists of an excitatory and an inhibitory neural population that together create oscillations in the theta frequency range. When stimulated with speech, the theta rhythm becomes entrained to the onsets of syllables, such that the onsets can be inferred from the network activity. We investigate how well the obtained syllable parsing performs when different types of visual stimuli are added. In particular, we consider currents related to the rate of syllables as well as currents related to the mouth-opening area of the talking faces.Main results.We find that currents that target the excitatory neuronal population can influence speech comprehension, both boosting it or impeding it, depending on the temporal delay and on whether the currents are excitatory or inhibitory. In contrast, currents that act on the inhibitory neurons do not impact speech comprehension significantly.Significance.Our results suggest neural mechanisms for the integration of visual information with the acoustic information in speech and make experimentally-testable predictions.
Collapse
Affiliation(s)
- Anirudh Kulkarni
- Department of Bioengineering and Centre for Neurotechnology, Imperial College London, South Kensington Campus, SW7 2AZ London, United Kingdom
| | - Mikolaj Kegler
- Department of Bioengineering and Centre for Neurotechnology, Imperial College London, South Kensington Campus, SW7 2AZ London, United Kingdom
| | - Tobias Reichenbach
- Department of Bioengineering and Centre for Neurotechnology, Imperial College London, South Kensington Campus, SW7 2AZ London, United Kingdom.,Department Artificial Intelligence in Biomedical Engineering, Friedrich-Alexander-Universität Erlangen-Nürnberg, Konrad-Zuse-Strasse 3/5, Erlangen, 91056, Germany
| |
Collapse
|
31
|
Ramos-Escobar N, Segura E, Olivé G, Rodriguez-Fornells A, François C. Oscillatory activity and EEG phase synchrony of concurrent word segmentation and meaning-mapping in 9-year-old children. Dev Cogn Neurosci 2021; 51:101010. [PMID: 34461393 PMCID: PMC8403737 DOI: 10.1016/j.dcn.2021.101010] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2020] [Revised: 08/25/2021] [Accepted: 08/26/2021] [Indexed: 10/28/2022] Open
Abstract
When learning a new language, one must segment words from continuous speech and associate them with meanings. These complex processes can be boosted by attentional mechanisms triggered by multi-sensory information. Previous electrophysiological studies suggest that brain oscillations are sensitive to different hierarchical complexity levels of the input, making them a plausible neural substrate for speech parsing. Here, we investigated the functional role of brain oscillations during concurrent speech segmentation and meaning acquisition in sixty 9-year-old children. We collected EEG data during an audio-visual statistical learning task during which children were exposed to a learning condition with consistent word-picture associations and a random condition with inconsistent word-picture associations before being tested on their ability to recall words and word-picture associations. We capitalized on the brain dynamics to align neural activity to the same rate as an external rhythmic stimulus to explore modulations of neural synchronization and phase synchronization between electrodes during multi-sensory word learning. Results showed enhanced power at both word- and syllabic-rate and increased EEG phase synchronization between frontal and occipital regions in the learning compared to the random condition. These findings suggest that multi-sensory cueing and attentional mechanisms play an essential role in children's successful word learning.
Collapse
Affiliation(s)
- Neus Ramos-Escobar
- Dept. of Cognition, Development and Educational Science, Institute of Neuroscience, University of Barcelona, L'Hospitalet de Llobregat, Barcelona, 08097, Spain; Cognition and Brain Plasticity Group, Bellvitge Biomedical Research Institute (IDIBELL), L'Hospitalet de Llobregat, Barcelona, 08097, Spain
| | - Emma Segura
- Dept. of Cognition, Development and Educational Science, Institute of Neuroscience, University of Barcelona, L'Hospitalet de Llobregat, Barcelona, 08097, Spain; Cognition and Brain Plasticity Group, Bellvitge Biomedical Research Institute (IDIBELL), L'Hospitalet de Llobregat, Barcelona, 08097, Spain
| | - Guillem Olivé
- Dept. of Cognition, Development and Educational Science, Institute of Neuroscience, University of Barcelona, L'Hospitalet de Llobregat, Barcelona, 08097, Spain; Cognition and Brain Plasticity Group, Bellvitge Biomedical Research Institute (IDIBELL), L'Hospitalet de Llobregat, Barcelona, 08097, Spain
| | - Antoni Rodriguez-Fornells
- Dept. of Cognition, Development and Educational Science, Institute of Neuroscience, University of Barcelona, L'Hospitalet de Llobregat, Barcelona, 08097, Spain; Cognition and Brain Plasticity Group, Bellvitge Biomedical Research Institute (IDIBELL), L'Hospitalet de Llobregat, Barcelona, 08097, Spain; Catalan Institution for Research and Advanced Studies, ICREA, Barcelona, Spain.
| | | |
Collapse
|
32
|
Expertise Modulates Neural Stimulus-Tracking. eNeuro 2021; 8:ENEURO.0065-21.2021. [PMID: 34341067 PMCID: PMC8371925 DOI: 10.1523/eneuro.0065-21.2021] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2021] [Revised: 06/14/2021] [Accepted: 06/16/2021] [Indexed: 11/21/2022] Open
Abstract
How does the brain anticipate information in language? When people perceive speech, low-frequency (<10 Hz) activity in the brain synchronizes with bursts of sound and visual motion. This phenomenon, called cortical stimulus-tracking, is thought to be one way that the brain predicts the timing of upcoming words, phrases, and syllables. In this study, we test whether stimulus-tracking depends on domain-general expertise or on language-specific prediction mechanisms. We go on to examine how the effects of expertise differ between frontal and sensory cortex. We recorded electroencephalography (EEG) from human participants who were experts in either sign language or ballet, and we compared stimulus-tracking between groups while participants watched videos of sign language or ballet. We measured stimulus-tracking by computing coherence between EEG recordings and visual motion in the videos. Results showed that stimulus-tracking depends on domain-general expertise, and not on language-specific prediction mechanisms. At frontal channels, fluent signers showed stronger coherence to sign language than to dance, whereas expert dancers showed stronger coherence to dance than to sign language. At occipital channels, however, the two groups of participants did not show different patterns of coherence. These results are difficult to explain by entrainment of endogenous oscillations, because neither sign language nor dance show any periodicity at the frequencies of significant expertise-dependent stimulus-tracking. These results suggest that the brain may rely on domain-general predictive mechanisms to optimize perception of temporally-predictable stimuli such as speech, sign language, and dance.
Collapse
|
33
|
O'Sullivan AE, Crosse MJ, Liberto GMD, de Cheveigné A, Lalor EC. Neurophysiological Indices of Audiovisual Speech Processing Reveal a Hierarchy of Multisensory Integration Effects. J Neurosci 2021; 41:4991-5003. [PMID: 33824190 PMCID: PMC8197638 DOI: 10.1523/jneurosci.0906-20.2021] [Citation(s) in RCA: 24] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2020] [Revised: 03/16/2021] [Accepted: 03/22/2021] [Indexed: 12/27/2022] Open
Abstract
Seeing a speaker's face benefits speech comprehension, especially in challenging listening conditions. This perceptual benefit is thought to stem from the neural integration of visual and auditory speech at multiple stages of processing, whereby movement of a speaker's face provides temporal cues to auditory cortex, and articulatory information from the speaker's mouth can aid recognizing specific linguistic units (e.g., phonemes, syllables). However, it remains unclear how the integration of these cues varies as a function of listening conditions. Here, we sought to provide insight on these questions by examining EEG responses in humans (males and females) to natural audiovisual (AV), audio, and visual speech in quiet and in noise. We represented our speech stimuli in terms of their spectrograms and their phonetic features and then quantified the strength of the encoding of those features in the EEG using canonical correlation analysis (CCA). The encoding of both spectrotemporal and phonetic features was shown to be more robust in AV speech responses than what would have been expected from the summation of the audio and visual speech responses, suggesting that multisensory integration occurs at both spectrotemporal and phonetic stages of speech processing. We also found evidence to suggest that the integration effects may change with listening conditions; however, this was an exploratory analysis and future work will be required to examine this effect using a within-subject design. These findings demonstrate that integration of audio and visual speech occurs at multiple stages along the speech processing hierarchy.SIGNIFICANCE STATEMENT During conversation, visual cues impact our perception of speech. Integration of auditory and visual speech is thought to occur at multiple stages of speech processing and vary flexibly depending on the listening conditions. Here, we examine audiovisual (AV) integration at two stages of speech processing using the speech spectrogram and a phonetic representation, and test how AV integration adapts to degraded listening conditions. We find significant integration at both of these stages regardless of listening conditions. These findings reveal neural indices of multisensory interactions at different stages of processing and provide support for the multistage integration framework.
Collapse
Affiliation(s)
- Aisling E O'Sullivan
- School of Engineering, Trinity Centre for Biomedical Engineering and Trinity College Institute of Neuroscience, Trinity College Dublin, Dublin 2, Ireland
| | - Michael J Crosse
- X, The Moonshot Factory, Mountain View, CA and Department of Neuroscience, Albert Einstein College of Medicine, Bronx, New York 10461
| | - Giovanni M Di Liberto
- Laboratoire des Systèmes Perceptifs, Département d'Études Cognitives, École Normale Supérieure, Paris Sciences et Lettres University, Centre National de la Recherche Scientifique, Paris 75005, France
| | - Alain de Cheveigné
- Laboratoire des Systèmes Perceptifs, Département d'Études Cognitives, École Normale Supérieure, Paris Sciences et Lettres University, Centre National de la Recherche Scientifique, Paris 75005, France
- University College London Ear Institute, University College London, London WC1X 8EE, United Kingdom
| | - Edmund C Lalor
- School of Engineering, Trinity Centre for Biomedical Engineering and Trinity College Institute of Neuroscience, Trinity College Dublin, Dublin 2, Ireland
- Department of Biomedical Engineering and Department of Neuroscience, University of Rochester, Rochester, New York 14627
| |
Collapse
|
34
|
Cummings AE, Wu YC, Ogiela DA. Phonological Underspecification: An Explanation for How a Rake Can Become Awake. Front Hum Neurosci 2021; 15:585817. [PMID: 33679342 PMCID: PMC7925882 DOI: 10.3389/fnhum.2021.585817] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2020] [Accepted: 01/25/2021] [Indexed: 11/13/2022] Open
Abstract
Neural markers, such as the mismatch negativity (MMN), have been used to examine the phonological underspecification of English feature contrasts using the Featurally Underspecified Lexicon (FUL) model. However, neural indices have not been examined within the approximant phoneme class, even though there is evidence suggesting processing asymmetries between liquid (e.g., /ɹ/) and glide (e.g., /w/) phonemes. The goal of this study was to determine whether glide phonemes elicit electrophysiological asymmetries related to [consonantal] underspecification when contrasted with liquid phonemes in adult English speakers. Specifically, /ɹɑ/ is categorized as [+consonantal] while /wɑ/ is not specified [i.e., (-consonantal)]. Following the FUL framework, if /w/ is less specified than /ɹ/, the former phoneme should elicit a larger MMN response than the latter phoneme. Fifteen English-speaking adults were presented with two syllables, /ɹɑ/ and /wɑ/, in an event-related potential (ERP) oddball paradigm in which both syllables served as the standard and deviant stimulus in opposite stimulus sets. Three types of analyses were used: (1) traditional mean amplitude measurements; (2) cluster-based permutation analyses; and (3) event-related spectral perturbation (ERSP) analyses. The less specified /wɑ/ elicited a large MMN, while a much smaller MMN was elicited by the more specified /ɹɑ/. In the standard and deviant ERP waveforms, /wɑ/ elicited a significantly larger negative response than did /ɹɑ/. Theta activity elicited by /ɹɑ/ was significantly greater than that elicited by /wɑ/ in the 100-300 ms time window. Also, low gamma activation was significantly lower for /ɹɑ/ vs. /wɑ/ deviants over the left hemisphere, as compared to the right, in the 100-150 ms window. These outcomes suggest that the [consonantal] feature follows the underspecification predictions of FUL previously tested with the place of articulation and voicing features. Thus, this study provides new evidence for phonological underspecification. Moreover, as neural oscillation patterns have not previously been discussed in the underspecification literature, the ERSP analyses identified potential new indices of phonological underspecification.
Collapse
Affiliation(s)
- Alycia E. Cummings
- Department of Communication Sciences and Disorders, Idaho State University, Meridian, ID, United States
| | - Ying C. Wu
- Swartz Center for Computational Neuroscience, University of California, San Diego, San Diego, CA, United States
| | - Diane A. Ogiela
- Department of Communication Sciences and Disorders, Idaho State University, Meridian, ID, United States
| |
Collapse
|
35
|
Differential attention-dependent adjustment of frequency, power and phase in primary sensory and frontoparietal areas. Cortex 2021; 137:179-193. [PMID: 33636631 DOI: 10.1016/j.cortex.2021.01.008] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2020] [Revised: 10/13/2020] [Accepted: 01/22/2021] [Indexed: 11/23/2022]
Abstract
Continuously prioritizing behaviourally relevant information from the environment for improved stimulus processing is a crucial function of attention. In the current MEG study, we investigated how ongoing oscillatory activity of both sensory and non-sensory brain regions are differentially impacted by attentional focus. Low-frequency phase alignment of neural activity in primary sensory areas, with respect to attended/ignored features has been suggested to support top-down prioritization. However, phase adjustment in frontoparietal regions has not been widely studied, despite general implication of these in top-down selection of information. To investigate this, we let participants perform an established intermodal selective attention task, where low-frequency auditory (1.6 Hz) and visual (1.8 Hz) stimuli were presented simultaneously. We instructed them to either attend to the auditory or to the visual stimuli and to detect targets while ignoring the other stimulus stream. As expected, the strongest phase adjustment was observed in primary sensory regions for auditory and for visual stimulation, independent of attentional focus. We found greater differences in phase locking between attended and ignored stimulation for the visual modality. Interestingly, auditory temporal regions show small but significant attention-dependent neural entrainment even for visual stimulation. Extending findings from invasive recordings in non-human primates, we demonstrate an effect of attentional focus on the phase of the entrained oscillations in auditory and visual cortex which may be driven by phase locked increases of induced power. While sensory areas adjusted the phase of the respective stimulation frequencies, attentional focus adjusted the peak frequencies in nonsensory areas. Spatially these areas show a striking overlap with core regions of the dorsal attention network and the frontoparietal network. This suggests that these areas prioritize the attended modality by optimally exploiting the temporal structure of stimulation. Overall, our study complements and extends previous work by showing a differential effect of attentional focus on entrained oscillations (or phase adjustment) in primary sensory areas and frontoparietal areas.
Collapse
|
36
|
Delta/Theta band EEG activity shapes the rhythmic perceptual sampling of auditory scenes. Sci Rep 2021; 11:2370. [PMID: 33504860 PMCID: PMC7840678 DOI: 10.1038/s41598-021-82008-7] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2020] [Accepted: 01/13/2021] [Indexed: 11/08/2022] Open
Abstract
Many studies speak in favor of a rhythmic mode of listening, by which the encoding of acoustic information is structured by rhythmic neural processes at the time scale of about 1 to 4 Hz. Indeed, psychophysical data suggest that humans sample acoustic information in extended soundscapes not uniformly, but weigh the evidence at different moments for their perceptual decision at the time scale of about 2 Hz. We here test the critical prediction that such rhythmic perceptual sampling is directly related to the state of ongoing brain activity prior to the stimulus. Human participants judged the direction of frequency sweeps in 1.2 s long soundscapes while their EEG was recorded. We computed the perceptual weights attributed to different epochs within these soundscapes contingent on the phase or power of pre-stimulus EEG activity. This revealed a direct link between 4 Hz EEG phase and power prior to the stimulus and the phase of the rhythmic component of these perceptual weights. Hence, the temporal pattern by which the acoustic information is sampled over time for behavior is directly related to pre-stimulus brain activity in the delta/theta band. These results close a gap in the mechanistic picture linking ongoing delta band activity with their role in shaping the segmentation and perceptual influence of subsequent acoustic information.
Collapse
|
37
|
Beier EJ, Chantavarin S, Rehrig G, Ferreira F, Miller LM. Cortical Tracking of Speech: Toward Collaboration between the Fields of Signal and Sentence Processing. J Cogn Neurosci 2021; 33:574-593. [PMID: 33475452 DOI: 10.1162/jocn_a_01676] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022]
Abstract
In recent years, a growing number of studies have used cortical tracking methods to investigate auditory language processing. Although most studies that employ cortical tracking stem from the field of auditory signal processing, this approach should also be of interest to psycholinguistics-particularly the subfield of sentence processing-given its potential to provide insight into dynamic language comprehension processes. However, there has been limited collaboration between these fields, which we suggest is partly because of differences in theoretical background and methodological constraints, some mutually exclusive. In this paper, we first review the theories and methodological constraints that have historically been prioritized in each field and provide concrete examples of how some of these constraints may be reconciled. We then elaborate on how further collaboration between the two fields could be mutually beneficial. Specifically, we argue that the use of cortical tracking methods may help resolve long-standing debates in the field of sentence processing that commonly used behavioral and neural measures (e.g., ERPs) have failed to adjudicate. Similarly, signal processing researchers who use cortical tracking may be able to reduce noise in the neural data and broaden the impact of their results by controlling for linguistic features of their stimuli and by using simple comprehension tasks. Overall, we argue that a balance between the methodological constraints of the two fields will lead to an overall improved understanding of language processing as well as greater clarity on what mechanisms cortical tracking of speech reflects. Increased collaboration will help resolve debates in both fields and will lead to new and exciting avenues for research.
Collapse
|
38
|
Lalonde K, Werner LA. Development of the Mechanisms Underlying Audiovisual Speech Perception Benefit. Brain Sci 2021; 11:49. [PMID: 33466253 PMCID: PMC7824772 DOI: 10.3390/brainsci11010049] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2020] [Revised: 12/30/2020] [Accepted: 12/30/2020] [Indexed: 02/07/2023] Open
Abstract
The natural environments in which infants and children learn speech and language are noisy and multimodal. Adults rely on the multimodal nature of speech to compensate for noisy environments during speech communication. Multiple mechanisms underlie mature audiovisual benefit to speech perception, including reduced uncertainty as to when auditory speech will occur, use of correlations between the amplitude envelope of auditory and visual signals in fluent speech, and use of visual phonetic knowledge for lexical access. This paper reviews evidence regarding infants' and children's use of temporal and phonetic mechanisms in audiovisual speech perception benefit. The ability to use temporal cues for audiovisual speech perception benefit emerges in infancy. Although infants are sensitive to the correspondence between auditory and visual phonetic cues, the ability to use this correspondence for audiovisual benefit may not emerge until age four. A more cohesive account of the development of audiovisual speech perception may follow from a more thorough understanding of the development of sensitivity to and use of various temporal and phonetic cues.
Collapse
Affiliation(s)
- Kaylah Lalonde
- Center for Hearing Research, Boys Town National Research Hospital, Omaha, NE 68131, USA
| | - Lynne A. Werner
- Department of Speech and Hearing Sciences, University of Washington, Seattle, WA 98105, USA;
| |
Collapse
|
39
|
Auditory detection is modulated by theta phase of silent lip movements. CURRENT RESEARCH IN NEUROBIOLOGY 2021; 2:100014. [PMID: 36246505 PMCID: PMC9559921 DOI: 10.1016/j.crneur.2021.100014] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2020] [Revised: 05/12/2021] [Accepted: 05/19/2021] [Indexed: 11/23/2022] Open
Abstract
Audiovisual speech perception relies, among other things, on our expertise to map a speaker's lip movements with speech sounds. This multimodal matching is facilitated by salient syllable features that align lip movements and acoustic envelope signals in the 4–8 Hz theta band. Although non-exclusive, the predominance of theta rhythms in speech processing has been firmly established by studies showing that neural oscillations track the acoustic envelope in the primary auditory cortex. Equivalently, theta oscillations in the visual cortex entrain to lip movements, and the auditory cortex is recruited during silent speech perception. These findings suggest that neuronal theta oscillations may play a functional role in organising information flow across visual and auditory sensory areas. We presented silent speech movies while participants performed a pure tone detection task to test whether entrainment to lip movements directs the auditory system and drives behavioural outcomes. We showed that auditory detection varied depending on the ongoing theta phase conveyed by lip movements in the movies. In a complementary experiment presenting the same movies while recording participants' electro-encephalogram (EEG), we found that silent lip movements entrained neural oscillations in the visual and auditory cortices with the visual phase leading the auditory phase. These results support the idea that the visual cortex entrained by lip movements filtered the sensitivity of the auditory cortex via theta phase synchronization. Subjects entrain to visual activity conveyed by speakers' lip movements. Visual entrainment modulates auditory perception and performances. Silent lips perception recruits both visual and auditory cortices. Visual and auditory cortices synchronize via theta phase coupling.
Collapse
|
40
|
Mégevand P, Mercier MR, Groppe DM, Zion Golumbic E, Mesgarani N, Beauchamp MS, Schroeder CE, Mehta AD. Crossmodal Phase Reset and Evoked Responses Provide Complementary Mechanisms for the Influence of Visual Speech in Auditory Cortex. J Neurosci 2020; 40:8530-8542. [PMID: 33023923 PMCID: PMC7605423 DOI: 10.1523/jneurosci.0555-20.2020] [Citation(s) in RCA: 21] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2020] [Revised: 07/27/2020] [Accepted: 08/31/2020] [Indexed: 12/26/2022] Open
Abstract
Natural conversation is multisensory: when we can see the speaker's face, visual speech cues improve our comprehension. The neuronal mechanisms underlying this phenomenon remain unclear. The two main alternatives are visually mediated phase modulation of neuronal oscillations (excitability fluctuations) in auditory neurons and visual input-evoked responses in auditory neurons. Investigating this question using naturalistic audiovisual speech with intracranial recordings in humans of both sexes, we find evidence for both mechanisms. Remarkably, auditory cortical neurons track the temporal dynamics of purely visual speech using the phase of their slow oscillations and phase-related modulations in broadband high-frequency activity. Consistent with known perceptual enhancement effects, the visual phase reset amplifies the cortical representation of concomitant auditory speech. In contrast to this, and in line with earlier reports, visual input reduces the amplitude of evoked responses to concomitant auditory input. We interpret the combination of improved phase tracking and reduced response amplitude as evidence for more efficient and reliable stimulus processing in the presence of congruent auditory and visual speech inputs.SIGNIFICANCE STATEMENT Watching the speaker can facilitate our understanding of what is being said. The mechanisms responsible for this influence of visual cues on the processing of speech remain incompletely understood. We studied these mechanisms by recording the electrical activity of the human brain through electrodes implanted surgically inside the brain. We found that visual inputs can operate by directly activating auditory cortical areas, and also indirectly by modulating the strength of cortical responses to auditory input. Our results help to understand the mechanisms by which the brain merges auditory and visual speech into a unitary perception.
Collapse
Affiliation(s)
- Pierre Mégevand
- Department of Neurosurgery, Donald and Barbara Zucker School of Medicine at Hofstra/Northwell, Hempstead, New York 11549
- Feinstein Institutes for Medical Research, Manhasset, New York 11030
- Department of Basic Neurosciences, Faculty of Medicine, University of Geneva, 1211 Geneva, Switzerland
| | - Manuel R Mercier
- Department of Neurology, Montefiore Medical Center, Bronx, New York 10467
- Department of Neuroscience, Albert Einstein College of Medicine, Bronx, New York 10461
- Institut de Neurosciences des Systèmes, Aix Marseille University, INSERM, 13005 Marseille, France
| | - David M Groppe
- Department of Neurosurgery, Donald and Barbara Zucker School of Medicine at Hofstra/Northwell, Hempstead, New York 11549
- Feinstein Institutes for Medical Research, Manhasset, New York 11030
- The Krembil Neuroscience Centre, University Health Network, Toronto, Ontario M5T 1M8, Canada
| | - Elana Zion Golumbic
- The Gonda Brain Research Center, Bar Ilan University, Ramat Gan 5290002, Israel
| | - Nima Mesgarani
- Department of Electrical Engineering, Columbia University, New York, New York 10027
| | - Michael S Beauchamp
- Department of Neurosurgery, Baylor College of Medicine, Houston, Texas 77030
| | - Charles E Schroeder
- Nathan S. Kline Institute, Orangeburg, New York 10962
- Department of Psychiatry, Columbia University, New York, New York 10032
| | - Ashesh D Mehta
- Department of Neurosurgery, Donald and Barbara Zucker School of Medicine at Hofstra/Northwell, Hempstead, New York 11549
- Feinstein Institutes for Medical Research, Manhasset, New York 11030
| |
Collapse
|
41
|
Dunbar TA, Gorman JC. Using Communication to Modulate Neural Synchronization in Teams. Front Hum Neurosci 2020; 14:332. [PMID: 33100984 PMCID: PMC7506512 DOI: 10.3389/fnhum.2020.00332] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2020] [Accepted: 07/28/2020] [Indexed: 11/14/2022] Open
Abstract
Throughout training and team performance, teams may be assessed based on their communication patterns to identify which behaviors contributed to the team’s performance; however, this process of establishing meaning in communication is burdensome and time consuming despite the low monetary cost. A current topic in team research is developing covert measures, which are easier to analyze in real-time, to identify team processes as they occur during team performance; however, little is known about how overt and covert measures of team process relate to one another. In this study, we investigated the relationship between overt (communication) and covert (neural) measures of team process by manipulating the interaction partner (participant or experimenter) team members worked with and the type of task (decision-making or action-based) teams performed to assess their effects on team neural synchronization (measured as neurodynamic entropy) and communication (measured as both flow and content). The results indicated that the type of task affected how the teams structured their communication but had unpredictable effects on the neural synchronization of the team when averaged across the task session. The interaction partner did not affect team neural synchronization when averaged. However, there were significant relationships when communication and neural processes were examined over time between the neurodynamic entropy and the communication flow time series due to both the type of task and the interaction partner. Specifically, significant relationships across time were observed when participants were interacting with the other participant, during the second task trial, and across different regions of the cortex depending on the type of task being performed. The findings from the time series analyses suggest that factors that are previously known to affect communication (interaction partner and task type) also structure the relationship between team communication and neural synchronization—cross-level effects—but only when examined across time. Future research should consider these factors when developing new conceptualizations of team process measurement for measuring team performance over time.
Collapse
Affiliation(s)
- Terri A Dunbar
- Systems Psychology Laboratory, School of Psychology, Georgia Institute of Technology, Atlanta, GA, United States
| | - Jamie C Gorman
- Systems Psychology Laboratory, School of Psychology, Georgia Institute of Technology, Atlanta, GA, United States
| |
Collapse
|
42
|
Forseth KJ, Hickok G, Rollo PS, Tandon N. Language prediction mechanisms in human auditory cortex. Nat Commun 2020; 11:5240. [PMID: 33067457 PMCID: PMC7567874 DOI: 10.1038/s41467-020-19010-6] [Citation(s) in RCA: 32] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2019] [Accepted: 08/12/2020] [Indexed: 01/31/2023] Open
Abstract
Spoken language, both perception and production, is thought to be facilitated by an ensemble of predictive mechanisms. We obtain intracranial recordings in 37 patients using depth probes implanted along the anteroposterior extent of the supratemporal plane during rhythm listening, speech perception, and speech production. These reveal two predictive mechanisms in early auditory cortex with distinct anatomical and functional characteristics. The first, localized to bilateral Heschl's gyri and indexed by low-frequency phase, predicts the timing of acoustic events. The second, localized to planum temporale only in language-dominant cortex and indexed by high-gamma power, shows a transient response to acoustic stimuli that is uniquely suppressed during speech production. Chronometric stimulation of Heschl's gyrus selectively disrupts speech perception, while stimulation of planum temporale selectively disrupts speech production. This work illuminates the fundamental acoustic infrastructure-both architecture and function-for spoken language, grounding cognitive models of speech perception and production in human neurobiology.
Collapse
Affiliation(s)
- K J Forseth
- Vivian L. Smith Department of Neurosurgery, McGovern Medical School, Houston, TX, USA
| | - G Hickok
- Department of Cognitive Sciences, University of California, Irvine, CA, USA
| | - P S Rollo
- Vivian L. Smith Department of Neurosurgery, McGovern Medical School, Houston, TX, USA
| | - N Tandon
- Vivian L. Smith Department of Neurosurgery, McGovern Medical School, Houston, TX, USA.
- Memorial Hermann Hospital, Texas Medical Center, Houston, TX, USA.
| |
Collapse
|
43
|
Li Y, Luo H, Tian X. Mental operations in rhythm: Motor-to-sensory transformation mediates imagined singing. PLoS Biol 2020; 18:e3000504. [PMID: 33017389 PMCID: PMC7561264 DOI: 10.1371/journal.pbio.3000504] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2019] [Revised: 10/15/2020] [Accepted: 09/01/2020] [Indexed: 11/21/2022] Open
Abstract
What enables the mental activities of thinking verbally or humming in our mind? We hypothesized that the interaction between motor and sensory systems induces speech and melodic mental representations, and this motor-to-sensory transformation forms the neural basis that enables our verbal thinking and covert singing. Analogous with the neural entrainment to auditory stimuli, participants imagined singing lyrics of well-known songs rhythmically while their neural electromagnetic signals were recorded using magnetoencephalography (MEG). We found that when participants imagined singing the same song in similar durations across trials, the delta frequency band (1–3 Hz, similar to the rhythm of the songs) showed more consistent phase coherence across trials. This neural phase tracking of imagined singing was observed in a frontal-parietal-temporal network: the proposed motor-to-sensory transformation pathway, including the inferior frontal gyrus (IFG), insula (INS), premotor area, intra-parietal sulcus (IPS), temporal-parietal junction (TPJ), primary auditory cortex (Heschl’s gyrus [HG]), and superior temporal gyrus (STG) and sulcus (STS). These results suggest that neural responses can entrain the rhythm of mental activity. Moreover, the theta-band (4–8 Hz) phase coherence was localized in the auditory cortices. The mu (9–12 Hz) and beta (17–20 Hz) bands were observed in the right-lateralized sensorimotor systems that were consistent with the singing context. The gamma band was broadly manifested in the observed network. The coherent and frequency-specific activations in the motor-to-sensory transformation network mediate the internal construction of perceptual representations and form the foundation of neural computations for mental operations. What enables our mental activities for thinking verbally or humming in our mind? Using an imagined singing paradigm with magnetoencephalography recordings, this study shows that neural oscillations in the motor-to-sensory transformation network tracked inner speech and covert singing.
Collapse
Affiliation(s)
- Yanzhu Li
- New York University Shanghai, Shanghai, China
- NYU-ECNU Institute of Brain and Cognitive Science at NYU Shanghai, Shanghai, China
| | - Huan Luo
- Peking University, Beijing, China
| | - Xing Tian
- New York University Shanghai, Shanghai, China
- NYU-ECNU Institute of Brain and Cognitive Science at NYU Shanghai, Shanghai, China
- * E-mail:
| |
Collapse
|
44
|
Boasen J, Giroux F, Duchesneau MO, Sénécal S, Léger PM, Ménard JF. High-fidelity vibrokinetic stimulation induces sustained changes in intercortical coherence during a cinematic experience. J Neural Eng 2020; 17:046046. [PMID: 32756020 DOI: 10.1088/1741-2552/abaca2] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
Abstract
OBJECTIVE High-fidelity vibrokinetic (HFVK) technology is widely used to enhance the immersiveness of audiovisual (AV) entertainment experiences. However, despite evidence that HFVK technology does subjectively enhance AV immersion, the underlying mechanism has not been clarified. Neurophysiological studies could provide important evidence to illuminate this mechanism, thereby benefiting HFVK stimulus design, and facilitating expansion of HFVK technology. APPROACH We conducted a between-subjects (VK, N = 11; Control, N = 9) exploratory study to measure the effect of HFVK stimulation through an HFVK seat on electroencephalographic cortical activity during an AV cinematic experience. Subjective appreciation of the experience was assessed and incorporated into statistical models exploring the effects of HFVK stimulation across cortical brain areas. We separately analyzed alpha-band (8-12 Hz) and theta-band (5-7 Hz) activities as indices of engagement and sensory processing, respectively. We also performed theta-band (5-7 Hz) coherence analyses using cortical seed areas identified from the theta activity analysis. MAIN RESULTS The right fusiform gyrus, inferiotemporal gyrus, and supramarginal gyrus, known for emotion, AV-spatial, and vestibular processing, were identified as seeds from theta analyses. Coherence from these areas was uniformly enhanced in HFVK subjects in right motor areas, albeit predominantly in those who were appreciative. Meanwhile, compared to control subjects, HFVK subjects exhibited uniform interhemispheric decoherence with the left insula, which is important for self-processing. SIGNIFICANCE The results collectively point to sustained decoherence between sensory and self-processing as a possible mechanism for how HFVK increases immersion, and that coordination of emotional, spatial, and vestibular processing hubs with the motor system may be required for appreciation of the HFVK-enhanced experience. Overall, this study offers the first ever demonstration that HFVK stimulation has a real and sustained effect on brain activity during a cinematic experience.
Collapse
Affiliation(s)
- J Boasen
- Tech3Lab, HEC Montréal, Montréal, Canada. Faculty of Health Sciences, Hokkaido University, Sapporo, Japan
| | | | | | | | | | | |
Collapse
|
45
|
Vilà-Balló A, Marti-Marca A, Torres-Ferrús M, Alpuente A, Gallardo VJ, Pozo-Rosich P. Neurophysiological correlates of abnormal auditory processing in episodic migraine during the interictal period. Cephalalgia 2020; 41:45-57. [PMID: 32838536 DOI: 10.1177/0333102420951509] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
BACKGROUND The characteristics of the hypersensitivity to auditory stimuli during the interictal period in episodic migraine are discussed. The combined use of event-related potentials, time-frequency power and phase-synchronization can provide relevant information about the time-course of sensory-attentional processing in migraine and its underlying mechanisms. OBJECTIVE The aim of this nested case-control study was to examine these processes in young, female, episodic migraine patients interictally and compare them to controls using an active auditory oddball task. METHOD We recorded, using 20 channels, the electrophysiological brain activity of 21 women with episodic migraine without aura and 21 healthy matched controls without family history of migraine, during a novelty oddball paradigm. We collected sociodemographic and clinical data as well as scores related to disability, quality of life, anxiety and depression. We calculated behavioural measures including reaction times, hit rates and false alarms. Spectral power and phase-synchronization of oscillatory activity as well as event-related potentials were obtained for standard stimuli. For target and novel stimuli, event-related potentials were acquired. RESULTS There were no significant differences at the behavioural level. In migraine patients, we found an increased phase-synchronization at the theta frequency range and a higher N1 response to standard trials. No differences were observed in spectral power. No evidence for a lack of habituation in any of the measures was seen between migraine patients and controls. The Reorienting Negativity was reduced in migraine patients as compared to controls on novel but not on target trials. CONCLUSION Our findings suggest that migraine patients process stimuli as more salient, seem to allocate more of their attentional resources to their surrounding environment, and have less available resources to reorient attention back to the main task.
Collapse
Affiliation(s)
- Adrià Vilà-Balló
- Headache and Neurological Pain Research Group, Vall d'Hebron Research Institute, Department of Medicine, Universitat Autònoma de Barcelona, Barcelona, Spain
| | - Angela Marti-Marca
- Headache and Neurological Pain Research Group, Vall d'Hebron Research Institute, Department of Medicine, Universitat Autònoma de Barcelona, Barcelona, Spain
| | - Marta Torres-Ferrús
- Headache and Neurological Pain Research Group, Vall d'Hebron Research Institute, Department of Medicine, Universitat Autònoma de Barcelona, Barcelona, Spain.,Headache Unit, Department of Neurology, Hospital Universitari Vall d'Hebron, Barcelona, Spain
| | - Alicia Alpuente
- Headache and Neurological Pain Research Group, Vall d'Hebron Research Institute, Department of Medicine, Universitat Autònoma de Barcelona, Barcelona, Spain.,Headache Unit, Department of Neurology, Hospital Universitari Vall d'Hebron, Barcelona, Spain
| | - Victor José Gallardo
- Headache and Neurological Pain Research Group, Vall d'Hebron Research Institute, Department of Medicine, Universitat Autònoma de Barcelona, Barcelona, Spain
| | - Patricia Pozo-Rosich
- Headache and Neurological Pain Research Group, Vall d'Hebron Research Institute, Department of Medicine, Universitat Autònoma de Barcelona, Barcelona, Spain.,Headache Unit, Department of Neurology, Hospital Universitari Vall d'Hebron, Barcelona, Spain
| |
Collapse
|
46
|
Fletcher MD, Song H, Perry SW. Electro-haptic stimulation enhances speech recognition in spatially separated noise for cochlear implant users. Sci Rep 2020; 10:12723. [PMID: 32728109 PMCID: PMC7391652 DOI: 10.1038/s41598-020-69697-2] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2020] [Accepted: 07/14/2020] [Indexed: 11/10/2022] Open
Abstract
Hundreds of thousands of profoundly hearing-impaired people perceive sounds through electrical stimulation of the auditory nerve using a cochlear implant (CI). However, CI users are often poor at understanding speech in noisy environments and separating sounds that come from different locations. We provided missing speech and spatial hearing cues through haptic stimulation to augment the electrical CI signal. After just 30 min of training, we found this “electro-haptic” stimulation substantially improved speech recognition in multi-talker noise when the speech and noise came from different locations. Our haptic stimulus was delivered to the wrists at an intensity that can be produced by a compact, low-cost, wearable device. These findings represent a significant step towards the production of a non-invasive neuroprosthetic that can improve CI users’ ability to understand speech in realistic noisy environments.
Collapse
Affiliation(s)
- Mark D Fletcher
- University of Southampton Auditory Implant Service, University of Southampton, University Road, Southampton, SO17 1BJ, UK.
| | - Haoheng Song
- Faculty of Engineering and Physical Sciences, University of Southampton, University Road, Southampton, SO17 1BJ, UK
| | - Samuel W Perry
- University of Southampton Auditory Implant Service, University of Southampton, University Road, Southampton, SO17 1BJ, UK
| |
Collapse
|
47
|
Plass J, Brang D, Suzuki S, Grabowecky M. Vision perceptually restores auditory spectral dynamics in speech. Proc Natl Acad Sci U S A 2020; 117:16920-16927. [PMID: 32632010 PMCID: PMC7382243 DOI: 10.1073/pnas.2002887117] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Visual speech facilitates auditory speech perception, but the visual cues responsible for these benefits and the information they provide remain unclear. Low-level models emphasize basic temporal cues provided by mouth movements, but these impoverished signals may not fully account for the richness of auditory information provided by visual speech. High-level models posit interactions among abstract categorical (i.e., phonemes/visemes) or amodal (e.g., articulatory) speech representations, but require lossy remapping of speech signals onto abstracted representations. Because visible articulators shape the spectral content of speech, we hypothesized that the perceptual system might exploit natural correlations between midlevel visual (oral deformations) and auditory speech features (frequency modulations) to extract detailed spectrotemporal information from visual speech without employing high-level abstractions. Consistent with this hypothesis, we found that the time-frequency dynamics of oral resonances (formants) could be predicted with unexpectedly high precision from the changing shape of the mouth during speech. When isolated from other speech cues, speech-based shape deformations improved perceptual sensitivity for corresponding frequency modulations, suggesting that listeners could exploit this cross-modal correspondence to facilitate perception. To test whether this type of correspondence could improve speech comprehension, we selectively degraded the spectral or temporal dimensions of auditory sentence spectrograms to assess how well visual speech facilitated comprehension under each degradation condition. Visual speech produced drastically larger enhancements during spectral degradation, suggesting a condition-specific facilitation effect driven by cross-modal recovery of auditory speech spectra. The perceptual system may therefore use audiovisual correlations rooted in oral acoustics to extract detailed spectrotemporal information from visual speech.
Collapse
Affiliation(s)
- John Plass
- Department of Psychology, University of Michigan, Ann Arbor, MI 48109;
- Department of Psychology, Northwestern University, Evanston, IL 60208
| | - David Brang
- Department of Psychology, University of Michigan, Ann Arbor, MI 48109
| | - Satoru Suzuki
- Department of Psychology, Northwestern University, Evanston, IL 60208
- Interdepartmental Neuroscience Program, Northwestern University, Chicago, IL 60611
| | - Marcia Grabowecky
- Department of Psychology, Northwestern University, Evanston, IL 60208
- Interdepartmental Neuroscience Program, Northwestern University, Chicago, IL 60611
| |
Collapse
|
48
|
Genuine cross-frequency coupling networks in human resting-state electrophysiological recordings. PLoS Biol 2020; 18:e3000685. [PMID: 32374723 PMCID: PMC7233600 DOI: 10.1371/journal.pbio.3000685] [Citation(s) in RCA: 33] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2019] [Revised: 05/18/2020] [Accepted: 04/02/2020] [Indexed: 12/28/2022] Open
Abstract
Phase synchronization of neuronal oscillations in specific frequency bands coordinates anatomically distributed neuronal processing and communication. Typically, oscillations and synchronization take place concurrently in many distinct frequencies, which serve separate computational roles in cognitive functions. While within-frequency phase synchronization has been studied extensively, less is known about the mechanisms that govern neuronal processing distributed across frequencies and brain regions. Such integration of processing between frequencies could be achieved via cross-frequency coupling (CFC), either by phase–amplitude coupling (PAC) or by n:m-cross–frequency phase synchrony (CFS). So far, studies have mostly focused on local CFC in individual brain regions, whereas the presence and functional organization of CFC between brain areas have remained largely unknown. We posit that interareal CFC may be essential for large-scale coordination of neuronal activity and investigate here whether genuine CFC networks are present in human resting-state (RS) brain activity. To assess the functional organization of CFC networks, we identified brain-wide CFC networks at mesoscale resolution from stereoelectroencephalography (SEEG) and at macroscale resolution from source-reconstructed magnetoencephalography (MEG) data. We developed a novel, to our knowledge, graph-theoretical method to distinguish genuine CFC from spurious CFC that may arise from nonsinusoidal signals ubiquitous in neuronal activity. We show that genuine interareal CFC is present in human RS activity in both SEEG and MEG data. Both CFS and PAC networks coupled theta and alpha oscillations with higher frequencies in large-scale networks connecting anterior and posterior brain regions. CFS and PAC networks had distinct spectral patterns and opposing distribution of low- and high-frequency network hubs, implying that they constitute distinct CFC mechanisms. The strength of CFS networks was also predictive of cognitive performance in a separate neuropsychological assessment. In conclusion, these results provide evidence for interareal CFS and PAC being 2 distinct mechanisms for coupling oscillations across frequencies in large-scale brain networks. Genuine interareal cross-frequency coupling (CFC) can be identified from human resting state activity using magnetoencephalography, stereoelectroencephalography, and novel network approaches. CFC couples slow theta and alpha oscillations to faster oscillations across brain regions.
Collapse
|
49
|
Carmona L, Diez PF, Laciar E, Mut V. Multisensory Stimulation and EEG Recording Below the Hair-Line: A New Paradigm on Brain Computer Interfaces. IEEE Trans Neural Syst Rehabil Eng 2020; 28:825-831. [PMID: 32149649 DOI: 10.1109/tnsre.2020.2979684] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
To test the feasibility of implementing multisensory (auditory and visual) stimulation in combination with electrodes placed on non-hair positions to design more efficient and comfortable Brain-computer interfaces (BCI). Fifteen volunteers participated in the experiments. They were stimulated by visual, auditory and multisensory stimuli set at 37, 38, 39 and 40Hz and at different phases (0°, 90°, 180° and 270°). The electroencephalogram (EEG) was measured from Oz, T7, T8, Tp9 and Tp10 positions. To evaluate the amplitude of the visual and auditory evoked potentials, the signal-to-noise ratio (SNR) was used and the accuracy of detection was calculated using canonical correlation analysis. Additionally, the volunteers were asked about the discomfort of each kind of stimulus. The multisensory stimulation allows for attaining higher SNR on every electrode. Non-hair (Tp9 and Tp10) positions attained SNR and accuracy similar to the ones obtained from occipital positions on visual stimulation. No significant difference was found on the discomfort produced by each kind of stimulation. The results demonstrated that multisensory stimulation can help in obtaining high amplitude steady-state evoked responses with a similar discomfort level. Then, it is possible to design a more efficient and comfortable hybrid-BCI based on multisensory stimulation and electrodes on non-hair positions. The current article proposes a new paradigm for hybrid-BCI based on steady-state evoked potentials measured from the area behind-the-ears and elicited by multisensory stimulation, thus, allowing subjects to achieve similar performance to the one achieved by visual-occipital BCI, but measuring the EEG on a more comfortable electrode location.
Collapse
|
50
|
Zhou H, Hua L, Jiang H, Dai Z, Han Y, Lin P, Wang H, Lu Q, Yao Z. Autonomic Nervous System Is Related to Inhibitory and Control Function Through Functional Inter-Region Connectivities of OFC in Major Depression. Neuropsychiatr Dis Treat 2020; 16:235-247. [PMID: 32021217 PMCID: PMC6982460 DOI: 10.2147/ndt.s238044] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/10/2019] [Accepted: 12/30/2019] [Indexed: 11/23/2022] Open
Abstract
OBJECTIVE To investigate the mechanism of interactions between autonomic nervous system (ANS) and cognitive function in Major depression (MD) with Magnetoencephalography (MEG) measurements. METHODS Participants with MD (n = 20), and Health controls (HCs, n = 18) were completed MEG measurements during the performance of a go/no-go task. Heart rate variability (HRV) indices (SDANN, and RMSSD) were derived from the raw MEG data. The correlation analysis of the HRV and functional connectivities in different brain regions was conducted by Pearson's r in two groups. RESULTS The go/no-go task performances of HCs were better than MD patients; HRV indices were lower in the MD group. Under the no-go task, a brain MEG functional connectivity analysis based on the seed regions of the orbitofrontal cortex (OFC) displayed increased functional inter-region connectivity networks of OFC in MD group. HRV indices were correlated with different functional inter-region connectivity networks of OFC in two groups, respectively. CONCLUSION ANS is related to inhibitory and control function through functional inter-region connectivity networks of OFC in MD. These findings have important implications for the understanding pathophysiology of MD, and MEG may provide an image-guided tool for interventions.
Collapse
Affiliation(s)
- Hongliang Zhou
- Department of Psychiatry, The Affiliated Brain Hospital of Nanjing Medical University, Nanjing210029, People’s Republic of China
| | - Lingling Hua
- Department of Psychiatry, The Affiliated Brain Hospital of Nanjing Medical University, Nanjing210029, People’s Republic of China
| | - Haiteng Jiang
- School of Biological Sciences & Medical Engineering, Southeast University, Nanjing210096, People’s Republic of China
| | - Zongpeng Dai
- School of Biological Sciences & Medical Engineering, Southeast University, Nanjing210096, People’s Republic of China
| | - Yinglin Han
- Department of Psychiatry, The Affiliated Brain Hospital of Nanjing Medical University, Nanjing210029, People’s Republic of China
| | - Pinhua Lin
- Department of Psychiatry, The Affiliated Brain Hospital of Nanjing Medical University, Nanjing210029, People’s Republic of China
| | - Haofei Wang
- Department of Psychiatry, The Affiliated Brain Hospital of Nanjing Medical University, Nanjing210029, People’s Republic of China
| | - Qing Lu
- School of Biological Sciences & Medical Engineering, Southeast University, Nanjing210096, People’s Republic of China
- Child Development and Learning Science, Key Laboratory of Ministry of Education, Nanjing 210096, People’s Republic of China
| | - Zhijian Yao
- Department of Psychiatry, The Affiliated Brain Hospital of Nanjing Medical University, Nanjing210029, People’s Republic of China
- Nanjing Brain Hospital, Medical School of Nanjing University, Nanjing210093, People’s Republic of China
| |
Collapse
|