1
|
Berthault E, Chen S, Falk S, Morillon B, Schön D. Auditory and motor priming of metric structure improves understanding of degraded speech. Cognition 2024; 248:105793. [PMID: 38636164 DOI: 10.1016/j.cognition.2024.105793] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2023] [Revised: 03/07/2024] [Accepted: 04/09/2024] [Indexed: 04/20/2024]
Abstract
Speech comprehension is enhanced when preceded (or accompanied) by a congruent rhythmic prime reflecting the metrical sentence structure. Although these phenomena have been described for auditory and motor primes separately, their respective and synergistic contribution has not been addressed. In this experiment, participants performed a speech comprehension task on degraded speech signals that were preceded by a rhythmic prime that could be auditory, motor or audiomotor. Both auditory and audiomotor rhythmic primes facilitated speech comprehension speed. While the presence of a purely motor prime (unpaced tapping) did not globally benefit speech comprehension, comprehension accuracy scaled with the regularity of motor tapping. In order to investigate inter-individual variability, participants also performed a Spontaneous Speech Synchronization test. The strength of the estimated perception-production coupling correlated positively with overall speech comprehension scores. These findings are discussed in the framework of the dynamic attending and active sensing theories.
Collapse
Affiliation(s)
- Emma Berthault
- Aix Marseille Université, INSERM, INS, Institut de Neurosciences des Systèmes, Marseille, France.
| | - Sophie Chen
- Aix Marseille Université, INSERM, INS, Institut de Neurosciences des Systèmes, Marseille, France.
| | - Simone Falk
- Department of Linguistics and Translation, University of Montreal, Canada; International Laboratory for Brain, Music and Sound Research, Montreal, Canada.
| | - Benjamin Morillon
- Aix Marseille Université, INSERM, INS, Institut de Neurosciences des Systèmes, Marseille, France.
| | - Daniele Schön
- Aix Marseille Université, INSERM, INS, Institut de Neurosciences des Systèmes, Marseille, France.
| |
Collapse
|
2
|
Perron M, Liu Q, Tremblay P, Alain C. Enhancing speech perception in noise through articulation. Ann N Y Acad Sci 2024. [PMID: 38924165 DOI: 10.1111/nyas.15179] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/28/2024]
Abstract
Considerable debate exists about the interplay between auditory and motor speech systems. Some argue for common neural mechanisms, whereas others assert that there are few shared resources. In four experiments, we tested the hypothesis that priming the speech motor system by repeating syllable pairs aloud improves subsequent syllable discrimination in noise compared with a priming discrimination task involving same-different judgments via button presses. Our results consistently showed that participants who engaged in syllable repetition performed better in syllable discrimination in noise than those who engaged in the priming discrimination task. This gain in accuracy was observed for primed and new syllable pairs, highlighting increased sensitivity to phonological details. The benefits were comparable whether the priming tasks involved auditory or visual presentation. Inserting a 1-h delay between the priming tasks and the syllable-in-noise task, the benefits persisted but were confined to primed syllable pairs. Finally, we demonstrated the effectiveness of this approach in older adults. Our findings substantiate the existence of a speech production-perception relationship. They also have clinical relevance as they raise the possibility of production-based interventions to improve speech perception ability. This would be particularly relevant for older adults who often encounter difficulties in perceiving speech in noise.
Collapse
Affiliation(s)
- Maxime Perron
- Department of Psychology, University of Toronto, Toronto, Ontario, Canada
- Baycrest Academy for Research and Education, Rotman Research Institute, North York, Ontario, Canada
| | - Qiying Liu
- Department of Psychology, University of Toronto, Toronto, Ontario, Canada
- Baycrest Academy for Research and Education, Rotman Research Institute, North York, Ontario, Canada
| | - Pascale Tremblay
- CERVO Brain Research Center, Quebec City, Quebec, Canada
- École de Réadaptation, Faculté de Médecine, Université Laval, Quebec City, Quebec, Canada
| | - Claude Alain
- Department of Psychology, University of Toronto, Toronto, Ontario, Canada
- Baycrest Academy for Research and Education, Rotman Research Institute, North York, Ontario, Canada
- Institute of Medical Sciences, University of Toronto, Toronto, Ontario, Canada
- Music and Health Science Research Collaboratory, University of Toronto, Toronto, Ontario, Canada
| |
Collapse
|
3
|
Zou T, Li L, Huang X, Deng C, Wang X, Gao Q, Chen H, Li R. Dynamic causal modeling analysis reveals the modulation of motor cortex and integration in superior temporal gyrus during multisensory speech perception. Cogn Neurodyn 2024; 18:931-946. [PMID: 38826672 PMCID: PMC11143173 DOI: 10.1007/s11571-023-09945-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2022] [Revised: 02/03/2023] [Accepted: 02/10/2023] [Indexed: 03/06/2023] Open
Abstract
The processing of speech information from various sensory modalities is crucial for human communication. Both left posterior superior temporal gyrus (pSTG) and motor cortex importantly involve in the multisensory speech perception. However, the dynamic integration of primary sensory regions to pSTG and the motor cortex remain unclear. Here, we implemented a behavioral experiment of classical McGurk effect paradigm and acquired the task functional magnetic resonance imaging (fMRI) data during synchronized audiovisual syllabic perception from 63 normal adults. We conducted dynamic causal modeling (DCM) analysis to explore the cross-modal interactions among the left pSTG, left precentral gyrus (PrG), left middle superior temporal gyrus (mSTG), and left fusiform gyrus (FuG). Bayesian model selection favored a winning model that included modulations of connections to PrG (mSTG → PrG, FuG → PrG), from PrG (PrG → mSTG, PrG → FuG), and to pSTG (mSTG → pSTG, FuG → pSTG). Moreover, the coupling strength of the above connections correlated with behavioral McGurk susceptibility. In addition, significant differences were found in the coupling strength of these connections between strong and weak McGurk perceivers. Strong perceivers modulated less inhibitory visual influence, allowed less excitatory auditory information flowing into PrG, but integrated more audiovisual information in pSTG. Taken together, our findings show that the PrG and pSTG interact dynamically with primary cortices during audiovisual speech, and support the motor cortex plays a specifically functional role in modulating the gain and salience between auditory and visual modalities. Supplementary Information The online version contains supplementary material available at 10.1007/s11571-023-09945-z.
Collapse
Affiliation(s)
- Ting Zou
- The Clinical Hospital of Chengdu Brain Science Institute, MOE Key Laboratory for Neuroinformation, High-Field Magnetic Resonance Brain Imaging Key Laboratory of Sichuan Province, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, 610054 People’s Republic of China
| | - Liyuan Li
- The Clinical Hospital of Chengdu Brain Science Institute, MOE Key Laboratory for Neuroinformation, High-Field Magnetic Resonance Brain Imaging Key Laboratory of Sichuan Province, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, 610054 People’s Republic of China
| | - Xinju Huang
- The Clinical Hospital of Chengdu Brain Science Institute, MOE Key Laboratory for Neuroinformation, High-Field Magnetic Resonance Brain Imaging Key Laboratory of Sichuan Province, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, 610054 People’s Republic of China
| | - Chijun Deng
- The Clinical Hospital of Chengdu Brain Science Institute, MOE Key Laboratory for Neuroinformation, High-Field Magnetic Resonance Brain Imaging Key Laboratory of Sichuan Province, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, 610054 People’s Republic of China
| | - Xuyang Wang
- The Clinical Hospital of Chengdu Brain Science Institute, MOE Key Laboratory for Neuroinformation, High-Field Magnetic Resonance Brain Imaging Key Laboratory of Sichuan Province, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, 610054 People’s Republic of China
| | - Qing Gao
- The Clinical Hospital of Chengdu Brain Science Institute, MOE Key Laboratory for Neuroinformation, High-Field Magnetic Resonance Brain Imaging Key Laboratory of Sichuan Province, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, 610054 People’s Republic of China
| | - Huafu Chen
- The Clinical Hospital of Chengdu Brain Science Institute, MOE Key Laboratory for Neuroinformation, High-Field Magnetic Resonance Brain Imaging Key Laboratory of Sichuan Province, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, 610054 People’s Republic of China
| | - Rong Li
- The Clinical Hospital of Chengdu Brain Science Institute, MOE Key Laboratory for Neuroinformation, High-Field Magnetic Resonance Brain Imaging Key Laboratory of Sichuan Province, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, 610054 People’s Republic of China
| |
Collapse
|
4
|
Rizzi R, Bidelman GM. Functional benefits of continuous vs. categorical listening strategies on the neural encoding and perception of noise-degraded speech. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.05.15.594387. [PMID: 38798410 PMCID: PMC11118460 DOI: 10.1101/2024.05.15.594387] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/29/2024]
Abstract
Acoustic information in speech changes continuously, yet listeners form discrete perceptual categories to ease the demands of perception. Being a more continuous/gradient as opposed to a discrete/categorical listener may be further advantageous for understanding speech in noise by increasing perceptual flexibility and resolving ambiguity. The degree to which a listener's responses to a continuum of speech sounds are categorical versus continuous can be quantified using visual analog scaling (VAS) during speech labeling tasks. Here, we recorded event-related brain potentials (ERPs) to vowels along an acoustic-phonetic continuum (/u/ to /a/) while listeners categorized phonemes in both clean and noise conditions. Behavior was assessed using standard two alternative forced choice (2AFC) and VAS paradigms to evaluate categorization under task structures that promote discrete (2AFC) vs. continuous (VAS) hearing, respectively. Behaviorally, identification curves were steeper under 2AFC vs. VAS categorization but were relatively immune to noise, suggesting robust access to abstract, phonetic categories even under signal degradation. Behavioral slopes were positively correlated with listeners' QuickSIN scores, suggesting a behavioral advantage for speech in noise comprehension conferred by gradient listening strategy. At the neural level, electrode level data revealed P2 peak amplitudes of the ERPs were modulated by task and noise; responses were larger under VAS vs. 2AFC categorization and showed larger noise-related delay in latency in the VAS vs. 2AFC condition. More gradient responders also had smaller shifts in ERP latency with noise, suggesting their neural encoding of speech was more resilient to noise degradation. Interestingly, source-resolved ERPs showed that more gradient listening was also correlated with stronger neural responses in left superior temporal gyrus. Our results demonstrate that listening strategy (i.e., being a discrete vs. continuous listener) modulates the categorical organization of speech and behavioral success, with continuous/gradient listening being more advantageous to speech in noise perception.
Collapse
|
5
|
Tseng HC, Hsieh IH. Effects of absolute pitch on brain activation and functional connectivity during hearing-in-noise perception. Cortex 2024; 174:1-18. [PMID: 38484435 DOI: 10.1016/j.cortex.2024.02.011] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2023] [Revised: 01/11/2024] [Accepted: 02/06/2024] [Indexed: 04/21/2024]
Abstract
Hearing-in-noise (HIN) ability is crucial in speech and music communication. Recent evidence suggests that absolute pitch (AP), the ability to identify isolated musical notes, is associated with HIN benefits. A theoretical account postulates a link between AP ability and neural network indices of segregation. However, how AP ability modulates the brain activation and functional connectivity underlying HIN perception remains unclear. Here we used functional magnetic resonance imaging to contrast brain responses among a sample (n = 45) comprising 15 AP musicians, 15 non-AP musicians, and 15 non-musicians in perceiving Mandarin speech and melody targets under varying signal-to-noise ratios (SNRs: No-Noise, 0, -9 dB). Results reveal that AP musicians exhibited increased activation in auditory and superior frontal regions across both HIN domains (music and speech), irrespective of noise levels. Notably, substantially higher sensorimotor activation was found in AP musicians when the target was music compared to speech. Furthermore, we examined AP effects on neural connectivity using psychophysiological interaction analysis with the auditory cortex as the seed region. AP musicians showed decreased functional connectivity with the sensorimotor and middle frontal gyrus compared to non-AP musicians. Crucially, AP differentially affected connectivity with parietal and frontal brain regions depending on the HIN domain being music or speech. These findings suggest that AP plays a critical role in HIN perception, manifested by increased activation and functional independence between auditory and sensorimotor regions for perceiving music and speech streams.
Collapse
Affiliation(s)
- Hung-Chen Tseng
- Institute of Cognitive Neuroscience, National Central University, Taoyuan City, Taiwan
| | - I-Hui Hsieh
- Institute of Cognitive Neuroscience, National Central University, Taoyuan City, Taiwan; Cognitive Intelligence and Precision Healthcare Center, National Central University, Taoyuan City, Taiwan.
| |
Collapse
|
6
|
Li Z, Zhang D. How does the human brain process noisy speech in real life? Insights from the second-person neuroscience perspective. Cogn Neurodyn 2024; 18:371-382. [PMID: 38699619 PMCID: PMC11061069 DOI: 10.1007/s11571-022-09924-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2022] [Revised: 11/20/2022] [Accepted: 12/19/2022] [Indexed: 01/07/2023] Open
Abstract
Comprehending speech with the existence of background noise is of great importance for human life. In the past decades, a large number of psychological, cognitive and neuroscientific research has explored the neurocognitive mechanisms of speech-in-noise comprehension. However, as limited by the low ecological validity of the speech stimuli and the experimental paradigm, as well as the inadequate attention on the high-order linguistic and extralinguistic processes, there remains much unknown about how the brain processes noisy speech in real-life scenarios. A recently emerging approach, i.e., the second-person neuroscience approach, provides a novel conceptual framework. It measures both of the speaker's and the listener's neural activities, and estimates the speaker-listener neural coupling with regarding of the speaker's production-related neural activity as a standardized reference. The second-person approach not only promotes the use of naturalistic speech but also allows for free communication between speaker and listener as in a close-to-life context. In this review, we first briefly review the previous discoveries about how the brain processes speech in noise; then, we introduce the principles and advantages of the second-person neuroscience approach and discuss its implications to unravel the linguistic and extralinguistic processes during speech-in-noise comprehension; finally, we conclude by proposing some critical issues and calls for more research interests in the second-person approach, which would further extend the present knowledge about how people comprehend speech in noise.
Collapse
Affiliation(s)
- Zhuoran Li
- Department of Psychology, School of Social Sciences, Tsinghua University, Room 334, Mingzhai Building, Beijing, 100084 China
- Tsinghua Laboratory of Brain and Intelligence, Tsinghua University, Beijing, 100084 China
| | - Dan Zhang
- Department of Psychology, School of Social Sciences, Tsinghua University, Room 334, Mingzhai Building, Beijing, 100084 China
- Tsinghua Laboratory of Brain and Intelligence, Tsinghua University, Beijing, 100084 China
| |
Collapse
|
7
|
Gonzalez JE, Nieto N, Brusco P, Gravano A, Kamienkowski JE. Speech-induced suppression during natural dialogues. Commun Biol 2024; 7:291. [PMID: 38459110 PMCID: PMC10923813 DOI: 10.1038/s42003-024-05945-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2023] [Accepted: 02/21/2024] [Indexed: 03/10/2024] Open
Abstract
When engaged in a conversation, one receives auditory information from the other's speech but also from their own speech. However, this information is processed differently by an effect called Speech-Induced Suppression. Here, we studied brain representation of acoustic properties of speech in natural unscripted dialogues, using electroencephalography (EEG) and high-quality speech recordings from both participants. Using encoding techniques, we were able to reproduce a broad range of previous findings on listening to another's speech, and achieving even better performances when predicting EEG signal in this complex scenario. Furthermore, we found no response when listening to oneself, using different acoustic features (spectrogram, envelope, etc.) and frequency bands, evidencing a strong effect of SIS. The present work shows that this mechanism is present, and even stronger, during natural dialogues. Moreover, the methodology presented here opens the possibility of a deeper understanding of the related mechanisms in a wider range of contexts.
Collapse
Affiliation(s)
- Joaquin E Gonzalez
- Laboratorio de Inteligencia Artificial Aplicada, Instituto de Ciencias de la Computación (Universidad de Buenos Aires - Consejo Nacional de Investigaciones Cientificas y Tecnicas), Buenos Aires, Argentina.
| | - Nicolás Nieto
- Instituto de Investigación en Señales, Sistemas e Inteligencia Computacional, sinc(i) (Universidad Nacional del Litoral - Consejo Nacional de Investigaciones Cientificas y Tecnicas), Santa Fe, Argentina
- Instituto de Matemática Aplicada del Litoral, IMAL-UNL/CONICET, Santa Fe, Argentina
| | - Pablo Brusco
- Departamento de Computación, Facultad de Ciencias Exactas y Naturales, Universidad de Buenos Aires, Buenos Aires, Argentina
| | - Agustín Gravano
- Laboratorio de Inteligencia Artificial, Universidad Torcuato Di Tella, Buenos Aires, Argentina
- Escuela de Negocios, Universidad Torcuato Di Tella, Buenos Aires, Argentina
- Consejo Nacional de Investigaciones Científicas y Técnicas, Buenos Aires, Argentina
| | - Juan E Kamienkowski
- Laboratorio de Inteligencia Artificial Aplicada, Instituto de Ciencias de la Computación (Universidad de Buenos Aires - Consejo Nacional de Investigaciones Cientificas y Tecnicas), Buenos Aires, Argentina
- Departamento de Computación, Facultad de Ciencias Exactas y Naturales, Universidad de Buenos Aires, Buenos Aires, Argentina
- Maestria de Explotación de Datos y Descubrimiento del Conocimiento, Facultad de Ciencias Exactas y Naturales - Facultad de Ingenieria, Universidad de Buenos Aires, Buenos Aires, Argentina
| |
Collapse
|
8
|
Slade K, Beat A, Taylor J, Plack CJ, Nuttall HE. The effect of motor resource suppression on speech perception in noise in younger and older listeners: An online study. Psychon Bull Rev 2024; 31:389-400. [PMID: 37653280 PMCID: PMC10866784 DOI: 10.3758/s13423-023-02361-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 07/31/2023] [Indexed: 09/02/2023]
Abstract
Speech motor resources may be recruited to assist challenging speech perception in younger normally hearing listeners, but the extent to which this occurs for older adult listeners is unclear. We investigated if speech motor resources are also recruited in older adults during speech perception. Specifically, we investigated if suppression of speech motor resources via sub-vocal rehearsal affects speech perception compared to non-speech motor suppression (jaw movement) and passive listening. Participants identified words in speech-shaped noise at signal-to-noise ratios (SNRs) from -16 to +16 dB in three listening conditions during which participants: (1) opened and closed their jaw (non-speech movement); (2) sub-vocally mimed 'the' (articulatory suppression); (3) produced no concurrent movement (passive listening). Data from 46 younger adults (M age = 20.17 years, SD = 1.61, 36 female) and 41 older adults (M age = 69 years, SD = 5.82, 21 female) were analysed. Linear mixed effects modelling investigated the impact of age, listening condition, and self-reported hearing ability on speech perception (d' prime). Results indicated that speech perception ability was significantly worse in older adults relative to younger adults across all listening conditions. A significant interaction between age group and listening condition indicated that younger adults showed poorer performance during articulatory suppression compared to passive listening, but older adults performed equivalently across conditions. This finding suggests that speech motor resources are less available to support speech perception in older adults, providing important insights for auditory-motor integration for speech understanding and communication in ageing.
Collapse
Affiliation(s)
- Kate Slade
- Neuroscience of Speech and Action Laboratory, Department of Psychology, Lancaster University, Lancaster, UK.
- Lancaster Medical School, Lancaster University, Lancaster, UK.
| | - Alanna Beat
- Neuroscience of Speech and Action Laboratory, Department of Psychology, Lancaster University, Lancaster, UK
| | - Jennifer Taylor
- Neuroscience of Speech and Action Laboratory, Department of Psychology, Lancaster University, Lancaster, UK
| | - Christopher J Plack
- Neuroscience of Speech and Action Laboratory, Department of Psychology, Lancaster University, Lancaster, UK
- Manchester Centre for Audiology and Deafness, School of Health Sciences, University of Manchester, Manchester, UK
| | - Helen E Nuttall
- Neuroscience of Speech and Action Laboratory, Department of Psychology, Lancaster University, Lancaster, UK.
| |
Collapse
|
9
|
Assaneo MF, Orpella J. Rhythms in Speech. ADVANCES IN EXPERIMENTAL MEDICINE AND BIOLOGY 2024; 1455:257-274. [PMID: 38918356 DOI: 10.1007/978-3-031-60183-5_14] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/27/2024]
Abstract
Speech can be defined as the human ability to communicate through a sequence of vocal sounds. Consequently, speech requires an emitter (the speaker) capable of generating the acoustic signal and a receiver (the listener) able to successfully decode the sounds produced by the emitter (i.e., the acoustic signal). Time plays a central role at both ends of this interaction. On the one hand, speech production requires precise and rapid coordination, typically within the order of milliseconds, of the upper vocal tract articulators (i.e., tongue, jaw, lips, and velum), their composite movements, and the activation of the vocal folds. On the other hand, the generated acoustic signal unfolds in time, carrying information at different timescales. This information must be parsed and integrated by the receiver for the correct transmission of meaning. This chapter describes the temporal patterns that characterize the speech signal and reviews research that explores the neural mechanisms underlying the generation of these patterns and the role they play in speech comprehension.
Collapse
Affiliation(s)
- M Florencia Assaneo
- Instituto de Neurobiología, Universidad Autónoma de México, Santiago de Querétaro, Mexico.
| | - Joan Orpella
- Department of Neuroscience, Georgetown University Medical Center, Washington, DC, USA
| |
Collapse
|
10
|
Ahn E, Majumdar A, Lee T, Brang D. Evidence for a Causal Dissociation of the McGurk Effect and Congruent Audiovisual Speech Perception via TMS. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.11.27.568892. [PMID: 38077093 PMCID: PMC10705272 DOI: 10.1101/2023.11.27.568892] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/24/2023]
Abstract
Congruent visual speech improves speech perception accuracy, particularly in noisy environments. Conversely, mismatched visual speech can alter what is heard, leading to an illusory percept known as the McGurk effect. This illusion has been widely used to study audiovisual speech integration, illustrating that auditory and visual cues are combined in the brain to generate a single coherent percept. While prior transcranial magnetic stimulation (TMS) and neuroimaging studies have identified the left posterior superior temporal sulcus (pSTS) as a causal region involved in the generation of the McGurk effect, it remains unclear whether this region is critical only for this illusion or also for the more general benefits of congruent visual speech (e.g., increased accuracy and faster reaction times). Indeed, recent correlative research suggests that the benefits of congruent visual speech and the McGurk effect reflect largely independent mechanisms. To better understand how these different features of audiovisual integration are causally generated by the left pSTS, we used single-pulse TMS to temporarily impair processing while subjects were presented with either incongruent (McGurk) or congruent audiovisual combinations. Consistent with past research, we observed that TMS to the left pSTS significantly reduced the strength of the McGurk effect. Importantly, however, left pSTS stimulation did not affect the positive benefits of congruent audiovisual speech (increased accuracy and faster reaction times), demonstrating a causal dissociation between the two processes. Our results are consistent with models proposing that the pSTS is but one of multiple critical areas supporting audiovisual speech interactions. Moreover, these data add to a growing body of evidence suggesting that the McGurk effect is an imperfect surrogate measure for more general and ecologically valid audiovisual speech behaviors.
Collapse
Affiliation(s)
- EunSeon Ahn
- Department of Psychology, University of Michigan, Ann Arbor, MI 48109
| | - Areti Majumdar
- Department of Psychology, University of Michigan, Ann Arbor, MI 48109
| | - Taraz Lee
- Department of Psychology, University of Michigan, Ann Arbor, MI 48109
| | - David Brang
- Department of Psychology, University of Michigan, Ann Arbor, MI 48109
| |
Collapse
|
11
|
Li L, Pasco G, Ali JB, Johnson MH, Jones EJH, Charman T. Associations between early language, motor abilities, and later autism traits in infants with typical and elevated likelihood of autism. Autism Res 2023; 16:2184-2197. [PMID: 37698295 PMCID: PMC10899446 DOI: 10.1002/aur.3023] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2023] [Accepted: 08/16/2023] [Indexed: 09/13/2023]
Abstract
Slower acquisition of language and motor milestones are common in infants with later autism and studies have indicated that motor skills predict the rate of language development, suggesting these domains of development may be interlinked. However, the inter-relationships between the two domains over development and emerging autistic traits are not fully established. We studied language and motor development using standardized observational and parent-report measures in infants with (n = 271) and without (n = 137) a family history of autism across four waves of data collection from 10 to 36 months. We used Random Intercept Cross-Lagged Panel Models to examine contemporaneous and longitudinal associations between language and motor developments in both elevated and typical likelihood groups. We estimated paths between language and motor abilities at 10, 14, 24, and 36 months and autism trait scores at 36 months, to test whether the domains were interrelated and how they related to emerging autism traits. Results revealed consistent bidirectional Expressive Language (EL) and Fine Motor (FM) cross-lagged effects from 10 to 24 and a unidirectional EL to FM effect from 24 to 36 months as well as significantly correlated random intercepts between Gross motor (GM) and Receptive language (RL), indicating stable concurrent associations over time. However, only the associations between GM and RL were associated with later autism traits. Early motor and language are linked, but only gross motor and receptive language are jointly associated with autistic traits in infants with an autism family history.
Collapse
Affiliation(s)
- Leyan Li
- Department of PsychologyInstitute of Psychiatry, Psychology & Neuroscience, King's College LondonLondonUK
| | - Greg Pasco
- Department of PsychologyInstitute of Psychiatry, Psychology & Neuroscience, King's College LondonLondonUK
| | - Jannath Begum Ali
- Department of Psychological SciencesCentre for Brain and Cognitive Development, Birkbeck, University of LondonLondonUK
| | - Mark H. Johnson
- Department of Psychological SciencesCentre for Brain and Cognitive Development, Birkbeck, University of LondonLondonUK
- Department of PsychologyUniversity of CambridgeCambridgeUK
| | - Emily J. H. Jones
- Department of Psychological SciencesCentre for Brain and Cognitive Development, Birkbeck, University of LondonLondonUK
| | - Tony Charman
- Department of PsychologyInstitute of Psychiatry, Psychology & Neuroscience, King's College LondonLondonUK
| |
Collapse
|
12
|
Berger JI, Gander PE, Kim S, Schwalje AT, Woo J, Na YM, Holmes A, Hong JM, Dunn CC, Hansen MR, Gantz BJ, McMurray B, Griffiths TD, Choi I. Neural Correlates of Individual Differences in Speech-in-Noise Performance in a Large Cohort of Cochlear Implant Users. Ear Hear 2023; 44:1107-1120. [PMID: 37144890 PMCID: PMC10426791 DOI: 10.1097/aud.0000000000001357] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2021] [Accepted: 01/11/2023] [Indexed: 05/06/2023]
Abstract
OBJECTIVES Understanding speech-in-noise (SiN) is a complex task that recruits multiple cortical subsystems. Individuals vary in their ability to understand SiN. This cannot be explained by simple peripheral hearing profiles, but recent work by our group ( Kim et al. 2021 , Neuroimage ) highlighted central neural factors underlying the variance in SiN ability in normal hearing (NH) subjects. The present study examined neural predictors of SiN ability in a large cohort of cochlear-implant (CI) users. DESIGN We recorded electroencephalography in 114 postlingually deafened CI users while they completed the California consonant test: a word-in-noise task. In many subjects, data were also collected on two other commonly used clinical measures of speech perception: a word-in-quiet task (consonant-nucleus-consonant) word and a sentence-in-noise task (AzBio sentences). Neural activity was assessed at a vertex electrode (Cz), which could help maximize eventual generalizability to clinical situations. The N1-P2 complex of event-related potentials (ERPs) at this location were included in multiple linear regression analyses, along with several other demographic and hearing factors as predictors of SiN performance. RESULTS In general, there was a good agreement between the scores on the three speech perception tasks. ERP amplitudes did not predict AzBio performance, which was predicted by the duration of device use, low-frequency hearing thresholds, and age. However, ERP amplitudes were strong predictors for performance for both word recognition tasks: the California consonant test (which was conducted simultaneously with electroencephalography recording) and the consonant-nucleus-consonant (conducted offline). These correlations held even after accounting for known predictors of performance including residual low-frequency hearing thresholds. In CI-users, better performance was predicted by an increased cortical response to the target word, in contrast to previous reports in normal-hearing subjects in whom speech perception ability was accounted for by the ability to suppress noise. CONCLUSIONS These data indicate a neurophysiological correlate of SiN performance, thereby revealing a richer profile of an individual's hearing performance than shown by psychoacoustic measures alone. These results also highlight important differences between sentence and word recognition measures of performance and suggest that individual differences in these measures may be underwritten by different mechanisms. Finally, the contrast with prior reports of NH listeners in the same task suggests CI-users performance may be explained by a different weighting of neural processes than NH listeners.
Collapse
Affiliation(s)
- Joel I. Berger
- Department of Neurosurgery, University of Iowa Hospitals and Clinics, Iowa City, Iowa, USA
| | - Phillip E. Gander
- Department of Neurosurgery, University of Iowa Hospitals and Clinics, Iowa City, Iowa, USA
| | - Subong Kim
- Department of Speech, Language, and Hearing Sciences, Purdue University, West Lafayette, Indiana, USA
| | - Adam T. Schwalje
- Department of Otolaryngology – Head and Neck Surgery, University of Iowa Hospitals and Clinics, Iowa City, Iowa, USA
| | - Jihwan Woo
- Department of Biomedical Engineering, University of Ulsan, Ulsan, South Korea
| | - Young-min Na
- Department of Biomedical Engineering, University of Ulsan, Ulsan, South Korea
| | - Ann Holmes
- Department of Psychological and Brain Sciences, University of Louisville, Louisville, Kentucky, USA
| | - Jean M. Hong
- Department of Otolaryngology – Head and Neck Surgery, University of Iowa Hospitals and Clinics, Iowa City, Iowa, USA
| | - Camille C. Dunn
- Department of Otolaryngology – Head and Neck Surgery, University of Iowa Hospitals and Clinics, Iowa City, Iowa, USA
| | - Marlan R. Hansen
- Department of Otolaryngology – Head and Neck Surgery, University of Iowa Hospitals and Clinics, Iowa City, Iowa, USA
| | - Bruce J. Gantz
- Department of Otolaryngology – Head and Neck Surgery, University of Iowa Hospitals and Clinics, Iowa City, Iowa, USA
| | - Bob McMurray
- Department of Otolaryngology – Head and Neck Surgery, University of Iowa Hospitals and Clinics, Iowa City, Iowa, USA
- Department of Psychological and Brain Sciences, University of Iowa, Iowa City, Iowa, USA
- Department of Communication Sciences and Disorders, University of Iowa, Iowa City, Iowa, USA
| | - Timothy D. Griffiths
- Biosciences Institute, Newcastle University, Newcastle upon Tyne, United Kingdom
| | - Inyong Choi
- Department of Otolaryngology – Head and Neck Surgery, University of Iowa Hospitals and Clinics, Iowa City, Iowa, USA
- Department of Communication Sciences and Disorders, University of Iowa, Iowa City, Iowa, USA
| |
Collapse
|
13
|
Zhang Y, Rennig J, Magnotti JF, Beauchamp MS. Multivariate fMRI responses in superior temporal cortex predict visual contributions to, and individual differences in, the intelligibility of noisy speech. Neuroimage 2023; 278:120271. [PMID: 37442310 PMCID: PMC10460966 DOI: 10.1016/j.neuroimage.2023.120271] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2023] [Revised: 06/20/2023] [Accepted: 07/06/2023] [Indexed: 07/15/2023] Open
Abstract
Humans have the unique ability to decode the rapid stream of language elements that constitute speech, even when it is contaminated by noise. Two reliable observations about noisy speech perception are that seeing the face of the talker improves intelligibility and the existence of individual differences in the ability to perceive noisy speech. We introduce a multivariate BOLD fMRI measure that explains both observations. In two independent fMRI studies, clear and noisy speech was presented in visual, auditory and audiovisual formats to thirty-seven participants who rated intelligibility. An event-related design was used to sort noisy speech trials by their intelligibility. Individual-differences multidimensional scaling was applied to fMRI response patterns in superior temporal cortex and the dissimilarity between responses to clear speech and noisy (but intelligible) speech was measured. Neural dissimilarity was less for audiovisual speech than auditory-only speech, corresponding to the greater intelligibility of noisy audiovisual speech. Dissimilarity was less in participants with better noisy speech perception, corresponding to individual differences. These relationships held for both single word and entire sentence stimuli, suggesting that they were driven by intelligibility rather than the specific stimuli tested. A neural measure of perceptual intelligibility may aid in the development of strategies for helping those with impaired speech perception.
Collapse
Affiliation(s)
- Yue Zhang
- Department of Neurosurgery, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, United States; Department of Neurosurgery, Baylor College of Medicine, Houston, TX, United States
| | - Johannes Rennig
- Division of Neuropsychology, Center of Neurology, Hertie-Institute for Clinical Brain Research, University of Tübingen, Tübingen, Germany
| | - John F Magnotti
- Department of Neurosurgery, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, United States
| | - Michael S Beauchamp
- Department of Neurosurgery, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, United States.
| |
Collapse
|
14
|
McHaney JR, Hancock KE, Polley DB, Parthasarathy A. Sensory representations and pupil-indexed listening effort provide complementary contributions to multi-talker speech intelligibility. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.08.13.553131. [PMID: 37645975 PMCID: PMC10462058 DOI: 10.1101/2023.08.13.553131] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/01/2023]
Abstract
Optimal speech perception in noise requires successful separation of the target speech stream from multiple competing background speech streams. The ability to segregate these competing speech streams depends on the fidelity of bottom-up neural representations of sensory information in the auditory system and top-down influences of effortful listening. Here, we use objective neurophysiological measures of bottom-up temporal processing using envelope-following responses (EFRs) to amplitude modulated tones and investigate their interactions with pupil-indexed listening effort, as it relates to performance on the Quick speech in noise (QuickSIN) test in young adult listeners with clinically normal hearing thresholds. We developed an approach using ear-canal electrodes and adjusting electrode montages for modulation rate ranges, which extended the rage of reliable EFR measurements as high as 1024Hz. Pupillary responses revealed changes in listening effort at the two most difficult signal-to-noise ratios (SNR), but behavioral deficits at the hardest SNR only. Neither pupil-indexed listening effort nor the slope of the EFR decay function independently related to QuickSIN performance. However, a linear model using the combination of EFRs and pupil metrics significantly explained variance in QuickSIN performance. These results suggest a synergistic interaction between bottom-up sensory coding and top-down measures of listening effort as it relates to speech perception in noise. These findings can inform the development of next-generation tests for hearing deficits in listeners with normal-hearing thresholds that incorporates a multi-dimensional approach to understanding speech intelligibility deficits.
Collapse
Affiliation(s)
- Jacie R. McHaney
- Department of Communication Science and Disorders, University of Pittsburgh, Pittsburgh, PA
| | - Kenneth E. Hancock
- Deparment of Otolaryngology – Head and Neck Surgery, Harvard Medical School, Boston, MA
- Eaton-Peabody Laboratories, Massachusetts Eye and Ear, Boston MA
| | - Daniel B. Polley
- Deparment of Otolaryngology – Head and Neck Surgery, Harvard Medical School, Boston, MA
- Eaton-Peabody Laboratories, Massachusetts Eye and Ear, Boston MA
| | - Aravindakshan Parthasarathy
- Department of Communication Science and Disorders, University of Pittsburgh, Pittsburgh, PA
- Department of Bioengineering, University of Pittsburgh, Pittsburgh PA
| |
Collapse
|
15
|
Liang B, Li Y, Zhao W, Du Y. Bilateral human laryngeal motor cortex in perceptual decision of lexical tone and voicing of consonant. Nat Commun 2023; 14:4710. [PMID: 37543659 PMCID: PMC10404239 DOI: 10.1038/s41467-023-40445-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2023] [Accepted: 07/27/2023] [Indexed: 08/07/2023] Open
Abstract
Speech perception is believed to recruit the left motor cortex. However, the exact role of the laryngeal subregion and its right counterpart in speech perception, as well as their temporal patterns of involvement remain unclear. To address these questions, we conducted a hypothesis-driven study, utilizing transcranial magnetic stimulation on the left or right dorsal laryngeal motor cortex (dLMC) when participants performed perceptual decision on Mandarin lexical tone or consonant (voicing contrast) presented with or without noise. We used psychometric function and hierarchical drift-diffusion model to disentangle perceptual sensitivity and dynamic decision-making parameters. Results showed that bilateral dLMCs were engaged with effector specificity, and this engagement was left-lateralized with right upregulation in noise. Furthermore, the dLMC contributed to various decision stages depending on the hemisphere and task difficulty. These findings substantially advance our understanding of the hemispherical lateralization and temporal dynamics of bilateral dLMC in sensorimotor integration during speech perceptual decision-making.
Collapse
Affiliation(s)
- Baishen Liang
- Institute of Psychology, CAS Key Laboratory of Behavioral Science, Chinese Academy of Sciences, Beijing, 100101, China
- Department of Psychology, University of Chinese Academy of Sciences, Beijing, 100049, China
| | - Yanchang Li
- Institute of Psychology, CAS Key Laboratory of Behavioral Science, Chinese Academy of Sciences, Beijing, 100101, China
| | - Wanying Zhao
- Institute of Psychology, CAS Key Laboratory of Behavioral Science, Chinese Academy of Sciences, Beijing, 100101, China
| | - Yi Du
- Institute of Psychology, CAS Key Laboratory of Behavioral Science, Chinese Academy of Sciences, Beijing, 100101, China.
- Department of Psychology, University of Chinese Academy of Sciences, Beijing, 100049, China.
- CAS Center for Excellence in Brain Science and Intelligence Technology, Shanghai, 200031, China.
- Chinese Institute for Brain Research, Beijing, 102206, China.
| |
Collapse
|
16
|
Viswanathan V, Bharadwaj HM, Heinz MG, Shinn-Cunningham BG. Induced alpha and beta electroencephalographic rhythms covary with single-trial speech intelligibility in competition. Sci Rep 2023; 13:10216. [PMID: 37353552 PMCID: PMC10290148 DOI: 10.1038/s41598-023-37173-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/06/2023] [Accepted: 06/17/2023] [Indexed: 06/25/2023] Open
Abstract
Neurophysiological studies suggest that intrinsic brain oscillations influence sensory processing, especially of rhythmic stimuli like speech. Prior work suggests that brain rhythms may mediate perceptual grouping and selective attention to speech amidst competing sound, as well as more linguistic aspects of speech processing like predictive coding. However, we know of no prior studies that have directly tested, at the single-trial level, whether brain oscillations relate to speech-in-noise outcomes. Here, we combined electroencephalography while simultaneously measuring intelligibility of spoken sentences amidst two different interfering sounds: multi-talker babble or speech-shaped noise. We find that induced parieto-occipital alpha (7-15 Hz; thought to modulate attentional focus) and frontal beta (13-30 Hz; associated with maintenance of the current sensorimotor state and predictive coding) oscillations covary with trial-wise percent-correct scores; importantly, alpha and beta power provide significant independent contributions to predicting single-trial behavioral outcomes. These results can inform models of speech processing and guide noninvasive measures to index different neural processes that together support complex listening.
Collapse
Affiliation(s)
- Vibha Viswanathan
- Neuroscience Institute, Carnegie Mellon University, Pittsburgh, PA, 15213, USA.
| | - Hari M Bharadwaj
- Department of Communication Science and Disorders, University of Pittsburgh, Pittsburgh, PA, 15260, USA
| | - Michael G Heinz
- Department of Speech, Language, and Hearing Sciences, Purdue University, West Lafayette, IN, 47907, USA
| | | |
Collapse
|
17
|
Cope TE, Sohoglu E, Peterson KA, Jones PS, Rua C, Passamonti L, Sedley W, Post B, Coebergh J, Butler CR, Garrard P, Abdel-Aziz K, Husain M, Griffiths TD, Patterson K, Davis MH, Rowe JB. Temporal lobe perceptual predictions for speech are instantiated in motor cortex and reconciled by inferior frontal cortex. Cell Rep 2023; 42:112422. [PMID: 37099422 DOI: 10.1016/j.celrep.2023.112422] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2022] [Revised: 12/23/2022] [Accepted: 04/05/2023] [Indexed: 04/27/2023] Open
Abstract
Humans use predictions to improve speech perception, especially in noisy environments. Here we use 7-T functional MRI (fMRI) to decode brain representations of written phonological predictions and degraded speech signals in healthy humans and people with selective frontal neurodegeneration (non-fluent variant primary progressive aphasia [nfvPPA]). Multivariate analyses of item-specific patterns of neural activation indicate dissimilar representations of verified and violated predictions in left inferior frontal gyrus, suggestive of processing by distinct neural populations. In contrast, precentral gyrus represents a combination of phonological information and weighted prediction error. In the presence of intact temporal cortex, frontal neurodegeneration results in inflexible predictions. This manifests neurally as a failure to suppress incorrect predictions in anterior superior temporal gyrus and reduced stability of phonological representations in precentral gyrus. We propose a tripartite speech perception network in which inferior frontal gyrus supports prediction reconciliation in echoic memory, and precentral gyrus invokes a motor model to instantiate and refine perceptual predictions for speech.
Collapse
Affiliation(s)
- Thomas E Cope
- Department of Clinical Neurosciences, University of Cambridge, Cambridge CB2 0SZ, UK; Medical Research Council Cognition and Brain Sciences Unit, University of Cambridge, Cambridge CB2 7EF, UK; Cambridge University Hospitals NHS Trust, Cambridge CB2 0QQ, UK.
| | - Ediz Sohoglu
- Medical Research Council Cognition and Brain Sciences Unit, University of Cambridge, Cambridge CB2 7EF, UK; School of Psychology, University of Sussex, Brighton BN1 9RH, UK
| | - Katie A Peterson
- Department of Clinical Neurosciences, University of Cambridge, Cambridge CB2 0SZ, UK; Department of Radiology, University of Cambridge, Cambridge CB2 0QQ, UK
| | - P Simon Jones
- Department of Clinical Neurosciences, University of Cambridge, Cambridge CB2 0SZ, UK
| | - Catarina Rua
- Department of Clinical Neurosciences, University of Cambridge, Cambridge CB2 0SZ, UK
| | - Luca Passamonti
- Department of Clinical Neurosciences, University of Cambridge, Cambridge CB2 0SZ, UK
| | - William Sedley
- Biosciences Institute, Newcastle University, Newcastle upon Tyne NE2 4HH, UK
| | - Brechtje Post
- Theoretical and Applied Linguistics, Faculty of Modern & Medieval Languages & Linguistics, University of Cambridge, Cambridge CB3 9DA, UK
| | - Jan Coebergh
- Ashford and St Peter's Hospital, Ashford TW15 3AA, UK; St George's Hospital, London SW17 0QT, UK
| | - Christopher R Butler
- Nuffield Department of Clinical Neurosciences, University of Oxford, Oxford OX3 9DU, UK; Faculty of Medicine, Department of Brain Sciences, Imperial College London, London W12 0NN, UK
| | - Peter Garrard
- St George's Hospital, London SW17 0QT, UK; Molecular and Clinical Sciences Research Institute, St. George's, University of London, London SW17 0RE, UK
| | - Khaled Abdel-Aziz
- Ashford and St Peter's Hospital, Ashford TW15 3AA, UK; St George's Hospital, London SW17 0QT, UK
| | - Masud Husain
- Nuffield Department of Clinical Neurosciences, University of Oxford, Oxford OX3 9DU, UK
| | - Timothy D Griffiths
- Biosciences Institute, Newcastle University, Newcastle upon Tyne NE2 4HH, UK
| | - Karalyn Patterson
- Department of Clinical Neurosciences, University of Cambridge, Cambridge CB2 0SZ, UK; Medical Research Council Cognition and Brain Sciences Unit, University of Cambridge, Cambridge CB2 7EF, UK
| | - Matthew H Davis
- Medical Research Council Cognition and Brain Sciences Unit, University of Cambridge, Cambridge CB2 7EF, UK
| | - James B Rowe
- Department of Clinical Neurosciences, University of Cambridge, Cambridge CB2 0SZ, UK; Medical Research Council Cognition and Brain Sciences Unit, University of Cambridge, Cambridge CB2 7EF, UK; Cambridge University Hospitals NHS Trust, Cambridge CB2 0QQ, UK
| |
Collapse
|
18
|
Viswanathan V, Bharadwaj HM, Heinz MG, Shinn-Cunningham BG. Induced Alpha And Beta Electroencephalographic Rhythms Covary With Single-Trial Speech Intelligibility In Competition. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2022.12.31.522365. [PMID: 36712081 PMCID: PMC9884507 DOI: 10.1101/2022.12.31.522365] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 01/04/2023]
Abstract
Neurophysiological studies suggest that intrinsic brain oscillations influence sensory processing, especially of rhythmic stimuli like speech. Prior work suggests that brain rhythms may mediate perceptual grouping and selective attention to speech amidst competing sound, as well as more linguistic aspects of speech processing like predictive coding. However, we know of no prior studies that have directly tested, at the single-trial level, whether brain oscillations relate to speech-in-noise outcomes. Here, we combined electroencephalography while simultaneously measuring intelligibility of spoken sentences amidst two different interfering sounds: multi-talker babble or speech-shaped noise. We find that induced parieto-occipital alpha (7-15 Hz; thought to modulate attentional focus) and frontal beta (13-30 Hz; associated with maintenance of the current sensorimotor state and predictive coding) oscillations covary with trial-wise percent-correct scores; importantly, alpha and beta power provide significant independent contributions to predicting single-trial behavioral outcomes. These results can inform models of speech processing and guide noninvasive measures to index different neural processes that together support complex listening.
Collapse
Affiliation(s)
- Vibha Viswanathan
- Neuroscience Institute, Carnegie Mellon University, Pitttsburgh, PA 15213
| | - Hari M. Bharadwaj
- Department of Communication Science and Disorders, University of Pittsburgh, Pitttsburgh, PA 15260
| | - Michael G. Heinz
- Department of Speech, Language, and Hearing Sciences, Purdue University, West Lafayette, IN 47907
| | | |
Collapse
|
19
|
Zhang L, Wang X, Alain C, Du Y. Successful aging of musicians: Preservation of sensorimotor regions aids audiovisual speech-in-noise perception. SCIENCE ADVANCES 2023; 9:eadg7056. [PMID: 37126550 PMCID: PMC10132752 DOI: 10.1126/sciadv.adg7056] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/03/2023]
Abstract
Musicianship can mitigate age-related declines in audiovisual speech-in-noise perception. We tested whether this benefit originates from functional preservation or functional compensation by comparing fMRI responses of older musicians, older nonmusicians, and young nonmusicians identifying noise-masked audiovisual syllables. Older musicians outperformed older nonmusicians and showed comparable performance to young nonmusicians. Notably, older musicians retained similar neural specificity of speech representations in sensorimotor areas to young nonmusicians, while older nonmusicians showed degraded neural representations. In the same region, older musicians showed higher neural alignment to young nonmusicians than older nonmusicians, which was associated with their training intensity. In older nonmusicians, the degree of neural alignment predicted better performance. In addition, older musicians showed greater activation in frontal-parietal, speech motor, and visual motion regions and greater deactivation in the angular gyrus than older nonmusicians, which predicted higher neural alignment in sensorimotor areas. Together, these findings suggest that musicianship-related benefit in audiovisual speech-in-noise processing is rooted in preserving youth-like representations in sensorimotor regions.
Collapse
Affiliation(s)
- Lei Zhang
- CAS Key Laboratory of Behavioral Science, Institute of Psychology, Chinese Academy of Sciences, Beijing 100101, China
- Department of Psychology, University of Chinese Academy of Sciences, Beijing 100049, China
| | - Xiuyi Wang
- CAS Key Laboratory of Behavioral Science, Institute of Psychology, Chinese Academy of Sciences, Beijing 100101, China
| | - Claude Alain
- Rotman Research Institute, Baycrest Centre for Geriatric Care, Toronto, ON M6A 2E1, Canada
- Department of Psychology, University of Toronto, ON M8V 2S4, Canada
| | - Yi Du
- CAS Key Laboratory of Behavioral Science, Institute of Psychology, Chinese Academy of Sciences, Beijing 100101, China
- Department of Psychology, University of Chinese Academy of Sciences, Beijing 100049, China
- CAS Center for Excellence in Brain Science and Intelligence Technology, Shanghai 200031, China
- Chinese Institute for Brain Research, Beijing 102206, China
| |
Collapse
|
20
|
Li N, Ma W, Ren F, Li X, Li F, Zong W, Wu L, Dai Z, Hui SCN, Edden RAE, Li M, Gao F. Neurochemical and functional reorganization of the cognitive-ear link underlies cognitive impairment in presbycusis. Neuroimage 2023; 268:119861. [PMID: 36610677 PMCID: PMC10026366 DOI: 10.1016/j.neuroimage.2023.119861] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2022] [Revised: 12/11/2022] [Accepted: 01/03/2023] [Indexed: 01/06/2023] Open
Abstract
Recent studies suggest that the interaction between presbycusis and cognitive impairment may be partially explained by the cognitive-ear link. However, the underlying neurophysiological mechanisms remain largely unknown. In this study, we combined magnetic resonance spectroscopy (MRS) and resting-state functional magnetic resonance imaging (fMRI) to investigate auditory gamma-aminobutyric acid (GABA) and glutamate (Glu) levels, intra- and inter-network functional connectivity, and their relationships with auditory and cognitive function in 51 presbycusis patients and 51 well-matched healthy controls. Our results confirmed reorganization of the cognitive-ear link in presbycusis, including decreased auditory GABA and Glu levels and aberrant functional connectivity involving auditory networks (AN) and cognitive-related networks, which were associated with reduced speech perception or cognitive impairment. Moreover, mediation analyses revealed that decreased auditory GABA levels and dysconnectivity between the AN and default mode network (DMN) mediated the association between hearing loss and impaired information processing speed in presbycusis. These findings highlight the importance of AN-DMN dysconnectivity in cognitive-ear link reorganization leading to cognitive impairment, and hearing loss may drive reorganization via decreased auditory GABA levels. Modulation of GABA neurotransmission may lead to new treatment strategies for cognitive impairment in presbycusis patients.
Collapse
Affiliation(s)
- Ning Li
- Department of Radiology, Shandong Provincial Hospital Affiliated to Shandong First Medical University, Jinan, China
| | - Wen Ma
- Department of Otolaryngology, the Central Hospital of Jinan City, Cheeloo College of Medicine, Shandong University, Jinan, China
| | - Fuxin Ren
- Department of Radiology, Shandong Provincial Hospital Affiliated to Shandong First Medical University, Jinan, China; Department of Radiology, Shandong Provincial Hospital, Cheeloo College of Medicine, Shandong University, Jinan, China
| | - Xiao Li
- Department of Radiology, Shandong Provincial Hospital Affiliated to Shandong First Medical University, Jinan, China; Department of Radiology, Shandong Provincial Hospital, Cheeloo College of Medicine, Shandong University, Jinan, China
| | - Fuyan Li
- Department of Radiology, Shandong Provincial Hospital Affiliated to Shandong First Medical University, Jinan, China; Department of Radiology, Shandong Provincial Hospital, Cheeloo College of Medicine, Shandong University, Jinan, China
| | - Wei Zong
- Department of Radiology, Shandong Provincial Hospital Affiliated to Shandong First Medical University, Jinan, China; Department of Radiology, Shandong Provincial Hospital, Cheeloo College of Medicine, Shandong University, Jinan, China
| | - Lili Wu
- CAS Key Laboratory of Mental Health, Institute of Psychology, Chinese Academy of Sciences, Beijing, China
| | - Zongrui Dai
- Westa College, Southwest University, Chongqing, China
| | - Steve C N Hui
- Russell H. Morgan Department of Radiology and Radiological Science, The Johns Hopkins University School of Medicine, Baltimore, MD, USA; F. M. Kirby Research Center for Functional Brain Imaging, Kennedy Krieger Institute, Baltimore, MD, USA
| | - Richard A E Edden
- Russell H. Morgan Department of Radiology and Radiological Science, The Johns Hopkins University School of Medicine, Baltimore, MD, USA; F. M. Kirby Research Center for Functional Brain Imaging, Kennedy Krieger Institute, Baltimore, MD, USA
| | - Muwei Li
- Vanderbilt University Institute of Imaging Science, Nashville, TN, USA
| | - Fei Gao
- Department of Radiology, Shandong Provincial Hospital Affiliated to Shandong First Medical University, Jinan, China.
| |
Collapse
|
21
|
Blank H, Alink A, Büchel C. Multivariate functional neuroimaging analyses reveal that strength-dependent face expectations are represented in higher-level face-identity areas. Commun Biol 2023; 6:135. [PMID: 36725984 PMCID: PMC9892564 DOI: 10.1038/s42003-023-04508-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2022] [Accepted: 01/19/2023] [Indexed: 02/03/2023] Open
Abstract
Perception is an active inference in which prior expectations are combined with sensory input. It is still unclear how the strength of prior expectations is represented in the human brain. The strength, or precision, of a prior could be represented with its content, potentially in higher-level sensory areas. We used multivariate analyses of functional resonance imaging data to test whether expectation strength is represented together with the expected face in high-level face-sensitive regions. Participants were trained to associate images of scenes with subsequently presented images of different faces. Each scene predicted three faces, each with either low, intermediate, or high probability. We found that anticipation enhances the similarity of response patterns in the face-sensitive anterior temporal lobe to response patterns specifically associated with the image of the expected face. In contrast, during face presentation, activity increased for unexpected faces in a typical prediction error network, containing areas such as the caudate and the insula. Our findings show that strength-dependent face expectations are represented in higher-level face-identity areas, supporting hierarchical theories of predictive processing according to which higher-level sensory regions represent weighted priors.
Collapse
Affiliation(s)
- Helen Blank
- grid.13648.380000 0001 2180 3484Department of Systems Neuroscience, University Medical Center Hamburg-Eppendorf, 20246 Hamburg, Germany
| | - Arjen Alink
- grid.13648.380000 0001 2180 3484Department of Systems Neuroscience, University Medical Center Hamburg-Eppendorf, 20246 Hamburg, Germany
| | - Christian Büchel
- grid.13648.380000 0001 2180 3484Department of Systems Neuroscience, University Medical Center Hamburg-Eppendorf, 20246 Hamburg, Germany
| |
Collapse
|
22
|
Schelinski S, von Kriegstein K. Responses in left inferior frontal gyrus are altered for speech-in-noise processing, but not for clear speech in autism. Brain Behav 2023; 13:e2848. [PMID: 36575611 PMCID: PMC9927852 DOI: 10.1002/brb3.2848] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/18/2022] [Revised: 11/10/2022] [Accepted: 11/28/2022] [Indexed: 12/29/2022] Open
Abstract
INTRODUCTION Autistic individuals often have difficulties with recognizing what another person is saying in noisy conditions such as in a crowded classroom or a restaurant. The underlying neural mechanisms of this speech perception difficulty are unclear. In typically developed individuals, three cerebral cortex regions are particularly related to speech-in-noise perception: the left inferior frontal gyrus (IFG), the right insula, and the left inferior parietal lobule (IPL). Here, we tested whether responses in these cerebral cortex regions are altered in speech-in-noise perception in autism. METHODS Seventeen autistic adults and 17 typically developed controls (matched pairwise on age, sex, and IQ) performed an auditory-only speech recognition task during functional magnetic resonance imaging (fMRI). Speech was presented either with noise (noise condition) or without noise (no noise condition, i.e., clear speech). RESULTS In the left IFG, blood-oxygenation-level-dependent (BOLD) responses were higher in the control compared to the autism group for recognizing speech-in-noise compared to clear speech. For this contrast, both groups had similar response magnitudes in the right insula and left IPL. Additionally, we replicated previous findings that BOLD responses in speech-related and auditory brain regions (including bilateral superior temporal sulcus and Heschl's gyrus) for clear speech were similar in both groups and that voice identity recognition was impaired for clear and noisy speech in autism. DISCUSSION Our findings show that in autism, the processing of speech is particularly reduced under noisy conditions in the left IFG-a dysfunction that might be important in explaining restricted speech comprehension in noisy environments.
Collapse
Affiliation(s)
- Stefanie Schelinski
- Faculty of Psychology, Chair of Cognitive and Clinical Neuroscience, Technische Universität Dresden, Dresden, Germany.,Max Planck Research Group Neural Mechanisms of Human Communication, Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany
| | - Katharina von Kriegstein
- Faculty of Psychology, Chair of Cognitive and Clinical Neuroscience, Technische Universität Dresden, Dresden, Germany.,Max Planck Research Group Neural Mechanisms of Human Communication, Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany
| |
Collapse
|
23
|
MacGregor LJ, Gilbert RA, Balewski Z, Mitchell DJ, Erzinçlioğlu SW, Rodd JM, Duncan J, Fedorenko E, Davis MH. Causal Contributions of the Domain-General (Multiple Demand) and the Language-Selective Brain Networks to Perceptual and Semantic Challenges in Speech Comprehension. NEUROBIOLOGY OF LANGUAGE (CAMBRIDGE, MASS.) 2022; 3:665-698. [PMID: 36742011 PMCID: PMC9893226 DOI: 10.1162/nol_a_00081] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 03/14/2022] [Accepted: 09/07/2022] [Indexed: 06/18/2023]
Abstract
Listening to spoken language engages domain-general multiple demand (MD; frontoparietal) regions of the human brain, in addition to domain-selective (frontotemporal) language regions, particularly when comprehension is challenging. However, there is limited evidence that the MD network makes a functional contribution to core aspects of understanding language. In a behavioural study of volunteers (n = 19) with chronic brain lesions, but without aphasia, we assessed the causal role of these networks in perceiving, comprehending, and adapting to spoken sentences made more challenging by acoustic-degradation or lexico-semantic ambiguity. We measured perception of and adaptation to acoustically degraded (noise-vocoded) sentences with a word report task before and after training. Participants with greater damage to MD but not language regions required more vocoder channels to achieve 50% word report, indicating impaired perception. Perception improved following training, reflecting adaptation to acoustic degradation, but adaptation was unrelated to lesion location or extent. Comprehension of spoken sentences with semantically ambiguous words was measured with a sentence coherence judgement task. Accuracy was high and unaffected by lesion location or extent. Adaptation to semantic ambiguity was measured in a subsequent word association task, which showed that availability of lower-frequency meanings of ambiguous words increased following their comprehension (word-meaning priming). Word-meaning priming was reduced for participants with greater damage to language but not MD regions. Language and MD networks make dissociable contributions to challenging speech comprehension: Using recent experience to update word meaning preferences depends on language-selective regions, whereas the domain-general MD network plays a causal role in reporting words from degraded speech.
Collapse
Affiliation(s)
- Lucy J. MacGregor
- MRC Cognition and Brain Sciences Unit, University of Cambridge, Cambridge, UK
| | - Rebecca A. Gilbert
- MRC Cognition and Brain Sciences Unit, University of Cambridge, Cambridge, UK
| | - Zuzanna Balewski
- Helen Wills Neuroscience Institute, University of California, Berkeley, Berkeley, CA
| | - Daniel J. Mitchell
- MRC Cognition and Brain Sciences Unit, University of Cambridge, Cambridge, UK
| | | | - Jennifer M. Rodd
- Psychology and Language Sciences, University College London, London, UK
| | - John Duncan
- MRC Cognition and Brain Sciences Unit, University of Cambridge, Cambridge, UK
| | - Evelina Fedorenko
- Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA
- McGovern Institute for Brain Research, Massachusetts Institute of Technology, Cambridge, MA
- Program in Speech and Hearing Bioscience and Technology, Harvard University, Cambridge, MA
| | - Matthew H. Davis
- MRC Cognition and Brain Sciences Unit, University of Cambridge, Cambridge, UK
| |
Collapse
|
24
|
Guo Z, Chen F. Decoding lexical tones and vowels in imagined tonal monosyllables using fNIRS signals. J Neural Eng 2022; 19. [PMID: 36317255 DOI: 10.1088/1741-2552/ac9e1d] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2022] [Accepted: 10/27/2022] [Indexed: 11/11/2022]
Abstract
Objective.Speech is a common way of communication. Decoding verbal intent could provide a naturalistic communication way for people with severe motor disabilities. Active brain computer interaction (BCI) speller is one of the most commonly used speech BCIs. To reduce the spelling time of Chinese words, identifying vowels and tones that are embedded in imagined Chinese words is essential. Functional near-infrared spectroscopy (fNIRS) has been widely used in BCI because it is portable, non-invasive, safe, low cost, and has a relatively high spatial resolution.Approach.In this study, an active BCI speller based on fNIRS is presented by covertly rehearsing tonal monosyllables with vowels (i.e. /a/, /i/, /o/, and /u/) and four lexical tones in Mandarin Chinese (i.e. tones 1, 2, 3, and 4) for 10 s.Main results.fNIRS results showed significant differences in the right superior temporal gyrus between imagined vowels with tone 2/3/4 and those with tone 1 (i.e. more activations and stronger connections to other brain regions for imagined vowels with tones 2/3/4 than for those with tone 1). Speech-related areas for tone imagery (i.e. the right hemisphere) provided majority of information for identifying tones, while the left hemisphere had advantages in vowel identification. Having decoded both vowels and tones during the post-stimulus 15 s period, the average classification accuracies exceeded 40% and 70% in multiclass (i.e. four classes) and binary settings, respectively. To spell words more quickly, the time window size for decoding was reduced from 15 s to 2.5 s while the classification accuracies were not significantly reduced.Significance.For the first time, this work demonstrated the possibility of discriminating lexical tones and vowels in imagined tonal syllables simultaneously. In addition, the reduced time window for decoding indicated that the spelling time of Chinese words could be significantly reduced in the fNIRS-based BCIs.
Collapse
Affiliation(s)
- Zengzhi Guo
- School of Electronics and Information Engineering, Harbin Institute of Technology, Harbin, People's Republic of China.,Department of Electrical and Electronic Engineering, Southern University of Science and Technology, Shenzhen, People's Republic of China
| | - Fei Chen
- Department of Electrical and Electronic Engineering, Southern University of Science and Technology, Shenzhen, People's Republic of China
| |
Collapse
|
25
|
Silva AB, Liu JR, Zhao L, Levy DF, Scott TL, Chang EF. A Neurosurgical Functional Dissection of the Middle Precentral Gyrus during Speech Production. J Neurosci 2022; 42:8416-8426. [PMID: 36351829 PMCID: PMC9665919 DOI: 10.1523/jneurosci.1614-22.2022] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2022] [Accepted: 08/30/2022] [Indexed: 11/17/2022] Open
Abstract
Classical models have traditionally focused on the left posterior inferior frontal gyrus (Broca's area) as a key region for motor planning of speech production. However, converging evidence suggests that it is not critical for either speech motor planning or execution. Alternative cortical areas supporting high-level speech motor planning have yet to be defined. In this review, we focus on the precentral gyrus, whose role in speech production is often thought to be limited to lower-level articulatory muscle control. In particular, we highlight neurosurgical investigations that have shed light on a cortical region anatomically located near the midpoint of the precentral gyrus, hence called the middle precentral gyrus (midPrCG). The midPrCG is functionally located between dorsal hand and ventral orofacial cortical representations and exhibits unique sensorimotor and multisensory functions relevant for speech processing. This includes motor control of the larynx, auditory processing, as well as a role in reading and writing. Furthermore, direct electrical stimulation of midPrCG can evoke complex movements, such as vocalization, and selective injury can cause deficits in verbal fluency, such as pure apraxia of speech. Based on these findings, we propose that midPrCG is essential to phonological-motoric aspects of speech production, especially syllabic-level speech sequencing, a role traditionally ascribed to Broca's area. The midPrCG is a cortical brain area that should be included in contemporary models of speech production with a unique role in speech motor planning and execution.
Collapse
Affiliation(s)
- Alexander B Silva
- Department of Neurological Surgery, University of California, San Francisco, California, 94158
- Weill Institute for Neurosciences, University of California, San Francisco, California, 94158
- Medical Scientist Training Program, University of California, San Francisco, California, 94158
- Graduate Program in Bioengineering, University of California, Berkeley, California 94720, & University of California, San Francisco, California, 94158
| | - Jessie R Liu
- Department of Neurological Surgery, University of California, San Francisco, California, 94158
- Weill Institute for Neurosciences, University of California, San Francisco, California, 94158
- Graduate Program in Bioengineering, University of California, Berkeley, California 94720, & University of California, San Francisco, California, 94158
| | - Lingyun Zhao
- Department of Neurological Surgery, University of California, San Francisco, California, 94158
- Weill Institute for Neurosciences, University of California, San Francisco, California, 94158
| | - Deborah F Levy
- Department of Neurological Surgery, University of California, San Francisco, California, 94158
- Weill Institute for Neurosciences, University of California, San Francisco, California, 94158
| | - Terri L Scott
- Department of Neurological Surgery, University of California, San Francisco, California, 94158
- Weill Institute for Neurosciences, University of California, San Francisco, California, 94158
| | - Edward F Chang
- Department of Neurological Surgery, University of California, San Francisco, California, 94158
- Weill Institute for Neurosciences, University of California, San Francisco, California, 94158
- Graduate Program in Bioengineering, University of California, Berkeley, California 94720, & University of California, San Francisco, California, 94158
| |
Collapse
|
26
|
Dole M, Vilain C, Haldin C, Baciu M, Cousin E, Lamalle L, Lœvenbruck H, Vilain A, Schwartz JL. Comparing the selectivity of vowel representations in cortical auditory vs. motor areas: A repetition-suppression study. Neuropsychologia 2022; 176:108392. [DOI: 10.1016/j.neuropsychologia.2022.108392] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2022] [Revised: 09/22/2022] [Accepted: 10/03/2022] [Indexed: 10/31/2022]
|
27
|
Domain-specific hearing-in-noise performance is associated with absolute pitch proficiency. Sci Rep 2022; 12:16344. [PMID: 36175508 PMCID: PMC9521875 DOI: 10.1038/s41598-022-20869-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2022] [Accepted: 09/20/2022] [Indexed: 11/22/2022] Open
Abstract
Recent evidence suggests that musicians may have an advantage over non-musicians in perceiving speech against noisy backgrounds. Previously, musicians have been compared as a homogenous group, despite demonstrated heterogeneity, which may contribute to discrepancies between studies. Here, we investigated whether “quasi”-absolute pitch (AP) proficiency, viewed as a general trait that varies across a spectrum, accounts for the musician advantage in hearing-in-noise (HIN) performance, irrespective of whether the streams are speech or musical sounds. A cohort of 12 non-musicians and 42 trained musicians stratified into high, medium, or low AP proficiency identified speech or melody targets masked in noise (speech-shaped, multi-talker, and multi-music) under four signal-to-noise ratios (0, − 3, − 6, and − 9 dB). Cognitive abilities associated with HIN benefits, including auditory working memory and use of visuo-spatial cues, were assessed. AP proficiency was verified against pitch adjustment and relative pitch tasks. We found a domain-specific effect on HIN perception: quasi-AP abilities were related to improved perception of melody but not speech targets in noise. The quasi-AP advantage extended to tonal working memory and the use of spatial cues, but only during melodic stream segregation. Overall, the results do not support the putative musician advantage in speech-in-noise perception, but suggest a quasi-AP advantage in perceiving music under noisy environments.
Collapse
|
28
|
Wang X, Krieger-Redwood K, Zhang M, Cui Z, Wang X, Karapanagiotidis T, Du Y, Leech R, Bernhardt BC, Margulies DS, Smallwood J, Jefferies E. Physical distance to sensory-motor landmarks predicts language function. Cereb Cortex 2022; 33:4305-4318. [PMID: 36066439 PMCID: PMC10110440 DOI: 10.1093/cercor/bhac344] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2022] [Revised: 08/01/2022] [Accepted: 08/02/2022] [Indexed: 11/14/2022] Open
Abstract
Auditory language comprehension recruits cortical regions that are both close to sensory-motor landmarks (supporting auditory and motor features) and far from these landmarks (supporting word meaning). We investigated whether the responsiveness of these regions in task-based functional MRI is related to individual differences in their physical distance to primary sensorimotor landmarks. Parcels in the auditory network, that were equally responsive across story and math tasks, showed stronger activation in individuals who had less distance between these parcels and transverse temporal sulcus, in line with the predictions of the "tethering hypothesis," which suggests that greater proximity to input regions might increase the fidelity of sensory processing. Conversely, language and default mode parcels, which were more active for the story task, showed positive correlations between individual differences in activation and sensory-motor distance from primary sensory-motor landmarks, consistent with the view that physical separation from sensory-motor inputs supports aspects of cognition that draw on semantic memory. These results demonstrate that distance from sensorimotor regions provides an organizing principle of functional differentiation within the cortex. The relationship between activation and geodesic distance to sensory-motor landmarks is in opposite directions for cortical regions that are proximal to the heteromodal (DMN and language network) and unimodal ends of the principal gradient of intrinsic connectivity.
Collapse
Affiliation(s)
- Xiuyi Wang
- CAS Key Laboratory of Behavioral Science, Institute of Psychology, Chinese Academy of Sciences, Beijing, 100101, China.,Department of Psychology, University of York, Heslington, York YO10 5DD, UK
| | | | - Meichao Zhang
- Department of Psychology, University of York, Heslington, York YO10 5DD, UK
| | - Zaixu Cui
- Chinese Institute for Brain Research, Beijing 102206, China
| | - Xiaokang Wang
- Department of Biomedical Engineering, University of California, Davis, CA 95616, USA
| | | | - Yi Du
- CAS Key Laboratory of Behavioral Science, Institute of Psychology, Chinese Academy of Sciences, Beijing, 100101, China.,Chinese Institute for Brain Research, Beijing 102206, China.,CAS Center for Excellence in Brain Science and Intelligence Technology, Shanghai 200031, China.,Department of Psychology, University of Chinese Academy of Sciences, Beijing 100049, China
| | - Robert Leech
- Centre for Neuroimaging Science, Kings College London, London, UK
| | - Boris C Bernhardt
- McConnell Brain Imaging Centre, McGill University, Montreal, Quebec, Canada
| | - Daniel S Margulies
- Integrative Neuroscience and Cognition Center (UMR 8002), Centre National de la Recherche Scientifique (CNRS) and Université de Paris, Paris, France.,Wellcome Centre for Integrative Neuroimaging, Nuffield Department of Clinical Neurosciences, University of Oxford, Oxford, UK
| | | | | |
Collapse
|
29
|
Wang M, Liu J, Kong L, Zhao Y, Diao T, Ma X. Subjective tinnitus patients with normal pure-tone hearing still suffer more informational masking in the noisy environment. Front Neurosci 2022; 16:983427. [PMID: 36090272 PMCID: PMC9448876 DOI: 10.3389/fnins.2022.983427] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2022] [Accepted: 07/29/2022] [Indexed: 11/13/2022] Open
Abstract
Subjective tinnitus patients experience more hearing difficulties than normal peers in complex hearing environments, even though most of these patients have normal pure-tone hearing thresholds. Using speech recognition tasks under different masking conditions can provide insight into whether the effects of tinnitus are lateralized and the mechanisms behind the effects. By simulating sound field recordings, we obtain a target speech sentence that can be perceived as presented on one side and noise or speech masking with or without spatial separation from it. Our study used the virtual sound field technique to investigate the difference in speech recognition ability between chronic subjective tinnitus patients and a normal-hearing control group under the four masking conditions (speech-spectrum noise masking or two-talker speech masking, with or without perceived spatial separation). Experiment 1 showed no differences for target speech perceived location (left or right), which rules out a lateralization of the effect of tinnitus patients. Experiment 2 further found that although tinnitus patients had weaker performance than normal people in very complex auditory scenarios, when the spatial cue of the target speech exists, they can make good use of this cue to make up for the original processing disadvantage and achieve a similar performance as the normal-hearing group. In addition, the current study distinguished the effects of informational masking and energetic masking on speech recognition in patients with tinnitus and normal hearing. The results suggest that the impact of tinnitus on speech recognition in patients is more likely to occur in the auditory center rather than the periphery.
Collapse
Affiliation(s)
- Mengyuan Wang
- School of Psychology, Beijing Normal University, Beijing, China
| | - Jinjun Liu
- School of Psychology, Beijing Normal University, Beijing, China
| | - Lingzhi Kong
- School of Communication Sciences, Beijing Language and Culture University, Beijing, China
- *Correspondence: Lingzhi Kong,
| | - Yixin Zhao
- Department of Otolaryngology, Head and Neck Surgery, People’s Hospital, Peking University, Beijing, China
- Yixin Zhao,
| | - Tongxiang Diao
- Department of Otolaryngology, Head and Neck Surgery, People’s Hospital, Peking University, Beijing, China
- Tongxiang Diao,
| | - Xin Ma
- Department of Otolaryngology, Head and Neck Surgery, People’s Hospital, Peking University, Beijing, China
- Xin Ma,
| |
Collapse
|
30
|
Li Z, Hong B, Wang D, Nolte G, Engel AK, Zhang D. Speaker-listener neural coupling reveals a right-lateralized mechanism for non-native speech-in-noise comprehension. Cereb Cortex 2022; 33:3701-3714. [PMID: 35975617 DOI: 10.1093/cercor/bhac302] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2022] [Revised: 07/08/2022] [Accepted: 07/09/2022] [Indexed: 11/14/2022] Open
Abstract
While the increasingly globalized world has brought more and more demands for non-native language communication, the prevalence of background noise in everyday life poses a great challenge to non-native speech comprehension. The present study employed an interbrain approach based on functional near-infrared spectroscopy (fNIRS) to explore how people adapt to comprehend non-native speech information in noise. A group of Korean participants who acquired Chinese as their non-native language was invited to listen to Chinese narratives at 4 noise levels (no noise, 2 dB, -6 dB, and - 9 dB). These narratives were real-life stories spoken by native Chinese speakers. Processing of the non-native speech was associated with significant fNIRS-based listener-speaker neural couplings mainly over the right hemisphere at both the listener's and the speaker's sides. More importantly, the neural couplings from the listener's right superior temporal gyrus, the right middle temporal gyrus, as well as the right postcentral gyrus were found to be positively correlated with their individual comprehension performance at the strongest noise level (-9 dB). These results provide interbrain evidence in support of the right-lateralized mechanism for non-native speech processing and suggest that both an auditory-based and a sensorimotor-based mechanism contributed to the non-native speech-in-noise comprehension.
Collapse
Affiliation(s)
- Zhuoran Li
- Department of Psychology, School of Social Sciences, Tsinghua University, Beijing 100084, China.,Tsinghua Laboratory of Brain and Intelligence, Tsinghua University, Beijing 100084, China
| | - Bo Hong
- Tsinghua Laboratory of Brain and Intelligence, Tsinghua University, Beijing 100084, China.,Department of Biomedical Engineering, School of Medicine, Tsinghua University, Beijing 100084, China
| | - Daifa Wang
- School of Biological Science and Medical Engineering, Beihang University, Beijing 100083, China
| | - Guido Nolte
- Department of Neurophysiology and Pathophysiology, University Medical Center Hamburg Eppendorf, 20246 Hamburg, Germany
| | - Andreas K Engel
- Department of Neurophysiology and Pathophysiology, University Medical Center Hamburg Eppendorf, 20246 Hamburg, Germany
| | - Dan Zhang
- Department of Psychology, School of Social Sciences, Tsinghua University, Beijing 100084, China.,Tsinghua Laboratory of Brain and Intelligence, Tsinghua University, Beijing 100084, China
| |
Collapse
|
31
|
Preisig BC, Riecke L, Hervais-Adelman A. Speech sound categorization: The contribution of non-auditory and auditory cortical regions. Neuroimage 2022; 258:119375. [PMID: 35700949 DOI: 10.1016/j.neuroimage.2022.119375] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2022] [Revised: 05/13/2022] [Accepted: 06/10/2022] [Indexed: 11/26/2022] Open
Abstract
Which processes in the human brain lead to the categorical perception of speech sounds? Investigation of this question is hampered by the fact that categorical speech perception is normally confounded by acoustic differences in the stimulus. By using ambiguous sounds, however, it is possible to dissociate acoustic from perceptual stimulus representations. Twenty-seven normally hearing individuals took part in an fMRI study in which they were presented with an ambiguous syllable (intermediate between /da/ and /ga/) in one ear and with disambiguating acoustic feature (third formant, F3) in the other ear. Multi-voxel pattern searchlight analysis was used to identify brain areas that consistently differentiated between response patterns associated with different syllable reports. By comparing responses to different stimuli with identical syllable reports and identical stimuli with different syllable reports, we disambiguated whether these regions primarily differentiated the acoustics of the stimuli or the syllable report. We found that BOLD activity patterns in left perisylvian regions (STG, SMG), left inferior frontal regions (vMC, IFG, AI), left supplementary motor cortex (SMA/pre-SMA), and right motor and somatosensory regions (M1/S1) represent listeners' syllable report irrespective of stimulus acoustics. Most of these regions are outside of what is traditionally regarded as auditory or phonological processing areas. Our results indicate that the process of speech sound categorization implicates decision-making mechanisms and auditory-motor transformations.
Collapse
Affiliation(s)
- Basil C Preisig
- Donders Institute for Brain, Cognition, and Behaviour, Radboud University, 6500 HB Nijmegen, The Netherlands; Max Planck Institute for Psycholinguistics, 6525 XD Nijmegen, The Netherlands; Department of Psychology, Neurolinguistics, University of Zurich, 8050 Zurich, Switzerland; Department of Comparative Language Science, Evolutionary Neuroscience of Language, University of Zurich, 8050 Zurich, Switzerland; Neuroscience Center Zurich, University of Zurich and Eidgenössische Technische Hochschule Zurich, 8057 Zurich, Switzerland.
| | - Lars Riecke
- Department of Cognitive Neuroscience, Faculty of Psychology and Neuroscience, Maastricht University, 6229 ER Maastricht, The Netherlands
| | - Alexis Hervais-Adelman
- Department of Psychology, Neurolinguistics, University of Zurich, 8050 Zurich, Switzerland; Neuroscience Center Zurich, University of Zurich and Eidgenössische Technische Hochschule Zurich, 8057 Zurich, Switzerland
| |
Collapse
|
32
|
Zendel BR. The importance of the motor system in the development of music-based forms of auditory rehabilitation. Ann N Y Acad Sci 2022; 1515:10-19. [PMID: 35648040 DOI: 10.1111/nyas.14810] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Hearing abilities decline with age, and one of the most commonly reported hearing issues in older adults is a difficulty understanding speech when there is loud background noise. Understanding speech in noise relies on numerous cognitive processes, including working memory, and is supported by numerous brain regions, including the motor and motor planning systems. Indeed, many working memory processes are supported by motor and premotor cortical regions. Interestingly, lifelong musicians and nonmusicians given music training over the course of weeks or months show an improved ability to understand speech when there is loud background noise. These benefits are associated with enhanced working memory abilities, and enhanced activity in motor and premotor cortical regions. Accordingly, it is likely that music training improves the coupling between the auditory and motor systems and promotes plasticity in these regions and regions that feed into auditory/motor areas. This leads to an enhanced ability to dynamically process incoming acoustic information, and is likely the reason that musicians and those who receive laboratory-based music training are better able to understand speech when there is background noise. Critically, these findings suggest that music-based forms of auditory rehabilitation are possible and should focus on tasks that promote auditory-motor interactions.
Collapse
Affiliation(s)
- Benjamin Rich Zendel
- Faculty of Medicine, Memorial University of Newfoundland, St. John's, Newfoundland and Labrador, Canada.,Aging Research Centre - Newfoundland and Labrador, Grenfell Campus, Memorial University, Corner Brook, Newfoundland and Labrador, Canada
| |
Collapse
|
33
|
Zhang L, Du Y. Lip movements enhance speech representations and effective connectivity in auditory dorsal stream. Neuroimage 2022; 257:119311. [PMID: 35589000 DOI: 10.1016/j.neuroimage.2022.119311] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2022] [Revised: 05/09/2022] [Accepted: 05/11/2022] [Indexed: 11/25/2022] Open
Abstract
Viewing speaker's lip movements facilitates speech perception, especially under adverse listening conditions, but the neural mechanisms of this perceptual benefit at the phonemic and feature levels remain unclear. This fMRI study addressed this question by quantifying regional multivariate representation and network organization underlying audiovisual speech-in-noise perception. Behaviorally, valid lip movements improved recognition of place of articulation to aid phoneme identification. Meanwhile, lip movements enhanced neural representations of phonemes in left auditory dorsal stream regions, including frontal speech motor areas and supramarginal gyrus (SMG). Moreover, neural representations of place of articulation and voicing features were promoted differentially by lip movements in these regions, with voicing enhanced in Broca's area while place of articulation better encoded in left ventral premotor cortex and SMG. Next, dynamic causal modeling (DCM) analysis showed that such local changes were accompanied by strengthened effective connectivity along the dorsal stream. Moreover, the neurite orientation dispersion of the left arcuate fasciculus, the bearing skeleton of auditory dorsal stream, predicted the visual enhancements of neural representations and effective connectivity. Our findings provide novel insight to speech science that lip movements promote both local phonemic and feature encoding and network connectivity in the dorsal pathway and the functional enhancement is mediated by the microstructural architecture of the circuit.
Collapse
Affiliation(s)
- Lei Zhang
- CAS Key Laboratory of Behavioral Science, Institute of Psychology, Chinese Academy of Sciences, Beijing, China 100101; Department of Psychology, University of Chinese Academy of Sciences, Beijing, China 100049
| | - Yi Du
- CAS Key Laboratory of Behavioral Science, Institute of Psychology, Chinese Academy of Sciences, Beijing, China 100101; Department of Psychology, University of Chinese Academy of Sciences, Beijing, China 100049; CAS Center for Excellence in Brain Science and Intelligence Technology, Shanghai, China 200031; Chinese Institute for Brain Research, Beijing, China 102206.
| |
Collapse
|
34
|
Malloy JR, Nistal D, Heyne M, Tardif MC, Bohland JW. Delayed Auditory Feedback Elicits Specific Patterns of Serial Order Errors in a Paced Syllable Sequence Production Task. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2022; 65:1800-1821. [PMID: 35442719 DOI: 10.1044/2022_jslhr-21-00427] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
PURPOSE Delayed auditory feedback (DAF) interferes with speech output. DAF causes distorted and disfluent productions and errors in the serial order of produced sounds. Although DAF has been studied extensively, the specific patterns of elicited speech errors are somewhat obscured by relatively small speech samples, differences across studies, and uncontrolled variables. The goal of this study was to characterize the types of serial order errors that increase under DAF in a systematic syllable sequence production task, which used a closed set of sounds and controlled for speech rate. METHOD Sixteen adult speakers repeatedly produced CVCVCV (C = consonant, V = vowel) sequences, paced to a "visual metronome," while hearing self-generated feedback with delays of 0-250 ms. Listeners transcribed recordings, and speech errors were classified based on the literature surrounding naturally occurring slips of the tongue. A series of mixed-effects models were used to assess the effects of delay for different error types, for error arrival time, and for speaking rate. RESULTS DAF had a significant effect on the overall error rate for delays of 100 ms or greater. Statistical models revealed significant effects (relative to zero delay) for vowel and syllable repetitions, vowel exchanges, vowel omissions, onset disfluencies, and distortions. Serial order errors were especially dominated by vowel and syllable repetitions. Errors occurred earlier on average within a trial for longer feedback delays. Although longer delays caused slower speech, this effect was mediated by the run number (time in the experiment) and small compared with those in previous studies. CONCLUSIONS DAF drives a specific pattern of serial order errors. The dominant pattern of vowel and syllable repetition errors suggests possible mechanisms whereby DAF drives changes to the activity in speech planning representations, yielding errors. These mechanisms are outlined with reference to the GODIVA (Gradient Order Directions Into Velocities of Articulators) model of speech planning and production. SUPPLEMENTAL MATERIAL https://doi.org/10.23641/asha.19601785.
Collapse
Affiliation(s)
| | - Dominic Nistal
- Department of Neurological Surgery, University of Washington, Seattle
| | - Matthias Heyne
- Department of Communication Science and Disorders, University of Pittsburgh, PA
| | - Monique C Tardif
- Department of Communication Science and Disorders, University of Pittsburgh, PA
- Center for the Neural Basis of Cognition, Pittsburgh, PA
| | - Jason W Bohland
- Department of Communication Science and Disorders, University of Pittsburgh, PA
- Center for the Neural Basis of Cognition, Pittsburgh, PA
| |
Collapse
|
35
|
Rovetti J, Copelli F, Russo FA. Audio and visual speech emotion activate the left pre-supplementary motor area. COGNITIVE, AFFECTIVE & BEHAVIORAL NEUROSCIENCE 2022; 22:291-303. [PMID: 34811708 DOI: 10.3758/s13415-021-00961-2] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Accepted: 10/03/2021] [Indexed: 06/13/2023]
Abstract
Sensorimotor brain areas have been implicated in the recognition of emotion expressed on the face and through nonverbal vocalizations. However, no previous study has assessed whether sensorimotor cortices are recruited during the perception of emotion in speech-a signal that includes both audio (speech sounds) and visual (facial speech movements) components. To address this gap in the literature, we recruited 24 participants to listen to speech clips produced in a way that was either happy, sad, or neutral in expression. These stimuli also were presented in one of three modalities: audio-only (hearing the voice but not seeing the face), video-only (seeing the face but not hearing the voice), or audiovisual. Brain activity was recorded using electroencephalography, subjected to independent component analysis, and source-localized. We found that the left presupplementary motor area was more active in response to happy and sad stimuli than neutral stimuli, as indexed by greater mu event-related desynchronization. This effect did not differ by the sensory modality of the stimuli. Activity levels in other sensorimotor brain areas did not differ by emotion, although they were greatest in response to visual-only and audiovisual stimuli. One possible explanation for the pre-SMA result is that this brain area may actively support speech emotion recognition by using our extensive experience expressing emotion to generate sensory predictions that in turn guide our perception.
Collapse
Affiliation(s)
- Joseph Rovetti
- Department of Psychology, Ryerson University, Toronto, ON, M5B 2K3, Canada
- Department of Psychology, Western University, London, ON, Canada
| | - Fran Copelli
- Department of Psychology, Ryerson University, Toronto, ON, M5B 2K3, Canada
| | - Frank A Russo
- Department of Psychology, Ryerson University, Toronto, ON, M5B 2K3, Canada.
| |
Collapse
|
36
|
Ylinen A, Wikman P, Leminen M, Alho K. Task-dependent cortical activations during selective attention to audiovisual speech. Brain Res 2022; 1775:147739. [PMID: 34843702 DOI: 10.1016/j.brainres.2021.147739] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2021] [Revised: 10/21/2021] [Accepted: 11/21/2021] [Indexed: 11/28/2022]
Abstract
Selective listening to speech depends on widespread networks of the brain, but how the involvement of different neural systems in speech processing is affected by factors such as the task performed by a listener and speech intelligibility remains poorly understood. We used functional magnetic resonance imaging to systematically examine the effects that performing different tasks has on neural activations during selective attention to continuous audiovisual speech in the presence of task-irrelevant speech. Participants viewed audiovisual dialogues and attended either to the semantic or the phonological content of speech, or ignored speech altogether and performed a visual control task. The tasks were factorially combined with good and poor auditory and visual speech qualities. Selective attention to speech engaged superior temporal regions and the left inferior frontal gyrus regardless of the task. Frontoparietal regions implicated in selective auditory attention to simple sounds (e.g., tones, syllables) were not engaged by the semantic task, suggesting that this network may not be not as crucial when attending to continuous speech. The medial orbitofrontal cortex, implicated in social cognition, was most activated by the semantic task. Activity levels during the phonological task in the left prefrontal, premotor, and secondary somatosensory regions had a distinct temporal profile as well as the highest overall activity, possibly relating to the role of the dorsal speech processing stream in sub-lexical processing. Our results demonstrate that the task type influences neural activations during selective attention to speech, and emphasize the importance of ecologically valid experimental designs.
Collapse
Affiliation(s)
- Artturi Ylinen
- Department of Psychology and Logopedics, University of Helsinki, Helsinki, Finland.
| | - Patrik Wikman
- Department of Psychology and Logopedics, University of Helsinki, Helsinki, Finland; Department of Neuroscience, Georgetown University, Washington D.C., USA
| | - Miika Leminen
- Analytics and Data Services, HUS Helsinki University Hospital, Helsinki, Finland
| | - Kimmo Alho
- Department of Psychology and Logopedics, University of Helsinki, Helsinki, Finland; Advanced Magnetic Imaging Centre, Aalto NeuroImaging, Aalto University, Espoo, Finland
| |
Collapse
|
37
|
Hierarchical cortical networks of "voice patches" for processing voices in human brain. Proc Natl Acad Sci U S A 2021; 118:2113887118. [PMID: 34930846 DOI: 10.1073/pnas.2113887118] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 11/11/2021] [Indexed: 12/26/2022] Open
Abstract
Humans have an extraordinary ability to recognize and differentiate voices. It is yet unclear whether voices are uniquely processed in the human brain. To explore the underlying neural mechanisms of voice processing, we recorded electrocorticographic signals from intracranial electrodes in epilepsy patients while they listened to six different categories of voice and nonvoice sounds. Subregions in the temporal lobe exhibited preferences for distinct voice stimuli, which were defined as "voice patches." Latency analyses suggested a dual hierarchical organization of the voice patches. We also found that voice patches were functionally connected under both task-engaged and resting states. Furthermore, the left motor areas were coactivated and correlated with the temporal voice patches during the sound-listening task. Taken together, this work reveals hierarchical cortical networks in the human brain for processing human voices.
Collapse
|
38
|
Effect of Noise Reduction on Cortical Speech-in-Noise Processing and Its Variance due to Individual Noise Tolerance. Ear Hear 2021; 43:849-861. [PMID: 34751679 PMCID: PMC9010348 DOI: 10.1097/aud.0000000000001144] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
Abstract
OBJECTIVES Despite the widespread use of noise reduction (NR) in modern digital hearing aids, our neurophysiological understanding of how NR affects speech-in-noise perception and why its effect is variable is limited. The current study aimed to (1) characterize the effect of NR on the neural processing of target speech and (2) seek neural determinants of individual differences in the NR effect on speech-in-noise performance, hypothesizing that an individual's own capability to inhibit background noise would inversely predict NR benefits in speech-in-noise perception. DESIGN Thirty-six adult listeners with normal hearing participated in the study. Behavioral and electroencephalographic responses were simultaneously obtained during a speech-in-noise task in which natural monosyllabic words were presented at three different signal-to-noise ratios, each with NR off and on. A within-subject analysis assessed the effect of NR on cortical evoked responses to target speech in the temporal-frontal speech and language brain regions, including supramarginal gyrus and inferior frontal gyrus in the left hemisphere. In addition, an across-subject analysis related an individual's tolerance to noise, measured as the amplitude ratio of auditory-cortical responses to target speech and background noise, to their speech-in-noise performance. RESULTS At the group level, in the poorest signal-to-noise ratio condition, NR significantly increased early supramarginal gyrus activity and decreased late inferior frontal gyrus activity, indicating a switch to more immediate lexical access and less effortful cognitive processing, although no improvement in behavioral performance was found. The across-subject analysis revealed that the cortical index of individual noise tolerance significantly correlated with NR-driven changes in speech-in-noise performance. CONCLUSIONS NR can facilitate speech-in-noise processing despite no improvement in behavioral performance. Findings from the current study also indicate that people with lower noise tolerance are more likely to get more benefits from NR. Overall, results suggest that future research should take a mechanistic approach to NR outcomes and individual noise tolerance.
Collapse
|
39
|
Defenderfer J, Forbes S, Wijeakumar S, Hedrick M, Plyler P, Buss AT. Frontotemporal activation differs between perception of simulated cochlear implant speech and speech in background noise: An image-based fNIRS study. Neuroimage 2021; 240:118385. [PMID: 34256138 PMCID: PMC8503862 DOI: 10.1016/j.neuroimage.2021.118385] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2021] [Revised: 06/10/2021] [Accepted: 07/09/2021] [Indexed: 10/27/2022] Open
Abstract
In this study we used functional near-infrared spectroscopy (fNIRS) to investigate neural responses in normal-hearing adults as a function of speech recognition accuracy, intelligibility of the speech stimulus, and the manner in which speech is distorted. Participants listened to sentences and reported aloud what they heard. Speech quality was distorted artificially by vocoding (simulated cochlear implant speech) or naturally by adding background noise. Each type of distortion included high and low-intelligibility conditions. Sentences in quiet were used as baseline comparison. fNIRS data were analyzed using a newly developed image reconstruction approach. First, elevated cortical responses in the middle temporal gyrus (MTG) and middle frontal gyrus (MFG) were associated with speech recognition during the low-intelligibility conditions. Second, activation in the MTG was associated with recognition of vocoded speech with low intelligibility, whereas MFG activity was largely driven by recognition of speech in background noise, suggesting that the cortical response varies as a function of distortion type. Lastly, an accuracy effect in the MFG demonstrated significantly higher activation during correct perception relative to incorrect perception of speech. These results suggest that normal-hearing adults (i.e., untrained listeners of vocoded stimuli) do not exploit the same attentional mechanisms of the frontal cortex used to resolve naturally degraded speech and may instead rely on segmental and phonetic analyses in the temporal lobe to discriminate vocoded speech.
Collapse
Affiliation(s)
- Jessica Defenderfer
- Speech and Hearing Science, University of Tennessee Health Science Center, Knoxville, TN, United States.
| | - Samuel Forbes
- Psychology, University of East Anglia, Norwich, England.
| | | | - Mark Hedrick
- Speech and Hearing Science, University of Tennessee Health Science Center, Knoxville, TN, United States.
| | - Patrick Plyler
- Speech and Hearing Science, University of Tennessee Health Science Center, Knoxville, TN, United States.
| | - Aaron T Buss
- Psychology, University of Tennessee, Knoxville, TN, United States.
| |
Collapse
|
40
|
Reduced Semantic Context and Signal-to-Noise Ratio Increase Listening Effort As Measured Using Functional Near-Infrared Spectroscopy. Ear Hear 2021; 43:836-848. [PMID: 34623112 DOI: 10.1097/aud.0000000000001137] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]
Abstract
OBJECTIVES Understanding speech-in-noise can be highly effortful. Decreasing the signal-to-noise ratio (SNR) of speech increases listening effort, but it is relatively unclear if decreasing the level of semantic context does as well. The current study used functional near-infrared spectroscopy to evaluate two primary hypotheses: (1) listening effort (operationalized as oxygenation of the left lateral PFC) increases as the SNR decreases and (2) listening effort increases as context decreases. DESIGN Twenty-eight younger adults with normal hearing completed the Revised Speech Perception in Noise Test, in which they listened to sentences and reported the final word. These sentences either had an easy SNR (+4 dB) or a hard SNR (-2 dB), and were either low in semantic context (e.g., "Tom could have thought about the sport") or high in context (e.g., "She had to vacuum the rug"). PFC oxygenation was measured throughout using functional near-infrared spectroscopy. RESULTS Accuracy on the Revised Speech Perception in Noise Test was worse when the SNR was hard than when it was easy, and worse for sentences low in semantic context than high in context. Similarly, oxygenation across the entire PFC (including the left lateral PFC) was greater when the SNR was hard, and left lateral PFC oxygenation was greater when context was low. CONCLUSIONS These results suggest that activation of the left lateral PFC (interpreted here as reflecting listening effort) increases to compensate for acoustic and linguistic challenges. This may reflect the increased engagement of domain-general and domain-specific processes subserved by the dorsolateral prefrontal cortex (e.g., cognitive control) and inferior frontal gyrus (e.g., predicting the sensory consequences of articulatory gestures), respectively.
Collapse
|
41
|
Language statistical learning responds to reinforcement learning principles rooted in the striatum. PLoS Biol 2021; 19:e3001119. [PMID: 34491980 PMCID: PMC8448350 DOI: 10.1371/journal.pbio.3001119] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2021] [Revised: 09/17/2021] [Accepted: 08/02/2021] [Indexed: 11/23/2022] Open
Abstract
Statistical learning (SL) is the ability to extract regularities from the environment. In the domain of language, this ability is fundamental in the learning of words and structural rules. In lack of reliable online measures, statistical word and rule learning have been primarily investigated using offline (post-familiarization) tests, which gives limited insights into the dynamics of SL and its neural basis. Here, we capitalize on a novel task that tracks the online SL of simple syntactic structures combined with computational modeling to show that online SL responds to reinforcement learning principles rooted in striatal function. Specifically, we demonstrate—on 2 different cohorts—that a temporal difference model, which relies on prediction errors, accounts for participants’ online learning behavior. We then show that the trial-by-trial development of predictions through learning strongly correlates with activity in both ventral and dorsal striatum. Our results thus provide a detailed mechanistic account of language-related SL and an explanation for the oft-cited implication of the striatum in SL tasks. This work, therefore, bridges the long-standing gap between language learning and reinforcement learning phenomena. Statistical learning is the ability to extract regularities from the environment; in the domain of language, this ability is fundamental in the learning of words and structural rules. This study uses a combination of computational modelling and functional MRI to reveal a fundamental link between online language statistical learning and reinforcement learning at the algorithmic and implementational levels.
Collapse
|
42
|
Viswanathan V, Bharadwaj HM, Shinn-Cunningham BG, Heinz MG. Modulation masking and fine structure shape neural envelope coding to predict speech intelligibility across diverse listening conditions. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2021; 150:2230. [PMID: 34598642 PMCID: PMC8483789 DOI: 10.1121/10.0006385] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/26/2021] [Revised: 07/22/2021] [Accepted: 08/30/2021] [Indexed: 05/28/2023]
Abstract
A fundamental question in the neuroscience of everyday communication is how scene acoustics shape the neural processing of attended speech sounds and in turn impact speech intelligibility. While it is well known that the temporal envelopes in target speech are important for intelligibility, how the neural encoding of target-speech envelopes is influenced by background sounds or other acoustic features of the scene is unknown. Here, we combine human electroencephalography with simultaneous intelligibility measurements to address this key gap. We find that the neural envelope-domain signal-to-noise ratio in target-speech encoding, which is shaped by masker modulations, predicts intelligibility over a range of strategically chosen realistic listening conditions unseen by the predictive model. This provides neurophysiological evidence for modulation masking. Moreover, using high-resolution vocoding to carefully control peripheral envelopes, we show that target-envelope coding fidelity in the brain depends not only on envelopes conveyed by the cochlea, but also on the temporal fine structure (TFS), which supports scene segregation. Our results are consistent with the notion that temporal coherence of sound elements across envelopes and/or TFS influences scene analysis and attentive selection of a target sound. Our findings also inform speech-intelligibility models and technologies attempting to improve real-world speech communication.
Collapse
Affiliation(s)
- Vibha Viswanathan
- Weldon School of Biomedical Engineering, Purdue University, West Lafayette, Indiana 47907, USA
| | - Hari M Bharadwaj
- Department of Speech, Language, and Hearing Sciences, Purdue University, West Lafayette, Indiana 47907, USA
| | | | - Michael G Heinz
- Department of Speech, Language, and Hearing Sciences, Purdue University, West Lafayette, Indiana 47907, USA
| |
Collapse
|
43
|
Venezia JH, Richards VM, Hickok G. Speech-Driven Spectrotemporal Receptive Fields Beyond the Auditory Cortex. Hear Res 2021; 408:108307. [PMID: 34311190 PMCID: PMC8378265 DOI: 10.1016/j.heares.2021.108307] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/08/2021] [Revised: 06/15/2021] [Accepted: 06/30/2021] [Indexed: 10/20/2022]
Abstract
We recently developed a method to estimate speech-driven spectrotemporal receptive fields (STRFs) using fMRI. The method uses spectrotemporal modulation filtering, a form of acoustic distortion that renders speech sometimes intelligible and sometimes unintelligible. Using this method, we found significant STRF responses only in classic auditory regions throughout the superior temporal lobes. However, our analysis was not optimized to detect small clusters of STRFs as might be expected in non-auditory regions. Here, we re-analyze our data using a more sensitive multivariate statistical test for cross-subject alignment of STRFs, and we identify STRF responses in non-auditory regions including the left dorsal premotor cortex (dPM), left inferior frontal gyrus (IFG), and bilateral calcarine sulcus (calcS). All three regions responded more to intelligible than unintelligible speech, but left dPM and calcS responded significantly to vocal pitch and demonstrated strong functional connectivity with early auditory regions. Left dPM's STRF generated the best predictions of activation on trials rated as unintelligible by listeners, a hallmark auditory profile. IFG, on the other hand, responded almost exclusively to intelligible speech and was functionally connected with classic speech-language regions in the superior temporal sulcus and middle temporal gyrus. IFG's STRF was also (weakly) able to predict activation on unintelligible trials, suggesting the presence of a partial 'acoustic trace' in the region. We conclude that left dPM is part of the human dorsal laryngeal motor cortex, a region previously shown to be capable of operating in an 'auditory mode' to encode vocal pitch. Further, given previous observations that IFG is involved in syntactic working memory and/or processing of linear order, we conclude that IFG is part of a higher-order speech circuit that exerts a top-down influence on processing of speech acoustics. Finally, because calcS is modulated by emotion, we speculate that changes in the quality of vocal pitch may have contributed to its response.
Collapse
Affiliation(s)
- Jonathan H Venezia
- VA Loma Linda Healthcare System, Loma Linda, CA, United States; Dept. of Otolaryngology, Loma Linda University School of Medicine, Loma Linda, CA, United States.
| | - Virginia M Richards
- Depts. of Cognitive Sciences and Language Science, University of California, Irvine, Irvine, CA, United States
| | - Gregory Hickok
- Depts. of Cognitive Sciences and Language Science, University of California, Irvine, Irvine, CA, United States
| |
Collapse
|
44
|
Geller J, Holmes A, Schwalje A, Berger JI, Gander PE, Choi I, McMurray B. Validation of the Iowa Test of Consonant Perception. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2021; 150:2131. [PMID: 34598595 PMCID: PMC8637717 DOI: 10.1121/10.0006246] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/04/2020] [Revised: 08/20/2021] [Accepted: 08/24/2021] [Indexed: 05/22/2023]
Abstract
Speech perception (especially in background noise) is a critical problem for hearing-impaired listeners and an important issue for cognitive hearing science. Despite a plethora of standardized measures, few single-word closed-set tests uniformly sample the most frequently used phonemes and use response choices that equally sample phonetic features like place and voicing. The Iowa Test of Consonant Perception (ITCP) attempts to solve this. It is a proportionally balanced phonemic word recognition task designed to assess perception of the initial consonant of monosyllabic consonant-vowel-consonant (CVC) words. The ITCP consists of 120 sampled CVC words. Words were recorded from four different talkers (two female) and uniformly sampled from all four quadrants of the vowel space to control for coarticulation. Response choices on each trial are balanced to equate difficulty and sample a single phonetic feature. This study evaluated the psychometric properties of ITCP by examining reliability (test-retest) and validity in a sample of online normal-hearing participants. Ninety-eight participants completed two sessions of the ITCP along with standardized tests of words and sentence in noise (CNC words and AzBio sentences). The ITCP showed good test-retest reliability and convergent validity with two popular tests presented in noise. All the materials to use the ITCP or to construct your own version of the ITCP are freely available [Geller, McMurray, Holmes, and Choi (2020). https://osf.io/hycdu/].
Collapse
Affiliation(s)
- Jason Geller
- Department of Psychological and Brain Sciences, University of Iowa, G60 Psychological and Brain Sciences Building, Iowa City, Iowa 52242, USA
| | - Ann Holmes
- Department of Psychological and Brain Sciences, University of Iowa, G60 Psychological and Brain Sciences Building, Iowa City, Iowa 52242, USA
| | - Adam Schwalje
- Department of Otolaryngology-Head and Neck Surgery, University of Iowa, 200 Hawkins Drive, 21151 Pomerantz Family Pavilion, Iowa City, Iowa 52242, USA
| | - Joel I Berger
- Department of Neurosurgery, University of Iowa, 200 Hawkins Drive, 1800 John Pappajohn Pavilion, Iowa City, Iowa 52242, USA
| | - Phillip E Gander
- Department of Neurosurgery, University of Iowa, 200 Hawkins Drive, 1800 John Pappajohn Pavilion, Iowa City, Iowa 52242, USA
| | - Inyong Choi
- Department of Communication Sciences and Disorders, University of Iowa, Wendell Johnson Speech and Hearing Center, Iowa City, Iowa 52242, USA
| | - Bob McMurray
- Department of Psychological and Brain Sciences, University of Iowa, G60 Psychological and Brain Sciences Building, Iowa City, Iowa 52242, USA
| |
Collapse
|
45
|
Li Z, Li J, Hong B, Nolte G, Engel AK, Zhang D. Speaker-Listener Neural Coupling Reveals an Adaptive Mechanism for Speech Comprehension in a Noisy Environment. Cereb Cortex 2021; 31:4719-4729. [PMID: 33969389 DOI: 10.1093/cercor/bhab118] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2021] [Revised: 03/25/2021] [Indexed: 01/01/2023] Open
Abstract
Comprehending speech in noise is an essential cognitive skill for verbal communication. However, it remains unclear how our brain adapts to the noisy environment to achieve comprehension. The present study investigated the neural mechanisms of speech comprehension in noise using an functional near-infrared spectroscopy-based inter-brain approach. A group of speakers was invited to tell real-life stories. The recorded speech audios were added with meaningless white noise at four signal-to-noise levels and then played to listeners. Results showed that speaker-listener neural couplings of listener's left inferior frontal gyri (IFG), that is, sensorimotor system, and right middle temporal gyri (MTG), angular gyri (AG), that is, auditory system, were significantly higher in listening conditions than in the baseline. More importantly, the correlation between neural coupling of listener's left IFG and the comprehension performance gradually became more positive with increasing noise level, indicating an adaptive role of sensorimotor system in noisy speech comprehension; however, the top behavioral correlations for the coupling of listener's right MTG and AG were only obtained in mild noise conditions, indicating a different and less robust mechanism. To sum up, speaker-listener coupling analysis provides added value and new sight to understand the neural mechanism of speech-in-noise comprehension.
Collapse
Affiliation(s)
- Zhuoran Li
- Department of Psychology, School of Social Sciences, Tsinghua University, Beijing 100084, China.,Tsinghua Laboratory of Brain and Intelligence, Tsinghua University, Beijing 100084, China
| | - Jiawei Li
- Department of Psychology, School of Social Sciences, Tsinghua University, Beijing 100084, China.,Tsinghua Laboratory of Brain and Intelligence, Tsinghua University, Beijing 100084, China
| | - Bo Hong
- Tsinghua Laboratory of Brain and Intelligence, Tsinghua University, Beijing 100084, China.,Department of Biomedical Engineering, School of Medicine, Tsinghua University, Beijing 100084, China
| | - Guido Nolte
- Department of Neurophysiology and Pathophysiology, University Medical Center Hamburg Eppendorf, Hamburg 20246, Germany
| | - Andreas K Engel
- Department of Neurophysiology and Pathophysiology, University Medical Center Hamburg Eppendorf, Hamburg 20246, Germany
| | - Dan Zhang
- Department of Psychology, School of Social Sciences, Tsinghua University, Beijing 100084, China.,Tsinghua Laboratory of Brain and Intelligence, Tsinghua University, Beijing 100084, China
| |
Collapse
|
46
|
Ren F, Ma W, Zong W, Li N, Li X, Li F, Wu L, Li H, Li M, Gao F. Brain Frequency-Specific Changes in the Spontaneous Neural Activity Are Associated With Cognitive Impairment in Patients With Presbycusis. Front Aging Neurosci 2021; 13:649874. [PMID: 34335224 PMCID: PMC8316979 DOI: 10.3389/fnagi.2021.649874] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2021] [Accepted: 06/11/2021] [Indexed: 11/13/2022] Open
Abstract
Presbycusis (PC) is characterized by preferential hearing loss at high frequencies and difficulty in speech recognition in noisy environments. Previous studies have linked PC to cognitive impairment, accelerated cognitive decline and incident Alzheimer’s disease. However, the neural mechanisms of cognitive impairment in patients with PC remain unclear. Although resting-state functional magnetic resonance imaging (rs-fMRI) studies have explored low-frequency oscillation (LFO) connectivity or amplitude of PC-related neural activity, it remains unclear whether the abnormalities occur within all frequency bands or within specific frequency bands. Fifty-one PC patients and fifty-one well-matched normal hearing controls participated in this study. The LFO amplitudes were investigated using the amplitude of low-frequency fluctuation (ALFF) at different frequency bands (slow-4 and slow-5). PC patients showed abnormal LFO amplitudes in the Heschl’s gyrus, dorsolateral prefrontal cortex (dlPFC), frontal eye field and key nodes of the speech network exclusively in slow-4, which suggested that abnormal spontaneous neural activity in PC was frequency dependent. Our findings also revealed that stronger functional connectivity between the dlPFC and the posterodorsal stream of auditory processing, as well as lower functional coupling between the PCC and key nodes of the DMN, which were associated with cognitive impairments in PC patients. Our study might underlie the cross-modal plasticity and higher-order cognitive participation of the auditory cortex after partial hearing deprivation. Our findings indicate that frequency-specific analysis of ALFF could provide valuable insights into functional alterations in the auditory cortex and non-auditory regions involved in cognitive impairment associated with PC.
Collapse
Affiliation(s)
- Fuxin Ren
- Department of Radiology, Shandong Provincial Hospital, Cheeloo College of Medicine, Shandong University, Jinan, China.,Department of Radiology, Shandong Provincial Hospital Affiliated to Shandong First Medical University, Jinan, China
| | - Wen Ma
- Department of Otolaryngology, The Central Hospital of Jinan City, Cheeloo College of Medicine, Shandong University, Jinan, China
| | - Wei Zong
- Department of Radiology, Shandong Provincial Hospital, Cheeloo College of Medicine, Shandong University, Jinan, China.,Department of Radiology, Shandong Provincial Hospital Affiliated to Shandong First Medical University, Jinan, China
| | - Ning Li
- Department of Radiology, Shandong Provincial Hospital, Cheeloo College of Medicine, Shandong University, Jinan, China.,Department of Radiology, Shandong Provincial Hospital Affiliated to Shandong First Medical University, Jinan, China
| | - Xiao Li
- Department of Radiology, Shandong Provincial Hospital, Cheeloo College of Medicine, Shandong University, Jinan, China.,Department of Radiology, Shandong Provincial Hospital Affiliated to Shandong First Medical University, Jinan, China
| | - Fuyan Li
- Department of Radiology, Shandong Provincial Hospital, Cheeloo College of Medicine, Shandong University, Jinan, China.,Department of Radiology, Shandong Provincial Hospital Affiliated to Shandong First Medical University, Jinan, China
| | - Lili Wu
- CAS Key Laboratory of Mental Health, Institute of Psychology, Chinese Academy of Sciences, Beijing, China
| | - Honghao Li
- Department of Neurology, Shandong Provincial Hospital Affiliated to Shandong First Medical University, Jinan, China
| | - Muwei Li
- Vanderbilt University Institute of Imaging Science, Nashville, TN, United States
| | - Fei Gao
- Department of Radiology, Shandong Provincial Hospital, Cheeloo College of Medicine, Shandong University, Jinan, China.,Department of Radiology, Shandong Provincial Hospital Affiliated to Shandong First Medical University, Jinan, China
| |
Collapse
|
47
|
Wang J, Chen J, Yang X, Liu L, Wu C, Lu L, Li L, Wu Y. Common Brain Substrates Underlying Auditory Speech Priming and Perceived Spatial Separation. Front Neurosci 2021; 15:664985. [PMID: 34220425 PMCID: PMC8247760 DOI: 10.3389/fnins.2021.664985] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2021] [Accepted: 05/10/2021] [Indexed: 11/22/2022] Open
Abstract
Under a “cocktail party” environment, listeners can utilize prior knowledge of the content and voice of the target speech [i.e., auditory speech priming (ASP)] and perceived spatial separation to improve recognition of the target speech among masking speech. Previous studies suggest that these two unmasking cues are not processed independently. However, it is unclear whether the unmasking effects of these two cues are supported by common neural bases. In the current study, we aimed to first confirm that ASP and perceived spatial separation contribute to the improvement of speech recognition interactively in a multitalker condition and further investigate whether there exist intersectant brain substrates underlying both unmasking effects, by introducing these two unmasking cues in a unified paradigm and using functional magnetic resonance imaging. The results showed that neural activations by the unmasking effects of ASP and perceived separation partly overlapped in brain areas: the left pars triangularis (TriIFG) and orbitalis of the inferior frontal gyrus, left inferior parietal lobule, left supramarginal gyrus, and bilateral putamen, all of which are involved in the sensorimotor integration and the speech production. The activations of the left TriIFG were correlated with behavioral improvements caused by ASP and perceived separation. Meanwhile, ASP and perceived separation also enhanced the functional connectivity between the left IFG and brain areas related to the suppression of distractive speech signals: the anterior cingulate cortex and the left middle frontal gyrus, respectively. Therefore, these findings suggest that the motor representation of speech is important for both the unmasking effects of ASP and perceived separation and highlight the critical role of the left IFG in these unmasking effects in “cocktail party” environments.
Collapse
Affiliation(s)
- Junxian Wang
- School of Psychological and Cognitive Sciences and Beijing Key Laboratory of Behavior and Mental Health, Peking University, Beijing, China
| | - Jing Chen
- Department of Machine Intelligence, Peking University, Beijing, China.,Speech and Hearing Research Center, Key Laboratory on Machine Perception (Ministry of Education), Peking University, Beijing, China
| | - Xiaodong Yang
- School of Psychological and Cognitive Sciences and Beijing Key Laboratory of Behavior and Mental Health, Peking University, Beijing, China
| | - Lei Liu
- School of Psychological and Cognitive Sciences and Beijing Key Laboratory of Behavior and Mental Health, Peking University, Beijing, China
| | - Chao Wu
- School of Nursing, Peking University, Beijing, China
| | - Lingxi Lu
- Center for the Cognitive Science of Language, Beijing Language and Culture University, Beijing, China
| | - Liang Li
- School of Psychological and Cognitive Sciences and Beijing Key Laboratory of Behavior and Mental Health, Peking University, Beijing, China.,Speech and Hearing Research Center, Key Laboratory on Machine Perception (Ministry of Education), Peking University, Beijing, China.,Beijing Institute for Brain Disorders, Beijing, China
| | - Yanhong Wu
- School of Psychological and Cognitive Sciences and Beijing Key Laboratory of Behavior and Mental Health, Peking University, Beijing, China.,Speech and Hearing Research Center, Key Laboratory on Machine Perception (Ministry of Education), Peking University, Beijing, China
| |
Collapse
|
48
|
Levy DF, Wilson SM. Categorical Encoding of Vowels in Primary Auditory Cortex. Cereb Cortex 2021; 30:618-627. [PMID: 31241149 DOI: 10.1093/cercor/bhz112] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2019] [Revised: 04/05/2019] [Accepted: 05/02/2019] [Indexed: 11/14/2022] Open
Abstract
Speech perception involves mapping from a continuous and variable acoustic speech signal to discrete, linguistically meaningful units. However, it is unclear where in the auditory processing stream speech sound representations cease to be veridical (faithfully encoding precise acoustic properties) and become categorical (encoding sounds as linguistic categories). In this study, we used functional magnetic resonance imaging and multivariate pattern analysis to determine whether tonotopic primary auditory cortex (PAC), defined as tonotopic voxels falling within Heschl's gyrus, represents one class of speech sounds-vowels-veridically or categorically. For each of 15 participants, 4 individualized synthetic vowel stimuli were generated such that the vowels were equidistant in acoustic space, yet straddled a categorical boundary (with the first 2 vowels perceived as [i] and the last 2 perceived as [i]). Each participant's 4 vowels were then presented in a block design with an irrelevant but attention-demanding level change detection task. We found that in PAC bilaterally, neural discrimination between pairs of vowels that crossed the categorical boundary was more accurate than neural discrimination between equivalently spaced vowel pairs that fell within a category. These findings suggest that PAC does not represent vowel sounds veridically, but that encoding of vowels is shaped by linguistically relevant phonemic categories.
Collapse
Affiliation(s)
- Deborah F Levy
- Department of Hearing and Speech Sciences, Vanderbilt University Medical Center, Nashville, TN 37232, USA
| | - Stephen M Wilson
- Department of Hearing and Speech Sciences, Vanderbilt University Medical Center, Nashville, TN 37232, USA
| |
Collapse
|
49
|
Li X, Zatorre RJ, Du Y. The Microstructural Plasticity of the Arcuate Fasciculus Undergirds Improved Speech in Noise Perception in Musicians. Cereb Cortex 2021; 31:3975-3985. [PMID: 34037726 PMCID: PMC8328222 DOI: 10.1093/cercor/bhab063] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022] Open
Abstract
Musical training is thought to be related to improved language skills, for example, understanding speech in background noise. Although studies have found that musicians and nonmusicians differed in morphology of bilateral arcuate fasciculus (AF), none has associated such white matter features with speech-in-noise (SIN) perception. Here, we tested both SIN and the diffusivity of bilateral AF segments in musicians and nonmusicians using diffusion tensor imaging. Compared with nonmusicians, musicians had higher fractional anisotropy (FA) in the right direct AF and lower radial diffusivity in the left anterior AF, which correlated with SIN performance. The FA-based laterality index showed stronger right lateralization of the direct AF and stronger left lateralization of the posterior AF in musicians than nonmusicians, with the posterior AF laterality predicting SIN accuracy. Furthermore, hemodynamic activity in right superior temporal gyrus obtained during a SIN task played a full mediation role in explaining the contribution of the right direct AF diffusivity on SIN performance, which therefore links training-related white matter plasticity, brain hemodynamics, and speech perception ability. Our findings provide direct evidence that differential microstructural plasticity of bilateral AF segments may serve as a neural foundation of the cross-domain transfer effect of musical experience to speech perception amid competing noise.
Collapse
Affiliation(s)
- Xiaonan Li
- CAS Key Laboratory of Behavioral Science, Institute of Psychology, Chinese Academy of Sciences, Beijing 100101, China.,Department of Psychology, University of Chinese Academy of Sciences, Beijing 100049, China
| | - Robert J Zatorre
- Montréal Neurological Institute, McGill University, Montréal, QC H3A 2B4, Canada.,International Laboratory for Brain, Music, and Sound Research (BRAMS), Montréal, QC H3A 2B4, Canada.,Centre for Research on Brain, Language and Music (CRBLM), Montreal, QC H3A 2B4, Canada
| | - Yi Du
- CAS Key Laboratory of Behavioral Science, Institute of Psychology, Chinese Academy of Sciences, Beijing 100101, China.,CAS Center for Excellence in Brain Science and Intelligence Technology, Shanghai 200031, China.,Department of Psychology, University of Chinese Academy of Sciences, Beijing 100049, China.,Chinese Institute for Brain Research, Beijing 102206, China
| |
Collapse
|
50
|
Guediche S, de Bruin A, Caballero-Gaudes C, Baart M, Samuel AG. Second-language word recognition in noise: Interdependent neuromodulatory effects of semantic context and crosslinguistic interactions driven by word form similarity. Neuroimage 2021; 237:118168. [PMID: 34000398 DOI: 10.1016/j.neuroimage.2021.118168] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2020] [Revised: 05/05/2021] [Accepted: 05/12/2021] [Indexed: 11/17/2022] Open
Abstract
Spoken language comprehension is a fundamental component of our cognitive skills. We are quite proficient at deciphering words from the auditory input despite the fact that the speech we hear is often masked by noise such as background babble originating from talkers other than the one we are attending to. To perceive spoken language as intended, we rely on prior linguistic knowledge and context. Prior knowledge includes all sounds and words that are familiar to a listener and depends on linguistic experience. For bilinguals, the phonetic and lexical repertoire encompasses two languages, and the degree of overlap between word forms across languages affects the degree to which they influence one another during auditory word recognition. To support spoken word recognition, listeners often rely on semantic information (i.e., the words we hear are usually related in a meaningful way). Although the number of multilinguals across the globe is increasing, little is known about how crosslinguistic effects (i.e., word overlap) interact with semantic context and affect the flexible neural systems that support accurate word recognition. The current multi-echo functional magnetic resonance imaging (fMRI) study addresses this question by examining how prime-target word pair semantic relationships interact with the target word's form similarity (cognate status) to the translation equivalent in the dominant language (L1) during accurate word recognition of a non-dominant (L2) language. We tested 26 early-proficient Spanish-Basque (L1-L2) bilinguals. When L2 targets matching L1 translation-equivalent phonological word forms were preceded by unrelated semantic contexts that drive lexical competition, a flexible language control (fronto-parietal-subcortical) network was upregulated, whereas when they were preceded by related semantic contexts that reduce lexical competition, it was downregulated. We conclude that an interplay between semantic and crosslinguistic effects regulates flexible control mechanisms of speech processing to facilitate L2 word recognition, in noise.
Collapse
Affiliation(s)
- Sara Guediche
- Basque Center on Cognition Brain, and Language, Donostia-San Sebastian 20009, Spain.
| | | | | | - Martijn Baart
- Basque Center on Cognition Brain, and Language, Donostia-San Sebastian 20009, Spain; Department of Cognitive Neuropsychology, Tilburg University, P.O. Box 90153, 5000 LE Tilburg, the Netherlands
| | - Arthur G Samuel
- Basque Center on Cognition Brain, and Language, Donostia-San Sebastian 20009, Spain; Stony Brook University, NY 11794-2500, United States; Ikerbasque Foundation, Spain
| |
Collapse
|