1
|
McFarlane KA, Sanchez JT. Effects of Temporal Processing on Speech-in-Noise Perception in Middle-Aged Adults. BIOLOGY 2024; 13:371. [PMID: 38927251 PMCID: PMC11200514 DOI: 10.3390/biology13060371] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/15/2024] [Revised: 05/20/2024] [Accepted: 05/21/2024] [Indexed: 06/28/2024]
Abstract
Auditory temporal processing is a vital component of auditory stream segregation, or the process in which complex sounds are separated and organized into perceptually meaningful objects. Temporal processing can degrade prior to hearing loss, and is suggested to be a contributing factor to difficulties with speech-in-noise perception in normal-hearing listeners. The current study tested this hypothesis in middle-aged adults-an under-investigated cohort, despite being the age group where speech-in-noise difficulties are first reported. In 76 participants, three mechanisms of temporal processing were measured: peripheral auditory nerve function using electrocochleography, subcortical encoding of periodic speech cues (i.e., fundamental frequency; F0) using the frequency following response, and binaural sensitivity to temporal fine structure (TFS) using a dichotic frequency modulation detection task. Two measures of speech-in-noise perception were administered to explore how contributions of temporal processing may be mediated by different sensory demands present in the speech perception task. This study supported the hypothesis that temporal coding deficits contribute to speech-in-noise difficulties in middle-aged listeners. Poorer speech-in-noise perception was associated with weaker subcortical F0 encoding and binaural TFS sensitivity, but in different contexts, highlighting that diverse aspects of temporal processing are differentially utilized based on speech-in-noise task characteristics.
Collapse
Affiliation(s)
- Kailyn A. McFarlane
- Roxelyn and Richard Pepper Department of Communication Sciences and Disorders, Northwestern University, Evanston, IL 60208, USA;
| | - Jason Tait Sanchez
- Roxelyn and Richard Pepper Department of Communication Sciences and Disorders, Northwestern University, Evanston, IL 60208, USA;
- Knowles Hearing Center, Northwestern University, Evanston, IL 60208, USA
- Department of Neurobiology, Northwestern University, Evanston, IL 60208, USA
| |
Collapse
|
2
|
Shen J, Sun J, Zhang Z, Sun B, Li H, Liu Y. The Effect of Hearing Loss and Working Memory Capacity on Context Use and Reliance on Context in Older Adults. Ear Hear 2024; 45:787-800. [PMID: 38273447 DOI: 10.1097/aud.0000000000001470] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/27/2024]
Abstract
OBJECTIVES Older adults often complain of difficulty in communicating in noisy environments. Contextual information is considered an important cue for identifying everyday speech. To date, it has not been clear exactly how context use (CU) and reliance on context in older adults are affected by hearing status and cognitive function. The present study examined the effects of semantic context on the performance of speech recognition, recall, perceived listening effort (LE), and noise tolerance, and further explored the impacts of hearing loss and working memory capacity on CU and reliance on context among older adults. DESIGN Fifty older adults with normal hearing and 56 older adults with mild-to-moderate hearing loss between the ages of 60 and 95 years participated in this study. A median split of the backward digit span further classified the participants into high working memory (HWM) and low working memory (LWM) capacity groups. Each participant performed high- and low-context Repeat and Recall tests, including a sentence repeat and delayed recall task, subjective assessments of LE, and tolerable time under seven signal to noise ratios (SNRs). CU was calculated as the difference between high- and low-context sentences for each outcome measure. The proportion of context use (PCU) in high-context performance was taken as the reliance on context to explain the degree to which participants relied on context when they repeated and recalled high-context sentences. RESULTS Semantic context helps improve the performance of speech recognition and delayed recall, reduces perceived LE, and prolongs noise tolerance in older adults with and without hearing loss. In addition, the adverse effects of hearing loss on the performance of repeat tasks were more pronounced in low context than in high context, whereas the effects on recall tasks and noise tolerance time were more significant in high context than in low context. Compared with other tasks, the CU and PCU in repeat tasks were more affected by listening status and working memory capacity. In the repeat phase, hearing loss increased older adults' reliance on the context of a relatively challenging listening environment, as shown by the fact that when the SNR was 0 and -5 dB, the PCU (repeat) of the hearing loss group was significantly greater than that of the normal-hearing group, whereas there was no significant difference between the two hearing groups under the remaining SNRs. In addition, older adults with LWM had significantly greater CU and PCU in repeat tasks than those with HWM, especially at SNRs with moderate task demands. CONCLUSIONS Taken together, semantic context not only improved speech perception intelligibility but also released cognitive resources for memory encoding in older adults. Mild-to-moderate hearing loss and LWM capacity in older adults significantly increased the use and reliance on semantic context, which was also modulated by the level of SNR.
Collapse
Affiliation(s)
- Jiayuan Shen
- School of Medical Technology and Information Engineering, Zhejiang Chinese Medical University, Zhejiang, China
| | - Jiayu Sun
- Department of Otolaryngology, Head and Neck Surgery, Shanghai Ninth People's Hospital, Shanghai JiaoTong University School of Medicine, Shanghai, China
| | - Zhikai Zhang
- Department of Otolaryngology, Head and Neck Surgery, Beijing Chao-Yang Hospital, Capital Medical University, Beijing, China
| | - Baoxuan Sun
- Training Department, Widex Hearing Aid (Shanghai) Co., Ltd, Shanghai, China
| | - Haitao Li
- Department of Neurology, Beijing Friendship Hospital, Capital Medical University, Beijing, China
- These authors contributed equally to this work and are co-corresponding authors
| | - Yuhe Liu
- Department of Otolaryngology, Head and Neck Surgery, Beijing Friendship Hospital, Capital Medical University, Beijing, China
- These authors contributed equally to this work and are co-corresponding authors
| |
Collapse
|
3
|
Hussain RO, Kumar P, Singh NK. Subcortical and Cortical Electrophysiological Measures in Children With Speech-in-Noise Deficits Associated With Auditory Processing Disorders. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2022; 65:4454-4468. [PMID: 36279585 DOI: 10.1044/2022_jslhr-22-00094] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
PURPOSE The aim of this study was to analyze the subcortical and cortical auditory evoked potentials for speech stimuli in children with speech-in-noise (SIN) deficits associated with auditory processing disorder (APD) without any reading or language deficits. METHOD The study included 20 children in the age range of 9-13 years. Ten children were recruited to the APD group; they had below-normal scores on the speech-perception-in-noise test and were diagnosed as having APD. The remaining 10 were typically developing (TD) children and were recruited to the TD group. Speech-evoked subcortical (brainstem) and cortical (auditory late latency) responses were recorded and compared across both groups. RESULTS The results showed a statistically significant reduction in the amplitudes of the subcortical potentials (both for stimulus in quiet and in noise) and the magnitudes of the spectral components (fundamental frequency and the second formant) in children with SIN deficits in the APD group compared to the TD group. In addition, the APD group displayed enhanced amplitudes of the cortical potentials compared to the TD group. CONCLUSION Children with SIN deficits associated with APD exhibited impaired coding/processing of the auditory information at the level of the brainstem and the auditory cortex. SUPPLEMENTAL MATERIAL https://doi.org/10.23641/asha.21357735.
Collapse
Affiliation(s)
| | - Prawin Kumar
- Department of Audiology, All India Institute of Speech and Hearing, Mysore
| | - Niraj Kumar Singh
- Department of Audiology, All India Institute of Speech and Hearing, Mysore
| |
Collapse
|
4
|
Bsharat-Maalouf D, Karawani H. Bilinguals' speech perception in noise: Perceptual and neural associations. PLoS One 2022; 17:e0264282. [PMID: 35196339 PMCID: PMC8865662 DOI: 10.1371/journal.pone.0264282] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2021] [Accepted: 02/07/2022] [Indexed: 01/26/2023] Open
Abstract
The current study characterized subcortical speech sound processing among monolinguals and bilinguals in quiet and challenging listening conditions and examined the relation between subcortical neural processing and perceptual performance. A total of 59 normal-hearing adults, ages 19–35 years, participated in the study: 29 native Hebrew-speaking monolinguals and 30 Arabic-Hebrew-speaking bilinguals. Auditory brainstem responses to speech sounds were collected in a quiet condition and with background noise. The perception of words and sentences in quiet and background noise conditions was also examined to assess perceptual performance and to evaluate the perceptual-physiological relationship. Perceptual performance was tested among bilinguals in both languages (first language (L1-Arabic) and second language (L2-Hebrew)). The outcomes were similar between monolingual and bilingual groups in quiet. Noise, as expected, resulted in deterioration in perceptual and neural responses, which was reflected in lower accuracy in perceptual tasks compared to quiet, and in more prolonged latencies and diminished neural responses. However, a mixed picture was observed among bilinguals in perceptual and physiological outcomes in noise. In the perceptual measures, bilinguals were significantly less accurate than their monolingual counterparts. However, in neural responses, bilinguals demonstrated earlier peak latencies compared to monolinguals. Our results also showed that perceptual performance in noise was related to subcortical resilience to the disruption caused by background noise. Specifically, in noise, increased brainstem resistance (i.e., fewer changes in the fundamental frequency (F0) representations or fewer shifts in the neural timing) was related to better speech perception among bilinguals. Better perception in L1 in noise was correlated with fewer changes in F0 representations, and more accurate perception in L2 was related to minor shifts in auditory neural timing. This study delves into the importance of using neural brainstem responses to speech sounds to differentiate individuals with different language histories and to explain inter-subject variability in bilinguals’ perceptual abilities in daily life situations.
Collapse
Affiliation(s)
- Dana Bsharat-Maalouf
- Department of Communication Sciences and Disorders, University of Haifa, Haifa, Israel
| | - Hanin Karawani
- Department of Communication Sciences and Disorders, University of Haifa, Haifa, Israel
- * E-mail:
| |
Collapse
|
5
|
Early auditory responses to speech sounds in Parkinson's disease: preliminary data. Sci Rep 2022; 12:1019. [PMID: 35046514 PMCID: PMC8770631 DOI: 10.1038/s41598-022-05128-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2021] [Accepted: 01/06/2022] [Indexed: 11/14/2022] Open
Abstract
Parkinson’s disease (PD), as a manifestation of basal ganglia dysfunction, is associated with a number of speech deficits, including reduced voice modulation and vocal output. Interestingly, previous work has shown that participants with PD show an increased feedback-driven motor response to unexpected fundamental frequency perturbations during speech production, and a heightened ability to detect differences in vocal pitch relative to control participants. Here, we explored one possible contributor to these enhanced responses. We recorded the frequency-following auditory brainstem response (FFR) to repetitions of the speech syllable [da] in PD and control participants. Participants with PD displayed a larger amplitude FFR related to the fundamental frequency of speech stimuli relative to the control group. The current preliminary results suggest the dysfunction of the basal ganglia in PD contributes to the early stage of auditory processing and may reflect one component of a broader sensorimotor processing impairment associated with the disease.
Collapse
|
6
|
Sammeth CA, Greene NT, Brown AD, Tollin DJ. Normative Study of the Binaural Interaction Component of the Human Auditory Brainstem Response as a Function of Interaural Time Differences. Ear Hear 2021; 42:629-643. [PMID: 33141776 PMCID: PMC8085190 DOI: 10.1097/aud.0000000000000964] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
Abstract
OBJECTIVES The binaural interaction component (BIC) of the auditory brainstem response (ABR) is obtained by subtracting the sum of the monaural right and left ear ABRs from the binaurally evoked ABR. The result is a small but prominent negative peak (herein called "DN1"), indicating a smaller binaural than summed ABR, which occurs around the latency of wave V or its roll-off slope. The BIC has been proposed to have diagnostic value as a biomarker of binaural processing abilities; however, there have been conflicting reports regarding the reliability of BIC measures in human subjects. The objectives of the current study were to: (1) examine prevalence of BIC across a large group of normal-hearing young adults; (2) determine effects of interaural time differences (ITDs) on BIC; and (3) examine any relationship between BIC and behavioral ITD discrimination acuity. DESIGN Subjects were 40 normal-hearing adults (20 males and 20 females), aged 21 to 48 years, with no history of otologic or neurologic disorders. Midline ABRs were recorded from electrodes at high forehead (Fz) referenced to the nape of the neck (near the seventh cervical vertebra), with Fpz (low forehead) as the ground. ABRs were also recorded with a conventional earlobe reference for comparison to midline results. Stimuli were 90 dB peSPL biphasic clicks. For BIC measurements, stimuli were presented in a block as interleaved right monaural, left monaural, and binaural stimuli with 2000+ presentations per condition. Four measurements were averaged for a total of 8000+ stimuli per analyzed waveform. BIC was measured for ITD = 0 (simultaneous bilateral) and for ITDs of ±500 and ±750 µs. Subjects separately performed a lateralization task, using the same stimuli, to determine ITD discrimination thresholds. RESULTS An identifiable BIC DN1 was obtained in 39 of 40 subjects at ITD = 0 µs in at least one of two measurement sessions, but was seen in lesser numbers of subjects in a single session or as ITD increased. BIC was most often seen when a subject was relaxed or sleeping, and less often when they fidgeted or reported neck tension, suggesting myogenic activity as a possible factor in disrupting BIC measurements. Mean BIC latencies systematically increased with increasing ITD, and mean BIC amplitudes tended to decrease. However, across subjects, there was no significant relationship between the amplitude or latency of the BIC and behavioral ITD thresholds. CONCLUSIONS Consistent with previous studies, measurement of the BIC was time consuming and a BIC was sometimes difficult to obtain in awake normal-hearing subjects. The BIC will thus continue to be of limited clinical utility unless stimulus parameters and measurement techniques can be identified that produce a more robust response. Nonetheless, modulation of BIC characteristics by ITD supports the concept that the ABR BIC indexes aspects of binaural brainstem processing and thus may prove useful in selected research applications, e.g. in the examination of populations expected to have aberrant binaural signal processing ability.
Collapse
Affiliation(s)
- Carol A. Sammeth
- Department of Physiology and Biophysics, University of Colorado School of Medicine, Aurora, Colorado, USA
| | - Nathaniel T. Greene
- Department of Otolaryngology, University of Colorado School of Medicine, Aurora, Colorado, USA
| | - Andrew D. Brown
- Department of Speech and Hearing Sciences, University of Washington, Seattle, Washington, USA
| | - Daniel J Tollin
- Department of Physiology and Biophysics, University of Colorado School of Medicine, Aurora, Colorado, USA
- Department of Otolaryngology, University of Colorado School of Medicine, Aurora, Colorado, USA
| |
Collapse
|
7
|
Heidari A, Moossavi A, Yadegari F, Bakhshi E, Ahadi M. Effect of Vowel Auditory Training on the Speech-In-Noise Perception among Older Adults with Normal Hearing. IRANIAN JOURNAL OF OTORHINOLARYNGOLOGY 2020; 32:229-236. [PMID: 32850511 PMCID: PMC7423087 DOI: 10.22038/ijorl.2019.33433.2110] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Introduction: Aging reduces the ability to understand speech in noise. Hearing rehabilitation is one of the ways to help older people communicate effectively. This study aimed to investigate the effect of vowel auditory training on the improvement of speech-in-noise (SIN) perception among elderly listeners. Materials and Methods: This study was conducted on 36 elderly listeners (17 males and 15 females) with the mean±SD of 67.6±6.33. They had the normal peripheral auditory ability but had difficulties in SIN perception. The samples were randomly divided into two groups of intervention and control. The intervention group underwent vowel auditory training; however, the control group received no training. Results: After vowel auditory training, the intervention group showed significant changes in the results of the SIN test at two signal-to-noise ratios of 0 and -10 and the Iranian version of the Speech, Spatial, and Qualities of Hearing Scale, compared to the control group (P<0.001). Regarding the Speech Auditory Brainstem Response test, the F0 magnitude was higher in the intervention group (8.42±2.26), compared to the control group (6.68±1.87) (P<0.011). Conclusion: This study investigated the effect of vowel auditory training on the improvement of SIN perception which could be probably due to better F0 encoding and receiving. This ability enhancement resulted in the easier perception of speech and its more proper separation from background noise which in turn enhanced the ability of the old people to follow the speech of a specific person and track the discussion.
Collapse
Affiliation(s)
- Atta Heidari
- Department of Audiology, Faculty of Rehabilitation, Hamadan University of Medical Sciences, Hamadan, Iran
| | - Abdollah Moossavi
- Department of Otolaryngology and Head and Neck Surgery, School of Medicine, Iran University of Medical Sciences, Tehran, Iran
| | - Fariba Yadegari
- Department of Speech Therapy, University of Social Welfare and Rehabilitation Sciences, Tehran, Iran
| | - Enayatollah Bakhshi
- Department of Biostatistics, University of Social Welfare and Rehabilitation Sciences, Tehran, Iran
| | - Mohsen Ahadi
- Department of Audiology, Rehabilitation Research Center, School of Rehabilitation Sciences, Iran University of Medical Sciences, Tehran, Iran
| |
Collapse
|
8
|
Musical Experience Offsets Age-Related Decline in Understanding Speech-in-Noise: Type of Training Does Not Matter, Working Memory Is the Key. Ear Hear 2020; 42:258-270. [PMID: 32826504 DOI: 10.1097/aud.0000000000000921] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
Abstract
OBJECTIVES Speech comprehension under "cocktail party" scenarios deteriorates with age even in the absence of measurable hearing loss. Musical training is suggested to counteract the age-related decline in speech-in-noise (SIN) perception, yet which aspect of musical plasticity contributes to this compensation remains unclear. This study aimed to investigate the effects of musical experience and aging on SIN perception ability. We hypothesized a key mediation role of auditory working memory in ameliorating deficient SIN perception in older adults by musical training. DESIGN Forty-eight older musicians, 29 older nonmusicians, 48 young musicians, and 24 young nonmusicians all with (near) normal peripheral hearing were recruited. The SIN task was recognizing nonsense speech sentences either perceptually colocated or separated with a noise masker (energetic masking) or a two-talker speech masker (informational masking). Auditory working memory was measured by auditory digit span. Path analysis was used to examine the direct and indirect effects of musical expertise and age on SIN perception performance. RESULTS Older musicians outperformed older nonmusicians in auditory working memory and all SIN conditions (noise separation, noise colocation, speech separation, speech colocation), but such musician advantages were absent in young adults. Path analysis showed that age and musical training had opposite effects on auditory working memory, which played a significant mediation role in SIN perception. In addition, the type of musical training did not differentiate SIN perception regardless of age. CONCLUSIONS These results provide evidence that musical training offsets age-related speech perception deficit at adverse listening conditions by preserving auditory working memory. Our findings highlight auditory working memory in supporting speech perception amid competing noise in older adults, and underline musical training as a means of "cognitive reserve" against declines in speech comprehension and cognition in aging populations.
Collapse
|
9
|
Sattari K, Rahbar N, Ahadi M, Haghani H. The effects of a temporal processing-based auditory training program on the auditory skills of elderly users of hearing aids: a study protocol for a randomized clinical trial. F1000Res 2020; 9:425. [PMID: 32595959 PMCID: PMC7308962 DOI: 10.12688/f1000research.22757.2] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 07/01/2020] [Indexed: 02/03/2023] Open
Abstract
Background: One of the most important effects of age-related declines in neural processing speed is the impairment of temporal resolution, which leads to difficulty hearing in noisy environments. Since the central auditory system is highly plastic, by designing and implementing a temporal processing-based auditory training program, we can help the elderly improve their listening skills and speech understanding in noisy environments. Methods: In the first phase of this research, based on the theoretical framework of temporal processing, an auditory training solution was developed as a software program. In the second phase, which will be described in the present study, the effects of the designed program on the listening skills of the elderly users of hearing aids (age: 60-75 years) will be studied in the control and intervention groups. In the intervention group, the auditory training program will be implemented for three months (36 sessions), and the results of central tests (GIN, DPT, QuickSIN) and the electrophysiological speech-ABR test will be compared in both groups before, immediately and one month after the intervention. Discussion: Since temporal processing is not sufficient in auditory training programs for the elderly with hearing impairments, implementation of a temporal processing-based auditory training program can reduce hearing problems in noisy environments among elderly users of hearing aids. Trial registration: This study was registered as a clinical trial in the Iranian Registry of Clinical Trials (
IRCT20190921044838N1) on December 25, 2019.
Collapse
Affiliation(s)
- Karim Sattari
- Department of Audiology, Rehabilitation Research Center, School of Rehabilitation Sciences, Iran University of Medical Sciences, Tehran, Iran
| | - Nariman Rahbar
- Department of Audiology, Rehabilitation Research Center, School of Rehabilitation Sciences, Iran University of Medical Sciences, Tehran, Iran
| | - Mohsen Ahadi
- Department of Audiology, Rehabilitation Research Center, School of Rehabilitation Sciences, Iran University of Medical Sciences, Tehran, Iran
| | - Hamid Haghani
- Department of Biostatistics, School of Management and Information Technology, Iran University of Medical Sciences, Tehran, Iran
| |
Collapse
|
10
|
López-Caballero F, Martin-Trias P, Ribas-Prats T, Gorina-Careta N, Bartrés-Faz D, Escera C. Effects of cTBS on the Frequency-Following Response and Other Auditory Evoked Potentials. Front Hum Neurosci 2020; 14:250. [PMID: 32733220 PMCID: PMC7360924 DOI: 10.3389/fnhum.2020.00250] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2019] [Accepted: 06/04/2020] [Indexed: 01/22/2023] Open
Abstract
The frequency-following response (FFR) is an auditory evoked potential (AEP) that follows the periodic characteristics of a sound. Despite being a widely studied biosignal in auditory neuroscience, the neural underpinnings of the FFR are still unclear. Traditionally, FFR was associated with subcortical activity, but recent evidence suggested cortical contributions which may be dependent on the stimulus frequency. We combined electroencephalography (EEG) with an inhibitory transcranial magnetic stimulation protocol, the continuous theta burst stimulation (cTBS), to disentangle the cortical contribution to the FFR elicited to stimuli of high and low frequency. We recorded FFR to the syllable /ba/ at two fundamental frequencies (Low: 113 Hz; High: 317 Hz) in healthy participants. FFR, cortical potentials, and auditory brainstem response (ABR) were recorded before and after real and sham cTBS in the right primary auditory cortex. Results showed that cTBS did not produce a significant change in the FFR recorded, in any of the frequencies. No effect was observed in the ABR and cortical potentials, despite the latter known contributions from the auditory cortex. Possible reasons behind the negative results include compensatory mechanisms from the non-targeted areas, intraindividual variability of the cTBS effectiveness, and the particular location of our target area, the primary auditory cortex.
Collapse
Affiliation(s)
- Fran López-Caballero
- Institute of Neurosciences, University of Barcelona, Barcelona, Spain.,Cognitive Neuroscience Research Group, Department of Clinical Psychology and Psychobiology, University of Barcelona, Barcelona, Spain
| | - Pablo Martin-Trias
- Medical Psychology Unit, Department of Medicine, Faculty of Medicine and Health Sciences, University of Barcelona, Barcelona, Spain
| | - Teresa Ribas-Prats
- Institute of Neurosciences, University of Barcelona, Barcelona, Spain.,Cognitive Neuroscience Research Group, Department of Clinical Psychology and Psychobiology, University of Barcelona, Barcelona, Spain.,Institut de Recerca Sant Joan de Déu (IRSJD), Barcelona, Spain
| | - Natàlia Gorina-Careta
- Institute of Neurosciences, University of Barcelona, Barcelona, Spain.,Cognitive Neuroscience Research Group, Department of Clinical Psychology and Psychobiology, University of Barcelona, Barcelona, Spain.,Institut de Recerca Sant Joan de Déu (IRSJD), Barcelona, Spain
| | - David Bartrés-Faz
- Institute of Neurosciences, University of Barcelona, Barcelona, Spain.,Medical Psychology Unit, Department of Medicine, Faculty of Medicine and Health Sciences, University of Barcelona, Barcelona, Spain.,Institut d'Investigacions Biomèdiques August Pi i Sunyer (IDIBAPS), Barcelona, Spain
| | - Carles Escera
- Institute of Neurosciences, University of Barcelona, Barcelona, Spain.,Cognitive Neuroscience Research Group, Department of Clinical Psychology and Psychobiology, University of Barcelona, Barcelona, Spain.,Institut de Recerca Sant Joan de Déu (IRSJD), Barcelona, Spain
| |
Collapse
|
11
|
Fu D, Weber C, Yang G, Kerzel M, Nan W, Barros P, Wu H, Liu X, Wermter S. What Can Computational Models Learn From Human Selective Attention? A Review From an Audiovisual Unimodal and Crossmodal Perspective. Front Integr Neurosci 2020; 14:10. [PMID: 32174816 PMCID: PMC7056875 DOI: 10.3389/fnint.2020.00010] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/03/2019] [Accepted: 02/11/2020] [Indexed: 11/13/2022] Open
Abstract
Selective attention plays an essential role in information acquisition and utilization from the environment. In the past 50 years, research on selective attention has been a central topic in cognitive science. Compared with unimodal studies, crossmodal studies are more complex but necessary to solve real-world challenges in both human experiments and computational modeling. Although an increasing number of findings on crossmodal selective attention have shed light on humans' behavioral patterns and neural underpinnings, a much better understanding is still necessary to yield the same benefit for intelligent computational agents. This article reviews studies of selective attention in unimodal visual and auditory and crossmodal audiovisual setups from the multidisciplinary perspectives of psychology and cognitive neuroscience, and evaluates different ways to simulate analogous mechanisms in computational models and robotics. We discuss the gaps between these fields in this interdisciplinary review and provide insights about how to use psychological findings and theories in artificial intelligence from different perspectives.
Collapse
Affiliation(s)
- Di Fu
- CAS Key Laboratory of Behavioral Science, Institute of Psychology, Beijing, China
- Department of Psychology, University of Chinese Academy of Sciences, Beijing, China
- Department of Informatics, University of Hamburg, Hamburg, Germany
| | - Cornelius Weber
- Department of Informatics, University of Hamburg, Hamburg, Germany
| | - Guochun Yang
- CAS Key Laboratory of Behavioral Science, Institute of Psychology, Beijing, China
- Department of Psychology, University of Chinese Academy of Sciences, Beijing, China
| | - Matthias Kerzel
- Department of Informatics, University of Hamburg, Hamburg, Germany
| | - Weizhi Nan
- Department of Psychology, Center for Brain and Cognitive Sciences, School of Education, Guangzhou University, Guangzhou, China
| | - Pablo Barros
- Department of Informatics, University of Hamburg, Hamburg, Germany
| | - Haiyan Wu
- CAS Key Laboratory of Behavioral Science, Institute of Psychology, Beijing, China
- Department of Psychology, University of Chinese Academy of Sciences, Beijing, China
| | - Xun Liu
- CAS Key Laboratory of Behavioral Science, Institute of Psychology, Beijing, China
- Department of Psychology, University of Chinese Academy of Sciences, Beijing, China
| | - Stefan Wermter
- Department of Informatics, University of Hamburg, Hamburg, Germany
| |
Collapse
|
12
|
Luo L, Xu N, Wang Q, Li L. Disparity in interaural time difference improves the accuracy of neural representations of individual concurrent narrowband sounds in rat inferior colliculus and auditory cortex. J Neurophysiol 2020; 123:695-706. [PMID: 31891521 DOI: 10.1152/jn.00284.2019] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022] Open
Abstract
The central mechanisms underlying binaural unmasking for spectrally overlapping concurrent sounds, which are unresolved in the peripheral auditory system, remain largely unknown. In this study, frequency-following responses (FFRs) to two binaurally presented independent narrowband noises (NBNs) with overlapping spectra were recorded simultaneously in the inferior colliculus (IC) and auditory cortex (AC) in anesthetized rats. The results showed that for both IC FFRs and AC FFRs, introducing an interaural time difference (ITD) disparity between the two concurrent NBNs enhanced the representation fidelity, reflected by the increased coherence between the responses evoked by double-NBN stimulation and the responses evoked by single NBNs. The ITD disparity effect varied across frequency bands, being more marked for higher frequency bands in the IC and lower frequency bands in the AC. Moreover, the coherence between IC responses and AC responses was also enhanced by the ITD disparity, and the enhancement was most prominent for low-frequency bands and the IC and the AC on the same side. These results suggest a critical role of the ITD cue in the neural segregation of spectrotemporally overlapping sounds.NEW & NOTEWORTHY When two spectrally overlapped narrowband noises are presented at the same time with the same sound-pressure level, they mask each other. Introducing a disparity in interaural time difference between these two narrowband noises improves the accuracy of the neural representation of individual sounds in both the inferior colliculus and the auditory cortex. The lower frequency signal transformation from the inferior colliculus to the auditory cortex on the same side is also enhanced, showing the effect of binaural unmasking.
Collapse
Affiliation(s)
- Lu Luo
- School of Psychological and Cognitive Sciences, Beijing Key Laboratory of Behavior and Mental Health, Peking University, Beijing, China
| | - Na Xu
- School of Psychological and Cognitive Sciences, Beijing Key Laboratory of Behavior and Mental Health, Peking University, Beijing, China
| | - Qian Wang
- School of Psychological and Cognitive Sciences, Beijing Key Laboratory of Behavior and Mental Health, Peking University, Beijing, China.,Beijing Key Laboratory of Epilepsy, Epilepsy Center, Department of Functional Neurosurgery, Sanbo Brain Hospital, Capital Medical University, Beijing, China
| | - Liang Li
- School of Psychological and Cognitive Sciences, Beijing Key Laboratory of Behavior and Mental Health, Peking University, Beijing, China.,Speech and Hearing Research Center, Key Laboratory on Machine Perception (Ministry of Education), Peking University, Beijing, China.,Beijing Institute for Brain Disorders, Beijing, China
| |
Collapse
|
13
|
Du Y, Shen Y, Wu X, Chen J. The effect of speech material on the band importance function for Mandarin Chinese. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2019; 146:445. [PMID: 31370645 PMCID: PMC7273514 DOI: 10.1121/1.5116691] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/23/2019] [Revised: 05/23/2019] [Accepted: 06/25/2019] [Indexed: 05/17/2023]
Abstract
Speech material influences the relative contributions of different frequency regions to intelligibility for English. In the current study, whether a similar effect of speech material is present for Mandarin Chinese was investigated. Speech recognition was measured using three speech materials in Mandarin, including disyllabic words, nonsense sentences, and meaningful sentences. These materials differed from one another in terms of the amount of contextual information and word frequency. The band importance function (BIF), as defined under the Speech Intelligibility Index (SII) framework, was used to quantify the contributions across frequency regions. The BIFs for the three speech materials were estimated from 16 adults who were native speakers of Mandarin. A Bayesian adaptive procedure was used to efficiently estimate the octave-frequency BIFs for the three materials for each listener. As the amount of contextual information increased, low-frequency bands (e.g., 250 and 500 Hz) became more important for speech recognition, consistent with English. The BIF was flatter for Mandarin than for comparable English speech materials. Introducing the language- and material-specific BIFs to the SII model led to improved predictions of Mandarin speech-recognition performance. Results suggested the necessity of developing material-specific BIFs for Mandarin.
Collapse
Affiliation(s)
- Yufan Du
- Department of Machine Intelligence, Peking University, Beijing, China
| | - Yi Shen
- Department of Speech and Hearing Sciences, Indiana University Bloomington, 200 South Jordan Avenue, Bloomington, Indiana 47405, USA
| | - Xihong Wu
- Department of Machine Intelligence, Peking University, Beijing, China
| | - Jing Chen
- Department of Machine Intelligence, Peking University, Beijing, China
| |
Collapse
|
14
|
Takeuti AA, Fávero ML, Zaia EH, Ganança FF. Auditory brainstem function in women with vestibular migraine: a controlled study. BMC Neurol 2019; 19:144. [PMID: 31248379 PMCID: PMC6595618 DOI: 10.1186/s12883-019-1368-5] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2019] [Accepted: 06/17/2019] [Indexed: 01/04/2023] Open
Abstract
BACKGROUND Vestibular migraine (VM) has been recognized as a diagnostic entity over the past three decades. It affects up to 1% of the general population and 7% of patients seen in dizziness clinics. It is still underdiagnosed; consequently, it is important to conduct clinical studies that address diagnostic indicators of VM. The aim of this study was to assess auditory brainstem function in women with vestibular migraine using electrophysiological testing, contralateral acoustic reflex and loudness discomfort level. METHODS The study group consisted of 29 women with vestibular migraine in the interictal period, and the control group comprised 25 healthy women. Auditory brainstem response, frequency following response, binaural interaction component and assessment of contralateral efferent suppression were performed. The threshold of loudness discomfort and the contralateral acoustic reflex were also investigated. The results were compared between the groups. RESULTS There was a statistically significant difference between the groups in the frequency following response and the loudness discomfort level. CONCLUSIONS The current study suggested that temporal auditory processing and loudness discomfort levels are altered in VM patients during the interictal period, indicating that these measures may be useful as diagnostic criteria.
Collapse
Affiliation(s)
- Alice A. Takeuti
- Departamento de Otorrinolaringologia e Cirurgia de Cabeça e Pescoço, Universidade Federal de São Paulo, São Paulo, Brazil
| | - Mariana L. Fávero
- Divisão de Educação e Reabilitação dos Distúrbios da Comunicação (DERDIC), Pontíficia Universidade Catolica de São Paulo, São Paulo, Brazil
| | | | - Fernando F. Ganança
- Departamento de Otorrinolaringologia e Cirurgia de Cabeça e Pescoço, Universidade Federal de São Paulo, São Paulo, Brazil
| |
Collapse
|
15
|
Xu N, Luo L, Wang Q, Li L. Binaural unmasking of the accuracy of envelope-signal representation in rat auditory cortex but not auditory midbrain. Hear Res 2019; 377:224-233. [PMID: 30991272 DOI: 10.1016/j.heares.2019.04.003] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/26/2018] [Revised: 03/25/2019] [Accepted: 04/03/2019] [Indexed: 01/16/2023]
Abstract
Accurate neural representations of acoustic signals under noisy conditions are critical for animals' survival. Detecting signal against background noise can be improved by binaural hearing particularly when an interaural-time-difference (ITD) disparity is introduced between the signal and the noise, a phenomenon known as binaural unmasking. Previous studies have mainly focused on the binaural unmasking effect on response magnitudes, and it is not clear whether binaural unmasking affects the accuracy of central representations of target acoustic signals and the relative contributions of different central auditory structures to this accuracy. Frequency following responses (FFRs), which are sustained phase-locked neural activities, can be used for measuring the accuracy of the representation of signals. Using intracranial recordings of local field potentials, this study aimed to assess whether the binaural unmasking effects include an improvement of the accuracy of neural representations of sound-envelope signals in the rat IC and/or auditory cortex (AC). The results showed that (1) when a narrow-band noise was presented binaurally, the stimulus-response (S-R) coherence of the FFRs to the envelope (FFRenvelope) of the narrow-band noise recorded in the IC was higher than that recorded in the AC. (2) Presenting a broad-band masking noise caused a larger reduction of the S-R coherence for FFRenvelope in the IC than that in the AC. (3) Introducing an ITD disparity between the narrow-band signal noise and the broad-band masking noise did not affect the IC S-R coherence, but enhanced both the AC S-R coherence and the coherence between the IC FFRenvelope and AC FFRenvelope. Thus, although the accuracy of representing envelope signals in the AC is lower than that in the IC, it can be binaurally unmasked, indicating a binaural-unmasking mechanism that is formed during the signal transmission from the IC to the AC.
Collapse
Affiliation(s)
- Na Xu
- School of Psychological and Cognitive Sciences, Beijing Key Laboratory of Behavior and Mental Health, Peking University, Beijing, 100080, China
| | - Lu Luo
- School of Psychological and Cognitive Sciences, Beijing Key Laboratory of Behavior and Mental Health, Peking University, Beijing, 100080, China
| | - Qian Wang
- School of Psychological and Cognitive Sciences, Beijing Key Laboratory of Behavior and Mental Health, Peking University, Beijing, 100080, China; Beijing Key Laboratory of Epilepsy, Epilepsy Center, Department of Functional Neurosurgery, Sanbo Brain Hospital, Capital Medical University, Beijing, 100093, China
| | - Liang Li
- School of Psychological and Cognitive Sciences, Beijing Key Laboratory of Behavior and Mental Health, Peking University, Beijing, 100080, China; Speech and Hearing Research Center, Key Laboratory on Machine Perception (Ministry of Education), Peking University, Beijing, 100871, China; Beijing Institute for Brain Disorders, Beijing, 100096, China.
| |
Collapse
|
16
|
Graydon K, Van Dun B, Dowell R, Rance G. The frequency-following response as an assessment of spatial processing. Int J Audiol 2019; 58:497-503. [DOI: 10.1080/14992027.2019.1597285] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
Affiliation(s)
- Kelley Graydon
- The HEARing Cooperative Research Centre, Carlton, Australia
- Department of Audiology and Speech Pathology, The University of Melbourne, Carlton, Victoria, Australia
| | - Bram Van Dun
- The HEARing Cooperative Research Centre, Carlton, Australia
- National Acoustic Laboratories, Macquarie Park, New South Wales, Australia
| | - Richard Dowell
- The HEARing Cooperative Research Centre, Carlton, Australia
| | - Gary Rance
- The HEARing Cooperative Research Centre, Carlton, Australia
| |
Collapse
|
17
|
Emotionally conditioning the target-speech voice enhances recognition of the target speech under "cocktail-party" listening conditions. Atten Percept Psychophys 2019; 80:871-883. [PMID: 29473143 DOI: 10.3758/s13414-018-1489-8] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
Under a noisy "cocktail-party" listening condition with multiple people talking, listeners can use various perceptual/cognitive unmasking cues to improve recognition of the target speech against informational speech-on-speech masking. One potential unmasking cue is the emotion expressed in a speech voice, by means of certain acoustical features. However, it was unclear whether emotionally conditioning a target-speech voice that has none of the typical acoustical features of emotions (i.e., an emotionally neutral voice) can be used by listeners for enhancing target-speech recognition under speech-on-speech masking conditions. In this study we examined the recognition of target speech against a two-talker speech masker both before and after the emotionally neutral target voice was paired with a loud female screaming sound that has a marked negative emotional valence. The results showed that recognition of the target speech (especially the first keyword in a target sentence) was significantly improved by emotionally conditioning the target speaker's voice. Moreover, the emotional unmasking effect was independent of the unmasking effect of the perceived spatial separation between the target speech and the masker. Also, (skin conductance) electrodermal responses became stronger after emotional learning when the target speech and masker were perceptually co-located, suggesting an increase of listening efforts when the target speech was informationally masked. These results indicate that emotionally conditioning the target speaker's voice does not change the acoustical parameters of the target-speech stimuli, but the emotionally conditioned vocal features can be used as cues for unmasking target speech.
Collapse
|
18
|
Yellamsetty A, Bidelman GM. Brainstem correlates of concurrent speech identification in adverse listening conditions. Brain Res 2019; 1714:182-192. [PMID: 30796895 DOI: 10.1016/j.brainres.2019.02.025] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2018] [Revised: 01/07/2019] [Accepted: 02/19/2019] [Indexed: 01/20/2023]
Abstract
When two voices compete, listeners can segregate and identify concurrent speech sounds using pitch (fundamental frequency, F0) and timbre (harmonic) cues. Speech perception is also hindered by the signal-to-noise ratio (SNR). How clear and degraded concurrent speech sounds are represented at early, pre-attentive stages of the auditory system is not well understood. To this end, we measured scalp-recorded frequency-following responses (FFR) from the EEG while human listeners heard two concurrently presented, steady-state (time-invariant) vowels whose F0 differed by zero or four semitones (ST) presented diotically in either clean (no noise) or noise-degraded (+5dB SNR) conditions. Listeners also performed a speeded double vowel identification task in which they were required to identify both vowels correctly. Behavioral results showed that speech identification accuracy increased with F0 differences between vowels, and this perceptual F0 benefit was larger for clean compared to noise degraded (+5dB SNR) stimuli. Neurophysiological data demonstrated more robust FFR F0 amplitudes for single compared to double vowels and considerably weaker responses in noise. F0 amplitudes showed speech-on-speech masking effects, along with a non-linear constructive interference at 0ST, and suppression effects at 4ST. Correlations showed that FFR F0 amplitudes failed to predict listeners' identification accuracy. In contrast, FFR F1 amplitudes were associated with faster reaction times, although this correlation was limited to noise conditions. The limited number of brain-behavior associations suggests subcortical activity mainly reflects exogenous processing rather than perceptual correlates of concurrent speech perception. Collectively, our results demonstrate that FFRs reflect pre-attentive coding of concurrent auditory stimuli that only weakly predict the success of identifying concurrent speech.
Collapse
Affiliation(s)
- Anusha Yellamsetty
- School of Communication Sciences & Disorders, University of Memphis, Memphis, TN, USA; Department of Communication Sciences & Disorders, University of South Florida, USA.
| | - Gavin M Bidelman
- School of Communication Sciences & Disorders, University of Memphis, Memphis, TN, USA; Institute for Intelligent Systems, University of Memphis, Memphis, TN, USA; University of Tennessee Health Sciences Center, Department of Anatomy and Neurobiology, Memphis, TN, USA.
| |
Collapse
|
19
|
Hao W, Wang Q, Li L, Qiao Y, Gao Z, Ni D, Shang Y. Effects of Phase-Locking Deficits on Speech Recognition in Older Adults With Presbycusis. Front Aging Neurosci 2018; 10:397. [PMID: 30574084 PMCID: PMC6291518 DOI: 10.3389/fnagi.2018.00397] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/04/2018] [Accepted: 11/19/2018] [Indexed: 12/05/2022] Open
Abstract
Objective: People with presbycusis (PC) often report difficulties in speech recognition, especially under noisy listening conditions. Investigating the PC-related changes in central representations of envelope signals and temporal fine structure (TFS) signals of speech sounds is critical for understanding the mechanism underlying the PC-related deficit in speech recognition. Frequency-following responses (FFRs) to speech stimulation can be used to examine the subcortical encoding of both envelope and TFS speech signals. This study compared FFRs to speech signals between listeners with PC and those with clinically normal hearing (NH) under either quiet or noise-masking conditions. Methods: FFRs to a 170-ms speech syllable /da/ were recorded under either a quiet or noise-masking (with a signal-to-noise ratio (SNR) of 8 dB) condition in 14 older adults with PC and 13 age-matched adults with NH. The envelope (FFRENV) and TFS (FFRTFS) components of FFRs were analyzed separately by adding and subtracting the alternative polarity responses, respectively. Speech recognition in noise was evaluated in each participant. Results: In the quiet condition, compared with the NH group, the PC group exhibited smaller F0 and H3 amplitudes and decreased stimulus-response (S-R) correlation for FFRENV but not for FFRTFS. Both the H2 and H3 amplitudes and the S-R correlation of FFRENV significantly decreased in the noise condition compared with the quiet condition in the NH group but not in the PC group. Moreover, the degree of hearing loss was correlated with noise-induced changes in FFRTFS morphology. Furthermore, the speech-in-noise (SIN) threshold was negatively correlated with the noise-induced change in H2 (for FFRENV) and the S-R correlation for FFRENV in the quiet condition. Conclusion: Audibility affects the subcortical encoding of both envelope and TFS in PC patients. The impaired ability to adjust the balance between the envelope and TFS in the noise condition may be part of the mechanism underlying PC-related deficits in speech recognition in noise. FFRs can predict SIN perception performance.
Collapse
Affiliation(s)
- Wenyang Hao
- Department of Otorhinolaryngology, Peking Union Medical College Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China
| | - Qian Wang
- Epilepsy Center, Department of Clinical Psychology, Sanbo Brain Hospital, Capital Medical University, Beijing, China
| | - Liang Li
- School of Psychological and Cognitive Sciences and Beijing Key Laboratory of Behavior and Mental Health, Speech and Hearing Research Center, Key Laboratory on Machine Perception (Ministry of Education), Peking University, Beijing, China
| | - Yufei Qiao
- Department of Otorhinolaryngology, Peking Union Medical College Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China
| | - Zhiqiang Gao
- Department of Otorhinolaryngology, Peking Union Medical College Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China
| | - Daofeng Ni
- Department of Otorhinolaryngology, Peking Union Medical College Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China
| | - Yingying Shang
- Department of Otorhinolaryngology, Peking Union Medical College Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China
| |
Collapse
|
20
|
Neural representation of interaural correlation in human auditory brainstem: Comparisons between temporal-fine structure and envelope. Hear Res 2018; 365:165-173. [PMID: 29853322 DOI: 10.1016/j.heares.2018.05.015] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/26/2017] [Revised: 05/05/2018] [Accepted: 05/20/2018] [Indexed: 11/24/2022]
Abstract
Central processing of interaural correlation (IAC), which depends on the precise representation of acoustic signals from the two ears, is essential for both localization and recognition of auditory objects. A complex soundwave is initially filtered by the peripheral auditory system into multiple narrowband waves, which are further decomposed into two functionally distinctive components: the quickly-varying temporal-fine structure (TFS) and the slowly-varying envelope. In rats, a narrowband noise can evoke auditory-midbrain frequency-following responses (FFRs) that contain both the TFS component (FFRTFS) and the envelope component (FFREnv), which represent the TFS and envelope of the narrowband noise, respectively. These two components are different in sensitivity to the interaural time disparity. In human listeners, the present study investigated whether the FFRTFS and FFREnv components of brainstem FFRs to a narrowband noise are different in sensitivity to IAC and whether there are potential brainstem mechanisms underlying the integration of the two components. The results showed that although both the amplitude of FFRTFS and that of FFREnv were significantly affected by shifts of IAC between 1 and 0, the stimulus-to-response correlation for FFRTFS, but not that for FFREnv, was sensitive to the IAC shifts. Moreover, in addition to the correlation between the binaurally evoked FFRTFS and FFREnv, the correlation between the IAC-shift-induced change of FFRTFS and that of FFREnv was significant. Thus, the TFS information is more precisely represented in the human auditory brainstem than the envelope information, and the correlation between FFRTFS and FFREnv for the same narrowband noise suggest a brainstem binding mechanism underlying the perceptual integration of the TFS and envelope signals.
Collapse
|
21
|
Musical training sharpens and bonds ears and tongue to hear speech better. Proc Natl Acad Sci U S A 2017; 114:13579-13584. [PMID: 29203648 DOI: 10.1073/pnas.1712223114] [Citation(s) in RCA: 58] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
The idea that musical training improves speech perception in challenging listening environments is appealing and of clinical importance, yet the mechanisms of any such musician advantage are not well specified. Here, using functional magnetic resonance imaging (fMRI), we found that musicians outperformed nonmusicians in identifying syllables at varying signal-to-noise ratios (SNRs), which was associated with stronger activation of the left inferior frontal and right auditory regions in musicians compared with nonmusicians. Moreover, musicians showed greater specificity of phoneme representations in bilateral auditory and speech motor regions (e.g., premotor cortex) at higher SNRs and in the left speech motor regions at lower SNRs, as determined by multivoxel pattern analysis. Musical training also enhanced the intrahemispheric and interhemispheric functional connectivity between auditory and speech motor regions. Our findings suggest that improved speech in noise perception in musicians relies on stronger recruitment of, finer phonological representations in, and stronger functional connectivity between auditory and frontal speech motor cortices in both hemispheres, regions involved in bottom-up spectrotemporal analyses and top-down articulatory prediction and sensorimotor integration, respectively.
Collapse
|
22
|
Differences between auditory frequency-following responses and onset responses: Intracranial evidence from rat inferior colliculus. Hear Res 2017; 357:25-32. [PMID: 29156225 DOI: 10.1016/j.heares.2017.10.014] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/18/2017] [Revised: 10/14/2017] [Accepted: 10/30/2017] [Indexed: 11/22/2022]
Abstract
A periodic sound, such as a pure tone, evokes both transient onset field-potential responses and sustained frequency-following responses (FFRs) in the auditory midbrain, the inferior colliculus (IC). It is not clear whether the two types of responses are based on the same or different neural substrates. Although it has been assumed that FFRs are based on phase locking to the periodic sound, the evidence showing the direct relationship between the FFR amplitude and the phase-locking strength is still lacking. Using intracranial recordings from the rat central nucleus of inferior colliculus (ICC), this study was to examine whether FFRs and onset responses are different in sensitivity to pure-tone frequency and/or response-stimulus correlation, when a tone stimulus is presented either monaurally or binaurally. Particularly, this study was to examine whether the FFR amplitude is correlated with the strength of phase locking. The results showed that with the increase of tone-stimulus frequency from 1 to 2 kHz, the FFR amplitude decreased but the onset-response amplitude increased. Moreover, the FFR amplitude, but not the onset-response amplitude, was significantly correlated with the phase coherence between tone-evoked potentials and the tone stimulus. Finally, the FFR amplitude was negatively correlated with the onset-response amplitude. These results indicate that periodic-sound-evoked FFRs are based on phase-locking activities of sustained-response neurons, but onset responses are based on transient activities of onset-response neurons, suggesting that FFRs and onset responses are associated with different functions.
Collapse
|
23
|
Coffey EBJ, Chepesiuk AMP, Herholz SC, Baillet S, Zatorre RJ. Neural Correlates of Early Sound Encoding and their Relationship to Speech-in-Noise Perception. Front Neurosci 2017; 11:479. [PMID: 28890684 PMCID: PMC5575455 DOI: 10.3389/fnins.2017.00479] [Citation(s) in RCA: 47] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2017] [Accepted: 08/11/2017] [Indexed: 01/05/2023] Open
Abstract
Speech-in-noise (SIN) perception is a complex cognitive skill that affects social, vocational, and educational activities. Poor SIN ability particularly affects young and elderly populations, yet varies considerably even among healthy young adults with normal hearing. Although SIN skills are known to be influenced by top-down processes that can selectively enhance lower-level sound representations, the complementary role of feed-forward mechanisms and their relationship to musical training is poorly understood. Using a paradigm that minimizes the main top-down factors that have been implicated in SIN performance such as working memory, we aimed to better understand how robust encoding of periodicity in the auditory system (as measured by the frequency-following response) contributes to SIN perception. Using magnetoencephalograpy, we found that the strength of encoding at the fundamental frequency in the brainstem, thalamus, and cortex is correlated with SIN accuracy. The amplitude of the slower cortical P2 wave was previously also shown to be related to SIN accuracy and FFR strength; we use MEG source localization to show that the P2 wave originates in a temporal region anterior to that of the cortical FFR. We also confirm that the observed enhancements were related to the extent and timing of musicianship. These results are consistent with the hypothesis that basic feed-forward sound encoding affects SIN perception by providing better information to later processing stages, and that modifying this process may be one mechanism through which musical training might enhance the auditory networks that subserve both musical and language functions.
Collapse
Affiliation(s)
- Emily B J Coffey
- Department of Neurology and Neurosurgery, Montreal Neurological Institute, McGill UniversityMontréal, QC, Canada.,Laboratory for Brain, Music and Sound ResearchMontréal, QC, Canada.,Centre for Research on Brain, Language and MusicMontréal, QC, Canada
| | - Alexander M P Chepesiuk
- Department of Neurology and Neurosurgery, Montreal Neurological Institute, McGill UniversityMontréal, QC, Canada
| | - Sibylle C Herholz
- Department of Neurology and Neurosurgery, Montreal Neurological Institute, McGill UniversityMontréal, QC, Canada.,Laboratory for Brain, Music and Sound ResearchMontréal, QC, Canada.,Centre for Research on Brain, Language and MusicMontréal, QC, Canada.,German Center for Neurodegenerative DiseasesBonn, Germany
| | - Sylvain Baillet
- Centre for Research on Brain, Language and MusicMontréal, QC, Canada.,McConnell Brain Imaging Centre, Montreal Neurological Institute, McGill UniversityMontréal, QC, Canada
| | - Robert J Zatorre
- Department of Neurology and Neurosurgery, Montreal Neurological Institute, McGill UniversityMontréal, QC, Canada.,Laboratory for Brain, Music and Sound ResearchMontréal, QC, Canada.,Centre for Research on Brain, Language and MusicMontréal, QC, Canada
| |
Collapse
|
24
|
Neural representations of concurrent sounds with overlapping spectra in rat inferior colliculus: Comparisons between temporal-fine structure and envelope. Hear Res 2017; 353:87-96. [PMID: 28655419 DOI: 10.1016/j.heares.2017.06.005] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/19/2017] [Revised: 05/21/2017] [Accepted: 06/12/2017] [Indexed: 11/24/2022]
Abstract
Perceptual segregation of multiple sounds, which overlap in both time and spectra, into individual auditory streams is critical for hearing in natural environments. Some cues such as interaural time disparities (ITDs) play an important role in the segregation, especially when sounds are separated in space. In this study, we investigated the neural representation of two uncorrelated narrowband noises that shared the identical spectrum in the rat inferior colliculus (IC) using frequency-following-response (FFR) recordings, when the ITD for each noise stimulus was manipulated. The results of this study showed that recorded FFRs exhibited two distinctive components: the fast-varying temporal fine structure (TFS) component (FFRTFS) and the slow-varying envelope component (FFRENV). When a single narrowband noise was presented alone, the FFRTFS, but not the FFRENV, was sensitive to ITDs. When two narrowband noises were presented simultaneously, the FFRTFS took advantage of the ITD disparity that was associated with perceived spatial separation between the two concurrent sounds, and displayed a better linear synchronization to the sound with an ipsilateral-leading ITD. However, no effects of ITDs were found on the FFRENV. These results suggest that the FFRTFS and FFRENV represent two distinct types of signal processing in the auditory brainstem and contribute differentially to sound segregation based on spatial cues: the FFRTFS is more critical to spatial release from masking.
Collapse
|
25
|
Speech-in-noise perception in musicians: A review. Hear Res 2017; 352:49-69. [PMID: 28213134 DOI: 10.1016/j.heares.2017.02.006] [Citation(s) in RCA: 83] [Impact Index Per Article: 11.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/26/2016] [Revised: 02/01/2017] [Accepted: 02/05/2017] [Indexed: 11/23/2022]
Abstract
The ability to understand speech in the presence of competing sound sources is an important neuroscience question in terms of how the nervous system solves this computational problem. It is also a critical clinical problem that disproportionally affects the elderly, children with language-related learning disorders, and those with hearing loss. Recent evidence that musicians have an advantage on this multifaceted skill has led to the suggestion that musical training might be used to improve or delay the decline of speech-in-noise (SIN) function. However, enhancements have not been universally reported, nor have the relative contributions of different bottom-up versus top-down processes, and their relation to preexisting factors been disentangled. This information that would be helpful to establish whether there is a real effect of experience, what exactly is its nature, and how future training-based interventions might target the most relevant components of cognitive processes. These questions are complicated by important differences in study design and uneven coverage of neuroimaging modality. In this review, we aim to systematize recent results from studies that have specifically looked at musician-related differences in SIN by their study design properties, to summarize the findings, and to identify knowledge gaps for future work.
Collapse
|
26
|
|
27
|
Reichenbach CS, Braiman C, Schiff ND, Hudspeth AJ, Reichenbach T. The Auditory-Brainstem Response to Continuous, Non-repetitive Speech Is Modulated by the Speech Envelope and Reflects Speech Processing. Front Comput Neurosci 2016; 10:47. [PMID: 27303286 PMCID: PMC4880572 DOI: 10.3389/fncom.2016.00047] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2015] [Accepted: 04/29/2016] [Indexed: 11/13/2022] Open
Abstract
The auditory-brainstem response (ABR) to short and simple acoustical signals is an important clinical tool used to diagnose the integrity of the brainstem. The ABR is also employed to investigate the auditory brainstem in a multitude of tasks related to hearing, such as processing speech or selectively focusing on one speaker in a noisy environment. Such research measures the response of the brainstem to short speech signals such as vowels or words. Because the voltage signal of the ABR has a tiny amplitude, several hundred to a thousand repetitions of the acoustic signal are needed to obtain a reliable response. The large number of repetitions poses a challenge to assessing cognitive functions due to neural adaptation. Here we show that continuous, non-repetitive speech, lasting several minutes, may be employed to measure the ABR. Because the speech is not repeated during the experiment, the precise temporal form of the ABR cannot be determined. We show, however, that important structural features of the ABR can nevertheless be inferred. In particular, the brainstem responds at the fundamental frequency of the speech signal, and this response is modulated by the envelope of the voiced parts of speech. We accordingly introduce a novel measure that assesses the ABR as modulated by the speech envelope, at the fundamental frequency of speech and at the characteristic latency of the response. This measure has a high signal-to-noise ratio and can hence be employed effectively to measure the ABR to continuous speech. We use this novel measure to show that the ABR is weaker to intelligible speech than to unintelligible, time-reversed speech. The methods presented here can be employed for further research on speech processing in the auditory brainstem and can lead to the development of future clinical diagnosis of brainstem function.
Collapse
Affiliation(s)
- Chagit S Reichenbach
- Tri-Institutional Training Program in Computational Biology and Medicine, Weill Cornell Medical CollegeNew York, NY, USA; Howard Hughes Medical Institute and Laboratory of Sensory Neuroscience, The Rockefeller UniversityNew York, NY, USA; Department of Neuroscience, Brain and Mind Research Institute, Weill Cornell Medical CollegeNew York, NY, USA
| | - Chananel Braiman
- Tri-Institutional Training Program in Computational Biology and Medicine, Weill Cornell Medical College New York, NY, USA
| | - Nicholas D Schiff
- Department of Neuroscience, Brain and Mind Research Institute, Weill Cornell Medical College New York, NY, USA
| | - A J Hudspeth
- Howard Hughes Medical Institute and Laboratory of Sensory Neuroscience, The Rockefeller University New York, NY, USA
| | - Tobias Reichenbach
- Department of Bioengineering, Imperial College London, South Kensington Campus London, UK
| |
Collapse
|
28
|
Chen J, Jono T, Cui J, Yue X, Tang Y. The Acoustic Properties of Low Intensity Vocalizations Match Hearing Sensitivity in the Webbed-Toed Gecko, Gekko subpalmatus. PLoS One 2016; 11:e0146677. [PMID: 26752301 PMCID: PMC4709187 DOI: 10.1371/journal.pone.0146677] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2015] [Accepted: 12/21/2015] [Indexed: 11/26/2022] Open
Abstract
The design of acoustic signals and hearing sensitivity in socially communicating species would normally be expected to closely match in order to minimize signal degradation and attenuation during signal propagation. Nevertheless, other factors such as sensory biases as well as morphological and physiological constraints may affect strict correspondence between signal features and hearing sensitivity. Thus study of the relationships between sender and receiver characteristics in species utilizing acoustic communication can provide information about how acoustic communication systems evolve. The genus Gekko includes species emitting high-amplitude vocalizations for long-range communication (loud callers) as well as species producing only low-amplitude vocalizations when in close contact with conspecifics (quiet callers) which have rarely been investigated. In order to investigate relationships between auditory physiology and the frequency characteristics of acoustic signals in a quiet caller, Gekko subpalmatus we measured the subjects’ vocal signal characteristics as well as auditory brainstem responses (ABRs) to assess auditory sensitivity. The results show that G. subpalmatus males emit low amplitude calls when encountering females, ranging in dominant frequency from 2.47 to 4.17 kHz with an average at 3.35 kHz. The auditory range with highest sensitivity closely matches the dominant frequency of the vocalizations. This correspondence is consistent with the notion that quiet and loud calling species are under similar selection pressures for matching auditory sensitivity with spectral characteristics of vocalizations.
Collapse
Affiliation(s)
- Jingfeng Chen
- Department of Herpetology, Chengdu Institute of Biology, Chinese Academy of Sciences, Chengdu, Sichuan, China
- * E-mail: (JFC); (YZT)
| | - Teppei Jono
- Department of Herpetology, Chengdu Institute of Biology, Chinese Academy of Sciences, Chengdu, Sichuan, China
| | - Jianguo Cui
- Department of Herpetology, Chengdu Institute of Biology, Chinese Academy of Sciences, Chengdu, Sichuan, China
| | - Xizi Yue
- Department of Herpetology, Chengdu Institute of Biology, Chinese Academy of Sciences, Chengdu, Sichuan, China
| | - Yezhong Tang
- Department of Herpetology, Chengdu Institute of Biology, Chinese Academy of Sciences, Chengdu, Sichuan, China
- * E-mail: (JFC); (YZT)
| |
Collapse
|
29
|
Wang Q, Li L. Auditory midbrain representation of a break in interaural correlation. J Neurophysiol 2015; 114:2258-64. [PMID: 26269559 DOI: 10.1152/jn.00645.2015] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2015] [Accepted: 08/10/2015] [Indexed: 11/22/2022] Open
Abstract
The auditory peripheral system filters broadband sounds into narrowband waves and decomposes narrowband waves into quickly varying temporal fine structures (TFSs) and slowly varying envelopes. When a noise is presented binaurally (with the interaural correlation being 1), human listeners can detect a transient break in interaural correlation (BIC), which does not alter monaural inputs substantially. The central correlates of BIC are unknown. This study examined whether phase locking-based frequency-following responses (FFRs) of neuron populations in the rat auditory midbrain [inferior colliculus (IC)] to interaurally correlated steady-state narrowband noises are modulated by introduction of a BIC. The results showed that the noise-induced FFR exhibited both a TFS component (FFRTFS) and an envelope component (FFREnv), signaling the center frequency and bandwidth, respectively. Introduction of either a BIC or an interaurally correlated amplitude gap (which had the summated amplitude matched to the BIC) significantly reduced both FFRTFS and FFREnv. However, the BIC-induced FFRTFS reduction and FFREnv reduction were not correlated with the amplitude gap-induced FFRTFS reduction and FFREnv reduction, respectively. Thus, although introduction of a BIC does not affect monaural inputs, it causes a temporary reduction in sustained responses of IC neuron populations to the noise. This BIC-induced FFR reduction is not based on a simple linear summation of noise signals.
Collapse
Affiliation(s)
- Qian Wang
- Department of Psychology and Beijing Key Laboratory of Behavior and Mental Health, Peking University, Beijing, People's Republic of China
| | - Liang Li
- Department of Psychology and Beijing Key Laboratory of Behavior and Mental Health, Peking University, Beijing, People's Republic of China; Speech and Hearing Research Center, Key Laboratory on Machine Perception (Ministry of Education), Peking University, Beijing, People's Republic of China; PKU-IDG/McGovern Institute for Brain Research, Peking University, Beijing, People's Republic of China; and Beijing Institute for Brain Disorders, Beijing, People's Republic of China
| |
Collapse
|
30
|
Park PKJ, Ryu H, Lee JH, Shin CW, Lee KB, Woo J, Kim JS, Kang BC, Liu SC, Delbruck T. Fast neuromorphic sound localization for binaural hearing aids. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2015; 2013:5275-8. [PMID: 24110926 DOI: 10.1109/embc.2013.6610739] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
We report on the neuromorphic sound localization circuit which can enhance the perceptual sensation in a hearing aid system. All elements are simple leaky integrate-and-fire neuron circuits with different parameters optimized to suppress the impacts of synaptic circuit noises. The detection range and resolution of the proposed neuromorphic circuit are 500 us and 5 us, respectively. Our results show that, the proposed technique can localize a sound pulse with extremely narrow duration (∼ 1 ms) resulting in real-time response.
Collapse
|
31
|
He W, Ding X, Zhang R, Chen J, Zhang D, Wu X. Electrically-evoked frequency-following response (EFFR) in the auditory brainstem of guinea pigs. PLoS One 2014; 9:e106719. [PMID: 25244253 PMCID: PMC4171095 DOI: 10.1371/journal.pone.0106719] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2014] [Accepted: 08/09/2014] [Indexed: 11/19/2022] Open
Abstract
It is still a difficult clinical issue to decide whether a patient is a suitable candidate for a cochlear implant and to plan postoperative rehabilitation, especially for some special cases, such as auditory neuropathy. A partial solution to these problems is to preoperatively evaluate the functional integrity of the auditory neural pathways. For evaluating the strength of phase-locking of auditory neurons, which was not reflected in previous methods using electrically evoked auditory brainstem response (EABR), a new method for recording phase-locking related auditory responses to electrical stimulation, called the electrically evoked frequency-following response (EFFR), was developed and evaluated using guinea pigs. The main objective was to assess feasibility of the method by testing whether the recorded signals reflected auditory neural responses or artifacts. The results showed the following: 1) the recorded signals were evoked by neuron responses rather than by artifact; 2) responses evoked by periodic signals were significantly higher than those evoked by the white noise; 3) the latency of the responses fell in the expected range; 4) the responses decreased significantly after death of the guinea pigs; and 5) the responses decreased significantly when the animal was replaced by an electrical resistance. All of these results suggest the method was valid. Recording obtained using complex tones with a missing fundamental component and using pure tones with various frequencies were consistent with those obtained using acoustic stimulation in previous studies.
Collapse
Affiliation(s)
- Wenxin He
- Speech and Hearing Research Center, and Key Laboratory of Machine Perception (Ministry of Education), Peking University, Beijing, People's Republic of China
| | - Xiuyong Ding
- Department of Otorhinolaryngology Head and Neck Surgery, Beijing Friendship Hospital, Capital Medical University, Beijing, People's Republic of China
| | - Ruxiang Zhang
- Department of Otorhinolaryngology Head and Neck Surgery, Beijing Friendship Hospital, Capital Medical University, Beijing, People's Republic of China
| | - Jing Chen
- Speech and Hearing Research Center, and Key Laboratory of Machine Perception (Ministry of Education), Peking University, Beijing, People's Republic of China
- * E-mail: (JC); (XW)
| | - Daoxing Zhang
- Department of Otorhinolaryngology Head and Neck Surgery, Beijing Friendship Hospital, Capital Medical University, Beijing, People's Republic of China
| | - Xihong Wu
- Speech and Hearing Research Center, and Key Laboratory of Machine Perception (Ministry of Education), Peking University, Beijing, People's Republic of China
- * E-mail: (JC); (XW)
| |
Collapse
|
32
|
Zhang C, Lu L, Wu X, Li L. Attentional modulation of the early cortical representation of speech signals in informational or energetic masking. BRAIN AND LANGUAGE 2014; 135:85-95. [PMID: 24992572 DOI: 10.1016/j.bandl.2014.06.002] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/04/2013] [Revised: 06/04/2014] [Accepted: 06/05/2014] [Indexed: 06/03/2023]
Abstract
It is easier to recognize a masked speech when the speech and its masker are perceived as spatially segregated. Using event-related potentials, this study examined how the early cortical representation of speech is affected by different masker types and perceptual locations, when the listener is either passively or actively listening to the target speech syllable. The results showed that the two-talker-speech masker induced a much larger masking effect on the N1/P2 complex than either the steady-state-noise masker or the amplitude-modulated speech-spectrum-noise masker did. Also, a switch from the passive- to active-listening condition enhanced the N1/P2 complex only when the masker was speech. Moreover, under the active-listening condition, perceived separation between target and masker enhanced the N1/P2 complex only when the masker was speech. Thus, when a masker is present, the effect of selective attention to the target-speech signal on the early cortical representation of the speech signal is masker-type dependent.
Collapse
Affiliation(s)
- Changxin Zhang
- Department of Psychology, Speech and Hearing Research Center, McGovern Institute for Brain Research at PKU, Key Laboratory on Machine Perception (Ministry of Education), Peking University, Beijing 100871, China
| | - Lingxi Lu
- Department of Psychology, Speech and Hearing Research Center, McGovern Institute for Brain Research at PKU, Key Laboratory on Machine Perception (Ministry of Education), Peking University, Beijing 100871, China
| | - Xihong Wu
- Department of Psychology, Speech and Hearing Research Center, McGovern Institute for Brain Research at PKU, Key Laboratory on Machine Perception (Ministry of Education), Peking University, Beijing 100871, China
| | - Liang Li
- Department of Psychology, Speech and Hearing Research Center, McGovern Institute for Brain Research at PKU, Key Laboratory on Machine Perception (Ministry of Education), Peking University, Beijing 100871, China.
| |
Collapse
|
33
|
Gao Y, Cao S, Qu T, Wu X, Li H, Zhang J, Li L. Voice-associated static face image releases speech from informational masking. Psych J 2014; 3:113-20. [PMID: 26271763 DOI: 10.1002/pchj.45] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2013] [Accepted: 11/07/2013] [Indexed: 11/08/2022]
Abstract
In noisy, multipeople talking environments such as a cocktail party, listeners can use various perceptual and/or cognitive cues to improve recognition of target speech against masking, particularly informational masking. Previous studies have shown that temporally prepresented voice cues (voice primes) improve recognition of target speech against speech masking but not noise masking. This study investigated whether static face image primes that have become target-voice associated (i.e., facial images linked through associative learning with voices reciting the target speech) can be used by listeners to unmask speech. The results showed that in 32 normal-hearing younger adults, temporally prepresenting a voice-priming sentence with the same voice reciting the target sentence significantly improved the recognition of target speech that was masked by irrelevant two-talker speech. When a person's face photograph image became associated with the voice reciting the target speech by learning, temporally prepresenting the target-voice-associated face image significantly improved recognition of target speech against speech masking, particularly for the last two keywords in the target sentence. Moreover, speech-recognition performance under the voice-priming condition was significantly correlated to that under the face-priming condition. The results suggest that learned facial information on talker identity plays an important role in identifying the target-talker's voice and facilitating selective attention to the target-speech stream against the masking-speech stream.
Collapse
Affiliation(s)
- Yayue Gao
- Department of Psychology, Speech and Hearing Research Center, Key Laboratory on Machine Perception (Ministry of Education), Peking University, Beijing, China
| | - Shuyang Cao
- Department of Psychology, Speech and Hearing Research Center, Key Laboratory on Machine Perception (Ministry of Education), Peking University, Beijing, China.,State Administration of Press, Publication, Radio, Film and Television of The People's Republic of China, Beijing
| | - Tianshu Qu
- Department of Psychology, Speech and Hearing Research Center, Key Laboratory on Machine Perception (Ministry of Education), Peking University, Beijing, China
| | - Xihong Wu
- Department of Psychology, Speech and Hearing Research Center, Key Laboratory on Machine Perception (Ministry of Education), Peking University, Beijing, China
| | - Haifeng Li
- School of Computer Science and Technology, Harbin Institute of Technology, Harbin, China
| | - Jinsheng Zhang
- Department of Otolaryngology-Head and Neck Surgery, Wayne State University School of Medicine, Detroit, Michigan, USA
| | - Liang Li
- Department of Psychology, Speech and Hearing Research Center, Key Laboratory on Machine Perception (Ministry of Education), Peking University, Beijing, China
| |
Collapse
|
34
|
Joos K, Gilles A, Van de Heyning P, De Ridder D, Vanneste S. From sensation to percept: The neural signature of auditory event-related potentials. Neurosci Biobehav Rev 2014; 42:148-56. [DOI: 10.1016/j.neubiorev.2014.02.009] [Citation(s) in RCA: 55] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2013] [Revised: 02/17/2014] [Accepted: 02/19/2014] [Indexed: 10/25/2022]
|
35
|
Richardson BD, Hancock KE, Caspary DM. Stimulus-specific adaptation in auditory thalamus of young and aged awake rats. J Neurophysiol 2013; 110:1892-902. [PMID: 23904489 DOI: 10.1152/jn.00403.2013] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022] Open
Abstract
Novel stimulus detection by single neurons in the auditory system, known as stimulus-specific adaptation (SSA), appears to function as a real-time filtering/gating mechanism in processing acoustic information. Particular stimulus paradigms allowing for quantification of a neuron's ability to detect novel or deviant stimuli have been used to examine SSA in the inferior colliculus, medial geniculate body (MGB), and auditory cortex of anesthetized rodents. However, the study of SSA in awake animals is limited to auditory cortex. The present study used individually advanceable tetrodes to record single-unit responses from auditory thalamus (MGB) of awake young adult and aged Fischer Brown Norway (FBN) rats to 1) examine the presence of SSA in the MGB of awake rats and 2) determine whether SSA is altered by aging in MGB. MGB single units in awake FBN rats displayed SSA in response to two stimulus paradigms: the oddball paradigm and a random blocked/interleaved presentation of a set of frequencies. SSA levels were modestly, but nonsignificantly, increased in the nonlemniscal regions of the MGB and at lower stimulus intensities, where 27 of 57 (47%) young adult MGB units displayed SSA. The present findings provide the initial description of SSA in the MGB of awake rats and support SSA as being qualitatively independent of arousal level or anesthetized state. Finally, contrary to previous studies in auditory cortex of anesthetized rats, MGB units in aged rats showed SSA levels indistinguishable from SSA levels in young adult rats, suggesting that SSA in MGB was not impacted by aging in an awake preparation.
Collapse
Affiliation(s)
- Ben D Richardson
- Department of Pharmacology, Southern Illinois University School of Medicine, Springfield, Illinois
| | | | | |
Collapse
|
36
|
Zhu L, Bharadwaj H, Xia J, Shinn-Cunningham B. A comparison of spectral magnitude and phase-locking value analyses of the frequency-following response to complex tones. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2013; 134:384-395. [PMID: 23862815 PMCID: PMC3724813 DOI: 10.1121/1.4807498] [Citation(s) in RCA: 60] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/25/2012] [Revised: 04/07/2013] [Accepted: 04/23/2013] [Indexed: 05/31/2023]
Abstract
Two experiments, both presenting diotic, harmonic tone complexes (100 Hz fundamental), were conducted to explore the envelope-related component of the frequency-following response (FFRENV), a measure of synchronous, subcortical neural activity evoked by a periodic acoustic input. Experiment 1 directly compared two common analysis methods, computing the magnitude spectrum and the phase-locking value (PLV). Bootstrapping identified which FFRENV frequency components were statistically above the noise floor for each metric and quantified the statistical power of the approaches. Across listeners and conditions, the two methods produced highly correlated results. However, PLV analysis required fewer processing stages to produce readily interpretable results. Moreover, at the fundamental frequency of the input, PLVs were farther above the metric's noise floor than spectral magnitudes. Having established the advantages of PLV analysis, the efficacy of the approach was further demonstrated by investigating how different acoustic frequencies contribute to FFRENV, analyzing responses to complex tones composed of different acoustic harmonics of 100 Hz (Experiment 2). Results show that the FFRENV response is dominated by peripheral auditory channels responding to unresolved harmonics, although low-frequency channels driven by resolved harmonics also contribute. These results demonstrate the utility of the PLV for quantifying the strength of FFRENV across conditions.
Collapse
Affiliation(s)
- Li Zhu
- Department of Biomedical Engineering, School of Medicine, Tsinghua University, Beijing, 100084, People's Republic of China
| | | | | | | |
Collapse
|
37
|
Primitive auditory memory is correlated with spatial unmasking that is based on direct-reflection integration. PLoS One 2013; 8:e63106. [PMID: 23658664 PMCID: PMC3639177 DOI: 10.1371/journal.pone.0063106] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2012] [Accepted: 03/28/2013] [Indexed: 11/19/2022] Open
Abstract
In reverberant rooms with multiple-people talking, spatial separation between speech sources improves recognition of attended speech, even though both the head-shadowing and interaural-interaction unmasking cues are limited by numerous reflections. It is the perceptual integration between the direct wave and its reflections that bridges the direct-reflection temporal gaps and results in the spatial unmasking under reverberant conditions. This study further investigated (1) the temporal dynamic of the direct-reflection-integration-based spatial unmasking as a function of the reflection delay, and (2) whether this temporal dynamic is correlated with the listeners’ auditory ability to temporally retain raw acoustic signals (i.e., the fast decaying primitive auditory memory, PAM). The results showed that recognition of the target speech against the speech-masker background is a descending exponential function of the delay of the simulated target reflection. In addition, the temporal extent of PAM is frequency dependent and markedly longer than that for perceptual fusion. More importantly, the temporal dynamic of the speech-recognition function is significantly correlated with the temporal extent of the PAM of low-frequency raw signals. Thus, we propose that a chain process, which links the earlier-stage PAM with the later-stage correlation computation, perceptual integration, and attention facilitation, plays a role in spatially unmasking target speech under reverberant conditions.
Collapse
|
38
|
Wu C, Cao S, Wu X, Li L. Temporally pre-presented lipreading cues release speech from informational masking. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2013; 133:EL281-EL285. [PMID: 23556692 DOI: 10.1121/1.4794933] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/02/2023]
Abstract
Listeners can use temporally pre-presented content cues and concurrently presented lipreading cues to improve speech recognition under masking conditions. This study investigated whether temporally pre-presented lipreading cues also unmask speech. In a test trial, before the target sentence was co-presented with the masker, either target-matched (priming) lipreading video or static face (priming-control) video was presented in quiet. Participants' target-recognition performance was improved by a shift from the priming-control condition to the priming condition when the masker was speech but not noise. This release from informational masking suggests a combined effect of working memory and cross-modal integration on selective attention to target speech.
Collapse
Affiliation(s)
- Chao Wu
- Department of Psychology, Department of Machine Intelligence, Speech and Hearing Research Center, Key Laboratory on Machine Perception, Ministry of Education, Peking University, Beijing 100871, China.
| | | | | | | |
Collapse
|
39
|
Perceived target–masker separation unmasks responses of lateral amygdala to the emotionally conditioned target sounds in awake rats. Neuroscience 2012; 225:249-57. [DOI: 10.1016/j.neuroscience.2012.08.022] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2012] [Revised: 08/10/2012] [Accepted: 08/14/2012] [Indexed: 11/20/2022]
|