1
|
MacLean J, Stirn J, Sisson A, Bidelman GM. Short- and long-term neuroplasticity interact during the perceptual learning of concurrent speech. Cereb Cortex 2024; 34:bhad543. [PMID: 38212291 PMCID: PMC10839853 DOI: 10.1093/cercor/bhad543] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2023] [Revised: 12/20/2023] [Accepted: 12/21/2023] [Indexed: 01/13/2024] Open
Abstract
Plasticity from auditory experience shapes the brain's encoding and perception of sound. However, whether such long-term plasticity alters the trajectory of short-term plasticity during speech processing has yet to be investigated. Here, we explored the neural mechanisms and interplay between short- and long-term neuroplasticity for rapid auditory perceptual learning of concurrent speech sounds in young, normal-hearing musicians and nonmusicians. Participants learned to identify double-vowel mixtures during ~ 45 min training sessions recorded simultaneously with high-density electroencephalography (EEG). We analyzed frequency-following responses (FFRs) and event-related potentials (ERPs) to investigate neural correlates of learning at subcortical and cortical levels, respectively. Although both groups showed rapid perceptual learning, musicians showed faster behavioral decisions than nonmusicians overall. Learning-related changes were not apparent in brainstem FFRs. However, plasticity was highly evident in cortex, where ERPs revealed unique hemispheric asymmetries between groups suggestive of different neural strategies (musicians: right hemisphere bias; nonmusicians: left hemisphere). Source reconstruction and the early (150-200 ms) time course of these effects localized learning-induced cortical plasticity to auditory-sensory brain areas. Our findings reinforce the domain-general benefits of musicianship but reveal that successful speech sound learning is driven by a critical interplay between long- and short-term mechanisms of auditory plasticity, which first emerge at a cortical level.
Collapse
Affiliation(s)
- Jessica MacLean
- Department of Speech, Language and Hearing Sciences, Indiana University, Bloomington, IN, USA
- Program in Neuroscience, Indiana University, Bloomington, IN, USA
| | - Jack Stirn
- Department of Speech, Language and Hearing Sciences, Indiana University, Bloomington, IN, USA
| | - Alexandria Sisson
- Department of Speech, Language and Hearing Sciences, Indiana University, Bloomington, IN, USA
| | - Gavin M Bidelman
- Department of Speech, Language and Hearing Sciences, Indiana University, Bloomington, IN, USA
- Program in Neuroscience, Indiana University, Bloomington, IN, USA
- Cognitive Science Program, Indiana University, Bloomington, IN, USA
| |
Collapse
|
2
|
MacLean J, Stirn J, Sisson A, Bidelman GM. Short- and long-term experience-dependent neuroplasticity interact during the perceptual learning of concurrent speech. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.09.26.559640. [PMID: 37808665 PMCID: PMC10557636 DOI: 10.1101/2023.09.26.559640] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/10/2023]
Abstract
Plasticity from auditory experiences shapes brain encoding and perception of sound. However, whether such long-term plasticity alters the trajectory of short-term plasticity during speech processing has yet to be investigated. Here, we explored the neural mechanisms and interplay between short- and long-term neuroplasticity for rapid auditory perceptual learning of concurrent speech sounds in young, normal-hearing musicians and nonmusicians. Participants learned to identify double-vowel mixtures during ∼45 minute training sessions recorded simultaneously with high-density EEG. We analyzed frequency-following responses (FFRs) and event-related potentials (ERPs) to investigate neural correlates of learning at subcortical and cortical levels, respectively. While both groups showed rapid perceptual learning, musicians showed faster behavioral decisions than nonmusicians overall. Learning-related changes were not apparent in brainstem FFRs. However, plasticity was highly evident in cortex, where ERPs revealed unique hemispheric asymmetries between groups suggestive of different neural strategies (musicians: right hemisphere bias; nonmusicians: left hemisphere). Source reconstruction and the early (150-200 ms) time course of these effects localized learning-induced cortical plasticity to auditory-sensory brain areas. Our findings confirm domain-general benefits for musicianship but reveal successful speech sound learning is driven by a critical interplay between long- and short-term mechanisms of auditory plasticity that first emerge at a cortical level.
Collapse
|
3
|
Gohari N, Hosseini Dastgerdi Z, Bernstein LJ, Alain C. Neural correlates of concurrent sound perception: A review and guidelines for future research. Brain Cogn 2022; 163:105914. [PMID: 36155348 DOI: 10.1016/j.bandc.2022.105914] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2022] [Revised: 08/30/2022] [Accepted: 09/02/2022] [Indexed: 11/02/2022]
Abstract
The perception of concurrent sound sources depends on processes (i.e., auditory scene analysis) that fuse and segregate acoustic features according to harmonic relations, temporal coherence, and binaural cues (encompass dichotic pitch, location difference, simulated echo). The object-related negativity (ORN) and P400 are electrophysiological indices of concurrent sound perception. Here, we review the different paradigms used to study concurrent sound perception and the brain responses obtained from these paradigms. Recommendations regarding the design and recording parameters of the ORN and P400 are made, and their clinical applications in assessing central auditory processing ability in different populations are discussed.
Collapse
Affiliation(s)
- Nasrin Gohari
- Department of Audiology, School of Rehabilitation, Hamadan University of Medical Sciences, Hamadan, Iran.
| | - Zahra Hosseini Dastgerdi
- Department of Audiology, School of Rehabilitation, Isfahan University of Medical Sciences, Isfahan, Iran
| | - Lori J Bernstein
- Department of Supportive Care, University Health Network, and Department of Psychiatry, University of Toronto, Toronto, Canada
| | - Claude Alain
- Rotman Research Institute, Baycrest Centre for Geriatric Care & Department of Psychology, University of Toronto, Canada
| |
Collapse
|
4
|
Bsharat-Maalouf D, Karawani H. Bilinguals' speech perception in noise: Perceptual and neural associations. PLoS One 2022; 17:e0264282. [PMID: 35196339 PMCID: PMC8865662 DOI: 10.1371/journal.pone.0264282] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2021] [Accepted: 02/07/2022] [Indexed: 01/26/2023] Open
Abstract
The current study characterized subcortical speech sound processing among monolinguals and bilinguals in quiet and challenging listening conditions and examined the relation between subcortical neural processing and perceptual performance. A total of 59 normal-hearing adults, ages 19–35 years, participated in the study: 29 native Hebrew-speaking monolinguals and 30 Arabic-Hebrew-speaking bilinguals. Auditory brainstem responses to speech sounds were collected in a quiet condition and with background noise. The perception of words and sentences in quiet and background noise conditions was also examined to assess perceptual performance and to evaluate the perceptual-physiological relationship. Perceptual performance was tested among bilinguals in both languages (first language (L1-Arabic) and second language (L2-Hebrew)). The outcomes were similar between monolingual and bilingual groups in quiet. Noise, as expected, resulted in deterioration in perceptual and neural responses, which was reflected in lower accuracy in perceptual tasks compared to quiet, and in more prolonged latencies and diminished neural responses. However, a mixed picture was observed among bilinguals in perceptual and physiological outcomes in noise. In the perceptual measures, bilinguals were significantly less accurate than their monolingual counterparts. However, in neural responses, bilinguals demonstrated earlier peak latencies compared to monolinguals. Our results also showed that perceptual performance in noise was related to subcortical resilience to the disruption caused by background noise. Specifically, in noise, increased brainstem resistance (i.e., fewer changes in the fundamental frequency (F0) representations or fewer shifts in the neural timing) was related to better speech perception among bilinguals. Better perception in L1 in noise was correlated with fewer changes in F0 representations, and more accurate perception in L2 was related to minor shifts in auditory neural timing. This study delves into the importance of using neural brainstem responses to speech sounds to differentiate individuals with different language histories and to explain inter-subject variability in bilinguals’ perceptual abilities in daily life situations.
Collapse
Affiliation(s)
- Dana Bsharat-Maalouf
- Department of Communication Sciences and Disorders, University of Haifa, Haifa, Israel
| | - Hanin Karawani
- Department of Communication Sciences and Disorders, University of Haifa, Haifa, Israel
- * E-mail:
| |
Collapse
|
5
|
Defining the Role of Attention in Hierarchical Auditory Processing. Audiol Res 2021; 11:112-128. [PMID: 33805600 PMCID: PMC8006147 DOI: 10.3390/audiolres11010012] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2021] [Revised: 03/07/2021] [Accepted: 03/10/2021] [Indexed: 01/09/2023] Open
Abstract
Communication in noise is a complex process requiring efficient neural encoding throughout the entire auditory pathway as well as contributions from higher-order cognitive processes (i.e., attention) to extract speech cues for perception. Thus, identifying effective clinical interventions for individuals with speech-in-noise deficits relies on the disentanglement of bottom-up (sensory) and top-down (cognitive) factors to appropriately determine the area of deficit; yet, how attention may interact with early encoding of sensory inputs remains unclear. For decades, attentional theorists have attempted to address this question with cleverly designed behavioral studies, but the neural processes and interactions underlying attention's role in speech perception remain unresolved. While anatomical and electrophysiological studies have investigated the neurological structures contributing to attentional processes and revealed relevant brain-behavior relationships, recent electrophysiological techniques (i.e., simultaneous recording of brainstem and cortical responses) may provide novel insight regarding the relationship between early sensory processing and top-down attentional influences. In this article, we review relevant theories that guide our present understanding of attentional processes, discuss current electrophysiological evidence of attentional involvement in auditory processing across subcortical and cortical levels, and propose areas for future study that will inform the development of more targeted and effective clinical interventions for individuals with speech-in-noise deficits.
Collapse
|
6
|
Mahmud MS, Yeasin M, Bidelman GM. Speech categorization is better described by induced rather than evoked neural activity. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2021; 149:1644. [PMID: 33765780 PMCID: PMC8267855 DOI: 10.1121/10.0003572] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/08/2023]
Abstract
Categorical perception (CP) describes how the human brain categorizes speech despite inherent acoustic variability. We examined neural correlates of CP in both evoked and induced electroencephalogram (EEG) activity to evaluate which mode best describes the process of speech categorization. Listeners labeled sounds from a vowel gradient while we recorded their EEGs. Using a source reconstructed EEG, we used band-specific evoked and induced neural activity to build parameter optimized support vector machine models to assess how well listeners' speech categorization could be decoded via whole-brain and hemisphere-specific responses. We found whole-brain evoked β-band activity decoded prototypical from ambiguous speech sounds with ∼70% accuracy. However, induced γ-band oscillations showed better decoding of speech categories with ∼95% accuracy compared to evoked β-band activity (∼70% accuracy). Induced high frequency (γ-band) oscillations dominated CP decoding in the left hemisphere, whereas lower frequencies (θ-band) dominated the decoding in the right hemisphere. Moreover, feature selection identified 14 brain regions carrying induced activity and 22 regions of evoked activity that were most salient in describing category-level speech representations. Among the areas and neural regimes explored, induced γ-band modulations were most strongly associated with listeners' behavioral CP. The data suggest that the category-level organization of speech is dominated by relatively high frequency induced brain rhythms.
Collapse
Affiliation(s)
- Md Sultan Mahmud
- Department of Electrical and Computer Engineering, University of Memphis, 3815 Central Avenue, Memphis, Tennessee 38152, USA
| | - Mohammed Yeasin
- Department of Electrical and Computer Engineering, University of Memphis, 3815 Central Avenue, Memphis, Tennessee 38152, USA
| | - Gavin M Bidelman
- School of Communication Sciences and Disorders, University of Memphis, 4055 North Park Loop, Memphis, Tennessee 38152, USA
| |
Collapse
|
7
|
Bidelman GM, Bush LC, Boudreaux AM. Effects of Noise on the Behavioral and Neural Categorization of Speech. Front Neurosci 2020; 14:153. [PMID: 32180700 PMCID: PMC7057933 DOI: 10.3389/fnins.2020.00153] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2019] [Accepted: 02/10/2020] [Indexed: 02/02/2023] Open
Abstract
We investigated whether the categorical perception (CP) of speech might also provide a mechanism that aids its perception in noise. We varied signal-to-noise ratio (SNR) [clear, 0 dB, -5 dB] while listeners classified an acoustic-phonetic continuum (/u/ to /a/). Noise-related changes in behavioral categorization were only observed at the lowest SNR. Event-related brain potentials (ERPs) differentiated category vs. category-ambiguous speech by the P2 wave (~180-320 ms). Paralleling behavior, neural responses to speech with clear phonetic status (i.e., continuum endpoints) were robust to noise down to -5 dB SNR, whereas responses to ambiguous tokens declined with decreasing SNR. Results demonstrate that phonetic speech representations are more resistant to degradation than corresponding acoustic representations. Findings suggest the mere process of binning speech sounds into categories provides a robust mechanism to aid figure-ground speech perception by fortifying abstract categories from the acoustic signal and making the speech code more resistant to external interferences.
Collapse
Affiliation(s)
- Gavin M Bidelman
- Institute for Intelligent Systems, University of Memphis, Memphis, TN, United States.,School of Communication Sciences and Disorders, University of Memphis, Memphis, TN, United States.,Department of Anatomy and Neurobiology, University of Tennessee Health Sciences Center, Memphis, TN, United States
| | - Lauren C Bush
- School of Communication Sciences and Disorders, University of Memphis, Memphis, TN, United States
| | - Alex M Boudreaux
- School of Communication Sciences and Disorders, University of Memphis, Memphis, TN, United States
| |
Collapse
|
8
|
Yellamsetty A, Bidelman GM. Brainstem correlates of concurrent speech identification in adverse listening conditions. Brain Res 2019; 1714:182-192. [PMID: 30796895 DOI: 10.1016/j.brainres.2019.02.025] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2018] [Revised: 01/07/2019] [Accepted: 02/19/2019] [Indexed: 01/20/2023]
Abstract
When two voices compete, listeners can segregate and identify concurrent speech sounds using pitch (fundamental frequency, F0) and timbre (harmonic) cues. Speech perception is also hindered by the signal-to-noise ratio (SNR). How clear and degraded concurrent speech sounds are represented at early, pre-attentive stages of the auditory system is not well understood. To this end, we measured scalp-recorded frequency-following responses (FFR) from the EEG while human listeners heard two concurrently presented, steady-state (time-invariant) vowels whose F0 differed by zero or four semitones (ST) presented diotically in either clean (no noise) or noise-degraded (+5dB SNR) conditions. Listeners also performed a speeded double vowel identification task in which they were required to identify both vowels correctly. Behavioral results showed that speech identification accuracy increased with F0 differences between vowels, and this perceptual F0 benefit was larger for clean compared to noise degraded (+5dB SNR) stimuli. Neurophysiological data demonstrated more robust FFR F0 amplitudes for single compared to double vowels and considerably weaker responses in noise. F0 amplitudes showed speech-on-speech masking effects, along with a non-linear constructive interference at 0ST, and suppression effects at 4ST. Correlations showed that FFR F0 amplitudes failed to predict listeners' identification accuracy. In contrast, FFR F1 amplitudes were associated with faster reaction times, although this correlation was limited to noise conditions. The limited number of brain-behavior associations suggests subcortical activity mainly reflects exogenous processing rather than perceptual correlates of concurrent speech perception. Collectively, our results demonstrate that FFRs reflect pre-attentive coding of concurrent auditory stimuli that only weakly predict the success of identifying concurrent speech.
Collapse
Affiliation(s)
- Anusha Yellamsetty
- School of Communication Sciences & Disorders, University of Memphis, Memphis, TN, USA; Department of Communication Sciences & Disorders, University of South Florida, USA.
| | - Gavin M Bidelman
- School of Communication Sciences & Disorders, University of Memphis, Memphis, TN, USA; Institute for Intelligent Systems, University of Memphis, Memphis, TN, USA; University of Tennessee Health Sciences Center, Department of Anatomy and Neurobiology, Memphis, TN, USA.
| |
Collapse
|
9
|
Brainstem-cortical functional connectivity for speech is differentially challenged by noise and reverberation. Hear Res 2018; 367:149-160. [PMID: 29871826 DOI: 10.1016/j.heares.2018.05.018] [Citation(s) in RCA: 32] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/25/2018] [Revised: 05/18/2018] [Accepted: 05/23/2018] [Indexed: 11/21/2022]
Abstract
Everyday speech perception is challenged by external acoustic interferences that hinder verbal communication. Here, we directly compared how different levels of the auditory system (brainstem vs. cortex) code speech and how their neural representations are affected by two acoustic stressors: noise and reverberation. We recorded multichannel (64 ch) brainstem frequency-following responses (FFRs) and cortical event-related potentials (ERPs) simultaneously in normal hearing individuals to speech sounds presented in mild and moderate levels of noise and reverb. We matched signal-to-noise and direct-to-reverberant ratios to equate the severity between classes of interference. Electrode recordings were parsed into source waveforms to assess the relative contribution of region-specific brain areas [i.e., brainstem (BS), primary auditory cortex (A1), inferior frontal gyrus (IFG)]. Results showed that reverberation was less detrimental to (and in some cases facilitated) the neural encoding of speech compared to additive noise. Inter-regional correlations revealed associations between BS and A1 responses, suggesting subcortical speech representations influence higher auditory-cortical areas. Functional connectivity analyses further showed that directed signaling toward A1 in both feedforward cortico-collicular (BS→A1) and feedback cortico-cortical (IFG→A1) pathways were strong predictors of degraded speech perception and differentiated "good" vs. "poor" perceivers. Our findings demonstrate a functional interplay within the brain's speech network that depends on the form and severity of acoustic interference. We infer that in addition to the quality of neural representations within individual brain regions, listeners' success at the "cocktail party" is modulated based on how information is transferred among subcortical and cortical hubs of the auditory-linguistic network.
Collapse
|
10
|
Yellamsetty A, Bidelman GM. Low- and high-frequency cortical brain oscillations reflect dissociable mechanisms of concurrent speech segregation in noise. Hear Res 2018; 361:92-102. [PMID: 29398142 DOI: 10.1016/j.heares.2018.01.006] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/24/2017] [Revised: 12/09/2017] [Accepted: 01/12/2018] [Indexed: 10/18/2022]
Abstract
Parsing simultaneous speech requires listeners use pitch-guided segregation which can be affected by the signal-to-noise ratio (SNR) in the auditory scene. The interaction of these two cues may occur at multiple levels within the cortex. The aims of the current study were to assess the correspondence between oscillatory brain rhythms and determine how listeners exploit pitch and SNR cues to successfully segregate concurrent speech. We recorded electrical brain activity while participants heard double-vowel stimuli whose fundamental frequencies (F0s) differed by zero or four semitones (STs) presented in either clean or noise-degraded (+5 dB SNR) conditions. We found that behavioral identification was more accurate for vowel mixtures with larger pitch separations but F0 benefit interacted with noise. Time-frequency analysis decomposed the EEG into different spectrotemporal frequency bands. Low-frequency (θ, β) responses were elevated when speech did not contain pitch cues (0ST > 4ST) or was noisy, suggesting a correlate of increased listening effort and/or memory demands. Contrastively, γ power increments were observed for changes in both pitch (0ST > 4ST) and SNR (clean > noise), suggesting high-frequency bands carry information related to acoustic features and the quality of speech representations. Brain-behavior associations corroborated these effects; modulations in low-frequency rhythms predicted the speed of listeners' perceptual decisions with higher bands predicting identification accuracy. Results are consistent with the notion that neural oscillations reflect both automatic (pre-perceptual) and controlled (post-perceptual) mechanisms of speech processing that are largely divisible into high- and low-frequency bands of human brain rhythms.
Collapse
Affiliation(s)
- Anusha Yellamsetty
- School of Communication Sciences & Disorders, University of Memphis, Memphis, TN, USA
| | - Gavin M Bidelman
- School of Communication Sciences & Disorders, University of Memphis, Memphis, TN, USA; Institute for Intelligent Systems, University of Memphis, Memphis, TN, USA; Univeristy of Tennessee Health Sciences Center, Department of Anatomy and Neurobiology, Memphis, TN, USA.
| |
Collapse
|
11
|
Bidelman GM. Sonification of scalp-recorded frequency-following responses (FFRs) offers improved response detection over conventional statistical metrics. J Neurosci Methods 2018; 293:59-66. [DOI: 10.1016/j.jneumeth.2017.09.005] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2017] [Revised: 08/15/2017] [Accepted: 09/12/2017] [Indexed: 11/30/2022]
|
12
|
Bidelman GM, Yellamsetty A. Noise and pitch interact during the cortical segregation of concurrent speech. Hear Res 2017; 351:34-44. [PMID: 28578876 DOI: 10.1016/j.heares.2017.05.008] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/24/2017] [Revised: 05/09/2017] [Accepted: 05/23/2017] [Indexed: 10/19/2022]
Abstract
Behavioral studies reveal listeners exploit intrinsic differences in voice fundamental frequency (F0) to segregate concurrent speech sounds-the so-called "F0-benefit." More favorable signal-to-noise ratio (SNR) in the environment, an extrinsic acoustic factor, similarly benefits the parsing of simultaneous speech. Here, we examined the neurobiological substrates of these two cues in the perceptual segregation of concurrent speech mixtures. We recorded event-related brain potentials (ERPs) while listeners performed a speeded double-vowel identification task. Listeners heard two concurrent vowels whose F0 differed by zero or four semitones presented in either clean (no noise) or noise-degraded (+5 dB SNR) conditions. Behaviorally, listeners were more accurate in correctly identifying both vowels for larger F0 separations but F0-benefit was more pronounced at more favorable SNRs (i.e., pitch × SNR interaction). Analysis of the ERPs revealed that only the P2 wave (∼200 ms) showed a similar F0 x SNR interaction as behavior and was correlated with listeners' perceptual F0-benefit. Neural classifiers applied to the ERPs further suggested that speech sounds are segregated neurally within 200 ms based on SNR whereas segregation based on pitch occurs later in time (400-700 ms). The earlier timing of extrinsic SNR compared to intrinsic F0-based segregation implies that the cortical extraction of speech from noise is more efficient than differentiating speech based on pitch cues alone, which may recruit additional cortical processes. Findings indicate that noise and pitch differences interact relatively early in cerebral cortex and that the brain arrives at the identities of concurrent speech mixtures as early as ∼200 ms.
Collapse
Affiliation(s)
- Gavin M Bidelman
- School of Communication Sciences & Disorders, University of Memphis, Memphis, TN, 38152, USA; Institute for Intelligent Systems, University of Memphis, Memphis, TN, 38152, USA; Univeristy of Tennessee Health Sciences Center, Department of Anatomy and Neurobiology, Memphis, TN, 38163, USA.
| | - Anusha Yellamsetty
- School of Communication Sciences & Disorders, University of Memphis, Memphis, TN, 38152, USA
| |
Collapse
|
13
|
Snyder JS, Elhilali M. Recent advances in exploring the neural underpinnings of auditory scene perception. Ann N Y Acad Sci 2017; 1396:39-55. [PMID: 28199022 PMCID: PMC5446279 DOI: 10.1111/nyas.13317] [Citation(s) in RCA: 22] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2016] [Revised: 12/21/2016] [Accepted: 01/08/2017] [Indexed: 11/29/2022]
Abstract
Studies of auditory scene analysis have traditionally relied on paradigms using artificial sounds-and conventional behavioral techniques-to elucidate how we perceptually segregate auditory objects or streams from each other. In the past few decades, however, there has been growing interest in uncovering the neural underpinnings of auditory segregation using human and animal neuroscience techniques, as well as computational modeling. This largely reflects the growth in the fields of cognitive neuroscience and computational neuroscience and has led to new theories of how the auditory system segregates sounds in complex arrays. The current review focuses on neural and computational studies of auditory scene perception published in the last few years. Following the progress that has been made in these studies, we describe (1) theoretical advances in our understanding of the most well-studied aspects of auditory scene perception, namely segregation of sequential patterns of sounds and concurrently presented sounds; (2) the diversification of topics and paradigms that have been investigated; and (3) how new neuroscience techniques (including invasive neurophysiology in awake humans, genotyping, and brain stimulation) have been used in this field.
Collapse
Affiliation(s)
- Joel S. Snyder
- Department of Psychology, University of Nevada, Las Vegas, Las Vegas, Nevada
| | - Mounya Elhilali
- Department of Electrical and Computer Engineering, The Johns Hopkins University, Baltimore, Maryland
| |
Collapse
|
14
|
Bidelman GM, Walker BS. Attentional modulation and domain-specificity underlying the neural organization of auditory categorical perception. Eur J Neurosci 2017; 45:690-699. [DOI: 10.1111/ejn.13526] [Citation(s) in RCA: 32] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2016] [Revised: 01/13/2017] [Accepted: 01/13/2017] [Indexed: 11/29/2022]
Affiliation(s)
- Gavin M. Bidelman
- Institute for Intelligent Systems; University of Memphis; Memphis TN USA
- School of Communication Sciences & Disorders; University of Memphis; 4055 North Park Loop Memphis TN 38152 USA
- Department of Anatomy and Neurobiology; Univeristy of Tennessee Health Sciences Center; Memphis TN USA
| | - Breya S. Walker
- Institute for Intelligent Systems; University of Memphis; Memphis TN USA
- Department of Psychology; University of Memphis; Memphis TN USA
| |
Collapse
|
15
|
Neural Correlates of Speech Segregation Based on Formant Frequencies of Adjacent Vowels. Sci Rep 2017; 7:40790. [PMID: 28102300 PMCID: PMC5244401 DOI: 10.1038/srep40790] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2016] [Accepted: 12/09/2016] [Indexed: 11/25/2022] Open
Abstract
The neural substrates by which speech sounds are perceptually segregated into distinct streams are poorly understood. Here, we recorded high-density scalp event-related potentials (ERPs) while participants were presented with a cyclic pattern of three vowel sounds (/ee/-/ae/-/ee/). Each trial consisted of an adaptation sequence, which could have either a small, intermediate, or large difference in first formant (Δf1) as well as a test sequence, in which Δf1 was always intermediate. For the adaptation sequence, participants tended to hear two streams (“streaming”) when Δf1 was intermediate or large compared to when it was small. For the test sequence, in which Δf1 was always intermediate, the pattern was usually reversed, with participants hearing a single stream with increasing Δf1 in the adaptation sequences. During the adaptation sequence, Δf1-related brain activity was found between 100–250 ms after the /ae/ vowel over fronto-central and left temporal areas, consistent with generation in auditory cortex. For the test sequence, prior stimulus modulated ERP amplitude between 20–150 ms over left fronto-central scalp region. Our results demonstrate that the proximity of formants between adjacent vowels is an important factor in the perceptual organization of speech, and reveal a widely distributed neural network supporting perceptual grouping of speech sounds.
Collapse
|
16
|
Communicating in Challenging Environments: Noise and Reverberation. THE FREQUENCY-FOLLOWING RESPONSE 2017. [DOI: 10.1007/978-3-319-47944-6_8] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]
|
17
|
Bidelman GM. Relative contribution of envelope and fine structure to the subcortical encoding of noise-degraded speech. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2016; 140:EL358. [PMID: 27794347 DOI: 10.1121/1.4965248] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]
Abstract
Brainstem frequency-following responses (FFR) were elicited to the speech token /ama/ in noise containing only envelope (ENV) or fine structure (TFS) cues to assess the relative contribution of these temporal features to the neural encoding of degraded speech. Successive cue removal weakened FFRs with noise having the most deleterious effect on TFS coding. Neuro-acoustic and response-to-response correlations revealed speech-FFRs are dominated by stimulus ENV for clean speech, with TFS making a stronger contribution in moderate noise levels. Results suggest that the relative weighting of temporal ENV and TFS cues to the neural transcription of speech depends critically on the degree of noise in the soundscape.
Collapse
Affiliation(s)
- Gavin M Bidelman
- School of Communication Sciences & Disorders, University of Memphis, Memphis, Tennessee 38152, USA
| |
Collapse
|
18
|
Auditory perceptual restoration and illusory continuity correlates in the human brainstem. Brain Res 2016; 1646:84-90. [DOI: 10.1016/j.brainres.2016.05.050] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2015] [Revised: 05/20/2016] [Accepted: 05/26/2016] [Indexed: 11/22/2022]
|
19
|
Right-ear advantage drives the link between olivocochlear efferent 'antimasking' and speech-in-noise listening benefits. Neuroreport 2016; 26:483-7. [PMID: 25919996 DOI: 10.1097/wnr.0000000000000376] [Citation(s) in RCA: 54] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
Abstract
The mammalian cochlea receives feedback from the brainstem medial olivocochlear (MOC) efferents, whose putative 'antimasking' function is to adjust cochlear amplification and enhance peripheral signal detection in adverse listening environments. Human studies have been inconsistent in demonstrating a clear connection between this corticofugal system and behavioral speech-in-noise (SIN) listening skills. To elucidate the role of brainstem efferent activity in SIN perception, we measured ear-specific contralateral suppression of transient-evoked otoacoustic emissions (OAEs), a proxy measure of MOC activation linked to auditory learning in noisy environments. We show that suppression of cochlear emissions is stronger with a more basal cochlear bias in the right ear compared with the left ear. Moreover, a strong negative correlation was observed between behavioral SIN performance and right-ear OAE suppression magnitudes, such that lower speech reception thresholds in noise were predicted by larger amounts of MOC-related activity. This brain-behavioral relation was not observed for left ear SIN perception. The rightward bias in contralateral MOC suppression of OAEs, coupled with the stronger association between physiological and perceptual measures, is consistent with left-hemisphere cerebral dominance for speech-language processing. We posit that corticofugal feedback from the left cerebral cortex through descending MOC projections sensitizes the right cochlea to signal-in-noise detection, facilitating figure-ground contrast and improving degraded speech analysis. Our findings demonstrate that SIN listening is at least partly driven by subcortical brain mechanisms; primitive stages of cochlear processing and brainstem MOC modulation of (right) inner ear mechanics play a critical role in dictating SIN understanding.
Collapse
|
20
|
Bidelman GM, Howell M. Functional changes in inter- and intra-hemispheric cortical processing underlying degraded speech perception. Neuroimage 2015; 124:581-590. [PMID: 26386346 DOI: 10.1016/j.neuroimage.2015.09.020] [Citation(s) in RCA: 70] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2015] [Revised: 07/29/2015] [Accepted: 09/09/2015] [Indexed: 11/18/2022] Open
Abstract
Previous studies suggest that at poorer signal-to-noise ratios (SNRs), auditory cortical event-related potentials are weakened, prolonged, and show a shift in the functional lateralization of cerebral processing from left to right hemisphere. Increased right hemisphere involvement during speech-in-noise (SIN) processing may reflect the recruitment of additional brain resources to aid speech recognition or alternatively, the progressive loss of involvement from left linguistic brain areas as speech becomes more impoverished (i.e., nonspeech-like). To better elucidate the brain basis of SIN perception, we recorded neuroelectric activity in normal hearing listeners to speech sounds presented at various SNRs. Behaviorally, listeners obtained superior SIN performance for speech presented to the right compared to the left ear (i.e., right ear advantage). Source analysis of neural data assessed the relative contribution of region-specific neural generators (linguistic and auditory brain areas) to SIN processing. We found that left inferior frontal brain areas (e.g., Broca's areas) partially disengage at poorer SNRs but responses do not right lateralize with increasing noise. In contrast, auditory sources showed more resilience to noise in left compared to right primary auditory cortex but also a progressive shift in dominance from left to right hemisphere at lower SNRs. Region- and ear-specific correlations revealed that listeners' right ear SIN advantage was predicted by source activity emitted from inferior frontal gyrus (but not primary auditory cortex). Our findings demonstrate changes in the functional asymmetry of cortical speech processing during adverse acoustic conditions and suggest that "cocktail party" listening skills depend on the quality of speech representations in the left cerebral hemisphere rather than compensatory recruitment of right hemisphere mechanisms.
Collapse
Affiliation(s)
- Gavin M Bidelman
- Institute for Intelligent Systems, University of Memphis, Memphis, TN, USA; School of Communication Sciences & Disorders, University of Memphis, Memphis, TN, USA.
| | - Megan Howell
- School of Communication Sciences & Disorders, University of Memphis, Memphis, TN, USA
| |
Collapse
|
21
|
Deviance-Related Responses along the Auditory Hierarchy: Combined FFR, MLR and MMN Evidence. PLoS One 2015; 10:e0136794. [PMID: 26348628 PMCID: PMC4562708 DOI: 10.1371/journal.pone.0136794] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2015] [Accepted: 08/08/2015] [Indexed: 11/19/2022] Open
Abstract
The mismatch negativity (MMN) provides a correlate of automatic auditory discrimination in human auditory cortex that is elicited in response to violation of any acoustic regularity. Recently, deviance-related responses were found at much earlier cortical processing stages as reflected by the middle latency response (MLR) of the auditory evoked potential, and even at the level of the auditory brainstem as reflected by the frequency following response (FFR). However, no study has reported deviance-related responses in the FFR, MLR and long latency response (LLR) concurrently in a single recording protocol. Amplitude-modulated (AM) sounds were presented to healthy human participants in a frequency oddball paradigm to investigate deviance-related responses along the auditory hierarchy in the ranges of FFR, MLR and LLR. AM frequency deviants modulated the FFR, the Na and Nb components of the MLR, and the LLR eliciting the MMN. These findings demonstrate that it is possible to elicit deviance-related responses at three different levels (FFR, MLR and LLR) in one single recording protocol, highlight the involvement of the whole auditory hierarchy in deviance detection and have implications for cognitive and clinical auditory neuroscience. Moreover, the present protocol provides a new research tool into clinical neuroscience so that the functional integrity of the auditory novelty system can now be tested as a whole in a range of clinical populations where the MMN was previously shown to be defective.
Collapse
|