1
|
Carney LH. Neural Fluctuation Contrast as a Code for Complex Sounds: The Role and Control of Peripheral Nonlinearities. Hear Res 2024; 443:108966. [PMID: 38310710 PMCID: PMC10923127 DOI: 10.1016/j.heares.2024.108966] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/30/2023] [Revised: 01/14/2024] [Accepted: 01/26/2024] [Indexed: 02/06/2024]
Abstract
The nonlinearities of the inner ear are often considered to be obstacles that the central nervous system has to overcome to decode neural responses to sounds. This review describes how peripheral nonlinearities, such as saturation of the inner-hair-cell response and of the IHC-auditory-nerve synapse, are instead beneficial to the neural encoding of complex sounds such as speech. These nonlinearities set up contrast in the depth of neural-fluctuations in auditory-nerve responses along the tonotopic axis, referred to here as neural fluctuation contrast (NFC). Physiological support for the NFC coding hypothesis is reviewed, and predictions of several psychophysical phenomena, including masked detection and speech intelligibility, are presented. Lastly, a framework based on the NFC code for understanding how the medial olivocochlear (MOC) efferent system contributes to the coding of complex sounds is presented. By modulating cochlear gain control in response to both sound energy and fluctuations in neural responses, the MOC system is hypothesized to function not as a simple feedback gain-control device, but rather as a mechanism for enhancing NFC along the tonotopic axis, enabling robust encoding of complex sounds across a wide range of sound levels and in the presence of background noise. Effects of sensorineural hearing loss on the NFC code and on the MOC feedback system are presented and discussed.
Collapse
Affiliation(s)
- Laurel H Carney
- Depts. of Biomedical Engineering, Neuroscience, and Electrical & Computer Engineering University of Rochester, Rochester, NY, USA.
| |
Collapse
|
2
|
Heeringa AN, Jüchter C, Beutelmann R, Klump GM, Köppl C. Altered neural encoding of vowels in noise does not affect behavioral vowel discrimination in gerbils with age-related hearing loss. Front Neurosci 2023; 17:1238941. [PMID: 38033551 PMCID: PMC10682387 DOI: 10.3389/fnins.2023.1238941] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2023] [Accepted: 10/24/2023] [Indexed: 12/02/2023] Open
Abstract
Introduction Understanding speech in a noisy environment, as opposed to speech in quiet, becomes increasingly more difficult with increasing age. Using the quiet-aged gerbil, we studied the effects of aging on speech-in-noise processing. Specifically, behavioral vowel discrimination and the encoding of these vowels by single auditory-nerve fibers were compared, to elucidate some of the underlying mechanisms of age-related speech-in-noise perception deficits. Methods Young-adult and quiet-aged Mongolian gerbils, of either sex, were trained to discriminate a deviant naturally-spoken vowel in a sequence of vowel standards against a speech-like background noise. In addition, we recorded responses from single auditory-nerve fibers of young-adult and quiet-aged gerbils while presenting the same speech stimuli. Results Behavioral vowel discrimination was not significantly affected by aging. For both young-adult and quiet-aged gerbils, the behavioral discrimination between /eː/ and /iː/ was more difficult to make than /eː/ vs. /aː/ or /iː/ vs. /aː/, as evidenced by longer response times and lower d' values. In young-adults, spike timing-based vowel discrimination agreed with the behavioral vowel discrimination, while in quiet-aged gerbils it did not. Paradoxically, discrimination between vowels based on temporal responses was enhanced in aged gerbils for all vowel comparisons. Representation schemes, based on the spectrum of the inter-spike interval histogram, revealed stronger encoding of both the fundamental and the lower formant frequencies in fibers of quiet-aged gerbils, but no qualitative changes in vowel encoding. Elevated thresholds in combination with a fixed stimulus level, i.e., lower sensation levels of the stimuli for old individuals, can explain the enhanced temporal coding of the vowels in noise. Discussion These results suggest that the altered auditory-nerve discrimination metrics in old gerbils may mask age-related deterioration in the central (auditory) system to the extent that behavioral vowel discrimination matches that of the young adults.
Collapse
Affiliation(s)
- Amarins N. Heeringa
- Research Centre Neurosensory Science and Cluster of Excellence “Hearing4all”, Department of Neuroscience, School of Medicine and Health Science, Carl von Ossietzky University Oldenburg, Oldenburg, Germany
| | | | | | | | | |
Collapse
|
3
|
Rizzi R, Bidelman GM. Duplex perception reveals brainstem auditory representations are modulated by listeners' ongoing percept for speech. Cereb Cortex 2023; 33:10076-10086. [PMID: 37522248 PMCID: PMC10502779 DOI: 10.1093/cercor/bhad266] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2023] [Revised: 06/27/2023] [Accepted: 07/10/2023] [Indexed: 08/01/2023] Open
Abstract
So-called duplex speech stimuli with perceptually ambiguous spectral cues to one ear and isolated low- versus high-frequency third formant "chirp" to the opposite ear yield a coherent percept supporting their phonetic categorization. Critically, such dichotic sounds are only perceived categorically upon binaural integration. Here, we used frequency-following responses (FFRs), scalp-recorded potentials reflecting phase-locked subcortical activity, to investigate brainstem responses to fused speech percepts and to determine whether FFRs reflect binaurally integrated category-level representations. We recorded FFRs to diotic and dichotic stop-consonants (/da/, /ga/) that either did or did not require binaural fusion to properly label along with perceptually ambiguous sounds without clear phonetic identity. Behaviorally, listeners showed clear categorization of dichotic speech tokens confirming they were heard with a fused, phonetic percept. Neurally, we found FFRs were stronger for categorically perceived speech relative to category-ambiguous tokens but also differentiated phonetic categories for both diotically and dichotically presented speech sounds. Correlations between neural and behavioral data further showed FFR latency predicted the degree to which listeners labeled tokens as "da" versus "ga." The presence of binaurally integrated, category-level information in FFRs suggests human brainstem processing reflects a surprisingly abstract level of the speech code typically circumscribed to much later cortical processing.
Collapse
Affiliation(s)
- Rose Rizzi
- Department of Speech, Language, and Hearing Sciences, Indiana University, Bloomington, IN, United States
- Program in Neuroscience, Indiana University, Bloomington, IN, United States
- School of Communication Sciences and Disorders, University of Memphis, Memphis, TN, United States
| | - Gavin M Bidelman
- Department of Speech, Language, and Hearing Sciences, Indiana University, Bloomington, IN, United States
- Program in Neuroscience, Indiana University, Bloomington, IN, United States
- Cognitive Science Program, Indiana University, Bloomington, IN, United States
| |
Collapse
|
4
|
Hamza Y, Farhadi A, Schwarz DM, McDonough JM, Carney LH. Representations of fricatives in subcortical model responses: Comparisons with human consonant perception. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2023; 154:602-618. [PMID: 37535429 PMCID: PMC10550336 DOI: 10.1121/10.0020536] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/30/2022] [Revised: 07/11/2023] [Accepted: 07/13/2023] [Indexed: 08/05/2023]
Abstract
Fricatives are obstruent sound contrasts made by airflow constrictions in the vocal tract that produce turbulence across the constriction or at a site downstream from the constriction. Fricatives exhibit significant intra/intersubject and contextual variability. Yet, fricatives are perceived with high accuracy. The current study investigated modeled neural responses to fricatives in the auditory nerve (AN) and inferior colliculus (IC) with the hypothesis that response profiles across populations of neurons provide robust correlates to consonant perception. Stimuli were 270 intervocalic fricatives (10 speakers × 9 fricatives × 3 utterances). Computational model response profiles had characteristic frequencies that were log-spaced from 125 Hz to 8 or 20 kHz to explore the impact of high-frequency responses. Confusion matrices generated by k-nearest-neighbor subspace classifiers were based on the profiles of average rates across characteristic frequencies as feature vectors. Model confusion matrices were compared with published behavioral data. The modeled AN and IC neural responses provided better predictions of behavioral accuracy than the stimulus spectra, and IC showed better accuracy than AN. Behavioral fricative accuracy was explained by modeled neural response profiles, whereas confusions were only partially explained. Extended frequencies improved accuracy based on the model IC, corroborating the importance of extended high frequencies in speech perception.
Collapse
Affiliation(s)
- Yasmeen Hamza
- Department of Biomedical Engineering, University of Rochester, Rochester, New York 14627, USA
| | - Afagh Farhadi
- Department of Electrical and Computer Engineering, University of Rochester, Rochester, New York 14627, USA
| | - Douglas M Schwarz
- Depts. of Neuroscience and Biomedical Engineering, University of Rochester, Rochester, New York 14627, USA
| | - Joyce M McDonough
- Department of Linguistics, University of Rochester, Rochester, New York 14627, USA
| | - Laurel H Carney
- Depts. of Biomedical Engineering, Neuroscience, and Electrical and Computer Engineering, University of Rochester, Rochester, New York 14627, USA
| |
Collapse
|
5
|
Sabesan S, Fragner A, Bench C, Drakopoulos F, Lesica NA. Large-scale electrophysiology and deep learning reveal distorted neural signal dynamics after hearing loss. eLife 2023; 12:e85108. [PMID: 37162188 PMCID: PMC10202456 DOI: 10.7554/elife.85108] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2022] [Accepted: 04/27/2023] [Indexed: 05/11/2023] Open
Abstract
Listeners with hearing loss often struggle to understand speech in noise, even with a hearing aid. To better understand the auditory processing deficits that underlie this problem, we made large-scale brain recordings from gerbils, a common animal model for human hearing, while presenting a large database of speech and noise sounds. We first used manifold learning to identify the neural subspace in which speech is encoded and found that it is low-dimensional and that the dynamics within it are profoundly distorted by hearing loss. We then trained a deep neural network (DNN) to replicate the neural coding of speech with and without hearing loss and analyzed the underlying network dynamics. We found that hearing loss primarily impacts spectral processing, creating nonlinear distortions in cross-frequency interactions that result in a hypersensitivity to background noise that persists even after amplification with a hearing aid. Our results identify a new focus for efforts to design improved hearing aids and demonstrate the power of DNNs as a tool for the study of central brain structures.
Collapse
Affiliation(s)
| | | | - Ciaran Bench
- Ear Institute, University College LondonLondonUnited Kingdom
| | | | | |
Collapse
|
6
|
Rizzi R, Bidelman GM. Duplex perception reveals brainstem auditory representations are modulated by listeners' ongoing percept for speech. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.05.09.540018. [PMID: 37214801 PMCID: PMC10197666 DOI: 10.1101/2023.05.09.540018] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/24/2023]
Abstract
So-called duplex speech stimuli with perceptually ambiguous spectral cues to one ear and isolated low- vs. high-frequency third formant "chirp" to the opposite ear yield a coherent percept supporting their phonetic categorization. Critically, such dichotic sounds are only perceived categorically upon binaural integration. Here, we used frequency-following responses (FFRs), scalp-recorded potentials reflecting phase-locked subcortical activity, to investigate brainstem responses to fused speech percepts and to determine whether FFRs reflect binaurally integrated category-level representations. We recorded FFRs to diotic and dichotic stop-consonants (/da/, /ga/) that either did or did not require binaural fusion to properly label along with perceptually ambiguous sounds without clear phonetic identity. Behaviorally, listeners showed clear categorization of dichotic speech tokens confirming they were heard with a fused, phonetic percept. Neurally, we found FFRs were stronger for categorically perceived speech relative to category-ambiguous tokens but also differentiated phonetic categories for both diotically and dichotically presented speech sounds. Correlations between neural and behavioral data further showed FFR latency predicted the degree to which listeners labeled tokens as "da" vs. "ga". The presence of binaurally integrated, category-level information in FFRs suggests human brainstem processing reflects a surprisingly abstract level of the speech code typically circumscribed to much later cortical processing.
Collapse
Affiliation(s)
- Rose Rizzi
- Department of Speech, Language, and Hearing Sciences, Indiana University, Bloomington, IN, USA
- Program in Neuroscience, Indiana University, Bloomington, IN, USA
- School of Communication Sciences and Disorders, University of Memphis, Memphis, TN, USA
| | - Gavin M. Bidelman
- Department of Speech, Language, and Hearing Sciences, Indiana University, Bloomington, IN, USA
- Program in Neuroscience, Indiana University, Bloomington, IN, USA
- Cognitive Science Program, Indiana University, Bloomington, IN, USA
- School of Communication Sciences and Disorders, University of Memphis, Memphis, TN, USA
| |
Collapse
|
7
|
Parida S, Heinz MG. Underlying neural mechanisms of degraded speech intelligibility following noise-induced hearing loss: The importance of distorted tonotopy. Hear Res 2022; 426:108586. [PMID: 35953357 PMCID: PMC11149709 DOI: 10.1016/j.heares.2022.108586] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/21/2021] [Revised: 06/21/2022] [Accepted: 07/21/2022] [Indexed: 11/30/2022]
Abstract
Listeners with sensorineural hearing loss (SNHL) have substantial perceptual deficits, especially in noisy environments. Unfortunately, speech-intelligibility models have limited success in predicting the performance of listeners with hearing loss. A better understanding of the various suprathreshold factors that contribute to neural-coding degradations of speech in noisy conditions will facilitate better modeling and clinical outcomes. Here, we highlight the importance of one physiological factor that has received minimal attention to date, termed distorted tonotopy, which refers to a disruption in the mapping between acoustic frequency and cochlear place that is a hallmark of normal hearing. More so than commonly assumed factors (e.g., threshold elevation, reduced frequency selectivity, diminished temporal coding), distorted tonotopy severely degrades the neural representations of speech (particularly in noise) in single- and across-fiber responses in the auditory nerve following noise-induced hearing loss. Key results include: 1) effects of distorted tonotopy depend on stimulus spectral bandwidth and timbre, 2) distorted tonotopy increases across-fiber correlation and thus reduces information capacity to the brain, and 3) its effects vary across etiologies, which may contribute to individual differences. These results motivate the development and testing of noninvasive measures that can assess the severity of distorted tonotopy in human listeners. The development of such noninvasive measures of distorted tonotopy would advance precision-audiological approaches to improving diagnostics and rehabilitation for listeners with SNHL.
Collapse
Affiliation(s)
- Satyabrata Parida
- Department of Speech, Language, and Hearing Sciences, Purdue University, West Lafayette, IN, 47907 USA; Department of Neurobiology, University of Pittsburgh, Pittsburgh, PA, 15261 USA.
| | - Michael G Heinz
- Department of Speech, Language, and Hearing Sciences, Purdue University, West Lafayette, IN, 47907 USA; Weldon School of Biomedical Engineering, Purdue University, West Lafayette, IN, 47907 USA
| |
Collapse
|
8
|
Settibhaktini H, Heinz MG, Chintanpalli A. Modeling the effects of age and hearing loss on concurrent vowel scores. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2021; 150:3581. [PMID: 34852572 PMCID: PMC8594952 DOI: 10.1121/10.0007046] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/23/2020] [Revised: 10/09/2021] [Accepted: 10/12/2021] [Indexed: 06/13/2023]
Abstract
A difference in fundamental frequency (F0) between two vowels is an important segregation cue prior to identifying concurrent vowels. To understand the effects of this cue on identification due to age and hearing loss, Chintanpalli, Ahlstrom, and Dubno [(2016). J. Acoust. Soc. Am. 140, 4142-4153] collected concurrent vowel scores across F0 differences for younger adults with normal hearing (YNH), older adults with normal hearing (ONH), and older adults with hearing loss (OHI). The current modeling study predicts these concurrent vowel scores to understand age and hearing loss effects. The YNH model cascaded the temporal responses of an auditory-nerve model from Bruce, Efrani, and Zilany [(2018). Hear. Res. 360, 40-45] with a modified F0-guided segregation algorithm from Meddis and Hewitt [(1992). J. Acoust. Soc. Am. 91, 233-245] to predict concurrent vowel scores. The ONH model included endocochlear-potential loss, while the OHI model also included hair cell damage; however, both models incorporated cochlear synaptopathy, with a larger effect for OHI. Compared with the YNH model, concurrent vowel scores were reduced across F0 differences for ONH and OHI models, with the lowest scores for OHI. These patterns successfully captured the age and hearing loss effects in the concurrent-vowel data. The predictions suggest that the inability to utilize an F0-guided segregation cue, resulting from peripheral changes, may reduce scores for ONH and OHI listeners.
Collapse
Affiliation(s)
- Harshavardhan Settibhaktini
- Department of Electrical and Electronics Engineering, Birla Institute of Technology and Science, Pilani Campus, Vidya Vihar, Pilani, Rajasthan 333031, India
| | - Michael G Heinz
- Department of Speech, Language and Hearing Sciences, and Weldon School of Biomedical Engineering, Purdue University, West Lafayette, Indiana 47907-2028, USA
| | - Ananthakrishna Chintanpalli
- Department of Electrical and Electronics Engineering, Birla Institute of Technology and Science, Pilani Campus, Vidya Vihar, Pilani, Rajasthan 333031, India
| |
Collapse
|
9
|
Leong UC, Schwarz DM, Henry KS, Carney LH. Sensorineural Hearing Loss Diminishes Use of Temporal Envelope Cues: Evidence From Roving-Level Tone-in-Noise Detection. Ear Hear 2021; 41:1009-1019. [PMID: 31985535 PMCID: PMC8221074 DOI: 10.1097/aud.0000000000000822] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
OBJECTIVES The objective of our study is to understand how listeners with and without sensorineural hearing loss (SNHL) use energy and temporal envelope cues to detect tones in noise. Previous studies of low-frequency tone-in-noise detection have shown that when energy cues are made less reliable using a roving-level paradigm, thresholds of listeners with normal hearing (NH) are only slightly increased. This result is consistent with studies demonstrating the importance of temporal envelope cues for masked detection. In contrast, roving-level detection thresholds are more elevated in listeners with SNHL at the test frequency, suggesting stronger weighting of energy cues. The present study extended these tests to a wide range of frequencies and stimulus levels. The authors hypothesized that individual listeners with SNHL use energy and temporal envelope cues differently for masked detection at different frequencies and levels, depending on the degree of hearing loss. DESIGN Twelve listeners with mild to moderate SNHL and 12 NH listeners participated. Tone-in-noise detection thresholds at 0.5, 1, 2, and 4 kHz in 1/3 octave bands of simultaneously gated Gaussian noise were obtained using a novel, two-part tracking paradigm. A track refers to the sequence of trials in an adaptive test procedure; the signal to noise ratio was the tracked variable. Each part of the track consisted of a two-alternative, two-interval, forced-choice procedure. The initial portion of the track estimated detection threshold using a fixed masker level. When the track continued, stimulus levels were randomly varied over a 20-dB rove range (±10 dB with respect to mean masker level), and a second threshold was estimated. Rove effect (RE) was defined as the difference between thresholds for the fixed- and roving-level tests. The size of the RE indicated how strongly a listener weighted energy-based cues for masked detection. Participants were tested at one to three masker levels per frequency, depending on audibility. RESULTS Across all stimulus frequencies and levels, NH listeners had small REs (≈1 dB), whereas listeners with SNHL typically had larger REs. Some listeners with SNHL had larger REs at higher frequencies, where pure-tone audiometric thresholds were typically elevated. RE did not vary significantly with masker level for either group. Increased RE for the SNHL group was consistent with simulations in which energy cues were more heavily weighted than envelope cues. CONCLUSIONS Tone-in-noise detection thresholds in NH listeners were typically elevated only slightly by the roving-level paradigm at any frequency or level tested, consistent with the primary use of level-independent cues, such as temporal envelope cues that are conveyed by fluctuations in neural responses. In comparison, thresholds of listeners with SNHL were more affected by the roving-level paradigm, suggesting stronger weighting of energy cues. For listeners with SNHL, the largest RE was observed at 4000 Hz, for which pure-tone audiometric thresholds were most elevated. Specifically, RE size at 4000 Hz was significantly correlated with higher pure-tone audiometric thresholds at the same frequency, after controlling for the effect of age. Future studies will explore strategies for restoring or enhancing neural fluctuation cues that may lead to improved hearing in noise for listeners with SNHL.
Collapse
Affiliation(s)
- U-Cheng Leong
- Department of Otolaryngology, University of Rochester, Rochester, New York, USA
| | - Douglas M. Schwarz
- Department of Neuroscience, University of Rochester, Rochester, New York, USA
| | - Kenneth S. Henry
- Departments of Otolaryngology and Neuroscience, University of Rochester, Rochester, New York, USA
| | - Laurel H. Carney
- Departments of Biomedical Engineering and Neuroscience, University of Rochester, Rochester, New York, USA
| |
Collapse
|
10
|
Parida S, Bharadwaj H, Heinz MG. Spectrally specific temporal analyses of spike-train responses to complex sounds: A unifying framework. PLoS Comput Biol 2021; 17:e1008155. [PMID: 33617548 PMCID: PMC7932515 DOI: 10.1371/journal.pcbi.1008155] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2020] [Revised: 03/04/2021] [Accepted: 02/04/2021] [Indexed: 11/24/2022] Open
Abstract
Significant scientific and translational questions remain in auditory neuroscience surrounding the neural correlates of perception. Relating perceptual and neural data collected from humans can be useful; however, human-based neural data are typically limited to evoked far-field responses, which lack anatomical and physiological specificity. Laboratory-controlled preclinical animal models offer the advantage of comparing single-unit and evoked responses from the same animals. This ability provides opportunities to develop invaluable insight into proper interpretations of evoked responses, which benefits both basic-science studies of neural mechanisms and translational applications, e.g., diagnostic development. However, these comparisons have been limited by a disconnect between the types of spectrotemporal analyses used with single-unit spike trains and evoked responses, which results because these response types are fundamentally different (point-process versus continuous-valued signals) even though the responses themselves are related. Here, we describe a unifying framework to study temporal coding of complex sounds that allows spike-train and evoked-response data to be analyzed and compared using the same advanced signal-processing techniques. The framework uses a set of peristimulus-time histograms computed from single-unit spike trains in response to polarity-alternating stimuli to allow advanced spectral analyses of both slow (envelope) and rapid (temporal fine structure) response components. Demonstrated benefits include: (1) novel spectrally specific temporal-coding measures that are less confounded by distortions due to hair-cell transduction, synaptic rectification, and neural stochasticity compared to previous metrics, e.g., the correlogram peak-height, (2) spectrally specific analyses of spike-train modulation coding (magnitude and phase), which can be directly compared to modern perceptually based models of speech intelligibility (e.g., that depend on modulation filter banks), and (3) superior spectral resolution in analyzing the neural representation of nonstationary sounds, such as speech and music. This unifying framework significantly expands the potential of preclinical animal models to advance our understanding of the physiological correlates of perceptual deficits in real-world listening following sensorineural hearing loss.
Collapse
Affiliation(s)
- Satyabrata Parida
- Weldon School of Biomedical Engineering, Purdue University, West Lafayette, Indiana, United States of America
| | - Hari Bharadwaj
- Weldon School of Biomedical Engineering, Purdue University, West Lafayette, Indiana, United States of America
- Department of Speech, Language, and Hearing Sciences, Purdue University, West Lafayette, Indiana, United States of America
| | - Michael G. Heinz
- Weldon School of Biomedical Engineering, Purdue University, West Lafayette, Indiana, United States of America
- Department of Speech, Language, and Hearing Sciences, Purdue University, West Lafayette, Indiana, United States of America
| |
Collapse
|
11
|
Parida S, Heinz MG. Noninvasive Measures of Distorted Tonotopic Speech Coding Following Noise-Induced Hearing Loss. J Assoc Res Otolaryngol 2020; 22:51-66. [PMID: 33188506 DOI: 10.1007/s10162-020-00755-2] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2019] [Accepted: 04/21/2020] [Indexed: 11/27/2022] Open
Abstract
Animal models of noise-induced hearing loss (NIHL) show a dramatic mismatch between cochlear characteristic frequency (CF, based on place of innervation) and the dominant response frequency in single auditory-nerve-fiber responses to broadband sounds (i.e., distorted tonotopy, DT). This noise trauma effect is associated with decreased frequency-tuning-curve (FTC) tip-to-tail ratio, which results from decreased tip sensitivity and enhanced tail sensitivity. Notably, DT is more severe for noise trauma than for metabolic (e.g., age-related) losses of comparable degree, suggesting that individual differences in DT may contribute to speech intelligibility differences in patients with similar audiograms. Although DT has implications for many neural-coding theories for real-world sounds, it has primarily been explored in single-neuron studies that are not viable with humans. Thus, there are no noninvasive measures to detect DT. Here, frequency following responses (FFRs) to a conversational speech sentence were recorded in anesthetized male chinchillas with either normal hearing or NIHL. Tonotopic sources of FFR envelope and temporal fine structure (TFS) were evaluated in normal-hearing chinchillas. Results suggest that FFR envelope primarily reflects activity from high-frequency neurons, whereas FFR-TFS receives broad tonotopic contributions. Representation of low- and high-frequency speech power in FFRs was also assessed. FFRs in hearing-impaired animals were dominated by low-frequency stimulus power, consistent with oversensitivity of high-frequency neurons to low-frequency power. These results suggest that DT can be diagnosed noninvasively. A normalized DT metric computed from speech FFRs provides a potential diagnostic tool to test for DT in humans. A sensitive noninvasive DT metric could be used to evaluate perceptual consequences of DT and to optimize hearing-aid amplification strategies to improve tonotopic coding for hearing-impaired listeners.
Collapse
Affiliation(s)
- Satyabrata Parida
- Weldon School of Biomedical Engineering, Purdue University, 206 South Martin Jischke Drive, West Lafayette, IN, 47907, USA
| | - Michael G Heinz
- Weldon School of Biomedical Engineering, Purdue University, 206 South Martin Jischke Drive, West Lafayette, IN, 47907, USA.
- Department of Speech, Language, and Hearing Sciences, Purdue University, 715 Clinic Drive, West Lafayette, IN, 47907, USA.
| |
Collapse
|
12
|
Whiteford KL, Kreft HA, Oxenham AJ. The role of cochlear place coding in the perception of frequency modulation. eLife 2020; 9:58468. [PMID: 32996463 PMCID: PMC7556860 DOI: 10.7554/elife.58468] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2020] [Accepted: 09/29/2020] [Indexed: 12/17/2022] Open
Abstract
Natural sounds convey information via frequency and amplitude modulations (FM and AM). Humans are acutely sensitive to the slow rates of FM that are crucial for speech and music. This sensitivity has long been thought to rely on precise stimulus-driven auditory-nerve spike timing (time code), whereas a coarser code, based on variations in the cochlear place of stimulation (place code), represents faster FM rates. We tested this theory in listeners with normal and impaired hearing, spanning a wide range of place-coding fidelity. Contrary to predictions, sensitivity to both slow and fast FM correlated with place-coding fidelity. We also used incoherent AM on two carriers to simulate place coding of FM and observed poorer sensitivity at high carrier frequencies and fast rates, two properties of FM detection previously ascribed to the limits of time coding. The results suggest a unitary place-based neural code for FM across all rates and carrier frequencies.
Collapse
Affiliation(s)
- Kelly L Whiteford
- Department of Psychology, University of Minnesota, Minneapolis, United States
| | - Heather A Kreft
- Department of Psychology, University of Minnesota, Minneapolis, United States
| | - Andrew J Oxenham
- Department of Psychology, University of Minnesota, Minneapolis, United States
| |
Collapse
|
13
|
Maxwell BN, Richards VM, Carney LH. Neural fluctuation cues for simultaneous notched-noise masking and profile-analysis tasks: Insights from model midbrain responses. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2020; 147:3523. [PMID: 32486827 PMCID: PMC7229985 DOI: 10.1121/10.0001226] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/10/2019] [Revised: 04/05/2020] [Accepted: 04/21/2020] [Indexed: 05/19/2023]
Abstract
Results of simultaneous notched-noise masking are commonly interpreted as reflecting the bandwidth of underlying auditory filters. This interpretation assumes that listeners detect a tone added to notched-noise based on an increase in energy at the output of an auditory filter. Previous work challenged this assumption by showing that randomly and independently varying (roving) the levels of each stimulus interval does not substantially worsen listener thresholds [Lentz, Richards, and Matiasek (1999). J. Acoust. Soc. Am. 106, 2779-2792]. Lentz et al. further challenged this assumption by showing that filter bandwidths based on notched-noise results were different from those based on a profile-analysis task [Green (1983). Am. Psychol. 38, 133-142; (1988). (Oxford University Press, New York)], although these estimates were later reconciled by emphasizing spectral peaks of the profile-analysis stimulus [Lentz (2006). J. Acoust. Soc. Am. 120, 945-956]. Here, a single physiological model is shown to account for performance in fixed- and roving-level notched-noise tasks and the Lentz et al. profile-analysis task. This model depends on peripheral neural fluctuation cues that are transformed into the average rates of model inferior colliculus neurons. Neural fluctuations are influenced by peripheral filters, synaptic adaptation, cochlear amplification, and saturation of inner hair cells, an element not included in previous theories of envelope-based cues for these tasks. Results suggest reevaluation of the interpretation of performance in these paradigms.
Collapse
Affiliation(s)
- Braden N Maxwell
- Departments of Biomedical Engineering and Neuroscience, 601 Elmwood Avenue, University of Rochester, Rochester, New York 14642, USA
| | - Virginia M Richards
- Department of Cognitive Sciences, University of California, 3151 Social Science Plaza, Irvine, California 92697-5100, USA
| | - Laurel H Carney
- Departments of Biomedical Engineering and Neuroscience, 601 Elmwood Avenue, University of Rochester, Rochester, New York 14642, USA
| |
Collapse
|
14
|
Deloche F. Fine-grained statistical structure of speech. PLoS One 2020; 15:e0230233. [PMID: 32196513 PMCID: PMC7083313 DOI: 10.1371/journal.pone.0230233] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2019] [Accepted: 02/25/2020] [Indexed: 12/04/2022] Open
Abstract
In spite of its acoustic diversity, the speech signal presents statistical regularities that can be exploited by biological or artificial systems for efficient coding. Independent Component Analysis (ICA) revealed that on small time scales (∼ 10 ms), the overall structure of speech is well captured by a time-frequency representation whose frequency selectivity follows the same power law in the high frequency range 1–8 kHz as cochlear frequency selectivity in mammals. Variations in the power-law exponent, i.e. different time-frequency trade-offs, have been shown to provide additional adaptation to phonetic categories. Here, we adopt a parametric approach to investigate the variations of the exponent at a finer level of speech. The estimation procedure is based on a measure that reflects the sparsity of decompositions in a set of Gabor dictionaries whose atoms are Gaussian-modulated sinusoids. We examine the variations of the exponent associated with the best decomposition, first at the level of phonemes, then at an intra-phonemic level. We show that this analysis offers a rich interpretation of the fine-grained statistical structure of speech, and that the exponent values can be related to key acoustic properties. Two main results are: i) for plosives, the exponent is lowered by the release bursts, concealing higher values during the opening phases; ii) for vowels, the exponent is bound to formant bandwidths and decreases with the degree of acoustic radiation at the lips. This work further suggests that an efficient coding strategy is to reduce frequency selectivity with sound intensity level, congruent with the nonlinear behavior of cochlear filtering.
Collapse
Affiliation(s)
- François Deloche
- Centre d’analyse et de mathématique sociales, CNRS, EHESS, Paris, France
- * E-mail:
| |
Collapse
|
15
|
Neuronal population model of globular bushy cells covering unit-to-unit variability. PLoS Comput Biol 2019; 15:e1007563. [PMID: 31881018 PMCID: PMC6934273 DOI: 10.1371/journal.pcbi.1007563] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2019] [Accepted: 11/25/2019] [Indexed: 01/02/2023] Open
Abstract
Computations of acoustic information along the central auditory pathways start in the cochlear nucleus. Bushy cells in the anteroventral cochlear nucleus, which innervate monaural and binaural stations in the superior olivary complex, process and transfer temporal cues relevant for sound localization. These cells are categorized into two groups: spherical and globular bushy cells (SBCs/GBCs). Spontaneous rates of GBCs innervated by multiple auditory nerve (AN) fibers are generally lower than those of SBCs that receive a small number of large AN synapses. In response to low-frequency tonal stimulation, both types of bushy cells show improved phase-locking and entrainment compared to AN fibers. When driven by high-frequency tones, GBCs show primary-like-with-notch or onset-L peristimulus time histograms and relatively irregular spiking. However, previous in vivo physiological studies of bushy cells also found considerable unit-to-unit variability in these response patterns. Here we present a population of models that can simulate the observed variation in GBCs. We used a simple coincidence detection model with an adaptive threshold and systematically varied its six parameters. Out of 567000 parameter combinations tested, 7520 primary-like-with-notch models and 4094 onset-L models were selected that satisfied a set of physiological criteria for a GBC unit. Analyses of the model parameters and output measures revealed that the parameters of the accepted model population are weakly correlated with each other to retain major GBC properties, and that the output spiking patterns of the model are affected by a combination of multiple parameters. Simulations of frequency-dependent temporal properties of the model GBCs showed a reasonable fit to empirical data, supporting the validity of our population modeling. The computational simplicity and efficiency of the model structure makes our approach suitable for future large-scale simulations of binaural information processing that may involve thousands of GBC units. In the auditory system, specialized neuronal circuits process various types of acoustic information. A group of neurons, called globular bushy cells (GBCs), faithfully transfer timing information of acoustic signals to their downstream neurons responsible for the perception of sound location. Previous physiological studies found representative activity patterns of GBCs, but with substantial individual variations among them. In this study, we present a population of models, instead of creating one best model, to account for the observed variations of GBCs. We varied all six parameters of a simple auditory neuron model and selected the combinations of parameters that led to acceptable activity patterns of GBCs. In total, we tested more than half a million combinations and accepted ~11600 GBC models. Temporal spiking patterns of real GBCs depend on the sound frequency, and our model population was able to replicate this trend. The model used here is computationally efficient and can thus serve as a building block for future large-scale simulations of auditory information processing.
Collapse
|
16
|
Halliday LF, Rosen S, Tuomainen O, Calcus A. Impaired frequency selectivity and sensitivity to temporal fine structure, but not envelope cues, in children with mild-to-moderate sensorineural hearing loss. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2019; 146:4299. [PMID: 31893709 DOI: 10.1121/1.5134059] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/01/2019] [Accepted: 10/24/2019] [Indexed: 06/10/2023]
Abstract
Psychophysical thresholds were measured for 8-16 year-old children with mild-to-moderate sensorineural hearing loss (MMHL; N = 46) on a battery of auditory processing tasks that included measures designed to be dependent upon frequency selectivity and sensitivity to temporal fine structure (TFS) or envelope cues. Children with MMHL who wore hearing aids were tested in both unaided and aided conditions, and all were compared to a group of normally hearing (NH) age-matched controls. Children with MMHL performed more poorly than NH controls on tasks considered to be dependent upon frequency selectivity, sensitivity to TFS, and speech discrimination (/bɑ/-/dɑ/), but not on tasks measuring sensitivity to envelope cues. Auditory processing deficits remained regardless of age, were observed in both unaided and aided conditions, and could not be attributed to differences in nonverbal IQ or attention between groups. However, better auditory processing in children with MMHL was predicted by better audiometric thresholds and, for aided tasks only, higher levels of maternal education. These results suggest that, as for adults with MMHL, children with MMHL may show deficits in frequency selectivity and sensitivity to TFS, but sensitivity to the envelope may remain intact.
Collapse
Affiliation(s)
- Lorna F Halliday
- Speech, Hearing, and Phonetic Sciences, University College London, Chandler House, 2 Wakefield Street, London WC1N 1PF, United Kingdom
| | - Stuart Rosen
- Speech, Hearing, and Phonetic Sciences, University College London, Chandler House, 2 Wakefield Street, London WC1N 1PF, United Kingdom
| | - Outi Tuomainen
- Speech, Hearing, and Phonetic Sciences, University College London, Chandler House, 2 Wakefield Street, London WC1N 1PF, United Kingdom
| | - Axelle Calcus
- Speech, Hearing, and Phonetic Sciences, University College London, Chandler House, 2 Wakefield Street, London WC1N 1PF, United Kingdom
| |
Collapse
|
17
|
Trevino M, Lobarinas E, Maulden AC, Heinz MG. The chinchilla animal model for hearing science and noise-induced hearing loss. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2019; 146:3710. [PMID: 31795699 PMCID: PMC6881193 DOI: 10.1121/1.5132950] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/20/2019] [Revised: 09/19/2019] [Accepted: 09/24/2019] [Indexed: 05/07/2023]
Abstract
The chinchilla animal model for noise-induced hearing loss has an extensive history spanning more than 50 years. Many behavioral, anatomical, and physiological characteristics of the chinchilla make it a valuable animal model for hearing science. These include similarities with human hearing frequency and intensity sensitivity, the ability to be trained behaviorally with acoustic stimuli relevant to human hearing, a docile nature that allows many physiological measures to be made in an awake state, physiological robustness that allows for data to be collected from all levels of the auditory system, and the ability to model various types of conductive and sensorineural hearing losses that mimic pathologies observed in humans. Given these attributes, chinchillas have been used repeatedly to study anatomical, physiological, and behavioral effects of continuous and impulse noise exposures that produce either temporary or permanent threshold shifts. Based on the mechanistic insights from noise-exposure studies, chinchillas have also been used in pre-clinical drug studies for the prevention and rescue of noise-induced hearing loss. This review paper highlights the role of the chinchilla model in hearing science, its important contributions, and its advantages and limitations.
Collapse
Affiliation(s)
- Monica Trevino
- School of Behavioral and Brain Sciences, Callier Center, The University of Texas at Dallas, 1966 Inwood Road, Dallas, Texas 75235, USA
| | - Edward Lobarinas
- School of Behavioral and Brain Sciences, Callier Center, The University of Texas at Dallas, 1966 Inwood Road, Dallas, Texas 75235, USA
| | - Amanda C Maulden
- Department of Speech, Language, and Hearing Sciences, Purdue University, 715 Clinic Drive, West Lafayette, Indiana 47907, USA
| | - Michael G Heinz
- Weldon School of Biomedical Engineering, Purdue University, 715 Clinic Drive, West Lafayette, Indiana 47907, USA
| |
Collapse
|
18
|
Abstract
Studies of vowel systems regularly appeal to the need to understand how the auditory system encodes and processes the information in the acoustic signal. The goal of this study is to present computational models to address this need, and to use the models to illustrate responses to vowels at two levels of the auditory pathway. Many of the models previously used to study auditory representations of speech are based on linear filter banks simulating the tuning of the inner ear. These models do not incorporate key nonlinear response properties of the inner ear that influence responses at conversational-speech sound levels. These nonlinear properties shape neural representations in ways that are important for understanding responses in the central nervous system. The model for auditory-nerve (AN) fibers used here incorporates realistic nonlinear properties associated with the basilar membrane, inner hair cells (IHCs), and the IHC-AN synapse. These nonlinearities set up profiles of f0-related fluctuations that vary in amplitude across the population of frequency-tuned AN fibers. Amplitude fluctuations in AN responses are smallest near formant peaks and largest at frequencies between formants. These f0-related fluctuations strongly excite or suppress neurons in the auditory midbrain, the first level of the auditory pathway where tuning for low-frequency fluctuations in sounds occurs. Formant-related amplitude fluctuations provide representations of the vowel spectrum in discharge rates of midbrain neurons. These representations in the midbrain are robust across a wide range of sound levels, including the entire range of conversational-speech levels, and in the presence of realistic background noise levels.
Collapse
|
19
|
Encina-Llamas G, Harte JM, Dau T, Shinn-Cunningham B, Epp B. Investigating the Effect of Cochlear Synaptopathy on Envelope Following Responses Using a Model of the Auditory Nerve. J Assoc Res Otolaryngol 2019; 20:363-382. [PMID: 31102010 PMCID: PMC6646444 DOI: 10.1007/s10162-019-00721-7] [Citation(s) in RCA: 34] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2017] [Accepted: 04/08/2019] [Indexed: 12/16/2022] Open
Abstract
The healthy auditory system enables communication in challenging situations with high levels of background noise. Yet, despite normal sensitivity to pure tones, many listeners complain about having difficulties in such situations. Recent animal studies demonstrated that noise overexposure that produces temporary threshold shifts can cause the loss of auditory nerve (AN) fiber synapses (i.e., cochlear synaptopathy, CS), which appears to predominantly affect medium- and low-spontaneous rate (SR) fibers. In the present study, envelope following response (EFR) magnitude-level functions were recorded in normal hearing (NH) threshold and mildly hearing-impaired (HI) listeners with thresholds elevated above 2 kHz. EFRs were elicited by sinusoidally amplitude modulated (SAM) tones presented in quiet with a carrier frequency of 2 kHz, modulated at 93 Hz, and modulation depths of 0.85 (deep) and 0.25 (shallow). While EFR magnitude-level functions for deeply modulated tones were similar for all listeners, EFR magnitudes for shallowly modulated tones were reduced at medium stimulation levels in some NH threshold listeners and saturated in all HI listeners for the whole level range. A phenomenological model of the AN was used to investigate the extent to which hair-cell dysfunction and/or CS could explain the trends observed in the EFR data. Hair-cell dysfunction alone, including postulated elevated hearing thresholds at extended high frequencies (EHF) beyond 8 kHz, could not account for the recorded EFR data. Postulated CS led to simulations generally consistent with the recorded data, but a loss of all types of AN fibers was required within the model framework. The effects of off-frequency contributions (i.e., away from the characteristic place of the stimulus) and the differential loss of different AN fiber types on EFR magnitude-level functions were analyzed. When using SAM tones in quiet as the stimulus, model simulations suggested that (1) EFRs are dominated by the activity of high-SR fibers at all stimulus intensities, and (2) EFRs at medium-to-high stimulus levels are dominated by off-frequency contributions.
Collapse
Affiliation(s)
- Gerard Encina-Llamas
- Hearing Systems section, Department of Health Technology, Technical University of Denmark (DTU), Kongens Lyngby, Denmark.
| | - James M Harte
- Interacoustics Research Unit, Kongens Lyngby, Denmark
| | - Torsten Dau
- Hearing Systems section, Department of Health Technology, Technical University of Denmark (DTU), Kongens Lyngby, Denmark
| | - Barbara Shinn-Cunningham
- Carnegie Mellon Neuroscience Institute, Pittsburgh, PA, USA
- Department of Biomedical Engineering, Boston University, Boston, MA, USA
| | - Bastian Epp
- Hearing Systems section, Department of Health Technology, Technical University of Denmark (DTU), Kongens Lyngby, Denmark
| |
Collapse
|
20
|
Human Frequency Following Responses to Vocoded Speech: Amplitude Modulation Versus Amplitude Plus Frequency Modulation. Ear Hear 2019; 41:300-311. [PMID: 31246660 DOI: 10.1097/aud.0000000000000756] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
Abstract
OBJECTIVES The most commonly employed speech processing strategies in cochlear implants (CIs) only extract and encode amplitude modulation (AM) in a limited number of frequency channels. proposed a novel speech processing strategy that encodes both frequency modulation (FM) and AM to improve CI performance. Using behavioral tests, they reported better speech, speaker, and tone recognition with this novel strategy than with the AM-alone strategy. Here, we used the scalp-recorded human frequency following responses (FFRs) to examine the differences in the neural representation of vocoded speech sounds with AM alone and AM + FM as the spectral and temporal cues were varied. Specifically, we were interested in determining whether the addition of FM to AM improved the neural representation of envelope periodicity (FFRENV) and temporal fine structure (FFRTFS), as reflected in the temporal pattern of the phase-locked neural activity generating the FFR. DESIGN FFRs were recorded from 13 normal-hearing, adult listeners in response to the original unprocessed stimulus (a synthetic diphthong /au/ with a 110-Hz fundamental frequency or F0 and a 250-msec duration) and the 2-, 4-, 8- and 16-channel sine vocoded versions of /au/ with AM alone and AM + FM. Temporal waveforms, autocorrelation analyses, fast Fourier Transform, and stimulus-response spectral correlations were used to analyze both the strength and fidelity of the neural representation of envelope periodicity (F0) and TFS (formant structure). RESULTS The periodicity strength in the FFRENV decreased more for the AM stimuli than for the relatively resilient AM + FM stimuli as the number of channels was increased. Regardless of the number of channels, a clear spectral peak of FFRENV was consistently observed at the stimulus F0 for all the AM + FM stimuli but not for the AM stimuli. Neural representation as revealed by the spectral correlation of FFRTFS was better for the AM + FM stimuli when compared to the AM stimuli. Neural representation of the time-varying formant-related harmonics as revealed by the spectral correlation was also better for the AM + FM stimuli as compared to the AM stimuli. CONCLUSIONS These results are consistent with previously reported behavioral results and suggest that the AM + FM processing strategy elicited brainstem neural activity that better preserved periodicity, temporal fine structure, and time-varying spectral information than the AM processing strategy. The relatively more robust neural representation of AM + FM stimuli observed here likely contributes to the superior performance on speech, speaker, and tone recognition with the AM + FM processing strategy. Taken together, these results suggest that neural information preserved in the FFR may be used to evaluate signal processing strategies considered for CIs.
Collapse
|
21
|
[Basic knowledge on the efficacy of hearing aids depending on the type of hearing impairment for Ear, Nose & Throat specialists]. HNO 2019; 66:122-127. [PMID: 29236127 DOI: 10.1007/s00106-017-0457-2] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
Abstract
For Ear, Nose & Throat specialists, the physiological and psychoacoustical deficits related to hearing impairment and the compensatory capabilities of hearing aids are topics of prime importance. In conductive hearing loss, the foremost deficit is decreased audibility, for which hearing aids can compensate almost entirely through the use of level independent gain. In the instance of sensorineural hearing loss, however, the irreversible loss of outer and inner hair cells causes a distorted sound perception, which is particularly troublesome when trying to understand speech in noisy environments. Unfortunately, this distortion cannot be compensated through the use of hearing aids. Nevertheless, in particular listening environments, its effects can be lessened by reducing background noise levels through the use of directional microphones and, to a lesser extent, digital noise reduction. Noise reduction is in many cases also the main effect to improve speech discrimination in retrocochlear hearing loss.
Collapse
|
22
|
Bianchi F, Carney LH, Dau T, Santurette S. Effects of Musical Training and Hearing Loss on Fundamental Frequency Discrimination and Temporal Fine Structure Processing: Psychophysics and Modeling. J Assoc Res Otolaryngol 2019; 20:263-277. [PMID: 30693416 PMCID: PMC6513935 DOI: 10.1007/s10162-018-00710-2] [Citation(s) in RCA: 20] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/21/2018] [Accepted: 12/19/2018] [Indexed: 11/01/2022] Open
Abstract
Several studies have shown that musical training leads to improved fundamental frequency (F0) discrimination for young listeners with normal hearing (NH). It is unclear whether a comparable effect of musical training occurs for listeners whose sensory encoding of F0 is degraded. To address this question, the effect of musical training was investigated for three groups of listeners (young NH, older NH, and older listeners with hearing impairment, HI). In a first experiment, F0 discrimination was investigated using complex tones that differed in harmonic content and phase configuration (sine, positive, or negative Schroeder). Musical training was associated with significantly better F0 discrimination of complex tones containing low-numbered harmonics for all groups of listeners. Part of this effect was caused by the fact that musicians were more robust than non-musicians to harmonic roving. Despite the benefit relative to their non-musicians counterparts, the older musicians, with or without HI, performed worse than the young musicians. In a second experiment, binaural sensitivity to temporal fine structure (TFS) cues was assessed for the same listeners by estimating the highest frequency at which an interaural phase difference was perceived. Performance was better for musicians for all groups of listeners and the use of TFS cues was degraded for the two older groups of listeners. These findings suggest that musical training is associated with an enhancement of both TFS cues encoding and F0 discrimination in young and older listeners with or without HI, although the musicians' benefit decreased with increasing hearing loss. Additionally, models of the auditory periphery and midbrain were used to examine the effect of HI on F0 encoding. The model predictions reflected the worsening in F0 discrimination with increasing HI and accounted for up to 80 % of the variance in the data.
Collapse
Affiliation(s)
- Federica Bianchi
- Hearing Systems Group, Department of Electrical Engineering, Technical University of Denmark, Ørsteds Plads, Building 352, 2800, Lyngby, Denmark.
- Current Affiliation: Oticon Medical, Kongebakken 9, Smørum, Denmark.
| | - Laurel H Carney
- Departments of Biomedical Engineering and Neuroscience, University of Rochester, Rochester, NY, USA
| | - Torsten Dau
- Hearing Systems Group, Department of Electrical Engineering, Technical University of Denmark, Ørsteds Plads, Building 352, 2800, Lyngby, Denmark
| | - Sébastien Santurette
- Hearing Systems Group, Department of Electrical Engineering, Technical University of Denmark, Ørsteds Plads, Building 352, 2800, Lyngby, Denmark
- Department of Otorhinolaryngology, Head and Neck Surgery & Audiology, Rigshospitalet, 2100, Copenhagen, Denmark
| |
Collapse
|
23
|
Yellamsetty A, Bidelman GM. Brainstem correlates of concurrent speech identification in adverse listening conditions. Brain Res 2019; 1714:182-192. [PMID: 30796895 DOI: 10.1016/j.brainres.2019.02.025] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2018] [Revised: 01/07/2019] [Accepted: 02/19/2019] [Indexed: 01/20/2023]
Abstract
When two voices compete, listeners can segregate and identify concurrent speech sounds using pitch (fundamental frequency, F0) and timbre (harmonic) cues. Speech perception is also hindered by the signal-to-noise ratio (SNR). How clear and degraded concurrent speech sounds are represented at early, pre-attentive stages of the auditory system is not well understood. To this end, we measured scalp-recorded frequency-following responses (FFR) from the EEG while human listeners heard two concurrently presented, steady-state (time-invariant) vowels whose F0 differed by zero or four semitones (ST) presented diotically in either clean (no noise) or noise-degraded (+5dB SNR) conditions. Listeners also performed a speeded double vowel identification task in which they were required to identify both vowels correctly. Behavioral results showed that speech identification accuracy increased with F0 differences between vowels, and this perceptual F0 benefit was larger for clean compared to noise degraded (+5dB SNR) stimuli. Neurophysiological data demonstrated more robust FFR F0 amplitudes for single compared to double vowels and considerably weaker responses in noise. F0 amplitudes showed speech-on-speech masking effects, along with a non-linear constructive interference at 0ST, and suppression effects at 4ST. Correlations showed that FFR F0 amplitudes failed to predict listeners' identification accuracy. In contrast, FFR F1 amplitudes were associated with faster reaction times, although this correlation was limited to noise conditions. The limited number of brain-behavior associations suggests subcortical activity mainly reflects exogenous processing rather than perceptual correlates of concurrent speech perception. Collectively, our results demonstrate that FFRs reflect pre-attentive coding of concurrent auditory stimuli that only weakly predict the success of identifying concurrent speech.
Collapse
Affiliation(s)
- Anusha Yellamsetty
- School of Communication Sciences & Disorders, University of Memphis, Memphis, TN, USA; Department of Communication Sciences & Disorders, University of South Florida, USA.
| | - Gavin M Bidelman
- School of Communication Sciences & Disorders, University of Memphis, Memphis, TN, USA; Institute for Intelligent Systems, University of Memphis, Memphis, TN, USA; University of Tennessee Health Sciences Center, Department of Anatomy and Neurobiology, Memphis, TN, USA.
| |
Collapse
|
24
|
Miller CW, Bernstein JGW, Zhang X, Wu YH, Bentler RA, Tremblay K. The Effects of Static and Moving Spectral Ripple Sensitivity on Unaided and Aided Speech Perception in Noise. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2018; 61:3113-3126. [PMID: 30515519 PMCID: PMC6440313 DOI: 10.1044/2018_jslhr-h-17-0373] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/02/2017] [Revised: 06/06/2018] [Accepted: 08/04/2018] [Indexed: 05/26/2023]
Abstract
PURPOSE This study evaluated whether certain spectral ripple conditions were more informative than others in predicting ecologically relevant unaided and aided speech outcomes. METHOD A quasi-experimental study design was used to evaluate 67 older adult hearing aid users with bilateral, symmetrical hearing loss. Speech perception in noise was tested under conditions of unaided and aided, auditory-only and auditory-visual, and 2 types of noise. Predictors included age, audiometric thresholds, audibility, hearing aid compression, and modulation depth detection thresholds for moving (4-Hz) or static (0-Hz) 2-cycle/octave spectral ripples applied to carriers of broadband noise or 2000-Hz low- or high-pass filtered noise. RESULTS A principal component analysis of the modulation detection data found that broadband and low-pass static and moving ripple detection thresholds loaded onto the first factor whereas high-pass static and moving ripple detection thresholds loaded onto a second factor. A linear mixed model revealed that audibility and the first factor (reflecting broadband and low-pass static and moving ripples) were significantly associated with speech perception performance. Similar results were found for unaided and aided speech scores. The interactions between speech conditions were not significant, suggesting that the relationship between ripples and speech perception was consistent regardless of visual cues or noise condition. High-pass ripple sensitivity was not correlated with speech understanding. CONCLUSIONS The results suggest that, for hearing aid users, poor speech understanding in noise and sensitivity to both static and slow-moving ripples may reflect deficits in the same underlying auditory processing mechanism. Significant factor loadings involving ripple stimuli with low-frequency content may suggest an impaired ability to use temporal fine structure information in the stimulus waveform. Support is provided for the use of spectral ripple testing to predict speech perception outcomes in clinical settings.
Collapse
Affiliation(s)
- Christi W. Miller
- Department of Speech and Hearing Sciences, University of Washington, Seattle
| | - Joshua G. W. Bernstein
- National Military Audiology and Speech Pathology Center, Walter Reed National Military Medical Center, Bethesda, MD
| | - Xuyang Zhang
- Department of Communication Sciences and Disorders, University of Iowa, Iowa City
| | - Yu-Hsiang Wu
- Department of Communication Sciences and Disorders, University of Iowa, Iowa City
| | - Ruth A. Bentler
- Department of Communication Sciences and Disorders, University of Iowa, Iowa City
| | - Kelly Tremblay
- Department of Speech and Hearing Sciences, University of Washington, Seattle
| |
Collapse
|
25
|
Moncada-Torres A, Joshi SN, Prokopiou A, Wouters J, Epp B, Francart T. A framework for computational modelling of interaural time difference discrimination of normal and hearing-impaired listeners. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2018; 144:940. [PMID: 30180705 DOI: 10.1121/1.5051322] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/19/2017] [Accepted: 08/03/2018] [Indexed: 06/08/2023]
Abstract
Different computational models have been developed to study the interaural time difference (ITD) perception. However, only few have used a physiologically inspired architecture to study ITD discrimination. Furthermore, they do not include aspects of hearing impairment. In this work, a framework was developed to predict ITD thresholds in listeners with normal and impaired hearing. It combines the physiologically inspired model of the auditory periphery proposed by Zilany, Bruce, Nelson, and Carney [(2009). J. Acoust. Soc. Am. 126(5), 2390-2412] as a front end with a coincidence detection stage and a neurometric decision device as a back end. It was validated by comparing its predictions against behavioral data for narrowband stimuli from literature. The framework is able to model ITD discrimination of normal-hearing and hearing-impaired listeners at a group level. Additionally, it was used to explore the effect of different proportions of outer- and inner-hair cell impairment on ITD discrimination.
Collapse
Affiliation(s)
- Arturo Moncada-Torres
- KU Leuven - University of Leuven, Department of Neurosciences, ExpORL, Herestraat 49, Bus 721, 3000 Leuven, Belgium
| | - Suyash N Joshi
- Department of Electrical Engineering, Hearing Systems, Technical University of Denmark, Ørsteds Plads, Building 352, DK-2800 Kongens Lyngby, Denmark
| | - Andreas Prokopiou
- KU Leuven - University of Leuven, Department of Neurosciences, ExpORL, Herestraat 49, Bus 721, 3000 Leuven, Belgium
| | - Jan Wouters
- KU Leuven - University of Leuven, Department of Neurosciences, ExpORL, Herestraat 49, Bus 721, 3000 Leuven, Belgium
| | - Bastian Epp
- Department of Electrical Engineering, Hearing Systems, Technical University of Denmark, Ørsteds Plads, Building 352, DK-2800 Kongens Lyngby, Denmark
| | - Tom Francart
- KU Leuven - University of Leuven, Department of Neurosciences, ExpORL, Herestraat 49, Bus 721, 3000 Leuven, Belgium
| |
Collapse
|
26
|
Carney LH. Supra-Threshold Hearing and Fluctuation Profiles: Implications for Sensorineural and Hidden Hearing Loss. J Assoc Res Otolaryngol 2018; 19:331-352. [PMID: 29744729 PMCID: PMC6081887 DOI: 10.1007/s10162-018-0669-5] [Citation(s) in RCA: 93] [Impact Index Per Article: 15.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2017] [Accepted: 04/19/2018] [Indexed: 12/22/2022] Open
Abstract
An important topic in contemporary auditory science is supra-threshold hearing. Difficulty hearing at conversational speech levels in background noise has long been recognized as a problem of sensorineural hearing loss, including that associated with aging (presbyacusis). Such difficulty in listeners with normal thresholds has received more attention recently, especially associated with descriptions of synaptopathy, the loss of auditory nerve (AN) fibers as a result of noise exposure or aging. Synaptopathy has been reported to cause a disproportionate loss of low- and medium-spontaneous rate (L/MSR) AN fibers. Several studies of synaptopathy have assumed that the wide dynamic ranges of L/MSR AN fiber rates are critical for coding supra-threshold sounds. First, this review will present data from the literature that argues against a direct role for average discharge rates of L/MSR AN fibers in coding sounds at moderate to high sound levels. Second, the encoding of sounds at supra-threshold levels is examined. A key assumption in many studies is that saturation of AN fiber discharge rates limits neural encoding, even though the majority of AN fibers, high-spontaneous rate (HSR) fibers, have saturated average rates at conversational sound levels. It is argued here that the cross-frequency profile of low-frequency neural fluctuation amplitudes, not average rates, encodes complex sounds. As described below, this fluctuation-profile coding mechanism benefits from both saturation of inner hair cell (IHC) transduction and average rate saturation associated with the IHC-AN synapse. Third, the role of the auditory efferent system, which receives inputs from L/MSR fibers, is revisited in the context of fluctuation-profile coding. The auditory efferent system is hypothesized to maintain and enhance neural fluctuation profiles. Lastly, central mechanisms sensitive to neural fluctuations are reviewed. Low-frequency fluctuations in AN responses are accentuated by cochlear nucleus neurons which, either directly or via other brainstem nuclei, relay fluctuation profiles to the inferior colliculus (IC). IC neurons are sensitive to the frequency and amplitude of low-frequency fluctuations and convert fluctuation profiles from the periphery into a phase-locked rate profile that is robust across a wide range of sound levels and in background noise. The descending projection from the midbrain (IC) to the efferent system completes a functional loop that, combined with inputs from the L/MSR pathway, is hypothesized to maintain "sharp" supra-threshold hearing, reminiscent of visual mechanisms that regulate optical accommodation. Examples from speech coding and detection in noise are reviewed. Implications for the effects of synaptopathy on control mechanisms hypothesized to influence supra-threshold hearing are discussed. This framework for understanding neural coding and control mechanisms for supra-threshold hearing suggests strategies for the design of novel hearing aid signal-processing and electrical stimulation patterns for cochlear implants.
Collapse
Affiliation(s)
- Laurel H Carney
- Departments of Biomedical Engineering, Neuroscience, and Electrical & Computer Engineering, Del Monte Institute for Neuroscience, University of Rochester, 601 Elmwood Ave., Box 603, Rochester, NY, 14642, USA.
| |
Collapse
|
27
|
Abstract
OBJECTIVES Vocoders offer an effective platform to simulate the effects of cochlear implant speech processing strategies in normal-hearing listeners. Several behavioral studies have examined the effects of varying spectral and temporal cues on vocoded speech perception; however, little is known about the neural indices of vocoded speech perception. Here, the scalp-recorded frequency following response (FFR) was used to study the effects of varying spectral and temporal cues on brainstem neural representation of specific acoustic cues, the temporal envelope periodicity related to fundamental frequency (F0) and temporal fine structure (TFS) related to formant and formant-related frequencies, as reflected in the phase-locked neural activity in response to vocoded speech. DESIGN In experiment 1, FFRs were measured in 12 normal-hearing, adult listeners in response to a steady state English back vowel /u/ presented in an unaltered, unprocessed condition and six sine-vocoder conditions with varying numbers of channels (1, 2, 4, 8, 16, and 32), while the temporal envelope cutoff frequency was fixed at 500 Hz. In experiment 2, FFRs were obtained from 14 normal-hearing, adult listeners in response to the same English vowel /u/, presented in an unprocessed condition and four vocoded conditions where both the temporal envelope cutoff frequency (50 versus 500 Hz) and carrier type (sine wave versus noise band) were varied separately with the number of channels fixed at 8. Fast Fourier Transform was applied to the time waveforms of FFR to analyze the strength of brainstem neural representation of temporal envelope periodicity (F0) and TFS-related peaks (formant structure). RESULTS Brainstem neural representation of both temporal envelope and TFS cues improved when the number of channels increased from 1 to 4, followed by a plateau with 8 and 16 channels, and a reduction in phase-locking strength with 32 channels. For the sine vocoders, peaks in the FFRTFS spectra corresponded with the low-frequency sine-wave carriers and side band frequencies in the stimulus spectra. When the temporal envelope cutoff frequency increased from 50 to 500 Hz, an improvement was observed in brainstem F0 representation with no change in brainstem representation of spectral peaks proximal to the first formant frequency (F1). There was no significant effect of carrier type (sine- versus noise-vocoder) on brainstem neural representation of F0 cues when the temporal envelope cutoff frequency was 500 Hz. CONCLUSIONS While the improvement in neural representation of temporal envelope and TFS cues with up to 4 vocoder channels is consistent with the behavioral literature, the reduced neural phase-locking strength noted with even more channels may be because of the narrow bandwidth of each channel as the number of channels increases. Stronger neural representation of temporal envelope cues with higher temporal envelope cutoff frequencies is likely a reflection of brainstem neural phase-locking to F0-related periodicity fluctuations preserved in the 500-Hz temporal envelopes, which are unavailable in the 50-Hz temporal envelopes. No effect of temporal envelope cutoff frequency was seen for neural representation of TFS cues, suggesting that spectral side band frequencies created by the 500-Hz temporal envelopes did not improve neural representation of F1 cues over the 50-Hz temporal envelopes. Finally, brainstem F0 representation was not significantly affected by carrier type with a temporal envelope cutoff frequency of 500 Hz, which is inconsistent with previous results of behavioral studies examining pitch perception of vocoded stimuli.
Collapse
|
28
|
Settibhaktini H, Chintanpalli A. Modeling the level-dependent changes of concurrent vowel scores. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2018; 143:440. [PMID: 29390795 PMCID: PMC6226212 DOI: 10.1121/1.5021330] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/12/2017] [Revised: 10/20/2017] [Accepted: 01/02/2018] [Indexed: 06/07/2023]
Abstract
The difference in fundamental frequency (F0) between talkers is an important cue for speaker segregation. To understand how this cue varies across sound level, Chintanpalli, Ahlstrom, and Dubno [(2014). J. Assoc. Res. Otolaryngol. 15, 823-837] collected level-dependent changes in concurrent-vowel identification scores for same- and different-F0 conditions in younger adults with normal hearing. Modeling suggested that level-dependent changes in phase locking of auditory-nerve (AN) fibers to formants and F0s may contribute to concurrent-vowel identification scores; however, identification scores were not predicted to test this suggestion directly. The current study predicts these identification scores using the temporal responses of a computational AN model and a modified version of Meddis and Hewitt's [(1992). J. Acoust. Soc. Am. 91, 233-245] F0-based segregation algorithm. The model successfully captured the level-dependent changes in identification scores of both vowels with and without F0 difference, as well as identification scores for one vowel correct. The model's F0-based vowel segregation was controlled using the actual F0-benefit across levels such that the predicted F0-benefit matched qualitatively with the actual F0-benefit as a function of level. The quantitative predictions from this F0-based segregation algorithm demonstrate that temporal responses of AN fibers to vowel formants and F0s can account for variations in identification scores across sound level and F0-difference conditions in a concurrent-vowel task.
Collapse
Affiliation(s)
- Harshavardhan Settibhaktini
- Department of Electrical and Electronics Engineering, Birla Institute of Technology and Science, Pilani Campus, Vidya Vihar, Pilani, Rajasthan, 333031, India
| | - Ananthakrishna Chintanpalli
- Department of Electrical and Electronics Engineering, Birla Institute of Technology and Science, Pilani Campus, Vidya Vihar, Pilani, Rajasthan, 333031, India
| |
Collapse
|
29
|
Hauser SN, Burton JA, Mercer ET, Ramachandran R. Effects of noise overexposure on tone detection in noise in nonhuman primates. Hear Res 2018; 357:33-45. [PMID: 29175767 PMCID: PMC5743633 DOI: 10.1016/j.heares.2017.11.004] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/03/2017] [Revised: 11/06/2017] [Accepted: 11/07/2017] [Indexed: 10/18/2022]
Abstract
This report explores the consequences of acoustic overexposures on hearing in noisy environments for two macaque monkeys trained to perform a reaction time detection task using a Go/No-Go lever release paradigm. Behavioral and non-invasive physiological assessments were obtained before and after narrowband noise exposure. Physiological measurements showed elevated auditory brainstem response (ABR) thresholds and absent distortion product otoacoustic emissions (DPOAEs) post-exposure relative to pre-exposure. Audiograms revealed frequency specific increases in tone detection thresholds, with the greatest increases at the exposure band frequency and higher. Masked detection was affected in a similar frequency specific manner: threshold shift rates (change of masked threshold per dB increase in noise level) were lower than pre-exposure values at frequencies higher than the exposure band. Detection thresholds in sinusoidally amplitude modulated (SAM) noise post-exposure showed no difference from those in unmodulated noise, whereas pre-exposure masked detection thresholds were lower in the presence of SAM noise compared to unmodulated noise. These frequency-dependent results were correlated with cochlear histopathological changes in monkeys that underwent similar noise exposure. These results reveal that behavioral and physiological effects of noise exposure in macaques are similar to those seen in humans and provide preliminary information on the relationship between noise exposure, cochlear pathology and perceptual changes in hearing within individual subjects.
Collapse
Affiliation(s)
- Samantha N Hauser
- Department of Hearing and Speech Sciences, Vanderbilt University Medical Center, Nashville, TN 37232, USA.
| | - Jane A Burton
- Department of Hearing and Speech Sciences, Vanderbilt University Medical Center, Nashville, TN 37232, USA.
| | - Evan T Mercer
- Vanderbilt University Interdisciplinary Program in Neuroscience for Undergraduates, Vanderbilt University, Nashville, TN 37212, USA.
| | - Ramnarayan Ramachandran
- Department of Hearing and Speech Sciences, Vanderbilt University Medical Center, Nashville, TN 37232, USA.
| |
Collapse
|
30
|
Valero MD, Burton JA, Hauser SN, Hackett TA, Ramachandran R, Liberman MC. Noise-induced cochlear synaptopathy in rhesus monkeys (Macaca mulatta). Hear Res 2017; 353:213-223. [PMID: 28712672 PMCID: PMC5632522 DOI: 10.1016/j.heares.2017.07.003] [Citation(s) in RCA: 145] [Impact Index Per Article: 20.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/18/2017] [Revised: 06/02/2017] [Accepted: 07/06/2017] [Indexed: 12/21/2022]
Abstract
Cochlear synaptopathy can result from various insults, including acoustic trauma, aging, ototoxicity, or chronic conductive hearing loss. For example, moderate noise exposure in mice can destroy up to ∼50% of synapses between auditory nerve fibers (ANFs) and inner hair cells (IHCs) without affecting outer hair cells (OHCs) or thresholds, because the synaptopathy occurs first in high-threshold ANFs. However, the fiber loss likely impairs temporal processing and hearing-in-noise, a classic complaint of those with sensorineural hearing loss. Non-human primates appear to be less vulnerable to noise-induced hair-cell loss than rodents, but their susceptibility to synaptopathy has not been studied. Because establishing a non-human primate model may be important in the development of diagnostics and therapeutics, we examined cochlear innervation and the damaging effects of acoustic overexposure in young adult rhesus macaques. Anesthetized animals were exposed bilaterally to narrow-band noise centered at 2 kHz at various sound-pressure levels for 4 h. Cochlear function was assayed for up to 8 weeks following exposure via auditory brainstem responses (ABRs) and otoacoustic emissions (OAEs). A moderate loss of synaptic connections (mean of 12-27% in the basal half of the cochlea) followed temporary threshold shifts (TTS), despite minimal hair-cell loss. A dramatic loss of synapses (mean of 50-75% in the basal half of the cochlea) was seen on IHCs surviving noise exposures that produced permanent threshold shifts (PTS) and widespread hair-cell loss. Higher noise levels were required to produce PTS in macaques compared to rodents, suggesting that primates are less vulnerable to hair-cell loss. However, the phenomenon of noise-induced cochlear synaptopathy in primates is similar to that seen in rodents.
Collapse
Affiliation(s)
- M D Valero
- Eaton-Peabody Laboratories, Massachusetts Eye and Ear Infirmary, Boston, MA 02114, USA; Department of Otolaryngology, Harvard Medical School, Boston, MA 02115, USA.
| | - J A Burton
- Vanderbilt University Medical Center, Dept. of Hearing and Speech Sciences, Nashville, TN 37232, USA
| | - S N Hauser
- Vanderbilt University Medical Center, Dept. of Hearing and Speech Sciences, Nashville, TN 37232, USA
| | - T A Hackett
- Vanderbilt University Medical Center, Dept. of Hearing and Speech Sciences, Nashville, TN 37232, USA
| | - R Ramachandran
- Vanderbilt University Medical Center, Dept. of Hearing and Speech Sciences, Nashville, TN 37232, USA
| | - M C Liberman
- Eaton-Peabody Laboratories, Massachusetts Eye and Ear Infirmary, Boston, MA 02114, USA; Department of Otolaryngology, Harvard Medical School, Boston, MA 02115, USA
| |
Collapse
|
31
|
Predictions of Speech Chimaera Intelligibility Using Auditory Nerve Mean-Rate and Spike-Timing Neural Cues. J Assoc Res Otolaryngol 2017; 18:687-710. [PMID: 28748487 DOI: 10.1007/s10162-017-0627-7] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2015] [Accepted: 05/29/2017] [Indexed: 10/19/2022] Open
Abstract
Perceptual studies of speech intelligibility have shown that slow variations of acoustic envelope (ENV) in a small set of frequency bands provides adequate information for good perceptual performance in quiet, whereas acoustic temporal fine-structure (TFS) cues play a supporting role in background noise. However, the implications for neural coding are prone to misinterpretation because the mean-rate neural representation can contain recovered ENV cues from cochlear filtering of TFS. We investigated ENV recovery and spike-time TFS coding using objective measures of simulated mean-rate and spike-timing neural representations of chimaeric speech, in which either the ENV or the TFS is replaced by another signal. We (a) evaluated the levels of mean-rate and spike-timing neural information for two categories of chimaeric speech, one retaining ENV cues and the other TFS; (b) examined the level of recovered ENV from cochlear filtering of TFS speech; (c) examined and quantified the contribution to recovered ENV from spike-timing cues using a lateral inhibition network (LIN); and (d) constructed linear regression models with objective measures of mean-rate and spike-timing neural cues and subjective phoneme perception scores from normal-hearing listeners. The mean-rate neural cues from the original ENV and recovered ENV partially accounted for perceptual score variability, with additional variability explained by the recovered ENV from the LIN-processed TFS speech. The best model predictions of chimaeric speech intelligibility were found when both the mean-rate and spike-timing neural cues were included, providing further evidence that spike-time coding of TFS cues is important for intelligibility when the speech envelope is degraded.
Collapse
|
32
|
Ananthakrishnan S, Krishnan A, Bartlett E. Human Frequency Following Response: Neural Representation of Envelope and Temporal Fine Structure in Listeners with Normal Hearing and Sensorineural Hearing Loss. Ear Hear 2016; 37:e91-e103. [PMID: 26583482 DOI: 10.1097/aud.0000000000000247] [Citation(s) in RCA: 46] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
OBJECTIVE Listeners with sensorineural hearing loss (SNHL) typically experience reduced speech perception, which is not completely restored with amplification. This likely occurs because cochlear damage, in addition to elevating audiometric thresholds, alters the neural representation of speech transmitted to higher centers along the auditory neuroaxis. While the deleterious effects of SNHL on speech perception in humans have been well-documented using behavioral paradigms, our understanding of the neural correlates underlying these perceptual deficits remains limited. Using the scalp-recorded frequency following response (FFR), the authors examine the effects of SNHL and aging on subcortical neural representation of acoustic features important for pitch and speech perception, namely the periodicity envelope (F0) and temporal fine structure (TFS; formant structure), as reflected in the phase-locked neural activity generating the FFR. DESIGN FFRs were obtained from 10 listeners with normal hearing (NH) and 9 listeners with mild-moderate SNHL in response to a steady-state English back vowel /u/ presented at multiple intensity levels. Use of multiple presentation levels facilitated comparisons at equal sound pressure level (SPL) and equal sensation level. In a second follow-up experiment to address the effect of age on envelope and TFS representation, FFRs were obtained from 25 NH and 19 listeners with mild to moderately severe SNHL to the same vowel stimulus presented at 80 dB SPL. Temporal waveforms, Fast Fourier Transform and spectrograms were used to evaluate the magnitude of the phase-locked activity at F0 (periodicity envelope) and F1 (TFS). RESULTS Neural representation of both envelope (F0) and TFS (F1) at equal SPLs was stronger in NH listeners compared with listeners with SNHL. Also, comparison of neural representation of F0 and F1 across stimulus levels expressed in SPL and sensation level (accounting for audibility) revealed that level-related changes in F0 and F1 magnitude were different for listeners with SNHL compared with listeners with NH. Furthermore, the degradation in subcortical neural representation was observed to persist in listeners with SNHL even when the effects of age were controlled for. CONCLUSIONS Overall, our results suggest a relatively greater degradation in the neural representation of TFS compared with periodicity envelope in individuals with SNHL. This degraded neural representation of TFS in SNHL, as reflected in the brainstem FFR, may reflect a disruption in the temporal pattern of phase-locked neural activity arising from altered tonotopic maps and/or wider filters causing poor frequency selectivity in these listeners. Finally, while preliminary results indicate that the deleterious effects of SNHL may be greater than age-related degradation in subcortical neural representation, the lack of a balanced age-matched control group in this study does not permit us to completely rule out the effects of age on subcortical neural representation.
Collapse
Affiliation(s)
- Saradha Ananthakrishnan
- 1Department of Speech Language Hearing Sciences, Purdue University, West Lafayette, Indiana, USA; 2Department of Audiology, Speech-Language Pathology and Deaf studies, Towson University, Towson, Maryland, USA; 3Department of Biomedical Engineering, Purdue University, West Lafayette, Indiana, USA; and 4Department of Biological Sciences, Purdue University, West Lafayette, Indiana, USA
| | | | | |
Collapse
|
33
|
Chintanpalli A, Ahlstrom JB, Dubno JR. Effects of age and hearing loss on concurrent vowel identification. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2016; 140:4142. [PMID: 28040038 PMCID: PMC5848863 DOI: 10.1121/1.4968781] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/28/2016] [Revised: 11/09/2016] [Accepted: 11/11/2016] [Indexed: 06/06/2023]
Abstract
Differences in formant frequencies and fundamental frequencies (F0) are important cues for segregating and identifying two simultaneous vowels. This study assessed age- and hearing-loss-related changes in the use of these cues for recognition of one or both vowels in a pair and determined differences related to vowel identity and specific vowel pairings. Younger adults with normal hearing, older adults with normal hearing, and older adults with hearing loss listened to different-vowel and identical-vowel pairs that varied in F0 differences. Identification of both vowels as a function of F0 difference revealed that increased age affects the use of F0 and formant difference cues for different-vowel pairs. Hearing loss further reduced the use of these cues, which was not attributable to lower vowel sensation levels. High scores for one vowel in the pair and no effect of F0 differences suggested that F0 cues are important only for identifying both vowels. In contrast to mean scores, widely varying differences in effects of F0 cues, age, and hearing loss were observed for particular vowels and vowel pairings. These variations in identification of vowel pairs were not explained by acoustical models based on the location and level of formants within the two vowels.
Collapse
Affiliation(s)
- Ananthakrishna Chintanpalli
- Department of Otolaryngology-Head and Neck Surgery, Medical University of South Carolina, 135 Rutledge Avenue, MSC 550, Charleston, South Carolina 29425-5500, USA
| | - Jayne B Ahlstrom
- Department of Otolaryngology-Head and Neck Surgery, Medical University of South Carolina, 135 Rutledge Avenue, MSC 550, Charleston, South Carolina 29425-5500, USA
| | - Judy R Dubno
- Department of Otolaryngology-Head and Neck Surgery, Medical University of South Carolina, 135 Rutledge Avenue, MSC 550, Charleston, South Carolina 29425-5500, USA
| |
Collapse
|
34
|
Guthrie OW. Noise Induced DNA Damage Within the Auditory Nerve. Anat Rec (Hoboken) 2016; 300:520-526. [DOI: 10.1002/ar.23494] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2016] [Revised: 06/21/2016] [Accepted: 07/09/2016] [Indexed: 11/06/2022]
Affiliation(s)
- O'neil W. Guthrie
- Cell and Molecular Pathology Laboratory, Department of Communication Sciences and Disorders; Northern Arizona University; Flagstaff Arizona
- Research Service-151 Loma Linda Veterans Affairs Medical Center; Loma Linda California
- Department of Otolaryngology and Head & Neck Surgery, School of Medicine; Loma Linda University Medical Center; Loma Linda California
| |
Collapse
|
35
|
Colin D, Micheyl C, Girod A, Truy E, Gallégo S. Binaural Diplacusis and Its Relationship with Hearing-Threshold Asymmetry. PLoS One 2016; 11:e0159975. [PMID: 27536884 PMCID: PMC4990190 DOI: 10.1371/journal.pone.0159975] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2016] [Accepted: 07/11/2016] [Indexed: 12/02/2022] Open
Abstract
Binaural pitch diplacusis refers to a perceptual anomaly whereby the same sound is perceived as having a different pitch depending on whether it is presented in the left or the right ear. Results in the literature suggest that this phenomenon is more prevalent, and larger, in individuals with asymmetric hearing loss than in individuals with symmetric hearing. However, because studies devoted to this effect have thus far involved small samples, the prevalence of the effect, and its relationship with interaural asymmetries in hearing thresholds, remain unclear. In this study, psychometric functions for interaural pitch comparisons were measured in 55 subjects, including 12 normal-hearing and 43 hearing-impaired participants. Statistically significant pitch differences between the left and right ears were observed in normal-hearing participants, but the effect was usually small (less than 1.5/16 octave, or about 7%). For the hearing-impaired participants, statistically significant interaural pitch differences were found in about three-quarters of the cases. Moreover, for about half of these participants, the difference exceeded 1.5/16 octaves and, in some participants, was as large as or larger than 1/4 octave. This was the case even for the lowest frequency tested, 500 Hz. The pitch differences were weakly, but significantly, correlated with the difference in hearing thresholds between the two ears, such that larger threshold asymmetries were statistically associated with larger pitch differences. For the vast majority of the hearing-impaired participants, the direction of the pitch differences was such that pitch was perceived as higher on the side with the higher (i.e., ‘worse’) hearing thresholds than on the opposite side. These findings are difficult to reconcile with purely temporal models of pitch perception, but may be accounted for by place-based or spectrotemporal models.
Collapse
Affiliation(s)
- David Colin
- Lyon Neuroscience Research Center, IMPACT Team, CRNL, INSERM U1028, CNRS UMR5292, Lyon, France
- Institut des Sciences et Techniques de la Réadaptation, Lyon, France
- University Lyon 1, Lyon, France
- * E-mail:
| | | | - Anneline Girod
- Institut des Sciences et Techniques de la Réadaptation, Lyon, France
| | - Eric Truy
- Lyon Neuroscience Research Center, IMPACT Team, CRNL, INSERM U1028, CNRS UMR5292, Lyon, France
- Departement ORL, Hôpital Edouard Herriot, Centre Hospitalier et Universitaire, Lyon, France
- University Lyon 1, Lyon, France
| | - Stéphane Gallégo
- Institut des Sciences et Techniques de la Réadaptation, Lyon, France
- University Lyon 1, Lyon, France
| |
Collapse
|
36
|
Islam MA, Jassim WA, Cheok NS, Zilany MSA. A Robust Speaker Identification System Using the Responses from a Model of the Auditory Periphery. PLoS One 2016; 11:e0158520. [PMID: 27392046 PMCID: PMC4938550 DOI: 10.1371/journal.pone.0158520] [Citation(s) in RCA: 25] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2016] [Accepted: 06/16/2016] [Indexed: 11/18/2022] Open
Abstract
Speaker identification under noisy conditions is one of the challenging topics in the field of speech processing applications. Motivated by the fact that the neural responses are robust against noise, this paper proposes a new speaker identification system using 2-D neurograms constructed from the responses of a physiologically-based computational model of the auditory periphery. The responses of auditory-nerve fibers for a wide range of characteristic frequency were simulated to speech signals to construct neurograms. The neurogram coefficients were trained using the well-known Gaussian mixture model-universal background model classification technique to generate an identity model for each speaker. In this study, three text-independent and one text-dependent speaker databases were employed to test the identification performance of the proposed method. Also, the robustness of the proposed method was investigated using speech signals distorted by three types of noise such as the white Gaussian, pink, and street noises with different signal-to-noise ratios. The identification results of the proposed neural-response-based method were compared to the performances of the traditional speaker identification methods using features such as the Mel-frequency cepstral coefficients, Gamma-tone frequency cepstral coefficients and frequency domain linear prediction. Although the classification accuracy achieved by the proposed method was comparable to the performance of those traditional techniques in quiet, the new feature was found to provide lower error rates of classification under noisy environments.
Collapse
Affiliation(s)
- Md. Atiqul Islam
- Department of Biomedical Engineering, Faculty of Engineering, University of Malaya, Kuala Lumpur, 50603, Malaysia
| | - Wissam A. Jassim
- Department of Biomedical Engineering, Faculty of Engineering, University of Malaya, Kuala Lumpur, 50603, Malaysia
| | - Ng Siew Cheok
- Department of Biomedical Engineering, Faculty of Engineering, University of Malaya, Kuala Lumpur, 50603, Malaysia
| | | |
Collapse
|
37
|
Distorted Tonotopic Coding of Temporal Envelope and Fine Structure with Noise-Induced Hearing Loss. J Neurosci 2016; 36:2227-37. [PMID: 26888932 DOI: 10.1523/jneurosci.3944-15.2016] [Citation(s) in RCA: 28] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022] Open
Abstract
UNLABELLED People with cochlear hearing loss have substantial difficulty understanding speech in real-world listening environments (e.g., restaurants), even with amplification from a modern digital hearing aid. Unfortunately, a disconnect remains between human perceptual studies implicating diminished sensitivity to fast acoustic temporal fine structure (TFS) and animal studies showing minimal changes in neural coding of TFS or slower envelope (ENV) structure. Here, we used general system-identification (Wiener kernel) analyses of chinchilla auditory nerve fiber responses to Gaussian noise to reveal pronounced distortions in tonotopic coding of TFS and ENV following permanent, noise-induced hearing loss. In basal fibers with characteristic frequencies (CFs) >1.5 kHz, hearing loss introduced robust nontonotopic coding (i.e., at the wrong cochlear place) of low-frequency TFS, while ENV responses typically remained at CF. As a consequence, the highest dominant frequency of TFS coding in response to Gaussian noise was 2.4 kHz in noise-overexposed fibers compared with 4.5 kHz in control fibers. Coding of ENV also became nontonotopic in more pronounced cases of cochlear damage. In apical fibers, more classical hearing-loss effects were observed, i.e., broadened tuning without a significant shift in best frequency. Because these distortions and dissociations of TFS/ENV disrupt tonotopicity, a fundamental principle of auditory processing necessary for robust signal coding in background noise, these results have important implications for understanding communication difficulties faced by people with hearing loss. Further, hearing aids may benefit from distinct amplification strategies for apical and basal cochlear regions to address fundamentally different coding deficits. SIGNIFICANCE STATEMENT Speech-perception problems associated with noise overexposure are pervasive in today's society, even with modern digital hearing aids. Unfortunately, the underlying physiological deficits in neural coding remain unclear. Here, we used innovative system-identification analyses of auditory nerve fiber responses to Gaussian noise to uncover pronounced distortions in coding of rapidly varying acoustic temporal fine structure and slower envelope cues following noise trauma. Because these distortions degrade and diminish the tonotopic representation of temporal acoustic features, a fundamental principle of auditory processing, the results represent a critical advancement in our understanding of the physiological bases of communication disorders. The detailed knowledge provided by this work will help guide the design of signal-processing strategies aimed at alleviating everyday communication problems for people with hearing loss.
Collapse
|
38
|
Hossain ME, Jassim WA, Zilany MSA. Reference-Free Assessment of Speech Intelligibility Using Bispectrum of an Auditory Neurogram. PLoS One 2016; 11:e0150415. [PMID: 26967160 PMCID: PMC4788356 DOI: 10.1371/journal.pone.0150415] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2015] [Accepted: 02/12/2016] [Indexed: 11/19/2022] Open
Abstract
Sensorineural hearing loss occurs due to damage to the inner and outer hair cells of the peripheral auditory system. Hearing loss can cause decreases in audibility, dynamic range, frequency and temporal resolution of the auditory system, and all of these effects are known to affect speech intelligibility. In this study, a new reference-free speech intelligibility metric is proposed using 2-D neurograms constructed from the output of a computational model of the auditory periphery. The responses of the auditory-nerve fibers with a wide range of characteristic frequencies were simulated to construct neurograms. The features of the neurograms were extracted using third-order statistics referred to as bispectrum. The phase coupling of neurogram bispectrum provides a unique insight for the presence (or deficit) of supra-threshold nonlinearities beyond audibility for listeners with normal hearing (or hearing loss). The speech intelligibility scores predicted by the proposed method were compared to the behavioral scores for listeners with normal hearing and hearing loss both in quiet and under noisy background conditions. The results were also compared to the performance of some existing methods. The predicted results showed a good fit with a small error suggesting that the subjective scores can be estimated reliably using the proposed neural-response-based metric. The proposed metric also had a wide dynamic range, and the predicted scores were well-separated as a function of hearing loss. The proposed metric successfully captures the effects of hearing loss and supra-threshold nonlinearities on speech intelligibility. This metric could be applied to evaluate the performance of various speech-processing algorithms designed for hearing aids and cochlear implants.
Collapse
Affiliation(s)
- Mohammad E. Hossain
- Department of Biomedical Engineering, Faculty of Engineering, University of Malaya, Kuala Lumpur, Malaysia
| | - Wissam A. Jassim
- Department of Biomedical Engineering, Faculty of Engineering, University of Malaya, Kuala Lumpur, Malaysia
| | - Muhammad S. A. Zilany
- Department of Biomedical Engineering, Faculty of Engineering, University of Malaya, Kuala Lumpur, Malaysia
- * E-mail:
| |
Collapse
|
39
|
Henry KS, Neilans EG, Abrams KS, Idrobo F, Carney LH. Neural correlates of behavioral amplitude modulation sensitivity in the budgerigar midbrain. J Neurophysiol 2016; 115:1905-16. [PMID: 26843608 DOI: 10.1152/jn.01003.2015] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2015] [Accepted: 01/24/2016] [Indexed: 11/22/2022] Open
Abstract
Amplitude modulation (AM) is a crucial feature of many communication signals, including speech. Whereas average discharge rates in the auditory midbrain correlate with behavioral AM sensitivity in rabbits, the neural bases of AM sensitivity in species with human-like behavioral acuity are unexplored. Here, we used parallel behavioral and neurophysiological experiments to explore the neural (midbrain) bases of AM perception in an avian speech mimic, the budgerigar (Melopsittacus undulatus). Behavioral AM sensitivity was quantified using operant conditioning procedures. Neural AM sensitivity was studied using chronically implanted microelectrodes in awake, unrestrained birds. Average discharge rates of multiunit recording sites in the budgerigar midbrain were insufficient to explain behavioral sensitivity to modulation frequencies <100 Hz for both tone- and noise-carrier stimuli, even with optimal pooling of information across recording sites. Neural envelope synchrony, in contrast, could explain behavioral performance for both carrier types across the full range of modulation frequencies studied (16-512 Hz). The results suggest that envelope synchrony in the budgerigar midbrain may underlie behavioral sensitivity to AM. Behavioral AM sensitivity based on synchrony in the budgerigar, which contrasts with rate-correlated behavioral performance in rabbits, raises the possibility that envelope synchrony, rather than average discharge rate, might also underlie AM perception in other species with sensitive AM detection abilities, including humans. These results highlight the importance of synchrony coding of envelope structure in the inferior colliculus. Furthermore, they underscore potential benefits of devices (e.g., midbrain implants) that evoke robust neural synchrony.
Collapse
Affiliation(s)
- Kenneth S Henry
- Department of Biomedical Engineering, University of Rochester, Rochester, New York;
| | | | - Kristina S Abrams
- Department of Neuroscience, University of Rochester, Rochester, New York
| | - Fabio Idrobo
- Department of Psychological and Brain Sciences, Boston University, Boston, Massachusetts; and Universidad de Los Andes, Bogotá, Colombia
| | - Laurel H Carney
- Department of Biomedical Engineering, University of Rochester, Rochester, New York; Department of Neuroscience, University of Rochester, Rochester, New York
| |
Collapse
|
40
|
Carney LH, Kim DO, Kuwada S. Speech Coding in the Midbrain: Effects of Sensorineural Hearing Loss. ADVANCES IN EXPERIMENTAL MEDICINE AND BIOLOGY 2016; 894:427-435. [PMID: 27080684 DOI: 10.1007/978-3-319-25474-6_45] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/05/2023]
Abstract
In response to voiced speech sounds, auditory-nerve (AN) fibres phase-lock to harmonics near best frequency (BF) and to the fundamental frequency (F0) of voiced sounds. Due to nonlinearities in the healthy ear, phase-locking in each frequency channel is dominated either by a single harmonic, for channels tuned near formants, or by F0, for channels between formants. The alternating dominance of these factors sets up a robust pattern of F0-synchronized rate across best frequency (BF). This profile of a temporally coded measure is transformed into a mean rate profile in the midbrain (inferior colliculus, IC), where neurons are sensitive to low-frequency fluctuations. In the impaired ear, the F0-synchronized rate profile is affected by several factors: Reduced synchrony capture decreases the dominance of a single harmonic near BF on the response. Elevated thresholds also reduce the effect of rate saturation, resulting in increased F0-synchrony. Wider peripheral tuning results in a wider-band envelope with reduced F0 amplitude. In general, sensorineural hearing loss reduces the contrast in AN F0-synchronized rates across BF. Computational models for AN and IC neurons illustrate how hearing loss would affect the F0-synchronized rate profiles set up in response to voiced speech sounds.
Collapse
Affiliation(s)
- Laurel H Carney
- Departments of Biomedical Engineering, Neurobiology & Anatomy, Electrical & Computer Engineering, University of Rochester, Rochester, NY, USA.
| | - Duck O Kim
- Department f Neuroscience, University of Connecticut Health Center, Farmington, CT, USA
| | - Shigeyuki Kuwada
- Department f Neuroscience, University of Connecticut Health Center, Farmington, CT, USA
| |
Collapse
|
41
|
Suppression Measured from Chinchilla Auditory-Nerve-Fiber Responses Following Noise-Induced Hearing Loss: Adaptive-Tracking and Systems-Identification Approaches. ADVANCES IN EXPERIMENTAL MEDICINE AND BIOLOGY 2016; 894:285-295. [PMID: 27080669 PMCID: PMC5069700 DOI: 10.1007/978-3-319-25474-6_30] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]
Abstract
The compressive nonlinearity of cochlear signal transduction, reflecting outer-hair-cell function, manifests as suppressive spectral interactions; e.g., two-tone suppression. Moreover, for broadband sounds, there are multiple interactions between frequency components. These frequency-dependent nonlinearities are important for neural coding of complex sounds, such as speech. Acoustic-trauma-induced outer-hair-cell damage is associated with loss of nonlinearity, which auditory prostheses attempt to restore with, e.g., "multi-channel dynamic compression" algorithms.Neurophysiological data on suppression in hearing-impaired (HI) mammals are limited. We present data on firing-rate suppression measured in auditory-nerve-fiber responses in a chinchilla model of noise-induced hearing loss, and in normal-hearing (NH) controls at equal sensation level. Hearing-impaired (HI) animals had elevated single-fiber excitatory thresholds (by ~ 20-40 dB), broadened frequency tuning, and reduced-magnitude distortion-product otoacoustic emissions; consistent with mixed inner- and outer-hair-cell pathology. We characterized suppression using two approaches: adaptive tracking of two-tone-suppression threshold (62 NH, and 35 HI fibers), and Wiener-kernel analyses of responses to broadband noise (91 NH, and 148 HI fibers). Suppression-threshold tuning curves showed sensitive low-side suppression for NH and HI animals. High-side suppression thresholds were elevated in HI animals, to the same extent as excitatory thresholds. We factored second-order Wiener-kernels into excitatory and suppressive sub-kernels to quantify the relative strength of suppression. We found a small decrease in suppression in HI fibers, which correlated with broadened tuning. These data will help guide novel amplification strategies, particularly for complex listening situations (e.g., speech in noise), in which current hearing aids struggle to restore intelligibility.
Collapse
|
42
|
Li Y, Ropp TJF, May BJ, Young ED. Dorsal Cochlear Nucleus of the Rat: Representation of Complex Sounds in Ears Damaged by Acoustic Trauma. J Assoc Res Otolaryngol 2015; 16:487-505. [PMID: 25967754 PMCID: PMC4488165 DOI: 10.1007/s10162-015-0522-z] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2014] [Accepted: 04/29/2015] [Indexed: 12/19/2022] Open
Abstract
Acoustic trauma damages the cochlea but secondarily modifies circuits of the central auditory system. Changes include decreases in inhibitory neurotransmitter systems, degeneration and rewiring of synaptic circuits, and changes in neural activity. Little is known about the consequences of these changes for the representation of complex sounds. Here, we show data from the dorsal cochlear nucleus (DCN) of rats with a moderate high-frequency hearing loss following acoustic trauma. Single-neuron recording was used to estimate the organization of neurons' receptive fields, the balance of inhibition and excitation, and the representation of the spectra of complex broadband stimuli. The complex stimuli had random spectral shapes (RSSs), and the responses were fit with a model that allows the quality of the representation and its degree of linearity to be estimated. Tone response maps of DCN neurons in rat are like those in other species investigated previously, suggesting the same general organization of this nucleus. Following acoustic trauma, abnormal response types appeared. These can be interpreted as reflecting degraded tuning in auditory nerve fibers plus loss of inhibitory inputs in DCN. Abnormal types are somewhat more prevalent at later times (103-376 days) following the exposure, but not significantly so. Inhibition became weaker in post-trauma neurons that retained inhibitory responses but also disappeared in many neurons. The quality of the representation of spectral shape, measured by sensitivity to the spectral shapes of RSS stimuli, was decreased following trauma; in fact, neurons with abnormal response types responded mainly to overall stimulus level, and not spectral shape.
Collapse
Affiliation(s)
- Yang Li
- />Department of Biomedical Engineering, Center for Hearing and Balance, Johns Hopkins University, 505 Traylor Bldg., 720 Rutland Ave., Baltimore, MD 21205 USA
| | - Tessa-Jonne F. Ropp
- />Department of Biomedical Engineering, Center for Hearing and Balance, Johns Hopkins University, 505 Traylor Bldg., 720 Rutland Ave., Baltimore, MD 21205 USA
| | - Bradford J. May
- />Department of Otolaryngology-HNS, Center for Hearing and Balance, Johns Hopkins University, Baltimore, MD 21205 USA
| | - Eric D. Young
- />Department of Biomedical Engineering, Center for Hearing and Balance, Johns Hopkins University, 505 Traylor Bldg., 720 Rutland Ave., Baltimore, MD 21205 USA
| |
Collapse
|
43
|
Speech Coding in the Brain: Representation of Vowel Formants by Midbrain Neurons Tuned to Sound Fluctuations. eNeuro 2015; 2:eN-TNC-0004-15. [PMID: 26464993 PMCID: PMC4596011 DOI: 10.1523/eneuro.0004-15.2015] [Citation(s) in RCA: 58] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2015] [Revised: 06/18/2015] [Accepted: 06/18/2015] [Indexed: 11/21/2022] Open
Abstract
Current models for neural coding of vowels are typically based on linear descriptions of the auditory periphery, and fail at high sound levels and in background noise. These models rely on either auditory nerve discharge rates or phase locking to temporal fine structure. However, both discharge rates and phase locking saturate at moderate to high sound levels, and phase locking is degraded in the CNS at middle to high frequencies. The fact that speech intelligibility is robust over a wide range of sound levels is problematic for codes that deteriorate as the sound level increases. Additionally, a successful neural code must function for speech in background noise at levels that are tolerated by listeners. The model presented here resolves these problems, and incorporates several key response properties of the nonlinear auditory periphery, including saturation, synchrony capture, and phase locking to both fine structure and envelope temporal features. The model also includes the properties of the auditory midbrain, where discharge rates are tuned to amplitude fluctuation rates. The nonlinear peripheral response features create contrasts in the amplitudes of low-frequency neural rate fluctuations across the population. These patterns of fluctuations result in a response profile in the midbrain that encodes vowel formants over a wide range of levels and in background noise. The hypothesized code is supported by electrophysiological recordings from the inferior colliculus of awake rabbits. This model provides information for understanding the structure of cross-linguistic vowel spaces, and suggests strategies for automatic formant detection and speech enhancement for listeners with hearing loss.
Collapse
|
44
|
Behavioral and neural discrimination of speech sounds after moderate or intense noise exposure in rats. Ear Hear 2015; 35:e248-61. [PMID: 25072238 DOI: 10.1097/aud.0000000000000062] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
Abstract
OBJECTIVES Hearing loss is a commonly experienced disability in a variety of populations including veterans and the elderly and can often cause significant impairment in the ability to understand spoken language. In this study, we tested the hypothesis that neural and behavioral responses to speech will be differentially impaired in an animal model after two forms of hearing loss. DESIGN Sixteen female Sprague-Dawley rats were exposed to one of two types of broadband noise which was either moderate or intense. In nine of these rats, auditory cortex recordings were taken 4 weeks after noise exposure (NE). The other seven were pretrained on a speech sound discrimination task prior to NE and were then tested on the same task after hearing loss. RESULTS Following intense NE, rats had few neural responses to speech stimuli. These rats were able to detect speech sounds but were no longer able to discriminate between speech sounds. Following moderate NE, rats had reorganized cortical maps and altered neural responses to speech stimuli but were still able to accurately discriminate between similar speech sounds during behavioral testing. CONCLUSIONS These results suggest that rats are able to adjust to the neural changes after moderate NE and discriminate speech sounds, but they are not able to recover behavioral abilities after intense NE. Animal models could help clarify the adaptive and pathological neural changes that contribute to speech processing in hearing-impaired populations and could be used to test potential behavioral and pharmacological therapies.
Collapse
|
45
|
The Role of Temporal Envelope and Fine Structure in Mandarin Lexical Tone Perception in Auditory Neuropathy Spectrum Disorder. PLoS One 2015; 10:e0129710. [PMID: 26052707 PMCID: PMC4459992 DOI: 10.1371/journal.pone.0129710] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2015] [Accepted: 05/12/2015] [Indexed: 11/19/2022] Open
Abstract
Temporal information in a signal can be partitioned into temporal envelope (E) and fine structure (FS). Fine structure is important for lexical tone perception for normal-hearing (NH) listeners, and listeners with sensorineural hearing loss (SNHL) have an impaired ability to use FS in lexical tone perception due to the reduced frequency resolution. The present study was aimed to assess which of the acoustic aspects (E or FS) played a more important role in lexical tone perception in subjects with auditory neuropathy spectrum disorder (ANSD) and to determine whether it was the deficit in temporal resolution or frequency resolution that might lead to more detrimental effects on FS processing in pitch perception. Fifty-eight native Mandarin Chinese-speaking subjects (27 with ANSD, 16 with SNHL, and 15 with NH) were assessed for (1) their ability to recognize lexical tones using acoustic E or FS cues with the “auditory chimera” technique, (2) temporal resolution as measured with temporal gap detection (TGD) threshold, and (3) frequency resolution as measured with the Q10dB values of the psychophysical tuning curves. Overall, 26.5%, 60.2%, and 92.1% of lexical tone responses were consistent with FS cues for tone perception for listeners with ANSD, SNHL, and NH, respectively. The mean TGD threshold was significantly higher for listeners with ANSD (11.9 ms) than for SNHL (4.0 ms; p < 0.001) and NH (3.9 ms; p < 0.001) listeners, with no significant difference between SNHL and NH listeners. In contrast, the mean Q10dB for listeners with SNHL (1.8±0.4) was significantly lower than that for ANSD (3.5±1.0; p < 0.001) and NH (3.4±0.9; p < 0.001) listeners, with no significant difference between ANSD and NH listeners. These results suggest that reduced temporal resolution, as opposed to reduced frequency selectivity, in ANSD subjects leads to greater degradation of FS processing for pitch perception.
Collapse
|
46
|
Occelli F, Suied C, Pressnitzer D, Edeline JM, Gourévitch B. A Neural Substrate for Rapid Timbre Recognition? Neural and Behavioral Discrimination of Very Brief Acoustic Vowels. Cereb Cortex 2015; 26:2483-2496. [PMID: 25947234 DOI: 10.1093/cercor/bhv071] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
The timbre of a sound plays an important role in our ability to discriminate between behaviorally relevant auditory categories, such as different vowels in speech. Here, we investigated, in the primary auditory cortex (A1) of anesthetized guinea pigs, the neural representation of vowels with impoverished timbre cues. Five different vowels were presented with durations ranging from 2 to 128 ms. A psychophysical experiment involving human listeners showed that identification performance was near ceiling for the longer durations and degraded close to chance level for the shortest durations. This was likely due to spectral splatter, which reduced the contrast between the spectral profiles of the vowels at short durations. Effects of vowel duration on cortical responses were well predicted by the linear frequency responses of A1 neurons. Using mutual information, we found that auditory cortical neurons in the guinea pig could be used to reliably identify several vowels for all durations. Information carried by each cortical site was low on average, but the population code was accurate even for durations where human behavioral performance was poor. These results suggest that a place population code is available at the level of A1 to encode spectral profile cues for even very short sounds.
Collapse
Affiliation(s)
- F Occelli
- UMR CNRS 9197, Institut de NeuroScience Paris-Saclay (NeuroPSI)
- Université Paris-Sud, Institut de NeuroScience Paris-Saclay (NeuroPSI) 91405 Orsay Cedex, France
| | - C Suied
- Département Action et Cognition en Situation Opérationnelle, Institut de Recherche Biomédicale des Armées, 91223 Brétigny sur Orge, France
| | - D Pressnitzer
- UMR CNRS 8248, LSP
- DEC, LSP Ecole Normale Supérieure, 29 rue d'Ulm, 75005 Paris, France
| | - J-M Edeline
- UMR CNRS 9197, Institut de NeuroScience Paris-Saclay (NeuroPSI)
- Université Paris-Sud, Institut de NeuroScience Paris-Saclay (NeuroPSI) 91405 Orsay Cedex, France
| | - B Gourévitch
- UMR CNRS 9197, Institut de NeuroScience Paris-Saclay (NeuroPSI)
- Université Paris-Sud, Institut de NeuroScience Paris-Saclay (NeuroPSI) 91405 Orsay Cedex, France
| |
Collapse
|
47
|
Heil P, Peterson AJ. Basic response properties of auditory nerve fibers: a review. Cell Tissue Res 2015; 361:129-58. [PMID: 25920587 DOI: 10.1007/s00441-015-2177-9] [Citation(s) in RCA: 68] [Impact Index Per Article: 7.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2014] [Accepted: 03/19/2015] [Indexed: 01/26/2023]
Abstract
All acoustic information from the periphery is encoded in the timing and rates of spikes in the population of spiral ganglion neurons projecting to the central auditory system. Considerable progress has been made in characterizing the physiological properties of type-I and type-II primary auditory afferents and understanding the basic properties of type-I afferents in response to sounds. Here, we review some of these properties, with emphasis placed on issues such as the stochastic nature of spike timing during spontaneous and driven activity, frequency tuning curves, spike-rate-versus-level functions, dynamic-range and spike-rate adaptation, and phase locking to stimulus fine structure and temporal envelope. We also review effects of acoustic trauma on some of these response properties.
Collapse
Affiliation(s)
- Peter Heil
- Leibniz Institute for Neurobiology, Brenneckestrasse 6, 39118, Magdeburg, Germany,
| | | |
Collapse
|
48
|
Bidelman GM, Alain C. Hierarchical neurocomputations underlying concurrent sound segregation: Connecting periphery to percept. Neuropsychologia 2015; 68:38-50. [DOI: 10.1016/j.neuropsychologia.2014.12.020] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2014] [Revised: 12/18/2014] [Accepted: 12/22/2014] [Indexed: 10/24/2022]
|
49
|
Moore BC. Dead regions in the cochlea: diagnosis, perceptual consequences, and implications for the fitting of hearing AIDS. Trends Amplif 2014; 5:1-34. [PMID: 25425895 DOI: 10.1177/108471380100500102] [Citation(s) in RCA: 157] [Impact Index Per Article: 15.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Hearing impairment is often associated with damage to the hair cells in the cochlea. Sometimes there may be complete loss of function of inner hair cells (IHCs) over a certain region of the cochlea; this is called a "dead region". The region can be defined in terms of the range of characteristic frequencies (CFs) of the IHCs and/or neurons immediately adjacent to the dead region. This paper reviews the following topics: the effect of dead regions on the audiogram; methods for the detection and delineation of dead regions based on psychophysical tuning curves (PTCs) and on the measurement of thresholds for pure tones in "threshold equalizing noise" (TEN); effects of dead regions on speech perception; effects of dead regions on the perception of tones; implications of dead regions for fitting hearing aids. The main conclusions are: (1) Dead regions may be relatively common in people with moderate-to-severe sensorineural hearing loss; (2) Dead regions cannot be reliably diagnosed from the audiogram; (3) PTCs provide a useful way of detecting dead regions and defining their boundaries. However, the determination of PTCs is probably too time-consuming to be used for routine diagnosis of dead regions in clinical practice; (4) The measurement of detection thresholds for pure tones in TEN provides a simple method for clinical diagnosis of dead regions; (5) Pure tones with frequencies falling in a dead region do not evoke clear pitch sensations (pitch matching is highly variable) and the perceived pitch is sometimes, but not always, different from "normal". However, ratings of pitch clarity cannot be used as a reliable indicator of a dead region; (6) Amplification of frequencies well inside a high-frequency dead region usually does not improve speech intelligibility, and may sometimes impair it. However, there may be some benefit in amplifying frequencies up to 50 to 100% above the estimated low-frequency edge of a high-frequency dead region; (7) The optimal form of amplification for people with low-frequency dead regions remains somewhat unclear. There may be some benefit from avoiding the amplification of frequencies well inside a dead region; (8) Patients with extensive dead regions are likely to get less benefit from hearing aids than patients without dead regions; (9) For patients with diagnosed dead regions at high frequencies, consideration should be given to use of a hearing aid incorporating frequency transposition and/or compression.
Collapse
Affiliation(s)
- B C Moore
- Department of Experimental Psychology, University of Cambridge, Downing Street, Cambridge CB2 3EB, UK email
| |
Collapse
|
50
|
Optimal combination of neural temporal envelope and fine structure cues to explain speech identification in background noise. J Neurosci 2014; 34:12145-54. [PMID: 25186758 DOI: 10.1523/jneurosci.1025-14.2014] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022] Open
Abstract
The dichotomy between acoustic temporal envelope (ENV) and fine structure (TFS) cues has stimulated numerous studies over the past decade to understand the relative role of acoustic ENV and TFS in human speech perception. Such acoustic temporal speech cues produce distinct neural discharge patterns at the level of the auditory nerve, yet little is known about the central neural mechanisms underlying the dichotomy in speech perception between neural ENV and TFS cues. We explored the question of how the peripheral auditory system encodes neural ENV and TFS cues in steady or fluctuating background noise, and how the central auditory system combines these forms of neural information for speech identification. We sought to address this question by (1) measuring sentence identification in background noise for human subjects as a function of the degree of available acoustic TFS information and (2) examining the optimal combination of neural ENV and TFS cues to explain human speech perception performance using computational models of the peripheral auditory system and central neural observers. Speech-identification performance by human subjects decreased as the acoustic TFS information was degraded in the speech signals. The model predictions best matched human performance when a greater emphasis was placed on neural ENV coding rather than neural TFS. However, neural TFS cues were necessary to account for the full effect of background-noise modulations on human speech-identification performance.
Collapse
|