1
|
Saddler MR, McDermott JH. Models optimized for real-world tasks reveal the task-dependent necessity of precise temporal coding in hearing. Nat Commun 2024; 15:10590. [PMID: 39632854 PMCID: PMC11618365 DOI: 10.1038/s41467-024-54700-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2024] [Accepted: 11/18/2024] [Indexed: 12/07/2024] Open
Abstract
Neurons encode information in the timing of their spikes in addition to their firing rates. Spike timing is particularly precise in the auditory nerve, where action potentials phase lock to sound with sub-millisecond precision, but its behavioral relevance remains uncertain. We optimized machine learning models to perform real-world hearing tasks with simulated cochlear input, assessing the precision of auditory nerve spike timing needed to reproduce human behavior. Models with high-fidelity phase locking exhibited more human-like sound localization and speech perception than models without, consistent with an essential role in human hearing. However, the temporal precision needed to reproduce human-like behavior varied across tasks, as did the precision that benefited real-world task performance. These effects suggest that perceptual domains incorporate phase locking to different extents depending on the demands of real-world hearing. The results illustrate how optimizing models for realistic tasks can clarify the role of candidate neural codes in perception.
Collapse
Affiliation(s)
- Mark R Saddler
- Department of Brain and Cognitive Sciences, MIT, Cambridge, MA, USA.
- McGovern Institute for Brain Research, MIT, Cambridge, MA, USA.
- Center for Brains, Minds, and Machines, MIT, Cambridge, MA, USA.
| | - Josh H McDermott
- Department of Brain and Cognitive Sciences, MIT, Cambridge, MA, USA.
- McGovern Institute for Brain Research, MIT, Cambridge, MA, USA.
- Center for Brains, Minds, and Machines, MIT, Cambridge, MA, USA.
- Program in Speech and Hearing Biosciences and Technology, Harvard, Cambridge, MA, USA.
| |
Collapse
|
2
|
Farhadi A, Jennings SG, Strickland EA, Carney LH. Subcortical auditory model including efferent dynamic gain control with inputs from cochlear nucleus and inferior colliculus. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2023; 154:3644-3659. [PMID: 38051523 PMCID: PMC10836963 DOI: 10.1121/10.0022578] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/25/2023] [Revised: 10/21/2023] [Accepted: 11/13/2023] [Indexed: 12/07/2023]
Abstract
An auditory model has been developed with a time-varying, gain-control signal based on the physiology of the efferent system and subcortical neural pathways. The medial olivocochlear (MOC) efferent stage of the model receives excitatory projections from fluctuation-sensitive model neurons of the inferior colliculus (IC) and wide-dynamic-range model neurons of the cochlear nucleus. The response of the model MOC stage dynamically controls cochlear gain via simulated outer hair cells. In response to amplitude-modulated (AM) noise, firing rates of most IC neurons with band-enhanced modulation transfer functions in awake rabbits increase over a time course consistent with the dynamics of the MOC efferent feedback. These changes in the rates of IC neurons in awake rabbits were employed to adjust the parameters of the efferent stage of the proposed model. Responses of the proposed model to AM noise were able to simulate the increasing IC rate over time, whereas the model without the efferent system did not show this trend. The proposed model with efferent gain control provides a powerful tool for testing hypotheses, shedding insight on mechanisms in hearing, specifically those involving the efferent system.
Collapse
Affiliation(s)
- Afagh Farhadi
- Department of Electrical and Computer Engineering, University of Rochester, Rochester, New York 14642, USA
| | - Skyler G Jennings
- Department of Communication Sciences and Disorders, University of Utah, Salt Lake City, Utah 84112, USA
| | - Elizabeth A Strickland
- Department of Speech, Language, and Hearing Sciences, Purdue University, West Lafayette, Indiana 47907, USA
| | - Laurel H Carney
- Department of Biomedical Engineering, University of Rochester, Rochester, New York 14642, USA
| |
Collapse
|
3
|
Shofner WP. Cochlear tuning and the peripheral representation of harmonic sounds in mammals. J Comp Physiol A Neuroethol Sens Neural Behav Physiol 2023; 209:145-161. [PMID: 35867137 DOI: 10.1007/s00359-022-01560-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2022] [Revised: 06/24/2022] [Accepted: 07/01/2022] [Indexed: 02/07/2023]
Abstract
Albert Feng was a prominent comparative neurophysiologist whose research provided numerous contributions towards understanding how the spectral and temporal characteristics of vocalizations underlie sound communication in frogs and bats. The present study is dedicated to Al's memory and compares the spectral and temporal representations of stochastic, complex sounds which underlie the perception of pitch strength in humans and chinchillas. Specifically, the pitch strengths of these stochastic sounds differ between humans and chinchillas, suggesting that humans and chinchillas may be using different cues. Outputs of auditory filterbank models based on human and chinchilla cochlear tuning were examined. Excitation patterns of harmonics are enhanced in humans as compared with chinchillas. In contrast, summary correlograms are degraded in humans as compared with chinchillas. Comparing summary correlograms and excitation patterns with corresponding behavioral data on pitch strength suggests that the dominant cue for pitch strength in humans is spectral (i.e., harmonic) structure, whereas the dominant cue for chinchillas is temporal (i.e., envelope) structure. The results support arguments that the broader cochlear tuning in non-human mammals emphasizes temporal cues for pitch perception, whereas the sharper cochlear tuning in humans emphasizes spectral cues.
Collapse
Affiliation(s)
- William P Shofner
- Department of Speech, Language and Hearing Sciences, Indiana University, 2631 East Discovery Parkway, Bloomington, IN, 47408, USA.
| |
Collapse
|
4
|
Schilling A, Gerum R, Metzner C, Maier A, Krauss P. Intrinsic Noise Improves Speech Recognition in a Computational Model of the Auditory Pathway. Front Neurosci 2022; 16:908330. [PMID: 35757533 PMCID: PMC9215117 DOI: 10.3389/fnins.2022.908330] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2022] [Accepted: 05/09/2022] [Indexed: 01/05/2023] Open
Abstract
Noise is generally considered to harm information processing performance. However, in the context of stochastic resonance, noise has been shown to improve signal detection of weak sub- threshold signals, and it has been proposed that the brain might actively exploit this phenomenon. Especially within the auditory system, recent studies suggest that intrinsic noise plays a key role in signal processing and might even correspond to increased spontaneous neuronal firing rates observed in early processing stages of the auditory brain stem and cortex after hearing loss. Here we present a computational model of the auditory pathway based on a deep neural network, trained on speech recognition. We simulate different levels of hearing loss and investigate the effect of intrinsic noise. Remarkably, speech recognition after hearing loss actually improves with additional intrinsic noise. This surprising result indicates that intrinsic noise might not only play a crucial role in human auditory processing, but might even be beneficial for contemporary machine learning approaches.
Collapse
Affiliation(s)
- Achim Schilling
- Laboratory of Sensory and Cognitive Neuroscience, Aix-Marseille University, Marseille, France
- Neuroscience Lab, University Hospital Erlangen, Erlangen, Germany
- Cognitive Computational Neuroscience Group, Friedrich-Alexander-University Erlangen-Nuremberg (FAU), Erlangen, Germany
| | - Richard Gerum
- Department of Physics and Center for Vision Research, York University, Toronto, ON, Canada
| | - Claus Metzner
- Neuroscience Lab, University Hospital Erlangen, Erlangen, Germany
- Friedrich-Alexander-University Erlangen-Nuremberg (FAU), Erlangen, Germany
| | - Andreas Maier
- Pattern Recognition Lab, Friedrich-Alexander-University Erlangen-Nuremberg (FAU), Erlangen, Germany
| | - Patrick Krauss
- Neuroscience Lab, University Hospital Erlangen, Erlangen, Germany
- Cognitive Computational Neuroscience Group, Friedrich-Alexander-University Erlangen-Nuremberg (FAU), Erlangen, Germany
- Pattern Recognition Lab, Friedrich-Alexander-University Erlangen-Nuremberg (FAU), Erlangen, Germany
- Linguistics Lab, Friedrich-Alexander-University Erlangen-Nuremberg (FAU), Erlangen, Germany
| |
Collapse
|
5
|
Osses Vecchi A, Varnet L, Carney LH, Dau T, Bruce IC, Verhulst S, Majdak P. A comparative study of eight human auditory models of monaural processing. ACTA ACUSTICA. EUROPEAN ACOUSTICS ASSOCIATION 2022; 6:17. [PMID: 36325461 PMCID: PMC9625898 DOI: 10.1051/aacus/2022008] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/19/2023]
Abstract
A number of auditory models have been developed using diverging approaches, either physiological or perceptual, but they share comparable stages of signal processing, as they are inspired by the same constitutive parts of the auditory system. We compare eight monaural models that are openly accessible in the Auditory Modelling Toolbox. We discuss the considerations required to make the model outputs comparable to each other, as well as the results for the following model processing stages or their equivalents: Outer and middle ear, cochlear filter bank, inner hair cell, auditory nerve synapse, cochlear nucleus, and inferior colliculus. The discussion includes a list of recommendations for future applications of auditory models.
Collapse
Affiliation(s)
- Alejandro Osses Vecchi
- Laboratoire des systèmes perceptifs, Département d’études cognitives, École Normale Supérieure, PSL University, CNRS, 75005 Paris, France
| | - Léo Varnet
- Laboratoire des systèmes perceptifs, Département d’études cognitives, École Normale Supérieure, PSL University, CNRS, 75005 Paris, France
| | - Laurel H. Carney
- Departments of Biomedical Engineering and Neuroscience, University of Rochester, Rochester, NY 14642, USA
| | - Torsten Dau
- Hearing Systems Section, Department of Health Technology, Technical University of Denmark, DK-2800 Kgs. Lyngby, Denmark
| | - Ian C. Bruce
- Department of Electrical and Computer Engineering, McMaster University, Hamilton, ON L8S 4K1, Canada
| | - Sarah Verhulst
- Hearing Technology group, WAVES, Department of Information Technology, Ghent University, 9000 Ghent, Belgium
| | - Piotr Majdak
- Acoustics Research Institute, Austrian Academy of Sciences, 1040 Vienna, Austria
| |
Collapse
|
6
|
Rutherford MA, von Gersdorff H, Goutman JD. Encoding sound in the cochlea: from receptor potential to afferent discharge. J Physiol 2021; 599:2527-2557. [PMID: 33644871 PMCID: PMC8127127 DOI: 10.1113/jp279189] [Citation(s) in RCA: 23] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2020] [Accepted: 02/22/2021] [Indexed: 12/17/2022] Open
Abstract
Ribbon-class synapses in the ear achieve analog to digital transformation of a continuously graded membrane potential to all-or-none spikes. In mammals, several auditory nerve fibres (ANFs) carry information from each inner hair cell (IHC) to the brain in parallel. Heterogeneity of transmission among synapses contributes to the diversity of ANF sound-response properties. In addition to the place code for sound frequency and the rate code for sound level, there is also a temporal code. In series with cochlear amplification and frequency tuning, neural representation of temporal cues over a broad range of sound levels enables auditory comprehension in noisy multi-speaker settings. The IHC membrane time constant introduces a low-pass filter that attenuates fluctuations of the receptor potential above 1-2 kHz. The ANF spike generator adds a high-pass filter via its depolarization-rate threshold that rejects slow changes in the postsynaptic potential and its phasic response property that ensures one spike per depolarization. Synaptic transmission involves several stochastic subcellular processes between IHC depolarization and ANF spike generation, introducing delay and jitter that limits the speed and precision of spike timing. ANFs spike at a preferred phase of periodic sounds in a process called phase-locking that is limited to frequencies below a few kilohertz by both the IHC receptor potential and the jitter in synaptic transmission. During phase-locking to periodic sounds of increasing intensity, faster and facilitated activation of synaptic transmission and spike generation may be offset by presynaptic depletion of synaptic vesicles, resulting in relatively small changes in response phase. Here we review encoding of spike-timing at cochlear ribbon synapses.
Collapse
Affiliation(s)
- Mark A. Rutherford
- Department of Otolaryngology, Washington University School of Medicine, St. Louis, Missouri 63110
| | - Henrique von Gersdorff
- Vollum Institute, Oregon Hearing Research Center, Oregon Health and Sciences University, Portland, Oregon 97239
| | | |
Collapse
|
7
|
Haro S, Smalt CJ, Ciccarelli GA, Quatieri TF. Deep Neural Network Model of Hearing-Impaired Speech-in-Noise Perception. Front Neurosci 2020; 14:588448. [PMID: 33384579 PMCID: PMC7770113 DOI: 10.3389/fnins.2020.588448] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2020] [Accepted: 11/10/2020] [Indexed: 01/15/2023] Open
Abstract
Many individuals struggle to understand speech in listening scenarios that include reverberation and background noise. An individual's ability to understand speech arises from a combination of peripheral auditory function, central auditory function, and general cognitive abilities. The interaction of these factors complicates the prescription of treatment or therapy to improve hearing function. Damage to the auditory periphery can be studied in animals; however, this method alone is not enough to understand the impact of hearing loss on speech perception. Computational auditory models bridge the gap between animal studies and human speech perception. Perturbations to the modeled auditory systems can permit mechanism-based investigations into observed human behavior. In this study, we propose a computational model that accounts for the complex interactions between different hearing damage mechanisms and simulates human speech-in-noise perception. The model performs a digit classification task as a human would, with only acoustic sound pressure as input. Thus, we can use the model's performance as a proxy for human performance. This two-stage model consists of a biophysical cochlear-nerve spike generator followed by a deep neural network (DNN) classifier. We hypothesize that sudden damage to the periphery affects speech perception and that central nervous system adaptation over time may compensate for peripheral hearing damage. Our model achieved human-like performance across signal-to-noise ratios (SNRs) under normal-hearing (NH) cochlear settings, achieving 50% digit recognition accuracy at -20.7 dB SNR. Results were comparable to eight NH participants on the same task who achieved 50% behavioral performance at -22 dB SNR. We also simulated medial olivocochlear reflex (MOCR) and auditory nerve fiber (ANF) loss, which worsened digit-recognition accuracy at lower SNRs compared to higher SNRs. Our simulated performance following ANF loss is consistent with the hypothesis that cochlear synaptopathy impacts communication in background noise more so than in quiet. Following the insult of various cochlear degradations, we implemented extreme and conservative adaptation through the DNN. At the lowest SNRs (<0 dB), both adapted models were unable to fully recover NH performance, even with hundreds of thousands of training samples. This implies a limit on performance recovery following peripheral damage in our human-inspired DNN architecture.
Collapse
Affiliation(s)
- Stephanie Haro
- Human Health and Performance Systems, Massachusetts Institute of Technology Lincoln Laboratory, Lexington, MA, United States
- Speech and Hearing Biosciences and Technology, Harvard Medical School, Boston, MA, United States
| | - Christopher J. Smalt
- Human Health and Performance Systems, Massachusetts Institute of Technology Lincoln Laboratory, Lexington, MA, United States
| | - Gregory A. Ciccarelli
- Human Health and Performance Systems, Massachusetts Institute of Technology Lincoln Laboratory, Lexington, MA, United States
| | - Thomas F. Quatieri
- Human Health and Performance Systems, Massachusetts Institute of Technology Lincoln Laboratory, Lexington, MA, United States
- Speech and Hearing Biosciences and Technology, Harvard Medical School, Boston, MA, United States
| |
Collapse
|
8
|
Chambers JD, Elgueda D, Fritz JB, Shamma SA, Burkitt AN, Grayden DB. Computational Neural Modeling of Auditory Cortical Receptive Fields. Front Comput Neurosci 2019; 13:28. [PMID: 31178710 PMCID: PMC6543553 DOI: 10.3389/fncom.2019.00028] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2018] [Accepted: 04/23/2019] [Indexed: 11/13/2022] Open
Abstract
Previous studies have shown that the auditory cortex can enhance the perception of behaviorally important sounds in the presence of background noise, but the mechanisms by which it does this are not yet elucidated. Rapid plasticity of spectrotemporal receptive fields (STRFs) in the primary (A1) cortical neurons is observed during behavioral tasks that require discrimination of particular sounds. This rapid task-related change is believed to be one of the processing strategies utilized by the auditory cortex to selectively attend to one stream of sound in the presence of mixed sounds. However, the mechanism by which the brain evokes this rapid plasticity in the auditory cortex remains unclear. This paper uses a neural network model to investigate how synaptic transmission within the cortical neuron network can change the receptive fields of individual neurons. A sound signal was used as input to a model of the cochlea and auditory periphery, which activated or inhibited integrate-and-fire neuron models to represent networks in the primary auditory cortex. Each neuron in the network was tuned to a different frequency. All neurons were interconnected with excitatory or inhibitory synapses of varying strengths. Action potentials in one of the model neurons were used to calculate the receptive field using reverse correlation. The results were directly compared to previously recorded electrophysiological data from ferrets performing behavioral tasks that require discrimination of particular sounds. The neural network model could reproduce complex STRFs observed experimentally through optimizing the synaptic weights in the model. The model predicts that altering synaptic drive between cortical neurons and/or bottom-up synaptic drive from the cochlear model to the cortical neurons can account for rapid task-related changes observed experimentally in A1 neurons. By identifying changes in the synaptic drive during behavioral tasks, the model provides insights into the neural mechanisms utilized by the auditory cortex to enhance the perception of behaviorally salient sounds.
Collapse
Affiliation(s)
- Jordan D Chambers
- NeuroEngineering Laboratory, Department of Biomedical Engineering, University of Melbourne, Parkville, VIC, Australia
| | - Diego Elgueda
- Departamento de Patología Animal, Facultad de Ciencias Veterinarias y Pecuarias, Universidad de Chile, Santiago, Chile.,Institute for Systems Research, University of Maryland, College Park, MD, United States
| | - Jonathan B Fritz
- Institute for Systems Research, University of Maryland, College Park, MD, United States
| | - Shihab A Shamma
- Institute for Systems Research, University of Maryland, College Park, MD, United States.,Laboratoire des Systèmes Perceptifs, École Normale Supérieure, Paris, France
| | - Anthony N Burkitt
- NeuroEngineering Laboratory, Department of Biomedical Engineering, University of Melbourne, Parkville, VIC, Australia
| | - David B Grayden
- NeuroEngineering Laboratory, Department of Biomedical Engineering, University of Melbourne, Parkville, VIC, Australia
| |
Collapse
|
9
|
Fischer BJ, Wydick JL, Köppl C, Peña JL. Multidimensional stimulus encoding in the auditory nerve of the barn owl. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2018; 144:2116. [PMID: 30404459 PMCID: PMC6185867 DOI: 10.1121/1.5056171] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/20/2018] [Revised: 09/07/2018] [Accepted: 09/10/2018] [Indexed: 06/08/2023]
Abstract
Auditory perception depends on multi-dimensional information in acoustic signals that must be encoded by auditory nerve fibers (ANF). These dimensions are represented by filters with different frequency selectivities. Multiple models have been suggested; however, the identification of relevant filters and type of interactions has been elusive, limiting progress in modeling the cochlear output. Spike-triggered covariance analysis of barn owl ANF responses was used to determine the number of relevant stimulus filters and estimate the nonlinearity that produces responses from filter outputs. This confirmed that ANF responses depend on multiple filters. The first, most dominant filter was the spike-triggered average, which was excitatory for all neurons. The second and third filters could be either suppressive or excitatory with center frequencies above or below that of the first filter. The nonlinear function mapping the first two filter outputs to the spiking probability ranged from restricted to nearly circular-symmetric, reflecting different modes of interaction between stimulus dimensions across the sample. This shows that stimulus encoding in ANFs of the barn owl is multidimensional and exhibits diversity over the population, suggesting that models must allow for variable numbers of filters and types of interactions between filters to describe how sound is encoded in ANFs.
Collapse
Affiliation(s)
- Brian J Fischer
- Department of Mathematics, Seattle University, Seattle, Washington 98122, USA
| | - Jacob L Wydick
- Department of Mathematics, Seattle University, Seattle, Washington 98122, USA
| | - Christine Köppl
- Cluster of Excellence "Hearing4all" and Research Centre Neurosensory Science, Department of Neuroscience, School of Medicine and Health Science, Carl von Ossietzky University, Oldenburg, Germany
| | - José L Peña
- Dominick P. Purpura Department of Neuroscience, Albert Einstein College of Medicine, New York, New York 10461, USA
| |
Collapse
|
10
|
Bruce IC, Erfani Y, Zilany MS. A phenomenological model of the synapse between the inner hair cell and auditory nerve: Implications of limited neurotransmitter release sites. Hear Res 2018; 360:40-54. [DOI: 10.1016/j.heares.2017.12.016] [Citation(s) in RCA: 70] [Impact Index Per Article: 11.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/02/2017] [Revised: 12/11/2017] [Accepted: 12/23/2017] [Indexed: 11/15/2022]
|
11
|
Settibhaktini H, Chintanpalli A. Modeling the level-dependent changes of concurrent vowel scores. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2018; 143:440. [PMID: 29390795 PMCID: PMC6226212 DOI: 10.1121/1.5021330] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/12/2017] [Revised: 10/20/2017] [Accepted: 01/02/2018] [Indexed: 06/07/2023]
Abstract
The difference in fundamental frequency (F0) between talkers is an important cue for speaker segregation. To understand how this cue varies across sound level, Chintanpalli, Ahlstrom, and Dubno [(2014). J. Assoc. Res. Otolaryngol. 15, 823-837] collected level-dependent changes in concurrent-vowel identification scores for same- and different-F0 conditions in younger adults with normal hearing. Modeling suggested that level-dependent changes in phase locking of auditory-nerve (AN) fibers to formants and F0s may contribute to concurrent-vowel identification scores; however, identification scores were not predicted to test this suggestion directly. The current study predicts these identification scores using the temporal responses of a computational AN model and a modified version of Meddis and Hewitt's [(1992). J. Acoust. Soc. Am. 91, 233-245] F0-based segregation algorithm. The model successfully captured the level-dependent changes in identification scores of both vowels with and without F0 difference, as well as identification scores for one vowel correct. The model's F0-based vowel segregation was controlled using the actual F0-benefit across levels such that the predicted F0-benefit matched qualitatively with the actual F0-benefit as a function of level. The quantitative predictions from this F0-based segregation algorithm demonstrate that temporal responses of AN fibers to vowel formants and F0s can account for variations in identification scores across sound level and F0-difference conditions in a concurrent-vowel task.
Collapse
Affiliation(s)
- Harshavardhan Settibhaktini
- Department of Electrical and Electronics Engineering, Birla Institute of Technology and Science, Pilani Campus, Vidya Vihar, Pilani, Rajasthan, 333031, India
| | - Ananthakrishna Chintanpalli
- Department of Electrical and Electronics Engineering, Birla Institute of Technology and Science, Pilani Campus, Vidya Vihar, Pilani, Rajasthan, 333031, India
| |
Collapse
|
12
|
A probabilistic Poisson-based model accounts for an extensive set of absolute auditory threshold measurements. Hear Res 2017; 353:135-161. [DOI: 10.1016/j.heares.2017.06.011] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/09/2016] [Revised: 06/19/2017] [Accepted: 06/25/2017] [Indexed: 01/11/2023]
|
13
|
A Test of the Stereausis Hypothesis for Sound Localization in Mammals. J Neurosci 2017; 37:7278-7289. [PMID: 28659280 DOI: 10.1523/jneurosci.0233-17.2017] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2017] [Revised: 05/20/2017] [Accepted: 05/25/2017] [Indexed: 11/21/2022] Open
Abstract
The relative arrival times of sounds at both ears constitute an important cue for localization of low-frequency sounds in the horizontal plane. The binaural neurons of the medial superior olive (MSO) act as coincidence detectors that fire when inputs from both ears arrive near simultaneously. Each principal neuron in the MSO is tuned to its own best interaural time difference (ITD), indicating the presence of an internal delay, a difference in the travel times from either ear to the MSO. According to the stereausis hypothesis, differences in wave propagation along the cochlea could provide the delays necessary for coincidence detection if the ipsilateral and contralateral inputs originated from different cochlear positions, with different frequency tuning. We therefore investigated the relation between interaural mismatches in frequency tuning and ITD tuning during in vivo loose-patch (juxtacellular) recordings from principal neurons of the MSO of anesthetized female gerbils. Cochlear delays can be bypassed by directly stimulating the auditory nerve; in agreement with the stereausis hypothesis, tuning for timing differences during bilateral electrical stimulation of the round windows differed markedly from ITD tuning in the same cells. Moreover, some neurons showed a frequency tuning mismatch that was sufficiently large to have a potential impact on ITD tuning. However, we did not find a correlation between frequency tuning mismatches and best ITDs. Our data thus suggest that axonal delays dominate ITD tuning.SIGNIFICANCE STATEMENT Neurons in the medial superior olive (MSO) play a unique role in sound localization because of their ability to compare the relative arrival time of low-frequency sounds at both ears. They fire maximally when the difference in sound arrival time exactly compensates for the internal delay: the difference in travel time from either ear to the MSO neuron. We tested whether differences in cochlear delay systematically contribute to the total travel time by comparing for individual MSO neurons the best difference in arrival times, as predicted from the frequency tuning for either ear, and the actual best difference. No systematic relation was observed, emphasizing the dominant contribution of axonal delays to the internal delay.
Collapse
|
14
|
Tabuchi H, Laback B. Psychophysical and modeling approaches towards determining the cochlear phase response based on interaural time differences. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2017; 141:4314. [PMID: 28618834 PMCID: PMC5734621 DOI: 10.1121/1.4984031] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/07/2023]
Abstract
The cochlear phase response is often estimated by measuring masking of a tonal target by harmonic complexes with various phase curvatures. Maskers yielding most modulated internal envelope representations after passing the cochlear filter are thought to produce minimum masking, with fast-acting cochlear compression as the main contributor to that effect. Thus, in hearing-impaired (HI) listeners, reduced cochlear compression hampers estimation of the phase response using the masking method. This study proposes an alternative approach, based on the effect of the envelope modulation strength on the sensitivity to interaural time differences (ITDs). To evaluate the general approach, ITD thresholds were measured in seven normal-hearing listeners using 300-ms Schroeder-phase harmonic complexes with nine different phase curvatures. ITD thresholds tended to be lowest for phase curvatures roughly similar to those previously shown to produce minimum masking. However, an unexpected ITD threshold peak was consistently observed for a particular negative phase curvature. An auditory-nerve based ITD model predicted the general pattern of ITD thresholds except for the threshold peak, as well as published envelope ITD data. Model predictions simulating outer hair cell loss support the feasibility of the ITD-based approach to estimate the phase response in HI listeners.
Collapse
|
15
|
Heil P, Peterson AJ. Spike timing in auditory-nerve fibers during spontaneous activity and phase locking. Synapse 2016; 71:5-36. [DOI: 10.1002/syn.21925] [Citation(s) in RCA: 28] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2015] [Revised: 07/20/2016] [Accepted: 07/24/2016] [Indexed: 12/22/2022]
Affiliation(s)
- Peter Heil
- Department of Systems Physiology of Learning; Leibniz Institute for Neurobiology; Magdeburg 39118 Germany
- Center for Behavioral Brain Sciences; Magdeburg Germany
| | - Adam J. Peterson
- Department of Systems Physiology of Learning; Leibniz Institute for Neurobiology; Magdeburg 39118 Germany
| |
Collapse
|
16
|
Hedrick MS, Moon IJ, Woo J, Won JH. Effects of Physiological Internal Noise on Model Predictions of Concurrent Vowel Identification for Normal-Hearing Listeners. PLoS One 2016; 11:e0149128. [PMID: 26866811 PMCID: PMC4750862 DOI: 10.1371/journal.pone.0149128] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2015] [Accepted: 01/27/2016] [Indexed: 11/18/2022] Open
Abstract
Previous studies have shown that concurrent vowel identification improves with increasing temporal onset asynchrony of the vowels, even if the vowels have the same fundamental frequency. The current study investigated the possible underlying neural processing involved in concurrent vowel perception. The individual vowel stimuli from a previously published study were used as inputs for a phenomenological auditory-nerve (AN) model. Spectrotemporal representations of simulated neural excitation patterns were constructed (i.e., neurograms) and then matched quantitatively with the neurograms of the single vowels using the Neurogram Similarity Index Measure (NSIM). A novel computational decision model was used to predict concurrent vowel identification. To facilitate optimum matches between the model predictions and the behavioral human data, internal noise was added at either neurogram generation or neurogram matching using the NSIM procedure. The best fit to the behavioral data was achieved with a signal-to-noise ratio (SNR) of 8 dB for internal noise added at the neurogram but with a much smaller amount of internal noise (SNR of 60 dB) for internal noise added at the level of the NSIM computations. The results suggest that accurate modeling of concurrent vowel data from listeners with normal hearing may partly depend on internal noise and where internal noise is hypothesized to occur during the concurrent vowel identification process.
Collapse
Affiliation(s)
- Mark S. Hedrick
- Department of Audiology and Speech Pathology, University of Tennessee Health Science Center, Knoxville, TN, United States of America
| | - Il Joon Moon
- Department of Otorhinolaryngology-Head and Neck Surgery, Samsung Medical Center, Sungkyunkwan University, School of Medicine, Seoul, Korea
| | - Jihwan Woo
- Department of Biomedical Engineering, University of Ulsan, Ulsan, Korea
- * E-mail:
| | - Jong Ho Won
- Department of Audiology and Speech Pathology, University of Tennessee Health Science Center, Knoxville, TN, United States of America
| |
Collapse
|
17
|
Implications of within-fiber temporal coding for perceptual studies of F0 discrimination and discrimination of harmonic and inharmonic tone complexes. J Assoc Res Otolaryngol 2015; 15:465-82. [PMID: 24658856 DOI: 10.1007/s10162-014-0451-2] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2013] [Accepted: 02/17/2014] [Indexed: 10/25/2022] Open
Abstract
Recent psychophysical studies suggest that normal-hearing (NH) listeners can use acoustic temporal-fine-structure (TFS) cues for accurately discriminating shifts in the fundamental frequency (F0) of complex tones, or equal shifts in all component frequencies, even when the components are peripherally unresolved. The present study quantified both envelope (ENV) and TFS cues in single auditory-nerve (AN) fiber responses (henceforth referred to as neural ENV and TFS cues) from NH chinchillas in response to harmonic and inharmonic complex tones similar to those used in recent psychophysical studies. The lowest component in the tone complex (i.e., harmonic rank N) was systematically varied from 2 to 20 to produce various resolvability conditions in chinchillas (partially resolved to completely unresolved). Neural responses to different pairs of TEST (F0 or frequency shifted) and standard or reference (REF) stimuli were used to compute shuffled cross-correlograms, from which cross-correlation coefficients representing the degree of similarity between responses were derived separately for TFS and ENV. For a given F0 shift, the dissimilarity (TEST vs. REF) was greater for neural TFS than ENV. However, this difference was stimulus-based; the sensitivities of the neural TFS and ENV metrics were equivalent for equal absolute shifts of their relevant frequencies (center component and F0, respectively). For the F0-discrimination task, both ENV and TFS cues were available and could in principle be used for task performance. However, in contrast to human performance, neural TFS cues quantified with our cross-correlation coefficients were unaffected by phase randomization, suggesting that F0 discrimination for unresolved harmonics does not depend solely on TFS cues. For the frequency-shift (harmonic-versus-inharmonic) discrimination task, neural ENV cues were not available. Neural TFS cues were available and could in principle support performance in this task; however, in contrast to human-listeners' performance, these TFS cues showed no dependence on N. We conclude that while AN-fiber responses contain TFS-related cues, which can in principle be used to discriminate changes in F0 or equal shifts in component frequencies of peripherally unresolved harmonics, performance in these two psychophysical tasks appears to be limited by other factors (e.g., central processing noise).
Collapse
|
18
|
Optimal combination of neural temporal envelope and fine structure cues to explain speech identification in background noise. J Neurosci 2014; 34:12145-54. [PMID: 25186758 DOI: 10.1523/jneurosci.1025-14.2014] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022] Open
Abstract
The dichotomy between acoustic temporal envelope (ENV) and fine structure (TFS) cues has stimulated numerous studies over the past decade to understand the relative role of acoustic ENV and TFS in human speech perception. Such acoustic temporal speech cues produce distinct neural discharge patterns at the level of the auditory nerve, yet little is known about the central neural mechanisms underlying the dichotomy in speech perception between neural ENV and TFS cues. We explored the question of how the peripheral auditory system encodes neural ENV and TFS cues in steady or fluctuating background noise, and how the central auditory system combines these forms of neural information for speech identification. We sought to address this question by (1) measuring sentence identification in background noise for human subjects as a function of the degree of available acoustic TFS information and (2) examining the optimal combination of neural ENV and TFS cues to explain human speech perception performance using computational models of the peripheral auditory system and central neural observers. Speech-identification performance by human subjects decreased as the acoustic TFS information was degraded in the speech signals. The model predictions best matched human performance when a greater emphasis was placed on neural ENV coding rather than neural TFS. However, neural TFS cues were necessary to account for the full effect of background-noise modulations on human speech-identification performance.
Collapse
|
19
|
Reverse correlation analysis of auditory-nerve fiber responses to broadband noise in a bird, the barn owl. J Assoc Res Otolaryngol 2014; 16:101-19. [PMID: 25315358 DOI: 10.1007/s10162-014-0494-4] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2014] [Accepted: 09/24/2014] [Indexed: 10/24/2022] Open
Abstract
While the barn owl has been extensively used as a model for sound localization and temporal coding, less is known about the mechanisms at its sensory organ, the basilar papilla (homologous to the mammalian cochlea). In this paper, we characterize, for the first time in the avian system, the auditory nerve fiber responses to broadband noise using reverse correlation. We use the derived impulse responses to study the processing of sounds in the cochlea of the barn owl. We characterize the frequency tuning, phase, instantaneous frequency, and relationship to input level of impulse responses. We show that, even features as complex as the phase dependence on input level, can still be consistent with simple linear filtering. Where possible, we compare our results with mammalian data. We identify salient differences between the barn owl and mammals, e.g., a much smaller frequency glide slope and a bimodal impulse response for the barn owl, and discuss what they might indicate about cochlear mechanics. While important for research on the avian auditory system, the results from this paper also allow us to examine hypotheses put forward for the mammalian cochlea.
Collapse
|
20
|
Chintanpalli A, Ahlstrom JB, Dubno JR. Computational model predictions of cues for concurrent vowel identification. J Assoc Res Otolaryngol 2014; 15:823-37. [PMID: 25002128 DOI: 10.1007/s10162-014-0475-7] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2013] [Accepted: 06/03/2014] [Indexed: 11/28/2022] Open
Abstract
Although differences in fundamental frequencies (F0s) between vowels are beneficial for their segregation and identification, listeners can still segregate and identify simultaneous vowels that have identical F0s, suggesting that additional cues are contributing, including formant frequency differences. The current perception and computational modeling study was designed to assess the contribution of F0 and formant difference cues for concurrent vowel identification. Younger adults with normal hearing listened to concurrent vowels over a wide range of levels (25-85 dB SPL) for conditions in which F0 was the same or different between vowel pairs. Vowel identification scores were poorer at the lowest and highest levels for each F0 condition, and F0 benefit was reduced at the lowest level as compared to higher levels. To understand the neural correlates underlying level-dependent changes in vowel identification, a computational auditory-nerve model was used to estimate formant and F0 difference cues under the same listening conditions. Template contrast and average localized synchronized rate predicted level-dependent changes in the strength of phase locking to F0s and formants of concurrent vowels, respectively. At lower levels, poorer F0 benefit may be attributed to poorer phase locking to both F0s, which resulted from lower firing rates of auditory-nerve fibers. At higher levels, poorer identification scores may relate to poorer phase locking to the second formant, due to synchrony capture by lower formants. These findings suggest that concurrent vowel identification may be partly influenced by level-dependent changes in phase locking of auditory-nerve fibers to F0s and formants of both vowels.
Collapse
Affiliation(s)
- Ananthakrishna Chintanpalli
- Department of Otolaryngology-Head and Neck Surgery, Medical University of South Carolina, 135 Rutledge Avenue, MSC 550, Charleston, SC, 29425-5500, USA,
| | | | | |
Collapse
|
21
|
Tateno T, Nishikawa J, Tsuchioka N, Shintaku H, Kawano S. A hardware model of the auditory periphery to transduce acoustic signals into neural activity. FRONTIERS IN NEUROENGINEERING 2013; 6:12. [PMID: 24324432 PMCID: PMC3840400 DOI: 10.3389/fneng.2013.00012] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/31/2013] [Accepted: 10/28/2013] [Indexed: 11/13/2022]
Abstract
To improve the performance of cochlear implants, we have integrated a microdevice into a model of the auditory periphery with the goal of creating a microprocessor. We constructed an artificial peripheral auditory system using a hybrid model in which polyvinylidene difluoride was used as a piezoelectric sensor to convert mechanical stimuli into electric signals. To produce frequency selectivity, the slit on a stainless steel base plate was designed such that the local resonance frequency of the membrane over the slit reflected the transfer function. In the acoustic sensor, electric signals were generated based on the piezoelectric effect from local stress in the membrane. The electrodes on the resonating plate produced relatively large electric output signals. The signals were fed into a computer model that mimicked some functions of inner hair cells, inner hair cell–auditory nerve synapses, and auditory nerve fibers. In general, the responses of the model to pure-tone burst and complex stimuli accurately represented the discharge rates of high-spontaneous-rate auditory nerve fibers across a range of frequencies greater than 1 kHz and middle to high sound pressure levels. Thus, the model provides a tool to understand information processing in the peripheral auditory system and a basic design for connecting artificial acoustic sensors to the peripheral auditory nervous system. Finally, we discuss the need for stimulus control with an appropriate model of the auditory periphery based on auditory brainstem responses that were electrically evoked by different temporal pulse patterns with the same pulse number.
Collapse
Affiliation(s)
- Takashi Tateno
- Special Research Promotion Group, Graduate School of Frontier Biosciences, Osaka University Osaka, Japan ; Biomedical Systems Engineering, Bioengineering and Bioinformatics, Graduate School of Information Science and Technology, Hokkaido University Sapporo, Japan
| | | | | | | | | |
Collapse
|
22
|
Modeling the time-varying and level-dependent effects of the medial olivocochlear reflex in auditory nerve responses. J Assoc Res Otolaryngol 2013; 15:159-73. [PMID: 24306278 DOI: 10.1007/s10162-013-0430-z] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2013] [Accepted: 11/17/2013] [Indexed: 10/25/2022] Open
Abstract
The medial olivocochlear reflex (MOCR) has been hypothesized to provide benefit for listening in noisy environments. This advantage can be attributed to a feedback mechanism that suppresses auditory nerve (AN) firing in continuous background noise, resulting in increased sensitivity to a tone or speech. MOC neurons synapse on outer hair cells (OHCs), and their activity effectively reduces cochlear gain. The computational model developed in this study implements the time-varying, characteristic frequency (CF) and level-dependent effects of the MOCR within the framework of a well-established model for normal and hearing-impaired AN responses. A second-order linear system was used to model the time-course of the MOCR using physiological data in humans. The stimulus-level-dependent parameters of the efferent pathway were estimated by fitting AN sensitivity derived from responses in decerebrate cats using a tone-in-noise paradigm. The resulting model uses a binaural, time-varying, CF-dependent, level-dependent OHC gain reduction for both ipsilateral and contralateral stimuli that improves detection of a tone in noise, similarly to recorded AN responses. The MOCR may be important for speech recognition in continuous background noise as well as for protection from acoustic trauma. Further study of this model and its efferent feedback loop may improve our understanding of the effects of sensorineural hearing loss in noisy situations, a condition in which hearing aids currently struggle to restore normal speech perception.
Collapse
|
23
|
Chintanpalli A, Heinz MG. The use of confusion patterns to evaluate the neural basis for concurrent vowel identification. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2013; 134:2988-3000. [PMID: 24116434 PMCID: PMC3799688 DOI: 10.1121/1.4820888] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/29/2012] [Revised: 05/31/2013] [Accepted: 08/26/2013] [Indexed: 06/02/2023]
Abstract
Normal-hearing listeners take advantage of differences in fundamental frequency (F0) to segregate competing talkers. Computational modeling using an F0-based segregation algorithm and auditory-nerve temporal responses captures the gradual improvement in concurrent-vowel identification with increasing F0 difference. This result has been taken to suggest that F0-based segregation is the basis for this improvement; however, evidence suggests that other factors may also contribute. The present study further tested models of concurrent-vowel identification by evaluating their ability to predict the specific confusions made by listeners. Measured human confusions consisted of at most one to three confusions per vowel pair, typically from an error in only one of the two vowels. An improvement due to F0 difference was correlated with spectral differences between vowels; however, simple models based on acoustic and cochlear spectral patterns predicted some confusions not made by human listeners. In contrast, a neural temporal model was better at predicting listener confusion patterns. However, the full F0-based segregation algorithm using these neural temporal analyses was inconsistent across F0 difference in capturing listener confusions, being worse for smaller differences. The inability of this commonly accepted model to fully account for listener confusions suggests that other factors besides F0 segregation are likely to contribute.
Collapse
|
24
|
Fontaine B, Benichoux V, Joris PX, Brette R. Predicting spike timing in highly synchronous auditory neurons at different sound levels. J Neurophysiol 2013; 110:1672-88. [PMID: 23864375 PMCID: PMC4042421 DOI: 10.1152/jn.00051.2013] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2013] [Accepted: 07/15/2013] [Indexed: 11/22/2022] Open
Abstract
A challenge for sensory systems is to encode natural signals that vary in amplitude by orders of magnitude. The spike trains of neurons in the auditory system must represent the fine temporal structure of sounds despite a tremendous variation in sound level in natural environments. It has been shown in vitro that the transformation from dynamic signals into precise spike trains can be accurately captured by simple integrate-and-fire models. In this work, we show that the in vivo responses of cochlear nucleus bushy cells to sounds across a wide range of levels can be precisely predicted by deterministic integrate-and-fire models with adaptive spike threshold. Our model can predict both the spike timings and the firing rate in response to novel sounds, across a large input level range. A noisy version of the model accounts for the statistical structure of spike trains, including the reliability and temporal precision of responses. Spike threshold adaptation was critical to ensure that predictions remain accurate at different levels. These results confirm that simple integrate-and-fire models provide an accurate phenomenological account of spike train statistics and emphasize the functional relevance of spike threshold adaptation.
Collapse
Affiliation(s)
- Bertrand Fontaine
- Laboratoire Psychologie de la Perception, CNRS, Université Paris Descartes, Paris, France
| | | | | | | |
Collapse
|
25
|
A sound processor for cochlear implant using a simple dual path nonlinear model of basilar membrane. COMPUTATIONAL AND MATHEMATICAL METHODS IN MEDICINE 2013; 2013:153039. [PMID: 23690872 PMCID: PMC3652108 DOI: 10.1155/2013/153039] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/21/2013] [Accepted: 03/26/2013] [Indexed: 11/17/2022]
Abstract
We propose a new active nonlinear model of the frequency response of the basilar membrane in biological cochlea called the simple dual path nonlinear (SDPN) model and a novel sound processing strategy for cochlear implants (CIs) based upon this model. The SDPN model was developed to utilize the advantages of the level-dependent frequency response characteristics of the basilar membrane for robust formant representation under noisy conditions. In comparison to the dual resonance nonlinear model (DRNL) which was previously proposed as an active nonlinear model of the basilar membrane, the SDPN model can reproduce similar level-dependent frequency responses with a much simpler structure and is thus better suited for incorporation into CI sound processors. By the analysis of dominant frequency component, it was confirmed that the formants of speech are more robustly represented after frequency decomposition by the nonlinear filterbank using SDPN, compared to a linear bandpass filter array which is used in conventional strategies. Acoustic simulation and hearing experiments in subjects with normal hearing showed that the proposed strategy results in better syllable recognition under speech-shaped noise compared to the conventional strategy based on fixed linear bandpass filters.
Collapse
|
26
|
Gai Y, Ruhland JL, Yin TCT, Tollin DJ. Behavioral and modeling studies of sound localization in cats: effects of stimulus level and duration. J Neurophysiol 2013; 110:607-20. [PMID: 23657278 DOI: 10.1152/jn.01019.2012] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022] Open
Abstract
Sound localization accuracy in elevation can be affected by sound spectrum alteration. Correspondingly, any stimulus manipulation that causes a change in the peripheral representation of the spectrum may degrade localization ability in elevation. The present study examined the influence of sound duration and level on localization performance in cats with the head unrestrained. Two cats were trained using operant conditioning to indicate the apparent location of a sound via gaze shift, which was measured with a search-coil technique. Overall, neither sound level nor duration had a notable effect on localization accuracy in azimuth, except at near-threshold levels. In contrast, localization accuracy in elevation improved as sound duration increased, and sound level also had a large effect on localization in elevation. For short-duration noise, the performance peaked at intermediate levels and deteriorated at low and high levels; for long-duration noise, this "negative level effect" at high levels was not observed. Simulations based on an auditory nerve model were used to explain the above observations and to test several hypotheses. Our results indicated that neither the flatness of sound spectrum (before the sound reaches the inner ear) nor the peripheral adaptation influences spectral coding at the periphery for localization in elevation, whereas neural computation that relies on "multiple looks" of the spectral analysis is critical in explaining the effect of sound duration, but not level. The release of negative level effect observed for long-duration sound could not be explained at the periphery and, therefore, is likely a result of processing at higher centers.
Collapse
Affiliation(s)
- Yan Gai
- Department of Neuroscience, University of Wisconsin, Madison, WI 53706, USA.
| | | | | | | |
Collapse
|
27
|
Chintanpalli A, Jennings SG, Heinz MG, Strickland EA. Modeling the anti-masking effects of the olivocochlear reflex in auditory nerve responses to tones in sustained noise. J Assoc Res Otolaryngol 2012; 13:219-35. [PMID: 22286536 DOI: 10.1007/s10162-011-0310-3] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2011] [Accepted: 12/21/2011] [Indexed: 10/14/2022] Open
Abstract
The medial olivocochlear reflex (MOCR) has been hypothesized to provide benefit for listening in noise. Strong physiological support for an anti-masking role for the MOCR has come from the observation that auditory nerve (AN) fibers exhibit reduced firing to sustained noise and increased sensitivity to tones when the MOCR is elicited. The present study extended a well-established computational model for normal-hearing and hearing-impaired AN responses to demonstrate that these anti-masking effects can be accounted for by reducing outer hair cell (OHC) gain, which is a primary effect of the MOCR. Tone responses in noise were examined systematically as a function of tone level, noise level, and OHC gain. Signal detection theory was used to predict detection and discrimination for different spontaneous rate fiber groups. Decreasing OHC gain decreased the sustained noise response and increased maximum discharge rate to the tone, thus modeling the ability of the MOCR to decompress AN fiber rate-level functions. Comparing the present modeling results with previous data from AN fibers in decerebrate cats suggests that the ipsilateral masking noise used in the physiological study may have elicited up to 20 dB of OHC gain reduction in addition to that inferred from the contralateral noise effects. Reducing OHC gain in the model also extended the dynamic range for discrimination over a wide range of background noise levels. For each masker level, an optimal OHC gain reduction was predicted (i.e., where maximum discrimination was achieved without increased detection threshold). These optimal gain reductions increased with masker level and were physiologically realistic. Thus, reducing OHC gain can improve tone-in-noise discrimination even though it may produce a “hearing loss” in quiet. Combining MOCR effects with the sensorineural hearing loss effects already captured by this computational AN model will be beneficial for exploring the implications of their interaction for the difficulties hearing-impaired listeners have in noisy situations.
Collapse
|
28
|
Lyon RF. Cascades of two-pole-two-zero asymmetric resonators are good models of peripheral auditory function. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2011; 130:3893-3904. [PMID: 22225045 DOI: 10.1121/1.3658470] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/31/2023]
Abstract
A cascade of two-pole-two-zero filter stages is a good model of the auditory periphery in two distinct ways. First, in the form of the pole-zero filter cascade, it acts as an auditory filter model that provides an excellent fit to data on human detection of tones in masking noise, with fewer fitting parameters than previously reported filter models such as the roex and gammachirp models. Second, when extended to the form of the cascade of asymmetric resonators with fast-acting compression, it serves as an efficient front-end filterbank for machine-hearing applications, including dynamic nonlinear effects such as fast wide-dynamic-range compression. In their underlying linear approximations, these filters are described by their poles and zeros, that is, by rational transfer functions, which makes them simple to implement in analog or digital domains. Other advantages in these models derive from the close connection of the filter-cascade architecture to wave propagation in the cochlea. These models also reflect the automatic-gain-control function of the auditory system and can maintain approximately constant impulse-response zero-crossing times as the level-dependent parameters change.
Collapse
Affiliation(s)
- Richard F Lyon
- Google Inc., 1600 Amphitheatre Parkway, Mountain View, California 94043, USA.
| |
Collapse
|
29
|
Day ML, Semple MN. Frequency-dependent interaural delays in the medial superior olive: implications for interaural cochlear delays. J Neurophysiol 2011; 106:1985-99. [DOI: 10.1152/jn.00131.2011] [Citation(s) in RCA: 48] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022] Open
Abstract
Neurons in the medial superior olive (MSO) are tuned to the interaural time difference (ITD) of sound arriving at the two ears. MSO neurons evoke a strongest response at their best delay (BD), at which the internal delay between bilateral inputs to MSO matches the external ITD. We performed extracellular recordings in the superior olivary complex of the anesthetized gerbil and found a majority of single units localized to the MSO to exhibit BDs that shifted with tone frequency. The relation of best interaural phase difference to tone frequency revealed nonlinearities in some MSO units and others with linear relations with characteristic phase between 0.4 and 0.6 cycles. The latter is usually associated with the interaction of ipsilateral excitation and contralateral inhibition, as in the lateral superior olive, yet all MSO units exhibited evidence of bilateral excitation. Interaural cochlear delays and phase-locked contralateral inhibition are two mechanisms of internal delay that have been suggested to create frequency-dependent delays. Best interaural phase-frequency relations were compared with a cross-correlation model of MSO that incorporated interaural cochlear delays and an additional frequency-independent delay component. The model with interaural cochlear delay fit phase-frequency relations exhibiting frequency-dependent delays with precision. Another model of MSO incorporating inhibition based on realistic biophysical parameters could not reproduce observed frequency-dependent delays.
Collapse
Affiliation(s)
- Mitchell L. Day
- Center for Neural Science, New York University, New York, New York
| | | |
Collapse
|
30
|
Swaminathan J, Heinz MG. Predicted effects of sensorineural hearing loss on across-fiber envelope coding in the auditory nerve. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2011; 129:4001-13. [PMID: 21682421 PMCID: PMC3135152 DOI: 10.1121/1.3583502] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/27/2010] [Revised: 12/23/2010] [Accepted: 03/30/2011] [Indexed: 05/19/2023]
Abstract
Cross-channel envelope correlations are hypothesized to influence speech intelligibility, particularly in adverse conditions. Acoustic analyses suggest speech envelope correlations differ for syllabic and phonemic ranges of modulation frequency. The influence of cochlear filtering was examined here by predicting cross-channel envelope correlations in different speech modulation ranges for normal and impaired auditory-nerve (AN) responses. Neural cross-correlation coefficients quantified across-fiber envelope coding in syllabic (0-5 Hz), phonemic (5-64 Hz), and periodicity (64-300 Hz) modulation ranges. Spike trains were generated from a physiologically based AN model. Correlations were also computed using the model with selective hair-cell damage. Neural predictions revealed that envelope cross-correlation decreased with increased characteristic-frequency separation for all modulation ranges (with greater syllabic-envelope correlation than phonemic or periodicity). Syllabic envelope was highly correlated across many spectral channels, whereas phonemic and periodicity envelopes were correlated mainly between adjacent channels. Outer-hair-cell impairment increased the degree of cross-channel correlation for phonemic and periodicity ranges for speech in quiet and in noise, thereby reducing the number of independent neural information channels for envelope coding. In contrast, outer-hair-cell impairment was predicted to decrease cross-channel correlation for syllabic envelopes in noise, which may partially account for the reduced ability of hearing-impaired listeners to segregate speech in complex backgrounds.
Collapse
Affiliation(s)
- Jayaganesh Swaminathan
- Department of Speech, Language, and Hearing Sciences, Purdue University, West Lafayette, Indiana 47907-2038, USA.
| | | |
Collapse
|
31
|
Jennings SG, Heinz MG, Strickland EA. Evaluating adaptation and olivocochlear efferent feedback as potential explanations of psychophysical overshoot. J Assoc Res Otolaryngol 2011; 12:345-60. [PMID: 21267622 DOI: 10.1007/s10162-011-0256-5] [Citation(s) in RCA: 37] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2010] [Accepted: 01/10/2011] [Indexed: 11/24/2022] Open
Abstract
Masked detection threshold for a short tone in noise improves as the tone's onset is delayed from the masker's onset. This improvement, known as "overshoot," is maximal at mid-masker levels and is reduced by temporary and permanent cochlear hearing loss. Computational modeling was used in the present study to evaluate proposed physiological mechanisms of overshoot, including classic firing rate adaptation and medial olivocochlear (MOC) feedback, for both normal hearing and cochlear hearing loss conditions. These theories were tested using an established model of the auditory periphery and signal detection theory techniques. The influence of several analysis variables on predicted tone-pip detection in broadband noise was evaluated, including: auditory nerve fiber spontaneous-rate (SR) pooling, range of characteristic frequencies, number of synapses per characteristic frequency, analysis window duration, and detection rule. The results revealed that overshoot similar to perceptual data in terms of both magnitude and level dependence could be predicted when the effects of MOC efferent feedback were included in the auditory nerve model. Conversely, simulations without MOC feedback effects never produced overshoot despite the model's ability to account for classic firing rate adaptation and dynamic range adaptation in auditory nerve responses. Cochlear hearing loss was predicted to reduce the size of overshoot only for model versions that included the effects of MOC efferent feedback. These findings suggest that overshoot in normal and hearing-impaired listeners is mediated by some form of dynamic range adaptation other than what is observed in the auditory nerve of anesthetized animals. Mechanisms for this adaptation may occur at several levels along the auditory pathway. Among these mechanisms, the MOC reflex may play a leading role.
Collapse
Affiliation(s)
- Skyler G Jennings
- Department of Speech, Language, and Hearing Sciences, Purdue University, 500 Oval Drive, West Lafayette, IN 47907, USA.
| | | | | |
Collapse
|
32
|
Temchin AN, Recio-Spinoso A, Ruggero MA. Timing of cochlear responses inferred from frequency-threshold tuning curves of auditory-nerve fibers. Hear Res 2010; 272:178-86. [PMID: 20951191 DOI: 10.1016/j.heares.2010.10.002] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/07/2010] [Revised: 10/01/2010] [Accepted: 10/06/2010] [Indexed: 12/01/2022]
Abstract
Links between frequency tuning and timing were explored in the responses to sound of auditory-nerve fibers. Synthetic transfer functions were constructed by combining filter functions, derived via minimum-phase computations from average frequency-threshold tuning curves of chinchilla auditory-nerve fibers with high spontaneous activity (Temchin et al., 2008), and signal-front delays specified by the latencies of basilar-membrane and auditory-nerve fiber responses to intense clicks (Temchin et al., 2005). The transfer functions predict several features of the phase-frequency curves of cochlear responses to tones, including their shape transitions in the regions with characteristic frequencies of 1 kHz and 3-4 kHz (Temchin and Ruggero, 2010). The transfer functions also predict the shapes of cochlear impulse responses, including the polarities of their frequency sweeps and their transition at characteristic frequencies around 1 kHz. Predictions are especially accurate for characteristic frequencies <1 kHz.
Collapse
Affiliation(s)
- Andrei N Temchin
- Hugh Knowles Center (Dept. of Communication Sciences and Disorders), Northwestern University, 2240 Campus Drive, Evanston, IL 60208-3550, United States
| | | | | |
Collapse
|
33
|
Petoe MA, Bradley AP, Wilson WJ. Spectral and synchrony differences in auditory brainstem responses evoked by chirps of varying durations. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2010; 128:1896-907. [PMID: 20968361 DOI: 10.1121/1.3483738] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/16/2023]
Abstract
The chirp-evoked ABR has been termed a more synchronous response, referring to the fact that rising-frequency chirp stimuli theoretically compensate for temporal dispersions down the basilar membrane. This compensation is made possible by delaying the higher frequency content of the stimulus until the lower frequency traveling waves are closer to the cochlea apex. However, it is not yet clear how sensitive this temporal compensation is to variation in the delay interval. This study analyzed chirp- and click-evoked ABRs at low intensity, using a variety of tools in the time, frequency, and phase domains, to measure synchrony in the response. Additionally, this study also examined the relationship between chirp sweep rate and response synchrony by varying the delay between high- and low-frequency portions of chirp stimuli. The results suggest that the chirp-evoked ABRs in this study exhibited more synchrony than the click-evoked ABRs and that slight gender-based differences exist in the synchrony of chirp-evoked ABRs. The study concludes that a tailoring of chirp parameters to gender may be beneficial in pathologies that severely affect neural synchrony, but that such a customization may not be necessary in routine clinical applications.
Collapse
Affiliation(s)
- Matthew A Petoe
- School of Information Technology and Electrical Engineering, The University of Queensland, Brisbane, Queensland 4072, Australia
| | | | | |
Collapse
|
34
|
Siveke I, Leibold C, Kaiser K, Grothe B, Wiegrebe L. Level-dependent latency shifts quantified through binaural processing. J Neurophysiol 2010; 104:2224-35. [PMID: 20702738 DOI: 10.1152/jn.00392.2010] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022] Open
Abstract
The mammalian binaural system compares the timing of monaural inputs with microsecond precision. This temporal precision is required for localizing sounds in azimuth. However, temporal features of the monaural inputs, in particular their latencies, highly depend on the overall sound level. In a combined psychophysical, electrophysiological, and modeling approach, we investigate how level-dependent latency shifts of the monaural responses are reflected in the perception and neural representation of interaural time differences. We exploit the sensitivity of the binaural system to the timing of high-frequency stimuli with binaurally incongruent envelopes. Using these novel stimuli, both the perceptually adjusted interaural time differences and the time differences extracted from electrophysiological recordings systematically depend on overall sound pressure level. The perceptual and electrophysiological time differences of the envelopes can be explained in an existing model of temporal integration only if a level-dependent firing threshold is added. Such an adjustment of firing threshold provides a temporally accurate neural code of the temporal structure of a stimulus and its binaural disparities independent of overall sound level.
Collapse
Affiliation(s)
- Ida Siveke
- Division of Neurobiology, Department Biologie II, Ludwig-Maximilians-Universität München, Germany
| | | | | | | | | |
Collapse
|
35
|
Petoe MA, Bradley AP, Wilson WJ. On chirp stimuli and neural synchrony in the suprathreshold auditory brainstem response. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2010; 128:235-46. [PMID: 20649219 DOI: 10.1121/1.3436527] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/16/2023]
Abstract
The chirp-evoked ABR has been regarded as a more synchronous response than the click-evoked ABR, referring to the belief that the chirp stimulates lower-, mid-, and higher-frequency regions of the cochlea simultaneously. In this study a variety of tools were used to analyze the synchronicity of ABRs evoked by chirp- and click-stimuli at 40 dB HL in 32 normal hearing subjects aged 18 to 55 years (mean=24.8 years, SD=7.1 years). Compared to the click-evoked ABRs, the chirp-evoked ABRs showed larger wave V amplitudes, but an absence of earlier waves in the grand averages, larger wave V latency variance, smaller FFT magnitudes at the higher component frequencies, and larger phase variance at the higher component frequencies. These results strongly suggest that the chirp-evoked ABRs exhibited less synchrony than the click-evoked ABRs in this study. It is proposed that the temporal compensation offered by chirp stimuli is sufficient to increase neural recruitment (as measured by wave V amplitude), but that destructive phase interactions still exist along the cochlea partition, particularly in the low frequency portions of the cochlea where more latency jitter is expected. The clinical implications of these findings are discussed.
Collapse
Affiliation(s)
- Matthew A Petoe
- School of Information Technology and Electrical Engineering, University of Queensland, Brisbane, Queensland 4072, Australia
| | | | | |
Collapse
|
36
|
Wagner H, Brill S, Kempter R, Carr CE. Auditory responses in the barn owl's nucleus laminaris to clicks: impulse response and signal analysis of neurophonic potential. J Neurophysiol 2009; 102:1227-40. [PMID: 19535487 DOI: 10.1152/jn.00092.2009] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022] Open
Abstract
We used acoustic clicks to study the impulse response of the neurophonic potential in the barn owl's nucleus laminaris. Clicks evoked a complex oscillatory neural response with a component that reflected the best frequency measured with tonal stimuli. The envelope of this component was obtained from the analytic signal created using the Hilbert transform. The time courses of the envelope and carrier waveforms were characterized by fitting them with filters. The envelope was better fitted with a Gaussian than with the envelope of a gamma-tone function. The carrier was better fitted with a frequency glide than with a constant instantaneous frequency. The change of the instantaneous frequency with time was better fitted with a linear fit than with a saturating nonlinearity. Frequency glides had not been observed in the bird's auditory system before. The glides were similar to those observed in the mammalian auditory nerve. Response amplitude, group delay, frequency, and phase depended in a systematic way on click level. In most cases, response amplitude decreased linearly as stimulus level decreased, while group delay, phase, and frequency increased linearly as level decreased. Thus the impulse response of the neurophonic potential in the nucleus laminaris of barn owls reflects many characteristics also observed in responses of the basilar membrane and auditory nerve in mammals.
Collapse
Affiliation(s)
- Hermann Wagner
- Institute for Biology II, RWTH Aachen, D-52074 Aachen, Germany.
| | | | | | | |
Collapse
|
37
|
Recio-Spinoso A, Narayan SS, Ruggero MA. Basilar membrane responses to noise at a basal site of the chinchilla cochlea: quasi-linear filtering. J Assoc Res Otolaryngol 2009; 10:471-84. [PMID: 19495878 DOI: 10.1007/s10162-009-0172-0] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2009] [Accepted: 04/28/2009] [Indexed: 11/30/2022] Open
Abstract
Basilar membrane responses to clicks and to white noise were recorded using laser velocimetry at basal sites of the chinchilla cochlea with characteristic frequencies near 10 kHz. Responses to noise grew at compressive rates and their instantaneous frequencies decreased with increasing stimulus level. First-order Wiener kernels were computed by cross-correlation of the noise stimuli and the responses. For linear systems, first-order Wiener kernels are identical to unit impulse responses. In the case of basilar membrane responses, first-order Wiener kernels and responses to clicks measured at the same sites were similar but not identical. Both consisted of transient oscillations with onset frequencies which increased rapidly, over about 0.5 ms, from 4-5 kHz to the characteristic frequency. Both first-order Wiener kernels and responses to clicks were more highly damped, exhibited slower frequency modulation, and grew at compressive rates with increasing stimulus levels. Responses to clicks had longer durations than the Wiener kernels. The statistical distribution of basilar membrane responses to Gaussian white noise is also Gaussian and the envelopes of the responses are Rayleigh distributed, as they should be for Gaussian noise passing through a linear band-pass filter. Accordingly, basilar membrane responses were accurately predicted by linear filters specified by the first-order Wiener kernels of responses to noise presented at the same level. Overall, the results indicate that cochlear nonlinearity is not instantaneous and resembles automatic gain control.
Collapse
Affiliation(s)
- Alberto Recio-Spinoso
- ENT Department, Leiden University Medical Center, Postbus 9600, 2300 RC, Leiden, The Netherlands
| | | | | |
Collapse
|
38
|
Quantifying envelope and fine-structure coding in auditory nerve responses to chimaeric speech. J Assoc Res Otolaryngol 2009; 10:407-23. [PMID: 19365691 DOI: 10.1007/s10162-009-0169-8] [Citation(s) in RCA: 54] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2008] [Accepted: 03/13/2009] [Indexed: 10/20/2022] Open
Abstract
Any sound can be separated mathematically into a slowly varying envelope and rapidly varying fine-structure component. This property has motivated numerous perceptual studies to understand the relative importance of each component for speech and music perception. Specialized acoustic stimuli, such as auditory chimaeras with the envelope of one sound and fine structure of another have been used to separate the perceptual roles for envelope and fine structure. Cochlear narrowband filtering limits the ability to isolate fine structure from envelope; however, envelope recovery from fine structure has been difficult to evaluate physiologically. To evaluate envelope recovery at the output of the cochlea, neural cross-correlation coefficients were developed that quantify the similarity between two sets of spike-train responses. Shuffled auto- and cross-correlogram analyses were used to compute separate correlations for responses to envelope and fine structure based on both model and recorded spike trains from auditory nerve fibers. Previous correlogram analyses were extended to isolate envelope coding more effectively in auditory nerve fibers with low center frequencies, which are particularly important for speech coding. Recovered speech envelopes were present in both model and recorded responses to one- and 16-band speech fine-structure chimaeras and were significantly greater for the one-band case, consistent with perceptual studies. Model predictions suggest that cochlear recovered envelopes are reduced following sensorineural hearing loss due to broadened tuning associated with outer-hair cell dysfunction. In addition to the within-fiber cross-stimulus cases considered here, these neural cross-correlation coefficients can also be used to evaluate spatiotemporal coding by applying them to cross-fiber within-stimulus conditions. Thus, these neural metrics can be used to quantitatively evaluate a wide range of perceptually significant temporal coding issues relevant to normal and impaired hearing.
Collapse
|
39
|
Kim KH, Choi SJ, Kim JH, Kim DH. An improved speech processing strategy for cochlear implants based on an active nonlinear filterbank model of the biological cochlea. IEEE Trans Biomed Eng 2009; 56:828-36. [PMID: 19272890 DOI: 10.1109/tbme.2008.2007850] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
The purpose of this study was to improve the speech processing strategy for cochlear implants (CIs) based on a nonlinear time-varying filter model of a biological cochlea. The level-dependent frequency response characteristic of the basilar membrane is known to produce robust formant representation and speech perception in noise. A dual resonance nonlinear (DRNL) model was adopted because it is simpler than other adaptive nonlinear models of the basilar membrane and can be readily incorporated into the CI speech processor. Spectral analysis showed that formant information is more saliently represented at the output of the proposed CI speech processor compared to the conventional strategy in noisy conditions. Acoustic simulation and hearing experiments showed that the DRNL-based nonlinear strategy improves speech performance in a speech-spectrum-shaped noise.
Collapse
Affiliation(s)
- Kyung Hwan Kim
- Department of Biomedical Engineering, College of Health Science, Yonsei University, Wonju 220-710, Korea.
| | | | | | | |
Collapse
|
40
|
Variation in the phase of response to low-frequency pure tones in the guinea pig auditory nerve as functions of stimulus level and frequency. J Assoc Res Otolaryngol 2008; 10:233-50. [PMID: 19093151 PMCID: PMC2674197 DOI: 10.1007/s10162-008-0151-x] [Citation(s) in RCA: 34] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2008] [Accepted: 11/14/2008] [Indexed: 11/02/2022] Open
Abstract
The directionality of hair cell stimulation combined with the vibration of the basilar membrane causes the auditory nerve fiber action potentials, in response to low-frequency stimuli, to occur at a particular phase of the stimulus waveform. Because direct mechanical measurements at the cochlear apex are difficult, such phase locking has often been used to indirectly infer the basilar membrane motion. Here, we confirm and extend earlier data from mammals using sine wave stimulation over a wide range of sound levels (up to 90 dB sound pressure level). We recorded phase-locked responses to pure tones over a wide range of frequencies and sound levels of a large population of auditory nerve fibers in the anesthetized guinea pig. The results indicate that, for a constant frequency of stimulation, the phase lag decreases with increases in the characteristic frequency (CF) of the nerve fiber. The phase lag decreases up to a CF above the stimulation frequency, beyond which it decreases at a much slower rate. Such phase changes are consistent with known basal cochlear mechanics. Measurements from individual fibers showed smaller but systematic variations in phase with sound level, confirming previous reports. We found a "null" stimulation frequency at which little variation in phase occurred with sound level. This null frequency was often not at the CF. At stimulation frequencies below the null, there was a progressive lag with sound level and a progressive lead for stimulation frequencies above the null. This was maximally 0.2 cycles.
Collapse
|
41
|
Day ML, Doiron B, Rinzel J. Subthreshold K+ channel dynamics interact with stimulus spectrum to influence temporal coding in an auditory brain stem model. J Neurophysiol 2007; 99:534-44. [PMID: 18057115 DOI: 10.1152/jn.00326.2007] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022] Open
Abstract
Neurons in the auditory brain stem encode signals with exceptional temporal precision. A low-threshold potassium current, IKLT, present in many auditory brain stem structures and thought to enhance temporal encoding, facilitates spike selection of rapid input current transients through an associated dynamic gate. Whether the dynamic nature of IKLT interacts with the timescales in spectrally rich input to influence spike encoding remains unclear. We examine the general influence of IKLT on spike encoding of stochastic stimuli using a pattern classification analysis between spike responses from a ventral cochlear nucleus (VCN) model containing IKLT, and the same model with the IKLT dynamics removed. The influence of IKLT on spike encoding depended on the spectral content of the current stimulus such that maximal IKLT influence occurred for stimuli with power concentrated at frequencies low enough (<500 Hz) to allow IKLT activation. Further, broadband stimuli significantly decreased the influence of IKLT on spike encoding, suggesting that broadband stimuli are not well suited for investigating the influence of some dynamic membrane nonlinearities. Finally, pattern classification on spike responses was performed for physiologically realistic conductance stimuli created from various sounds filtered through an auditory nerve (AN) model. Regardless of the sound, the synaptic input arriving at VCN had similar low-pass power spectra, which led to a large influence of IKLT on spike encoding, suggesting that the subthreshold dynamics of IKLT plays a significant role in shaping the response of real auditory brain stem neurons.
Collapse
Affiliation(s)
- Mitchell L Day
- Center for Neural Science, New York University, New York, NY, USA.
| | | | | |
Collapse
|
42
|
Gai Y, Carney LH, Abrams KS, Idrobo F, Harrison JM, Gilkey RH. Detection of tones in reproducible noise maskers by rabbits and comparison to detection by humans. J Assoc Res Otolaryngol 2007; 8:522-38. [PMID: 17899269 PMCID: PMC2538343 DOI: 10.1007/s10162-007-0096-5] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2007] [Accepted: 08/10/2007] [Indexed: 10/22/2022] Open
Abstract
Processing mechanisms used for detection of tones in noise can be revealed by using reproducible noise maskers and analyzing the pattern of results across masker waveforms. This study reports detection of a 500-Hz tone in broadband reproducible noise by rabbits using a set of masker waveforms for which human results are available. An appetitive-reinforcement, operant-conditioning procedure with bias control was used. Both fixed-level and roving-level noises were used to explore the utility of energy-related cues for detection. An energy-based detection model was able to partially explain the fixed-level results across reproducible noise waveforms for both rabbit and human. A multiple-channel energy model was able to explain fixed-level results, as well as the robust performance observed with roving-level noises. Further analysis using the energy model indicated a difference between species: human detection was influenced most by the noise spectrum surrounding the tone frequency, whereas rabbit detection was influenced most by the noise spectrum at frequencies above that of the tone. In addition, a temporal envelope-based model predicted detection by humans as well as the single-channel energy model did, but the envelope-based model failed to predict detection by rabbits. This result indicates that the contributions of energy and temporal cues to auditory processing differ across species. Overall, these findings suggest that caution must be used when evaluating neural encoding mechanisms in one species on the basis of behavioral results in another.
Collapse
Affiliation(s)
- Yan Gai
- Department Of Biomedical and Chemical Engineering, Syracuse University, Syracuse, NY 13244 USA
- Institute for Sensory Research, Syracuse University, Syracuse, NY 13244 USA
| | - Laurel H. Carney
- Department Of Biomedical and Chemical Engineering, Syracuse University, Syracuse, NY 13244 USA
- Institute for Sensory Research, Syracuse University, Syracuse, NY 13244 USA
- Department of Electrical Engineering and Computer Science, Syracuse University, Syracuse, NY 13244 USA
| | - Kristina S. Abrams
- Institute for Sensory Research, Syracuse University, Syracuse, NY 13244 USA
| | - Fabio Idrobo
- Department of Psychology, Boston University, Boston, MA 02215 USA
| | | | - Robert H. Gilkey
- Department of Psychology, Wright State University, Dayton, OH 45435 USA
- Human Effectiveness Directorate, Air Force Research Laboratory, Wright-Patterson Air Force Base, Dayton, OH 45433 USA
| |
Collapse
|
43
|
Siveke I, Leibold C, Grothe B. Spectral composition of concurrent noise affects neuronal sensitivity to interaural time differences of tones in the dorsal nucleus of the lateral lemniscus. J Neurophysiol 2007; 98:2705-15. [PMID: 17699697 DOI: 10.1152/jn.00275.2007] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022] Open
Abstract
We are regularly exposed to several concurrent sounds, producing a mixture of binaural cues. The neuronal mechanisms underlying the localization of concurrent sounds are not well understood. The major binaural cues for localizing low-frequency sounds in the horizontal plane are interaural time differences (ITDs). Auditory brain stem neurons encode ITDs by firing maximally in response to "favorable" ITDs and weakly or not at all in response to "unfavorable" ITDs. We recorded from ITD-sensitive neurons in the dorsal nucleus of the lateral lemniscus (DNLL) while presenting pure tones at different ITDs embedded in noise. We found that increasing levels of concurrent white noise suppressed the maximal response rate to tones with favorable ITDs and slightly enhanced the response rate to tones with unfavorable ITDs. Nevertheless, most of the neurons maintained ITD sensitivity to tones even for noise intensities equal to that of the tone. Using concurrent noise with a spectral composition in which the neuron's excitatory frequencies are omitted reduced the maximal response similar to that obtained with concurrent white noise. This finding indicates that the decrease of the maximal rate is mediated by suppressive cross-frequency interactions, which we also observed during monaural stimulation with additional white noise. In contrast, the enhancement of the firing rate to tones at unfavorable ITD might be due to early binaural interactions (e.g., at the level of the superior olive). A simple simulation corroborates this interpretation. Taken together, these findings suggest that the spectral composition of a concurrent sound strongly influences the spatial processing of ITD-sensitive DNLL neurons.
Collapse
Affiliation(s)
- Ida Siveke
- Division of Neurobiology, Department Biology II, Ludwig-Maximilians-Universität München, Germany
| | | | | |
Collapse
|
44
|
Zilany MSA, Bruce IC. Representation of the vowel /epsilon/ in normal and impaired auditory nerve fibers: model predictions of responses in cats. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2007; 122:402-17. [PMID: 17614499 DOI: 10.1121/1.2735117] [Citation(s) in RCA: 72] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/09/2023]
Abstract
The temporal response of auditory-nerve (AN) fibers to a steady-state vowel is investigated using a computational auditory-periphery model. The model predictions are validated against a wide range of physiological data for both normal and impaired fibers in cats. The model incorporates two parallel filter paths, component 1 (C1) and component 2 (C2), which correspond to the active and passive modes of basilar membrane vibration, respectively, in the cochlea. The outputs of the two filters are subsequently transduced by two separate functions, added together, and then low-pass filtered by the inner hair cell (IHC) membrane, which is followed by the IHC-AN synapse and discharge generator. The C1 response dominates at low and moderate levels and is responsible for synchrony capture and multiformant responses seen in the vowel responses. The C2 response dominates at high levels and contributes to the loss of synchrony capture observed in normal and impaired fibers. The interaction between C1 and C2 responses explains the behavior of AN fibers in the transition region, which is characterized by two important observations in the vowel responses: First, all components of the vowel undergo the C1/C2 transition simultaneously, and second, the responses to the nonformant components of the vowel become substantial.
Collapse
Affiliation(s)
- Muhammad S A Zilany
- Department of Electrical and Computer Engineering, McMaster University, Hamilton, Ontario, Canada
| | | |
Collapse
|
45
|
Kim KH, Kim JH, Kim DH. An improved speech processor for cochlear implant based on active nonlinear model of biological cochlea. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2007; 2007:6352-6355. [PMID: 18003474 DOI: 10.1109/iembs.2007.4353808] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/25/2023]
Abstract
The purpose of this study was to improve speech perception performance of cochlear implant (CI) under noise by a speech processing strategy based on nonlinear time-varying filter model of biological cochlea, which is beneficial in preserving spectral cues for speech perception. A dual resonance nonlinear model was applied to implement this feature. Time-frequency analysis indicated that formant information was more clearly represented at the output of CI speech processor, especially under noise. Acoustic simulation and hearing experiment also showed the superiority of the proposed strategy in that vowel perception score was notably enhanced. It was also observed that the AN responses to the stimulation pulses produced by the proposed strategy encode the formant information faithfully. Since the proposed strategy can be employed in CI devices without modification of hardwares, a significant contribution for the improvement of speech perception capability of CI implantees is expected.
Collapse
|
46
|
Tan Q, Carney LH. Predictions of formant-frequency discrimination in noise based on model auditory-nerve responses. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2006; 120:1435-45. [PMID: 17004467 PMCID: PMC2572872 DOI: 10.1121/1.2225858] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/12/2023]
Abstract
To better understand how the auditory system extracts speech signals in the presence of noise, discrimination thresholds for the second formant frequency were predicted with simulations of auditory-nerve responses. These predictions employed either average-rate information or combined rate and timing information, and either populations of model fibers tuned across a wide range of frequencies or a subset of fibers tuned to a restricted frequency range. In general, combined temporal and rate information for a small population of model fibers tuned near the formant frequency was most successful in replicating the trends reported in behavioral data for formant-frequency discrimination. To explore the nature of the temporal information that contributed to these results, predictions based on model auditory-nerve responses were compared to predictions based on the average rates of a population of cross-frequency coincidence detectors. These comparisons suggested that average response rate (count) of cross-frequency coincidence detectors did not effectively extract important temporal information from the auditory-nerve population response. Thus, the relative timing of action potentials across auditory-nerve fibers tuned to different frequencies was not the aspect of the temporal information that produced the trends in formant-frequency discrimination thresholds.
Collapse
Affiliation(s)
- Qing Tan
- Boston University Hearing Research Center, Department of Biomedical Engineering, Boston University, 44 Cummington Street, Boston, Massachusetts 02215, USA
| | | |
Collapse
|
47
|
Zilany MSA, Bruce IC. Modeling auditory-nerve responses for high sound pressure levels in the normal and impaired auditory periphery. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2006; 120:1446-66. [PMID: 17004468 DOI: 10.1121/1.2225512] [Citation(s) in RCA: 120] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/09/2023]
Abstract
This paper presents a computational model to simulate normal and impaired auditory-nerve (AN) fiber responses in cats. The model responses match physiological data over a wider dynamic range than previous auditory models. This is achieved by providing two modes of basilar membrane excitation to the inner hair cell (IHC) rather than one. The two modes are generated by two parallel filters, component 1 (C1) and component 2 (C2), and the outputs are subsequently transduced by two separate functions. The responses are then added and passed through the IHC low-pass filter followed by the IHC-AN synapse model and discharge generator. The C1 filter is a narrow-band, chirp filter with the gain and bandwidth controlled by a nonlinear feed-forward control path. This filter is responsible for low and moderate level responses. A linear, static, and broadly tuned C2 filter followed by a nonlinear, inverted and nonrectifying C2 transduction function is critical for producing transition region and high-level effects. Consistent with Kiang's two-factor cancellation hypothesis, the interaction between the two paths produces effects such as the C1/C2 transition and peak splitting in the period histogram. The model responses are consistent with a wide range of physiological data from both normal and impaired ears for stimuli presented at levels spanning the dynamic range of hearing.
Collapse
Affiliation(s)
- Muhammad S A Zilany
- Department of Electrical and Computer Engineering, McMaster University, Hamilton, Ontario L8S 4K1, Canada
| | | |
Collapse
|
48
|
Wagner H, Brill S, Kempter R, Carr CE. Microsecond precision of phase delay in the auditory system of the barn owl. J Neurophysiol 2005; 94:1655-8. [PMID: 15843477 PMCID: PMC3268176 DOI: 10.1152/jn.01226.2004] [Citation(s) in RCA: 34] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022] Open
Abstract
The auditory system encodes time with sub-millisecond accuracy. To shed new light on the basic mechanism underlying this precise temporal neuronal coding, we analyzed the neurophonic potential, a characteristic multiunit response, in the barn owl's nucleus laminaris. We report here that the relative time measure of phase delay is robust against changes in sound level, with a precision sharper than 20 micros. Absolute measures of delay, such as group delay or signal-front delay, had much greater temporal jitter, for example due to their strong dependence on sound level. Our findings support the hypothesis that phase delay underlies the sub-millisecond precision of the representation of interaural time difference needed for sound localization.
Collapse
Affiliation(s)
- Hermann Wagner
- Institute for Biology II, Rheinisch-Westfölische Technische Hochschule Aachen, Aachen, Germany
| | | | | | | |
Collapse
|
49
|
Abstract
Humans normally listen in mixed environments, in which sounds originating from more than one source overlap in time and in frequency. The auditory system is able to extract information specific to the individual sources that contribute to the composite signal and process the information for each source separately; this is called “auditory scene analysis” or “sound-source determination.” Sounds that are simultaneously present but generated independently tend to differ along relatively simple acoustic dimensions. These dimensions may be temporal, as when sounds begin or end asynchronously, or spectral, as when the sounds have different fundamental frequencies. Psychophysical experiments have identified some of the ways in which human listeners use these dimensions to isolate sources of sound. A simple but useful stimulus, a harmonic complex tone with or without a mistuned component, can be used for parametric investigation of the processing of spectral structure. This “mistuned tone” stimulus has been used in several psychophysical experiments, and more recently in studies that specifically address the neural mechanisms that underlie segregation based on harmonicity. Studies of the responses of single neurons in the chinchilla auditory system to mistuned tones are reviewed here in detail. The results of those experiments support the view that neurons in the inferior colliculus (IC) exhibit responses to mistuned tones that are larger and temporally more complex than the same neurons’ responses to harmonic tones. Mistuning does not produce comparable changes in the discharge patterns of auditory nerve (AN) fibers, indicating that a major transformation in the neural representation of harmonic structure occurs in the auditory brainstem. The brainstem processing that accomplishes this transformation may contribute to the segregation of competing sounds and ultimately to the identification of sound sources.
Collapse
Affiliation(s)
- Donal G Sinex
- Department of Psychology, Utah State University, Logan, Utah 84322, USA
| |
Collapse
|
50
|
Lopez-Poveda EA. Spectral processing by the peripheral auditory system: facts and models. INTERNATIONAL REVIEW OF NEUROBIOLOGY 2005; 70:7-48. [PMID: 16472630 DOI: 10.1016/s0074-7742(05)70001-5] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/02/2022]
Affiliation(s)
- Enrique A Lopez-Poveda
- Instituto de Neurociencias de Castilla y León, Universidad de Salamanca, Salamanca 37007, Spain
| |
Collapse
|