1
|
Jannesar N, Akbarzadeh-Sherbaf K, Safari S, Vahabie AH. SSTE: Syllable-Specific Temporal Encoding to FORCE-learn audio sequences with an associative memory approach. Neural Netw 2024; 177:106368. [PMID: 38761415 DOI: 10.1016/j.neunet.2024.106368] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2023] [Revised: 03/28/2024] [Accepted: 05/05/2024] [Indexed: 05/20/2024]
Abstract
The circuitry and pathways in the brains of humans and other species have long inspired researchers and system designers to develop accurate and efficient systems capable of solving real-world problems and responding in real-time. We propose the Syllable-Specific Temporal Encoding (SSTE) to learn vocal sequences in a reservoir of Izhikevich neurons, by forming associations between exclusive input activities and their corresponding syllables in the sequence. Our model converts the audio signals to cochleograms using the CAR-FAC model to simulate a brain-like auditory learning and memorization process. The reservoir is trained using a hardware-friendly approach to FORCE learning. Reservoir computing could yield associative memory dynamics with far less computational complexity compared to RNNs. The SSTE-based learning enables competent accuracy and stable recall of spatiotemporal sequences with fewer reservoir inputs compared with existing encodings in the literature for similar purpose, offering resource savings. The encoding points to syllable onsets and allows recalling from a desired point in the sequence, making it particularly suitable for recalling subsets of long vocal sequences. The SSTE demonstrates the capability of learning new signals without forgetting previously memorized sequences and displays robustness against occasional noise, a characteristic of real-world scenarios. The components of this model are configured to improve resource consumption and computational intensity, addressing some of the cost-efficiency issues that might arise in future implementations aiming for compactness and real-time, low-power operation. Overall, this model proposes a brain-inspired pattern generation network for vocal sequences that can be extended with other bio-inspired computations to explore their potentials for brain-like auditory perception. Future designs could inspire from this model to implement embedded devices that learn vocal sequences and recall them as needed in real-time. Such systems could acquire language and speech, operate as artificial assistants, and transcribe text to speech, in the presence of natural noise and corruption on audio data.
Collapse
Affiliation(s)
- Nastaran Jannesar
- High Performance Embedded Architecture Lab., School of Electrical and Computer Engineering, College of Engineering, University of Tehran, Tehran, Iran.
| | | | - Saeed Safari
- High Performance Embedded Architecture Lab., School of Electrical and Computer Engineering, College of Engineering, University of Tehran, Tehran, Iran.
| | - Abdol-Hossein Vahabie
- Department of Psychology, Faculty of Psychology and Education, University of Tehran, Tehran, Iran; Cognitive Systems Laboratory, Control and Intelligent Processing Center of Excellence (CIPCE), School of Electrical and Computer Engineering, College of Engineering, University of Tehran, Tehran, Iran.
| |
Collapse
|
2
|
Murakami Y. Fast time-domain solution of the cochlear transmission line model in real-time applications. JASA EXPRESS LETTERS 2024; 4:084402. [PMID: 39158407 DOI: 10.1121/10.0028278] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/29/2024] [Accepted: 07/29/2024] [Indexed: 08/20/2024]
Abstract
A fast numerical time-domain solution for a one-dimensional cochlear transmission-line model was proposed for real-time applications. In this approach, the three-dimensional solver developed by Murakami [J. Acoust. Soc. Am. 150(4), 2589-2599 (2021)] was modified to develop a solution for the one-dimensional model. This development allows the solution to accurately and quickly calculate cochlear responses. The present solution can solve the model in real-time under coarse grid conditions. However, under fine-grid conditions, the computation time is significantly longer than the duration of the signal. Nevertheless, calculations can be performed under the fine grid condition, which previously required much computation time. This fact is essential to applications.
Collapse
Affiliation(s)
- Yasuki Murakami
- Faculty of Design, Kyushu University, 4-9-1 Shiobaru, Minamiku, Fukuoka 815-8540,
| |
Collapse
|
3
|
Thoret E, Ystad S, Kronland-Martinet R. Hearing as adaptive cascaded envelope interpolation. Commun Biol 2023; 6:671. [PMID: 37355702 PMCID: PMC10290642 DOI: 10.1038/s42003-023-05040-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2023] [Accepted: 06/12/2023] [Indexed: 06/26/2023] Open
Abstract
The human auditory system is designed to capture and encode sounds from our surroundings and conspecifics. However, the precise mechanisms by which it adaptively extracts the most important spectro-temporal information from sounds are still not fully understood. Previous auditory models have explained sound encoding at the cochlear level using static filter banks, but this vision is incompatible with the nonlinear and adaptive properties of the auditory system. Here we propose an approach that considers the cochlear processes as envelope interpolations inspired by cochlear physiology. It unifies linear and nonlinear adaptive behaviors into a single comprehensive framework that provides a data-driven understanding of auditory coding. It allows simulating a broad range of psychophysical phenomena from virtual pitches and combination tones to consonance and dissonance of harmonic sounds. It further predicts the properties of the cochlear filters such as frequency selectivity. Here we propose a possible link between the parameters of the model and the density of hair cells on the basilar membrane. Cascaded Envelope Interpolation may lead to improvements in sound processing for hearing aids by providing a non-linear, data-driven, way to preprocessing of acoustic signals consistent with peripheral processes.
Collapse
Affiliation(s)
- Etienne Thoret
- Aix Marseille Univ, CNRS, UMR7061 PRISM, UMR7020 LIS, Marseille, France.
- Institute of Language, Communication, and the Brain (ILCB), Marseille, France.
| | - Sølvi Ystad
- CNRS, Aix Marseille Univ, UMR 7061 PRISM, Marseille, France
| | | |
Collapse
|
4
|
Sabesan S, Fragner A, Bench C, Drakopoulos F, Lesica NA. Large-scale electrophysiology and deep learning reveal distorted neural signal dynamics after hearing loss. eLife 2023; 12:e85108. [PMID: 37162188 PMCID: PMC10202456 DOI: 10.7554/elife.85108] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2022] [Accepted: 04/27/2023] [Indexed: 05/11/2023] Open
Abstract
Listeners with hearing loss often struggle to understand speech in noise, even with a hearing aid. To better understand the auditory processing deficits that underlie this problem, we made large-scale brain recordings from gerbils, a common animal model for human hearing, while presenting a large database of speech and noise sounds. We first used manifold learning to identify the neural subspace in which speech is encoded and found that it is low-dimensional and that the dynamics within it are profoundly distorted by hearing loss. We then trained a deep neural network (DNN) to replicate the neural coding of speech with and without hearing loss and analyzed the underlying network dynamics. We found that hearing loss primarily impacts spectral processing, creating nonlinear distortions in cross-frequency interactions that result in a hypersensitivity to background noise that persists even after amplification with a hearing aid. Our results identify a new focus for efforts to design improved hearing aids and demonstrate the power of DNNs as a tool for the study of central brain structures.
Collapse
Affiliation(s)
| | | | - Ciaran Bench
- Ear Institute, University College LondonLondonUnited Kingdom
| | | | | |
Collapse
|
5
|
Sadagopan S, Kar M, Parida S. Quantitative models of auditory cortical processing. Hear Res 2023; 429:108697. [PMID: 36696724 PMCID: PMC9928778 DOI: 10.1016/j.heares.2023.108697] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/18/2022] [Revised: 12/17/2022] [Accepted: 01/12/2023] [Indexed: 01/15/2023]
Abstract
To generate insight from experimental data, it is critical to understand the inter-relationships between individual data points and place them in context within a structured framework. Quantitative modeling can provide the scaffolding for such an endeavor. Our main objective in this review is to provide a primer on the range of quantitative tools available to experimental auditory neuroscientists. Quantitative modeling is advantageous because it can provide a compact summary of observed data, make underlying assumptions explicit, and generate predictions for future experiments. Quantitative models may be developed to characterize or fit observed data, to test theories of how a task may be solved by neural circuits, to determine how observed biophysical details might contribute to measured activity patterns, or to predict how an experimental manipulation would affect neural activity. In complexity, quantitative models can range from those that are highly biophysically realistic and that include detailed simulations at the level of individual synapses, to those that use abstract and simplified neuron models to simulate entire networks. Here, we survey the landscape of recently developed models of auditory cortical processing, highlighting a small selection of models to demonstrate how they help generate insight into the mechanisms of auditory processing. We discuss examples ranging from models that use details of synaptic properties to explain the temporal pattern of cortical responses to those that use modern deep neural networks to gain insight into human fMRI data. We conclude by discussing a biologically realistic and interpretable model that our laboratory has developed to explore aspects of vocalization categorization in the auditory pathway.
Collapse
Affiliation(s)
- Srivatsun Sadagopan
- Department of Neurobiology, University of Pittsburgh, Pittsburgh, PA, USA; Center for Neuroscience, University of Pittsburgh, Pittsburgh, PA, USA; Center for the Neural Basis of Cognition, University of Pittsburgh, Pittsburgh, PA, USA; Department of Bioengineering, University of Pittsburgh, Pittsburgh, PA, USA; Department of Communication Science and Disorders, University of Pittsburgh, Pittsburgh, PA, USA.
| | - Manaswini Kar
- Department of Neurobiology, University of Pittsburgh, Pittsburgh, PA, USA; Center for Neuroscience, University of Pittsburgh, Pittsburgh, PA, USA; Center for the Neural Basis of Cognition, University of Pittsburgh, Pittsburgh, PA, USA
| | - Satyabrata Parida
- Department of Neurobiology, University of Pittsburgh, Pittsburgh, PA, USA; Center for Neuroscience, University of Pittsburgh, Pittsburgh, PA, USA
| |
Collapse
|
6
|
Osses Vecchi A, Varnet L, Carney LH, Dau T, Bruce IC, Verhulst S, Majdak P. A comparative study of eight human auditory models of monaural processing. ACTA ACUSTICA. EUROPEAN ACOUSTICS ASSOCIATION 2022; 6:17. [PMID: 36325461 PMCID: PMC9625898 DOI: 10.1051/aacus/2022008] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/19/2023]
Abstract
A number of auditory models have been developed using diverging approaches, either physiological or perceptual, but they share comparable stages of signal processing, as they are inspired by the same constitutive parts of the auditory system. We compare eight monaural models that are openly accessible in the Auditory Modelling Toolbox. We discuss the considerations required to make the model outputs comparable to each other, as well as the results for the following model processing stages or their equivalents: Outer and middle ear, cochlear filter bank, inner hair cell, auditory nerve synapse, cochlear nucleus, and inferior colliculus. The discussion includes a list of recommendations for future applications of auditory models.
Collapse
Affiliation(s)
- Alejandro Osses Vecchi
- Laboratoire des systèmes perceptifs, Département d’études cognitives, École Normale Supérieure, PSL University, CNRS, 75005 Paris, France
| | - Léo Varnet
- Laboratoire des systèmes perceptifs, Département d’études cognitives, École Normale Supérieure, PSL University, CNRS, 75005 Paris, France
| | - Laurel H. Carney
- Departments of Biomedical Engineering and Neuroscience, University of Rochester, Rochester, NY 14642, USA
| | - Torsten Dau
- Hearing Systems Section, Department of Health Technology, Technical University of Denmark, DK-2800 Kgs. Lyngby, Denmark
| | - Ian C. Bruce
- Department of Electrical and Computer Engineering, McMaster University, Hamilton, ON L8S 4K1, Canada
| | - Sarah Verhulst
- Hearing Technology group, WAVES, Department of Information Technology, Ghent University, 9000 Ghent, Belgium
| | - Piotr Majdak
- Acoustics Research Institute, Austrian Academy of Sciences, 1040 Vienna, Austria
| |
Collapse
|
7
|
Saremi A, Stenfelt S. The effects of noise-induced hair cell lesions on cochlear electromechanical responses: A computational approach using a biophysical model. INTERNATIONAL JOURNAL FOR NUMERICAL METHODS IN BIOMEDICAL ENGINEERING 2022; 38:e3582. [PMID: 35150464 PMCID: PMC9286811 DOI: 10.1002/cnm.3582] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/26/2021] [Revised: 02/02/2022] [Accepted: 02/09/2022] [Indexed: 06/14/2023]
Abstract
A biophysically inspired signal processing model of the human cochlea is deployed to simulate the effects of specific noise-induced inner hair cell (IHC) and outer hair cell (OHC) lesions on hearing thresholds, cochlear compression, and the spectral and temporal features of the auditory nerve (AN) coding. The model predictions were evaluated by comparison with corresponding data from animal studies as well as human clinical observations. The hearing thresholds were simulated for specific OHC and IHC damages and the cochlear nonlinearity was assessed at 0.5 and 4 kHz. The tuning curves were estimated at 1 kHz and the contributions of the OHC and IHC pathologies to the tuning curve were distinguished by the model. Furthermore, the phase locking of AN spikes were simulated in quiet and in presence of noise. The model predicts that the phase locking drastically deteriorates in noise indicating the disturbing effect of background noise on the temporal coding in case of hearing impairment. Moreover, the paper presents an example wherein the model is inversely configured for diagnostic purposes using a machine learning optimization technique (Nelder-Mead method). Accordingly, the model finds a specific pattern of OHC lesions that gives the audiometric hearing loss measured in a group of noise-induced hearing impaired humans.
Collapse
Affiliation(s)
- Amin Saremi
- Department of Applied Physics and ElectronicsUmeå UniversityUmeåSweden
| | - Stefan Stenfelt
- Department of Biomedical and Clinical SciencesLinköping UniversityLinköpingSweden
| |
Collapse
|
8
|
Islam MA, Xu Y, Monk T, Afshar S, van Schaik A. Noise-robust text-dependent speaker identification using cochlear models. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2022; 151:500. [PMID: 35105043 DOI: 10.1121/10.0009314] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/30/2021] [Accepted: 12/27/2021] [Indexed: 06/14/2023]
Abstract
One challenging issue in speaker identification (SID) is to achieve noise-robust performance. Humans can accurately identify speakers, even in noisy environments. We can leverage our knowledge of the function and anatomy of the human auditory pathway to design SID systems that achieve better noise-robust performance than conventional approaches. We propose a text-dependent SID system based on a real-time cochlear model called cascade of asymmetric resonators with fast-acting compression (CARFAC). We investigate the SID performance of CARFAC on signals corrupted by noise of various types and levels. We compare its performance with conventional auditory feature generators including mel-frequency cepstrum coefficients, frequency domain linear predictions, as well as another biologically inspired model called the auditory nerve model. We show that CARFAC outperforms other approaches when signals are corrupted by noise. Our results are consistent across datasets, types and levels of noise, different speaking speeds, and back-end classifiers. We show that the noise-robust SID performance of CARFAC is largely due to its nonlinear processing of auditory input signals. Presumably, the human auditory system achieves noise-robust performance via inherent nonlinearities as well.
Collapse
Affiliation(s)
- Md Atiqul Islam
- International Centre for Neuromorphic Systems in the MARCS Institute for Brain, Behaviour, and Development, Western Sydney University, Penrith, New South Wales, 2751, Australia
| | - Ying Xu
- International Centre for Neuromorphic Systems in the MARCS Institute for Brain, Behaviour, and Development, Western Sydney University, Penrith, New South Wales, 2751, Australia
| | - Travis Monk
- International Centre for Neuromorphic Systems in the MARCS Institute for Brain, Behaviour, and Development, Western Sydney University, Penrith, New South Wales, 2751, Australia
| | - Saeed Afshar
- International Centre for Neuromorphic Systems in the MARCS Institute for Brain, Behaviour, and Development, Western Sydney University, Penrith, New South Wales, 2751, Australia
| | - André van Schaik
- International Centre for Neuromorphic Systems in the MARCS Institute for Brain, Behaviour, and Development, Western Sydney University, Penrith, New South Wales, 2751, Australia
| |
Collapse
|
9
|
Yamazaki H, Tsuji T, Doi K, Kawano S. Mathematical model of the auditory nerve response to stimulation by a micro-machined cochlea. INTERNATIONAL JOURNAL FOR NUMERICAL METHODS IN BIOMEDICAL ENGINEERING 2021; 37:e3430. [PMID: 33336933 DOI: 10.1002/cnm.3430] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/27/2019] [Revised: 11/20/2020] [Accepted: 12/13/2020] [Indexed: 06/12/2023]
Abstract
We report a novel mathematical model of an artificial auditory system consisting of a micro-machined cochlea and the auditory nerve response it evokes. The modeled micro-machined cochlea is one previously realized experimentally by mimicking functions of the cochlea [Shintaku et al, Sens. Actuat. 158 (2010) 183-192; Inaoka et al, Proc. Natl. Acad. Sci. USA 108 (2011) 18390-18395]. First, from the viewpoint of mechanical engineering, the frequency characteristics of a model device were experimentally investigated to develop an artificial basilar membrane based on a spring-mass-damper system. In addition, a nonlinear feedback controller mimicking the function of the outer hair cells was incorporated in this experimental system. That is, the developed device reproduces the proportional relationship between the oscillation amplitude of the basilar membrane and the cube root of the sound pressure observed in the mammalian auditory system, which is what enables it to have a wide dynamic range, and the characteristics of the control performance were evaluated numerically and experimentally. Furthermore, the stimulation of the auditory nerve by the micro-machined cochlea was investigated using the present mathematical model, and the simulation results were compared with our previous experimental results from animal testing [Shintaku et al, J. Biomech. Sci. Eng. 8 (2013) 198-208]. The simulation results were found to be in reasonably good agreement with those from the previous animal test; namely, there exists a threshold at which the excitation of the nerve starts and a saturation value for the firing rate under a large input. The proposed numerical model was able to qualitatively reproduce the results of the animal test with the micro-machined cochlea and is thus expected to guide the evaluation of micro-machined cochleae for future animal experiments.
Collapse
Affiliation(s)
- Hiroki Yamazaki
- Graduate School of Engineering Science, Osaka University, Osaka, Japan
| | - Tetsuro Tsuji
- Graduate School of Engineering Science, Osaka University, Osaka, Japan
| | - Kentaro Doi
- Graduate School of Engineering Science, Osaka University, Osaka, Japan
| | - Satoyuki Kawano
- Graduate School of Engineering Science, Osaka University, Osaka, Japan
| |
Collapse
|
10
|
Harnessing the power of artificial intelligence to transform hearing healthcare and research. NAT MACH INTELL 2021. [DOI: 10.1038/s42256-021-00394-z] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
|
11
|
Baby D, Van Den Broucke A, Verhulst S. A convolutional neural-network model of human cochlear mechanics and filter tuning for real-time applications. NAT MACH INTELL 2021; 3:134-143. [PMID: 33629031 PMCID: PMC7116797 DOI: 10.1038/s42256-020-00286-8] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
Abstract
Auditory models are commonly used as feature extractors for automatic speech-recognition systems or as front-ends for robotics, machine-hearing and hearing-aid applications. Although auditory models can capture the biophysical and nonlinear properties of human hearing in great detail, these biophysical models are computationally expensive and cannot be used in real-time applications. We present a hybrid approach where convolutional neural networks are combined with computational neuroscience to yield a real-time end-to-end model for human cochlear mechanics, including level-dependent filter tuning (CoNNear). The CoNNear model was trained on acoustic speech material and its performance and applicability were evaluated using (unseen) sound stimuli commonly employed in cochlear mechanics research. The CoNNear model accurately simulates human cochlear frequency selectivity and its dependence on sound intensity, an essential quality for robust speech intelligibility at negative speech-to-background-noise ratios. The CoNNear architecture is based on parallel and differentiable computations and has the power to achieve real-time human performance. These unique CoNNear features will enable the next generation of human-like machine-hearing applications.
Collapse
Affiliation(s)
- Deepak Baby
- Hearing Technology @ WAVES, Dept. of Information Technology, Ghent University, 9000 Ghent, Belgium
| | - Arthur Van Den Broucke
- Hearing Technology @ WAVES, Dept. of Information Technology, Ghent University, 9000 Ghent, Belgium
| | - Sarah Verhulst
- Hearing Technology @ WAVES, Dept. of Information Technology, Ghent University, 9000 Ghent, Belgium
| |
Collapse
|
12
|
Yamazaki H, Yamanaka D, Kawano S. A Preliminary Prototype High-Speed Feedback Control of an Artificial Cochlear Sensory Epithelium Mimicking Function of Outer Hair Cells. MICROMACHINES 2020; 11:mi11070644. [PMID: 32610696 PMCID: PMC7407979 DOI: 10.3390/mi11070644] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/28/2020] [Revised: 06/22/2020] [Accepted: 06/26/2020] [Indexed: 12/20/2022]
Abstract
A novel feedback control technique for the local oscillation amplitude in an artificial cochlear sensory epithelium that mimics the functions of the outer hair cells in the cochlea is successfully developed and can be implemented with a control time on the order of hundreds of milliseconds. The prototype artificial cochlear sensory epithelium was improved from that developed in our previous study to enable the instantaneous determination of the local resonance position based on the electrical output from a bimorph piezoelectric membrane. The device contains local patterned electrodes deposited with micro electro mechanical system (MEMS) technology that is used to detect the electrical output and oscillate the device by applying local electrical stimuli. The main feature of the present feedback control system is the principle that the resonance position is recognized by simultaneously measuring the local electrical outputs of all of the electrodes and comparing their magnitudes, which drastically reduces the feedback control time. In this way, it takes 0.8 s to control the local oscillation of the device, representing the speed of control with the order of one hundred times relative to that in the previous study using the mechanical automatic stage to scan the oscillation amplitude at each electrode. Furthermore, the intrinsic difficulties in the experiment such as the electrical measurement against the electromagnetic noise, adhesion of materials, and fatigue failure mechanism of the oscillation system are also shown and discussed in detail based on the many scientific aspects. The basic knowledge of the MEMS fabrication and the experimental measurement would provide useful suggestions for future research. The proposed preliminary prototype high-speed feedback control can aid in the future development of fully implantable cochlear implants with a wider dynamic range.
Collapse
|
13
|
Abstract
This study presents a computational model to reproduce the biological dynamics of "listening to music." A biologically plausible model of periodicity pitch detection is proposed and simulated. Periodicity pitch is computed across a range of the auditory spectrum. Periodicity pitch is detected from subsets of activated auditory nerve fibers (ANFs). These activate connected model octopus cells, which trigger model neurons detecting onsets and offsets; thence model interval-tuned neurons are innervated at the right interval times; and finally, a set of common interval-detecting neurons indicate pitch. Octopus cells rhythmically spike with the pitch periodicity of the sound. Batteries of interval-tuned neurons stopwatch-like measure the inter-spike intervals of the octopus cells by coding interval durations as first spike latencies (FSLs). The FSL-triggered spikes synchronously coincide through a monolayer spiking neural network at the corresponding receiver pitch neurons.
Collapse
Affiliation(s)
- Frank Klefenz
- Fraunhofer Institute for Digital Media Technology IDMT, Ilmenau, Germany
| | - Tamas Harczos
- Fraunhofer Institute for Digital Media Technology IDMT, Ilmenau, Germany
- Auditory Neuroscience and Optogenetics Laboratory, German Primate Center, Göttingen, Germany
- audifon GmbH & Co. KG, Kölleda, Germany
| |
Collapse
|
14
|
Parthasarathy A, Bartlett EL, Kujawa SG. Age-related Changes in Neural Coding of Envelope Cues: Peripheral Declines and Central Compensation. Neuroscience 2019; 407:21-31. [DOI: 10.1016/j.neuroscience.2018.12.007] [Citation(s) in RCA: 25] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2018] [Revised: 11/30/2018] [Accepted: 12/03/2018] [Indexed: 12/22/2022]
|
15
|
Alkhairy SA, Shera CA. An analytic physically motivated model of the mammalian cochlea. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2019; 145:45. [PMID: 30710944 PMCID: PMC6320697 DOI: 10.1121/1.5084042] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/01/2018] [Revised: 11/22/2018] [Accepted: 11/29/2018] [Indexed: 06/09/2023]
Abstract
In this paper, an analytic model of the mammalian cochlea is developed. A mixed physical-phenomenological approach by utilizing existing work on the physics of classical box-representations of the cochlea and behavior of recent data-derived wavenumber estimates is used. Spatial variation is incorporated through a single independent variable that combines space and frequency. This paper arrives at closed-form expressions for the organ of Corti velocity, its impedance, the pressure difference across the organ of Corti, and its wavenumber. Model tests using real and imaginary parts of chinchilla data from multiple locations and for multiple variables are performed. The model also predicts impedances that are qualitatively consistent with current literature. For implementation, the model can leverage existing efforts for both filter bank or filter cascade models that target improved algorithmic or analog circuit efficiencies. The simplicity of the cochlear model, its small number of model constants, its ability to capture the variation of tuning, its closed-form expressions for physically-interrelated variables, and the form of these expressions that allows for easily determining one variable from another make the model appropriate for analytic and digital auditory filter implementations as discussed here, as well as for extracting macromechanical insights regarding how the cochlea works.
Collapse
Affiliation(s)
- Samiya A Alkhairy
- Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, USA
| | | |
Collapse
|
16
|
Harczos T, Klefenz FM. Modeling Pitch Perception With an Active Auditory Model Extended by Octopus Cells. Front Neurosci 2018; 12:660. [PMID: 30319340 PMCID: PMC6167605 DOI: 10.3389/fnins.2018.00660] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2018] [Accepted: 09/04/2018] [Indexed: 11/13/2022] Open
Abstract
Pitch is an essential category for musical sensations. Models of pitch perception are vividly discussed up to date. Most of them rely on definitions of mathematical methods in the spectral or temporal domain. Our proposed pitch perception model is composed of an active auditory model extended by octopus cells. The active auditory model is the same as used in the Stimulation based on Auditory Modeling (SAM), a successful cochlear implant sound processing strategy extended here by modeling the functional behavior of the octopus cells in the ventral cochlear nucleus and by modeling their connections to the auditory nerve fibers (ANFs). The neurophysiological parameterization of the extended model is fully described in the time domain. The model is based on latency-phase en- and decoding as octopus cells are latency-phase rectifiers in their local receptive fields. Pitch is ubiquitously represented by cascaded firing sweeps of octopus cells. Based on the firing patterns of octopus cells, inter-spike interval histograms can be aggregated, in which the place of the global maximum is assumed to encode the pitch.
Collapse
Affiliation(s)
- Tamas Harczos
- Fraunhofer Institute for Digital Media Technology, Ilmenau, Germany
- Auditory Neuroscience and Optogenetics Laboratory, German Primate Center, Goettingen, Germany
- Institut für Mikroelektronik- und Mechatronik-Systeme gGmbH, Ilmenau, Germany
| | | |
Collapse
|
17
|
Moncada-Torres A, Joshi SN, Prokopiou A, Wouters J, Epp B, Francart T. A framework for computational modelling of interaural time difference discrimination of normal and hearing-impaired listeners. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2018; 144:940. [PMID: 30180705 DOI: 10.1121/1.5051322] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/19/2017] [Accepted: 08/03/2018] [Indexed: 06/08/2023]
Abstract
Different computational models have been developed to study the interaural time difference (ITD) perception. However, only few have used a physiologically inspired architecture to study ITD discrimination. Furthermore, they do not include aspects of hearing impairment. In this work, a framework was developed to predict ITD thresholds in listeners with normal and impaired hearing. It combines the physiologically inspired model of the auditory periphery proposed by Zilany, Bruce, Nelson, and Carney [(2009). J. Acoust. Soc. Am. 126(5), 2390-2412] as a front end with a coincidence detection stage and a neurometric decision device as a back end. It was validated by comparing its predictions against behavioral data for narrowband stimuli from literature. The framework is able to model ITD discrimination of normal-hearing and hearing-impaired listeners at a group level. Additionally, it was used to explore the effect of different proportions of outer- and inner-hair cell impairment on ITD discrimination.
Collapse
Affiliation(s)
- Arturo Moncada-Torres
- KU Leuven - University of Leuven, Department of Neurosciences, ExpORL, Herestraat 49, Bus 721, 3000 Leuven, Belgium
| | - Suyash N Joshi
- Department of Electrical Engineering, Hearing Systems, Technical University of Denmark, Ørsteds Plads, Building 352, DK-2800 Kongens Lyngby, Denmark
| | - Andreas Prokopiou
- KU Leuven - University of Leuven, Department of Neurosciences, ExpORL, Herestraat 49, Bus 721, 3000 Leuven, Belgium
| | - Jan Wouters
- KU Leuven - University of Leuven, Department of Neurosciences, ExpORL, Herestraat 49, Bus 721, 3000 Leuven, Belgium
| | - Bastian Epp
- Department of Electrical Engineering, Hearing Systems, Technical University of Denmark, Ørsteds Plads, Building 352, DK-2800 Kongens Lyngby, Denmark
| | - Tom Francart
- KU Leuven - University of Leuven, Department of Neurosciences, ExpORL, Herestraat 49, Bus 721, 3000 Leuven, Belgium
| |
Collapse
|
18
|
Saremi A, Lyon RF. Quadratic distortion in a nonlinear cascade model of the human cochlea. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2018; 143:EL418. [PMID: 29857771 DOI: 10.1121/1.5038595] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
The cascade of asymmetric resonators with fast-acting compression (CARFAC) is a cascade filterbank model that performed well in a comparative study of cochlear models, but exhibited two anomalies in its frequency response and excitation pattern. It is shown here that the underlying reason is CARFAC's inclusion of quadratic distortion, which generates DC and low-frequency components that in a real cochlea would be canceled by reflections at the helicotrema, but since cascade filterbanks lack the reflection mechanism, these low-frequency components cause the observed anomalies. The simulations demonstrate that the anomalies disappear when the model's quadratic distortion parameter is zeroed, while other successful features of the model remain intact.
Collapse
Affiliation(s)
- Amin Saremi
- Computational Neuroscience and Cluster of Excellence "Hearing4all," Department for Neuroscience, University of Oldenburg, Oldenburg, Germany
| | - Richard F Lyon
- Google Inc., 1600 Amphitheatre Parkway, Mountain View, California 94043, USA
| |
Collapse
|
19
|
Xu Y, Thakur CS, Singh RK, Hamilton TJ, Wang RM, van Schaik A. A FPGA Implementation of the CAR-FAC Cochlear Model. Front Neurosci 2018; 12:198. [PMID: 29692700 PMCID: PMC5902704 DOI: 10.3389/fnins.2018.00198] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2017] [Accepted: 03/12/2018] [Indexed: 11/19/2022] Open
Abstract
This paper presents a digital implementation of the Cascade of Asymmetric Resonators with Fast-Acting Compression (CAR-FAC) cochlear model. The CAR part simulates the basilar membrane's (BM) response to sound. The FAC part models the outer hair cell (OHC), the inner hair cell (IHC), and the medial olivocochlear efferent system functions. The FAC feeds back to the CAR by moving the poles and zeros of the CAR resonators automatically. We have implemented a 70-section, 44.1 kHz sampling rate CAR-FAC system on an Altera Cyclone V Field Programmable Gate Array (FPGA) with 18% ALM utilization by using time-multiplexing and pipeline parallelizing techniques and present measurement results here. The fully digital reconfigurable CAR-FAC system is stable, scalable, easy to use, and provides an excellent input stage to more complex machine hearing tasks such as sound localization, sound segregation, speech recognition, and so on.
Collapse
Affiliation(s)
| | | | | | | | | | - André van Schaik
- MARCS Institute, Western Sydney University, Sydney, NSW, Australia
| |
Collapse
|
20
|
Computational modeling of the human auditory periphery: Auditory-nerve responses, evoked potentials and hearing loss. Hear Res 2018; 360:55-75. [DOI: 10.1016/j.heares.2017.12.018] [Citation(s) in RCA: 93] [Impact Index Per Article: 15.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/17/2017] [Revised: 12/17/2017] [Accepted: 12/23/2017] [Indexed: 11/21/2022]
|
21
|
Ashida G, Tollin DJ, Kretzberg J. Physiological models of the lateral superior olive. PLoS Comput Biol 2017; 13:e1005903. [PMID: 29281618 PMCID: PMC5744914 DOI: 10.1371/journal.pcbi.1005903] [Citation(s) in RCA: 24] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2017] [Accepted: 11/28/2017] [Indexed: 01/09/2023] Open
Abstract
In computational biology, modeling is a fundamental tool for formulating, analyzing and predicting complex phenomena. Most neuron models, however, are designed to reproduce certain small sets of empirical data. Hence their outcome is usually not compatible or comparable with other models or datasets, making it unclear how widely applicable such models are. In this study, we investigate these aspects of modeling, namely credibility and generalizability, with a specific focus on auditory neurons involved in the localization of sound sources. The primary cues for binaural sound localization are comprised of interaural time and level differences (ITD/ILD), which are the timing and intensity differences of the sound waves arriving at the two ears. The lateral superior olive (LSO) in the auditory brainstem is one of the locations where such acoustic information is first computed. An LSO neuron receives temporally structured excitatory and inhibitory synaptic inputs that are driven by ipsi- and contralateral sound stimuli, respectively, and changes its spike rate according to binaural acoustic differences. Here we examine seven contemporary models of LSO neurons with different levels of biophysical complexity, from predominantly functional ones (‘shot-noise’ models) to those with more detailed physiological components (variations of integrate-and-fire and Hodgkin-Huxley-type). These models, calibrated to reproduce known monaural and binaural characteristics of LSO, generate largely similar results to each other in simulating ITD and ILD coding. Our comparisons of physiological detail, computational efficiency, predictive performances, and further expandability of the models demonstrate (1) that the simplistic, functional LSO models are suitable for applications where low computational costs and mathematical transparency are needed, (2) that more complex models with detailed membrane potential dynamics are necessary for simulation studies where sub-neuronal nonlinear processes play important roles, and (3) that, for general purposes, intermediate models might be a reasonable compromise between simplicity and biological plausibility. Computational models help our understanding of complex biological systems, by identifying their key elements and revealing their operational principles. Close comparisons between model predictions and empirical observations ensure our confidence in a model as a building block for further applications. Most current neuronal models, however, are constructed to replicate only a small specific set of experimental data. Thus, it is usually unclear how these models can be generalized to different datasets and how they compare with each other. In this paper, seven neuronal models are examined that are designed to reproduce known physiological characteristics of auditory neurons involved in the detection of sound source location. Despite their different levels of complexity, the models generate largely similar results when their parameters are tuned with common criteria. Comparisons show that simple models are computationally more efficient and theoretically transparent, and therefore suitable for rigorous mathematical analyses and engineering applications including real-time simulations. In contrast, complex models are necessary for investigating the relationship between underlying biophysical processes and sub- and suprathreshold spiking properties, although they have a large number of unconstrained, unverified parameters. Having identified their advantages and drawbacks, these auditory neuron models may readily be used for future studies and applications.
Collapse
Affiliation(s)
- Go Ashida
- Cluster of Excellence "Hearing4all", Department of Neuroscience, University of Oldenburg, Oldenburg, Germany
| | - Daniel J Tollin
- Department of Physiology and Biophysics, University of Colorado School of Medicine, Aurora, Colorado, United States of America
| | - Jutta Kretzberg
- Cluster of Excellence "Hearing4all", Department of Neuroscience, University of Oldenburg, Oldenburg, Germany
| |
Collapse
|
22
|
Dietz M, Lestang JH, Majdak P, Stern RM, Marquardt T, Ewert SD, Hartmann WM, Goodman DFM. A framework for testing and comparing binaural models. Hear Res 2017; 360:92-106. [PMID: 29208336 DOI: 10.1016/j.heares.2017.11.010] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/30/2017] [Revised: 11/03/2017] [Accepted: 11/24/2017] [Indexed: 11/19/2022]
Abstract
Auditory research has a rich history of combining experimental evidence with computational simulations of auditory processing in order to deepen our theoretical understanding of how sound is processed in the ears and in the brain. Despite significant progress in the amount of detail and breadth covered by auditory models, for many components of the auditory pathway there are still different model approaches that are often not equivalent but rather in conflict with each other. Similarly, some experimental studies yield conflicting results which has led to controversies. This can be best resolved by a systematic comparison of multiple experimental data sets and model approaches. Binaural processing is a prominent example of how the development of quantitative theories can advance our understanding of the phenomena, but there remain several unresolved questions for which competing model approaches exist. This article discusses a number of current unresolved or disputed issues in binaural modelling, as well as some of the significant challenges in comparing binaural models with each other and with the experimental data. We introduce an auditory model framework, which we believe can become a useful infrastructure for resolving some of the current controversies. It operates models over the same paradigms that are used experimentally. The core of the proposed framework is an interface that connects three components irrespective of their underlying programming language: The experiment software, an auditory pathway model, and task-dependent decision stages called artificial observers that provide the same output format as the test subject.
Collapse
Affiliation(s)
- Mathias Dietz
- National Centre for Audiology, Western University, London, ON, Canada.
| | - Jean-Hugues Lestang
- Department of Electrical and Electronic Engineering, Imperial College London, London, United Kingdom
| | - Piotr Majdak
- Institut für Schallforschung, Österreichische Akademie der Wissenschaften, Wien, Austria
| | | | | | - Stephan D Ewert
- Medizinische Physik, Universität Oldenburg, Oldenburg, Germany
| | | | - Dan F M Goodman
- Department of Electrical and Electronic Engineering, Imperial College London, London, United Kingdom
| |
Collapse
|