1
|
Smith SS, Sollini J, Akeroyd MA. Inferring the basis of binaural detection with a modified autoencoder. Front Neurosci 2023; 17:1000079. [PMID: 36777633 PMCID: PMC9909603 DOI: 10.3389/fnins.2023.1000079] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2022] [Accepted: 01/02/2023] [Indexed: 01/28/2023] Open
Abstract
The binaural system utilizes interaural timing cues to improve the detection of auditory signals presented in noise. In humans, the binaural mechanisms underlying this phenomenon cannot be directly measured and hence remain contentious. As an alternative, we trained modified autoencoder networks to mimic human-like behavior in a binaural detection task. The autoencoder architecture emphasizes interpretability and, hence, we "opened it up" to see if it could infer latent mechanisms underlying binaural detection. We found that the optimal networks automatically developed artificial neurons with sensitivity to timing cues and with dynamics consistent with a cross-correlation mechanism. These computations were similar to neural dynamics reported in animal models. That these computations emerged to account for human hearing attests to their generality as a solution for binaural signal detection. This study examines the utility of explanatory-driven neural network models and how they may be used to infer mechanisms of audition.
Collapse
Affiliation(s)
- Samuel S Smith
- Hearing Sciences, Mental Health and Clinical Neurosciences, School of Medicine, University of Nottingham, Nottingham, United Kingdom
| | - Joseph Sollini
- Hearing Sciences, Mental Health and Clinical Neurosciences, School of Medicine, University of Nottingham, Nottingham, United Kingdom
| | - Michael A Akeroyd
- Hearing Sciences, Mental Health and Clinical Neurosciences, School of Medicine, University of Nottingham, Nottingham, United Kingdom
| |
Collapse
|
2
|
Eurich B, Encke J, Ewert SD, Dietz M. Lower interaural coherence in off-signal bands impairs binaural detection. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2022; 151:3927. [PMID: 35778173 DOI: 10.1121/10.0011673] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/10/2021] [Accepted: 05/24/2022] [Indexed: 06/15/2023]
Abstract
Differences in interaural phase configuration between a target and a masker can lead to substantial binaural unmasking. This effect is decreased for masking noises with an interaural time difference (ITD). Adding a second noise with an opposing ITD in most cases further reduces binaural unmasking. Thus far, modeling of these detection thresholds required both a mechanism for internal ITD compensation and an increased filter bandwidth. An alternative explanation for the reduction is that unmasking is impaired by the lower interaural coherence in off-frequency regions caused by the second masker [Marquardt and McAlpine (2009). J. Acoust. Soc. Am. 126(6), EL177-EL182]. Based on this hypothesis, the current work proposes a quantitative multi-channel model using monaurally derived peripheral filter bandwidths and an across-channel incoherence interference mechanism. This mechanism differs from wider filters since it has no effect when the masker coherence is constant across frequency bands. Combined with a monaural energy discrimination pathway, the model predicts the differences between a single delayed noise and two opposingly delayed noises as well as four other data sets. It helps resolve the inconsistency that simulating some data requires wide filters while others require narrow filters.
Collapse
Affiliation(s)
- Bernhard Eurich
- Department für Medizinische Physik und Akustik, Universität Oldenburg, 26111 Oldenburg, Germany
| | - Jörg Encke
- Department für Medizinische Physik und Akustik, Universität Oldenburg, 26111 Oldenburg, Germany
| | - Stephan D Ewert
- Department für Medizinische Physik und Akustik, Universität Oldenburg, 26111 Oldenburg, Germany
| | - Mathias Dietz
- Department für Medizinische Physik und Akustik, Universität Oldenburg, 26111 Oldenburg, Germany
| |
Collapse
|
3
|
Osses Vecchi A, Varnet L, Carney LH, Dau T, Bruce IC, Verhulst S, Majdak P. A comparative study of eight human auditory models of monaural processing. ACTA ACUSTICA. EUROPEAN ACOUSTICS ASSOCIATION 2022; 6:17. [PMID: 36325461 PMCID: PMC9625898 DOI: 10.1051/aacus/2022008] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/19/2023]
Abstract
A number of auditory models have been developed using diverging approaches, either physiological or perceptual, but they share comparable stages of signal processing, as they are inspired by the same constitutive parts of the auditory system. We compare eight monaural models that are openly accessible in the Auditory Modelling Toolbox. We discuss the considerations required to make the model outputs comparable to each other, as well as the results for the following model processing stages or their equivalents: Outer and middle ear, cochlear filter bank, inner hair cell, auditory nerve synapse, cochlear nucleus, and inferior colliculus. The discussion includes a list of recommendations for future applications of auditory models.
Collapse
Affiliation(s)
- Alejandro Osses Vecchi
- Laboratoire des systèmes perceptifs, Département d’études cognitives, École Normale Supérieure, PSL University, CNRS, 75005 Paris, France
- Corresponding author:
| | - Léo Varnet
- Laboratoire des systèmes perceptifs, Département d’études cognitives, École Normale Supérieure, PSL University, CNRS, 75005 Paris, France
| | - Laurel H. Carney
- Departments of Biomedical Engineering and Neuroscience, University of Rochester, Rochester, NY 14642, USA
| | - Torsten Dau
- Hearing Systems Section, Department of Health Technology, Technical University of Denmark, DK-2800 Kgs. Lyngby, Denmark
| | - Ian C. Bruce
- Department of Electrical and Computer Engineering, McMaster University, Hamilton, ON L8S 4K1, Canada
| | - Sarah Verhulst
- Hearing Technology group, WAVES, Department of Information Technology, Ghent University, 9000 Ghent, Belgium
| | - Piotr Majdak
- Acoustics Research Institute, Austrian Academy of Sciences, 1040 Vienna, Austria
| |
Collapse
|
4
|
Osses Vecchi A, Kohlrausch A. Perceptual similarity between piano notes: Simulations with a template-based perception model. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2021; 149:3534. [PMID: 34241098 DOI: 10.1121/10.0004818] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/31/2020] [Accepted: 04/11/2021] [Indexed: 06/13/2023]
Abstract
In this paper, the auditory model developed by Dau, Kollmeier, and Kohlrausch [(1997). J. Acoust. Soc. Am. 102, 2892-2905] was used to simulate the perceptual similarity between complex sounds. As complex sounds, a set of piano recordings was used, whose perceptual similarity has recently been measured by Osses, Kohlrausch, and Chaigne [(2019). J. Acoust. Soc. Am. 146, 1024-1035] using a three-alternative forced-choice discrimination task in noise. To simulate this discrimination task, the auditory model required a new back-end stage, the central processor, which is preceded by several processing stages that are to a greater or lesser extent inspired by physiological aspects of the normal-hearing system. Therefore, a comprehensive review of the model parameters as used in the literature is given, indicating the fixed set of parameter values that is used in all simulations. Due to the perceptual relevance of the piano note onsets, this review includes an in-depth description of the auditory adaptation stage, the adaptation loops. A moderate to high correlation was found between the simulation results and existing experimental data.
Collapse
Affiliation(s)
- Alejandro Osses Vecchi
- Human-Technology Interaction Group, Department of Industrial Engineering and Innovation Sciences, Eindhoven University of Technology, 5600MB Eindhoven, The Netherlands
| | - Armin Kohlrausch
- Human-Technology Interaction Group, Department of Industrial Engineering and Innovation Sciences, Eindhoven University of Technology, 5600MB Eindhoven, The Netherlands
| |
Collapse
|
5
|
Dietz M, Lestang JH, Majdak P, Stern RM, Marquardt T, Ewert SD, Hartmann WM, Goodman DFM. A framework for testing and comparing binaural models. Hear Res 2017; 360:92-106. [PMID: 29208336 DOI: 10.1016/j.heares.2017.11.010] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/30/2017] [Revised: 11/03/2017] [Accepted: 11/24/2017] [Indexed: 11/19/2022]
Abstract
Auditory research has a rich history of combining experimental evidence with computational simulations of auditory processing in order to deepen our theoretical understanding of how sound is processed in the ears and in the brain. Despite significant progress in the amount of detail and breadth covered by auditory models, for many components of the auditory pathway there are still different model approaches that are often not equivalent but rather in conflict with each other. Similarly, some experimental studies yield conflicting results which has led to controversies. This can be best resolved by a systematic comparison of multiple experimental data sets and model approaches. Binaural processing is a prominent example of how the development of quantitative theories can advance our understanding of the phenomena, but there remain several unresolved questions for which competing model approaches exist. This article discusses a number of current unresolved or disputed issues in binaural modelling, as well as some of the significant challenges in comparing binaural models with each other and with the experimental data. We introduce an auditory model framework, which we believe can become a useful infrastructure for resolving some of the current controversies. It operates models over the same paradigms that are used experimentally. The core of the proposed framework is an interface that connects three components irrespective of their underlying programming language: The experiment software, an auditory pathway model, and task-dependent decision stages called artificial observers that provide the same output format as the test subject.
Collapse
Affiliation(s)
- Mathias Dietz
- National Centre for Audiology, Western University, London, ON, Canada.
| | - Jean-Hugues Lestang
- Department of Electrical and Electronic Engineering, Imperial College London, London, United Kingdom
| | - Piotr Majdak
- Institut für Schallforschung, Österreichische Akademie der Wissenschaften, Wien, Austria
| | | | | | - Stephan D Ewert
- Medizinische Physik, Universität Oldenburg, Oldenburg, Germany
| | | | - Dan F M Goodman
- Department of Electrical and Electronic Engineering, Imperial College London, London, United Kingdom
| |
Collapse
|
6
|
Bernstein LR, Trahiotis C. An interaural-correlation-based approach that accounts for a wide variety of binaural detection data. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2017; 141:1150. [PMID: 28253652 DOI: 10.1121/1.4976098] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]
Abstract
Interaural cross-correlation-based models of binaural processing have accounted successfully for a wide variety of binaural phenomena, including binaural detection, binaural discrimination, and measures of extents of laterality based on interaural temporal disparities, interaural intensitive disparities, and their combination. This report focuses on quantitative accounts of data obtained from binaural detection experiments published over five decades. Particular emphasis is placed on stimulus contexts for which commonly used correlation-based approaches fail to provide adequate explanations of the data. One such context concerns binaural detection of signals masked by certain noises that are narrow-band and/or interaurally partially correlated. It is shown that a cross-correlation-based model that includes stages of peripheral auditory processing can, when coupled with an appropriate decision variable, account well for a wide variety of classic and recently published binaural detection data including those that have, heretofore, proven to be problematic.
Collapse
Affiliation(s)
- Leslie R Bernstein
- Departments of Neuroscience and Surgery (Otolaryngology), University of Connecticut Health Center, Farmington, Connecticut 06030, USA
| | - Constantine Trahiotis
- Departments of Neuroscience and Surgery (Otolaryngology), University of Connecticut Health Center, Farmington, Connecticut 06030, USA
| |
Collapse
|
7
|
Camalier CR, Grantham DW, Bernstein LR. Binaural interference: effects of temporal interferer fringe and interstimulus interval. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2014; 135:789-795. [PMID: 25234887 DOI: 10.1121/1.4861351] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/03/2023]
Abstract
Binaural interference refers to the phenomenon in which the potency of binaural cues conveyed by a "target" stimulus occupying one spectral region is degraded by the presence of an "interferer" stimulus occupying a spectral region remote from the target. It is typified by conditions in which thresholds for detection of interaural temporal difference conveyed by a high-frequency target are elevated when the target is accompanied by a spectrally remote low-frequency interferer. This study explored effects of temporal relations between targets and interferers on binaural interference. In the first experiment, duration by which the interferer preceded and/or trailed the target (onset and offset "fringes") was varied. Results indicated binaural interference decreased with total duration of the temporal fringe, but did not depend on whether that duration was composed of onset, offset, or onset + offset fringes. In the second experiment, binaural interference was measured as a function of the interstimulus interval (ISI) between the two presentations of the target. Results indicated that shorter ISIs increased thresholds in both the interferer and no-interferer conditions, but did not affect binaural interference. These results suggest that the mechanisms underlying the effects of manipulations of the interferer temporal fringe and manipulation of the ISI are essentially independent.
Collapse
Affiliation(s)
- Corrie R Camalier
- Department of Psychology, Vanderbilt University, Nashville, Tennessee 37240
| | - D Wesley Grantham
- Vanderbilt Bill Wilkerson Center, Vanderbilt University School of Medicine, Nashville, Tennessee 37232
| | - Leslie R Bernstein
- Departments of Neuroscience and Surgery (Otolaryngology), University of Connecticut Health Center, Farmington, Connecticut 06030
| |
Collapse
|
8
|
Binaural release from masking in forward-masked intensity discrimination: evidence for effects of selective attention. Hear Res 2012; 294:1-9. [PMID: 23010335 DOI: 10.1016/j.heares.2012.09.004] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/08/2012] [Revised: 09/07/2012] [Accepted: 09/12/2012] [Indexed: 11/20/2022]
Abstract
In a forward-masked intensity discrimination task, we manipulated the perceived lateralization of the masker via variation of the interaural time difference (ITD). The maskers and targets were 500 Hz pure tones with a duration of 30 ms. Standards of 30 and 60 dB SPL were combined with 60 or 90 dB SPL maskers. As expected, the presentation of a forward masker perceived as lateralized to the other side of the head as the target resulted in a significantly smaller elevation of the intensity difference limen than a masker lateralized ipsilaterally. This binaural release from masking in forward-masked intensity discrimination cannot be explained by peripheral mechanisms because varying the ITD leaves the neural representation in the monaural channels (i.e., in the auditory nerve) unaltered. Instead, our results are compatible with the assumption that lateralization differences between masker and target promote object segregation and therefore facilitate object-based selective attention to the target.
Collapse
|
9
|
Klein-Hennig M, Dietz M, Hohmann V, Ewert SD. The influence of different segments of the ongoing envelope on sensitivity to interaural time delays. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2011; 129:3856-72. [PMID: 21682409 DOI: 10.1121/1.3585847] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/16/2023]
Abstract
The auditory system is sensitive to interaural timing disparities in the fine structure and the envelope of sounds, each contributing important cues for lateralization. In this study, psychophysical measurements were conducted with customized envelope waveforms in order to investigate the isolated effect of different segments of a periodic, ongoing envelope on lateralization. One envelope cycle was composed of the four segments attack flank, hold duration, decay flank, and pause duration, which were independently varied to customize the envelope waveform. The envelope waveforms were applied to a 4-kHz sinusoidal carrier, and just noticeable envelope interaural time differences were measured in six normal hearing subjects. The results indicate that attack durations and pause durations prior to the attack are the most important stimulus characteristics for processing envelope timing disparities. The results were compared to predictions of three binaural lateralization models based on the normalized cross correlation coefficient. Two of the models included an additional stage to mimic neural adaptation prior to binaural interaction, involving either a single short time constant (5 ms) or a combination of five time constants up to 500 ms. It was shown that the model with the single short time constant accounted best for the data.
Collapse
|
10
|
Buss E, Hall Iii JW. Effects of non-simultaneous masking on the binaural masking level difference. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2011; 129:907-919. [PMID: 21361448 PMCID: PMC3070997 DOI: 10.1121/1.3514528] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/16/2010] [Revised: 10/08/2010] [Accepted: 10/17/2010] [Indexed: 05/30/2023]
Abstract
The present study sought to clarify the role of non-simultaneous masking in the binaural masking level difference for maskers that fluctuate in level. In the first experiment the signal was a brief 500-Hz tone, and the masker was a bandpass noise (100-2000 Hz), with the initial and final 200-ms bursts presented at 40-dB spectrum level and the inter-burst gap presented at 20-dB spectrum level. Temporal windows were fitted to thresholds measured for a range of gap durations and signal positions within the gap. In the second experiment, individual differences in out of phase (NoSπ) thresholds were compared for a brief signal in a gapped bandpass masker, a brief signal in a steady bandpass masker, and a long signal in a narrowband (50-Hz-wide) noise masker. The third experiment measured brief tone detection thresholds in forward, simultaneous, and backward masking conditions for a 50- and for a 1900-Hz-wide noise masker centered on the 500-Hz signal frequency. Results are consistent with comparable temporal resolution in the in phase (NoSo) and NoSπ conditions and no effect of temporal resolution on individual observers' ability to utilize binaural cues in narrowband noise. The large masking release observed for a narrowband noise masker may be due to binaural masking release from non-simultaneous, informational masking.
Collapse
Affiliation(s)
- Emily Buss
- Department of Otolaryngology∕Head and Neck Surgery, University of North Carolina School of Medicine, Chapel Hill, North Carolina 27599, USA.
| | | |
Collapse
|
11
|
Goupell MJ. Interaural fluctuations and the detection of interaural incoherence. IV. The effect of compression on stimulus statistics. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2010; 128:3691-702. [PMID: 21218901 PMCID: PMC3037772 DOI: 10.1121/1.3505104] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/04/2009] [Revised: 09/22/2010] [Accepted: 09/29/2010] [Indexed: 05/25/2023]
Abstract
The purpose of this experiment was to determine whether the normalized interaural cross-correlation (CC) model or a model based on interaural phase and level differences can better describe incoherence detection data. The ability to detect interaural incoherence in three sets of reproducible dichotic noises was tested in six listeners. The first set contained noises with a constrained value of the CC and the CC including signal compression. The second set contained noises with a constrained value of the CC including signal compression. The third set contained noises with constrained values in the fluctuations in the interaural differences. Modeling showed that neither the CC model nor the model using the interaural differences could account for the data in any set. Examination of the statistical properties of the stimuli showed that including compression before the calculation of the interaural CC causes a substantial correlation of this metric to the fluctuations in the interaural phase difference. This finding implies that it may be more difficult to discriminate between the common types of binaural models than previously thought.
Collapse
Affiliation(s)
- Matthew J Goupell
- Binaural Hearing and Speech Laboratory, Waisman Center, University of Wisconsin-Madison, 1500 Highland Avenue, Madison, Wisconsin 53705, USA.
| |
Collapse
|
12
|
Zilany MSA, Bruce IC, Nelson PC, Carney LH. A phenomenological model of the synapse between the inner hair cell and auditory nerve: long-term adaptation with power-law dynamics. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2009; 126:2390-412. [PMID: 19894822 PMCID: PMC2787068 DOI: 10.1121/1.3238250] [Citation(s) in RCA: 194] [Impact Index Per Article: 12.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/09/2023]
Abstract
There is growing evidence that the dynamics of biological systems that appear to be exponential over short time courses are in some cases better described over the long-term by power-law dynamics. A model of rate adaptation at the synapse between inner hair cells and auditory-nerve (AN) fibers that includes both exponential and power-law dynamics is presented here. Exponentially adapting components with rapid and short-term time constants, which are mainly responsible for shaping onset responses, are followed by two parallel paths with power-law adaptation that provide slowly and rapidly adapting responses. The slowly adapting power-law component significantly improves predictions of the recovery of the AN response after stimulus offset. The faster power-law adaptation is necessary to account for the "additivity" of rate in response to stimuli with amplitude increments. The proposed model is capable of accurately predicting several sets of AN data, including amplitude-modulation transfer functions, long-term adaptation, forward masking, and adaptation to increments and decrements in the amplitude of an ongoing stimulus.
Collapse
Affiliation(s)
- Muhammad S A Zilany
- Department of Biomedical Engineering, University of Rochester, NY 14642, USA
| | | | | | | |
Collapse
|
13
|
Davidson SA, Gilkey RH, Colburn HS, Carney LH. An evaluation of models for diotic and dichotic detection in reproducible noises. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2009; 126:1906-25. [PMID: 19813804 PMCID: PMC2771055 DOI: 10.1121/1.3206583] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/22/2008] [Revised: 05/29/2009] [Accepted: 07/27/2009] [Indexed: 05/24/2023]
Abstract
Several psychophysical models for masked detection were evaluated using reproducible noises. The data were hit and false-alarm rates from three psychophysical studies of detection of 500-Hz tones in reproducible noise under diotic (N0S0) and dichotic (N0Spi) conditions with four stimulus bandwidths (50, 100, 115, and 2900 Hz). Diotic data were best predicted by an energy-based multiple-detector model that linearly combined stimulus energies at the outputs of several critical-band filters. The tone-plus-noise trials in the dichotic data were best predicted by models that linearly combined either the average values or the standard deviations of interaural time and level differences; however, these models offered no predictions for noise-alone responses. The decision variables of more complicated temporal models, including the models of Dau et al. [(1996a). J. Acoust. Soc. Am. 99, 3615-3622] and Breebaart et al. [(2001a). J. Acoust. Soc. Am. 110, 1074-1088], were weakly correlated with subjects' responses. Comparisons of the dependencies of each model on envelope and fine-structure cues to those in the data suggested that dependence upon both envelope and fine structure, as well as an interaction between them, is required to predict the detection results.
Collapse
Affiliation(s)
- Sean A Davidson
- Department of Biomedical and Chemical Engineering, Institute for Sensory Research, Syracuse University, 621 Skytop Road, Syracuse, New York 13244, USA
| | | | | | | |
Collapse
|
14
|
Beutelmann R, Brand T, Kollmeier B. Prediction of binaural speech intelligibility with frequency-dependent interaural phase differences. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2009; 126:1359-1368. [PMID: 19739750 DOI: 10.1121/1.3177266] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/28/2023]
Abstract
The aim of this study was to test the hypothesis of independent processing strategies in adjacent binaural frequency bands underlying current models for binaural speech intelligibility in complex configurations and to investigate the effective binaural auditory bandwidth in broad-band signals. Speech reception thresholds (SRTs) were measured for binaural conditions with frequency-dependent interaural phase differences (IPDs) of speech and noise. SRT predictions with the binaural speech intelligibility model by Beutelmann and Brand (2006, J. Acoust. Soc. Am. 120, 331-342) were compared with the observed data. The IPDs of speech and noise had a sinusoidal shape on a logarithmic frequency scale. The bandwidth between zeros of the IPD function was varied from 18 to 4 octaves. Speech and noise had either the same IPD function (reference condition) or opposite signs of the IPD function (binaural condition). Each condition had two subconditions with alternating and non-alternating signs, respectively, of the IPD function. The binaural unmasking with respect to the reference condition decreased from 6 dB to zero with decreasing IPD bandwidth for the alternating condition while it stayed significantly larger than zero for the non-alternating condition. The observed results were well predicted by the model with an analysis filter bandwidth of 2.3 equivalent rectangular bandwidths (ERBs).
Collapse
Affiliation(s)
- Rainer Beutelmann
- Medizinische Physik, Carl-von-Ossietzky-Universitat Oldenburg, Oldenburg, Germany.
| | | | | |
Collapse
|
15
|
Dietz M, Ewert SD, Hohmann V. Lateralization of stimuli with independent fine-structure and envelope-based temporal disparities. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2009; 125:1622-1635. [PMID: 19275320 DOI: 10.1121/1.3076045] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/27/2023]
Abstract
Psychoacoustic experiments were conducted to investigate the role and interaction of fine-structure and envelope-based interaural temporal disparities. A computational model for the lateralization of binaural stimuli, motivated by recent physiological findings, is suggested and evaluated against the psychoacoustic data. The model is based on the independent extraction of the interaural phase difference (IPD) from the stimulus fine-structure and envelope. Sinusoidally amplitude-modulated 1-kHz tones were used in the experiments. The lateralization from either carrier (fine-structure) or modulator (envelope) IPD was matched with an interaural level difference, revealing a nearly linear dependence for both IPD types up to 135 degrees , independent of the modulation frequency. However, if a carrier IPD was traded with an opposed modulator IPD to produce a centered sound image, a carrier IPD of 45 degrees required the largest opposed modulator IPD. The data could be modeled assuming a population of binaural neurons with a physiological distribution of the best IPDs clustered around 45 degrees -50 degrees . The model was also used to predict the perceived lateralization of previously published data. Subject-dependent differences in the perceptual salience of fine-structure and envelope cues, also reported previously, could be modeled by individual weighting coefficients for the two cues.
Collapse
Affiliation(s)
- Mathias Dietz
- Medizinische Physik, Universitat Oldenburg, Oldenburg, Germany.
| | | | | |
Collapse
|
16
|
Jepsen ML, Ewert SD, Dau T. A computational model of human auditory signal processing and perception. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2008; 124:422-438. [PMID: 18646987 DOI: 10.1121/1.2924135] [Citation(s) in RCA: 84] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/26/2023]
Abstract
A model of computational auditory signal-processing and perception that accounts for various aspects of simultaneous and nonsimultaneous masking in human listeners is presented. The model is based on the modulation filterbank model described by Dau et al. [J. Acoust. Soc. Am. 102, 2892 (1997)] but includes major changes at the peripheral and more central stages of processing. The model contains outer- and middle-ear transformations, a nonlinear basilar-membrane processing stage, a hair-cell transduction stage, a squaring expansion, an adaptation stage, a 150-Hz lowpass modulation filter, a bandpass modulation filterbank, a constant-variance internal noise, and an optimal detector stage. The model was evaluated in experimental conditions that reflect, to a different degree, effects of compression as well as spectral and temporal resolution in auditory processing. The experiments include intensity discrimination with pure tones and broadband noise, tone-in-noise detection, spectral masking with narrow-band signals and maskers, forward masking with tone signals and tone or noise maskers, and amplitude-modulation detection with narrow- and wideband noise carriers. The model can account for most of the key properties of the data and is more powerful than the original model. The model might be useful as a front end in technical applications.
Collapse
Affiliation(s)
- Morten L Jepsen
- Centre for Applied Hearing Research, Acoustic Technology, Department of Electrical Engineering, Technical University of Denmark, DK-2800 Kgs. Lyngby, Denmark
| | | | | |
Collapse
|
17
|
Thompson ER, Dau T. Binaural processing of modulated interaural level differences. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2008; 123:1017-1029. [PMID: 18247904 DOI: 10.1121/1.2821800] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/25/2023]
Abstract
Two experiments are presented that measure the acuity of binaural processing of modulated interaural level differences (ILDs) using psychoacoustic methods. In both experiments, dynamic ILDs were created by imposing an interaurally antiphasic sinusoidal amplitude modulation (AM) signal on high-frequency carriers, which were presented over headphones. In the first experiment, the sensitivity to dynamic ILDs was measured as a function of the modulation frequency using puretone, and interaurally correlated and uncorrelated narrow-band noise carriers. The intrinsic interaural level fluctuations of the uncorrelated noise carriers raised the ILD modulation detection thresholds with respect to the pure-tone carriers. The diotic fluctuations of the correlated noise carriers also caused a small increase in the thresholds over the pure-tone carriers, particularly with low ILD modulation frequencies. The second experiment investigated the modulation frequency selectivity in dynamic ILD processing by imposing an interaurally uncorrelated bandpass noise AM masker in series with the interaurally antiphasic AM signal on a pure-tone carrier. By varying the masker center frequencies relative to the signal modulation frequency, broadly tuned, bandpass-shaped patterns were obtained. Simulations with an existing binaural model show that a low-pass filter to limit the binaural temporal resolution is not sufficient to predict the results of the experiments.
Collapse
Affiliation(s)
- Eric R Thompson
- Centre for Applied Hearing Research, Acoustic Technology, Orsted.DTU, Technical University of Denmark, Building 352, Orsteds Plads, 2800 Kgs. Lyngby, Denmark.
| | | |
Collapse
|
18
|
Culling JF. Evidence specifically favoring the equalization-cancellation theory of binaural unmasking. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2007; 122:2803-2813. [PMID: 18189570 DOI: 10.1121/1.2785035] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/25/2023]
Abstract
Three experiments investigated the roles of interaural correlation (rho) and of the monaural power spectrum in the detection and discrimination of narrow-band-noise signals (462-539 Hz) in broadband maskers (0-3 kHz). The power and rho of the target band were independently controlled, while the flanking noise was fixed and diotic. Experiments 1 and 2 involved rho and power values that would be produced by specific values of signal-to-noise ratio (SNR) in the NoSpi binaural configuration. Listeners were required to discriminate different SNRs via a 2I-FC loudness-discrimination task. At low reference SNRs, changes in rho fully accounted for listeners' performance, but as reference SNR increased, additional energy in the target band played an increasing role. Experiment 2 showed that at these higher SNRs the combination of information from the power spectrum and rho was superadditive and could not be explained by simple signal-detection models. The equalization-cancellation (EC) theory would explain these data using the output from interaural cancellation, Y, rather than rho. Experiment 3 attempted to foil binaural processing, by fixing either rho or Y across intervals. Consistent with EC theory, when Y was fixed, the contribution of the binaural system appeared negligible, while fixing rho did not have this effect.
Collapse
Affiliation(s)
- John F Culling
- School of Psychology, Cardiff University, Tower Building, Park Place Cardiff, CF10 3AT, United Kingdom.
| |
Collapse
|
19
|
Dietz M, Ewert SD, Hohmann V, Kollmeier B. Coding of temporally fluctuating interaural timing disparities in a binaural processing model based on phase differences. Brain Res 2007; 1220:234-45. [PMID: 17949695 DOI: 10.1016/j.brainres.2007.09.026] [Citation(s) in RCA: 28] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/21/2007] [Revised: 09/07/2007] [Accepted: 09/13/2007] [Indexed: 11/17/2022]
Abstract
A model of the effective processing of interaural timing disparities in the human auditory system is presented which provides modifications and extensions to existing models motivated by recent physiological findings. In particular, an established model of excitatory-inhibitory (EI) neuronal connectivity is complemented by a model that is based on a rate code derived from the interaural phase difference (IPD). The IPD model is shown to successfully simulate literature data on fine structure and envelope-based binaural detection and lateralization experiments. In order to investigate the processing of temporal fluctuations of interaural timing disparities, detection thresholds of broadband binaural-beat stimuli were measured in six normal-hearing listeners and were compared with model simulations. In a first experiment, the highest detectable beat frequency was found to be 96 Hz for a noise bandwidth of 550 Hz and 219 Hz for a bandwidth of 1100 Hz. Both models predicted lower thresholds, but performed increasingly better when the integration time constants of the binaural processors were reduced. In a second experiment, the signal-to-noise ratio at the detection threshold of binaural-beat stimuli mixed with interaurally uncorrelated noise was measured as a function of the beat frequency. The threshold increased about 1.7 dB per octave which was simulated similarly by both models. The results indicate that the primary temporal resolution of the binaural system for detecting interaural timing disparities is much higher than the temporal resolution found in higher auditory processes as supposedly involved in, e.g., masking.
Collapse
Affiliation(s)
- Mathias Dietz
- Medizinische Physik, Universität Oldenburg, 26111 Oldenburg, Germany.
| | | | | | | |
Collapse
|
20
|
Goupell MJ, Hartmann WM. Interaural fluctuations and the detection of interaural incoherence. III. Narrowband experiments and binaural models. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2007; 122:1029-45. [PMID: 17672651 DOI: 10.1121/1.2734489] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/16/2023]
Abstract
In the first two articles of this series, reproducible noises with a fixed value of interaural coherence (0.992) were used to study the human ability to detect interaural incoherence. It was found that incoherence detection is strongly correlated with fluctuations in interaural differences, especially for narrow noise bandwidths, but it remained unclear what function of the fluctuations best agrees with detection data. In the present article, ten different binaural models were tested against detection data for 14- and 108-Hz bandwidths. These models included different types of binaural processing: independent-interaural-phase-difference/interaural-level-difference, lateral-position, and short-term cross-correlation. Several preprocessing transformations of the interaural differences were incorporated: compression of binaural cues, temporal averaging, and envelope weighting. For the 14-Hz bandwidth data, the most successful model postulated that incoherence is detected via fluctuations of interaural phase and interaural level processed by independent centers. That model correlated with detectability at r=0.87. That model proved to be more successful than short-term cross-correlation models incorporating standard physiologically-based model features (r=0.78). For the 108-Hz bandwidth data, detection performance varied much less among different waveforms, and the data were less able to distinguish between models.
Collapse
Affiliation(s)
- Matthew J Goupell
- Department of Physics and Astronomy, Michigan State University, East Lansing, Michigan 48824, USA.
| | | |
Collapse
|
21
|
Bernstein LR, Trahiotis C, Freyman RL. Binaural detection of 500-Hz tones in broadband and in narrowband masking noise: effects of signal/masker duration and forward masking fringes. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2006; 119:2981-93. [PMID: 16708954 DOI: 10.1121/1.2188373] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/09/2023]
Abstract
NoSpi thresholds for a 500-Hz tonal signal were measured with broadband and with narrowband maskers using a single-interval adaptive matrix procedure [C. Kaernbach, J Acoust. Soc. Am. 88, 2645-2655 (1990)]. The purpose of the study was to investigate and to account for the effects on thresholds of varying the durations of the signals and maskers and the durations of forward masking fringes that preceded the occurrence of signal-plus-noise. For detection in both broadband and narrowband noise, the addition of brief forward fringes of masking noise resulted in elevations in threshold for the shortest signal durations. Longer forward fringes led to larger decreases in threshold when the masker was broadband as compared to when the masker was narrowband. The complex patterning of the data was explained by the operation of: (1) "predetection" temporal integration associated with peripheral auditory filtering; (2) duration-dependent, across-frequency influences that differentially affect broadband and narrowband NoSpi thresholds, (3) "post-detection" temporal integration associated with the central binaural mechanism, and (4) consideration of the detection thresholds in terms of changes in interaural correlation rather than in terms of signal level or signal-to-noise ratio, per se.
Collapse
Affiliation(s)
- Leslie R Bernstein
- Department of Neuroscience, University of Connecticut Health Center, Farmington, Connecticut 06030, USA
| | | | | |
Collapse
|
22
|
Hawley ML, Litovsky RY, Culling JF. The benefit of binaural hearing in a cocktail party: effect of location and type of interferer. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2004; 115:833-43. [PMID: 15000195 DOI: 10.1121/1.1639908] [Citation(s) in RCA: 292] [Impact Index Per Article: 14.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/22/2023]
Abstract
The "cocktail party problem" was studied using virtual stimuli whose spatial locations were generated using anechoic head-related impulse responses from the AUDIS database [Blauert et al., J. Acoust. Soc. Am. 103, 3082 (1998)]. Speech reception thresholds (SRTs) were measured for Harvard IEEE sentences presented from the front in the presence of one, two, or three interfering sources. Four types of interferer were used: (1) other sentences spoken by the same talker, (2) time-reversed sentences of the same talker, (3) speech-spectrum shaped noise, and (4) speech-spectrum shaped noise, modulated by the temporal envelope of the sentences. Each interferer was matched to the spectrum of the target talker. Interferers were placed in several spatial configurations, either coincident with or separated from the target. Binaural advantage was derived by subtracting SRTs from listening with the "better monaural ear" from those for binaural listening. For a single interferer, there was a binaural advantage of 2-4 dB for all interferer types. For two or three interferers, the advantage was 2-4 dB for noise and speech-modulated noise, and 6-7 dB for speech and time-reversed speech. These data suggest that the benefit of binaural hearing for speech intelligibility is especially pronounced when there are multiple voiced interferers at different locations from the target, regardless of spatial configuration; measurements with fewer or with other types of interferers can underestimate this benefit.
Collapse
Affiliation(s)
- Monica L Hawley
- Hearing Research Center and Department of Biomedical Engineering, Boston University, Boston, MA 02215, USA
| | | | | |
Collapse
|
23
|
Fitzpatrick DC, Kuwada S, Batra R. Transformations in processing interaural time differences between the superior olivary complex and inferior colliculus: beyond the Jeffress model. Hear Res 2002; 168:79-89. [PMID: 12117511 DOI: 10.1016/s0378-5955(02)00359-3] [Citation(s) in RCA: 40] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
Abstract
Interaural time differences (ITDs) are used to localize sounds and improve signal detection in noise. Encoding ITDs in neurons depends on specialized mechanisms for comparing inputs from the two ears. Most studies have emphasized how the responses of ITD-sensitive neurons are consistent with the tenets of the Jeffress model. The Jeffress model uses neuronal coincidence detectors that compare inputs from both sides and delay lines so that different neurons achieve coincidence at different ITDs. Although Jeffress-type models are successful at predicting sensitivity to ITDs in humans, in many respects they are a limited representation of the responses seen in neurons. In the superior olivary complex (SOC), ITD-sensitive neurons are distributed across both the medial (MSO) and lateral (LSO) superior olives. Similar response types are found in neurons sensitive to ITDs in two signal types: low-frequency sounds and envelopes of high-frequency sounds. Excitatory-excitatory interactions in the MSO are associated with peak-type responses, and excitatory-inhibitory interactions in the LSO are associated with trough-type responses. There are also neurons with responses intermediate between peak- and trough-type. In the inferior colliculus (IC), the same basic types remain, presumably due to inputs arising from the MSO and LSO. Using recordings from the SOC and IC, we describe how the response types can be described within a continuum that extends to very large values of ITD, and compare the functional organization at the two levels.
Collapse
Affiliation(s)
- Douglas C Fitzpatrick
- Department of Otolaryngology/Head and Neck Surgery, University of North Carolina, Chapel Hill, NC 27599-7070, USA.
| | | | | |
Collapse
|
24
|
Breebaart J, van de Par S, Kohlrausch A. Binaural processing model based on contralateral inhibition. II. Dependence on spectral parameters. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2001; 110:1089-1104. [PMID: 11519577 DOI: 10.1121/1.1383298] [Citation(s) in RCA: 16] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/23/2023]
Abstract
This and two accompanying articles [Breebaart et al., J. Acoust. Soc. Am. 110, 1074-1088 (2001); 110, 1105-1117 (2001)] describe a computational model for the signal processing in the binaural auditory system. The model consists of several stages of monaural and binaural preprocessing combined with an optimal detector. In the present article the model is tested and validated by comparing its predictions with experimental data for binaural discrimination and masking conditions as a function of the spectral parameters of both masker and signal. For this purpose, the model is used as an artificial observer in a three-interval, forced-choice adaptive procedure. All model parameters were kept constant for all simulations described in this and the subsequent article. The effects of the following experimental parameters were investigated: center frequency of both masker and target, bandwidth of masker and target, the interaural phase relations of masker and target, and the level of the masker. Several phenomena that occur in binaural listening conditions can be accounted for. These include the wider effective binaural critical bandwidth observed in band-widening NoS(pi) conditions, the different masker-level dependence of binaural detection thresholds for narrow- and for wide-band maskers, the unification of IID and ITD sensitivity with binaural detection data, and the dependence of binaural thresholds on frequency.
Collapse
Affiliation(s)
- J Breebaart
- IPO, Center for User-System Interaction, Eindhoven, The Netherlands.
| | | | | |
Collapse
|
25
|
Breebaart J, van de Par S, Kohlrausch A. Binaural processing model based on contralateral inhibition. I. Model structure. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2001; 110:1074-1088. [PMID: 11519576 DOI: 10.1121/1.1383297] [Citation(s) in RCA: 114] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/23/2023]
Abstract
This article presents a quantitative binaural signal detection model which extends the monaural model described by Dau et al. [J. Acoust. Soc. Am. 99, 3615-3622 (1996)]. The model is divided into three stages. The first stage comprises peripheral preprocessing in the right and left monaural channels. The second stage is a binaural processor which produces a time-dependent internal representation of the binaurally presented stimuli. This stage is based on the Jeffress delay line extended with tapped attenuator lines. Through this extension, the internal representation codes both interaural time and intensity differences. In contrast to most present-day models, which are based on excitatory-excitatory interaction, the binaural interaction in the present model is based on contralateral inhibition of ipsilateral signals. The last stage, a central processor, extracts a decision variable that can be used to detect the presence of a signal in a detection task, but could also derive information about the position and the compactness of a sound source. In two accompanying articles, the model predictions are compared with data obtained with human observers in a great variety of experimental conditions.
Collapse
Affiliation(s)
- J Breebaart
- IPO, Center for User-System Interaction, Eindhoven, The Netherlands.
| | | | | |
Collapse
|