1
|
Shi K, Quass GL, Rogalla MM, Ford AN, Czarny JE, Apostolides PF. Population coding of time-varying sounds in the nonlemniscal inferior colliculus. J Neurophysiol 2024; 131:842-864. [PMID: 38505907 DOI: 10.1152/jn.00013.2024] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2024] [Revised: 02/29/2024] [Accepted: 03/15/2024] [Indexed: 03/21/2024] Open
Abstract
The inferior colliculus (IC) of the midbrain is important for complex sound processing, such as discriminating conspecific vocalizations and human speech. The IC's nonlemniscal, dorsal "shell" region is likely important for this process, as neurons in these layers project to higher-order thalamic nuclei that subsequently funnel acoustic signals to the amygdala and nonprimary auditory cortices, forebrain circuits important for vocalization coding in a variety of mammals, including humans. However, the extent to which shell IC neurons transmit acoustic features necessary to discern vocalizations is less clear, owing to the technical difficulty of recording from neurons in the IC's superficial layers via traditional approaches. Here, we use two-photon Ca2+ imaging in mice of either sex to test how shell IC neuron populations encode the rate and depth of amplitude modulation, important sound cues for speech perception. Most shell IC neurons were broadly tuned, with a low neurometric discrimination of amplitude modulation rate; only a subset was highly selective to specific modulation rates. Nevertheless, neural network classifier trained on fluorescence data from shell IC neuron populations accurately classified amplitude modulation rate, and decoding accuracy was only marginally reduced when highly tuned neurons were omitted from training data. Rather, classifier accuracy increased monotonically with the modulation depth of the training data, such that classifiers trained on full-depth modulated sounds had median decoding errors of ∼0.2 octaves. Thus, shell IC neurons may transmit time-varying signals via a population code, with perhaps limited reliance on the discriminative capacity of any individual neuron.NEW & NOTEWORTHY The IC's shell layers originate a "nonlemniscal" pathway important for perceiving vocalization sounds. However, prior studies suggest that individual shell IC neurons are broadly tuned and have high response thresholds, implying a limited reliability of efferent signals. Using Ca2+ imaging, we show that amplitude modulation is accurately represented in the population activity of shell IC neurons. Thus, downstream targets can read out sounds' temporal envelopes from distributed rate codes transmitted by populations of broadly tuned neurons.
Collapse
Affiliation(s)
- Kaiwen Shi
- Department of Otolaryngology-Head & Neck Surgery, Kresge Hearing Research Institute, University of Michigan Medical School, Ann Arbor, Michigan, United States
| | - Gunnar L Quass
- Department of Otolaryngology-Head & Neck Surgery, Kresge Hearing Research Institute, University of Michigan Medical School, Ann Arbor, Michigan, United States
| | - Meike M Rogalla
- Department of Otolaryngology-Head & Neck Surgery, Kresge Hearing Research Institute, University of Michigan Medical School, Ann Arbor, Michigan, United States
| | - Alexander N Ford
- Department of Otolaryngology-Head & Neck Surgery, Kresge Hearing Research Institute, University of Michigan Medical School, Ann Arbor, Michigan, United States
| | - Jordyn E Czarny
- Department of Otolaryngology-Head & Neck Surgery, Kresge Hearing Research Institute, University of Michigan Medical School, Ann Arbor, Michigan, United States
| | - Pierre F Apostolides
- Department of Otolaryngology-Head & Neck Surgery, Kresge Hearing Research Institute, University of Michigan Medical School, Ann Arbor, Michigan, United States
- Department of Molecular & Integrative Physiology, University of Michigan Medical School, Ann Arbor, Michigan, United States
| |
Collapse
|
2
|
Shi K, Quass GL, Rogalla MM, Ford AN, Czarny JE, Apostolides PF. Population coding of time-varying sounds in the non-lemniscal Inferior Colliculus. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.08.14.553263. [PMID: 37645904 PMCID: PMC10461978 DOI: 10.1101/2023.08.14.553263] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/31/2023]
Abstract
The inferior colliculus (IC) of the midbrain is important for complex sound processing, such as discriminating conspecific vocalizations and human speech. The IC's non-lemniscal, dorsal "shell" region is likely important for this process, as neurons in these layers project to higher-order thalamic nuclei that subsequently funnel acoustic signals to the amygdala and non-primary auditory cortices; forebrain circuits important for vocalization coding in a variety of mammals, including humans. However, the extent to which shell IC neurons transmit acoustic features necessary to discern vocalizations is less clear, owing to the technical difficulty of recording from neurons in the IC's superficial layers via traditional approaches. Here we use 2-photon Ca2+ imaging in mice of either sex to test how shell IC neuron populations encode the rate and depth of amplitude modulation, important sound cues for speech perception. Most shell IC neurons were broadly tuned, with a low neurometric discrimination of amplitude modulation rate; only a subset were highly selective to specific modulation rates. Nevertheless, neural network classifier trained on fluorescence data from shell IC neuron populations accurately classified amplitude modulation rate, and decoding accuracy was only marginally reduced when highly tuned neurons were omitted from training data. Rather, classifier accuracy increased monotonically with the modulation depth of the training data, such that classifiers trained on full-depth modulated sounds had median decoding errors of ~0.2 octaves. Thus, shell IC neurons may transmit time-varying signals via a population code, with perhaps limited reliance on the discriminative capacity of any individual neuron.
Collapse
Affiliation(s)
- Kaiwen Shi
- Kresge Hearing Research Institute, Department of Otolaryngology — Head & Neck Surgery, University of Michigan Medical School, Ann Arbor, MI, 48109
| | - Gunnar L. Quass
- Kresge Hearing Research Institute, Department of Otolaryngology — Head & Neck Surgery, University of Michigan Medical School, Ann Arbor, MI, 48109
| | - Meike M. Rogalla
- Kresge Hearing Research Institute, Department of Otolaryngology — Head & Neck Surgery, University of Michigan Medical School, Ann Arbor, MI, 48109
| | - Alexander N. Ford
- Kresge Hearing Research Institute, Department of Otolaryngology — Head & Neck Surgery, University of Michigan Medical School, Ann Arbor, MI, 48109
| | - Jordyn E. Czarny
- Kresge Hearing Research Institute, Department of Otolaryngology — Head & Neck Surgery, University of Michigan Medical School, Ann Arbor, MI, 48109
| | - Pierre F. Apostolides
- Kresge Hearing Research Institute, Department of Otolaryngology — Head & Neck Surgery, University of Michigan Medical School, Ann Arbor, MI, 48109
- Department of Molecular & Integrative Physiology, University of Michigan Medical School, Ann Arbor, MI, 48109
| |
Collapse
|
3
|
Matz AF, Nie Y, Wheeler HJ. Auditory stream segregation of amplitude-modulated narrowband noise in cochlear implant users and individuals with normal hearing. Front Psychol 2022; 13:927854. [PMID: 36118488 PMCID: PMC9479457 DOI: 10.3389/fpsyg.2022.927854] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2022] [Accepted: 08/11/2022] [Indexed: 11/13/2022] Open
Abstract
Voluntary stream segregation was investigated in cochlear implant (CI) users and normal-hearing (NH) listeners using a segregation-promoting objective approach which evaluated the role of spectral and amplitude-modulation (AM) rate separations on stream segregation and its build-up. Sequences of 9 or 3 pairs of A and B narrowband noise (NBN) bursts were presented which differed in either center frequency of the noise band, the AM-rate, or both. In some sequences (delayed sequences), the last B burst was delayed by 35 ms from their otherwise-steady temporal position. In the other sequences (no-delay sequences), the last B bursts were temporally advanced from 0 to 10 ms. A single interval yes/no procedure was utilized to measure participants’ sensitivity (d′) in identifying delayed vs. no-delay sequences. A higher d′ value showed the higher ability to segregate the A and B subsequences. For NH listeners, performance improved with each spectral separation. However, for CI users, performance was only significantly better for the condition with the largest spectral separation. Additionally, performance was significantly poorer for the largest AM-rate separation than for the condition with no AM-rate separation for both groups. The significant effect of sequence duration in both groups indicated that listeners made more improvement with lengthening the duration of stimulus sequences, supporting the build-up effect. The results of this study suggest that CI users are less able than NH listeners to segregate NBN bursts into different auditory streams when they are moderately separated in the spectral domain. Contrary to our hypothesis, our results indicate that AM-rate separation may interfere with the segregation of streams of NBN. Additionally, our results add evidence to the literature that CI users build up stream segregation at a rate comparable to NH listeners, when the inter-stream spectral separations are adequately large.
Collapse
Affiliation(s)
- Alexandria F. Matz
- Department of Otolaryngology, Eastern Virginia Medical School, Norfolk, VA, United States
| | - Yingjiu Nie
- Department of Communication Sciences and Disorders, James Madison University, Harrisonburg, VA, United States
- *Correspondence: Yingjiu Nie,
| | - Harley J. Wheeler
- Department of Speech-Language-Hearing Sciences, University of Minnesota, Twin Cities, Minneapolis, MN, United States
| |
Collapse
|
4
|
Monaghan JJM, Carlyon RP, Deeks JM. Modulation Depth Discrimination by Cochlear Implant Users. J Assoc Res Otolaryngol 2022; 23:285-299. [PMID: 35080684 PMCID: PMC8964891 DOI: 10.1007/s10162-022-00834-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2021] [Accepted: 12/30/2021] [Indexed: 11/29/2022] Open
Abstract
Cochlear implants (CIs) convey the amplitude envelope of speech by modulating high-rate pulse trains. However, not all of the envelope may be necessary to perceive amplitude modulations (AMs); the effective envelope depth may be limited by forward and backward masking from the envelope peaks. Three experiments used modulated pulse trains to measure which portions of the envelope can be effectively processed by CI users as a function of AM frequency. Experiment 1 used a three-interval forced-choice task to test the ability of CI users to discriminate less-modulated pulse trains from a fully modulated standard, without controlling for loudness. The stimuli in experiment 2 were identical, but a two-interval task was used in which participants were required to choose the less-modulated interval, ignoring loudness. Catch trials, in which judgements based on level or modulation depth would give opposing answers, were included. Experiment 3 employed novel stimuli whose modulation envelope could be modified below a variable point in the dynamic range, without changing the loudness of the stimulus. Overall, results showed that substantial portions of the envelope are not accurately encoded by CI users. In experiment 1, where loudness cues were available, participants on average were insensitive to changes in the bottom 30% of their dynamic range. In experiment 2, where loudness was controlled, participants appeared insensitive to changes in the bottom 50% of the dynamic range. In experiment 3, participants were insensitive to changes in the bottom 80% of the dynamic range. We discuss potential reasons for this insensitivity and implications for CI speech-processing strategies.
Collapse
Affiliation(s)
- Jessica J M Monaghan
- Macquarie University, The Australian Hearing Hub, NSW, 2109, Sydney, Australia.
- National Acoustic Laboratories, The Australian Hearing Hub, Sydney, NSW, 2109, Australia.
| | - Robert P Carlyon
- Cambridge Hearing Group, Medical Research Council Cognition and Brain Sciences Unit, University of Cambridge, 15 Chaucer Road, Cambridge, CB2 7EF, UK
| | - John M Deeks
- Cambridge Hearing Group, Medical Research Council Cognition and Brain Sciences Unit, University of Cambridge, 15 Chaucer Road, Cambridge, CB2 7EF, UK
| |
Collapse
|
5
|
Cabrera L, Lorenzini I, Rosen S, Varnet L, Lorenzi C. Temporal integration for amplitude modulation in childhood: Interaction between internal noise and memory. Hear Res 2021; 415:108403. [PMID: 34879987 DOI: 10.1016/j.heares.2021.108403] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/27/2021] [Revised: 11/17/2021] [Accepted: 11/25/2021] [Indexed: 11/25/2022]
Abstract
It is still unclear whether the gradual improvement in amplitude-modulation (AM) sensitivity typically found in children up to 10 years of age reflects an improvement in "processing efficiency" (the central ability to use information extracted by sensory mechanisms). This hypothesis was tested by evaluating temporal integration for AM, a capacity relying on memory and decision factors. This was achieved by measuring the effect of increasing the number of AM cycles (2 vs 8) on AM-detection thresholds for three groups of children aged from 5 to 11 years and a group of young adults. AM-detection thresholds were measured using a forced-choice procedure and sinusoidal AM (4 or 32 Hz rate) applied to a 1024-Hz pure-tone carrier. All age groups demonstrated temporal integration for AM at both rates; that is, significant improvements in AM sensitivity with a higher number of AM cycles. However, an effect of age is observed as both 5-6 year olds and adults exhibited more temporal integration compared to 7-8 and 10-11 year olds at both rates. This difference is due to: (i) the 5-6 year olds displaying the worst thresholds with 2 AM cycles, but similar thresholds with 8 cycles compared to the 7-8 and 10-11 year olds, and, (ii) adults showing the best thresholds with 8 AM cycles but similar thresholds with 2 cycles compared to the 7-8 and 10-11 year olds. Computational modelling indicated that higher levels of internal noise combined with poorer short-term memory capacities in children accounted for the developmental trends. Improvement in processing efficiency may therefore account for the development of AM detection in childhood. This article is part of the Special Issue Outer hair cell Edited by Joseph Santos-Sacchi and Kumar Navaratnam.
Collapse
Affiliation(s)
- Laurianne Cabrera
- Université de Paris, CNRS, Integrative Neuroscience and Cognition Center, F-75006 Paris, France; Speech, Hearing and Phonetic Sciences, UCL, United Kingdom.
| | - Irene Lorenzini
- Université de Paris, CNRS, Integrative Neuroscience and Cognition Center, F-75006 Paris, France
| | - Stuart Rosen
- Speech, Hearing and Phonetic Sciences, UCL, United Kingdom
| | - Léo Varnet
- Laboratoire des Systèmes Perceptifs (UMR 8248), CNRS, Ecole normale supérieure, Université Paris Sciences & Lettres (PSL), Paris, France
| | - Christian Lorenzi
- Laboratoire des Systèmes Perceptifs (UMR 8248), CNRS, Ecole normale supérieure, Université Paris Sciences & Lettres (PSL), Paris, France
| |
Collapse
|
6
|
Palandrani KN, Hoover EC, Stavropoulos T, Seitz AR, Isarangura S, Gallun FJ, Eddins DA. Temporal integration of monaural and dichotic frequency modulation. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2021; 150:745. [PMID: 34470296 PMCID: PMC8337085 DOI: 10.1121/10.0005729] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/16/2020] [Revised: 06/17/2021] [Accepted: 07/02/2021] [Indexed: 05/06/2023]
Abstract
Frequency modulation (FM) detection at low modulation frequencies is commonly used as an index of temporal fine-structure processing. The present study evaluated the rate of improvement in monaural and dichotic FM across a range of test parameters. In experiment I, dichotic and monaural FM detection was measured as a function of duration and modulator starting phase. Dichotic FM thresholds were lower than monaural FM thresholds and the modulator starting phase had no effect on detection. Experiment II measured monaural FM detection for signals that differed in modulation rate and duration such that the improvement with duration in seconds (carrier) or cycles (modulator) was compared. Monaural FM detection improved monotonically with the number of modulation cycles, suggesting that the modulator is extracted prior to detection. Experiment III measured dichotic FM detection for shorter signal durations to test the hypothesis that dichotic FM relies primarily on the signal onset. The rate of improvement decreased as duration increased, which is consistent with the use of primarily onset cues for the detection of dichotic FM. These results establish that improvement with duration occurs as a function of the modulation cycles at a rate consistent with the independent-samples model for monaural FM, but later cycles contribute less to detection in dichotic FM.
Collapse
Affiliation(s)
- Katherine N Palandrani
- Department of Communication Sciences and Disorders, University of Maryland, College Park, Maryland 20742, USA
| | - Eric C Hoover
- Department of Communication Sciences and Disorders, University of Maryland, College Park, Maryland 20742, USA
| | - Trevor Stavropoulos
- Brain Game Center, University of California Riverside, Riverside, California 92521, USA
| | - Aaron R Seitz
- Department of Psychology, University of California Riverside, Riverside, California 92521, USA
| | - Sittiprapa Isarangura
- Department of Communication Sciences and Disorders, Mahidol University, Phaya Thai, Bangkok 10400, Thailand
| | - Frederick J Gallun
- Oregon Hearing Research Center, Oregon Health and Science University, Portland, Oregon 97239, USA
| | - David A Eddins
- Department of Communication Sciences and Disorders, University of South Florida, Tampa, Florida 33620, USA
| |
Collapse
|
7
|
Füllgrabe C, Sęk A, Moore BCJ. Forward masking of amplitude modulation across ears and its tuning in the modulation domain. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2021; 149:1764. [PMID: 33765781 DOI: 10.1121/10.0003598] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/07/2020] [Accepted: 02/08/2021] [Indexed: 06/12/2023]
Abstract
Frequency selectivity in the amplitude modulation (AM) domain has been demonstrated using both simultaneous AM masking and forward AM masking. This has been explained using the concept of a modulation filter bank (MFB). Here, we assessed whether the MFB occurs before or after the point of binaural interaction in the auditory pathway by using forward masking in the AM domain in an ipsilateral condition (masker AM and signal AM applied to the left ear with an unmodulated carrier in the right ear) and a contralateral condition (masker AM applied to the right ear and signal AM applied to the left ear). The carrier frequency was 8 kHz, the signal AM frequency, fs, was 40 or 80 Hz, and the masker AM frequency ranged from 0.25 to 4 times fs. Contralateral forward AM masking did occur, but it was smaller than ipsilateral AM masking. Tuning in the AM domain was slightly sharper for ipsilateral than for contralateral masking, perhaps reflecting confusion of the signal and masker AM in the ipsilateral condition when their AM frequencies were the same. The results suggest that there might be an MFB both before and after the point in the auditory pathway where binaural interaction occurs.
Collapse
Affiliation(s)
- Christian Füllgrabe
- School of Sport, Exercise and Health Sciences, Loughborough University, Ashby Road, Loughborough LE11 3TU, United Kingdom
| | - Aleksander Sęk
- Cambridge Hearing Group, Department of Experimental Psychology, University of Cambridge, Downing Street, Cambridge CB2 3EB, United Kingdom
| | - Brian C J Moore
- Cambridge Hearing Group, Department of Experimental Psychology, University of Cambridge, Downing Street, Cambridge CB2 3EB, United Kingdom
| |
Collapse
|
8
|
Yao JD, Gimoto J, Constantinople CM, Sanes DH. Parietal Cortex Is Required for the Integration of Acoustic Evidence. Curr Biol 2020; 30:3293-3303.e4. [PMID: 32619478 DOI: 10.1016/j.cub.2020.06.017] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2020] [Revised: 05/12/2020] [Accepted: 06/04/2020] [Indexed: 01/31/2023]
Abstract
Sensory-driven decisions are formed by accumulating information over time. Although parietal cortex activity is thought to represent accumulated evidence for sensory-based decisions, recent perturbation studies in rodents and non-human primates have challenged the hypothesis that these representations actually influence behavior. Here, we asked whether the parietal cortex integrates acoustic features from auditory cortical inputs during a perceptual decision-making task. If so, we predicted that selective inactivation of this projection should impair subjects' ability to accumulate sensory evidence. We trained gerbils to perform an auditory discrimination task and obtained measures of integration time as a readout of evidence accumulation capability. Minimum integration time was calculated behaviorally as the shortest stimulus duration for which subjects could discriminate the acoustic signals. Direct pharmacological inactivation of parietal cortex increased minimum integration times, suggesting its role in the behavior. To determine the specific impact of sensory evidence, we chemogenetically inactivated the excitatory projections from auditory cortex to parietal cortex and found this was sufficient to increase minimum behavioral integration times. Our signal-detection-theory-based model accurately replicated behavioral outcomes and indicated that the deficits in task performance were plausibly explained by elevated sensory noise. Together, our findings provide causal evidence that parietal cortex plays a role in the network that integrates auditory features for perceptual judgments.
Collapse
Affiliation(s)
- Justin D Yao
- Center for Neural Science, New York University, New York, NY 10003, USA.
| | - Justin Gimoto
- Center for Neural Science, New York University, New York, NY 10003, USA
| | - Christine M Constantinople
- Center for Neural Science, New York University, New York, NY 10003, USA; Neuroscience Institute, NYU Langone Medical Center, New York University, New York, NY 10016, USA
| | - Dan H Sanes
- Center for Neural Science, New York University, New York, NY 10003, USA; Department of Psychology, New York University, New York, NY 10003, USA; Department of Biology, New York University, New York, NY 10003, USA; Neuroscience Institute, NYU Langone Medical Center, New York University, New York, NY 10016, USA
| |
Collapse
|
9
|
Cai H, Dent ML. Best sensitivity of temporal modulation transfer functions in laboratory mice matches the amplitude modulation embedded in vocalizations. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2020; 147:337. [PMID: 32006990 PMCID: PMC7043865 DOI: 10.1121/10.0000583] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/30/2019] [Revised: 12/18/2019] [Accepted: 12/22/2019] [Indexed: 06/10/2023]
Abstract
The perception of spectrotemporal changes is crucial for distinguishing between acoustic signals, including vocalizations. Temporal modulation transfer functions (TMTFs) have been measured in many species and reveal that the discrimination of amplitude modulation suffers at rapid modulation frequencies. TMTFs were measured in six CBA/CaJ mice in an operant conditioning procedure, where mice were trained to discriminate an 800 ms amplitude modulated white noise target from a continuous noise background. TMTFs of mice show a bandpass characteristic, with an upper limit cutoff frequency of around 567 Hz. Within the measured modulation frequencies ranging from 5 Hz to 1280 Hz, the mice show a best sensitivity for amplitude modulation at around 160 Hz. To look for a possible parallel evolution between sound perception and production in living organisms, we also analyzed the components of amplitude modulations embedded in natural ultrasonic vocalizations (USVs) emitted by this strain. We found that the cutoff frequency of amplitude modulation in most of the individual USVs is around their most sensitive range obtained from the psychoacoustic experiments. Further analyses of the duration and modulation frequency ranges of USVs indicated that the broader the frequency ranges of amplitude modulation in natural USVs, the shorter the durations of the USVs.
Collapse
Affiliation(s)
- Huaizhen Cai
- Department of Psychology, University at Buffalo-SUNY, Buffalo, New York 14260, USA
| | - Micheal L Dent
- Department of Psychology, University at Buffalo-SUNY, Buffalo, New York 14260, USA
| |
Collapse
|
10
|
Effects of Hearing Loss and Fast-Acting Compression on Amplitude Modulation Perception and Speech Intelligibility. Ear Hear 2019; 40:45-54. [DOI: 10.1097/aud.0000000000000589] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/17/2022]
|
11
|
Almishaal A, Bidelman GM, Jennings SG. Notched-noise precursors improve detection of low-frequency amplitude modulation. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2017; 141:324. [PMID: 28147582 PMCID: PMC5392086 DOI: 10.1121/1.4973912] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/07/2016] [Revised: 12/21/2016] [Accepted: 12/22/2016] [Indexed: 05/19/2023]
Abstract
Amplitude modulation (AM) detection was measured with a short (50 ms), high-frequency carrier as a function of carrier level (Experiment I) and modulation frequency (Experiment II) for conditions with or without a notched-noise precursor. A longer carrier (500 ms) was also included in Experiment I. When the carrier was preceded by silence (no precursor condition) AM detection thresholds worsened for moderate-level carriers compared to lower- or higher-level carriers, resulting in a "mid-level hump." AM detection thresholds with a precursor were better than those without a precursor, primarily for moderate-to-high level carriers, thus eliminating the mid-level hump in AM detection. When the carrier was 500 ms, AM thresholds improved by a constant (across all levels) relative to AM thresholds with a precursor, consistent with the longer carrier providing more "looks" to detect the AM signal. Experiment II revealed that improved AM detection with compared to without a precursor is limited to low-modulation frequencies (<60 Hz). These results are consistent with (1) a reduction in cochlear gain over the course of the precursor perhaps via the medial olivocochlear reflex or (2) a form of perceptual enhancement which may be mediated by adaptation of inhibition.
Collapse
Affiliation(s)
- Ali Almishaal
- Department of Communication Sciences and Disorders, The University of Utah, 390 South, 1530 East, Behavioral Sciences Building 1201, Salt Lake City, Utah 84112, USA
| | - Gavin M Bidelman
- School of Communication Sciences and Disorders and Institute for Intelligent Systems, University of Memphis, 4055 North Park Loop, Memphis, Tennessee 38152, USA
| | - Skyler G Jennings
- Department of Communication Sciences and Disorders, The University of Utah, 390 South, 1530 East, BEHS 1201, Salt Lake City, Utah 84112, USA
| |
Collapse
|
12
|
Shen Y. The effect of frequency cueing on the perceptual segregation of simultaneous tones: Bottom-up and top-down contributions. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2016; 140:3496. [PMID: 27908095 PMCID: PMC5848834 DOI: 10.1121/1.4965969] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/01/2016] [Revised: 09/30/2016] [Accepted: 10/08/2016] [Indexed: 06/06/2023]
Abstract
Listeners were presented with two simultaneous tones of different frequencies (more than one octave apart) and asked to identify the tone that was amplitude-modulated while a tonal precursor was presented to cue the frequency of the lower frequency tone. Performance thresholds were estimated based on the duration of the tone-pair. In Exp. I the duration of the precursor varied from 100 to 400 ms and the inter-stimulus interval (ISI) between the precursor and the tone-pair varied from 0 to 1 s. The presence of the precursor facilitated segregation. As the ISI increased, the facilitation effect of the precursor increased for the precursor durations of 100 and 200 ms, but not for the 400-ms precursor duration. When the precursor was presented to the contralateral ear relative to the tone-pair in Exp. II, no significant change to the precursor effect was observed. These observations contradict the predictions of the model based solely on bottom-up processing, suggesting the likely involvement of top-down processes.
Collapse
Affiliation(s)
- Yi Shen
- Department of Speech and Hearing Sciences, Indiana University Bloomington, Bloomington, Indiana 47405, USA
| |
Collapse
|
13
|
Sek A, Baer T, Crinnion W, Springgay A, Moore BCJ. Modulation masking within and across carriers for subjects with normal and impaired hearing. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2015; 138:1143-1153. [PMID: 26328728 DOI: 10.1121/1.4928135] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
The detection of amplitude modulation (AM) of a carrier can be impaired by additional (masker) AM applied to the same carrier (within-carrier modulation masking, MM) or to a different carrier (across-carrier MM). These two types of MM were compared for young normal-hearing and older hearing-impaired subjects. The signal was 4- or 16-Hz sinusoidal AM of a 4000-Hz carrier. Masker AM with depth 0.4 was applied either to the same carrier or to a carrier at 3179 or 2518 Hz. The masker AM rate was 0.25, 0.5, 1, 2, or 4 times the signal rate. The signal AM depth was varied adaptively to determine the threshold. Both within-carrier and across-carrier MM patterns were similar for the two groups, suggesting that the hypothetical modulation filters are not affected by hearing loss or age. The signal AM detection thresholds were also similar for the two groups. Thresholds in the absence of masker AM were lower (better) for the older hearing-impaired than for the young normal-hearing subjects. Since the masked modulation thresholds were similar for the two groups, it seems unlikely that abnormal MM contributes to the difficulties experienced by older hearing-impaired people in understanding speech in background sounds.
Collapse
Affiliation(s)
- Aleksander Sek
- Department of Experimental Psychology, University of Cambridge, Downing Street, Cambridge CB2 3EB, England
| | - Thomas Baer
- Department of Experimental Psychology, University of Cambridge, Downing Street, Cambridge CB2 3EB, England
| | - William Crinnion
- Department of Experimental Psychology, University of Cambridge, Downing Street, Cambridge CB2 3EB, England
| | - Alastair Springgay
- Department of Experimental Psychology, University of Cambridge, Downing Street, Cambridge CB2 3EB, England
| | - Brian C J Moore
- Department of Experimental Psychology, University of Cambridge, Downing Street, Cambridge CB2 3EB, England
| |
Collapse
|
14
|
Sabin AT, Gallun FJ, Souza PE. Acoustical correlates of performance on a dynamic range compression discrimination task. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2013; 134:2136-47. [PMID: 23967944 PMCID: PMC3765331 DOI: 10.1121/1.4816410] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/16/2023]
Abstract
Dynamic range compression is widely used to reduce the difference between the most and least intense portions of a signal. Such compression distorts the shape of the amplitude envelope of a signal, but it is unclear to what extent such distortions are actually perceivable by listeners. Here, the ability to distinguish between compressed and uncompressed versions of a noise vocoded sentence was initially measured in listeners with normal hearing while varying the threshold, ratio, attack, and release parameters. This narrow condition was selected in order to characterize perception under the most favorable listening conditions. The average behavioral sensitivity to compression was highly correlated to several acoustical indices of modulation depth. In particular, performance was highly correlated to the Euclidean distance between the modulation spectra of the uncompressed and compressed signals. Suggesting that this relationship is not restricted to the initial test conditions, the correlation remained largely unchanged both (1) when listeners with normal hearing were tested using a time-compressed version of the original signal, and (2) when listeners with impaired hearing were tested using the original signal. If this relationship generalizes to more ecologically valid conditions, it will provide a straightforward method for predicting the detectability of compression-induced distortions.
Collapse
Affiliation(s)
- Andrew T Sabin
- Department of Communication Sciences and Disorders, Northwestern University, 2240 Campus Drive, Evanston, Illinois 60201, USA.
| | | | | |
Collapse
|
15
|
Keitel A, Prinz W, Friederici AD, von Hofsten C, Daum MM. Perception of conversations: the importance of semantics and intonation in children's development. J Exp Child Psychol 2013; 116:264-77. [PMID: 23876388 DOI: 10.1016/j.jecp.2013.06.005] [Citation(s) in RCA: 70] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2013] [Revised: 05/30/2013] [Accepted: 06/12/2013] [Indexed: 11/16/2022]
Abstract
In conversations, adults readily detect and anticipate the end of a speaker's turn. However, little is known about the development of this ability. We addressed two important aspects involved in the perception of conversational turn taking: semantic content and intonational form. The influence of semantics was investigated by testing prelinguistic and linguistic children. The influence of intonation was tested by presenting participants with videos of two dyadic conversations: one with normal intonation and one with flattened (removed) intonation. Children of four different age groups--two prelinguistic groups (6- and 12-month-olds) and two linguistic groups (24- and 36-month-olds)--and an adult group participated. Their eye movements were recorded, and the frequency of anticipated turns was analyzed. Our results show that (a) the anticipation of turns was reliable only in 3-year-olds and adults, with younger children shifting their gaze between speakers regardless of the turn taking, and (b) only 3-year-olds anticipated turns better if intonation was normal. These results indicate that children anticipate turns in conversations in a manner comparable (but not identical) to adults only after they have developed a sophisticated understanding of language. In contrast to adults, 3-year-olds rely more strongly on prosodic information during the perception of conversational turn taking.
Collapse
Affiliation(s)
- Anne Keitel
- Research Group Infant Cognition and Action, Max Planck Institute for Human Cognitive and Brain Sciences, 04103 Leipzig, Germany.
| | | | | | | | | |
Collapse
|
16
|
Sayles M, Füllgrabe C, Winter IM. Neurometric amplitude-modulation detection threshold in the guinea-pig ventral cochlear nucleus. J Physiol 2013; 591:3401-19. [PMID: 23629508 DOI: 10.1113/jphysiol.2013.253062] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022] Open
Abstract
Amplitude modulation (AM) is a pervasive feature of natural sounds. Neural detection and processing of modulation cues is behaviourally important across species. Although most ecologically relevant sounds are not fully modulated, physiological studies have usually concentrated on fully modulated (100% modulation depth) signals. Psychoacoustic experiments mainly operate at low modulation depths, around detection threshold (∼5% AM). We presented sinusoidal amplitude-modulated tones, systematically varying modulation depth between zero and 100%, at a range of modulation frequencies, to anaesthetised guinea-pigs while recording spikes from neurons in the ventral cochlear nucleus (VCN). The cochlear nucleus is the site of the first synapse in the central auditory system. At this locus significant signal processing occurs with respect to representation of AM signals. Spike trains were analysed in terms of the vector strength of spike synchrony to the amplitude envelope. Neurons showed either low-pass or band-pass temporal modulation transfer functions, with the proportion of band-pass responses increasing with increasing sound level. The proportion of units showing a band-pass response varies with unit type: sustained chopper (CS) > transient chopper (CT) > primary-like (PL). Spike synchrony increased with increasing modulation depth. At the lowest modulation depth (6%), significant spike synchrony was only observed near to the unit's best modulation frequency for all unit types tested. Modulation tuning therefore became sharper with decreasing modulation depth. AM detection threshold was calculated for each individual unit as a function of modulation frequency. Chopper units have significantly better AM detection thresholds than do primary-like units. AM detection threshold is significantly worse at 40 dB vs. 10 dB above pure-tone spike rate threshold. Mean modulation detection thresholds for sounds 10 dB above pure-tone spike rate threshold at best modulation frequency are (95% CI) 11.6% (10.0-13.1) for PL units, 9.8% (8.2-11.5) for CT units, and 10.8% (8.4-13.2) for CS units. The most sensitive guinea-pig VCN single unit AM detection thresholds are similar to human psychophysical performance (∼3% AM), while the mean neurometric thresholds approach whole animal behavioural performance (∼10% AM).
Collapse
Affiliation(s)
- Mark Sayles
- Department of Otolaryngology - Head and Neck Surgery, Queen's Medical Centre, Nottingham, NG7 2UH, UK.
| | | | | |
Collapse
|
17
|
Dubno JR, Ahlstrom JB, Wang X, Horwitz AR. Level-dependent changes in perception of speech envelope cues. J Assoc Res Otolaryngol 2012; 13:835-52. [PMID: 22872414 DOI: 10.1007/s10162-012-0343-2] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2012] [Accepted: 07/16/2012] [Indexed: 11/28/2022] Open
Abstract
Level-dependent changes in temporal envelope fluctuations in speech and related changes in speech recognition may reveal effects of basilar-membrane nonlinearities. As a result of compression in the basilar-membrane response, the "effective" magnitude of envelope fluctuations may be reduced as speech level increases from lower level (more linear) to mid-level (more compressive) regions. With further increases to a more linear region, speech envelope fluctuations may become more pronounced. To assess these effects, recognition of consonants and key words in sentences was measured as a function of speech level for younger adults with normal hearing. Consonant-vowel syllables and sentences were spectrally degraded using "noise vocoder" processing to maximize perceptual effects of changes to the speech envelope. Broadband noise at a fixed signal-to-noise ratio maintained constant audibility as speech level increased. Results revealed significant increases in scores and envelope-dependent feature transmission from 45 to 60 dB SPL and decreasing scores and feature transmission from 60 to 85 dB SPL. This quadratic pattern, with speech recognition maximized at mid levels and poorer at lower and higher levels, is consistent with a role of cochlear nonlinearities in perception of speech envelope cues.
Collapse
Affiliation(s)
- Judy R Dubno
- Department of Otolaryngology-Head and Neck Surgery, Medical University of South Carolina, 135 Rutledge Avenue, MSC 550, Charleston, SC 29425-5500, USA.
| | | | | | | |
Collapse
|
18
|
O'Connor KN, Johnson JS, Niwa M, Noriega NC, Marshall EA, Sutter ML. Amplitude modulation detection as a function of modulation frequency and stimulus duration: comparisons between macaques and humans. Hear Res 2011; 277:37-43. [PMID: 21457768 DOI: 10.1016/j.heares.2011.03.014] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/14/2011] [Revised: 03/16/2011] [Accepted: 03/22/2011] [Indexed: 11/26/2022]
Abstract
Previous observations show that humans outperform non-human primates on some temporally-based auditory discrimination tasks, suggesting there are species differences in the proficiency of auditory temporal processing among primates. To further resolve these differences we compared the abilities of rhesus macaques and humans to detect sine-amplitude modulation (AM) of a broad-band noise carrier as a function of both AM frequency (2.5 Hz-2 kHz) and signal duration (50-800 ms), under similar testing conditions. Using a go/no-go AM detection task, we found that macaques were less sensitive than humans at the lower frequencies and shorter durations tested but were as, or slightly more, sensitive at higher frequencies and longer durations. Humans had broader AM tuning functions, with lower frequency regions of peak sensitivity (10-60 Hz) than macaques (30-120 Hz). These results support the notion that there are species differences in temporal processing among primates, and underscore the importance of stimulus duration when making cross-species comparisons for temporally-based tasks.
Collapse
Affiliation(s)
- Kevin N O'Connor
- Center for Neuroscience, UC Davis, 1544 Newton Ct. Davis, CA 95616, USA.
| | | | | | | | | | | |
Collapse
|
19
|
Hotehama T, Nakagawa S. Modulation detection for amplitude-modulated bone-conducted sounds with sinusoidal carriers in the high- and ultrasonic-frequency range. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2010; 128:3011-3018. [PMID: 21110596 DOI: 10.1121/1.3493421] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/30/2023]
Abstract
Ultrasonic vibration generates a sensation of sound via bone-conduction. This phenomenon is called bone-conducted ultrasonic (BCU) hearing. Complex sounds can also be perceived by amplitude-modulating a BCU stimulus (AM-BCU). The influence of the modulation frequency on the sensitivity to detecting amplitude modulation of sinusoidal carriers of 10, 20, and 30 kHz was examined to clarify the characteristics of the perception of amplitude modulation over the sonic or audio-frequency range and the ultrasonic range. In addition, the detection sensitivity for single-sideband modulation for a 20 kHz carrier was measured. Temporal modulation transfer functions (TMTFs) obtained at each carrier frequency suggest that the auditory system has the ability to process timing information in the envelopes of AM-BCUs at lower modulation frequencies, as is the case with audio-frequency sounds. The possible influence of peripheral filtering on the shape of the TMTF at higher frequencies was examined.
Collapse
Affiliation(s)
- Takuya Hotehama
- Japan Society for the Promotion of Science, National Institute of Advanced Industrial Science and Technology, 1-8-31 Midorigaoka, Ikeda, Osaka 563-8577, Japan
| | | |
Collapse
|
20
|
Malone BJ, Scott BH, Semple MN. Temporal codes for amplitude contrast in auditory cortex. J Neurosci 2010; 30:767-84. [PMID: 20071542 PMCID: PMC3551278 DOI: 10.1523/jneurosci.4170-09.2010] [Citation(s) in RCA: 48] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2009] [Revised: 10/16/2009] [Accepted: 11/11/2009] [Indexed: 11/21/2022] Open
Abstract
The encoding of sound level is fundamental to auditory signal processing, and the temporal information present in amplitude modulation is crucial to the complex signals used for communication sounds, including human speech. The modulation transfer function, which measures the minimum detectable modulation depth across modulation frequency, has been shown to predict speech intelligibility performance in a range of adverse listening conditions and hearing impairments, and even for users of cochlear implants. We presented sinusoidal amplitude modulation (SAM) tones of varying modulation depths to awake macaque monkeys while measuring the responses of neurons in the auditory core. Using spike train classification methods, we found that thresholds for modulation depth detection and discrimination in the most sensitive units are comparable to psychophysical thresholds when precise temporal discharge patterns rather than average firing rates are considered. Moreover, spike timing information was also superior to average rate information when discriminating static pure tones varying in level but with similar envelopes. The limited utility of average firing rate information in many units also limited the utility of standard measures of sound level tuning, such as the rate level function (RLF), in predicting cortical responses to dynamic signals like SAM. Response modulation typically exceeded that predicted by the slope of the RLF by large factors. The decoupling of the cortical encoding of SAM and static tones indicates that enhancing the representation of acoustic contrast is a cardinal feature of the ascending auditory pathway.
Collapse
Affiliation(s)
- Brian J Malone
- Center for Neural Science at New York University, New York, New York 10003, USA.
| | | | | |
Collapse
|
21
|
Heise SJ, Mauermann M, Verhey JL. Investigating possible mechanisms behind the effect of threshold fine structure on amplitude modulation perception. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2009; 126:2490-2500. [PMID: 19894829 DOI: 10.1121/1.3224731] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/28/2023]
Abstract
Detection thresholds for sinusoidal amplitude modulation at low levels are higher (worse) when the carrier of the signal falls in a region of high pure-tone sensitivity (a minimum of the fine structure of the threshold in quiet) than when it falls at a fine-structure maximum. This study explores possible mechanisms behind this phenomenon by measuring modulation detection thresholds as a function of modulation frequency (experiment 1) and of carrier level for tonal carriers (experiment 2) and for 32-Hz wide noise carriers (experiment 3). The carriers could either fall at a fine-structure minimum, a fine-structure maximum, or in a region without fine structure. Modulation frequencies varied between 8 Hz and one fine-structure cycle, and carrier levels varied between 7.5 and 37.5 dB sensation levels. A large part of the results can be explained by assuming a reduction in effective modulation depth by spontaneous otoacoustic emissions-or more generally cochlear resonances-that synchronize to the carrier at fine-structure minima. Beating between cochlear resonances and the stimulus ("monaural diplacusis") may hamper the detection task, but this cannot account for the whole effect.
Collapse
Affiliation(s)
- Stephan J Heise
- Institut fur Physik, Universitat Oldenburg, D-26111 Oldenburg, Germany.
| | | | | |
Collapse
|
22
|
He NJ, Mills JH, Ahlstrom JB, Dubno JR. Age-related differences in the temporal modulation transfer function with pure-tone carriers. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2008; 124:3841-9. [PMID: 19206810 PMCID: PMC2676625 DOI: 10.1121/1.2998779] [Citation(s) in RCA: 64] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/09/2023]
Abstract
Detection of amplitude modulation (AM) in 500 and 4000 Hz tonal carriers was measured as a function of modulation frequency from younger and older adults with normal hearing through 4000 Hz. The modulation frequency above which sensitivity to AM increased ("transition frequency") was similar for both groups. Temporal modulation transfer function shapes showed significant age-related differences. For younger subjects, AM detection thresholds were generally constant for low modulation frequencies. For a higher carrier frequency, AM detection thresholds then increased as modulation frequency further increased until the transition frequency. In contrast, AM detection for older subjects continuously increased with increasing modulation frequency, indicating an age-related decline in temporal resolution for faster envelope fluctuations. Significant age-related differences were observed whenever AM detection was dependent on temporal cues. For modulation frequencies above the transition frequency, age-related differences were larger for the lower frequency carrier (where both temporal and spectral cues were available) than for the higher frequency carrier (where AM detection was primarily dependent on spectral cues). These results are consistent with a general age-related decline in the synchronization of neural responses to both the carrier waveform and envelope fluctuation.
Collapse
Affiliation(s)
- Ning-ji He
- Department of Otolaryngology-Head and Neck Surgery, Medical University of South Carolina, MSC 550, Charleston, South Carolina 29425-5500, USA
| | | | | | | |
Collapse
|
23
|
Koopman J, Houtgast T, Dreschler WA. Modulation detection interference for asynchronous presentation of masker and target in listeners with normal and impaired hearing. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2008; 51:1588-98. [PMID: 18695020 DOI: 10.1044/1092-4388(2008/07-0075)] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/24/2023]
Abstract
PURPOSE The sensitivity to sinusoidal amplitude modulations (SAMs) is reduced when other modulated maskers are presented simultaneously at a distant frequency (also referred to as modulation detection interference [MDI]). This article describes the results of onset differences between masker and target as a parameter. METHOD Carrier frequencies were 1 kHz (target: 625 ms, 8 Hz SAM) and 2 kHz (masker: 625 ms, 8 Hz SAM; modulation depth = 1) presented at 25 dB SL for listeners with impaired hearing (n = 8) and at 25 dB SL and 50 dB SL for listeners with normal hearing (n = 6). Masker was delayed by 0, 125, 250, 500, 625, or 750 ms relative to the target. RESULTS Sensitivity to SAMs was reduced in both groups by a modulated masker simultaneous presentation. Reducing the temporal overlap (i.e., increasing the onset delay between masker and target) increased the sensitivity to SAMs in the presence of modulated maskers. CONCLUSION The gradual reduction in MDI with increasing asynchrony between masker and target suggests that MDI is not solely related to perceptual grouping. Reduced sensitivity to SAMs due to prior stimulation with SAM stimuli (forward masking), and deficits in across-channel integration, are other factors that may play a role.
Collapse
Affiliation(s)
- Jan Koopman
- Department of ENT, Erasmus Medical Center, Audiological Center - D1.26, Antwoordnummer 55, 3070 WB Rotterdam, The Netherlands.
| | | | | |
Collapse
|
24
|
Jepsen ML, Ewert SD, Dau T. A computational model of human auditory signal processing and perception. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2008; 124:422-438. [PMID: 18646987 DOI: 10.1121/1.2924135] [Citation(s) in RCA: 84] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/26/2023]
Abstract
A model of computational auditory signal-processing and perception that accounts for various aspects of simultaneous and nonsimultaneous masking in human listeners is presented. The model is based on the modulation filterbank model described by Dau et al. [J. Acoust. Soc. Am. 102, 2892 (1997)] but includes major changes at the peripheral and more central stages of processing. The model contains outer- and middle-ear transformations, a nonlinear basilar-membrane processing stage, a hair-cell transduction stage, a squaring expansion, an adaptation stage, a 150-Hz lowpass modulation filter, a bandpass modulation filterbank, a constant-variance internal noise, and an optimal detector stage. The model was evaluated in experimental conditions that reflect, to a different degree, effects of compression as well as spectral and temporal resolution in auditory processing. The experiments include intensity discrimination with pure tones and broadband noise, tone-in-noise detection, spectral masking with narrow-band signals and maskers, forward masking with tone signals and tone or noise maskers, and amplitude-modulation detection with narrow- and wideband noise carriers. The model can account for most of the key properties of the data and is more powerful than the original model. The model might be useful as a front end in technical applications.
Collapse
Affiliation(s)
- Morten L Jepsen
- Centre for Applied Hearing Research, Acoustic Technology, Department of Electrical Engineering, Technical University of Denmark, DK-2800 Kgs. Lyngby, Denmark
| | | | | |
Collapse
|
25
|
Edwards DR, Lee J, Andrews J, Wong A. Contribution of onset/offset information of modulation to amplitude modulation depth discrimination. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2008; 123:EL111-EL115. [PMID: 18529084 PMCID: PMC2811550 DOI: 10.1121/1.2895758] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/27/2007] [Accepted: 02/07/2008] [Indexed: 05/26/2023]
Abstract
A previous study by [J. Lee, G. Long, and C. Jeung, J. Acoust. Soc. Am. 119, S3332 (2006)] found that information at the onset or offset of modulation could be utilized for improved amplitude modulation (AM) depth discrimination in a continuous carrier condition (carrier presented 250 ms earlier and later than the modulator). In this study, the relative contribution of information at the onset or offset of the modulation was examined with an onset-fringe carrier condition (carrier begins 250 ms earlier than the modulator) and an offset-fringe condition (carrier ends 250 ms later than the modulator). The results suggest that modulation information at the onset might be utilized more than at the offset.
Collapse
Affiliation(s)
- Derek R Edwards
- Department of Speech, Language, and Hearing Sciences, University of Arizona, 1131 E. 2nd Street, Tucson, Arizona 85721-0071, USA.
| | | | | | | |
Collapse
|
26
|
Piechowiak T, Ewert SD, Dau T. Modeling comodulation masking release using an equalization-cancellation mechanism. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2007; 121:2111-26. [PMID: 17471726 DOI: 10.1121/1.2534227] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/15/2023]
Abstract
This study presents an auditory processing model that accounts for the perceptual phenomenon of comodulation masking release (CMR). The model includes an equalization-cancellation (EC) stage for the processing of activity across the audio-frequency axis. The EC process across frequency takes place at the output of a modulation filterbank assumed for each audio-frequency channel. The model was evaluated in three experimental conditions: (i) CMR with four widely spaced flanking bands in order to study pure across-channel processing, (ii) CMR with one flanking band varying in frequency in order to study the transition between conditions dominated by within-channel processing and those dominated by across-channel processing, and (iii) CMR obtained in the "classical" band-widening paradigm in order to study the role of across-channel processing in a condition which always includes within-channel processing. The simulations support the hypothesis that within-channel contributions to CMR can be as large as 15 dB. The across-channel process is robust but small (about 2-4 dB) and only observable at small masker bandwidths. Overall, the proposed model might provide an interesting framework for the analysis of fluctuating sounds in the auditory system.
Collapse
Affiliation(s)
- Tobias Piechowiak
- Centre for Applied Hearing Research, Acoustic Technology, Ørsted DTU, Technical University of Denmark, DK-2800 Kgs. Lyngby, Denmark
| | | | | |
Collapse
|
27
|
Nelson PC, Ewert SD, Carney LH, Dau T. Comparison of level discrimination, increment detection, and comodulation masking release in the audio- and envelope-frequency domains. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2007; 121:2168-81. [PMID: 17471731 PMCID: PMC2572867 DOI: 10.1121/1.2535868] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/15/2023]
Abstract
In general, the temporal structure of stimuli must be considered to account for certain observations made in detection and masking experiments in the audio-frequency domain. Two such phenomena are (1) a heightened sensitivity to amplitude increments with a temporal fringe compared to gated level discrimination performance and (2) lower tone-in-noise detection thresholds using a modulated masker compared to those using an unmodulated masker. In the current study, translations of these two experiments were carried out to test the hypothesis that analogous cues might be used in the envelope-frequency domain. Pure-tone carrier amplitude-modulation (AM) depth-discrimination thresholds were found to be similar using both traditional gated stimuli and using a temporally modulated fringe for a fixed standard depth (ms = 0.25) and a range of AM frequencies (4-64 Hz). In a second experiment, masked sinusoidal AM detection thresholds were compared in conditions with and without slow and regular fluctuations imposed on the instantaneous masker AM depth. Release from masking was obtained only for very slow masker fluctuations (less than 2 Hz). A physiologically motivated model that effectively acts as a first-order envelope change detector accounted for several, but not all, of the key aspects of the data.
Collapse
Affiliation(s)
- Paul C Nelson
- Department of Biomedical and Chemical Engineering and Institute for Sensory Research, Syracuse University, New York 13244, USA
| | | | | | | |
Collapse
|
28
|
Pincas J, Jackson PJB. Amplitude modulation of turbulence noise by voicing in fricatives. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2006; 120:3966-77. [PMID: 17225423 DOI: 10.1121/1.2358004] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/13/2023]
Abstract
The two principal sources of sound in speech, voicing and frication, occur simultaneously in voiced fricatives as well as at the vowel-fricative boundary in phonologically voiceless fricatives. Instead of simply overlapping, the two sources interact. This paper is an acoustic study of one such interaction effect: the amplitude modulation of the frication component when voicing is present. Corpora of sustained and fluent-speech English fricatives were recorded and analyzed using a signal-processing technique designed to extract estimates of modulation depth. Results reveal a pattern, consistent across speaking style, speaker, and place of articulation, for modulation at fo to rise at low voicing strengths and subsequently saturate. Voicing strength needed to produce saturation varied 60-66 dB across subjects and experimental conditions. Modulation depths at saturation varied little across speakers but significantly for place of articulation (with [z] showing particularly strong modulation) clustering at approximately 0.4-0.5 (a 40%-50% fluctuation above and below unmodulated amplitude); spectral analysis of modulating signals revealed weak but detectable modulation at the second and third harmonics (i.e., 2fo and 3fo).
Collapse
Affiliation(s)
- Jonathan Pincas
- Centre for Vision, Speech & Signal Processing, University of Surrey, Guildford, GU2 7XH, United Kingdom.
| | | |
Collapse
|
29
|
Lyzenga J, Carlyon RP. Detection, direction discrimination, and off-frequency interference of center-frequency modulations and glides for vowel formants. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2005; 117:3042-53. [PMID: 15957773 DOI: 10.1121/1.1882943] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/03/2023]
Abstract
Vowels are mainly classified by the positions of peaks in their frequency spectra, the formants. For normal-hearing subjects, change detection and direction discrimination were measured for linear glides in the center frequency (CF) of formantlike sounds. A CF rove was used to prevent subjects from using either the start or end points of the glides as cues. In addition, change detection and starting-phase (start-direction) discrimination were measured for similar stimuli with a sinusoidal 5-Hz formant-frequency modulation. The stimuli consisted of single formants generated using a number of different stimulus parameters including fundamental frequency, spectral slope, frequency region, and position of the formant relative to the harmonic spectrum. The change detection thresholds were in good agreement with the predictions of a model which analyzed and combined the effects of place-of-excitation and temporal cues. For most stimuli, thresholds were approximately equal for change detection and start-direction discrimination. Exceptions were found for stimuli that consisted of only one or two harmonics. In a separate experiment, it was shown that change detection and start-direction discrimination of linear and sinusoidal formant-frequency modulations were impaired by off-frequency frequency-modulated interferers. This frequency modulation detection interference was larger for formants with shallow than for those with steep spectral slopes.
Collapse
Affiliation(s)
- J Lyzenga
- MRC Cognition and Brain Sciences Unit, 15 Chaucer Road, Cambridge CB2 2EF United Kingdom
| | | |
Collapse
|
30
|
Ewert SD, Dau T. External and internal limitations in amplitude-modulation processing. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2004; 116:478-490. [PMID: 15296007 DOI: 10.1121/1.1737399] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/24/2023]
Abstract
Three experiments are presented to explore the relative role of "external" signal variability and "internal" resolution limitations of the auditory system in the detection and discrimination of amplitude modulations (AM). In the first experiment, AM-depth discrimination performance was determined using sinusoidally modulated broadband-noise and pure-tone carriers. The AM index, m, of the standard ranged from -28 to -3 dB (expressed as 20 log m). AM-depth discrimination thresholds were found to be a fraction of the AM depth of the standard for standards down to -18 dB, in the case of the pure-tone carrier, and down to -8 dB, in the case of the broadband-noise carrier. For smaller standards, AM-depth discrimination required a fixed increase in AM depth, independent of the AM depth of the standard. In the second experiment, AM-detection thresholds were obtained for signal-modulation frequencies of 4, 16, 64, and 256 Hz, applied to either a band-limited random-noise carrier or a deterministic ("frozen") noise carrier, as a function of carrier bandwidth (8 to 2048 Hz). In general, detection thresholds were higher for the random- than for the frozen-noise carriers. For both carrier types, thresholds followed the pattern expected from frequency-selective processing of the stimulus envelope. The third experiment investigated AM masking at 4, 16, and 64 Hz in the presence of a narrow-band masker modulation. The variability of the masker was changed from entirely frozen to entirely random, while the long-term average envelope power spectrum was held constant. The experiment examined the validity of a long-term average quantity as the decision variable, and the role of memory in experiments with frozen-noise maskers. The empirical results were compared to predictions obtained with two modulation-filterbank models. The predictions revealed that AM-depth discrimination and AM detection are limited by a combination of the external signal variability and an internal "Weber-fraction" noise process.
Collapse
Affiliation(s)
- Stephan D Ewert
- Carl von Ossietzky Universität Oldenburg, Medizinische Physik, D-26111 Oldenburg, Germany.
| | | |
Collapse
|
31
|
Ross B, Picton TW, Pantev C. Temporal integration in the human auditory cortex as represented by the development of the steady-state magnetic field. Hear Res 2002; 165:68-84. [PMID: 12031517 DOI: 10.1016/s0378-5955(02)00285-x] [Citation(s) in RCA: 135] [Impact Index Per Article: 6.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
The threshold for detecting amplitude modulation (AM) decreases with increasing duration of the AM sound up to several hundred milliseconds. If the auditory evoked steady-state response (SSR) to AM sound is an electrophysiological correlate of AM processing in the human brain, the development of the SSR should follow this course of temporal integration. Magnetoencephalographic recordings of SSR to 40 Hz AM tone-bursts were compared with responses to non-modulated tone-bursts at inter-stimulus intervals (ISIs) of 3, 1, and 0.5 s. Both types of stimuli elicited a transient gamma-band response (GBR), an N1 wave, and a sustained field (SF) during stimulus presentation. The AM stimulus evoked an additional 40 Hz SSR. The N1 amplitude was strongly reduced with shortened ISI, whereas the amplitudes of SSR, GBR, and SF were little affected by the ISI. Magnetic source-localization procedures estimated the generators of the early GBR, the SSR, and the SF to be anterior and medial to the sources of the N1. The sources of the SSR were in primary auditory cortex and separate from GBR sources. The SSR amplitude increased monotonically over a 200 ms period beginning about 40 ms after stimulus onset. The time course of the SSR phase reliably measured the duration of this transition to the steady state. At stimulus offset the SSR ceased within 50 ms. These results indicate that the primary auditory cortex responds immediately to stimulus changes and integrates stimulus features over a period of about 200 ms.
Collapse
Affiliation(s)
- Bernhard Ross
- Institute of Experimental Audiology, Münster University Hospital, Germany.
| | | | | |
Collapse
|
32
|
Lorenzi C, Simpson MI, Millman RE, Griffiths TD, Woods WP, Rees A, Green GG. Second-order modulation detection thresholds for pure-tone and narrow-band noise carriers. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2001; 110:2470-2478. [PMID: 11757936 DOI: 10.1121/1.1406160] [Citation(s) in RCA: 17] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/23/2023]
Abstract
Modulation perception has typically been characterized by measuring detection thresholds for sinusoidally amplitude-modulated (SAM) signals. This study uses multicomponent modulations. "Second-order" temporal modulation transfer functions (TMTFs) measure detection thresholds for a sinusoidal modulation of the modulation waveform of a SAM signal [Lorenzi et al., J. Acoust. Soc. Am. 110, 1030-2038 (2001)]. The SAM signal therefore acts as a "carrier" stimulus of frequency fm, and sinusoidal modulation of the SAM signal's modulation depth (at rate f'm) generates two additional components in the modulation spectrum at fm - f'm and fm + f'm. There is no spectral energy at the envelope beat frequency f'm in the modulation spectrum of the "physical" stimulus. In the present study, second-order TMTFs were measured for three listeners when fm was 16, 64, and 256 Hz. The carrier was either a 5-kHz pure tone or a narrow-band noise with center frequency and bandwidth of 5 kHz and 2 Hz, respectively. The narrow-band noise carrier was used to prevent listeners from detecting spectral energy at the beat frequency f'm in the "internal" stimuli's modulation spectrum. The results show that, for the 5-kHz pure-tone carrier, second-order TMTFs are nearly low pass in shape; the overall sensitivity and cutoff frequency measured on these second-order TMTFs increase when fm increases from 16 to 256 Hz. For the 2-Hz-wide narrow-band noise carrier, second-order TMTFs are nearly flat in shape for fm = 16 and 64 Hz, and they show a high-pass segment for fm = 256 Hz. These results suggest that detection of spectral energy at the envelope beat frequency contributes in part to the detection of second-order modulation. This is consistent with the idea that nonlinear mechanisms in the auditory pathway produce an audible distortion component at the envelope beat frequency in the internal modulation spectrum of the sounds.
Collapse
Affiliation(s)
- C Lorenzi
- Laboratoire de Psychologie Expérimentale, UMR CNRS 8581, Institut de Psychologie, Université René Descartes Paris V, Boulogne-Billancourt, France
| | | | | | | | | | | | | |
Collapse
|
33
|
Krishna BS, Semple MN. Auditory temporal processing: responses to sinusoidally amplitude-modulated tones in the inferior colliculus. J Neurophysiol 2000; 84:255-73. [PMID: 10899201 DOI: 10.1152/jn.2000.84.1.255] [Citation(s) in RCA: 235] [Impact Index Per Article: 9.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022] Open
Abstract
Time-varying envelopes are a common feature of acoustic communication signals like human speech and induce a variety of percepts in human listeners. We studied the responses of 109 single neurons in the inferior colliculus (IC) of the anesthetized Mongolian gerbil to contralaterally presented sinusoidally amplitude-modulated (SAM) tones with a wide range of parameters. Modulation transfer functions (MTFs) based on average spike rate (rMTFs) showed regions of enhancement and suppression, where spike rates increased or decreased respectively as stimulus modulation depth increased. Specifically, almost all IC rMTFs could be described by some combination of a primary and a secondary region of enhancement and an intervening region of suppression, with these regions present to varying degrees in individual rMTFs. rMTF characteristics of most neurons were dependent on sound pressure level (SPL). rMTFs in most neurons with "onset" or "onset-sustained" peri-stimulus time histograms (PSTHs) in response to brief pure tones showed only a peaked primary region of enhancement. The region of suppression tended to occur in neurons with "sustained" or "pauser" PSTHs, and usually emerged at higher SPLs. The secondary region of enhancement was only found in eight neurons. The lowest modulation frequency at which the spike rate reached a clear peak ("best modulation frequency" or BMF) was measured. All but two mean BMFs lay between 0 and 100 Hz. Fifty percent of the 49 neurons tested over at least a 20-dB range of SPLs showed a BMF variation larger than 66% of their mean BMF. MTFs based on vector strength (tMTFs) showed a variety of patterns; although mostly similar to those reported from the cochlear nucleus, tMTFs of IC neurons showed higher maximum values, smaller dynamic range with depth, and a lower high-frequency limit for significant phase locking. Systematic and large increases in phase-lead commonly occurred as SPL increased. rMTFs measured at multiple carrier frequencies (F(c)s) showed that the suppressive region was not the result of sideband inhibition. There was no systematic relationship between BMF and F(c) of stimulation in the cells studied, even at low carrier frequencies. The results suggest various possible mechanisms that could create IC MTFs, and strongly support the idea that inhibitory inputs shape the rMTF by sharpening regions of enhancement and creating a suppressive region. The paucity of BMFs above 100 Hz argues against simple rate-coding schemes for pitch. Finally, any labeled line or topographic representation of modulation frequency is unlikely to be independent of SPL.
Collapse
Affiliation(s)
- B S Krishna
- Center for Neural Science, New York University, New York, New York 10003, USA
| | | |
Collapse
|
34
|
He NJ, Horwitz AR, Dubno JR, Mills JH. Psychometric functions for gap detection in noise measured from young and aged subjects. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 1999; 106:966-978. [PMID: 10462802 DOI: 10.1121/1.427109] [Citation(s) in RCA: 70] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/23/2023]
Abstract
Psychometric functions for gap detection of temporal gaps in wideband noise were measured in a "yes/no" paradigm from normal-hearing young and aged subjects with closely matched audiograms. The effects of noise-burst duration, gap location, and uncertainty of gap location were tested. A typical psychometric function obtained in this study featured a steep slope, which was independent of most experimental conditions as well as age. However, gap thresholds were generally improved with increasing duration of the noise burst for both young and aged subjects. Gap location and uncertainty had no significant effects on the thresholds for the young subjects. For the aged subjects, whenever the gap was sufficiently away from the onset or offset of the noise burst, detectability was robust despite uncertainty about the gap location. Significant differences between young and aged subjects could be observed only when the gap was very close to the signal onset and offset.
Collapse
Affiliation(s)
- N J He
- Medical University of South Carolina, Charleston 29425-2242, USA.
| | | | | | | |
Collapse
|
35
|
Lyzenga J, Carlyon RP. Center frequency modulation detection for harmonic complexes resembling vowel formants and its interference by off-frequency maskers. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 1999; 105:2792-2806. [PMID: 10335631 DOI: 10.1121/1.426896] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/23/2023]
Abstract
Vowels are characterized by peaks in their spectral envelopes: the formants. To gain insight into the perception of speech as well as into the basic abilities of the ear, sensitivity to modulations in the positions of these formants is investigated. Frequency modulation detection thresholds (FMTs) were measured for the center frequency of formantlike harmonic complexes in the absence and in the presence of simultaneous off-frequency formants (maskers). Both the signals and the maskers were harmonic complexes which were band-pass filtered with a triangular spectral envelope, on a log-log scale, into either a LOW (near 500 Hz), a MID (near 1500 Hz), or a HIGH region (near 3000 Hz). They had a duration of 250 ms, and either an 80- or a 240-Hz fundamental. The modulation rate was 5 Hz for the signals and 10 Hz for the maskers. A pink noise background was presented continuously. In a first experiment no maskers were used. The measured FMTs were roughly two times larger than previously reported just-noticeable differences for formant frequency. In a second experiment, no significant differences were found between the FMTs in the absence of maskers and those in the presence of stationary (i.e., nonfrequency modulated) maskers. However, under many conditions the FMTs were increased by the presence of simultaneous modulated maskers. These results indicate that frequency modulation detection interference (FMDI) can exist for formantlike complex tones. The FMDI data could be divided into two groups. For stimuli characterized by a steep (200-dB/oct) slope, it was found that the size of the FMDI depended on which cues were used for detecting the signal and masker modulations. For stimuli with shallow (50-dB/oct) slopes, the FMDI was reduced when the signal and the masker had widely differing fundamentals, implying that the fundamental information is extracted before the interference occurs.
Collapse
Affiliation(s)
- J Lyzenga
- MRC Cognition and Brain Sciences Unit, Cambridge, United Kingdom.
| | | |
Collapse
|