1
|
Viswanathan V, Heinz MG, Shinn-Cunningham BG. Impact of Reduced Spectral Resolution on Temporal-Coherence-Based Source Segregation. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.03.11.584489. [PMID: 38586037 PMCID: PMC10998286 DOI: 10.1101/2024.03.11.584489] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 04/09/2024]
Abstract
Hearing-impaired listeners struggle to understand speech in noise, even when using cochlear implants (CIs) or hearing aids. Successful listening in noisy environments depends on the brain's ability to organize a mixture of sound sources into distinct perceptual streams (i.e., source segregation). In normal-hearing listeners, temporal coherence of sound fluctuations across frequency channels supports this process by promoting grouping of elements belonging to a single acoustic source. We hypothesized that reduced spectral resolution-a hallmark of both electric/CI (from current spread) and acoustic (from broadened tuning) hearing with sensorineural hearing loss-degrades segregation based on temporal coherence. This is because reduced frequency resolution decreases the likelihood that a single sound source dominates the activity driving any specific channel; concomitantly, it increases the correlation in activity across channels. Consistent with our hypothesis, predictions from a physiologically plausible model of temporal-coherence-based segregation suggest that CI current spread reduces comodulation masking release (CMR; a correlate of temporal-coherence processing) and speech intelligibility in noise. These predictions are consistent with our behavioral data with simulated CI listening. Our model also predicts smaller CMR with increasing levels of outer-hair-cell damage. These results suggest that reduced spectral resolution relative to normal hearing impairs temporal-coherence-based segregation and speech-in-noise outcomes.
Collapse
Affiliation(s)
- Vibha Viswanathan
- Neuroscience Institute, Carnegie Mellon University, Pitttsburgh, PA 15213
| | - Michael G. Heinz
- Department of Speech, Language, and Hearing Sciences, Purdue University, West Lafayette, IN 47907
| | | |
Collapse
|
2
|
Alamri Y, Jennings SG. Computational modeling of the human compound action potential. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2023; 153:2376. [PMID: 37092943 PMCID: PMC10119875 DOI: 10.1121/10.0017863] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/23/2022] [Revised: 03/21/2023] [Accepted: 04/04/2023] [Indexed: 05/03/2023]
Abstract
The auditory nerve (AN) compound action potential (CAP) is an important tool for assessing auditory disorders and monitoring the health of the auditory periphery during surgical procedures. The CAP has been mathematically conceptualized as the convolution of a unit response (UR) waveform with the firing rate of a population of AN fibers. Here, an approach for predicting experimentally recorded CAPs in humans is proposed, which involves the use of human-based computational models to simulate AN activity. CAPs elicited by clicks, chirps, and amplitude-modulated carriers were simulated and compared with empirically recorded CAPs from human subjects. In addition, narrowband CAPs derived from noise-masked clicks and tone bursts were simulated. Many morphological, temporal, and spectral aspects of human CAPs were captured by the simulations for all stimuli tested. These findings support the use of model simulations of the human CAP to refine existing human-based models of the auditory periphery, aid in the design and analysis of auditory experiments, and predict the effects of hearing loss, synaptopathy, and other auditory disorders on the human CAP.
Collapse
Affiliation(s)
- Yousef Alamri
- Department of Biomedical Engineering, The University of Utah, 390 South, 1530 East, BEHS 1201, Salt Lake City, Utah 84112, USA
| | - Skyler G Jennings
- Department of Communication Sciences and Disorders, The University of Utah, 390 South, 1530 East, BEHS 1201, Salt Lake City, Utah 84112, USA
| |
Collapse
|
3
|
Parida S, Heinz MG. Underlying neural mechanisms of degraded speech intelligibility following noise-induced hearing loss: The importance of distorted tonotopy. Hear Res 2022; 426:108586. [PMID: 35953357 PMCID: PMC11149709 DOI: 10.1016/j.heares.2022.108586] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/21/2021] [Revised: 06/21/2022] [Accepted: 07/21/2022] [Indexed: 11/30/2022]
Abstract
Listeners with sensorineural hearing loss (SNHL) have substantial perceptual deficits, especially in noisy environments. Unfortunately, speech-intelligibility models have limited success in predicting the performance of listeners with hearing loss. A better understanding of the various suprathreshold factors that contribute to neural-coding degradations of speech in noisy conditions will facilitate better modeling and clinical outcomes. Here, we highlight the importance of one physiological factor that has received minimal attention to date, termed distorted tonotopy, which refers to a disruption in the mapping between acoustic frequency and cochlear place that is a hallmark of normal hearing. More so than commonly assumed factors (e.g., threshold elevation, reduced frequency selectivity, diminished temporal coding), distorted tonotopy severely degrades the neural representations of speech (particularly in noise) in single- and across-fiber responses in the auditory nerve following noise-induced hearing loss. Key results include: 1) effects of distorted tonotopy depend on stimulus spectral bandwidth and timbre, 2) distorted tonotopy increases across-fiber correlation and thus reduces information capacity to the brain, and 3) its effects vary across etiologies, which may contribute to individual differences. These results motivate the development and testing of noninvasive measures that can assess the severity of distorted tonotopy in human listeners. The development of such noninvasive measures of distorted tonotopy would advance precision-audiological approaches to improving diagnostics and rehabilitation for listeners with SNHL.
Collapse
Affiliation(s)
- Satyabrata Parida
- Department of Speech, Language, and Hearing Sciences, Purdue University, West Lafayette, IN, 47907 USA; Department of Neurobiology, University of Pittsburgh, Pittsburgh, PA, 15261 USA.
| | - Michael G Heinz
- Department of Speech, Language, and Hearing Sciences, Purdue University, West Lafayette, IN, 47907 USA; Weldon School of Biomedical Engineering, Purdue University, West Lafayette, IN, 47907 USA
| |
Collapse
|
4
|
Viswanathan V, Shinn-Cunningham BG, Heinz MG. Speech Categorization Reveals the Role of Early-Stage Temporal-Coherence Processing in Auditory Scene Analysis. J Neurosci 2022; 42:240-254. [PMID: 34764159 PMCID: PMC8802934 DOI: 10.1523/jneurosci.1610-21.2021] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2021] [Revised: 10/18/2021] [Accepted: 10/26/2021] [Indexed: 11/21/2022] Open
Abstract
Temporal coherence of sound fluctuations across spectral channels is thought to aid auditory grouping and scene segregation. Although prior studies on the neural bases of temporal-coherence processing focused mostly on cortical contributions, neurophysiological evidence suggests that temporal-coherence-based scene analysis may start as early as the cochlear nucleus (i.e., the first auditory region supporting cross-channel processing over a wide frequency range). Accordingly, we hypothesized that aspects of temporal-coherence processing that could be realized in early auditory areas may shape speech understanding in noise. We then explored whether physiologically plausible computational models could account for results from a behavioral experiment that measured consonant categorization in different masking conditions. We tested whether within-channel masking of target-speech modulations predicted consonant confusions across the different conditions and whether predictions were improved by adding across-channel temporal-coherence processing mirroring the computations known to exist in the cochlear nucleus. Consonant confusions provide a rich characterization of error patterns in speech categorization, and are thus crucial for rigorously testing models of speech perception; however, to the best of our knowledge, they have not been used in prior studies of scene analysis. We find that within-channel modulation masking can reasonably account for category confusions, but that it fails when temporal fine structure cues are unavailable. However, the addition of across-channel temporal-coherence processing significantly improves confusion predictions across all tested conditions. Our results suggest that temporal-coherence processing strongly shapes speech understanding in noise and that physiological computations that exist early along the auditory pathway may contribute to this process.SIGNIFICANCE STATEMENT Temporal coherence of sound fluctuations across distinct frequency channels is thought to be important for auditory scene analysis. Prior studies on the neural bases of temporal-coherence processing focused mostly on cortical contributions, and it was unknown whether speech understanding in noise may be shaped by across-channel processing that exists in earlier auditory areas. Using physiologically plausible computational modeling to predict consonant confusions across different listening conditions, we find that across-channel temporal coherence contributes significantly to scene analysis and speech perception and that such processing may arise in the auditory pathway as early as the brainstem. By virtue of providing a richer characterization of error patterns not obtainable with just intelligibility scores, consonant confusions yield unique insight into scene analysis mechanisms.
Collapse
Affiliation(s)
- Vibha Viswanathan
- Weldon School of Biomedical Engineering, Purdue University, West Lafayette, Indiana 47907
| | | | - Michael G Heinz
- Weldon School of Biomedical Engineering, Purdue University, West Lafayette, Indiana 47907
- Department of Speech, Language, and Hearing Sciences, Purdue University, West Lafayette, Indiana 47907
| |
Collapse
|
5
|
Viswanathan V, Bharadwaj HM, Shinn-Cunningham BG, Heinz MG. Modulation masking and fine structure shape neural envelope coding to predict speech intelligibility across diverse listening conditions. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2021; 150:2230. [PMID: 34598642 PMCID: PMC8483789 DOI: 10.1121/10.0006385] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/26/2021] [Revised: 07/22/2021] [Accepted: 08/30/2021] [Indexed: 05/28/2023]
Abstract
A fundamental question in the neuroscience of everyday communication is how scene acoustics shape the neural processing of attended speech sounds and in turn impact speech intelligibility. While it is well known that the temporal envelopes in target speech are important for intelligibility, how the neural encoding of target-speech envelopes is influenced by background sounds or other acoustic features of the scene is unknown. Here, we combine human electroencephalography with simultaneous intelligibility measurements to address this key gap. We find that the neural envelope-domain signal-to-noise ratio in target-speech encoding, which is shaped by masker modulations, predicts intelligibility over a range of strategically chosen realistic listening conditions unseen by the predictive model. This provides neurophysiological evidence for modulation masking. Moreover, using high-resolution vocoding to carefully control peripheral envelopes, we show that target-envelope coding fidelity in the brain depends not only on envelopes conveyed by the cochlea, but also on the temporal fine structure (TFS), which supports scene segregation. Our results are consistent with the notion that temporal coherence of sound elements across envelopes and/or TFS influences scene analysis and attentive selection of a target sound. Our findings also inform speech-intelligibility models and technologies attempting to improve real-world speech communication.
Collapse
Affiliation(s)
- Vibha Viswanathan
- Weldon School of Biomedical Engineering, Purdue University, West Lafayette, Indiana 47907, USA
| | - Hari M Bharadwaj
- Department of Speech, Language, and Hearing Sciences, Purdue University, West Lafayette, Indiana 47907, USA
| | | | - Michael G Heinz
- Department of Speech, Language, and Hearing Sciences, Purdue University, West Lafayette, Indiana 47907, USA
| |
Collapse
|
6
|
The Use of Static and Dynamic Cues for Vowel Identification by Children Wearing Hearing Aids or Cochlear Implants. Ear Hear 2019; 41:72-81. [PMID: 30998549 DOI: 10.1097/aud.0000000000000735] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
OBJECTIVE To examine vowel perception based on dynamic formant transition and/or static formant pattern cues in children with hearing loss while using their hearing aids or cochlear implants. We predicted that the sensorineural hearing loss would degrade formant transitions more than static formant patterns, and that shortening the duration of cues would cause more difficulty for vowel identification for these children than for their normal-hearing peers. DESIGN A repeated-measures, between-group design was used. Children 4 to 9 years of age from a university hearing services clinic who were fit for hearing aids (13 children) or who wore cochlear implants (10 children) participated. Chronologically age-matched children with normal hearing served as controls (23 children). Stimuli included three naturally produced syllables (/ba/, /bi/, and /bu/), which were presented either in their entirety or segmented to isolate the formant transition or the vowel static formant center. The stimuli were presented to listeners via loudspeaker in the sound field. Aided participants wore their own devices and listened with their everyday settings. Participants chose the vowel presented by selecting from corresponding pictures on a computer screen. RESULTS Children with hearing loss were less able to use shortened transition or shortened vowel centers to identify vowels as compared to their normal-hearing peers. Whole syllable and initial transition yielded better identification performance than the vowel center for /ɑ/, but not for /i/ or /u/. CONCLUSIONS The children with hearing loss may require a longer time window than children with normal hearing to integrate vowel cues over time because of altered peripheral encoding in spectrotemporal domains. Clinical implications include cognizance of the importance of vowel perception when developing habilitative programs for children with hearing loss.
Collapse
|
7
|
Paraouty N, Stasiak A, Lorenzi C, Varnet L, Winter IM. Dual Coding of Frequency Modulation in the Ventral Cochlear Nucleus. J Neurosci 2018; 38:4123-4137. [PMID: 29599389 PMCID: PMC6596033 DOI: 10.1523/jneurosci.2107-17.2018] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2017] [Revised: 03/18/2018] [Accepted: 03/22/2018] [Indexed: 11/21/2022] Open
Abstract
Frequency modulation (FM) is a common acoustic feature of natural sounds and is known to play a role in robust sound source recognition. Auditory neurons show precise stimulus-synchronized discharge patterns that may be used for the representation of low-rate FM. However, it remains unclear whether this representation is based on synchronization to slow temporal envelope (ENV) cues resulting from cochlear filtering or phase locking to faster temporal fine structure (TFS) cues. To investigate the plausibility of those encoding schemes, single units of the ventral cochlear nucleus of guinea pigs of either sex were recorded in response to sine FM tones centered at the unit's best frequency (BF). The results show that, in contrast to high-BF units, for modulation depths within the receptive field, low-BF units (<4 kHz) demonstrate good phase locking to TFS. For modulation depths extending beyond the receptive field, the discharge patterns follow the ENV and fluctuate at the modulation rate. The receptive field proved to be a good predictor of the ENV responses for most primary-like and chopper units. The current in vivo data also reveal a high level of diversity in responses across unit types. TFS cues are mainly conveyed by low-frequency and primary-like units and ENV cues by chopper and onset units. The diversity of responses exhibited by cochlear nucleus neurons provides a neural basis for a dual-coding scheme of FM in the brainstem based on both ENV and TFS cues.SIGNIFICANCE STATEMENT Natural sounds, including speech, convey informative temporal modulations in frequency. Understanding how the auditory system represents those frequency modulations (FM) has important implications as robust sound source recognition depends crucially on the reception of low-rate FM cues. Here, we recorded 115 single-unit responses from the ventral cochlear nucleus in response to FM and provide the first physiological evidence of a dual-coding mechanism of FM via synchronization to temporal envelope cues and phase locking to temporal fine structure cues. We also demonstrate a diversity of neural responses with different coding specializations. These results support the dual-coding scheme proposed by psychophysicists to account for FM sensitivity in humans and provide new insights on how this might be implemented in the early stages of the auditory pathway.
Collapse
Affiliation(s)
- Nihaad Paraouty
- Centre for the Neural Basis of Hearing, The Physiological Laboratory, Department of Physiology, Development and Neuroscience, University of Cambridge, United Kingdom and
- Laboratoire des Systèmes Perceptifs CNRS UMR 8248, École Normale Supérieure, Paris Sciences et Lettres Research University, Paris, France
| | - Arkadiusz Stasiak
- Centre for the Neural Basis of Hearing, The Physiological Laboratory, Department of Physiology, Development and Neuroscience, University of Cambridge, United Kingdom and
| | - Christian Lorenzi
- Laboratoire des Systèmes Perceptifs CNRS UMR 8248, École Normale Supérieure, Paris Sciences et Lettres Research University, Paris, France
| | - Léo Varnet
- Laboratoire des Systèmes Perceptifs CNRS UMR 8248, École Normale Supérieure, Paris Sciences et Lettres Research University, Paris, France
| | - Ian M Winter
- Centre for the Neural Basis of Hearing, The Physiological Laboratory, Department of Physiology, Development and Neuroscience, University of Cambridge, United Kingdom and
| |
Collapse
|
8
|
Abstract
Diagnosing and treating hearing impairment is challenging because people with similar degrees of sensorineural hearing loss (SNHL) often have different speech-recognition abilities. The speech-based envelope power spectrum model (sEPSM) has demonstrated that the signal-to-noise ratio (SNRENV) from a modulation filter bank provides a robust speech-intelligibility measure across a wider range of degraded conditions than many long-standing models. In the sEPSM, noise (N) is assumed to: (a) reduce S + N envelope power by filling in dips within clean speech (S) and (b) introduce an envelope noise floor from intrinsic fluctuations in the noise itself. While the promise of SNRENV has been demonstrated for normal-hearing listeners, it has not been thoroughly extended to hearing-impaired listeners because of limited physiological knowledge of how SNHL affects speech-in-noise envelope coding relative to noise alone. Here, envelope coding to speech-in-noise stimuli was quantified from auditory-nerve model spike trains using shuffled correlograms, which were analyzed in the modulation-frequency domain to compute modulation-band estimates of neural SNRENV. Preliminary spike-train analyses show strong similarities to the sEPSM, demonstrating feasibility of neural SNRENV computations. Results suggest that individual differences can occur based on differential degrees of outer- and inner-hair-cell dysfunction in listeners currently diagnosed into the single audiological SNHL category. The predicted acoustic-SNR dependence in individual differences suggests that the SNR-dependent rate of susceptibility could be an important metric in diagnosing individual differences. Future measurements of the neural SNRENV in animal studies with various forms of SNHL will provide valuable insight for understanding individual differences in speech-in-noise intelligibility.
Collapse
Affiliation(s)
- Varsha H. Rallapalli
- Department of Speech, Language, and Hearing Sciences, Purdue University, West Lafayette, IN, USA
| | - Michael G. Heinz
- Department of Speech, Language, and Hearing Sciences, Purdue University, West Lafayette, IN, USA
- Weldon School of Biomedical Engineering, Purdue University, IN, USA
| |
Collapse
|
9
|
Vowel perception in listeners with normal hearing and in listeners with hearing loss: a preliminary study. Clin Exp Otorhinolaryngol 2015; 8:26-33. [PMID: 25729492 PMCID: PMC4338088 DOI: 10.3342/ceo.2015.8.1.26] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2013] [Revised: 02/06/2014] [Accepted: 02/18/2014] [Indexed: 11/08/2022] Open
Abstract
OBJECTIVES To determine the influence of hearing loss on perception of vowel slices. METHODS Fourteen listeners aged 20-27 participated; ten (6 males) had hearing within normal limits and four (3 males) had moderate-severe sensorineural hearing loss (SNHL). Stimuli were six naturally-produced words consisting of the vowels /i a u æ ɛ ʌ/ in a /b V b/ context. Each word was presented as a whole and in eight slices: the initial transition, one half and one fourth of initial transition, full central vowel, one-half central vowel, ending transition, one half and one fourth of ending transition. Each of the 54 stimuli was presented 10 times at 70 dB SPL (sound press level); listeners were asked to identify the word. Stimuli were shaped using signal processing software for the listeners with SNHL to mimic gain provided by an appropriately-fitting hearing aid. RESULTS Listeners with SNHL had a steeper rate of decreasing vowel identification with decreasing slice duration as compared to listeners with normal hearing, and the listeners with SNHL showed different patterns of vowel identification across vowels when compared to listeners with normal hearing. CONCLUSION Abnormal temporal integration is likely affecting vowel identification for listeners with SNHL, which in turn affects vowel internal representation at different levels of the auditory system.
Collapse
|
10
|
Churchill TH, Kan A, Goupell MJ, Ihlefeld A, Litovsky RY. Speech perception in noise with a harmonic complex excited vocoder. J Assoc Res Otolaryngol 2014; 15:265-78. [PMID: 24448721 DOI: 10.1007/s10162-013-0435-7] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2012] [Accepted: 12/17/2013] [Indexed: 12/01/2022] Open
Abstract
A cochlear implant (CI) presents band-pass-filtered acoustic envelope information by modulating current pulse train levels. Similarly, a vocoder presents envelope information by modulating an acoustic carrier. By studying how normal hearing (NH) listeners are able to understand degraded speech signals with a vocoder, the parameters that best simulate electric hearing and factors that might contribute to the NH-CI performance difference may be better understood. A vocoder with harmonic complex carriers (fundamental frequency, f0 = 100 Hz) was used to study the effect of carrier phase dispersion on speech envelopes and intelligibility. The starting phases of the harmonic components were randomly dispersed to varying degrees prior to carrier filtering and modulation. NH listeners were tested on recognition of a closed set of vocoded words in background noise. Two sets of synthesis filters simulated different amounts of current spread in CIs. Results showed that the speech vocoded with carriers whose starting phases were maximally dispersed was the most intelligible. Superior speech understanding may have been a result of the flattening of the dispersed-phase carrier's intrinsic temporal envelopes produced by the large number of interacting components in the high-frequency channels. Cross-correlogram analyses of auditory nerve model simulations confirmed that randomly dispersing the carrier's component starting phases resulted in better neural envelope representation. However, neural metrics extracted from these analyses were not found to accurately predict speech recognition scores for all vocoded speech conditions. It is possible that central speech understanding mechanisms are insensitive to the envelope-fine structure dichotomy exploited by vocoders.
Collapse
Affiliation(s)
- Tyler H Churchill
- Waisman Center, University of Wisconsin-Madison, 1500 Highland Avenue #521, Madison, WI, 53705, USA,
| | | | | | | | | |
Collapse
|
11
|
Lorenzi C, Wallaert N, Gnansia D, Leger AC, Ives DT, Chays A, Garnier S, Cazals Y. Temporal-envelope reconstruction for hearing-impaired listeners. J Assoc Res Otolaryngol 2012; 13:853-65. [PMID: 23007719 PMCID: PMC3505588 DOI: 10.1007/s10162-012-0350-3] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2011] [Accepted: 09/09/2012] [Indexed: 10/27/2022] Open
Abstract
Recent studies suggest that normal-hearing listeners maintain robust speech intelligibility despite severe degradations of amplitude-modulation (AM) cues, by using temporal-envelope information recovered from broadband frequency-modulation (FM) speech cues at the output of cochlear filters. This study aimed to assess whether cochlear damage affects this capacity to reconstruct temporal-envelope information from FM. This was achieved by measuring the ability of 40 normal-hearing listeners and 41 listeners with mild-to-moderate hearing loss to identify syllables processed to degrade AM cues while leaving FM cues intact within three broad frequency bands spanning the range 65-3,645 Hz. Stimuli were presented at 65 dB SPL for both normal-hearing listeners and hearing-impaired listeners. They were presented as such or amplified using a modified half-gain rule for hearing-impaired listeners. Hearing-impaired listeners showed significantly poorer identification scores than normal-hearing listeners at both presentation levels. However, the deficit shown by hearing-impaired listeners for amplified stimuli was relatively modest. Overall, hearing-impaired data and the results of a simulation study were consistent with a poorer-than-normal ability to reconstruct temporal-envelope information resulting from a broadening of cochlear filters by a factor ranging from 2 to 4. These results suggest that mild-to-moderate cochlear hearing loss has only a modest detrimental effect on peripheral, temporal-envelope reconstruction mechanisms.
Collapse
Affiliation(s)
- Christian Lorenzi
- Equipe Audition (CNRS, Universite Paris Descartes, Ecole normale superieure), Institut d'Etude de la Cognition, Ecole normale superieure, Paris Sciences et Lettres, 29 rue d'Ulm, 75005 Paris, France.
| | | | | | | | | | | | | | | |
Collapse
|
12
|
Psychophysiological analyses demonstrate the importance of neural envelope coding for speech perception in noise. J Neurosci 2012; 32:1747-56. [PMID: 22302814 DOI: 10.1523/jneurosci.4493-11.2012] [Citation(s) in RCA: 59] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022] Open
Abstract
Understanding speech in noisy environments is often taken for granted; however, this task is particularly challenging for people with cochlear hearing loss, even with hearing aids or cochlear implants. A significant limitation to improving auditory prostheses is our lack of understanding of the neural basis for robust speech perception in noise. Perceptual studies suggest the slowly varying component of the acoustic waveform (envelope, ENV) is sufficient for understanding speech in quiet, but the rapidly varying temporal fine structure (TFS) is important in noise. These perceptual findings have important implications for cochlear implants, which currently only provide ENV; however, neural correlates have been difficult to evaluate due to cochlear transformations between acoustic TFS and recovered neural ENV. Here, we demonstrate the relative contributions of neural ENV and TFS by quantitatively linking neural coding, predicted from a computational auditory nerve model, with perception of vocoded speech in noise measured from normal hearing human listeners. Regression models with ENV and TFS coding as independent variables predicted speech identification and phonetic feature reception at both positive and negative signal-to-noise ratios. We found that: (1) neural ENV coding was a primary contributor to speech perception, even in noise; and (2) neural TFS contributed in noise mainly in the presence of neural ENV, but rarely as the primary cue itself. These results suggest that neural TFS has less perceptual salience than previously thought due to cochlear signal processing transformations between TFS and ENV. Because these transformations differ between normal and impaired ears, these findings have important translational implications for auditory prostheses.
Collapse
|