1
|
Warnecke M, Litovsky RY. Signal envelope and speech intelligibility differentially impact auditory motion perception. Sci Rep 2021; 11:15117. [PMID: 34302032 PMCID: PMC8302594 DOI: 10.1038/s41598-021-94662-y] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2021] [Accepted: 07/14/2021] [Indexed: 11/09/2022] Open
Abstract
Our acoustic environment contains a plethora of complex sounds that are often in motion. To gauge approaching danger and communicate effectively, listeners need to localize and identify sounds, which includes determining sound motion. This study addresses which acoustic cues impact listeners' ability to determine sound motion. Signal envelope (ENV) cues are implicated in both sound motion tracking and stimulus intelligibility, suggesting that these processes could be competing for sound processing resources. We created auditory chimaera from speech and noise stimuli and varied the number of frequency bands, effectively manipulating speech intelligibility. Normal-hearing adults were presented with stationary or moving chimaeras and reported perceived sound motion and content. Results show that sensitivity to sound motion is not affected by speech intelligibility, but shows a clear difference for original noise and speech stimuli. Further, acoustic chimaera with speech-like ENVs which had intelligible content induced a strong bias in listeners to report sounds as stationary. Increasing stimulus intelligibility systematically increased that bias and removing intelligible content reduced it, suggesting that sound content may be prioritized over sound motion. These findings suggest that sound motion processing in the auditory system can be biased by acoustic parameters related to speech intelligibility.
Collapse
Affiliation(s)
- Michaela Warnecke
- University of Wisconsin-Madison, Waisman Center, 1500 Highland Ave, Madison, WI, 53705, USA.
| | - Ruth Y Litovsky
- University of Wisconsin-Madison, Waisman Center, 1500 Highland Ave, Madison, WI, 53705, USA
| |
Collapse
|
2
|
Baltzell LS, Swaminathan J, Cho AY, Lavandier M, Best V. Binaural sensitivity and release from speech-on-speech masking in listeners with and without hearing loss. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2020; 147:1546. [PMID: 32237845 PMCID: PMC7060089 DOI: 10.1121/10.0000812] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/30/2019] [Revised: 02/07/2020] [Accepted: 02/11/2020] [Indexed: 05/29/2023]
Abstract
Listeners with sensorineural hearing loss routinely experience less spatial release from masking (SRM) in speech mixtures than listeners with normal hearing. Hearing-impaired listeners have also been shown to have degraded temporal fine structure (TFS) sensitivity, a consequence of which is degraded access to interaural time differences (ITDs) contained in the TFS. Since these "binaural TFS" cues are critical for spatial hearing, it has been hypothesized that degraded binaural TFS sensitivity accounts for the limited SRM experienced by hearing-impaired listeners. In this study, speech stimuli were noise-vocoded using carriers that were systematically decorrelated across the left and right ears, thus simulating degraded binaural TFS sensitivity. Both (1) ITD sensitivity in quiet and (2) SRM in speech mixtures spatialized using ITDs (or binaural release from masking; BRM) were measured as a function of TFS interaural decorrelation in young normal-hearing and hearing-impaired listeners. This allowed for the examination of the relationship between ITD sensitivity and BRM over a wide range of ITD thresholds. This paper found that, for a given ITD sensitivity, hearing-impaired listeners experienced less BRM than normal-hearing listeners, suggesting that binaural TFS sensitivity can account for only a modest portion of the BRM deficit in hearing-impaired listeners. However, substantial individual variability was observed.
Collapse
Affiliation(s)
- Lucas S Baltzell
- Department of Speech, Language, and Hearing Sciences, Boston University, 635 Commonwealth Avenue, Boston, Massachusetts 02215, USA
| | - Jayaganesh Swaminathan
- Department of Speech, Language, and Hearing Sciences, Boston University, 635 Commonwealth Avenue, Boston, Massachusetts 02215, USA
| | - Adrian Y Cho
- Department of Speech, Language, and Hearing Sciences, Boston University, 635 Commonwealth Avenue, Boston, Massachusetts 02215, USA
| | - Mathieu Lavandier
- University of Lyon, ENTPE, Laboratoire Génie Civil et Bâtiment, Rue Maurice Audin, F-69518 Vaulx-en-Velin Cedex, France
| | - Virginia Best
- Department of Speech, Language, and Hearing Sciences, Boston University, 635 Commonwealth Avenue, Boston, Massachusetts 02215, USA
| |
Collapse
|
3
|
Hu G, Determan SC, Dong Y, Beeve AT, Collins JE, Gai Y. Spectral and Temporal Envelope Cues for Human and Automatic Speech Recognition in Noise. J Assoc Res Otolaryngol 2019; 21:73-87. [PMID: 31758279 DOI: 10.1007/s10162-019-00737-z] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2019] [Accepted: 09/16/2019] [Indexed: 11/30/2022] Open
Abstract
Acoustic features of speech include various spectral and temporal cues. It is known that temporal envelope plays a critical role for speech recognition by human listeners, while automated speech recognition (ASR) heavily relies on spectral analysis. This study compared sentence-recognition scores of humans and an ASR software, Dragon, when spectral and temporal-envelope cues were manipulated in background noise. Temporal fine structure of meaningful sentences was reduced by noise or tone vocoders. Three types of background noise were introduced: a white noise, a time-reversed multi-talker noise, and a fake-formant noise. Spectral information was manipulated by changing the number of frequency channels. With a 20-dB signal-to-noise ratio (SNR) and four vocoding channels, white noise had a stronger disruptive effect than the fake-formant noise. The same observation with 22 channels was made when SNR was lowered to 0 dB. In contrast, ASR was unable to function with four vocoding channels even with a 20-dB SNR. Its performance was least affected by white noise and most affected by the fake-formant noise. Increasing the number of channels, which improved the spectral resolution, generated non-monotonic behaviors for the ASR with white noise but not with colored noise. The ASR also showed highly improved performance with tone vocoders. It is possible that fake-formant noise affected the software's performance by disrupting spectral cues, whereas white noise affected performance by compromising speech segmentation. Overall, these results suggest that human listeners and ASR utilize different listening strategies in noise.
Collapse
Affiliation(s)
- Guangxin Hu
- Biomedical Engineering Department, Saint Louis University, 3007 Lindell Blvd Suite 2007, St Louis, MO, 63103, USA
| | - Sarah C Determan
- Biomedical Engineering Department, Saint Louis University, 3007 Lindell Blvd Suite 2007, St Louis, MO, 63103, USA
| | - Yue Dong
- Biomedical Engineering Department, Saint Louis University, 3007 Lindell Blvd Suite 2007, St Louis, MO, 63103, USA
| | - Alec T Beeve
- Biomedical Engineering Department, Saint Louis University, 3007 Lindell Blvd Suite 2007, St Louis, MO, 63103, USA
| | - Joshua E Collins
- Biomedical Engineering Department, Saint Louis University, 3007 Lindell Blvd Suite 2007, St Louis, MO, 63103, USA
| | - Yan Gai
- Biomedical Engineering Department, Saint Louis University, 3007 Lindell Blvd Suite 2007, St Louis, MO, 63103, USA.
| |
Collapse
|
4
|
Manno FAM, Cruces RR, Lau C, Barrios FA. Uncertain Emotion Discrimination Differences Between Musicians and Non-musicians Is Determined by Fine Structure Association: Hilbert Transform Psychophysics. Front Neurosci 2019; 13:902. [PMID: 31619943 PMCID: PMC6759500 DOI: 10.3389/fnins.2019.00902] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2019] [Accepted: 08/13/2019] [Indexed: 11/13/2022] Open
Abstract
Humans perceive musical sound as a complex phenomenon, which is known to induce an emotional response. The cues used to perceive emotion in music have not been unequivocally elucidated. Here, we sought to identify the attributes of sound that confer an emotion to music and determine if professional musicians have different musical emotion perception than non-musicians. The objective was to determine which sound cues are used to resolve emotional signals. Happy or sad classical music excerpts modified in fine structure or envelope conveying different degrees of emotional certainty were presented. Certainty was determined by identification of the emotional characteristic presented during a forced-choice discrimination task. Participants were categorized as good or poor performers (n = 32, age 21.16 ± 2.59 SD) and in a separate group as musicians in the first or last year of music education at a conservatory (n = 32, age 21.97 ± 2.42). We found that temporal fine structure information is essential for correct emotional identification. Non-musicians used less fine structure information to discriminate emotion in music compared with musicians. The present psychophysical experiments revealed what cues are used to resolve emotional signals and how they differ between non-musicians and musically educated individuals.
Collapse
Affiliation(s)
- Francis A. M. Manno
- School of Biomedical Engineering, Faculty of Engineering, University of Sydney, Sydney, NSW, Australia
- Department of Physics, City University of Hong Kong, Hong Kong, China
| | - Raul R. Cruces
- Instituto de Neurobiología, Universidad Nacional Autónoma de México, Querétaro, Mexico
| | - Condon Lau
- Department of Physics, City University of Hong Kong, Hong Kong, China
| | - Fernando A. Barrios
- Instituto de Neurobiología, Universidad Nacional Autónoma de México, Querétaro, Mexico
| |
Collapse
|
5
|
Hou L, Xu L. Role of short-time acoustic temporal fine structure cues in sentence recognition for normal-hearing listeners. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2018; 143:EL127. [PMID: 29495716 PMCID: PMC5820060 DOI: 10.1121/1.5024817] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/11/2017] [Revised: 01/28/2018] [Accepted: 02/03/2018] [Indexed: 06/08/2023]
Abstract
Short-time processing was employed to manipulate the amplitude, bandwidth, and temporal fine structure (TFS) in sentences. Fifty-two native-English-speaking, normal-hearing listeners participated in four sentence-recognition experiments. Results showed that recovered envelope (E) played an important role in speech recognition when the bandwidth was > 1 equivalent rectangular bandwidth. Removing TFS drastically reduced sentence recognition. Preserving TFS greatly improved sentence recognition when amplitude information was available at a rate ≥ 10 Hz (i.e., time segment ≤ 100 ms). Therefore, the short-time TFS facilitates speech perception together with the recovered E and works with the coarse amplitude cues to provide useful information for speech recognition.
Collapse
Affiliation(s)
- Limin Hou
- Communication and Information Engineering, Shanghai University, Shanghai, China
| | - Li Xu
- Communication Sciences and Disorders, Ohio University, Athens, Ohio 45701, USA
| |
Collapse
|
6
|
Predictions of Speech Chimaera Intelligibility Using Auditory Nerve Mean-Rate and Spike-Timing Neural Cues. J Assoc Res Otolaryngol 2017; 18:687-710. [PMID: 28748487 DOI: 10.1007/s10162-017-0627-7] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2015] [Accepted: 05/29/2017] [Indexed: 10/19/2022] Open
Abstract
Perceptual studies of speech intelligibility have shown that slow variations of acoustic envelope (ENV) in a small set of frequency bands provides adequate information for good perceptual performance in quiet, whereas acoustic temporal fine-structure (TFS) cues play a supporting role in background noise. However, the implications for neural coding are prone to misinterpretation because the mean-rate neural representation can contain recovered ENV cues from cochlear filtering of TFS. We investigated ENV recovery and spike-time TFS coding using objective measures of simulated mean-rate and spike-timing neural representations of chimaeric speech, in which either the ENV or the TFS is replaced by another signal. We (a) evaluated the levels of mean-rate and spike-timing neural information for two categories of chimaeric speech, one retaining ENV cues and the other TFS; (b) examined the level of recovered ENV from cochlear filtering of TFS speech; (c) examined and quantified the contribution to recovered ENV from spike-timing cues using a lateral inhibition network (LIN); and (d) constructed linear regression models with objective measures of mean-rate and spike-timing neural cues and subjective phoneme perception scores from normal-hearing listeners. The mean-rate neural cues from the original ENV and recovered ENV partially accounted for perceptual score variability, with additional variability explained by the recovered ENV from the LIN-processed TFS speech. The best model predictions of chimaeric speech intelligibility were found when both the mean-rate and spike-timing neural cues were included, providing further evidence that spike-time coding of TFS cues is important for intelligibility when the speech envelope is degraded.
Collapse
|
7
|
Qi B, Mao Y, Liu J, Liu B, Xu L. Relative contributions of acoustic temporal fine structure and envelope cues for lexical tone perception in noise. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2017; 141:3022. [PMID: 28599529 PMCID: PMC5415402 DOI: 10.1121/1.4982247] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/24/2016] [Revised: 03/21/2017] [Accepted: 04/11/2017] [Indexed: 06/07/2023]
Abstract
Previous studies have shown that lexical tone perception in quiet relies on the acoustic temporal fine structure (TFS) but not on the envelope (E) cues. The contributions of TFS to speech recognition in noise are under debate. In the present study, Mandarin tone tokens were mixed with speech-shaped noise (SSN) or two-talker babble (TTB) at five signal-to-noise ratios (SNRs; -18 to +6 dB). The TFS and E were then extracted from each of the 30 bands using Hilbert transform. Twenty-five combinations of TFS and E from the sound mixtures of the same tone tokens at various SNRs were created. Twenty normal-hearing, native-Mandarin-speaking listeners participated in the tone-recognition test. Results showed that tone-recognition performance improved as the SNRs in either TFS or E increased. The masking effects on tone perception for the TTB were weaker than those for the SSN. For both types of masker, the perceptual weights of TFS and E in tone perception in noise was nearly equivalent, with E playing a slightly greater role than TFS. Thus, the relative contributions of TFS and E cues to lexical tone perception in noise or in competing-talker maskers differ from those in quiet and those to speech perception of non-tonal languages.
Collapse
Affiliation(s)
- Beier Qi
- Department of Otolaryngology-Head and Neck Surgery, Beijing Tongren Hospital, Capital Medical University, Beijing, China
| | - Yitao Mao
- Department of Radiology, Xiangya Hospital, Central South University, Changsha, Hunan, China
| | - Jiaxing Liu
- Department of Otolaryngology-Head and Neck Surgery, Beijing Tongren Hospital, Capital Medical University, Beijing, China
| | - Bo Liu
- Department of Otolaryngology-Head and Neck Surgery, Beijing Tongren Hospital, Capital Medical University, Beijing, China
| | - Li Xu
- Communication Sciences and Disorders, Ohio University, Athens, Ohio 45701, USA
| |
Collapse
|
8
|
Nambi PMA, Mahajan Y, Francis N, Bhat JS. Temporal fine structure mediated recognition of speech in the presence of multitalker babble. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2016; 140:EL296. [PMID: 27794309 DOI: 10.1121/1.4964416] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]
Abstract
This experiment investigated the mechanisms of temporal fine structure (TFS) mediated speech recognition in multi-talker babble. The signal-to-noise ratio 50 (SNR-50) for naive-listeners was measured when the TFS was retained in its original form (ORIG-TFS), the TFS was time reversed (REV-TFS), and the TFS was replaced by noise (NO-TFS). The original envelope was unchanged. In the REV-TFS condition, periodicity cues for stream segregation were preserved, but envelope recovery was compromised. Both the mechanisms were compromised in the NO-TFS condition. The SNR-50 was lowest for ORIG-TFS followed by REV-TFS, which was lower than NO-TFS. Results suggest both stream segregation and envelope recovery aided TFS mediated speech recognition.
Collapse
Affiliation(s)
- Pitchai Muthu Arivudai Nambi
- Department of Audiology and Speech Language Pathology, Kasturba Medical College (Manipal University), Mangalore, India
| | - Yatin Mahajan
- The MARCS Institute for Brain, Behaviour and Development, Western Sydney University, Sydney Australia , , , and
| | - Nikita Francis
- Department of Audiology and Speech Language Pathology, Kasturba Medical College (Manipal University), Mangalore, India
| | - Jayashree S Bhat
- Department of Audiology and Speech Language Pathology, Kasturba Medical College (Manipal University), Mangalore, India
| |
Collapse
|
9
|
Reed CM, Desloge JG, Braida LD, Perez ZD, Léger AC. Level variations in speech: Effect on masking release in hearing-impaired listeners. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2016; 140:102. [PMID: 27475136 PMCID: PMC6910012 DOI: 10.1121/1.4954746] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/01/2015] [Revised: 06/02/2016] [Accepted: 06/10/2016] [Indexed: 05/31/2023]
Abstract
Acoustic speech is marked by time-varying changes in the amplitude envelope that may pose difficulties for hearing-impaired listeners. Removal of these variations (e.g., by the Hilbert transform) could improve speech reception for such listeners, particularly in fluctuating interference. Léger, Reed, Desloge, Swaminathan, and Braida [(2015b). J. Acoust. Soc. Am. 138, 389-403] observed that a normalized measure of masking release obtained for hearing-impaired listeners using speech processed to preserve temporal fine-structure (TFS) cues was larger than that for unprocessed or envelope-based speech. This study measured masking release for two other speech signals in which level variations were minimal: peak clipping and TFS processing of an envelope signal. Consonant identification was measured for hearing-impaired listeners in backgrounds of continuous and fluctuating speech-shaped noise. The normalized masking release obtained using speech with normal variations in overall level was substantially less than that observed using speech processed to achieve highly restricted level variations. These results suggest that the performance of hearing-impaired listeners in fluctuating noise may be improved by signal processing that leads to a decrease in stimulus level variations.
Collapse
Affiliation(s)
- Charlotte M Reed
- Research Laboratory of Electronics, Massachusetts Institute of Technology, 77 Massachusetts Avenue, Cambridge, Massachusetts 02139, USA
| | - Joseph G Desloge
- Research Laboratory of Electronics, Massachusetts Institute of Technology, 77 Massachusetts Avenue, Cambridge, Massachusetts 02139, USA
| | - Louis D Braida
- Research Laboratory of Electronics, Massachusetts Institute of Technology, 77 Massachusetts Avenue, Cambridge, Massachusetts 02139, USA
| | - Zachary D Perez
- Research Laboratory of Electronics, Massachusetts Institute of Technology, 77 Massachusetts Avenue, Cambridge, Massachusetts 02139, USA
| | - Agnès C Léger
- School of Psychological Sciences, University of Manchester, Manchester, M13 9PL, United Kingdom
| |
Collapse
|
10
|
Apoux F, Youngdahl CL, Yoho SE, Healy EW. Dual-carrier processing to convey temporal fine structure cues: Implications for cochlear implants. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2015; 138:1469-80. [PMID: 26428784 PMCID: PMC4575322 DOI: 10.1121/1.4928136] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/14/2014] [Revised: 07/22/2015] [Accepted: 07/23/2015] [Indexed: 05/26/2023]
Abstract
Speech intelligibility in noise can be degraded by using vocoder processing to alter the temporal fine structure (TFS). Here it is argued that this degradation is not attributable to the loss of speech information potentially present in the TFS. Instead it is proposed that the degradation results from the loss of sound-source segregation information when two or more carriers (i.e., TFS) are substituted with only one as a consequence of vocoder processing. To demonstrate this segregation role, vocoder processing involving two carriers, one for the target and one for the background, was implemented. Because this approach does not preserve the speech TFS, it may be assumed that any improvement in intelligibility can only be a consequence of the preserved carrier duality and associated segregation cues. Three experiments were conducted using this "dual-carrier" approach. All experiments showed substantial sentence intelligibility in noise improvements compared to traditional single-carrier conditions. In several conditions, the improvement was so substantial that intelligibility approximated that for unprocessed speech in noise. A foreseeable and potentially promising implication for the dual-carrier approach involves implementation into cochlear implant speech processors, where it may provide the TFS cues necessary to segregate speech from noise.
Collapse
Affiliation(s)
- Frédéric Apoux
- Speech Psychoacoustics Laboratory, Department of Speech and Hearing Science, The Ohio State University, Columbus, Ohio 43210, USA
| | - Carla L Youngdahl
- Speech Psychoacoustics Laboratory, Department of Speech and Hearing Science, The Ohio State University, Columbus, Ohio 43210, USA
| | - Sarah E Yoho
- Speech Psychoacoustics Laboratory, Department of Speech and Hearing Science, The Ohio State University, Columbus, Ohio 43210, USA
| | - Eric W Healy
- Speech Psychoacoustics Laboratory, Department of Speech and Hearing Science, The Ohio State University, Columbus, Ohio 43210, USA
| |
Collapse
|
11
|
Léger AC, Reed CM, Desloge JG, Swaminathan J, Braida LD. Consonant identification in noise using Hilbert-transform temporal fine-structure speech and recovered-envelope speech for listeners with normal and impaired hearing. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2015; 138:389-403. [PMID: 26233038 PMCID: PMC4514718 DOI: 10.1121/1.4922949] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/20/2014] [Revised: 04/07/2015] [Accepted: 06/11/2015] [Indexed: 05/31/2023]
Abstract
Consonant-identification ability was examined in normal-hearing (NH) and hearing-impaired (HI) listeners in the presence of steady-state and 10-Hz square-wave interrupted speech-shaped noise. The Hilbert transform was used to process speech stimuli (16 consonants in a-C-a syllables) to present envelope cues, temporal fine-structure (TFS) cues, or envelope cues recovered from TFS speech. The performance of the HI listeners was inferior to that of the NH listeners both in terms of lower levels of performance in the baseline condition and in the need for higher signal-to-noise ratio to yield a given level of performance. For NH listeners, scores were higher in interrupted noise than in steady-state noise for all speech types (indicating substantial masking release). For HI listeners, masking release was typically observed for TFS and recovered-envelope speech but not for unprocessed and envelope speech. For both groups of listeners, TFS and recovered-envelope speech yielded similar levels of performance and consonant confusion patterns. The masking release observed for TFS and recovered-envelope speech may be related to level effects associated with the manner in which the TFS processing interacts with the interrupted noise signal, rather than to the contributions of TFS cues per se.
Collapse
Affiliation(s)
- Agnès C Léger
- School of Psychological Sciences, University of Manchester, Manchester, M13 9PL, United Kingdom
| | - Charlotte M Reed
- Research Laboratory of Electronics, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, USA
| | - Joseph G Desloge
- Research Laboratory of Electronics, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, USA
| | - Jayaganesh Swaminathan
- Research Laboratory of Electronics, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, USA
| | - Louis D Braida
- Research Laboratory of Electronics, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, USA
| |
Collapse
|
12
|
Léger AC, Desloge JG, Braida LD, Swaminathan J. The role of recovered envelope cues in the identification of temporal-fine-structure speech for hearing-impaired listeners. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2015; 137:505-508. [PMID: 25618081 PMCID: PMC4304958 DOI: 10.1121/1.4904540] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/25/2014] [Revised: 11/14/2014] [Accepted: 11/26/2014] [Indexed: 06/01/2023]
Abstract
Narrowband speech can be separated into fast temporal cues [temporal fine structure (TFS)], and slow amplitude modulations (envelope). Speech processed to contain only TFS leads to envelope recovery through cochlear filtering, which has been suggested to account for TFS-speech intelligibility for normal-hearing listeners. Hearing-impaired listeners have deficits with TFS-speech identification, but the contribution of recovered-envelope cues to these deficits is unknown. This was assessed for hearing-impaired listeners by measuring identification of disyllables processed to contain TFS or recovered-envelope cues. Hearing-impaired listeners performed worse than normal-hearing listeners, but TFS-speech intelligibility was accounted for by recovered-envelope cues for both groups.
Collapse
Affiliation(s)
- Agnès C Léger
- Research Laboratory of Electronics, Massachusetts Institute of Technology, 77 Massachusetts Avenue, Room 36-757, Cambridge, Massachusetts 02139
| | - Joseph G Desloge
- Research Laboratory of Electronics, Massachusetts Institute of Technology, 77 Massachusetts Avenue, Room 36-757, Cambridge, Massachusetts 02139
| | - Louis D Braida
- Research Laboratory of Electronics, Massachusetts Institute of Technology, 77 Massachusetts Avenue, Room 36-757, Cambridge, Massachusetts 02139
| | - Jayaganesh Swaminathan
- Research Laboratory of Electronics, Massachusetts Institute of Technology, 77 Massachusetts Avenue, Room 36-757, Cambridge, Massachusetts 02139
| |
Collapse
|