1
|
Luo C, Ding N. Cortical encoding of hierarchical linguistic information when syllabic rhythms are obscured by echoes. Neuroimage 2024:120875. [PMID: 39341475 DOI: 10.1016/j.neuroimage.2024.120875] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2024] [Revised: 09/24/2024] [Accepted: 09/26/2024] [Indexed: 10/01/2024] Open
Abstract
In speech perception, low-frequency cortical activity tracks hierarchical linguistic units (e.g., syllables, phrases, and sentences) on top of acoustic features (e.g., speech envelope). Since the fluctuation of speech envelope typically corresponds to the syllabic boundaries, one common interpretation is that the acoustic envelope underlies the extraction of discrete syllables from continuous speech for subsequent linguistic processing. However, it remains unclear whether and how cortical activity encodes linguistic information when the speech envelope does not provide acoustic correlates of syllables. To address the issue, we introduced a frequency-tagging speech stream where the syllabic rhythm was obscured by echoic envelopes and investigated neural encoding of hierarchical linguistic information using electroencephalography (EEG). When listeners attended to the echoic speech, cortical activity showed reliable tracking of syllable, phrase, and sentence levels, among which the higher-level linguistic units elicited more robust neural responses. When attention was diverted from the echoic speech, reliable neural tracking of the syllable level was also observed in contrast to deteriorated neural tracking of the phrase and sentence levels. Further analyses revealed that the envelope aligned with the syllabic rhythm could be recovered from the echoic speech through a neural adaptation model, and the reconstructed envelope yielded higher predictive power for the neural tracking responses than either the original echoic envelope or anechoic envelope. Taken together, these results suggest that neural adaptation and attentional modulation jointly contribute to neural encoding of linguistic information in distorted speech where the syllabic rhythm is obscured by echoes.
Collapse
Affiliation(s)
- Cheng Luo
- Zhejiang Lab, Hangzhou 311121, China.
| | - Nai Ding
- Key Laboratory for Biomedical Engineering of Ministry of Education, College of Biomedical Engineering and Instrument Sciences, Zhejiang University, Hangzhou 310027, China; The State Key Lab of Brain-Machine Intelligence; The MOE Frontier Science Center for Brain Science & Brain-machine Integration, Zhejiang University, Hangzhou, 310027, China
| |
Collapse
|
2
|
Patro C, Monfiletto A, Singer A, Srinivasan NK, Mishra SK. Midlife Speech Perception Deficits: Impact of Extended High-Frequency Hearing, Peripheral Neural Function, and Cognitive Abilities. Ear Hear 2024; 45:1149-1164. [PMID: 38556645 DOI: 10.1097/aud.0000000000001504] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/02/2024]
Abstract
OBJECTIVES The objectives of the present study were to investigate the effects of age-related changes in extended high-frequency (EHF) hearing, peripheral neural function, working memory, and executive function on speech perception deficits in middle-aged individuals with clinically normal hearing. DESIGN We administered a comprehensive assessment battery to 37 participants spanning the age range of 20 to 56 years. This battery encompassed various evaluations, including standard and EHF pure-tone audiometry, ranging from 0.25 to 16 kHz. In addition, we conducted auditory brainstem response assessments with varying stimulation rates and levels, a spatial release from masking (SRM) task, and cognitive evaluations that involved the Trail Making test (TMT) for assessing executive function and the Abbreviated Reading Span test (ARST) for measuring working memory. RESULTS The results indicated a decline in hearing sensitivities at EHFs and an increase in completion times for the TMT with age. In addition, as age increased, there was a corresponding decrease in the amount of SRM. The declines in SRM were associated with age-related declines in hearing sensitivity at EHFs and TMT performance. While we observed an age-related decline in wave I responses, this decline was primarily driven by age-related reductions in EHF thresholds. In addition, the results obtained using the ARST did not show an age-related decline. Neither the auditory brainstem response results nor ARST scores were correlated with the amount of SRM. CONCLUSIONS These findings suggest that speech perception deficits in middle age are primarily linked to declines in EHF hearing and executive function, rather than cochlear synaptopathy or working memory.
Collapse
Affiliation(s)
- Chhayakanta Patro
- Department of Speech Language Pathology & Audiology, Towson University, Towson, Maryland, USA
| | - Angela Monfiletto
- Department of Speech Language Pathology & Audiology, Towson University, Towson, Maryland, USA
| | - Aviya Singer
- Department of Speech Language Pathology & Audiology, Towson University, Towson, Maryland, USA
| | - Nirmal Kumar Srinivasan
- Department of Speech Language Pathology & Audiology, Towson University, Towson, Maryland, USA
| | - Srikanta Kumar Mishra
- Department of Speech, Language and Hearing Sciences, The University of Texas at Austin, Austin, Texas, USA
| |
Collapse
|
3
|
Lalonde K, Peng ZE, Halverson DM, Dwyer GA. Children's use of spatial and visual cues for release from perceptual masking. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2024; 155:1559-1569. [PMID: 38393738 PMCID: PMC10890829 DOI: 10.1121/10.0024766] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/22/2023] [Revised: 01/19/2024] [Accepted: 01/22/2024] [Indexed: 02/25/2024]
Abstract
This study examined the role of visual speech in providing release from perceptual masking in children by comparing visual speech benefit across conditions with and without a spatial separation cue. Auditory-only and audiovisual speech recognition thresholds in a two-talker speech masker were obtained from 21 children with typical hearing (7-9 years of age) using a color-number identification task. The target was presented from a loudspeaker at 0° azimuth. Masker source location varied across conditions. In the spatially collocated condition, the masker was also presented from the loudspeaker at 0° azimuth. In the spatially separated condition, the masker was presented from the loudspeaker at 0° azimuth and a loudspeaker at -90° azimuth, with the signal from the -90° loudspeaker leading the signal from the 0° loudspeaker by 4 ms. The visual stimulus (static image or video of the target talker) was presented at 0° azimuth. Children achieved better thresholds when the spatial cue was provided and when the visual cue was provided. Visual and spatial cue benefit did not differ significantly depending on the presence of the other cue. Additional studies are needed to characterize how children's preferential use of visual and spatial cues varies depending on the strength of each cue.
Collapse
Affiliation(s)
- Kaylah Lalonde
- Center for Hearing Research, Boys Town National Research Hospital, Omaha, Nebraska 68131, USA
| | - Z Ellen Peng
- Center for Hearing Research, Boys Town National Research Hospital, Omaha, Nebraska 68131, USA
| | - Destinee M Halverson
- Center for Hearing Research, Boys Town National Research Hospital, Omaha, Nebraska 68131, USA
| | - Grace A Dwyer
- Center for Hearing Research, Boys Town National Research Hospital, Omaha, Nebraska 68131, USA
| |
Collapse
|
4
|
Oberfeld D, Staab K, Kattner F, Ellermeier W. Is Recognition of Speech in Noise Related to Memory Disruption Caused by Irrelevant Sound? Trends Hear 2024; 28:23312165241262517. [PMID: 39051688 PMCID: PMC11273587 DOI: 10.1177/23312165241262517] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2023] [Revised: 04/24/2024] [Accepted: 05/21/2024] [Indexed: 07/27/2024] Open
Abstract
Listeners with normal audiometric thresholds show substantial variability in their ability to understand speech in noise (SiN). These individual differences have been reported to be associated with a range of auditory and cognitive abilities. The present study addresses the association between SiN processing and the individual susceptibility of short-term memory to auditory distraction (i.e., the irrelevant sound effect [ISE]). In a sample of 67 young adult participants with normal audiometric thresholds, we measured speech recognition performance in a spatial listening task with two interfering talkers (speech-in-speech identification), audiometric thresholds, binaural sensitivity to the temporal fine structure (interaural phase differences [IPD]), serial memory with and without interfering talkers, and self-reported noise sensitivity. Speech-in-speech processing was not significantly associated with the ISE. The most important predictors of high speech-in-speech recognition performance were a large short-term memory span, low IPD thresholds, bilaterally symmetrical audiometric thresholds, and low individual noise sensitivity. Surprisingly, the susceptibility of short-term memory to irrelevant sound accounted for a substantially smaller amount of variance in speech-in-speech processing than the nondisrupted short-term memory capacity. The data confirm the role of binaural sensitivity to the temporal fine structure, although its association to SiN recognition was weaker than in some previous studies. The inverse association between self-reported noise sensitivity and SiN processing deserves further investigation.
Collapse
Affiliation(s)
- Daniel Oberfeld
- Institute of Psychology, Section Experimental Psychology, Johannes Gutenberg-Universität Mainz, Germany
| | - Katharina Staab
- Department of Marketing and Human Resource Management, Technische Universität Darmstadt, Darmstadt, Germany
| | - Florian Kattner
- Institut für Psychologie, Technische Universität Darmstadt, Darmstadt, Germany
| | - Wolfgang Ellermeier
- Institut für Psychologie, Technische Universität Darmstadt, Darmstadt, Germany
| |
Collapse
|
5
|
Benoit C, Carlson RJ, King MC, Horn DL, Rubinstein JT. Behavioral characterization of the cochlear amplifier lesion due to loss of function of stereocilin (STRC) in human subjects. Hear Res 2023; 439:108898. [PMID: 37890241 PMCID: PMC10756798 DOI: 10.1016/j.heares.2023.108898] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/22/2023] [Revised: 09/12/2023] [Accepted: 10/19/2023] [Indexed: 10/29/2023]
Abstract
Loss of function of stereocilin (STRC) is the second most common cause of inherited hearing loss. The loss of the stereocilin protein, encoded by the STRC gene, induces the loss of connection between outer hair cells and tectorial membrane. This only affects the outer hair cells (OHCs) function, involving deficits of active cochlear frequency selectivity and amplifier functions despite preservation of normal inner hair cells. Better understanding of cochlear features associated with mutation of STRC will improve our knowledge of normal cochlear function, the pathophysiology of hearing impairment, and potentially enhance hearing aid and cochlear implant signal processing. Nine subjects with homozygous or compound heterozygous loss of function mutations in STRC were included, age 7-24 years. Temporal and spectral modulation perception were measured, characterized by spectral and temporal modulation transfer functions. Speech-in-noise perception was studied with spondee identification in adaptive steady-state noise and AzBio sentences with 0 and -5 dB SNR multitalker babble. Results were compared with normal hearing (NH) and cochlear implant (CI) listeners to place STRC-/- listeners' hearing capacity in context. Spectral ripple discrimination thresholds in the STRC-/- subjects were poorer than in NH listeners (p < 0.0001) but remained better than for CI listeners (p < 0.0001). Frequency resolution appeared impaired in the STRC-/- group compared to NH listeners but did not reach statistical significance (p = 0.06). Compared to NH listeners, amplitude modulation detection thresholds in the STRC-/- group did not reach significance (p= 0.06) but were better than in CI subjects (p < 0.0001). Temporal resolution in STRC-/- subjects was similar to NH (p = 0.98) but better than in CI listeners (p = 0.04). The spondee reception threshold in the STRC-/- group was worse than NH listeners (p = 0.0008) but better than CI listeners (p = 0.0001). For AzBio sentences, performance at 0 dB SNR was similar between the STRC-/- group and the NH group, 88 % and 97 % respectively. For -5 dB SNR, the STRC-/- performance was significantly poorer than NH, 40 % and 85 % respectively, yet much better than with CI who performed at 54 % at +5 dB SNR in children and 53 % at + 10 dB SNR in adults. To our knowledge, this is the first study of the psychoacoustic performance of human subjects lacking cochlear amplification but with normal inner hair cell function. Our data demonstrate preservation of temporal resolution and a trend to impaired frequency resolution in this group without reaching statistical significance. Speech-in-noise perception compared to NH listeners was impaired as well. All measures were better than those in CI listeners. It remains to be seen if hearing aid modifications, customized for the spectral deficits in STRC-/- listeners can improve speech understanding in noise. Since cochlear implants are also limited by deficient spectral selectivity, STRC-/- hearing may provide an upper bound on what could be obtained with better temporal coding in electrical stimulation.
Collapse
Affiliation(s)
- Charlotte Benoit
- Virginia Merrill Bloedel Hearing Research Center, Department of Otolaryngology-Head and Neck Surgery, University of Washington, Seattle, WA, USA.
| | - Ryan J Carlson
- Departments of Genome Sciences and Medicine, University of Washington, Seattle, WA, USA
| | - Mary-Claire King
- Departments of Genome Sciences and Medicine, University of Washington, Seattle, WA, USA
| | - David L Horn
- Virginia Merrill Bloedel Hearing Research Center, Department of Otolaryngology-Head and Neck Surgery, University of Washington, Seattle, WA, USA; Department of Speech and Hearing Sciences, University of Washington, Seattle, WA, USA; Division of Pediatric Otolaryngology, Department of Surgery, Seattle Children's Hospital, Seattle, WA, USA
| | - Jay T Rubinstein
- Virginia Merrill Bloedel Hearing Research Center, Department of Otolaryngology-Head and Neck Surgery, University of Washington, Seattle, WA, USA; Department of Bioengineering, University of Washington, Seattle, WA, USA
| |
Collapse
|
6
|
Lutfi RA, Zandona M, Lee J. Simultaneous relative cue reliance in speech-on-speech masking. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2023; 154:2530-2538. [PMID: 37870932 DOI: 10.1121/10.0021874] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/20/2023] [Accepted: 09/27/2023] [Indexed: 10/25/2023]
Abstract
Modern hearing research has identified the ability of listeners to segregate simultaneous speech streams with a reliance on three major voice cues, fundamental frequency, level, and location. Few of these studies evaluated reliance for these cues presented simultaneously as occurs in nature, and fewer still considered the listeners' relative reliance on these cues owing to the cues' different units of measure. In the present study trial-by-trial analyses were used to isolate the listener's simultaneous reliance on the three voice cues, with the behavior of an ideal observer [Green and Swets (1966). (Wiley, New York), pp.151-178] serving as a comparison standard for evaluating relative reliance. Listeners heard on each trial a pair of randomly selected, simultaneous recordings of naturally spoken sentences. One of the recordings was always from the same talker, a distracter, and the other, with equal probability, was from one of two target talkers differing in the three voice cues. The listener's task was to identify the target talker. Among 33 clinically normal-hearing adults only one relied predominantly on voice level, the remaining were split between voice fundamental frequency and/or location. The results are discussed regarding their implications for the common practice in studies of using target-distracter level as a dependent measure of speech-on-speech masking.
Collapse
Affiliation(s)
- R A Lutfi
- Auditory Behavioral Research Lab, Department of Communication Sciences and Disorders, University of South Florida, Tampa, Florida 33620, USA
| | - M Zandona
- Auditory Behavioral Research Lab, Department of Communication Sciences and Disorders, University of South Florida, Tampa, Florida 33620, USA
| | - J Lee
- Auditory Behavioral Research Lab, Department of Communication Sciences and Disorders, University of South Florida, Tampa, Florida 33620, USA
| |
Collapse
|
7
|
Lutfi RA, Pastore T, Rodriguez B, Yost WA, Lee J. Molecular analysis of individual differences in talker search at the cocktail-party. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2022; 152:1804. [PMID: 36182280 PMCID: PMC9507302 DOI: 10.1121/10.0014116] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/23/2022] [Revised: 08/22/2022] [Accepted: 08/29/2022] [Indexed: 06/16/2023]
Abstract
A molecular (trial-by-trial) analysis of data from a cocktail-party, target-talker search task was used to test two general classes of explanations accounting for individual differences in listener performance: cue weighting models for which errors are tied to the speech features talkers have in common with the target and internal noise models for which errors are largely independent of these features. The speech of eight different talkers was played simultaneously over eight different loudspeakers surrounding the listener. The locations of the eight talkers varied at random from trial to trial. The listener's task was to identify the location of a target talker with which they had previously been familiarized. An analysis of the response counts to individual talkers showed predominant confusion with one talker sharing the same fundamental frequency and timbre as the target and, secondarily, other talkers sharing the same timbre. The confusions occurred for a roughly constant 31% of all of the trials for all of the listeners. The remaining errors were uniformly distributed across the remaining talkers and responsible for the large individual differences in performances observed. The results are consistent with a model in which largely stimulus-independent factors (internal noise) are responsible for the wide variation in performance across listeners.
Collapse
Affiliation(s)
- Robert A Lutfi
- Auditory Behavioral Research Laboratory, Department of Communication Sciences and Disorders, University of South Florida, Tampa, Florida 33620, USA
| | - Torben Pastore
- Spatial Hearing Laboratory, Department of Speech and Hearing, Arizona State University, Tempe, Arizona 85281, USA
| | - Briana Rodriguez
- Auditory Behavioral Research Laboratory, Department of Communication Sciences and Disorders, University of South Florida, Tampa, Florida 33620, USA
| | - William A Yost
- Spatial Hearing Laboratory, Department of Speech and Hearing, Arizona State University, Tempe, Arizona 85281, USA
| | - Jungmee Lee
- Auditory Behavioral Research Laboratory, Department of Communication Sciences and Disorders, University of South Florida, Tampa, Florida 33620, USA
| |
Collapse
|
8
|
Symons AE, Dick F, Tierney AT. Dimension-selective attention and dimensional salience modulate cortical tracking of acoustic dimensions. Neuroimage 2021; 244:118544. [PMID: 34492294 DOI: 10.1016/j.neuroimage.2021.118544] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2021] [Revised: 08/19/2021] [Accepted: 08/31/2021] [Indexed: 11/17/2022] Open
Abstract
Some theories of auditory categorization suggest that auditory dimensions that are strongly diagnostic for particular categories - for instance voice onset time or fundamental frequency in the case of some spoken consonants - attract attention. However, prior cognitive neuroscience research on auditory selective attention has largely focused on attention to simple auditory objects or streams, and so little is known about the neural mechanisms that underpin dimension-selective attention, or how the relative salience of variations along these dimensions might modulate neural signatures of attention. Here we investigate whether dimensional salience and dimension-selective attention modulate the cortical tracking of acoustic dimensions. In two experiments, participants listened to tone sequences varying in pitch and spectral peak frequency; these two dimensions changed at different rates. Inter-trial phase coherence (ITPC) and amplitude of the EEG signal at the frequencies tagged to pitch and spectral changes provided a measure of cortical tracking of these dimensions. In Experiment 1, tone sequences varied in the size of the pitch intervals, while the size of spectral peak intervals remained constant. Cortical tracking of pitch changes was greater for sequences with larger compared to smaller pitch intervals, with no difference in cortical tracking of spectral peak changes. In Experiment 2, participants selectively attended to either pitch or spectral peak. Cortical tracking was stronger in response to the attended compared to unattended dimension for both pitch and spectral peak. These findings suggest that attention can enhance the cortical tracking of specific acoustic dimensions rather than simply enhancing tracking of the auditory object as a whole.
Collapse
Affiliation(s)
- Ashley E Symons
- Department of Psychological Sciences, Birkbeck College, University of London UK.
| | - Fred Dick
- Department of Psychological Sciences, Birkbeck College, University of London UK; Division of Psychology & Language Sciences, University College London UK
| | - Adam T Tierney
- Department of Psychological Sciences, Birkbeck College, University of London UK
| |
Collapse
|
9
|
Peng ZE, Pausch F, Fels J. Spatial release from masking in reverberation for school-age children. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2021; 150:3263. [PMID: 34852617 PMCID: PMC8730369 DOI: 10.1121/10.0006752] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/15/2021] [Revised: 09/29/2021] [Accepted: 09/29/2021] [Indexed: 05/06/2023]
Abstract
Understanding speech in noisy environments, such as classrooms, is a challenge for children. When a spatial separation is introduced between the target and masker, as compared to when both are co-located, children demonstrate intelligibility improvement of the target speech. Such intelligibility improvement is known as spatial release from masking (SRM). In most reverberant environments, binaural cues associated with the spatial separation are distorted; the extent to which such distortion will affect children's SRM is unknown. Two virtual acoustic environments with reverberation times between 0.4 s and 1.1 s were compared. SRM was measured using a spatial separation with symmetrically displaced maskers to maximize access to binaural cues. The role of informational masking in modulating SRM was investigated through voice similarity between the target and masker. Results showed that, contradictory to previous developmental findings on free-field SRM, children's SRM in reverberation has not yet reached maturity in the 7-12 years age range. When reducing reverberation, an SRM improvement was seen in adults but not in children. Our findings suggest that, even though school-age children have access to binaural cues that are distorted in reverberation, they demonstrate immature use of such cues for speech-in-noise perception, even in mild reverberation.
Collapse
Affiliation(s)
- Z Ellen Peng
- Institute for Hearing Technology and Acoustics, RWTH Aachen University, Kopernikusstrasse 5, 52074 Aachen, Germany
| | - Florian Pausch
- Institute for Hearing Technology and Acoustics, RWTH Aachen University, Kopernikusstrasse 5, 52074 Aachen, Germany
| | - Janina Fels
- Institute for Hearing Technology and Acoustics, RWTH Aachen University, Kopernikusstrasse 5, 52074 Aachen, Germany
| |
Collapse
|
10
|
Vlahou E, Ueno K, Shinn-Cunningham BG, Kopčo N. Calibration of Consonant Perception to Room Reverberation. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2021; 64:2956-2976. [PMID: 34297606 DOI: 10.1044/2021_jslhr-20-00396] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Purpose We examined how consonant perception is affected by a preceding speech carrier simulated in the same or a different room, for different classes of consonants. Carrier room, carrier length, and carrier length/target room uncertainty were manipulated. A phonetic feature analysis tested which phonetic categories are influenced by the manipulations in the acoustic context of the carrier. Method Two experiments were performed, each with nine participants. Targets consisted of 10 or 16 vowel-consonant (VC) syllables presented in one of two strongly reverberant rooms, preceded by a multiple-VC carrier presented in either the same room, a different reverberant room, or an anechoic room. In Experiment 1, the carrier length and the target room randomly varied from trial to trial, whereas in Experiment 2, they were fixed within a block of trials. Results Overall, a consistent carrier provided an advantage for consonant perception compared to inconsistent carriers, whether in anechoic or differently reverberant rooms. Phonetic analysis showed that carrier inconsistency significantly degraded identification of the manner of articulation, especially for stop consonants and, in one of the rooms, also of voicing. Carrier length and carrier/target uncertainty did not affect adaptation to reverberation for individual phonetic features. The detrimental effects of anechoic and different reverberant carriers on target perception were similar. Conclusions The strength of calibration varies across different phonetic features, as well as across rooms with different levels of reverberation. Even though place of articulation is the feature that is affected by reverberation the most, it is the manner of articulation and, partially, voicing for which room adaptation is observed.
Collapse
Affiliation(s)
- Eleni Vlahou
- Department of Computer Science and Biomedical Informatics, University of Thessaly, Volos, Greece
- Institute of Computer Science, Faculty of Science, Pavol Jozef Šafárik University, Košice, Slovakia
- Hearing Research Center and Department of Biomedical Engineering, Boston University, MA
| | - Kanako Ueno
- School of Science and Technology, Meiji University, Chiyoda, Japan
| | | | - Norbert Kopčo
- Institute of Computer Science, Faculty of Science, Pavol Jozef Šafárik University, Košice, Slovakia
- Hearing Research Center and Department of Biomedical Engineering, Boston University, MA
| |
Collapse
|
11
|
Recall of Reverberant Speech in Quiet and Four-Talker Babble Noise. Brain Sci 2021; 11:brainsci11070891. [PMID: 34356126 PMCID: PMC8301929 DOI: 10.3390/brainsci11070891] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2021] [Revised: 06/29/2021] [Accepted: 06/30/2021] [Indexed: 11/17/2022] Open
Abstract
Using behavioral evaluation of free recall performance, we investigated whether reverberation and/or noise affected memory performance in normal-hearing adults. Thirty-four participants performed a free-recall task in which they were instructed to repeat the initial word after each sentence and to remember the target words after each list of seven sentences, in a 2 (reverberation) × 2 (noise) factorial design. Pupil dilation responses (baseline and peak pupil dilation) were also recorded sentence-by-sentence while the participants were trying to remember the target words. In noise, speech was presented at an easily audible level using an individualized signal-to-noise ratio (95% speech intelligibility). As expected, recall performance was significantly lower in the noisy environment than in the quiet condition. Regardless of noise interference or reverberation, sentence- baseline values gradually increased with an increase in the number of words to be remembered for a subsequent free-recall task. Long reverberation time had no significant effect on memory retrieval of verbal stimuli or pupillary responses during encoding.
Collapse
|
12
|
Lutfi RA, Rodriguez B, Lee J. The Listener Effect in Multitalker Speech Segregation and Talker Identification. Trends Hear 2021; 25:23312165211051886. [PMID: 34693853 PMCID: PMC8544763 DOI: 10.1177/23312165211051886] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2021] [Accepted: 09/20/2021] [Indexed: 12/04/2022] Open
Abstract
Over six decades ago, Cherry (1953) drew attention to what he called the "cocktail-party problem"; the challenge of segregating the speech of one talker from others speaking at the same time. The problem has been actively researched ever since but for all this time one observation has eluded explanation. It is the wide variation in performance of individual listeners. That variation was replicated here for four major experimental factors known to impact performance: differences in task (talker segregation vs. identification), differences in the voice features of talkers (pitch vs. location), differences in the voice similarity and uncertainty of talkers (informational masking), and the presence or absence of linguistic cues. The effect of these factors on the segregation of naturally spoken sentences and synthesized vowels was largely eliminated in psychometric functions relating the performance of individual listeners to that of an ideal observer, d'ideal. The effect of listeners remained as differences in the slopes of the functions (fixed effect) with little within-listener variability in the estimates of slope (random effect). The results make a case for considering the listener a factor in multitalker segregation and identification equal in status to any major experimental variable.
Collapse
Affiliation(s)
- Robert A. Lutfi
- Auditory Behavioral Research Lab, Department of Communication Sciences and Disorders, University of South Florida, Tampa, Florida
| | - Briana Rodriguez
- Auditory Behavioral Research Lab, Department of Communication Sciences and Disorders, University of South Florida, Tampa, Florida
| | - Jungmee Lee
- Auditory Behavioral Research Lab, Department of Communication Sciences and Disorders, University of South Florida, Tampa, Florida
| |
Collapse
|
13
|
Lutfi RA, Rodriguez B, Lee J, Pastore T. A test of model classes accounting for individual differences in the cocktail-party effect. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2020; 148:4014. [PMID: 33379927 PMCID: PMC7775115 DOI: 10.1121/10.0002961] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/18/2020] [Revised: 11/06/2020] [Accepted: 12/03/2020] [Indexed: 06/12/2023]
Abstract
Listeners differ widely in the ability to follow the speech of a single talker in a noisy crowd-what is called the cocktail-party effect. Differences may arise for any one or a combination of factors associated with auditory sensitivity, selective attention, working memory, and decision making required for effective listening. The present study attempts to narrow the possibilities by grouping explanations into model classes based on model predictions for the types of errors that distinguish better from poorer performing listeners in a vowel segregation and talker identification task. Two model classes are considered: those for which the errors are predictably tied to the voice variation of talkers (decision weight models) and those for which the errors occur largely independently of this variation (internal noise models). Regression analyses of trial-by-trial responses, for different tasks and task demands, show overwhelmingly that the latter type of error is responsible for the performance differences among listeners. The results are inconsistent with models that attribute the performance differences to differences in the reliance listeners place on relevant voice features in this decision. The results are consistent instead with models for which largely stimulus-independent, stochastic processes cause information loss at different stages of auditory processing.
Collapse
Affiliation(s)
- Robert A Lutfi
- Auditory Behavioral Research Lab, Department of Communication Sciences and Disorders, University of South Florida, Tampa, Florida 33620, USA
| | - Briana Rodriguez
- Auditory Behavioral Research Lab, Department of Communication Sciences and Disorders, University of South Florida, Tampa, Florida 33620, USA
| | - Jungmee Lee
- Auditory Behavioral Research Lab, Department of Communication Sciences and Disorders, University of South Florida, Tampa, Florida 33620, USA
| | - Torben Pastore
- Spatial Hearing Lab, College of Health Solutions, Arizona State University, Tempe, Arizona 85281, USA
| |
Collapse
|
14
|
Wang L, Best V, Shinn-Cunningham BG. Benefits of Beamforming With Local Spatial-Cue Preservation for Speech Localization and Segregation. Trends Hear 2020; 24:2331216519896908. [PMID: 31931677 PMCID: PMC6961143 DOI: 10.1177/2331216519896908] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022] Open
Abstract
A study was conducted to examine the benefits afforded by a
signal-processing strategy that imposes the binaural cues present in a
natural signal, calculated locally in time and frequency, on the
output of a beamforming microphone array. Such a strategy has the
potential to combine the signal-to-noise ratio advantage of
beamforming with the perceptual benefit of spatialization to enhance
performance in multitalker mixtures. Participants with normal hearing
and with hearing loss were tested on both speech localization and
speech-on-speech masking tasks. Performance for the spatialized
beamformer was compared with that for three other conditions: a
reference condition with no processing, a beamformer with no
spatialization, and a hybrid beamformer that operates only in the high
frequencies to preserve natural binaural cues in the low frequencies.
Beamforming with full-bandwidth spatialization supported speech
localization and produced better speech reception thresholds than the
other conditions.
Collapse
Affiliation(s)
- Le Wang
- Department of Biomedical Engineering, Boston University, Boston, MA, USA
| | - Virginia Best
- Department of Speech, Language and Hearing Sciences, Boston University, Boston, MA, USA
| | | |
Collapse
|
15
|
Lad M, Holmes E, Chu A, Griffiths TD. Speech-in-noise detection is related to auditory working memory precision for frequency. Sci Rep 2020; 10:13997. [PMID: 32814792 PMCID: PMC7438331 DOI: 10.1038/s41598-020-70952-9] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2020] [Accepted: 07/30/2020] [Indexed: 11/11/2022] Open
Abstract
Speech-in-noise (SiN) perception is a critical aspect of natural listening, deficits in which are a major contributor to the hearing handicap in cochlear hearing loss. Studies suggest that SiN perception correlates with cognitive skills, particularly phonological working memory: the ability to hold and manipulate phonemes or words in mind. We consider here the idea that SiN perception is linked to a more general ability to hold sound objects in mind, auditory working memory, irrespective of whether the objects are speech sounds. This process might help combine foreground elements, like speech, over seconds to aid their separation from the background of an auditory scene. We investigated the relationship between auditory working memory precision and SiN thresholds in listeners with normal hearing. We used a novel paradigm that tests auditory working memory for non-speech sounds that vary in frequency and amplitude modulation (AM) rate. The paradigm yields measures of precision in frequency and AM domains, based on the distribution of participants’ estimates of the target. Across participants, frequency precision correlated significantly with SiN thresholds. Frequency precision also correlated with the number of years of musical training. Measures of phonological working memory did not correlate with SiN detection ability. Our results demonstrate a specific relationship between working memory for frequency and SiN. We suggest that working memory for frequency facilitates the identification and tracking of foreground objects like speech during natural listening. Working memory performance for frequency also correlated with years of musical instrument experience suggesting that the former is potentially modifiable.
Collapse
Affiliation(s)
- Meher Lad
- Biosciences Institute, Newcastle University, Newcastle upon Tyne, UK.
| | - Emma Holmes
- Wellcome Centre for Human Neuroimaging, University College London, London, UK
| | - Agatha Chu
- Newcastle University Medical School, Newcastle upon Tyne, UK
| | - Timothy D Griffiths
- Biosciences Institute, Newcastle University, Newcastle upon Tyne, UK.,Wellcome Centre for Human Neuroimaging, University College London, London, UK
| |
Collapse
|
16
|
Bidelman GM, Yoo J. Musicians Show Improved Speech Segregation in Competitive, Multi-Talker Cocktail Party Scenarios. Front Psychol 2020; 11:1927. [PMID: 32973610 PMCID: PMC7461890 DOI: 10.3389/fpsyg.2020.01927] [Citation(s) in RCA: 17] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/10/2020] [Accepted: 07/13/2020] [Indexed: 12/05/2022] Open
Abstract
Studies suggest that long-term music experience enhances the brain’s ability to segregate speech from noise. Musicians’ “speech-in-noise (SIN) benefit” is based largely on perception from simple figure-ground tasks rather than competitive, multi-talker scenarios that offer realistic spatial cues for segregation and engage binaural processing. We aimed to investigate whether musicians show perceptual advantages in cocktail party speech segregation in a competitive, multi-talker environment. We used the coordinate response measure (CRM) paradigm to measure speech recognition and localization performance in musicians vs. non-musicians in a simulated 3D cocktail party environment conducted in an anechoic chamber. Speech was delivered through a 16-channel speaker array distributed around the horizontal soundfield surrounding the listener. Participants recalled the color, number, and perceived location of target callsign sentences. We manipulated task difficulty by varying the number of additional maskers presented at other spatial locations in the horizontal soundfield (0–1–2–3–4–6–8 multi-talkers). Musicians obtained faster and better speech recognition amidst up to around eight simultaneous talkers and showed less noise-related decline in performance with increasing interferers than their non-musician peers. Correlations revealed associations between listeners’ years of musical training and CRM recognition and working memory. However, better working memory correlated with better speech streaming. Basic (QuickSIN) but not more complex (speech streaming) SIN processing was still predicted by music training after controlling for working memory. Our findings confirm a relationship between musicianship and naturalistic cocktail party speech streaming but also suggest that cognitive factors at least partially drive musicians’ SIN advantage.
Collapse
Affiliation(s)
- Gavin M Bidelman
- Institute for Intelligent Systems, University of Memphis, Memphis, TN, United States.,School of Communication Sciences and Disorders, University of Memphis, Memphis, TN, United States.,Department of Anatomy and Neurobiology, University of Tennessee Health Sciences Center, Memphis, TN, United States
| | - Jessica Yoo
- School of Communication Sciences and Disorders, University of Memphis, Memphis, TN, United States
| |
Collapse
|
17
|
Jaeger M, Mirkovic B, Bleichner MG, Debener S. Decoding the Attended Speaker From EEG Using Adaptive Evaluation Intervals Captures Fluctuations in Attentional Listening. Front Neurosci 2020; 14:603. [PMID: 32612507 PMCID: PMC7308709 DOI: 10.3389/fnins.2020.00603] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2019] [Accepted: 05/15/2020] [Indexed: 11/13/2022] Open
Abstract
Listeners differ in their ability to attend to a speech stream in the presence of a competing sound. Differences in speech intelligibility in noise cannot be fully explained by the hearing ability which suggests the involvement of additional cognitive factors. A better understanding of the temporal fluctuations in the ability to pay selective auditory attention to a desired speech stream may help in explaining these variabilities. In order to better understand the temporal dynamics of selective auditory attention, we developed an online auditory attention decoding (AAD) processing pipeline based on speech envelope tracking in the electroencephalogram (EEG). Participants had to attend to one audiobook story while a second one had to be ignored. Online AAD was applied to track the attention toward the target speech signal. Individual temporal attention profiles were computed by combining an established AAD method with an adaptive staircase procedure. The individual decoding performance over time was analyzed and linked to behavioral performance as well as subjective ratings of listening effort, motivation, and fatigue. The grand average attended speaker decoding profile derived in the online experiment indicated performance above chance level. Parameters describing the individual AAD performance in each testing block indicated significant differences in decoding performance over time to be closely related to the behavioral performance in the selective listening task. Further, an exploratory analysis indicated that subjects with poor decoding performance reported higher listening effort and fatigue compared to good performers. Taken together our results show that online EEG based AAD in a complex listening situation is feasible. Adaptive attended speaker decoding profiles over time could be used as an objective measure of behavioral performance and listening effort. The developed online processing pipeline could also serve as a basis for future EEG based near real-time auditory neurofeedback systems.
Collapse
Affiliation(s)
- Manuela Jaeger
- Neuropsychology Lab, Department of Psychology, University of Oldenburg, Oldenburg, Germany.,Fraunhofer Institute for Digital Media Technology IDMT, Division Hearing, Speech and Audio Technology, Oldenburg, Germany
| | - Bojana Mirkovic
- Neuropsychology Lab, Department of Psychology, University of Oldenburg, Oldenburg, Germany.,Cluster of Excellence Hearing4all, University of Oldenburg, Oldenburg, Germany
| | - Martin G Bleichner
- Neuropsychology Lab, Department of Psychology, University of Oldenburg, Oldenburg, Germany.,Neurophysiology of Everyday Life Lab, Department of Psychology, University of Oldenburg, Oldenburg, Germany
| | - Stefan Debener
- Neuropsychology Lab, Department of Psychology, University of Oldenburg, Oldenburg, Germany.,Cluster of Excellence Hearing4all, University of Oldenburg, Oldenburg, Germany.,Research Center for Neurosensory Science, University of Oldenburg, Oldenburg, Germany
| |
Collapse
|
18
|
Jain C, Dwarakanath VM, G A. Suprathreshold Processing and Cocktail Party Listening in Younger and Older Adults with Normal Hearing. AGEING INTERNATIONAL 2019. [DOI: 10.1007/s12126-019-09356-8] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]
|
19
|
'Normal' hearing thresholds and fundamental auditory grouping processes predict difficulties with speech-in-noise perception. Sci Rep 2019; 9:16771. [PMID: 31728002 PMCID: PMC6856372 DOI: 10.1038/s41598-019-53353-5] [Citation(s) in RCA: 39] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2019] [Accepted: 10/18/2019] [Indexed: 11/25/2022] Open
Abstract
Understanding speech when background noise is present is a critical everyday task that varies widely among people. A key challenge is to understand why some people struggle with speech-in-noise perception, despite having clinically normal hearing. Here, we developed new figure-ground tests that require participants to extract a coherent tone pattern from a stochastic background of tones. These tests dissociated variability in speech-in-noise perception related to mechanisms for detecting static (same-frequency) patterns and those for tracking patterns that change frequency over time. In addition, elevated hearing thresholds that are widely considered to be ‘normal’ explained significant variance in speech-in-noise perception, independent of figure-ground perception. Overall, our results demonstrate that successful speech-in-noise perception is related to audiometric thresholds, fundamental grouping of static acoustic patterns, and tracking of acoustic sources that change in frequency. Crucially, speech-in-noise deficits are better assessed by measuring central (grouping) processes alongside audiometric thresholds.
Collapse
|
20
|
Aroudi A, Mirkovic B, De Vos M, Doclo S. Impact of Different Acoustic Components on EEG-Based Auditory Attention Decoding in Noisy and Reverberant Conditions. IEEE Trans Neural Syst Rehabil Eng 2019; 27:652-663. [DOI: 10.1109/tnsre.2019.2903404] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
|
21
|
Bharadwaj HM, Mai AR, Simpson JM, Choi I, Heinz MG, Shinn-Cunningham BG. Non-Invasive Assays of Cochlear Synaptopathy - Candidates and Considerations. Neuroscience 2019; 407:53-66. [PMID: 30853540 DOI: 10.1016/j.neuroscience.2019.02.031] [Citation(s) in RCA: 73] [Impact Index Per Article: 14.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2018] [Revised: 02/21/2019] [Accepted: 02/25/2019] [Indexed: 12/31/2022]
Abstract
Studies in multiple species, including in post-mortem human tissue, have shown that normal aging and/or acoustic overexposure can lead to a significant loss of afferent synapses innervating the cochlea. Hypothetically, this cochlear synaptopathy can lead to perceptual deficits in challenging environments and can contribute to central neural effects such as tinnitus. However, because cochlear synaptopathy can occur without any measurable changes in audiometric thresholds, synaptopathy can remain hidden from standard clinical diagnostics. To understand the perceptual sequelae of synaptopathy and to evaluate the efficacy of emerging therapies, sensitive and specific non-invasive measures at the individual patient level need to be established. Pioneering experiments in specific mice strains have helped identify many candidate assays. These include auditory brainstem responses, the middle-ear muscle reflex, envelope-following responses, and extended high-frequency audiograms. Unfortunately, because these non-invasive measures can be also affected by extraneous factors other than synaptopathy, their application and interpretation in humans is not straightforward. Here, we systematically examine six extraneous factors through a series of interrelated human experiments aimed at understanding their effects. Using strategies that may help mitigate the effects of such extraneous factors, we then show that these suprathreshold physiological assays exhibit across-individual correlations with each other indicative of contributions from a common physiological source consistent with cochlear synaptopathy. Finally, we discuss the application of these assays to two key outstanding questions, and discuss some barriers that still remain. This article is part of a Special Issue entitled: Hearing Loss, Tinnitus, Hyperacusis, Central Gain.
Collapse
Affiliation(s)
- Hari M Bharadwaj
- Department of Speech, Language, and Hearing Sciences, Purdue University, West Lafayette, IN; Weldon School of Biomedical Engineering, Purdue University, West Lafayette, IN.
| | - Alexandra R Mai
- Department of Speech, Language, and Hearing Sciences, Purdue University, West Lafayette, IN
| | - Jennifer M Simpson
- Department of Speech, Language, and Hearing Sciences, Purdue University, West Lafayette, IN
| | - Inyong Choi
- Department of Communication Sciences and Disorders, University of Iowa, Iowa City, IA
| | - Michael G Heinz
- Department of Speech, Language, and Hearing Sciences, Purdue University, West Lafayette, IN; Weldon School of Biomedical Engineering, Purdue University, West Lafayette, IN
| | | |
Collapse
|
22
|
Jakien KM, Gallun FJ. Normative Data for a Rapid, Automated Test of Spatial Release From Masking. Am J Audiol 2018; 27:529-538. [PMID: 30458523 PMCID: PMC6436452 DOI: 10.1044/2018_aja-17-0069] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2017] [Accepted: 01/20/2018] [Indexed: 12/02/2022] Open
Abstract
Purpose The purpose of this study is to report normative data and predict thresholds for a rapid test of spatial release from masking for speech perception. The test is easily administered and has good repeatability, with the potential to be used in clinics and laboratories. Normative functions were generated for adults varying in age and amounts of hearing loss. Method The test of spatial release presents a virtual auditory scene over headphones with 2 conditions: colocated (with target and maskers at 0°) and spatially separated (with target at 0° and maskers at ± 45°). Listener thresholds are determined as target-to-masker ratios, and spatial release from masking (SRM) is determined as the difference between the colocated condition and spatially separated condition. Multiple linear regression was used to fit the data from 82 adults 18–80 years of age with normal to moderate hearing loss (0–40 dB HL pure-tone average [PTA]). The regression equations were then used to generate normative functions that relate age (in years) and hearing thresholds (as PTA) to target-to-masker ratios and SRM. Results Normative functions were able to predict thresholds with an error of less than 3.5 dB in all conditions. In the colocated condition, the function included only age as a predictive parameter, whereas in the spatially separated condition, both age and PTA were included as parameters. For SRM, PTA was the only significant predictor. Different functions were generated for the 1st run, the 2nd run, and the average of the 2 runs. All 3 functions were largely similar in form, with the smallest error being associated with the function on the basis of the average of 2 runs. Conclusion With the normative functions generated from this data set, it would be possible for a researcher or clinician to interpret data from a small number of participants or even a single patient without having to first collect data from a control group, substantially reducing the time and resources needed. Supplemental Material https://doi.org/10.23641/asha.7080878
Collapse
Affiliation(s)
- Kasey M. Jakien
- National Center for Rehabilitative Auditory Research, VA Portland Health Care System, Department of Veterans Affairs, OR
- Department of Otolaryngology–Head & Neck Surgery, Oregon Health and Science University, Portland
| | - Frederick J. Gallun
- National Center for Rehabilitative Auditory Research, VA Portland Health Care System, Department of Veterans Affairs, OR
- Department of Otolaryngology–Head & Neck Surgery, Oregon Health and Science University, Portland
| |
Collapse
|
23
|
Lotfi Y, Ahmadi T, Moossavi A, Bakhshi E. Binaural sensitivity to temporal fine structure and lateralization ability in children with suspected (central) auditory processing disorder. Auris Nasus Larynx 2018; 46:64-69. [PMID: 29954636 DOI: 10.1016/j.anl.2018.06.005] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2018] [Revised: 06/11/2018] [Accepted: 06/17/2018] [Indexed: 11/26/2022]
Abstract
OBJECTIVE Previous studies have shown that a subgroup of children with suspected (central) auditory processing disorder (SusCAPD) have insufficient ability to use binaural cues to benefit from spatial processing. Thus, they experience considerable listening difficulties in challenging auditory environments, such as classrooms. Some researchers have also indicated the probable role of binaural temporal fine structure (TFS) in the perceptual segregation of target signal from noise and hence in speech perception in noise. Therefore, in the present study, in order to further investigate the underlying reason for listening problems against background noise in this group of children, their performance was measured using binaural TFS sensitivity test (TFS-LF) as well as behavioral auditory lateralization in noise test, both of which are based on binaural temporal cues processing. METHODS Participants in this analytical study included 91 children with normal hearing and no listening problems and 41 children (9-12 years old) with SusCAPD who found it challenging to understand speech in noise. Initially, the ability to use binaural TFS was measured at three frequencies (250, 500 and 750Hz) in both the groups, and the results of preliminary evaluations were compared between normal children and those with SusCAPD who participated in the study. Thereafter, the binaural performance of the 16 children with SusCAPD who had higher thresholds than the normal group at all three frequencies tested in TFS-LF test was examined using the lateralization test in 7 spatial locations. RESULTS Total 16 of the 41 children with SusCAPD who participated in this study (39%) showed poor performance on the TFS-LF test at all three frequencies, compared to both normal children and other children in the APD group (p<0.05). Furthermore, children in the APD group with binaural TFS coding deficits at all three frequencies revealed significant differences in the lateralization test results compared to normal children (p<0.05). CONCLUSION Findings of the current study demonstrated that one of the underlying causes for the difficulty understanding speech in noisy environments experienced by a subgroup of children with SusCAPD can be the reduced ability to benefit from binaural TFS information. This study also showed that a reduced ability to use binaural TFS cues in the group of children with SusCAPD was accompanied by reduced binaural processing abilities in the lateralization test which also admit the presence of binaural temporal processing deficits in this group of children.
Collapse
Affiliation(s)
- Yones Lotfi
- Department of Audiology, University of Social Welfare and Rehabilitation Sciences, Tehran, Iran
| | - Tayebeh Ahmadi
- Department of Audiology, University of Social Welfare and Rehabilitation Sciences, Tehran, Iran.
| | - Abdollah Moossavi
- Department of Otolaryngology, School of Medicine, Iran University of Medical Sciences, Tehran, Iran
| | - Enayatollah Bakhshi
- Department of Statistics, University of Social Welfare and Rehabilitation Sciences, Tehran, Iran
| |
Collapse
|
24
|
Sensorineural hearing loss degrades behavioral and physiological measures of human spatial selective auditory attention. Proc Natl Acad Sci U S A 2018; 115:E3286-E3295. [PMID: 29555752 PMCID: PMC5889663 DOI: 10.1073/pnas.1721226115] [Citation(s) in RCA: 40] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Listeners with sensorineural hearing loss often have trouble understanding speech amid other voices. While poor spatial hearing is often implicated, direct evidence is weak; moreover, studies suggest that reduced audibility and degraded spectrotemporal coding may explain such problems. We hypothesized that poor spatial acuity leads to difficulty deploying selective attention, which normally filters out distracting sounds. In listeners with normal hearing, selective attention causes changes in the neural responses evoked by competing sounds, which can be used to quantify the effectiveness of attentional control. Here, we used behavior and electroencephalography to explore whether control of selective auditory attention is degraded in hearing-impaired (HI) listeners. Normal-hearing (NH) and HI listeners identified a simple melody presented simultaneously with two competing melodies, each simulated from different lateral angles. We quantified performance and attentional modulation of cortical responses evoked by these competing streams. Compared with NH listeners, HI listeners had poorer sensitivity to spatial cues, performed more poorly on the selective attention task, and showed less robust attentional modulation of cortical responses. Moreover, across NH and HI individuals, these measures were correlated. While both groups showed cortical suppression of distracting streams, this modulation was weaker in HI listeners, especially when attending to a target at midline, surrounded by competing streams. These findings suggest that hearing loss interferes with the ability to filter out sound sources based on location, contributing to communication difficulties in social situations. These findings also have implications for technologies aiming to use neural signals to guide hearing aid processing.
Collapse
|
25
|
Jaeger M, Bleichner MG, Bauer AKR, Mirkovic B, Debener S. Did You Listen to the Beat? Auditory Steady-State Responses in the Human Electroencephalogram at 4 and 7 Hz Modulation Rates Reflect Selective Attention. Brain Topogr 2018; 31:811-826. [DOI: 10.1007/s10548-018-0637-8] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2017] [Accepted: 02/23/2018] [Indexed: 01/23/2023]
|
26
|
Abstract
Many people with difficulties following conversations in noisy settings have “clinically normal” audiograms, that is, tone thresholds better than 20 dB HL from 0.1 to 8 kHz. This review summarizes the possible causes of such difficulties, and examines established as well as promising new psychoacoustic and electrophysiologic approaches to differentiate between them. Deficits at the level of the auditory periphery are possible even if thresholds remain around 0 dB HL, and become probable when they reach 10 to 20 dB HL. Extending the audiogram beyond 8 kHz can identify early signs of noise-induced trauma to the vulnerable basal turn of the cochlea, and might point to “hidden” losses at lower frequencies that could compromise speech reception in noise. Listening difficulties can also be a consequence of impaired central auditory processing, resulting from lesions affecting the auditory brainstem or cortex, or from abnormal patterns of sound input during developmental sensitive periods and even in adulthood. Such auditory processing disorders should be distinguished from (cognitive) linguistic deficits, and from problems with attention or working memory that may not be specific to the auditory modality. Improved diagnosis of the causes of listening difficulties in noise should lead to better treatment outcomes, by optimizing auditory training procedures to the specific deficits of individual patients, for example.
Collapse
|
27
|
Oberem J, Seibold J, Koch I, Fels J. Intentional switching in auditory selective attention: Exploring attention shifts with different reverberation times. Hear Res 2017; 359:32-39. [PMID: 29305038 DOI: 10.1016/j.heares.2017.12.013] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/06/2017] [Revised: 12/11/2017] [Accepted: 12/18/2017] [Indexed: 12/01/2022]
Abstract
Using a well-established binaural-listening paradigm the ability to intentionally switch auditory selective attention was examined under anechoic, low reverberation (0.8 s) and high reverberation (1.75 s) conditions. Twenty-three young, normal-hearing subjects were tested in a within-subject design to analyze influences of the reverberation times. Spoken word pairs by two speakers were presented simultaneously to subjects from two of eight azimuth positions. The stimuli consisted of a single number word, (i.e., 1 to 9), followed by either the direction "UP" or "DOWN" in German. Guided by a visual cue prior to auditory stimulus onset indicating the position of the target speaker, subjects were asked to identify whether the target number was numerically smaller or greater than five and to categorize the direction of the second word. Switch costs, (i.e. reaction time differences between a position switch of the target relative to a position repetition), were larger under the high reverberation condition. Furthermore, the error rates were highly dependent on reverberant energy and reverberation interacted with the congruence effect, (i.e. stimuli spoken by target and distractor may evoke the same answer (congruent) or different answers (incongruent)), indicating larger congruence effects under higher reverberation times.
Collapse
Affiliation(s)
- Josefa Oberem
- Institute of Technical Acoustics, Medical Acoustics Group, RWTH Aachen University, Kopernikusstraße 5, 52074 Aachen, Germany.
| | - Julia Seibold
- Institute of Psychology, RWTH Aachen University, Jägerstraße 17, 52066 Aachen, Germany.
| | - Iring Koch
- Institute of Psychology, RWTH Aachen University, Jägerstraße 17, 52066 Aachen, Germany.
| | - Janina Fels
- Institute of Technical Acoustics, Medical Acoustics Group, RWTH Aachen University, Kopernikusstraße 5, 52074 Aachen, Germany.
| |
Collapse
|
28
|
Shinn-Cunningham B. Cortical and Sensory Causes of Individual Differences in Selective Attention Ability Among Listeners With Normal Hearing Thresholds. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2017; 60:2976-2988. [PMID: 29049598 PMCID: PMC5945067 DOI: 10.1044/2017_jslhr-h-17-0080] [Citation(s) in RCA: 23] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/27/2017] [Revised: 06/23/2017] [Accepted: 07/05/2017] [Indexed: 05/28/2023]
Abstract
PURPOSE This review provides clinicians with an overview of recent findings relevant to understanding why listeners with normal hearing thresholds (NHTs) sometimes suffer from communication difficulties in noisy settings. METHOD The results from neuroscience and psychoacoustics are reviewed. RESULTS In noisy settings, listeners focus their attention by engaging cortical brain networks to suppress unimportant sounds; they then can analyze and understand an important sound, such as speech, amidst competing sounds. Differences in the efficacy of top-down control of attention can affect communication abilities. In addition, subclinical deficits in sensory fidelity can disrupt the ability to perceptually segregate sound sources, interfering with selective attention, even in listeners with NHTs. Studies of variability in control of attention and in sensory coding fidelity may help to isolate and identify some of the causes of communication disorders in individuals presenting at the clinic with "normal hearing." CONCLUSIONS How well an individual with NHTs can understand speech amidst competing sounds depends not only on the sound being audible but also on the integrity of cortical control networks and the fidelity of the representation of suprathreshold sound. Understanding the root cause of difficulties experienced by listeners with NHTs ultimately can lead to new, targeted interventions that address specific deficits affecting communication in noise. PRESENTATION VIDEO http://cred.pubs.asha.org/article.aspx?articleid=2601617.
Collapse
Affiliation(s)
- Barbara Shinn-Cunningham
- Center for Research in Sensory Communication and Emerging Neural Technology, Boston University, MA
| |
Collapse
|
29
|
Fuglsang SA, Dau T, Hjortkjær J. Noise-robust cortical tracking of attended speech in real-world acoustic scenes. Neuroimage 2017; 156:435-444. [PMID: 28412441 DOI: 10.1016/j.neuroimage.2017.04.026] [Citation(s) in RCA: 97] [Impact Index Per Article: 13.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2016] [Revised: 04/07/2017] [Accepted: 04/10/2017] [Indexed: 11/30/2022] Open
Abstract
Selectively attending to one speaker in a multi-speaker scenario is thought to synchronize low-frequency cortical activity to the attended speech signal. In recent studies, reconstruction of speech from single-trial electroencephalogram (EEG) data has been used to decode which talker a listener is attending to in a two-talker situation. It is currently unclear how this generalizes to more complex sound environments. Behaviorally, speech perception is robust to the acoustic distortions that listeners typically encounter in everyday life, but it is unknown whether this is mirrored by a noise-robust neural tracking of attended speech. Here we used advanced acoustic simulations to recreate real-world acoustic scenes in the laboratory. In virtual acoustic realities with varying amounts of reverberation and number of interfering talkers, listeners selectively attended to the speech stream of a particular talker. Across the different listening environments, we found that the attended talker could be accurately decoded from single-trial EEG data irrespective of the different distortions in the acoustic input. For highly reverberant environments, speech envelopes reconstructed from neural responses to the distorted stimuli resembled the original clean signal more than the distorted input. With reverberant speech, we observed a late cortical response to the attended speech stream that encoded temporal modulations in the speech signal without its reverberant distortion. Single-trial attention decoding accuracies based on 40-50s long blocks of data from 64 scalp electrodes were equally high (80-90% correct) in all considered listening environments and remained statistically significant using down to 10 scalp electrodes and short (<30-s) unaveraged EEG segments. In contrast to the robust decoding of the attended talker we found that decoding of the unattended talker deteriorated with the acoustic distortions. These results suggest that cortical activity tracks an attended speech signal in a way that is invariant to acoustic distortions encountered in real-life sound environments. Noise-robust attention decoding additionally suggests a potential utility of stimulus reconstruction techniques in attention-controlled brain-computer interfaces.
Collapse
Affiliation(s)
- Søren Asp Fuglsang
- Hearing Systems Group, Department of Electrical Engineering, Technical University of Denmark, Ørsteds Plads, Building 352, 2800 Kgs. Lyngby, Denmark.
| | - Torsten Dau
- Hearing Systems Group, Department of Electrical Engineering, Technical University of Denmark, Ørsteds Plads, Building 352, 2800 Kgs. Lyngby, Denmark
| | - Jens Hjortkjær
- Hearing Systems Group, Department of Electrical Engineering, Technical University of Denmark, Ørsteds Plads, Building 352, 2800 Kgs. Lyngby, Denmark; Danish Research Centre for Magnetic Resonance, Centre for Functional and Diagnostic Imaging and Research, Copenhagen University Hospital Hvidovre, Kettegaard Allé 30, 2650 Hvidovre, Denmark.
| |
Collapse
|
30
|
Dimitrijevic A, Smith ML, Kadis DS, Moore DR. Cortical Alpha Oscillations Predict Speech Intelligibility. Front Hum Neurosci 2017; 11:88. [PMID: 28286478 PMCID: PMC5323373 DOI: 10.3389/fnhum.2017.00088] [Citation(s) in RCA: 53] [Impact Index Per Article: 7.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2016] [Accepted: 02/13/2017] [Indexed: 12/21/2022] Open
Abstract
Understanding speech in noise (SiN) is a complex task involving sensory encoding and cognitive resources including working memory and attention. Previous work has shown that brain oscillations, particularly alpha rhythms (8–12 Hz) play important roles in sensory processes involving working memory and attention. However, no previous study has examined brain oscillations during performance of a continuous speech perception test. The aim of this study was to measure cortical alpha during attentive listening in a commonly used SiN task (digits-in-noise, DiN) to better understand the neural processes associated with “top-down” cognitive processing in adverse listening environments. We recruited 14 normal hearing (NH) young adults. DiN speech reception threshold (SRT) was measured in an initial behavioral experiment. EEG activity was then collected: (i) while performing the DiN near SRT; and (ii) while attending to a silent, close-caption video during presentation of identical digit stimuli that the participant was instructed to ignore. Three main results were obtained: (1) during attentive (“active”) listening to the DiN, a number of distinct neural oscillations were observed (mainly alpha with some beta; 15–30 Hz). No oscillations were observed during attention to the video (“passive” listening); (2) overall, alpha event-related synchronization (ERS) of central/parietal sources were observed during active listening when data were grand averaged across all participants. In some participants, a smaller magnitude alpha event-related desynchronization (ERD), originating in temporal regions, was observed; and (3) when individual EEG trials were sorted according to correct and incorrect digit identification, the temporal alpha ERD was consistently greater on correctly identified trials. No such consistency was observed with the central/parietal alpha ERS. These data demonstrate that changes in alpha activity are specific to listening conditions. To our knowledge, this is the first report that shows almost no brain oscillatory changes during a passive task compared to an active task in any sensory modality. Temporal alpha ERD was related to correct digit identification.
Collapse
Affiliation(s)
- Andrew Dimitrijevic
- Otolaryngology-Head and Neck Surgery, Sunnybrook Health Sciences CentreToronto, ON, Canada; Hurvitz Brain Sciences, Evaluative Clinical Sciences, Sunnybrook Research InstituteToronto, ON, Canada; Faculty of Medicine, Otolaryngology-Head and Neck SurgeryUniversity of Toronto, Toronto, ON, Canada
| | - Michael L Smith
- Communication Sciences Research Center, Cincinnati Children's Hospital Medical CenterCincinnati, OH, USA; Speech and Hearing Sciences, University of WashingtonSeattle, WA, USA
| | - Darren S Kadis
- Pediatric Neuroimaging Research Consortium, Cincinnati Children's Hospital Medical CenterCincinnati, OH, USA; Division of Neurology, Cincinnati Children's Hospital Medical CenterCincinnati, OH, USA; Department of Pediatrics, University of Cincinnati, College of MedicineCincinnati, OH, USA
| | - David R Moore
- Communication Sciences Research Center, Cincinnati Children's Hospital Medical CenterCincinnati, OH, USA; Department of Otolaryngology, University of CincinnatiCincinnati, OH, USA
| |
Collapse
|
31
|
Bressler S, Goldberg H, Shinn-Cunningham B. Sensory coding and cognitive processing of sound in Veterans with blast exposure. Hear Res 2016; 349:98-110. [PMID: 27815131 DOI: 10.1016/j.heares.2016.10.018] [Citation(s) in RCA: 31] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/09/2016] [Revised: 10/07/2016] [Accepted: 10/26/2016] [Indexed: 11/17/2022]
Abstract
Recent anecdotal reports from VA audiology clinics as well as a few published studies have identified a sub-population of Service Members seeking treatment for problems communicating in everyday, noisy listening environments despite having normal to near-normal hearing thresholds. Because of their increased risk of exposure to dangerous levels of prolonged noise and transient explosive blast events, communication problems in these soldiers could be due to either hearing loss (traditional or "hidden") in the auditory sensory periphery or from blast-induced injury to cortical networks associated with attention. We found that out of the 14 blast-exposed Service Members recruited for this study, 12 had hearing thresholds in the normal to near-normal range. A majority of these participants reported having problems specifically related to failures with selective attention. Envelope following responses (EFRs) measuring neural coding fidelity of the auditory brainstem to suprathreshold sounds were similar between blast-exposed and non-blast controls. Blast-exposed subjects performed substantially worse than non-blast controls in an auditory selective attention task in which listeners classified the melodic contour (rising, falling, or "zig-zagging") of one of three simultaneous, competing tone sequences. Salient pitch and spatial differences made for easy segregation of the three concurrent melodies. Poor performance in the blast-exposed subjects was associated with weaker evoked response potentials (ERPs) in frontal EEG channels, as well as a failure of attention to enhance the neural responses evoked by a sequence when it was the target compared to when it was a distractor. These results suggest that communication problems in these listeners cannot be explained by compromised sensory representations in the auditory periphery, but rather point to lingering blast-induced damage to cortical networks implicated in the control of attention. Because all study participants also suffered from post-traumatic disorder (PTSD), follow-up studies are required to tease apart the contributions of PTSD and blast-induced injury on cognitive performance.
Collapse
Affiliation(s)
- Scott Bressler
- Center for Computational Neuroscience and Neural Technologies (CompNet), Boston University, Boston, MA 02215, USA
| | - Hannah Goldberg
- Center for Computational Neuroscience and Neural Technologies (CompNet), Boston University, Boston, MA 02215, USA
| | - Barbara Shinn-Cunningham
- Center for Computational Neuroscience and Neural Technologies (CompNet), Boston University, Boston, MA 02215, USA; Department of Biomedical Engineering, Boston University, Boston, MA 02215, USA.
| |
Collapse
|
32
|
Dai L, Shinn-Cunningham BG. Contributions of Sensory Coding and Attentional Control to Individual Differences in Performance in Spatial Auditory Selective Attention Tasks. Front Hum Neurosci 2016; 10:530. [PMID: 27812330 PMCID: PMC5071360 DOI: 10.3389/fnhum.2016.00530] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2016] [Accepted: 10/05/2016] [Indexed: 11/13/2022] Open
Abstract
Listeners with normal hearing thresholds (NHTs) differ in their ability to steer attention to whatever sound source is important. This ability depends on top-down executive control, which modulates the sensory representation of sound in the cortex. Yet, this sensory representation also depends on the coding fidelity of the peripheral auditory system. Both of these factors may thus contribute to the individual differences in performance. We designed a selective auditory attention paradigm in which we could simultaneously measure envelope following responses (EFRs, reflecting peripheral coding), onset event-related potentials (ERPs) from the scalp (reflecting cortical responses to sound) and behavioral scores. We performed two experiments that varied stimulus conditions to alter the degree to which performance might be limited due to fine stimulus details vs. due to control of attentional focus. Consistent with past work, in both experiments we find that attention strongly modulates cortical ERPs. Importantly, in Experiment I, where coding fidelity limits the task, individual behavioral performance correlates with subcortical coding strength (derived by computing how the EFR is degraded for fully masked tones compared to partially masked tones); however, in this experiment, the effects of attention on cortical ERPs were unrelated to individual subject performance. In contrast, in Experiment II, where sensory cues for segregation are robust (and thus less of a limiting factor on task performance), inter-subject behavioral differences correlate with subcortical coding strength. In addition, after factoring out the influence of subcortical coding strength, behavioral differences are also correlated with the strength of attentional modulation of ERPs. These results support the hypothesis that behavioral abilities amongst listeners with NHTs can arise due to both subcortical coding differences and differences in attentional control, depending on stimulus characteristics and task demands.
Collapse
Affiliation(s)
- Lengshi Dai
- Department of Biomedical Engineering, Boston University Boston, MA, USA
| | | |
Collapse
|
33
|
Dai L, Shinn-Cunningham BG. Contributions of Sensory Coding and Attentional Control to Individual Differences in Performance in Spatial Auditory Selective Attention Tasks. Front Hum Neurosci 2016. [PMID: 27812330 DOI: 10.3389/fnhum.2016.00530/bibtex] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022] Open
Abstract
Listeners with normal hearing thresholds (NHTs) differ in their ability to steer attention to whatever sound source is important. This ability depends on top-down executive control, which modulates the sensory representation of sound in the cortex. Yet, this sensory representation also depends on the coding fidelity of the peripheral auditory system. Both of these factors may thus contribute to the individual differences in performance. We designed a selective auditory attention paradigm in which we could simultaneously measure envelope following responses (EFRs, reflecting peripheral coding), onset event-related potentials (ERPs) from the scalp (reflecting cortical responses to sound) and behavioral scores. We performed two experiments that varied stimulus conditions to alter the degree to which performance might be limited due to fine stimulus details vs. due to control of attentional focus. Consistent with past work, in both experiments we find that attention strongly modulates cortical ERPs. Importantly, in Experiment I, where coding fidelity limits the task, individual behavioral performance correlates with subcortical coding strength (derived by computing how the EFR is degraded for fully masked tones compared to partially masked tones); however, in this experiment, the effects of attention on cortical ERPs were unrelated to individual subject performance. In contrast, in Experiment II, where sensory cues for segregation are robust (and thus less of a limiting factor on task performance), inter-subject behavioral differences correlate with subcortical coding strength. In addition, after factoring out the influence of subcortical coding strength, behavioral differences are also correlated with the strength of attentional modulation of ERPs. These results support the hypothesis that behavioral abilities amongst listeners with NHTs can arise due to both subcortical coding differences and differences in attentional control, depending on stimulus characteristics and task demands.
Collapse
Affiliation(s)
- Lengshi Dai
- Department of Biomedical Engineering, Boston University Boston, MA, USA
| | | |
Collapse
|
34
|
Lőcsei G, Pedersen JH, Laugesen S, Santurette S, Dau T, MacDonald EN. Temporal Fine-Structure Coding and Lateralized Speech Perception in Normal-Hearing and Hearing-Impaired Listeners. Trends Hear 2016; 20:2331216516660962. [PMID: 27601071 PMCID: PMC5014088 DOI: 10.1177/2331216516660962] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2015] [Accepted: 07/01/2016] [Indexed: 11/16/2022] Open
Abstract
This study investigated the relationship between speech perception performance in spatially complex, lateralized listening scenarios and temporal fine-structure (TFS) coding at low frequencies. Young normal-hearing (NH) and two groups of elderly hearing-impaired (HI) listeners with mild or moderate hearing loss above 1.5 kHz participated in the study. Speech reception thresholds (SRTs) were estimated in the presence of either speech-shaped noise, two-, four-, or eight-talker babble played reversed, or a nonreversed two-talker masker. Target audibility was ensured by applying individualized linear gains to the stimuli, which were presented over headphones. The target and masker streams were lateralized to the same or to opposite sides of the head by introducing 0.7-ms interaural time differences between the ears. TFS coding was assessed by measuring frequency discrimination thresholds and interaural phase difference thresholds at 250 Hz. NH listeners had clearly better SRTs than the HI listeners. However, when maskers were spatially separated from the target, the amount of SRT benefit due to binaural unmasking differed only slightly between the groups. Neither the frequency discrimination threshold nor the interaural phase difference threshold tasks showed a correlation with the SRTs or with the amount of masking release due to binaural unmasking, respectively. The results suggest that, although HI listeners with normal hearing thresholds below 1.5 kHz experienced difficulties with speech understanding in spatially complex environments, these limitations were unrelated to TFS coding abilities and were only weakly associated with a reduction in binaural-unmasking benefit for spatially separated competing sources.
Collapse
Affiliation(s)
- Gusztáv Lőcsei
- Department of Electrical Engineering, Technical University of Denmark, Kongens Lyngby, Denmark
| | | | - Søren Laugesen
- Eriksholm Research Centre, Oticon A/S, Snekkersten, Denmark
| | - Sébastien Santurette
- Department of Electrical Engineering, Technical University of Denmark, Kongens Lyngby, Denmark
| | - Torsten Dau
- Department of Electrical Engineering, Technical University of Denmark, Kongens Lyngby, Denmark
| | - Ewen N MacDonald
- Department of Electrical Engineering, Technical University of Denmark, Kongens Lyngby, Denmark
| |
Collapse
|
35
|
Oberfeld D, Klöckner-Nowotny F. Individual differences in selective attention predict speech identification at a cocktail party. eLife 2016; 5:e16747. [PMID: 27580272 PMCID: PMC5441891 DOI: 10.7554/elife.16747] [Citation(s) in RCA: 53] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2016] [Accepted: 08/08/2016] [Indexed: 11/13/2022] Open
Abstract
Listeners with normal hearing show considerable individual differences in speech understanding when competing speakers are present, as in a crowded restaurant. Here, we show that one source of this variance are individual differences in the ability to focus selective attention on a target stimulus in the presence of distractors. In 50 young normal-hearing listeners, the performance in tasks measuring auditory and visual selective attention was associated with sentence identification in the presence of spatially separated competing speakers. Together, the measures of selective attention explained a similar proportion of variance as the binaural sensitivity for the acoustic temporal fine structure. Working memory span, age, and audiometric thresholds showed no significant association with speech understanding. These results suggest that a reduced ability to focus attention on a target is one reason why some listeners with normal hearing sensitivity have difficulty communicating in situations with background noise.
Collapse
Affiliation(s)
- Daniel Oberfeld
- Department of Psychology, Section Experimental Psychology, Johannes Gutenberg-Universität, Mainz, Germany
| | - Felicitas Klöckner-Nowotny
- Department of Psychology, Section Experimental Psychology, Johannes Gutenberg-Universität, Mainz, Germany
| |
Collapse
|
36
|
Extended High-Frequency Bandwidth Improves Speech Reception in the Presence of Spatially Separated Masking Speech. Ear Hear 2016; 36:e214-24. [PMID: 25856543 DOI: 10.1097/aud.0000000000000161] [Citation(s) in RCA: 55] [Impact Index Per Article: 6.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
Abstract
OBJECTIVES The hypothesis that extending the audible frequency bandwidth beyond the range currently implemented in most hearing aids can improve speech understanding was tested for normal-hearing and hearing-impaired participants using target sentences and spatially separated masking speech. DESIGN The Hearing In Speech Test (HIST) speech corpus was re-recorded, and four masking talkers were recorded at a sample rate of 44.1 kHz. All talkers were male native speakers of American English. For each subject, the reception threshold for sentences (RTS) was measured in two spatial configurations. In the asymmetric configuration, the target was presented from -45° azimuth and two colocated masking talkers were presented from +45° azimuth. In the diffuse configuration, the target was presented from 0° azimuth and four masking talkers were each presented from a different azimuth: +45°, +135°, -135°, and -45°. The new speech sentences, masking materials, and configurations were presented using low-pass filter cutoff frequencies of 4, 6, 8, and 10 kHz. For the normal-hearing participants, stimuli were presented in the sound field using loudspeakers. For the hearing-impaired participants, the spatial configurations were simulated using earphones, and a multiband wide-dynamic-range compressor with a modified CAM2 fitting algorithm was used to compensate for each participant's hearing loss. RESULTS For the normal-hearing participants (N = 24, mean age 40 years), the RTS improved significantly by 3.0 dB when the bandwidth was increased from 4 to 10 kHz, and a significant improvement of 1.3 dB was obtained from extending the bandwidth from 6 to 10 kHz, in both spatial configurations. Hearing-impaired participants (N = 25, mean age 71 years) also showed a significant improvement in RTS with extended bandwidth, but the effect was smaller than for the normal-hearing participants. The mean decrease in RTS when the bandwidth was increased from 4 to 10 kHz was 1.3 dB for the asymmetric condition and 0.5 dB for the diffuse condition. CONCLUSIONS Extending bandwidth from 4 to 10 kHz can improve the ability of normal-hearing and hearing-impaired participants to understand target speech in the presence of spatially separated masking speech. Future studies of the benefits of extended high-frequency amplification should investigate other realistic listening situations, masker types, spatial configurations, and room reverberation conditions, to determine added value in overcoming the technical challenges associated with implementing a device capable of providing extended high-frequency amplification.
Collapse
|
37
|
Dietz M, Wang L, Greenberg D, McAlpine D. Sensitivity to Interaural Time Differences Conveyed in the Stimulus Envelope: Estimating Inputs of Binaural Neurons Through the Temporal Analysis of Spike Trains. J Assoc Res Otolaryngol 2016; 17:313-30. [PMID: 27294694 DOI: 10.1007/s10162-016-0573-9] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2015] [Accepted: 05/30/2016] [Indexed: 01/03/2023] Open
Abstract
Sound-source localization in the horizontal plane relies on detecting small differences in the timing and level of the sound at the two ears, including differences in the timing of the modulated envelopes of high-frequency sounds (envelope interaural time differences (ITDs)). We investigated responses of single neurons in the inferior colliculus (IC) to a wide range of envelope ITDs and stimulus envelope shapes. By a novel means of visualizing neural activity relative to different portions of the periodic stimulus envelope at each ear, we demonstrate the role of neuron-specific excitatory and inhibitory inputs in creating ITD sensitivity (or the lack of it) depending on the specific shape of the stimulus envelope. The underlying binaural brain circuitry and synaptic parameters were modeled individually for each neuron to account for neuron-specific activity patterns. The model explains the effects of envelope shapes on sensitivity to envelope ITDs observed in both normal-hearing listeners and in neural data, and has consequences for understanding how ITD information in stimulus envelopes might be maximized in users of bilateral cochlear implants-for whom ITDs conveyed in the stimulus envelope are the only ITD cues available.
Collapse
Affiliation(s)
- Mathias Dietz
- Medizinische Physik and Cluster of Excellence Hearing4all, Universität Oldenburg, 26111, Oldenburg, Germany. .,UCL Ear Institute, 332 Gray's Inn Road, London, WC1X 8EE, UK. .,National Centre for Audiology, Faculty of Health Sciences, Western University, London, N6G 1H1, Ontario, Canada.
| | - Le Wang
- Center for Computational Neuroscience and Neural Technology, Boston University, Boston, MA, 02215, USA
| | - David Greenberg
- UCL Ear Institute, 332 Gray's Inn Road, London, WC1X 8EE, UK
| | - David McAlpine
- UCL Ear Institute, 332 Gray's Inn Road, London, WC1X 8EE, UK.,Dept. of Lingustics, Australian Hearing Hub, Macquarie University, Sydney, NSW, 2109, Australia
| |
Collapse
|
38
|
Kidd G, Mason CR, Best V, Swaminathan J. Benefits of Acoustic Beamforming for Solving the Cocktail Party Problem. Trends Hear 2015; 19:2331216515593385. [PMID: 26126896 PMCID: PMC4509760 DOI: 10.1177/2331216515593385] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022] Open
Abstract
The benefit provided to listeners with sensorineural hearing loss (SNHL) by an acoustic beamforming microphone array was determined in a speech-on-speech masking experiment. Normal-hearing controls were tested as well. For the SNHL listeners, prescription-determined gain was applied to the stimuli, and performance using the beamformer was compared with that obtained using bilateral amplification. The listener identified speech from a target talker located straight ahead (0° azimuth) in the presence of four competing talkers that were either colocated with, or spatially separated from, the target. The stimuli were spatialized using measured impulse responses and presented via earphones. In the spatially separated masker conditions, the four maskers were arranged symmetrically around the target at ±15° and ±30° or at ±45° and ±90°. Results revealed that masked speech reception thresholds for spatially separated maskers were higher (poorer) on average for the SNHL than for the normal-hearing listeners. For most SNHL listeners in the wider masker separation condition, lower thresholds were obtained through the microphone array than through bilateral amplification. Large intersubject differences were found in both listener groups. The best masked speech reception thresholds overall were found for a hybrid condition that combined natural and beamforming listening in order to preserve localization for broadband sources.
Collapse
|
39
|
Carlile S, Corkhill C. Selective spatial attention modulates bottom-up informational masking of speech. Sci Rep 2015; 5:8662. [PMID: 25727100 PMCID: PMC4345314 DOI: 10.1038/srep08662] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2014] [Accepted: 01/28/2015] [Indexed: 11/09/2022] Open
Abstract
To hear out a conversation against other talkers listeners overcome energetic and informational masking. Largely attributed to top-down processes, information masking has also been demonstrated using unintelligible speech and amplitude-modulated maskers suggesting bottom-up processes. We examined the role of speech-like amplitude modulations in information masking using a spatial masking release paradigm. Separating a target talker from two masker talkers produced a 20 dB improvement in speech reception threshold; 40% of which was attributed to a release from informational masking. When across frequency temporal modulations in the masker talkers are decorrelated the speech is unintelligible, although the within frequency modulation characteristics remains identical. Used as a masker as above, the information masking accounted for 37% of the spatial unmasking seen with this masker. This unintelligible and highly differentiable masker is unlikely to involve top-down processes. These data provides strong evidence of bottom-up masking involving speech-like, within-frequency modulations and that this, presumably low level process, can be modulated by selective spatial attention.
Collapse
Affiliation(s)
- Simon Carlile
- School of Medical Sciences and The Bosch Institute, University of Sydney, Sydney, NSW 2006, Australia
| | - Caitlin Corkhill
- School of Medical Sciences, University of Sydney, Sydney, NSW 2006 Australia
| |
Collapse
|
40
|
Getzmann S, Wascher E, Falkenstein M. What does successful speech-in-noise perception in aging depend on? Electrophysiological correlates of high and low performance in older adults. Neuropsychologia 2015; 70:43-57. [PMID: 25681737 DOI: 10.1016/j.neuropsychologia.2015.02.009] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2014] [Revised: 02/04/2015] [Accepted: 02/06/2015] [Indexed: 11/30/2022]
Abstract
Aging usually decreases the ability to understand language under difficult listening conditions. However, aging is also associated with increased between-subject variability. Here, we studied potential sources of inter-individual differences and investigated spoken language understanding of younger and older adults (age ranges 21-35 and 57-74 years, respectively) in a simulated "cocktail-party" scenario. A naturalistic "stock-price monitoring" task was employed in which prices of listed companies were simultaneously recited by four speakers at different locations in space. The participants responded when prices of a target company exceeded specific values, while ignoring all other companies. According to their individual performance levels three subgroups of participants were composed, consisting of 12 high-performing and 12 low-performing older adults, and 12 young adults matching the high-performing older group. The analysis of the event-related brain potentials indicated that all older adults showed delayed attentional control (indicated by a later P2) and reduced speech processing (indicated by a reduced N400), relative to the younger adults. High-performing older adults differed in increased allocation of attention and inhibitory control (indicated by a stronger P2-N2 complex) from their low-performing counterparts. The results are consistent with the idea of an adjustment of mental resources that could help compensating potential deficiencies in peripheral and central auditory processing.
Collapse
Affiliation(s)
- Stephan Getzmann
- Leibniz Research Centre for Working Environment and Human Factors, Ardeystraße 67, D-44139 Dortmund, Germany.
| | - Edmund Wascher
- Leibniz Research Centre for Working Environment and Human Factors, Ardeystraße 67, D-44139 Dortmund, Germany
| | - Michael Falkenstein
- Leibniz Research Centre for Working Environment and Human Factors, Ardeystraße 67, D-44139 Dortmund, Germany
| |
Collapse
|
41
|
Sayles M, Stasiak A, Winter IM. Reverberation impairs brainstem temporal representations of voiced vowel sounds: challenging "periodicity-tagged" segregation of competing speech in rooms. Front Syst Neurosci 2015; 8:248. [PMID: 25628545 PMCID: PMC4290552 DOI: 10.3389/fnsys.2014.00248] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2014] [Accepted: 12/18/2014] [Indexed: 11/26/2022] Open
Abstract
The auditory system typically processes information from concurrently active sound sources (e.g., two voices speaking at once), in the presence of multiple delayed, attenuated and distorted sound-wave reflections (reverberation). Brainstem circuits help segregate these complex acoustic mixtures into “auditory objects.” Psychophysical studies demonstrate a strong interaction between reverberation and fundamental-frequency (F0) modulation, leading to impaired segregation of competing vowels when segregation is on the basis of F0 differences. Neurophysiological studies of complex-sound segregation have concentrated on sounds with steady F0s, in anechoic environments. However, F0 modulation and reverberation are quasi-ubiquitous. We examine the ability of 129 single units in the ventral cochlear nucleus (VCN) of the anesthetized guinea pig to segregate the concurrent synthetic vowel sounds /a/ and /i/, based on temporal discharge patterns under closed-field conditions. We address the effects of added real-room reverberation, F0 modulation, and the interaction of these two factors, on brainstem neural segregation of voiced speech sounds. A firing-rate representation of single-vowels' spectral envelopes is robust to the combination of F0 modulation and reverberation: local firing-rate maxima and minima across the tonotopic array code vowel-formant structure. However, single-vowel F0-related periodicity information in shuffled inter-spike interval distributions is significantly degraded in the combined presence of reverberation and F0 modulation. Hence, segregation of double-vowels' spectral energy into two streams (corresponding to the two vowels), on the basis of temporal discharge patterns, is impaired by reverberation; specifically when F0 is modulated. All unit types (primary-like, chopper, onset) are similarly affected. These results offer neurophysiological insights to perceptual organization of complex acoustic scenes under realistically challenging listening conditions.
Collapse
Affiliation(s)
- Mark Sayles
- Centre for the Neural Basis of Hearing, The Physiological Laboratory, Department of Physiology, Development and Neuroscience, University of Cambridge Cambridge, UK
| | - Arkadiusz Stasiak
- Centre for the Neural Basis of Hearing, The Physiological Laboratory, Department of Physiology, Development and Neuroscience, University of Cambridge Cambridge, UK
| | - Ian M Winter
- Centre for the Neural Basis of Hearing, The Physiological Laboratory, Department of Physiology, Development and Neuroscience, University of Cambridge Cambridge, UK
| |
Collapse
|
42
|
Choi I, Wang L, Bharadwaj H, Shinn-Cunningham B. Individual differences in attentional modulation of cortical responses correlate with selective attention performance. Hear Res 2014; 314:10-9. [PMID: 24821552 DOI: 10.1016/j.heares.2014.04.008] [Citation(s) in RCA: 52] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/11/2013] [Revised: 04/18/2014] [Accepted: 04/23/2014] [Indexed: 11/29/2022]
Abstract
Many studies have shown that attention modulates the cortical representation of an auditory scene, emphasizing an attended source while suppressing competing sources. Yet, individual differences in the strength of this attentional modulation and their relationship with selective attention ability are poorly understood. Here, we ask whether differences in how strongly attention modulates cortical responses reflect differences in normal-hearing listeners' selective auditory attention ability. We asked listeners to attend to one of three competing melodies and identify its pitch contour while we measured cortical electroencephalographic responses. The three melodies were either from widely separated pitch ranges ("easy trials"), or from a narrow, overlapping pitch range ("hard trials"). The melodies started at slightly different times; listeners attended either the leading or lagging melody. Because of the timing of the onsets, the leading melody drew attention exogenously. In contrast, attending the lagging melody required listeners to direct top-down attention volitionally. We quantified how attention amplified auditory N1 response to the attended melody and found large individual differences in the N1 amplification, even though only correctly answered trials were used to quantify the ERP gain. Importantly, listeners with the strongest amplification of N1 response to the lagging melody in the easy trials were the best performers across other types of trials. Our results raise the possibility that individual differences in the strength of top-down gain control reflect inherent differences in the ability to control top-down attention.
Collapse
Affiliation(s)
- Inyong Choi
- Center for Computational Neuroscience and Neural Technology, Boston University, Boston, MA 02215, USA
| | - Le Wang
- Center for Computational Neuroscience and Neural Technology, Boston University, Boston, MA 02215, USA
| | - Hari Bharadwaj
- Center for Computational Neuroscience and Neural Technology, Boston University, Boston, MA 02215, USA; Department of Biomedical Engineering, Boston University, Boston, MA 02215, USA
| | - Barbara Shinn-Cunningham
- Center for Computational Neuroscience and Neural Technology, Boston University, Boston, MA 02215, USA; Department of Biomedical Engineering, Boston University, Boston, MA 02215, USA.
| |
Collapse
|
43
|
Bones O, Hopkins K, Krishnan A, Plack CJ. Phase locked neural activity in the human brainstem predicts preference for musical consonance. Neuropsychologia 2014; 58:23-32. [PMID: 24690415 PMCID: PMC4040538 DOI: 10.1016/j.neuropsychologia.2014.03.011] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2013] [Revised: 03/20/2014] [Accepted: 03/21/2014] [Indexed: 11/03/2022]
Abstract
When musical notes are combined to make a chord, the closeness of fit of the combined spectrum to a single harmonic series (the 'harmonicity' of the chord) predicts the perceived consonance (how pleasant and stable the chord sounds; McDermott, Lehr, & Oxenham, 2010). The distinction between consonance and dissonance is central to Western musical form. Harmonicity is represented in the temporal firing patterns of populations of brainstem neurons. The current study investigates the role of brainstem temporal coding of harmonicity in the perception of consonance. Individual preference for consonant over dissonant chords was measured using a rating scale for pairs of simultaneous notes. In order to investigate the effects of cochlear interactions, notes were presented in two ways: both notes to both ears or each note to different ears. The electrophysiological frequency following response (FFR), reflecting sustained neural activity in the brainstem synchronised to the stimulus, was also measured. When both notes were presented to both ears the perceptual distinction between consonant and dissonant chords was stronger than when the notes were presented to different ears. In the condition in which both notes were presented to the both ears additional low-frequency components, corresponding to difference tones resulting from nonlinear cochlear processing, were observable in the FFR effectively enhancing the neural harmonicity of consonant chords but not dissonant chords. Suppressing the cochlear envelope component of the FFR also suppressed the additional frequency components. This suggests that, in the case of consonant chords, difference tones generated by interactions between notes in the cochlea enhance the perception of consonance. Furthermore, individuals with a greater distinction between consonant and dissonant chords in the FFR to individual harmonics had a stronger preference for consonant over dissonant chords. Overall, the results provide compelling evidence for the role of neural temporal coding in the perception of consonance, and suggest that the representation of harmonicity in phase locked neural firing drives the perception of consonance.
Collapse
Affiliation(s)
- Oliver Bones
- School of Psychological Sciences, The University of Manchester, Manchester M13 9PL, UK.
| | - Kathryn Hopkins
- School of Psychological Sciences, The University of Manchester, Manchester M13 9PL, UK
| | - Ananthanarayan Krishnan
- Department of Speech, Language, and Hearing Sciences, Purdue University, West Lafayette, IN 47907, USA
| | - Christopher J Plack
- School of Psychological Sciences, The University of Manchester, Manchester M13 9PL, UK
| |
Collapse
|
44
|
Ihlefeld A, Kan A, Litovsky RY. Across-frequency combination of interaural time difference in bilateral cochlear implant listeners. Front Syst Neurosci 2014; 8:22. [PMID: 24653681 PMCID: PMC3949319 DOI: 10.3389/fnsys.2014.00022] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2013] [Accepted: 01/29/2014] [Indexed: 11/13/2022] Open
Abstract
The current study examined how cochlear implant (CI) listeners combine temporally interleaved envelope-ITD information across two sites of stimulation. When two cochlear sites jointly transmit ITD information, one possibility is that CI listeners can extract the most reliable ITD cues available. As a result, ITD sensitivity would be sustained or enhanced compared to single-site stimulation. Alternatively, mutual interference across multiple sites of ITD stimulation could worsen dual-site performance compared to listening to the better of two electrode pairs. Two experiments used direct stimulation to examine how CI users can integrate ITDs across two pairs of electrodes. Experiment 1 tested ITD discrimination for two stimulation sites using 100-Hz sinusoidally modulated 1000-pps-carrier pulse trains. Experiment 2 used the same stimuli ramped with 100 ms windows, as a control condition with minimized onset cues. For all stimuli, performance improved monotonically with increasing modulation depth. Results show that when CI listeners are stimulated with electrode pairs at two cochlear sites, sensitivity to ITDs was similar to that seen when only the electrode pair with better sensitivity was activated. None of the listeners showed a decrement in performance from the worse electrode pair. This could be achieved either by listening to the better electrode pair or by truly integrating the information across cochlear sites.
Collapse
Affiliation(s)
- Antje Ihlefeld
- Waisman Center, University of Wisconsin Madison, WI, USA ; Center for Neural Science, New York University New York, NY, USA
| | - Alan Kan
- Waisman Center, University of Wisconsin Madison, WI, USA
| | | |
Collapse
|
45
|
Bharadwaj HM, Verhulst S, Shaheen L, Liberman MC, Shinn-Cunningham BG. Cochlear neuropathy and the coding of supra-threshold sound. Front Syst Neurosci 2014; 8:26. [PMID: 24600357 PMCID: PMC3930880 DOI: 10.3389/fnsys.2014.00026] [Citation(s) in RCA: 185] [Impact Index Per Article: 18.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2013] [Accepted: 02/05/2014] [Indexed: 11/13/2022] Open
Abstract
Many listeners with hearing thresholds within the clinically normal range nonetheless complain of difficulty hearing in everyday settings and understanding speech in noise. Converging evidence from human and animal studies points to one potential source of such difficulties: differences in the fidelity with which supra-threshold sound is encoded in the early portions of the auditory pathway. Measures of auditory subcortical steady-state responses (SSSRs) in humans and animals support the idea that the temporal precision of the early auditory representation can be poor even when hearing thresholds are normal. In humans with normal hearing thresholds (NHTs), paradigms that require listeners to make use of the detailed spectro-temporal structure of supra-threshold sound, such as selective attention and discrimination of frequency modulation (FM), reveal individual differences that correlate with subcortical temporal coding precision. Animal studies show that noise exposure and aging can cause a loss of a large percentage of auditory nerve fibers (ANFs) without any significant change in measured audiograms. Here, we argue that cochlear neuropathy may reduce encoding precision of supra-threshold sound, and that this manifests both behaviorally and in SSSRs in humans. Furthermore, recent studies suggest that noise-induced neuropathy may be selective for higher-threshold, lower-spontaneous-rate nerve fibers. Based on our hypothesis, we suggest some approaches that may yield particularly sensitive, objective measures of supra-threshold coding deficits that arise due to neuropathy. Finally, we comment on the potential clinical significance of these ideas and identify areas for future investigation.
Collapse
Affiliation(s)
- Hari M Bharadwaj
- Center for Computational Neuroscience and Neural Technology, Boston University Boston, MA, USA ; Department of Biomedical Engineering, Boston University Boston, MA, USA
| | - Sarah Verhulst
- Center for Computational Neuroscience and Neural Technology, Boston University Boston, MA, USA ; Department of Otology and Laryngology, Harvard Medical School Boston, MA, USA
| | - Luke Shaheen
- Eaton-Peabody Laboratories, Massachusetts Eye and Ear Infirmary Boston, MA, USA ; Harvard-MIT Division of Health Sciences and Technology, Speech and Hearing Bioscience and Technology Program Cambridge, MA, USA
| | - M Charles Liberman
- Department of Otology and Laryngology, Harvard Medical School Boston, MA, USA ; Eaton-Peabody Laboratories, Massachusetts Eye and Ear Infirmary Boston, MA, USA ; Harvard-MIT Division of Health Sciences and Technology, Speech and Hearing Bioscience and Technology Program Cambridge, MA, USA
| | - Barbara G Shinn-Cunningham
- Center for Computational Neuroscience and Neural Technology, Boston University Boston, MA, USA ; Department of Biomedical Engineering, Boston University Boston, MA, USA
| |
Collapse
|
46
|
Kidd G, Mason CR, Best V. The role of syntax in maintaining the integrity of streams of speech. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2014; 135:766-77. [PMID: 25234885 PMCID: PMC3986016 DOI: 10.1121/1.4861354] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/19/2013] [Revised: 12/13/2013] [Accepted: 12/23/2013] [Indexed: 05/21/2023]
Abstract
This study examined the ability of listeners to utilize syntactic structure to extract a target stream of speech from among competing sounds. Target talkers were identified by voice or location, which was held constant throughout a test utterance, and paired with correct or incorrect (random word order) target sentence syntax. Both voice and location provided reliable cues for identifying target speech even when other features varied unpredictably. The target sentences were masked either by predominantly energetic maskers (noise bursts) or by predominantly informational maskers (similar speech in random word order). When the maskers were noise bursts, target sentence syntax had relatively minor effects on identification performance. However, when the maskers were other talkers, correct target sentence syntax resulted in significantly better speech identification performance than incorrect syntax. Furthermore, conformance to correct syntax alone was sufficient to accurately identify the target speech. The results were interpreted as supporting the idea that the predictability of the elements comprising streams of speech, as manifested by syntactic structure, is an important factor in binding words together into coherent streams. Furthermore, these findings suggest that predictability is particularly important for maintaining the coherence of an auditory stream over time under conditions high in informational masking.
Collapse
Affiliation(s)
- Gerald Kidd
- Department of Speech, Language and Hearing Sciences and Hearing Research Center, Boston University, 635 Commonwealth Avenue, Boston, Massachusetts 02215
| | - Christine R Mason
- Department of Speech, Language and Hearing Sciences and Hearing Research Center, Boston University, 635 Commonwealth Avenue, Boston, Massachusetts 02215
| | - Virginia Best
- National Acoustic Laboratories, Macquarie University, New South Wales 2109, Australia
| |
Collapse
|
47
|
Schwartz AH, Shinn-Cunningham BG. Effects of dynamic range compression on spatial selective auditory attention in normal-hearing listeners. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2013; 133:2329-2339. [PMID: 23556599 PMCID: PMC3631248 DOI: 10.1121/1.4794386] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/05/2012] [Revised: 02/13/2013] [Accepted: 02/14/2013] [Indexed: 05/31/2023]
Abstract
Many hearing aids introduce compressive gain to accommodate the reduced dynamic range that often accompanies hearing loss. However, natural sounds produce complicated temporal dynamics in hearing aid compression, as gain is driven by whichever source dominates at a given moment. Moreover, independent compression at the two ears can introduce fluctuations in interaural level differences (ILDs) important for spatial perception. While independent compression can interfere with spatial perception of sound, it does not always interfere with localization accuracy or speech identification. Here, normal-hearing listeners reported a target message played simultaneously with two spatially separated masker messages. We measured the amount of spatial separation required between the target and maskers for subjects to perform at threshold in this task. Fast, syllabic compression that was independent at the two ears increased the required spatial separation, but linking the compressors to provide identical gain to both ears (preserving ILDs) restored much of the deficit caused by fast, independent compression. Effects were less clear for slower compression. Percent-correct performance was lower with independent compression, but only for small spatial separations. These results may help explain differences in previous reports of the effect of compression on spatial perception of sound.
Collapse
Affiliation(s)
- Andrew H Schwartz
- Harvard/Massachusetts Institute of Technology, Speech and Hearing Bioscience and Technology Program, Cambridge, Massachusetts 02139, USA
| | | |
Collapse
|
48
|
Shinn-Cunningham B, Ruggles DR, Bharadwaj H. How early aging and environment interact in everyday listening: from brainstem to behavior through modeling. ADVANCES IN EXPERIMENTAL MEDICINE AND BIOLOGY 2013; 787:501-10. [PMID: 23716257 PMCID: PMC4629495 DOI: 10.1007/978-1-4614-1590-9_55] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/25/2023]
Abstract
We recently showed that listeners with normal hearing thresholds vary in their ability to direct spatial attention and that ability is related to the fidelity of temporal coding in the brainstem. Here, we recruited additional middle-aged listeners and extended our analysis of the brainstem response, measured using the frequency-following response (FFR). We found that even though age does not predict overall selective attention ability, middle-aged listeners are more susceptible to the detrimental effects of reverberant energy than young adults. We separated the overall FFR into orthogonal envelope and carrier components and used an existing model to predict which auditory channels drive each component. We find that responses in mid- to high-frequency auditory channels dominate envelope FFR, while lower-frequency channels dominate the carrier FFR. Importantly, we find that which component of the FFR predicts selective attention performance changes with age. We suggest that early aging degrades peripheral temporal coding in mid-to-high frequencies, interfering with the coding of envelope interaural time differences. We argue that, compared to young adults, middle-aged listeners, who do not have strong temporal envelope coding, have more trouble following a conversation in a reverberant room because they are forced to rely on fragile carrier ITDs that are susceptible to the degrading effects of reverberation.
Collapse
Affiliation(s)
- Barbara Shinn-Cunningham
- Department of Biomedical Engineering, Boston University Center for Computational Neuroscience and Neural Technology, Boston, MA 02215, USA.
| | | | | |
Collapse
|
49
|
Ruggles D, Bharadwaj H, Shinn-Cunningham BG. Why middle-aged listeners have trouble hearing in everyday settings. Curr Biol 2012; 22:1417-22. [PMID: 22727697 DOI: 10.1016/j.cub.2012.05.025] [Citation(s) in RCA: 101] [Impact Index Per Article: 8.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2012] [Revised: 04/20/2012] [Accepted: 05/11/2012] [Indexed: 10/28/2022]
Abstract
Anecdotally, middle-aged listeners report difficulty conversing in social settings, even when they have normal audiometric thresholds [1-3]. Moreover, young adult listeners with "normal" hearing vary in their ability to selectively attend to speech amid similar streams of speech. Ignoring age, these individual differences correlate with physiological differences in temporal coding precision present in the auditory brainstem, suggesting that the fidelity of encoding of suprathreshold sound helps explain individual differences [4]. Here, we revisit the conundrum of whether early aging influences an individual's ability to communicate in everyday settings. Although absolute selective attention ability is not predicted by age, reverberant energy interferes more with selective attention as age increases. Breaking the brainstem response down into components corresponding to coding of stimulus fine structure and envelope, we find that age alters which brainstem component predicts performance. Specifically, middle-aged listeners appear to rely heavily on temporal fine structure, which is more disrupted by reverberant energy than temporal envelope structure is. In contrast, the fidelity of envelope cues predicts performance in younger adults. These results hint that temporal envelope cues influence spatial hearing in reverberant settings more than is commonly appreciated and help explain why middle-aged listeners have particular difficulty communicating in daily life.
Collapse
Affiliation(s)
- Dorea Ruggles
- Department of Biomedical Engineering, Center for Computational Neuroscience and Neural Technology, Boston University, Boston, MA 02215, USA
| | | | | |
Collapse
|
50
|
Maddox RK, Billimoria CP, Perrone BP, Shinn-Cunningham BG, Sen K. Competing sound sources reveal spatial effects in cortical processing. PLoS Biol 2012; 10:e1001319. [PMID: 22563301 PMCID: PMC3341327 DOI: 10.1371/journal.pbio.1001319] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2011] [Accepted: 03/20/2012] [Indexed: 11/18/2022] Open
Abstract
Why is spatial tuning in auditory cortex weak, even though location is important to object recognition in natural settings? This question continues to vex neuroscientists focused on linking physiological results to auditory perception. Here we show that the spatial locations of simultaneous, competing sound sources dramatically influence how well neural spike trains recorded from the zebra finch field L (an analog of mammalian primary auditory cortex) encode source identity. We find that the location of a birdsong played in quiet has little effect on the fidelity of the neural encoding of the song. However, when the song is presented along with a masker, spatial effects are pronounced. For each spatial configuration, a subset of neurons encodes song identity more robustly than others. As a result, competing sources from different locations dominate responses of different neural subpopulations, helping to separate neural responses into independent representations. These results help elucidate how cortical processing exploits spatial information to provide a substrate for selective spatial auditory attention.
Collapse
Affiliation(s)
- Ross K. Maddox
- Hearing Research Center, Department of Biomedical Engineering, Boston University, Boston, Massachusetts, United States of America
- Center for Biodynamics, Boston University, Boston, Massachusetts, United States of America
- * E-mail:
| | - Cyrus P. Billimoria
- Hearing Research Center, Department of Biomedical Engineering, Boston University, Boston, Massachusetts, United States of America
- Center for Biodynamics, Boston University, Boston, Massachusetts, United States of America
| | - Ben P. Perrone
- Hearing Research Center, Department of Biomedical Engineering, Boston University, Boston, Massachusetts, United States of America
- Center for Biodynamics, Boston University, Boston, Massachusetts, United States of America
| | - Barbara G. Shinn-Cunningham
- Hearing Research Center, Department of Biomedical Engineering, Boston University, Boston, Massachusetts, United States of America
- Center for Computational Neuroscience and Neural Technology, Boston University, Boston, Massachusetts, United States of America
| | - Kamal Sen
- Hearing Research Center, Department of Biomedical Engineering, Boston University, Boston, Massachusetts, United States of America
- Center for Biodynamics, Boston University, Boston, Massachusetts, United States of America
| |
Collapse
|