51
|
Feng T, Chen Q, Xiao Z. Age-Related Differences in the Effects of Masker Cuing on Releasing Chinese Speech From Informational Masking. Front Psychol 2018; 9:1922. [PMID: 30356784 PMCID: PMC6189421 DOI: 10.3389/fpsyg.2018.01922] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2018] [Accepted: 09/18/2018] [Indexed: 11/22/2022] Open
Abstract
The aims of the present study were to examine whether familiarity with a masker improves word recognition in speech masking situations and whether there are age-related differences in the effects of masker cuing. Thirty-two older listeners (range = 59–74; mean age = 66.41 years) with high-frequency hearing loss and 32 younger normal-hearing listeners (range = 21–28; mean age = 23.73) participated in this study, all of whom spoke Chinese as their first language. Two experiments were conducted and 16 younger and 16 older listeners were used in each experiment. The masking speech with different content from target speech with syntactically correct but semantically meaningless was a continuous recording of meaningless Chinese sentences spoken by two talkers. The masker level was adjusted to produce signal-to-masker ratios of -12, -8, -4, and 0 dB for the younger participants and -8, -4, 0, and 4 dB for the older participants. Under masker-priming conditions, a priming sentence, spoken by the masker talkers, was presented in quiet three times before a target sentence was presented together with a masker sentence 4 s later. In Experiment 1, using same-sentence masker-priming (identical to the masker sentence), the masker-priming improved the identification of the target sentence for both age groups compared to when no priming was provided. However, the amount of masking release was less in the older adults than in the younger adults. In Experiment 2, two kinds of primes were considered: same-sentence masker-priming, and different-sentence masker-priming (different from the masker sentence in content for each keyword). The results of Experiment 2 showed that both kinds of primes improved the identification of the targets for both age groups. However, the release from speech masking in both priming conditions was less in the older adults than in the younger adults, and the release from speech masking in both age groups was greater with same-sentence masker-priming than with different-sentence masker-priming. These results suggest that both the voice and content cues of a masker could be used to release target speech from maskers in noisy listening conditions. Furthermore, there was an age-related decline in masker-priming-induced release from speech masking.
Collapse
Affiliation(s)
- Tianquan Feng
- College of Teacher Education, Nanjing Normal University, Nanjing, China.,State Key Laboratory of Bioelectronics, School of Biological Science and Medical Engineering, Southeast University, Nanjing, China
| | - Qingrong Chen
- School of Psychology, Nanjing Normal University, Nanjing, China
| | - Zhongdang Xiao
- State Key Laboratory of Bioelectronics, School of Biological Science and Medical Engineering, Southeast University, Nanjing, China
| |
Collapse
|
52
|
Kreitewolf J, Mathias SR, Trapeau R, Obleser J, Schönwiesner M. Perceptual grouping in the cocktail party: Contributions of voice-feature continuity. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2018; 144:2178. [PMID: 30404485 DOI: 10.1121/1.5058684] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/26/2018] [Accepted: 09/18/2018] [Indexed: 06/08/2023]
Abstract
Cocktail parties pose a difficult yet solvable problem for the auditory system. Previous work has shown that the cocktail-party problem is considerably easier when all sounds in the target stream are spoken by the same talker (the voice-continuity benefit). The present study investigated the contributions of two of the most salient voice features-glottal-pulse rate (GPR) and vocal-tract length (VTL)-to the voice-continuity benefit. Twenty young, normal-hearing listeners participated in two experiments. On each trial, listeners heard concurrent sequences of spoken digits from three different spatial locations and reported the digits coming from a target location. Critically, across conditions, GPR and VTL either remained constant or varied across target digits. Additionally, across experiments, the target location either remained constant (Experiment 1) or varied (Experiment 2) within a trial. In Experiment 1, listeners benefited from continuity in either voice feature, but VTL continuity was more helpful than GPR continuity. In Experiment 2, spatial discontinuity greatly hindered listeners' abilities to exploit continuity in GPR and VTL. The present results suggest that selective attention benefits from continuity in target voice features and that VTL and GPR play different roles for perceptual grouping and stream segregation in the cocktail party.
Collapse
Affiliation(s)
- Jens Kreitewolf
- International Laboratory for Brain, Music and Sound Research (BRAMS), Department of Psychology, Université de Montréal, Pavillon 1420 Boulevard Mont-Royal, Outremont, Quebec, H2V 4P3, Canada
| | - Samuel R Mathias
- Neurocognition, Neurocomputation and Neurogenetics (n3) Division, Yale University School of Medicine, 40 Temple Street, New Haven, Connecticut 06511, USA
| | - Régis Trapeau
- International Laboratory for Brain, Music and Sound Research (BRAMS), Department of Psychology, Université de Montréal, Pavillon 1420 Boulevard Mont-Royal, Outremont, Quebec, H2V 4P3, Canada
| | - Jonas Obleser
- Department of Psychology, University of Lübeck, Maria-Goeppert-Straße 9a, D-23562 Lübeck, Germany
| | - Marc Schönwiesner
- International Laboratory for Brain, Music and Sound Research (BRAMS), Department of Psychology, Université de Montréal, Pavillon 1420 Boulevard Mont-Royal, Outremont, Quebec, H2V 4P3, Canada
| |
Collapse
|
53
|
Auditory cortex responses to interaural time differences in the envelope of low-frequency sound, recorded with MEG in young and older listeners. Hear Res 2018; 370:22-39. [PMID: 30265860 DOI: 10.1016/j.heares.2018.09.001] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/16/2018] [Revised: 08/31/2018] [Accepted: 09/03/2018] [Indexed: 11/21/2022]
Abstract
Interaural time and intensity differences (ITD and IID) are important cues in binaural hearing and allow for sound localization, improving speech understanding in noise and reverberation, and integrating sound sources in the auditory scene. Whereas previous research showed that the upper-frequency limit for ITD detection in the fine structure of sound declines in aging, the processing of envelope ITD in low-frequency amplitude modulated (AM) sound and the related brain responses are less understood. This study investigated the cortical processing of envelope ITD and compared the results with previous findings about the fine-structure ITD. In two experiments, participants listened to 40-Hz AM tones containing sudden changes in the envelope ITD. Multiple MEG responses were analyzed, including the auditory evoked N1 responses, elicited both by sound onsets and ITD changes, and 40-Hz responses, elicited by the AM. The first experiment with healthy young adults revealed a substantial decline in the magnitudes of the ITD change N1 response, and the 40-Hz phase resets at higher carrier frequencies, suggesting a similar frequency characteristic as observed for fine structure ITD. The amplitude of the 40-Hz ASSR declined only gradually with increasing carrier frequency, and it was excluded as a confounding factor in the decline in the ITD response. Larger responses to outward ITD changes than inward changes, here first reported for envelope ITD, were another characteristics that were similar to fine-structure ITD. A second experiment with groups of young and older listeners examined the effects of aging and concurrent noise on the cortical envelope ITD responses. One important research question was, whether binaural cues are accessible in noise. Behavioural tests showed an age-related hearing loss in the older group and decreased performance in envelope ITD detection and speech-in-noise (SIN) understanding. Binaural hearing and SIN performance were correlated with one other, but not with hearing loss. The frequency limit for envelope ITD was reduced in older listeners similarly as previously found for fine structure ITD, and older listeners were more susceptible to concurrent multi-talker noise. The similarities between responses to envelope ITD and to fine structure ITD suggest that a common cortical code exists for the envelope and fine structure ITD. The dependency on the carrier frequency suggests that envelope ITD processing at the subcortical level requires stimulus phase locking, which might be reduced in aging.
Collapse
|
54
|
Schwartz ZP, David SV. Focal Suppression of Distractor Sounds by Selective Attention in Auditory Cortex. Cereb Cortex 2018; 28:323-339. [PMID: 29136104 PMCID: PMC6057511 DOI: 10.1093/cercor/bhx288] [Citation(s) in RCA: 26] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2017] [Indexed: 11/15/2022] Open
Abstract
Auditory selective attention is required for parsing crowded acoustic environments, but cortical systems mediating the influence of behavioral state on auditory perception are not well characterized. Previous neurophysiological studies suggest that attention produces a general enhancement of neural responses to important target sounds versus irrelevant distractors. However, behavioral studies suggest that in the presence of masking noise, attention provides a focal suppression of distractors that compete with targets. Here, we compared effects of attention on cortical responses to masking versus non-masking distractors, controlling for effects of listening effort and general task engagement. We recorded single-unit activity from primary auditory cortex (A1) of ferrets during behavior and found that selective attention decreased responses to distractors masking targets in the same spectral band, compared with spectrally distinct distractors. This suppression enhanced neural target detection thresholds, suggesting that limited attention resources serve to focally suppress responses to distractors that interfere with target detection. Changing effort by manipulating target salience consistently modulated spontaneous but not evoked activity. Task engagement and changing effort tended to affect the same neurons, while attention affected an independent population, suggesting that distinct feedback circuits mediate effects of attention and effort in A1.
Collapse
Affiliation(s)
- Zachary P Schwartz
- Neuroscience Graduate Program, Oregon Health and Science University, OR, USA
| | - Stephen V David
- Oregon Hearing Research Center, Oregon Health and Science University, OR, USA
- Address Correspondence to Stephen V. David, Oregon Hearing Research Center, Oregon Health and Science University, 3181 SW Sam Jackson Park Road, MC L335A, Portland, OR 97239, USA.
| |
Collapse
|
55
|
Bologna WJ, Vaden KI, Ahlstrom JB, Dubno JR. Age effects on perceptual organization of speech: Contributions of glimpsing, phonemic restoration, and speech segregation. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2018; 144:267. [PMID: 30075693 PMCID: PMC6047943 DOI: 10.1121/1.5044397] [Citation(s) in RCA: 23] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]
Abstract
In realistic listening environments, speech perception requires grouping together audible fragments of speech, filling in missing information, and segregating the glimpsed target from the background. The purpose of this study was to determine the extent to which age-related difficulties with these tasks can be explained by declines in glimpsing, phonemic restoration, and/or speech segregation. Younger and older adults with normal hearing listened to sentences interrupted with silence or envelope-modulated noise, presented either in quiet or with a competing talker. Older adults were poorer than younger adults at recognizing keywords based on short glimpses but benefited more when envelope-modulated noise filled silent intervals. Recognition declined with a competing talker but this effect did not interact with age. Results of cognitive tasks indicated that faster processing speed and better visual-linguistic closure were predictive of better speech understanding. Taken together, these results suggest that age-related declines in speech recognition may be partially explained by difficulty grouping short glimpses of speech into a coherent message.
Collapse
Affiliation(s)
- William J Bologna
- Department of Otolaryngology-Head and Neck Surgery, Medical University of South Carolina, 135 Rutledge Avenue, MSC 550, Charleston, South Carolina 29425, USA
| | - Kenneth I Vaden
- Department of Otolaryngology-Head and Neck Surgery, Medical University of South Carolina, 135 Rutledge Avenue, MSC 550, Charleston, South Carolina 29425, USA
| | - Jayne B Ahlstrom
- Department of Otolaryngology-Head and Neck Surgery, Medical University of South Carolina, 135 Rutledge Avenue, MSC 550, Charleston, South Carolina 29425, USA
| | - Judy R Dubno
- Department of Otolaryngology-Head and Neck Surgery, Medical University of South Carolina, 135 Rutledge Avenue, MSC 550, Charleston, South Carolina 29425, USA
| |
Collapse
|
56
|
Bologna WJ, Vaden KI, Ahlstrom JB, Dubno JR. Age effects on perceptual organization of speech: Contributions of glimpsing, phonemic restoration, and speech segregation. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2018; 144:267. [PMID: 30075693 DOI: 10.5041466/1.5044397] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Subscribe] [Scholar Register] [Indexed: 05/26/2023]
Abstract
In realistic listening environments, speech perception requires grouping together audible fragments of speech, filling in missing information, and segregating the glimpsed target from the background. The purpose of this study was to determine the extent to which age-related difficulties with these tasks can be explained by declines in glimpsing, phonemic restoration, and/or speech segregation. Younger and older adults with normal hearing listened to sentences interrupted with silence or envelope-modulated noise, presented either in quiet or with a competing talker. Older adults were poorer than younger adults at recognizing keywords based on short glimpses but benefited more when envelope-modulated noise filled silent intervals. Recognition declined with a competing talker but this effect did not interact with age. Results of cognitive tasks indicated that faster processing speed and better visual-linguistic closure were predictive of better speech understanding. Taken together, these results suggest that age-related declines in speech recognition may be partially explained by difficulty grouping short glimpses of speech into a coherent message.
Collapse
Affiliation(s)
- William J Bologna
- Department of Otolaryngology-Head and Neck Surgery, Medical University of South Carolina, 135 Rutledge Avenue, MSC 550, Charleston, South Carolina 29425, USA
| | - Kenneth I Vaden
- Department of Otolaryngology-Head and Neck Surgery, Medical University of South Carolina, 135 Rutledge Avenue, MSC 550, Charleston, South Carolina 29425, USA
| | - Jayne B Ahlstrom
- Department of Otolaryngology-Head and Neck Surgery, Medical University of South Carolina, 135 Rutledge Avenue, MSC 550, Charleston, South Carolina 29425, USA
| | - Judy R Dubno
- Department of Otolaryngology-Head and Neck Surgery, Medical University of South Carolina, 135 Rutledge Avenue, MSC 550, Charleston, South Carolina 29425, USA
| |
Collapse
|
57
|
Popham S, Boebinger D, Ellis DPW, Kawahara H, McDermott JH. Inharmonic speech reveals the role of harmonicity in the cocktail party problem. Nat Commun 2018; 9:2122. [PMID: 29844313 PMCID: PMC5974276 DOI: 10.1038/s41467-018-04551-8] [Citation(s) in RCA: 36] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2017] [Accepted: 05/08/2018] [Indexed: 11/22/2022] Open
Abstract
The "cocktail party problem" requires us to discern individual sound sources from mixtures of sources. The brain must use knowledge of natural sound regularities for this purpose. One much-discussed regularity is the tendency for frequencies to be harmonically related (integer multiples of a fundamental frequency). To test the role of harmonicity in real-world sound segregation, we developed speech analysis/synthesis tools to perturb the carrier frequencies of speech, disrupting harmonic frequency relations while maintaining the spectrotemporal envelope that determines phonemic content. We find that violations of harmonicity cause individual frequencies of speech to segregate from each other, impair the intelligibility of concurrent utterances despite leaving intelligibility of single utterances intact, and cause listeners to lose track of target talkers. However, additional segregation deficits result from replacing harmonic frequencies with noise (simulating whispering), suggesting additional grouping cues enabled by voiced speech excitation. Our results demonstrate acoustic grouping cues in real-world sound segregation.
Collapse
Affiliation(s)
- Sara Popham
- Department of Brain and Cognitive Sciences, MIT, Cambridge, MA, 02139, USA
- Helen Wills Neuroscience Institute, UC Berkeley, Berkeley, CA, 94720, USA
| | - Dana Boebinger
- Department of Brain and Cognitive Sciences, MIT, Cambridge, MA, 02139, USA
- Program in Speech and Hearing Sciences, Harvard University, Cambridge, MA, 02138, USA
| | | | | | - Josh H McDermott
- Department of Brain and Cognitive Sciences, MIT, Cambridge, MA, 02139, USA.
- Program in Speech and Hearing Sciences, Harvard University, Cambridge, MA, 02138, USA.
| |
Collapse
|
58
|
Cueing listeners to attend to a target talker progressively improves word report as the duration of the cue-target interval lengthens to 2,000 ms. Atten Percept Psychophys 2018; 80:1520-1538. [PMID: 29696570 DOI: 10.3758/s13414-018-1531-x] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
Endogenous attention is typically studied by presenting instructive cues in advance of a target stimulus array. For endogenous visual attention, task performance improves as the duration of the cue-target interval increases up to 800 ms. Less is known about how endogenous auditory attention unfolds over time or the mechanisms by which an instructive cue presented in advance of an auditory array improves performance. The current experiment used five cue-target intervals (0, 250, 500, 1,000, and 2,000 ms) to compare four hypotheses for how preparatory attention develops over time in a multi-talker listening task. Young adults were cued to attend to a target talker who spoke in a mixture of three talkers. Visual cues indicated the target talker's spatial location or their gender. Participants directed attention to location and gender simultaneously ("objects") at all cue-target intervals. Participants were consistently faster and more accurate at reporting words spoken by the target talker when the cue-target interval was 2,000 ms than 0 ms. In addition, the latency of correct responses progressively shortened as the duration of the cue-target interval increased from 0 to 2,000 ms. These findings suggest that the mechanisms involved in preparatory auditory attention develop gradually over time, taking at least 2,000 ms to reach optimal configuration, yet providing cumulative improvements in speech intelligibility as the duration of the cue-target interval increases from 0 to 2,000 ms. These results demonstrate an improvement in performance for cue-target intervals longer than those that have been reported previously in the visual or auditory modalities.
Collapse
|
59
|
Senior B, Babel M. The role of unfamiliar accents in competing speech. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2018; 143:931. [PMID: 29495681 DOI: 10.1121/1.5023681] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
A listeners' ability to comprehend one speaker against a background of other speech-a phenomenon dubbed the cocktail party problem-varies according to the properties of the speech streams and the listener. Although a number of factors that contribute to a listener's ability to successfully segregate two simultaneous speech signals have been identified, comparably little work has focused on the role accents may play in this process. To this end, familiar Canadian-accented voices and unfamiliar British-accented voices were used in a competing talker task. Native speakers of Canadian English heard two different talkers simultaneously read sentences in the form of "[command] [colour] [preposition] [letter] [number] [adverb]" (e.g., "Lay blue at C4 now") and reported the coordinate from a target talker. Results indicate that on all but the most challenging trials, listeners did best when attending to an unfamiliar-accented target against a familiarly-accented masker and performed worse when forced to ignore the unfamiliar accent. These results suggest listeners can easily tune out a familiar accent, but are unable to do the same with an unfamiliar accent, indicating that unfamiliar accents are more effective maskers.
Collapse
Affiliation(s)
- Brianne Senior
- School of Audiology and Speech Science, University of British Columbia, 2177 Wesbrook Mall, Vancouver, BC V6T 1Z4 Canada
| | - Molly Babel
- Department of Linguistics, University of British Columbia, 2613 West Mall, Vancouver, BC V6T 1Z4 Canada
| |
Collapse
|
60
|
Rowland SC, Hartley DEH, Wiggins IM. Listening in Naturalistic Scenes: What Can Functional Near-Infrared Spectroscopy and Intersubject Correlation Analysis Tell Us About the Underlying Brain Activity? Trends Hear 2018; 22:2331216518804116. [PMID: 30345888 PMCID: PMC6198387 DOI: 10.1177/2331216518804116] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2018] [Revised: 08/17/2018] [Accepted: 09/06/2018] [Indexed: 12/24/2022] Open
Abstract
Listening to speech in the noisy conditions of everyday life can be effortful, reflecting the increased cognitive workload involved in extracting meaning from a degraded acoustic signal. Studying the underlying neural processes has the potential to provide mechanistic insight into why listening is effortful under certain conditions. In a move toward studying listening effort under ecologically relevant conditions, we used the silent and flexible neuroimaging technique functional near-infrared spectroscopy (fNIRS) to examine brain activity during attentive listening to speech in naturalistic scenes. Thirty normally hearing participants listened to a series of narratives continuously varying in acoustic difficulty while undergoing fNIRS imaging. Participants then listened to another set of closely matched narratives and rated perceived effort and intelligibility for each scene. As expected, self-reported effort generally increased with worsening signal-to-noise ratio. After controlling for better-ear signal-to-noise ratio, perceived effort was greater in scenes that contained competing speech than in those that did not, potentially reflecting an additional cognitive cost of overcoming informational masking. We analyzed the fNIRS data using intersubject correlation, a data-driven approach suitable for analyzing data collected under naturalistic conditions. Significant intersubject correlation was seen in the bilateral auditory cortices and in a range of channels across the prefrontal cortex. The involvement of prefrontal regions is consistent with the notion that higher order cognitive processes are engaged during attentive listening to speech in complex real-world conditions. However, further research is needed to elucidate the relationship between perceived listening effort and activity in these extended cortical networks.
Collapse
Affiliation(s)
- Stephen C. Rowland
- National Institute for Health Research Nottingham Biomedical Research Centre, UK
- Hearing Sciences, Division of Clinical Neuroscience, School of Medicine, University of Nottingham, UK
| | - Douglas E. H. Hartley
- National Institute for Health Research Nottingham Biomedical Research Centre, UK
- Hearing Sciences, Division of Clinical Neuroscience, School of Medicine, University of Nottingham, UK
- Medical Research Council Institute of Hearing Research, School of Medicine, University of Nottingham, UK
- Nottingham University Hospitals NHS Trust, Queens Medical Centre, UK
| | - Ian M. Wiggins
- National Institute for Health Research Nottingham Biomedical Research Centre, UK
- Hearing Sciences, Division of Clinical Neuroscience, School of Medicine, University of Nottingham, UK
- Medical Research Council Institute of Hearing Research, School of Medicine, University of Nottingham, UK
| |
Collapse
|
61
|
Oberem J, Seibold J, Koch I, Fels J. Intentional switching in auditory selective attention: Exploring attention shifts with different reverberation times. Hear Res 2017; 359:32-39. [PMID: 29305038 DOI: 10.1016/j.heares.2017.12.013] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/06/2017] [Revised: 12/11/2017] [Accepted: 12/18/2017] [Indexed: 12/01/2022]
Abstract
Using a well-established binaural-listening paradigm the ability to intentionally switch auditory selective attention was examined under anechoic, low reverberation (0.8 s) and high reverberation (1.75 s) conditions. Twenty-three young, normal-hearing subjects were tested in a within-subject design to analyze influences of the reverberation times. Spoken word pairs by two speakers were presented simultaneously to subjects from two of eight azimuth positions. The stimuli consisted of a single number word, (i.e., 1 to 9), followed by either the direction "UP" or "DOWN" in German. Guided by a visual cue prior to auditory stimulus onset indicating the position of the target speaker, subjects were asked to identify whether the target number was numerically smaller or greater than five and to categorize the direction of the second word. Switch costs, (i.e. reaction time differences between a position switch of the target relative to a position repetition), were larger under the high reverberation condition. Furthermore, the error rates were highly dependent on reverberant energy and reverberation interacted with the congruence effect, (i.e. stimuli spoken by target and distractor may evoke the same answer (congruent) or different answers (incongruent)), indicating larger congruence effects under higher reverberation times.
Collapse
Affiliation(s)
- Josefa Oberem
- Institute of Technical Acoustics, Medical Acoustics Group, RWTH Aachen University, Kopernikusstraße 5, 52074 Aachen, Germany.
| | - Julia Seibold
- Institute of Psychology, RWTH Aachen University, Jägerstraße 17, 52066 Aachen, Germany.
| | - Iring Koch
- Institute of Psychology, RWTH Aachen University, Jägerstraße 17, 52066 Aachen, Germany.
| | - Janina Fels
- Institute of Technical Acoustics, Medical Acoustics Group, RWTH Aachen University, Kopernikusstraße 5, 52074 Aachen, Germany.
| |
Collapse
|
62
|
Jakien KM, Kampel SD, Stansell MM, Gallun FJ. Validating a Rapid, Automated Test of Spatial Release From Masking. Am J Audiol 2017; 26:507-518. [PMID: 28973106 PMCID: PMC5968328 DOI: 10.1044/2017_aja-17-0013] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2017] [Revised: 05/04/2017] [Accepted: 05/23/2017] [Indexed: 11/09/2022] Open
Abstract
PURPOSE To evaluate the test-retest reliability of a headphone-based spatial release from a masking task with two maskers (referred to here as the SR2) and to describe its relationship to the same test done over loudspeakers in an anechoic chamber (the SR2A). We explore what thresholds tell us about certain populations (such as older individuals or individuals with hearing impairment) and discuss how the SR2 might be useful in the clinic. METHOD Fifty-four participants completed speech intelligibility tests in which a target phrase and two masking phrases from the Coordinate Response Measure corpus (Bolia, Nelson, Ericson, & Simpson, 2000) were presented either via earphones using a virtual spatial array or via loudspeakers in an anechoic chamber. For the SR2, the target sentence was always at 0° azimuth angle, and the maskers were either colocated at 0° or positioned at ± 45°. For the SR2A, the target was located at 0°, and the maskers were colocated or located at ± 15°, ± 30°, ± 45°, ± 90°, or ± 135°. Spatial release from masking was determined as the difference between thresholds in the colocated condition and each spatially separated condition. All participants completed the SR2 at least twice, and 29 of the individuals who completed the SR2 at least twice also participated in the SR2A. In a second experiment, 40 participants completed the SR2 8 times, and the changes in performance were evaluated as a function of test repetition. RESULTS Mean thresholds were slightly better on the SR2 after the first repetition but were consistent across 8 subsequent testing sessions. Performance was consistent for the SR2A, regardless of the number of times testing was repeated. The SR2, which simulates 45° separations of target and maskers, produced spatially separated thresholds that were similar to thresholds obtained with 30° of separation in the anechoic chamber. Over headphones and in the anechoic chamber, pure-tone average was a strong predictor of spatial release, whereas age only reached significance for colocated conditions. CONCLUSIONS The SR2 is a reliable and effective method of testing spatial release from masking, suitable for screening abnormal listening abilities and for tracking rehabilitation over time. Future work should focus on developing and validating rapid, automated testing to identify the ability of listeners to benefit from high-frequency amplification, smaller spatial separations, and larger spectral differences among talkers.
Collapse
Affiliation(s)
- Kasey M. Jakien
- National Center for Rehabilitative Auditory Research, VA Portland Health Care System, U.S. Department of Veterans Affairs, OR
- Department of Otolaryngology/Head & Neck Surgery, Oregon Health & Science University, Portland
| | - Sean D. Kampel
- National Center for Rehabilitative Auditory Research, VA Portland Health Care System, U.S. Department of Veterans Affairs, OR
- Department of Otolaryngology/Head & Neck Surgery, Oregon Health & Science University, Portland
| | - Meghan M. Stansell
- National Center for Rehabilitative Auditory Research, VA Portland Health Care System, U.S. Department of Veterans Affairs, OR
| | - Frederick J. Gallun
- National Center for Rehabilitative Auditory Research, VA Portland Health Care System, U.S. Department of Veterans Affairs, OR
- Department of Otolaryngology/Head & Neck Surgery, Oregon Health & Science University, Portland
| |
Collapse
|
63
|
Shinn-Cunningham B. Cortical and Sensory Causes of Individual Differences in Selective Attention Ability Among Listeners With Normal Hearing Thresholds. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2017; 60:2976-2988. [PMID: 29049598 PMCID: PMC5945067 DOI: 10.1044/2017_jslhr-h-17-0080] [Citation(s) in RCA: 23] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/27/2017] [Revised: 06/23/2017] [Accepted: 07/05/2017] [Indexed: 05/28/2023]
Abstract
PURPOSE This review provides clinicians with an overview of recent findings relevant to understanding why listeners with normal hearing thresholds (NHTs) sometimes suffer from communication difficulties in noisy settings. METHOD The results from neuroscience and psychoacoustics are reviewed. RESULTS In noisy settings, listeners focus their attention by engaging cortical brain networks to suppress unimportant sounds; they then can analyze and understand an important sound, such as speech, amidst competing sounds. Differences in the efficacy of top-down control of attention can affect communication abilities. In addition, subclinical deficits in sensory fidelity can disrupt the ability to perceptually segregate sound sources, interfering with selective attention, even in listeners with NHTs. Studies of variability in control of attention and in sensory coding fidelity may help to isolate and identify some of the causes of communication disorders in individuals presenting at the clinic with "normal hearing." CONCLUSIONS How well an individual with NHTs can understand speech amidst competing sounds depends not only on the sound being audible but also on the integrity of cortical control networks and the fidelity of the representation of suprathreshold sound. Understanding the root cause of difficulties experienced by listeners with NHTs ultimately can lead to new, targeted interventions that address specific deficits affecting communication in noise. PRESENTATION VIDEO http://cred.pubs.asha.org/article.aspx?articleid=2601617.
Collapse
Affiliation(s)
- Barbara Shinn-Cunningham
- Center for Research in Sensory Communication and Emerging Neural Technology, Boston University, MA
| |
Collapse
|
64
|
David M, Lavandier M, Grimault N, Oxenham AJ. Discrimination and streaming of speech sounds based on differences in interaural and spectral cues. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2017; 142:1674. [PMID: 28964066 PMCID: PMC5617732 DOI: 10.1121/1.5003809] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/09/2017] [Revised: 09/01/2017] [Accepted: 09/07/2017] [Indexed: 05/29/2023]
Abstract
Differences in spatial cues, including interaural time differences (ITDs), interaural level differences (ILDs) and spectral cues, can lead to stream segregation of alternating noise bursts. It is unknown how effective such cues are for streaming sounds with realistic spectro-temporal variations. In particular, it is not known whether the high-frequency spectral cues associated with elevation remain sufficiently robust under such conditions. To answer these questions, sequences of consonant-vowel tokens were generated and filtered by non-individualized head-related transfer functions to simulate the cues associated with different positions in the horizontal and median planes. A discrimination task showed that listeners could discriminate changes in interaural cues both when the stimulus remained constant and when it varied between presentations. However, discrimination of changes in spectral cues was much poorer in the presence of stimulus variability. A streaming task, based on the detection of repeated syllables in the presence of interfering syllables, revealed that listeners can use both interaural and spectral cues to segregate alternating syllable sequences, despite the large spectro-temporal differences between stimuli. However, only the full complement of spatial cues (ILDs, ITDs, and spectral cues) resulted in obligatory streaming in a task that encouraged listeners to integrate the tokens into a single stream.
Collapse
Affiliation(s)
- Marion David
- Department of Psychology, University of Minnesota, Minneapolis, Minnesota 55455, USA
| | - Mathieu Lavandier
- Univ Lyon, ENTPE, Laboratoire Génie Civil et bâtiment, Rue Maurice Audin, 69518 Vaulx-en-Velin Cedex, France
| | - Nicolas Grimault
- Centre de Recherche en Neurosciences de Lyon, Université Lyon 1, Cognition Auditive et Psychoacoustique, Avenue Tony Garnier, 69366 Lyon Cedex 07, France
| | - Andrew J Oxenham
- Department of Psychology, University of Minnesota, Minneapolis, Minnesota 55455, USA
| |
Collapse
|
65
|
Koelewijn T, Versfeld NJ, Kramer SE. Effects of attention on the speech reception threshold and pupil response of people with impaired and normal hearing. Hear Res 2017; 354:56-63. [PMID: 28869841 DOI: 10.1016/j.heares.2017.08.006] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/23/2017] [Revised: 08/21/2017] [Accepted: 08/25/2017] [Indexed: 11/26/2022]
Abstract
For people with hearing difficulties, following a conversation in a noisy environment requires substantial cognitive processing, which is often perceived as effortful. Recent studies with normal hearing (NH) listeners showed that the pupil dilation response, a measure of cognitive processing load, is affected by 'attention related' processes. How these processes affect the pupil dilation response for hearing impaired (HI) listeners remains unknown. Therefore, the current study investigated the effect of auditory attention on various pupil response parameters for 15 NH adults (median age 51 yrs.) and 15 adults with mild to moderate sensorineural hearing loss (median age 52 yrs.). Both groups listened to two different sentences presented simultaneously, one to each ear and partially masked by stationary noise. Participants had to repeat either both sentences or only one, for which they had to divide or focus attention, respectively. When repeating one sentence, the target sentence location (left or right) was either randomized or blocked across trials, which in the latter case allowed for a better spatial focus of attention. The speech-to-noise ratio was adjusted to yield about 50% sentences correct for each task and condition. NH participants had lower ('better') speech reception thresholds (SRT) than HI participants. The pupil measures showed no between-group effects, with the exception of a shorter peak latency for HI participants, which indicated a shorter processing time. Both groups showed higher SRTs and a larger pupil dilation response when two sentences were processed instead of one. Additionally, SRTs were higher and dilation responses were larger for both groups when the target location was randomized instead of fixed. We conclude that although HI participants could cope with less noise than the NH group, their ability to focus attention on a single talker, thereby improving SRTs and lowering cognitive processing load, was preserved. Shorter peak latencies could indicate that HI listeners adapt their listening strategy by not processing some information, which reduces processing time and thereby listening effort.
Collapse
Affiliation(s)
- Thomas Koelewijn
- Section Ear & Hearing, Department of Otolaryngology-Head and Neck Surgery and Amsterdam Public Health Research Institute, VU University Medical Center, Amsterdam, The Netherlands.
| | - Niek J Versfeld
- Section Ear & Hearing, Department of Otolaryngology-Head and Neck Surgery and Amsterdam Public Health Research Institute, VU University Medical Center, Amsterdam, The Netherlands
| | - Sophia E Kramer
- Section Ear & Hearing, Department of Otolaryngology-Head and Neck Surgery and Amsterdam Public Health Research Institute, VU University Medical Center, Amsterdam, The Netherlands
| |
Collapse
|
66
|
Peripheral hearing loss reduces the ability of children to direct selective attention during multi-talker listening. Hear Res 2017; 350:160-172. [PMID: 28505526 DOI: 10.1016/j.heares.2017.05.005] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/31/2016] [Revised: 04/28/2017] [Accepted: 05/08/2017] [Indexed: 11/23/2022]
Abstract
Restoring normal hearing requires knowledge of how peripheral and central auditory processes are affected by hearing loss. Previous research has focussed primarily on peripheral changes following sensorineural hearing loss, whereas consequences for central auditory processing have received less attention. We examined the ability of hearing-impaired children to direct auditory attention to a voice of interest (based on the talker's spatial location or gender) in the presence of a common form of background noise: the voices of competing talkers (i.e. during multi-talker, or "Cocktail Party" listening). We measured brain activity using electro-encephalography (EEG) when children prepared to direct attention to the spatial location or gender of an upcoming target talker who spoke in a mixture of three talkers. Compared to normally-hearing children, hearing-impaired children showed significantly less evidence of preparatory brain activity when required to direct spatial attention. This finding is consistent with the idea that hearing-impaired children have a reduced ability to prepare spatial attention for an upcoming talker. Moreover, preparatory brain activity was not restored when hearing-impaired children listened with their acoustic hearing aids. An implication of these findings is that steps to improve auditory attention alongside acoustic hearing aids may be required to improve the ability of hearing-impaired children to understand speech in the presence of competing talkers.
Collapse
|
67
|
Getzmann S, Wascher E. Visually guided auditory attention in a dynamic “cocktail-party” speech perception task: ERP evidence for age-related differences. Hear Res 2017; 344:98-108. [DOI: 10.1016/j.heares.2016.11.001] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/07/2016] [Revised: 10/20/2016] [Accepted: 11/03/2016] [Indexed: 10/20/2022]
|
68
|
Kidd G, Colburn HS. Informational Masking in Speech Recognition. SPRINGER HANDBOOK OF AUDITORY RESEARCH 2017. [DOI: 10.1007/978-3-319-51662-2_4] [Citation(s) in RCA: 52] [Impact Index Per Article: 7.4] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/04/2022]
|
69
|
Shinn-Cunningham B, Best V, Lee AKC. Auditory Object Formation and Selection. SPRINGER HANDBOOK OF AUDITORY RESEARCH 2017. [DOI: 10.1007/978-3-319-51662-2_2] [Citation(s) in RCA: 31] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/03/2022]
|
70
|
Best V, Streeter T, Roverud E, Mason CR, Kidd G. A Flexible Question-and-Answer Task for Measuring Speech Understanding. Trends Hear 2016; 20:2331216516678706. [PMID: 27888257 PMCID: PMC5131808 DOI: 10.1177/2331216516678706] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2016] [Revised: 10/20/2016] [Accepted: 10/20/2016] [Indexed: 11/16/2022] Open
Abstract
This report introduces a new speech task based on simple questions and answers. The task differs from a traditional sentence recall task in that it involves an element of comprehension and can be implemented in an ongoing fashion. It also contains two target items (the question and the answer) that may be associated with different voices and locations to create dynamic listening scenarios. A set of 227 questions was created, covering six broad categories (days of the week, months of the year, numbers, colors, opposites, and sizes). All questions and their one-word answers were spoken by 11 female and 11 male talkers. In this study, listeners were presented with question-answer pairs and asked to indicate whether the answer was true or false. Responses were given as simple button or key presses, which are quick to make and easy to score. Two preliminary experiments are presented that illustrate different ways of implementing the basic task. In the first experiment, question-answer pairs were presented in speech-shaped noise, and performance was compared across subjects, question categories, and time, to examine the different sources of variability. In the second experiment, sequences of question-answer pairs were presented amidst competing conversations in an ongoing, spatially dynamic listening scenario. Overall, the question-and-answer task appears to be feasible and could be implemented flexibly in a number of different ways.
Collapse
Affiliation(s)
- Virginia Best
- Department of Speech, Language and Hearing Sciences, Boston University, MA, USA
| | - Timothy Streeter
- Department of Speech, Language and Hearing Sciences, Boston University, MA, USA
| | - Elin Roverud
- Department of Speech, Language and Hearing Sciences, Boston University, MA, USA
| | - Christine R Mason
- Department of Speech, Language and Hearing Sciences, Boston University, MA, USA
| | - Gerald Kidd
- Department of Speech, Language and Hearing Sciences, Boston University, MA, USA
| |
Collapse
|
71
|
Switching of auditory attention in "cocktail-party" listening: ERP evidence of cueing effects in younger and older adults. Brain Cogn 2016; 111:1-12. [PMID: 27814564 DOI: 10.1016/j.bandc.2016.09.006] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2016] [Revised: 06/28/2016] [Accepted: 09/13/2016] [Indexed: 11/23/2022]
Abstract
Verbal communication in a "cocktail-party situation" is a major challenge for the auditory system. In particular, changes in target speaker usually result in declined speech perception. Here, we investigated whether speech cues indicating a subsequent change in target speaker reduce the costs of switching in younger and older adults. We employed event-related potential (ERP) measures and a speech perception task, in which sequences of short words were simultaneously presented by four speakers. Changes in target speaker were either unpredictable or semantically cued by a word within the target stream. Cued changes resulted in a less decreased performance than uncued changes in both age groups. The ERP analysis revealed shorter latencies in the change-related N400 and late positive complex (LPC) after cued changes, suggesting an acceleration in context updating and attention switching. Thus, both younger and older listeners used semantic cues to prepare changes in speaker setting.
Collapse
|
72
|
Mi J, Colburn HS. A Binaural Grouping Model for Predicting Speech Intelligibility in Multitalker Environments. Trends Hear 2016; 20:20/0/2331216516669919. [PMID: 27698261 PMCID: PMC5051670 DOI: 10.1177/2331216516669919] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open
Abstract
Spatially separating speech maskers from target speech often leads to a large intelligibility improvement. Modeling this phenomenon has long been of interest to binaural-hearing researchers for uncovering brain mechanisms and for improving signal-processing algorithms in hearing-assistive devices. Much of the previous binaural modeling work focused on the unmasking enabled by binaural cues at the periphery, and little quantitative modeling has been directed toward the grouping or source-separation benefits of binaural processing. In this article, we propose a binaural model that focuses on grouping, specifically on the selection of time-frequency units that are dominated by signals from the direction of the target. The proposed model uses Equalization-Cancellation (EC) processing with a binary decision rule to estimate a time-frequency binary mask. EC processing is carried out to cancel the target signal and the energy change between the EC input and output is used as a feature that reflects target dominance in each time-frequency unit. The processing in the proposed model requires little computational resources and is straightforward to implement. In combination with the Coherence-based Speech Intelligibility Index, the model is applied to predict the speech intelligibility data measured by Marrone et al. The predicted speech reception threshold matches the pattern of the measured data well, even though the predicted intelligibility improvements relative to the colocated condition are larger than some of the measured data, which may reflect the lack of internal noise in this initial version of the model.
Collapse
Affiliation(s)
- Jing Mi
- Boston University, Boston, MA, USA
| | | |
Collapse
|
73
|
Auditory distance perception in humans: a review of cues, development, neuronal bases, and effects of sensory loss. Atten Percept Psychophys 2016; 78:373-95. [PMID: 26590050 PMCID: PMC4744263 DOI: 10.3758/s13414-015-1015-1] [Citation(s) in RCA: 95] [Impact Index Per Article: 11.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/05/2022]
Abstract
Auditory distance perception plays a major role in spatial awareness, enabling location of objects and avoidance of obstacles in the environment. However, it remains under-researched relative to studies of the directional aspect of sound localization. This review focuses on the following four aspects of auditory distance perception: cue processing, development, consequences of visual and auditory loss, and neurological bases. The several auditory distance cues vary in their effective ranges in peripersonal and extrapersonal space. The primary cues are sound level, reverberation, and frequency. Nonperceptual factors, including the importance of the auditory event to the listener, also can affect perceived distance. Basic internal representations of auditory distance emerge at approximately 6 months of age in humans. Although visual information plays an important role in calibrating auditory space, sensorimotor contingencies can be used for calibration when vision is unavailable. Blind individuals often manifest supranormal abilities to judge relative distance but show a deficit in absolute distance judgments. Following hearing loss, the use of auditory level as a distance cue remains robust, while the reverberation cue becomes less effective. Previous studies have not found evidence that hearing-aid processing affects perceived auditory distance. Studies investigating the brain areas involved in processing different acoustic distance cues are described. Finally, suggestions are given for further research on auditory distance perception, including broader investigation of how background noise and multiple sound sources affect perceived auditory distance for those with sensory loss.
Collapse
|
74
|
Oberfeld D, Klöckner-Nowotny F. Individual differences in selective attention predict speech identification at a cocktail party. eLife 2016; 5:e16747. [PMID: 27580272 PMCID: PMC5441891 DOI: 10.7554/elife.16747] [Citation(s) in RCA: 53] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2016] [Accepted: 08/08/2016] [Indexed: 11/13/2022] Open
Abstract
Listeners with normal hearing show considerable individual differences in speech understanding when competing speakers are present, as in a crowded restaurant. Here, we show that one source of this variance are individual differences in the ability to focus selective attention on a target stimulus in the presence of distractors. In 50 young normal-hearing listeners, the performance in tasks measuring auditory and visual selective attention was associated with sentence identification in the presence of spatially separated competing speakers. Together, the measures of selective attention explained a similar proportion of variance as the binaural sensitivity for the acoustic temporal fine structure. Working memory span, age, and audiometric thresholds showed no significant association with speech understanding. These results suggest that a reduced ability to focus attention on a target is one reason why some listeners with normal hearing sensitivity have difficulty communicating in situations with background noise.
Collapse
Affiliation(s)
- Daniel Oberfeld
- Department of Psychology, Section Experimental Psychology, Johannes Gutenberg-Universität, Mainz, Germany
| | - Felicitas Klöckner-Nowotny
- Department of Psychology, Section Experimental Psychology, Johannes Gutenberg-Universität, Mainz, Germany
| |
Collapse
|
75
|
Kidd G, Mason CR, Swaminathan J, Roverud E, Clayton KK, Best V. Determining the energetic and informational components of speech-on-speech masking. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2016; 140:132. [PMID: 27475139 PMCID: PMC5392100 DOI: 10.1121/1.4954748] [Citation(s) in RCA: 68] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/13/2023]
Abstract
Identification of target speech was studied under masked conditions consisting of two or four independent speech maskers. In the reference conditions, the maskers were colocated with the target, the masker talkers were the same sex as the target, and the masker speech was intelligible. The comparison conditions, intended to provide release from masking, included different-sex target and masker talkers, time-reversal of the masker speech, and spatial separation of the maskers from the target. Significant release from masking was found for all comparison conditions. To determine whether these reductions in masking could be attributed to differences in energetic masking, ideal time-frequency segregation (ITFS) processing was applied so that the time-frequency units where the masker energy dominated the target energy were removed. The remaining target-dominated "glimpses" were reassembled as the stimulus. Speech reception thresholds measured using these resynthesized ITFS-processed stimuli were the same for the reference and comparison conditions supporting the conclusion that the amount of energetic masking across conditions was the same. These results indicated that the large release from masking found under all comparison conditions was due primarily to a reduction in informational masking. Furthermore, the large individual differences observed generally were correlated across the three masking release conditions.
Collapse
Affiliation(s)
- Gerald Kidd
- Department of Speech, Language and Hearing Sciences and Hearing Research Center, Boston University, 635 Commonwealth Avenue, Boston, Massachusetts 02215, USA
| | - Christine R Mason
- Department of Speech, Language and Hearing Sciences and Hearing Research Center, Boston University, 635 Commonwealth Avenue, Boston, Massachusetts 02215, USA
| | - Jayaganesh Swaminathan
- Department of Speech, Language and Hearing Sciences and Hearing Research Center, Boston University, 635 Commonwealth Avenue, Boston, Massachusetts 02215, USA
| | - Elin Roverud
- Department of Speech, Language and Hearing Sciences and Hearing Research Center, Boston University, 635 Commonwealth Avenue, Boston, Massachusetts 02215, USA
| | - Kameron K Clayton
- Department of Speech, Language and Hearing Sciences and Hearing Research Center, Boston University, 635 Commonwealth Avenue, Boston, Massachusetts 02215, USA
| | - Virginia Best
- Department of Speech, Language and Hearing Sciences and Hearing Research Center, Boston University, 635 Commonwealth Avenue, Boston, Massachusetts 02215, USA
| |
Collapse
|
76
|
Lewald J, Hanenberg C, Getzmann S. Brain correlates of the orientation of auditory spatial attention onto speaker location in a “cocktail-party” situation. Psychophysiology 2016; 53:1484-95. [DOI: 10.1111/psyp.12692] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2015] [Accepted: 05/24/2016] [Indexed: 11/29/2022]
Affiliation(s)
- Jörg Lewald
- Department of Cognitive Psychology, Faculty of Psychology; Ruhr University Bochum; Bochum Germany
- Leibniz Research Centre for Working Environment and Human Factors; Dortmund Germany
| | - Christina Hanenberg
- Department of Cognitive Psychology, Faculty of Psychology; Ruhr University Bochum; Bochum Germany
- Leibniz Research Centre for Working Environment and Human Factors; Dortmund Germany
| | - Stephan Getzmann
- Leibniz Research Centre for Working Environment and Human Factors; Dortmund Germany
| |
Collapse
|
77
|
Davis TJ, Grantham DW, Gifford RH. Effect of motion on speech recognition. Hear Res 2016; 337:80-8. [PMID: 27240478 DOI: 10.1016/j.heares.2016.05.011] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/22/2014] [Revised: 04/11/2016] [Accepted: 05/08/2016] [Indexed: 11/16/2022]
Abstract
The benefit of spatial separation for talkers in a multi-talker environment is well documented. However, few studies have examined the effect of talker motion on speech recognition. In the current study, we evaluated the effects of (1) motion of the target or distracters, (2) a priori information about the target and distracter spatial configurations, and (3) target and distracter location. In total, seventeen young adults with normal hearing were tested in a large anechoic chamber in two experiments. In Experiment 1, seven stimulus conditions were tested using the Coordinate Response Measure (Bolia et al., 2000) speech corpus, in which subjects were required to report the key words in a target sentence presented simultaneously with two distracter sentences. As in previous studies, there was a significant improvement in key word identification for conditions in which the target and distracters were spatially separated as compared to the co-located conditions. In addition, 1) motion of either talker or distracter resulted in improved performance compared to stationary presentation (talker motion yielded significantly better performance than distracter motion) 2) a priori information regarding stimulus configuration was not beneficial, and 3) performance was significantly better with key words at 0° azimuth as compared to -60° (on the listener's left). Experiment 2 included two additional conditions designed to assess whether the benefit of motion observed in Experiment 1 was due to the motion itself or to the fact that the motion conditions introduced small spatial separations in the target and distracter key words. Results showed that small spatial separations (on the order of 5-8°) resulted in improved performance (relative to co-located key words) whether the sentences were moving or stationary. These results suggest that in the presence of distracting messages, motion of either target or distracters and/or small spatial separation of the key words may be beneficial for sound source segregation and thus for improved speech recognition.
Collapse
Affiliation(s)
- Timothy J Davis
- Vanderbilt University, Department of Hearing and Speech Sciences, Nashville, TN, USA.
| | - D Wesley Grantham
- Vanderbilt University, Department of Hearing and Speech Sciences, Nashville, TN, USA
| | - René H Gifford
- Vanderbilt University, Department of Hearing and Speech Sciences, Nashville, TN, USA
| |
Collapse
|
78
|
Holmes E, Kitterick PT, Summerfield AQ. EEG activity evoked in preparation for multi-talker listening by adults and children. Hear Res 2016; 336:83-100. [PMID: 27178442 DOI: 10.1016/j.heares.2016.04.007] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/13/2016] [Revised: 04/04/2016] [Accepted: 04/28/2016] [Indexed: 12/01/2022]
Abstract
Selective attention is critical for successful speech perception because speech is often encountered in the presence of other sounds, including the voices of competing talkers. Faced with the need to attend selectively, listeners perceive speech more accurately when they know characteristics of upcoming talkers before they begin to speak. However, the neural processes that underlie the preparation of selective attention for voices are not fully understood. The current experiments used electroencephalography (EEG) to investigate the time course of brain activity during preparation for an upcoming talker in young adults aged 18-27 years with normal hearing (Experiments 1 and 2) and in typically-developing children aged 7-13 years (Experiment 3). Participants reported key words spoken by a target talker when an opposite-gender distractor talker spoke simultaneously. The two talkers were presented from different spatial locations (±30° azimuth). Before the talkers began to speak, a visual cue indicated either the location (left/right) or the gender (male/female) of the target talker. Adults evoked preparatory EEG activity that started shortly after (<50 ms) the visual cue was presented and was sustained until the talkers began to speak. The location cue evoked similar preparatory activity in Experiments 1 and 2 with different samples of participants. The gender cue did not evoke preparatory activity when it predicted gender only (Experiment 1) but did evoke preparatory activity when it predicted the identity of a specific talker with greater certainty (Experiment 2). Location cues evoked significant preparatory EEG activity in children but gender cues did not. The results provide converging evidence that listeners evoke consistent preparatory brain activity for selecting a talker by their location (regardless of their gender or identity), but not by their gender alone.
Collapse
Affiliation(s)
- Emma Holmes
- Department of Psychology, University of York, UK.
| | - Padraig T Kitterick
- NIHR Nottingham Hearing Biomedical Research Unit, UK; Division of Clinical Neuroscience, School of Medicine, University of Nottingham, UK
| | - A Quentin Summerfield
- Department of Psychology, University of York, UK; Hull York Medical School, University of York, UK
| |
Collapse
|
79
|
Schoenmaker E, Brand T, van de Par S. The multiple contributions of interaural differences to improved speech intelligibility in multitalker scenarios. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2016; 139:2589. [PMID: 27250153 DOI: 10.1121/1.4948568] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
Spatial separation of talkers is known to improve speech intelligibility in a multitalker scenario. A contribution of binaural unmasking, in addition to a better-ear effect, is usually considered to account for this advantage. Binaural unmasking is assumed to result from the spectro-temporally simultaneous presence of target and masker energy with different interaural properties. However, in the case of speech targets and speech interference, the spectro-temporal signal-to-noise ratio (SNR) fluctuates strongly, resulting in audible and localizable glimpses of target speech even at adverse global SNRs. The disparate interaural properties of target and masker may thus lead to improved segregation without requiring simultaneity. This study addresses the binaural contribution to spatial release from masking due to simultaneous disparities in interaural cues between target and interferers. For that purpose stimuli were designed that lacked simultaneously occurring disparities, but yielded a percept of spatially separated speech nearly indistinguishable from that of non-modified stimuli. A phoneme recognition experiment with either three collocated or spatially separated talkers showed a substantial spatial release from masking for the modified stimuli. The results suggest that binaural unmasking made a minor contribution to spatial release from masking, and that rather the interaural cues mediated by dominant speech components were essential.
Collapse
Affiliation(s)
- Esther Schoenmaker
- Acoustics Group, Cluster of Excellence Hearing4all, Carl von Ossietzky University, 26111 Oldenburg, Germany
| | - Thomas Brand
- Medizinische Physik, Cluster of Excellence Hearing4all, Carl von Ossietzky University, 26111 Oldenburg, Germany
| | - Steven van de Par
- Acoustics Group, Cluster of Excellence Hearing4all, Carl von Ossietzky University, 26111 Oldenburg, Germany
| |
Collapse
|
80
|
Attentional modulation of informational masking on early cortical representations of speech signals. Hear Res 2016; 331:119-30. [DOI: 10.1016/j.heares.2015.11.002] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/05/2013] [Revised: 10/27/2015] [Accepted: 11/04/2015] [Indexed: 11/27/2022]
|
81
|
Reed DK, Dietz M, Josupeit A, van de Par S. Lateralization of stimuli with alternating interaural time differences: The role of monaural envelope cues. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2016; 139:30-40. [PMID: 26827002 DOI: 10.1121/1.4938018] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
A temporally acute binaural system can help to resolve inherent fluctuations in binaural information that are often present in complex auditory scenes. Using a broadband noise stimulus that rapidly alternates between two different values of interaural time difference (ITD), the ability of the binaural system to hear the lateral position resulting from one of the ITD values was investigated. Results show that listeners are able to accurately lateralize brief noise tokens of only 3-7 ms in duration. In two subsequent experiments, the role of an amplitude modulation (AM) imposed on the ITD-switching stimulus used in the first experiment was tested. For wideband stimuli, the temporal position of the ITD target relative to the phase of the AM did not influence absolute lateralization or detection performance. When the stimuli were narrowband, however, detection of the ITD target was best when temporally positioned in the rising portion of the AM. These experiments illustrate that the auditory system is capable of making accurate lateral estimates of very brief moments of ITD information. Furthermore, for these instantaneous changes in ITD information, the stimulus bandwidth can influence the role of envelope cues for the readout of binaural information.
Collapse
Affiliation(s)
- Darrin K Reed
- Acoustics Group, Forschungszentrum Neurosensorik, Cluster of Excellence Hearing4all Universität Oldenburg, 26111 Oldenburg, Germany
| | - Mathias Dietz
- Medizinische Physik, Cluster of Excellence Hearing4all Universität Oldenburg, 26111 Oldenburg, Germany
| | - Angela Josupeit
- Medizinische Physik, Cluster of Excellence Hearing4all Universität Oldenburg, 26111 Oldenburg, Germany
| | - Steven van de Par
- Acoustics Group, Forschungszentrum Neurosensorik, Cluster of Excellence Hearing4all Universität Oldenburg, 26111 Oldenburg, Germany
| |
Collapse
|
82
|
The cocktail-party problem revisited: early processing and selection of multi-talker speech. Atten Percept Psychophys 2015; 77:1465-87. [PMID: 25828463 PMCID: PMC4469089 DOI: 10.3758/s13414-015-0882-9] [Citation(s) in RCA: 212] [Impact Index Per Article: 23.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Abstract
How do we recognize what one person is saying when others are speaking at the same time? This review summarizes widespread research in psychoacoustics, auditory scene analysis, and attention, all dealing with early processing and selection of speech, which has been stimulated by this question. Important effects occurring at the peripheral and brainstem levels are mutual masking of sounds and “unmasking” resulting from binaural listening. Psychoacoustic models have been developed that can predict these effects accurately, albeit using computational approaches rather than approximations of neural processing. Grouping—the segregation and streaming of sounds—represents a subsequent processing stage that interacts closely with attention. Sounds can be easily grouped—and subsequently selected—using primitive features such as spatial location and fundamental frequency. More complex processing is required when lexical, syntactic, or semantic information is used. Whereas it is now clear that such processing can take place preattentively, there also is evidence that the processing depth depends on the task-relevancy of the sound. This is consistent with the presence of a feedback loop in attentional control, triggering enhancement of to-be-selected input. Despite recent progress, there are still many unresolved issues: there is a need for integrative models that are neurophysiologically plausible, for research into grouping based on other than spatial or voice-related cues, for studies explicitly addressing endogenous and exogenous attention, for an explanation of the remarkable sluggishness of attention focused on dynamically changing sounds, and for research elucidating the distinction between binaural speech perception and sound localization.
Collapse
|
83
|
Getzmann S, Hanenberg C, Lewald J, Falkenstein M, Wascher E. Effects of age on electrophysiological correlates of speech processing in a dynamic "cocktail-party" situation. Front Neurosci 2015; 9:341. [PMID: 26483623 PMCID: PMC4586946 DOI: 10.3389/fnins.2015.00341] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2015] [Accepted: 09/09/2015] [Indexed: 11/23/2022] Open
Abstract
Successful speech perception in multi-speaker environments depends on auditory scene analysis, comprising auditory object segregation and grouping, and on focusing attention toward the speaker of interest. Changes in speaker settings (e.g., in speaker position) require object re-selection and attention re-focusing. Here, we tested the processing of changes in a realistic multi-speaker scenario in younger and older adults, employing a speech-perception task, and event-related potential (ERP) measures. Sequences of short words (combinations of company names and values) were simultaneously presented via four loudspeakers at different locations, and the participants responded to the value of a target company. Voice and position of the speaker of the target information were kept constant for a variable number of trials and then changed. Relative to the pre-change level, changes caused higher error rates, and more so in older than younger adults. The ERP analysis revealed stronger fronto-central N2 and N400 components in younger adults, suggesting a more effective inhibition of concurrent speech stimuli and enhanced language processing. The difference ERPs (post-change minus pre-change) indicated a change-related N400 and late positive complex (LPC) over parietal areas in both groups. Only the older adults showed an additional frontal LPC, suggesting increased allocation of attentional resources after changes in speaker settings. In sum, changes in speaker settings are critical events for speech perception in multi-speaker environments. Especially older persons show deficits that could be based on less flexible inhibitory control and increased distraction.
Collapse
Affiliation(s)
- Stephan Getzmann
- Aging Research Group, Leibniz Research Centre for Working Environment and Human Factors Dortmund, Germany
| | - Christina Hanenberg
- Aging Research Group, Leibniz Research Centre for Working Environment and Human Factors Dortmund, Germany
| | - Jörg Lewald
- Aging Research Group, Leibniz Research Centre for Working Environment and Human Factors Dortmund, Germany
| | - Michael Falkenstein
- Aging Research Group, Leibniz Research Centre for Working Environment and Human Factors Dortmund, Germany
| | - Edmund Wascher
- Aging Research Group, Leibniz Research Centre for Working Environment and Human Factors Dortmund, Germany
| |
Collapse
|
84
|
Kong YY, Somarowthu A, Ding N. Effects of Spectral Degradation on Attentional Modulation of Cortical Auditory Responses to Continuous Speech. J Assoc Res Otolaryngol 2015; 16:783-96. [PMID: 26362546 DOI: 10.1007/s10162-015-0540-x] [Citation(s) in RCA: 32] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2014] [Accepted: 08/24/2015] [Indexed: 11/24/2022] Open
Abstract
This study investigates the effect of spectral degradation on cortical speech encoding in complex auditory scenes. Young normal-hearing listeners were simultaneously presented with two speech streams and were instructed to attend to only one of them. The speech mixtures were subjected to noise-channel vocoding to preserve the temporal envelope and degrade the spectral information of speech. Each subject was tested with five spectral resolution conditions (unprocessed speech, 64-, 32-, 16-, and 8-channel vocoder conditions) and two target-to-masker ratio (TMR) conditions (3 and 0 dB). Ongoing electroencephalographic (EEG) responses and speech comprehension were measured in each spectral and TMR condition for each subject. Neural tracking of each speech stream was characterized by cross-correlating the EEG responses with the envelope of each of the simultaneous speech streams at different time lags. Results showed that spectral degradation and TMR both significantly influenced how top-down attention modulated the EEG responses to the attended and unattended speech. That is, the EEG responses to the attended and unattended speech streams differed more for the higher (unprocessed, 64 ch, and 32 ch) than the lower (16 and 8 ch) spectral resolution conditions, as well as for the higher (3 dB) than the lower TMR (0 dB) condition. The magnitude of differential neural modulation responses to the attended and unattended speech streams significantly correlated with speech comprehension scores. These results suggest that severe spectral degradation and low TMR hinder speech stream segregation, making it difficult to employ top-down attention to differentially process different speech streams.
Collapse
Affiliation(s)
- Ying-Yee Kong
- Department of Communication Sciences and Disorders, Northeastern University, Boston, MA, 02115, USA. .,Department of Bioengineering, Northeastern University, Boston, MA, 02115, USA.
| | - Ala Somarowthu
- Department of Bioengineering, Northeastern University, Boston, MA, 02115, USA.
| | - Nai Ding
- College of Biomedical Engineering and Instrument Sciences, Zhejiang University, Zhejiang, China.
| |
Collapse
|
85
|
Cai Y, Zheng Y, Liang M, Zhao F, Yu G, Liu Y, Chen Y, Chen G. Auditory Spatial Discrimination and the Mismatch Negativity Response in Hearing-Impaired Individuals. PLoS One 2015; 10:e0136299. [PMID: 26305694 PMCID: PMC4549058 DOI: 10.1371/journal.pone.0136299] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2015] [Accepted: 08/02/2015] [Indexed: 12/01/2022] Open
Abstract
The aims of the present study were to investigate the ability of hearing-impaired (HI) individuals with different binaural hearing conditions to discriminate spatial auditory-sources at the midline and lateral positions, and to explore the possible central processing mechanisms by measuring the minimal audible angle (MAA) and mismatch negativity (MMN) response. To measure MAA at the left/right 0°, 45° and 90° positions, 12 normal-hearing (NH) participants and 36 patients with sensorineural hearing loss, which included 12 patients with symmetrical hearing loss (SHL) and 24 patients with asymmetrical hearing loss (AHL) [12 with unilateral hearing loss on the left (UHLL) and 12 with unilateral hearing loss on the right (UHLR)] were recruited. In addition, 128-electrode electroencephalography was used to record the MMN response in a separate group of 60 patients (20 UHLL, 20 UHLR and 20 SHL patients) and 20 NH participants. The results showed MAA thresholds of the NH participants to be significantly lower than the HI participants. Also, a significantly smaller MAA threshold was obtained at the midline position than at the lateral position in both NH and SHL groups. However, in the AHL group, MAA threshold for the 90° position on the affected side was significantly smaller than the MMA thresholds obtained at other positions. Significantly reduced amplitudes and prolonged latencies of the MMN were found in the HI groups compared to the NH group. In addition, contralateral activation was found in the UHL group for sounds emanating from the 90° position on the affected side and in the NH group. These findings suggest that the abilities of spatial discrimination at the midline and lateral positions vary significantly in different hearing conditions. A reduced MMN amplitude and prolonged latency together with bilaterally symmetrical cortical activations over the auditory hemispheres indicate possible cortical compensatory changes associated with poor behavioral spatial discrimination in individuals with HI.
Collapse
Affiliation(s)
- Yuexin Cai
- Department of Otolaryngology, Sun Yat-sen Memorial Hospital, Sun Yat-sen University, Guangzhou, China
- Institute of Hearing and Speech-Language Science, Sun Yat-sen University, Guangzhou, China
| | - Yiqing Zheng
- Department of Otolaryngology, Sun Yat-sen Memorial Hospital, Sun Yat-sen University, Guangzhou, China
- Institute of Hearing and Speech-Language Science, Sun Yat-sen University, Guangzhou, China
- * E-mail:
| | - Maojin Liang
- Department of Otolaryngology, Sun Yat-sen Memorial Hospital, Sun Yat-sen University, Guangzhou, China
- Institute of Hearing and Speech-Language Science, Sun Yat-sen University, Guangzhou, China
| | - Fei Zhao
- Department of Speech Language Therapy and Hearing Science, Cardiff Metropolitan University, Cardiff, Wales
- Department of Hearing and Speech Sciences, Xinhua College, Sun Yat-sen University, Guangzhou, China
| | - Guangzheng Yu
- Acoustic Lab, Physics Department, South China University of Technology, Guangzhou, 510641, China
| | - Yu Liu
- Acoustic Lab, Physics Department, South China University of Technology, Guangzhou, 510641, China
| | - Yuebo Chen
- Department of Otolaryngology, Sun Yat-sen Memorial Hospital, Sun Yat-sen University, Guangzhou, China
- Institute of Hearing and Speech-Language Science, Sun Yat-sen University, Guangzhou, China
| | - Guisheng Chen
- Department of Otolaryngology, Sun Yat-sen Memorial Hospital, Sun Yat-sen University, Guangzhou, China
- Institute of Hearing and Speech-Language Science, Sun Yat-sen University, Guangzhou, China
| |
Collapse
|
86
|
Woods KJP, McDermott JH. Attentive Tracking of Sound Sources. Curr Biol 2015; 25:2238-46. [PMID: 26279234 DOI: 10.1016/j.cub.2015.07.043] [Citation(s) in RCA: 27] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2015] [Revised: 06/17/2015] [Accepted: 07/15/2015] [Indexed: 10/23/2022]
Abstract
Auditory scenes often contain concurrent sound sources, but listeners are typically interested in just one of these and must somehow select it for further processing. One challenge is that real-world sounds such as speech vary over time and as a consequence often cannot be separated or selected based on particular values of their features (e.g., high pitch). Here we show that human listeners can circumvent this challenge by tracking sounds with a movable focus of attention. We synthesized pairs of voices that changed in pitch and timbre over random, intertwined trajectories, lacking distinguishing features or linguistic information. Listeners were cued beforehand to attend to one of the voices. We measured their ability to extract this cued voice from the mixture by subsequently presenting the ending portion of one voice and asking whether it came from the cued voice. We found that listeners could perform this task but that performance was mediated by attention-listeners who performed best were also more sensitive to perturbations in the cued voice than in the uncued voice. Moreover, the task was impossible if the source trajectories did not maintain sufficient separation in feature space. The results suggest a locus of attention that can follow a sound's trajectory through a feature space, likely aiding selection and segregation amid similar distractors.
Collapse
Affiliation(s)
- Kevin J P Woods
- Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA 02139, USA; Program in Speech and Hearing Bioscience and Technology, Harvard University, Cambridge, MA 02138, USA.
| | - Josh H McDermott
- Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA 02139, USA; Program in Speech and Hearing Bioscience and Technology, Harvard University, Cambridge, MA 02138, USA
| |
Collapse
|
87
|
Limitations on Monaural and Binaural Temporal Processing in Bilateral Cochlear Implant Listeners. J Assoc Res Otolaryngol 2015; 16:641-52. [PMID: 26105749 PMCID: PMC4569611 DOI: 10.1007/s10162-015-0527-7] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2014] [Accepted: 05/20/2015] [Indexed: 11/20/2022] Open
Abstract
Monaural rate discrimination and binaural interaural time difference (ITD) discrimination were studied as functions of pulse rate in a group of bilaterally implanted cochlear implant users. Stimuli for the rate discrimination task were pulse trains presented to one electrode, which could be in the apical, middle, or basal part of the array, and in either the left or the right ear. In each two-interval trial, the standard stimulus had a rate of 100, 200, 300, or 500 pulses per second and the signal stimulus had a rate 35 % higher. ITD discrimination between pitch-matched electrode pairs was measured for the same standard rates as in the rate discrimination task and with an ITD of +/− 500 μs. Sensitivity (d′) on both tasks decreased with increasing rate, as has been reported previously. This study tested the hypothesis that deterioration in performance at high rates occurs for the two tasks due to a common neural basis, specific to the stimulation of each electrode. Results show that ITD scores for different pairs of electrodes correlated with the lower rate discrimination scores for those two electrodes. Statistical analysis, which partialed out overall differences between listeners, electrodes, and rates, supports the hypothesis that monaural and binaural temporal processing limitations are at least partly due to a common mechanism.
Collapse
|
88
|
Martin K, Johnstone P, Hedrick M. Auditory and visual localization accuracy in young children and adults. Int J Pediatr Otorhinolaryngol 2015; 79:844-851. [PMID: 25841637 DOI: 10.1016/j.ijporl.2015.03.016] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/15/2015] [Revised: 03/16/2015] [Accepted: 03/17/2015] [Indexed: 11/30/2022]
Abstract
OBJECTIVE This study aimed to measure and compare sound and light source localization ability in young children and adults who have normal hearing and normal/corrected vision in order to determine the extent to which age, type of stimuli, and stimulus order affects sound localization accuracy. METHODS Two experiments were conducted. The first involved a group of adults only. The second involved a group of 30 children aged 3 to 5 years. Testing occurred in a sound-treated booth containing a semi-circular array of 15 loudspeakers set at 10° intervals from -70° to 70° azimuth. Each loudspeaker had a tiny light bulb and a small picture fastened underneath. Seven of the loudspeakers were used to randomly test sound and light source identification. The sound stimulus was the word "baseball". The light stimulus was a flashing of a light bulb triggered by the digital signal of the word "baseball". Each participant was asked to face 0° azimuth, and identify the location of the test stimulus upon presentation. Adults used a computer mouse to click on an icon; children responded by verbally naming or walking toward the picture underneath the corresponding loudspeaker or light. A mixed experimental design using repeated measures was used to determine the effect of age and stimulus type on localization accuracy in children and adults. A mixed experimental design was used to compare the effect of stimulus order (light first/last) and varying or fixed intensity sound on localization accuracy in children and adults. RESULTS Localization accuracy was significantly better for light stimuli than sound stimuli for children and adults. Children, compared to adults, showed significantly greater localization errors for audition. Three-year-old children had significantly greater sound localization errors compared to 4- and 5-year olds. Adults performed better on the sound localization task when the light localization task occurred first. CONCLUSIONS Young children can understand and attend to localization tasks, but show poorer localization accuracy than adults in sound localization. This may be a reflection of differences in sensory modality development and/or central processes in young children, compared to adults.
Collapse
Affiliation(s)
- Karen Martin
- University of Tennessee Health Science Center, Department of Audiology and Speech Pathology, 578 South Stadium Hall, Knoxville, TN 37996, United States
| | - Patti Johnstone
- University of Tennessee Health Science Center, Department of Audiology and Speech Pathology, 578 South Stadium Hall, Knoxville, TN 37996, United States
| | - Mark Hedrick
- University of Tennessee Health Science Center, Department of Audiology and Speech Pathology, 578 South Stadium Hall, Knoxville, TN 37996, United States.
| |
Collapse
|
89
|
Lin G, Carlile S. Costs of switching auditory spatial attention in following conversational turn-taking. Front Neurosci 2015; 9:124. [PMID: 25941466 PMCID: PMC4403343 DOI: 10.3389/fnins.2015.00124] [Citation(s) in RCA: 27] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2015] [Accepted: 03/26/2015] [Indexed: 11/17/2022] Open
Abstract
Following a multi-talker conversation relies on the ability to rapidly and efficiently shift the focus of spatial attention from one talker to another. The current study investigated the listening costs associated with shifts in spatial attention during conversational turn-taking in 16 normally-hearing listeners using a novel sentence recall task. Three pairs of syntactically fixed but semantically unpredictable matrix sentences, recorded from a single male talker, were presented concurrently through an array of three loudspeakers (directly ahead and +/−30° azimuth). Subjects attended to one spatial location, cued by a tone, and followed the target conversation from one sentence to the next using the call-sign at the beginning of each sentence. Subjects were required to report the last three words of each sentence (speech recall task) or answer multiple choice questions related to the target material (speech comprehension task). The reading span test, attention network test, and trail making test were also administered to assess working memory, attentional control, and executive function. There was a 10.7 ± 1.3% decrease in word recall, a pronounced primacy effect, and a rise in masker confusion errors and word omissions when the target switched location between sentences. Switching costs were independent of the location, direction, and angular size of the spatial shift but did appear to be load dependent and only significant for complex questions requiring multiple cognitive operations. Reading span scores were positively correlated with total words recalled, and negatively correlated with switching costs and word omissions. Task switching speed (Trail-B time) was also significantly correlated with recall accuracy. Overall, this study highlights (i) the listening costs associated with shifts in spatial attention and (ii) the important role of working memory in maintaining goal relevant information and extracting meaning from dynamic multi-talker conversations.
Collapse
Affiliation(s)
- Gaven Lin
- Auditory Neuroscience Laboratory, Department of Physiology, School of Medical Sciences, University of Sydney Sydney, NSW, Australia
| | - Simon Carlile
- Auditory Neuroscience Laboratory, Department of Physiology, School of Medical Sciences, University of Sydney Sydney, NSW, Australia
| |
Collapse
|
90
|
Carlile S, Corkhill C. Selective spatial attention modulates bottom-up informational masking of speech. Sci Rep 2015; 5:8662. [PMID: 25727100 PMCID: PMC4345314 DOI: 10.1038/srep08662] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2014] [Accepted: 01/28/2015] [Indexed: 11/09/2022] Open
Abstract
To hear out a conversation against other talkers listeners overcome energetic and informational masking. Largely attributed to top-down processes, information masking has also been demonstrated using unintelligible speech and amplitude-modulated maskers suggesting bottom-up processes. We examined the role of speech-like amplitude modulations in information masking using a spatial masking release paradigm. Separating a target talker from two masker talkers produced a 20 dB improvement in speech reception threshold; 40% of which was attributed to a release from informational masking. When across frequency temporal modulations in the masker talkers are decorrelated the speech is unintelligible, although the within frequency modulation characteristics remains identical. Used as a masker as above, the information masking accounted for 37% of the spatial unmasking seen with this masker. This unintelligible and highly differentiable masker is unlikely to involve top-down processes. These data provides strong evidence of bottom-up masking involving speech-like, within-frequency modulations and that this, presumably low level process, can be modulated by selective spatial attention.
Collapse
Affiliation(s)
- Simon Carlile
- School of Medical Sciences and The Bosch Institute, University of Sydney, Sydney, NSW 2006, Australia
| | - Caitlin Corkhill
- School of Medical Sciences, University of Sydney, Sydney, NSW 2006 Australia
| |
Collapse
|
91
|
Koelewijn T, de Kluiver H, Shinn-Cunningham BG, Zekveld AA, Kramer SE. The pupil response reveals increased listening effort when it is difficult to focus attention. Hear Res 2015; 323:81-90. [PMID: 25732724 PMCID: PMC4632994 DOI: 10.1016/j.heares.2015.02.004] [Citation(s) in RCA: 64] [Impact Index Per Article: 7.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/22/2014] [Revised: 02/05/2015] [Accepted: 02/16/2015] [Indexed: 12/04/2022]
Abstract
Recent studies have shown that prior knowledge about where, when, and who is going to talk improves speech intelligibility. How related attentional processes affect cognitive processing load has not been investigated yet. In the current study, three experiments investigated how the pupil dilation response is affected by prior knowledge of target speech location, target speech onset, and who is going to talk. A total of 56 young adults with normal hearing participated. They had to reproduce a target sentence presented to one ear while ignoring a distracting sentence simultaneously presented to the other ear. The two sentences were independently masked by fluctuating noise. Target location (left or right ear), speech onset, and talker variability were manipulated in separate experiments by keeping these features either fixed during an entire block or randomized over trials. Pupil responses were recorded during listening and performance was scored after recall. The results showed an improvement in performance when the location of the target speech was fixed instead of randomized. Additionally, location uncertainty increased the pupil dilation response, which suggests that prior knowledge of location reduces cognitive load. Interestingly, the observed pupil responses for each condition were consistent with subjective reports of listening effort. We conclude that communicating in a dynamic environment like a cocktail party (where participants in competing conversations move unpredictably) requires substantial listening effort because of the demands placed on attentional processes.
Collapse
Affiliation(s)
- Thomas Koelewijn
- Section Ear & Hearing, Department of Otolaryngology-Head and Neck Surgery and EMGO Institute for Health and Care Research, VU University Medical Center, Amsterdam, The Netherlands.
| | - Hilde de Kluiver
- Section Ear & Hearing, Department of Otolaryngology-Head and Neck Surgery and EMGO Institute for Health and Care Research, VU University Medical Center, Amsterdam, The Netherlands
| | - Barbara G Shinn-Cunningham
- Department of Biomedical Engineering, Center for Computational Neuroscience and Neural Technology, Boston University, Boston, USA
| | - Adriana A Zekveld
- Section Ear & Hearing, Department of Otolaryngology-Head and Neck Surgery and EMGO Institute for Health and Care Research, VU University Medical Center, Amsterdam, The Netherlands; Linnaeus Centre HEAD, Department of Behavioral Sciences and Learning, Linköping University, Linköping, Sweden
| | - Sophia E Kramer
- Section Ear & Hearing, Department of Otolaryngology-Head and Neck Surgery and EMGO Institute for Health and Care Research, VU University Medical Center, Amsterdam, The Netherlands
| |
Collapse
|
92
|
Getzmann S, Lewald J, Falkenstein M. Using auditory pre-information to solve the cocktail-party problem: electrophysiological evidence for age-specific differences. Front Neurosci 2014; 8:413. [PMID: 25540608 PMCID: PMC4261705 DOI: 10.3389/fnins.2014.00413] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2014] [Accepted: 11/24/2014] [Indexed: 11/13/2022] Open
Abstract
Speech understanding in complex and dynamic listening environments requires (a) auditory scene analysis, namely auditory object formation and segregation, and (b) allocation of the attentional focus to the talker of interest. There is evidence that pre-information is actively used to facilitate these two aspects of the so-called “cocktail-party” problem. Here, a simulated multi-talker scenario was combined with electroencephalography to study scene analysis and allocation of attention in young and middle-aged adults. Sequences of short words (combinations of brief company names and stock-price values) from four talkers at different locations were simultaneously presented, and the detection of target names and the discrimination between critical target values were assessed. Immediately prior to speech sequences, auditory pre-information was provided via cues that either prepared auditory scene analysis or attentional focusing, or non-specific pre-information was given. While performance was generally better in younger than older participants, both age groups benefited from auditory pre-information. The analysis of the cue-related event-related potentials revealed age-specific differences in the use of pre-cues: Younger adults showed a pronounced N2 component, suggesting early inhibition of concurrent speech stimuli; older adults exhibited a stronger late P3 component, suggesting increased resource allocation to process the pre-information. In sum, the results argue for an age-specific utilization of auditory pre-information to improve listening in complex dynamic auditory environments.
Collapse
Affiliation(s)
- Stephan Getzmann
- Aging Research Group, Leibniz Research Centre for Working Environment and Human Factors, Technical University of Dortmund (IfADo) Dortmund, Germany
| | - Jörg Lewald
- Aging Research Group, Leibniz Research Centre for Working Environment and Human Factors, Technical University of Dortmund (IfADo) Dortmund, Germany ; Faculty of Psychology, Ruhr-University Bochum Bochum, Germany
| | - Michael Falkenstein
- Aging Research Group, Leibniz Research Centre for Working Environment and Human Factors, Technical University of Dortmund (IfADo) Dortmund, Germany
| |
Collapse
|
93
|
Koch I, Lawo V. The flip side of the auditory spatial selection benefit: larger attentional mixing costs for target selection by ear than by gender in auditory task switching. Exp Psychol 2014; 62:66-74. [PMID: 25384645 DOI: 10.1027/1618-3169/a000274] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
Abstract
In cued auditory task switching, one of two dichotically presented number words, spoken by a female and a male, had to be judged according to its numerical magnitude. One experimental group selected targets by speaker gender and another group by ear of presentation. In mixed-task blocks, the target-defining feature (male/female vs. left/right) was cued prior to each trial, but in pure blocks it remained constant. Compared to selection by gender, selection by ear led to better performance in pure blocks than in mixed blocks, resulting in larger "global" mixing costs for ear-based selection. Selection by ear also led to larger "local" switch costs in mixed blocks, but this finding was partially mediated by differential cue-repetition benefits. Together, the data suggest that requirements of attention shifting diminish the auditory spatial selection benefit.
Collapse
Affiliation(s)
- Iring Koch
- Institute of Psychology, RWTH Aachen University, Aachen, Germany
| | - Vera Lawo
- Institute of Psychology, RWTH Aachen University, Aachen, Germany
| |
Collapse
|
94
|
Lawo V, Fels J, Oberem J, Koch I. Intentional attention switching in dichotic listening: Exploring the efficiency of nonspatial and spatial selection. Q J Exp Psychol (Hove) 2014; 67:2010-24. [DOI: 10.1080/17470218.2014.898079] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]
Abstract
Using an auditory variant of task switching, we examined the ability to intentionally switch attention in a dichotic-listening task. In our study, participants responded selectively to one of two simultaneously presented auditory number words (spoken by a female and a male, one for each ear) by categorizing its numerical magnitude. The mapping of gender (female vs. male) and ear (left vs. right) was unpredictable. The to-be-attended feature for gender or ear, respectively, was indicated by a visual selection cue prior to auditory stimulus onset. In Experiment 1, explicitly cued switches of the relevant feature dimension (e.g., from gender to ear) and switches of the relevant feature within a dimension (e.g., from male to female) occurred in an unpredictable manner. We found large performance costs when the relevant feature switched, but switches of the relevant feature dimension incurred only small additional costs. The feature-switch costs were larger in ear-relevant than in gender-relevant trials. In Experiment 2, we replicated these findings using a simplified design (i.e., only within-dimension switches with blocked dimensions). In Experiment 3, we examined preparation effects by manipulating the cueing interval and found a preparation benefit only when ear was cued. Together, our data suggest that the large part of attentional switch costs arises from reconfiguration at the level of relevant auditory features (e.g., left vs. right) rather than feature dimensions (ear vs. gender). Additionally, our findings suggest that ear-based target selection benefits more from preparation time (i.e., time to direct attention to one ear) than gender-based target selection.
Collapse
Affiliation(s)
- Vera Lawo
- Institute of Psychology, RWTH Aachen University, Aachen, Germany
| | - Janina Fels
- Institute of Technical Acoustics, RWTH Aachen University, Aachen, Germany
| | - Josefa Oberem
- Institute of Technical Acoustics, RWTH Aachen University, Aachen, Germany
| | - Iring Koch
- Institute of Psychology, RWTH Aachen University, Aachen, Germany
| |
Collapse
|
95
|
Brungart DS, Cohen J, Cord M, Zion D, Kalluri S. Assessment of auditory spatial awareness in complex listening environments. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2014; 136:1808-1820. [PMID: 25324082 DOI: 10.1121/1.4893932] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/04/2023]
Abstract
In the real world, listeners often need to track multiple simultaneous sources in order to maintain awareness of the relevant sounds in their environments. Thus, there is reason to believe that simple single source sound localization tasks may not accurately capture the impact that a listening device such as a hearing aid might have on a listener's level of auditory awareness. In this experiment, 10 normal hearing listeners and 20 hearing impaired listeners were tested in a task that required them to identify and localize sound sources in three different listening tasks of increasing complexity: a single-source localization task, where listeners identified and localized a single sound source presented in isolation; an added source task, where listeners identified and localized a source that was added to an existing auditory scene, and a remove source task, where listeners identified and localized a source that was removed from an existing auditory scene. Hearing impaired listeners completed these tasks with and without the use of their previously fit hearing aids. As expected, the results show that performance decreased both with increasing task complexity and with the number of competing sound sources in the acoustic scene. The results also show that the added source task was as sensitive to differences in performance across listening conditions as the standard localization task, but that it correlated with a different pattern of subjective and objective performance measures across listeners. This result suggests that a measure of complex auditory situation awareness such as the one tested here may be a useful tool for evaluating differences in performance across different types of listening devices, such as hearing aids or hearing protection devices.
Collapse
Affiliation(s)
- Douglas S Brungart
- Walter Reed National Military Medical Center, 4954 North Palmer Road, Bethesda, Maryland 20889
| | - Julie Cohen
- Walter Reed National Military Medical Center, 4954 North Palmer Road, Bethesda, Maryland 20889
| | - Mary Cord
- Walter Reed National Military Medical Center, 4954 North Palmer Road, Bethesda, Maryland 20889
| | - Danielle Zion
- Walter Reed National Military Medical Center, 4954 North Palmer Road, Bethesda, Maryland 20889
| | - Sridhar Kalluri
- Starkey Hearing Research Center, 2150 Shattuck Avenue, Berkeley, California 94704
| |
Collapse
|
96
|
Zhang C, Lu L, Wu X, Li L. Attentional modulation of the early cortical representation of speech signals in informational or energetic masking. BRAIN AND LANGUAGE 2014; 135:85-95. [PMID: 24992572 DOI: 10.1016/j.bandl.2014.06.002] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/04/2013] [Revised: 06/04/2014] [Accepted: 06/05/2014] [Indexed: 06/03/2023]
Abstract
It is easier to recognize a masked speech when the speech and its masker are perceived as spatially segregated. Using event-related potentials, this study examined how the early cortical representation of speech is affected by different masker types and perceptual locations, when the listener is either passively or actively listening to the target speech syllable. The results showed that the two-talker-speech masker induced a much larger masking effect on the N1/P2 complex than either the steady-state-noise masker or the amplitude-modulated speech-spectrum-noise masker did. Also, a switch from the passive- to active-listening condition enhanced the N1/P2 complex only when the masker was speech. Moreover, under the active-listening condition, perceived separation between target and masker enhanced the N1/P2 complex only when the masker was speech. Thus, when a masker is present, the effect of selective attention to the target-speech signal on the early cortical representation of the speech signal is masker-type dependent.
Collapse
Affiliation(s)
- Changxin Zhang
- Department of Psychology, Speech and Hearing Research Center, McGovern Institute for Brain Research at PKU, Key Laboratory on Machine Perception (Ministry of Education), Peking University, Beijing 100871, China
| | - Lingxi Lu
- Department of Psychology, Speech and Hearing Research Center, McGovern Institute for Brain Research at PKU, Key Laboratory on Machine Perception (Ministry of Education), Peking University, Beijing 100871, China
| | - Xihong Wu
- Department of Psychology, Speech and Hearing Research Center, McGovern Institute for Brain Research at PKU, Key Laboratory on Machine Perception (Ministry of Education), Peking University, Beijing 100871, China
| | - Liang Li
- Department of Psychology, Speech and Hearing Research Center, McGovern Institute for Brain Research at PKU, Key Laboratory on Machine Perception (Ministry of Education), Peking University, Beijing 100871, China.
| |
Collapse
|
97
|
Bishop CW, Yadav D, London S, Miller LM. The effects of preceding lead-alone and lag-alone click trains on the buildup of echo suppression. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2014; 136:803-817. [PMID: 25096114 PMCID: PMC4144256 DOI: 10.1121/1.4874622] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/10/2012] [Revised: 04/14/2014] [Accepted: 04/16/2014] [Indexed: 06/03/2023]
Abstract
Spatial perception in echoic environments is influenced by recent acoustic history. For instance, echo suppression becomes more effective or "builds up" with repeated exposure to echoes having a consistent acoustic relationship to a temporally leading sound. Four experiments were conducted to investigate how buildup is affected by prior exposure to unpaired lead-alone or lag-alone click trains. Unpaired trains preceded lead-lag click trains designed to evoke and assay buildup. Listeners reported how many sounds they heard from the echo hemifield during the lead-lag trains. Stimuli were presented in free field (experiments 1 and 4) or dichotically through earphones (experiments 2 and 3). In experiment 1, listeners reported more echoes following a lead-alone train compared to a period of silence. In contrast, listeners reported fewer echoes following a lag-alone train; similar results were observed with earphones. Interestingly, the effects of lag-alone click trains on buildup were qualitatively different when compared to a no-conditioner trial type in experiment 4. Finally, experiment 3 demonstrated that the effects of preceding click trains on buildup cannot be explained by a change in counting strategy or perceived click salience. Together, these findings demonstrate that echo suppression is affected by prior exposure to unpaired stimuli.
Collapse
Affiliation(s)
- Christopher W Bishop
- University of California, Davis Center for Mind and Brain, 267 Cousteau Place, Davis, California 95618
| | - Deepak Yadav
- University of California, Davis Center for Mind and Brain, 267 Cousteau Place, Davis, California 95618
| | - Sam London
- University of California, Davis Center for Mind and Brain, 267 Cousteau Place, Davis, California 95618
| | - Lee M Miller
- University of California, Davis Center for Mind and Brain, 267 Cousteau Place, Davis, California 95618
| |
Collapse
|
98
|
Willmore BDB, Cooke JE, King AJ. Hearing in noisy environments: noise invariance and contrast gain control. J Physiol 2014; 592:3371-81. [PMID: 24907308 DOI: 10.1113/jphysiol.2014.274886] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/19/2023] Open
Abstract
Contrast gain control has recently been identified as a fundamental property of the auditory system. Electrophysiological recordings in ferrets have shown that neurons continuously adjust their gain (their sensitivity to change in sound level) in response to the contrast of sounds that are heard. At the level of the auditory cortex, these gain changes partly compensate for changes in sound contrast. This means that sounds which are structurally similar, but have different contrasts, have similar neuronal representations in the auditory cortex. As a result, the cortical representation is relatively invariant to stimulus contrast and robust to the presence of noise in the stimulus. In the inferior colliculus (an important subcortical auditory structure), gain changes are less reliably compensatory, suggesting that contrast- and noise-invariant representations are constructed gradually as one ascends the auditory pathway. In addition to noise invariance, contrast gain control provides a variety of computational advantages over static neuronal representations; it makes efficient use of neuronal dynamic range, may contribute to redundancy-reducing, sparse codes for sound and allows for simpler decoding of population responses. The circuits underlying auditory contrast gain control are still under investigation. As in the visual system, these circuits may be modulated by factors other than stimulus contrast, forming a potential neural substrate for mediating the effects of attention as well as interactions between the senses.
Collapse
Affiliation(s)
- Ben D B Willmore
- Department of Physiology, Anatomy and Genetics, University of Oxford, Sherrington Building, Parks Road, Oxford, OX1 3PT, UK
| | - James E Cooke
- Department of Physiology, Anatomy and Genetics, University of Oxford, Sherrington Building, Parks Road, Oxford, OX1 3PT, UK
| | - Andrew J King
- Department of Physiology, Anatomy and Genetics, University of Oxford, Sherrington Building, Parks Road, Oxford, OX1 3PT, UK
| |
Collapse
|
99
|
Bressler S, Masud S, Bharadwaj H, Shinn-Cunningham B. Bottom-up influences of voice continuity in focusing selective auditory attention. PSYCHOLOGICAL RESEARCH 2014; 78:349-60. [PMID: 24633644 DOI: 10.1007/s00426-014-0555-7] [Citation(s) in RCA: 37] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2013] [Accepted: 02/19/2014] [Indexed: 11/29/2022]
Abstract
Selective auditory attention causes a relative enhancement of the neural representation of important information and suppression of the neural representation of distracting sound, which enables a listener to analyze and interpret information of interest. Some studies suggest that in both vision and in audition, the "unit" on which attention operates is an object: an estimate of the information coming from a particular external source out in the world. In this view, which object ends up in the attentional foreground depends on the interplay of top-down, volitional attention and stimulus-driven, involuntary attention. Here, we test the idea that auditory attention is object based by exploring whether continuity of a non-spatial feature (talker identity, a feature that helps acoustic elements bind into one perceptual object) also influences selective attention performance. In Experiment 1, we show that perceptual continuity of target talker voice helps listeners report a sequence of spoken target digits embedded in competing reversed digits spoken by different talkers. In Experiment 2, we provide evidence that this benefit of voice continuity is obligatory and automatic, as if voice continuity biases listeners by making it easier to focus on a subsequent target digit when it is perceptually linked to what was already in the attentional foreground. Our results support the idea that feature continuity enhances streaming automatically, thereby influencing the dynamic processes that allow listeners to successfully attend to objects through time in the cacophony that assails our ears in many everyday settings.
Collapse
Affiliation(s)
- Scott Bressler
- Center for Computational Neuroscience and Neural Technology, Boston University, 677 Beacon St., Boston, MA, 02421, USA
| | | | | | | |
Collapse
|
100
|
Ihlefeld A, Kan A, Litovsky RY. Across-frequency combination of interaural time difference in bilateral cochlear implant listeners. Front Syst Neurosci 2014; 8:22. [PMID: 24653681 PMCID: PMC3949319 DOI: 10.3389/fnsys.2014.00022] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2013] [Accepted: 01/29/2014] [Indexed: 11/13/2022] Open
Abstract
The current study examined how cochlear implant (CI) listeners combine temporally interleaved envelope-ITD information across two sites of stimulation. When two cochlear sites jointly transmit ITD information, one possibility is that CI listeners can extract the most reliable ITD cues available. As a result, ITD sensitivity would be sustained or enhanced compared to single-site stimulation. Alternatively, mutual interference across multiple sites of ITD stimulation could worsen dual-site performance compared to listening to the better of two electrode pairs. Two experiments used direct stimulation to examine how CI users can integrate ITDs across two pairs of electrodes. Experiment 1 tested ITD discrimination for two stimulation sites using 100-Hz sinusoidally modulated 1000-pps-carrier pulse trains. Experiment 2 used the same stimuli ramped with 100 ms windows, as a control condition with minimized onset cues. For all stimuli, performance improved monotonically with increasing modulation depth. Results show that when CI listeners are stimulated with electrode pairs at two cochlear sites, sensitivity to ITDs was similar to that seen when only the electrode pair with better sensitivity was activated. None of the listeners showed a decrement in performance from the worse electrode pair. This could be achieved either by listening to the better electrode pair or by truly integrating the information across cochlear sites.
Collapse
Affiliation(s)
- Antje Ihlefeld
- Waisman Center, University of Wisconsin Madison, WI, USA ; Center for Neural Science, New York University New York, NY, USA
| | - Alan Kan
- Waisman Center, University of Wisconsin Madison, WI, USA
| | | |
Collapse
|