1
|
Yao D, Zhao J, Wang L, Shang Z, Gu J, Wang Y, Jia M, Li J. Effects of spatial configuration and fundamental frequency on speech intelligibility in multiple-talker conditions in the ipsilateral horizontal plane and median planea). THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2024; 155:2934-2947. [PMID: 38717201 DOI: 10.1121/10.0025857] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/19/2024] [Accepted: 04/11/2024] [Indexed: 09/20/2024]
Abstract
Spatial separation and fundamental frequency (F0) separation are effective cues for improving the intelligibility of target speech in multi-talker scenarios. Previous studies predominantly focused on spatial configurations within the frontal hemifield, overlooking the ipsilateral side and the entire median plane, where localization confusion often occurs. This study investigated the impact of spatial and F0 separation on intelligibility under the above-mentioned underexplored spatial configurations. The speech reception thresholds were measured through three experiments for scenarios involving two to four talkers, either in the ipsilateral horizontal plane or in the entire median plane, utilizing monotonized speech with varying F0s as stimuli. The results revealed that spatial separation in symmetrical positions (front-back symmetry in the ipsilateral horizontal plane or front-back, up-down symmetry in the median plane) contributes positively to intelligibility. Both target direction and relative target-masker separation influence the masking release attributed to spatial separation. As the number of talkers exceeds two, the masking release from spatial separation diminishes. Nevertheless, F0 separation remains as a remarkably effective cue and could even facilitate spatial separation in improving intelligibility. Further analysis indicated that current intelligibility models encounter difficulties in accurately predicting intelligibility in scenarios explored in this study.
Collapse
Affiliation(s)
- Dingding Yao
- Key Laboratory of Speech Acoustics and Content Understanding, Institute of Acoustics, Chinese Academy of Sciences, Beijing 100190, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Jiale Zhao
- Key Laboratory of Speech Acoustics and Content Understanding, Institute of Acoustics, Chinese Academy of Sciences, Beijing 100190, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Linyi Wang
- Key Laboratory of Speech Acoustics and Content Understanding, Institute of Acoustics, Chinese Academy of Sciences, Beijing 100190, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Zengqiang Shang
- Key Laboratory of Speech Acoustics and Content Understanding, Institute of Acoustics, Chinese Academy of Sciences, Beijing 100190, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Jianjun Gu
- Key Laboratory of Speech Acoustics and Content Understanding, Institute of Acoustics, Chinese Academy of Sciences, Beijing 100190, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Yunan Wang
- Department of Electronic and Information Engineering, Beihang University, Beijing 100191, China
| | - Maoshen Jia
- Faculty of Information Technology, Beijing University of Technology, Beijing 100124, China
| | - Junfeng Li
- Key Laboratory of Speech Acoustics and Content Understanding, Institute of Acoustics, Chinese Academy of Sciences, Beijing 100190, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| |
Collapse
|
2
|
Farahbod H, Rogalsky C, Keator LM, Cai J, Pillay SB, Turner K, LaCroix A, Fridriksson J, Binder JR, Middlebrooks JC, Hickok G, Saberi K. Informational Masking in Aging and Brain-lesioned Individuals. J Assoc Res Otolaryngol 2023; 24:67-79. [PMID: 36471207 PMCID: PMC9971540 DOI: 10.1007/s10162-022-00877-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2022] [Accepted: 11/01/2022] [Indexed: 12/12/2022] Open
Abstract
Auditory stream segregation and informational masking were investigated in brain-lesioned individuals, age-matched controls with no neurological disease, and young college-age students. A psychophysical paradigm known as rhythmic masking release (RMR) was used to examine the ability of participants to identify a change in the rhythmic sequence of 20-ms Gaussian noise bursts presented through headphones and filtered through generalized head-related transfer functions to produce the percept of an externalized auditory image (i.e., a 3D virtual reality sound). The target rhythm was temporally interleaved with a masker sequence comprising similar noise bursts in a manner that resulted in a uniform sequence with no information remaining about the target rhythm when the target and masker were presented from the same location (an impossible task). Spatially separating the target and masker sequences allowed participants to determine if there was a change in the target rhythm midway during its presentation. RMR thresholds were defined as the minimum spatial separation between target and masker sequences that resulted in 70.7% correct-performance level in a single-interval 2-alternative forced-choice adaptive tracking procedure. The main findings were (1) significantly higher RMR thresholds for individuals with brain lesions (especially those with damage to parietal areas) and (2) a left-right spatial asymmetry in performance for lesion (but not control) participants. These findings contribute to a better understanding of spatiotemporal relations in informational masking and the neural bases of auditory scene analysis.
Collapse
Affiliation(s)
- Haleh Farahbod
- grid.266093.80000 0001 0668 7243Department of Cognitive Sciences, University of California, Irvine, USA
| | - Corianne Rogalsky
- grid.215654.10000 0001 2151 2636College of Health Solutions, Arizona State University, Tempe, USA
| | - Lynsey M. Keator
- grid.254567.70000 0000 9075 106XDepartment of Communication Sciences and Disorders, University of South Carolina, Columbia, USA
| | - Julia Cai
- grid.215654.10000 0001 2151 2636College of Health Solutions, Arizona State University, Tempe, USA
| | - Sara B. Pillay
- grid.30760.320000 0001 2111 8460Department of Neurology, Medical College of Wisconsin, Milwaukee, USA
| | - Katie Turner
- grid.266093.80000 0001 0668 7243Department of Cognitive Sciences, University of California, Irvine, USA
| | - Arianna LaCroix
- grid.260024.20000 0004 0627 4571College of Health Sciences, Midwestern University, Glendale, USA
| | - Julius Fridriksson
- grid.254567.70000 0000 9075 106XDepartment of Communication Sciences and Disorders, University of South Carolina, Columbia, USA
| | - Jeffrey R. Binder
- grid.30760.320000 0001 2111 8460Department of Neurology, Medical College of Wisconsin, Milwaukee, USA
| | - John C. Middlebrooks
- grid.266093.80000 0001 0668 7243Department of Cognitive Sciences, University of California, Irvine, USA ,grid.266093.80000 0001 0668 7243Department of Otolaryngology, University of California, Irvine, USA ,grid.266093.80000 0001 0668 7243Department of Language Science, University of California, Irvine, USA
| | - Gregory Hickok
- grid.266093.80000 0001 0668 7243Department of Cognitive Sciences, University of California, Irvine, USA ,grid.266093.80000 0001 0668 7243Department of Language Science, University of California, Irvine, USA
| | - Kourosh Saberi
- Department of Cognitive Sciences, University of California, Irvine, USA.
| |
Collapse
|
3
|
Iva P, Martin R, Fielding J, Clough M, White O, Godic B, van der Walt A, Rajan R. Discriminating spatialised speech in complex environments in multiple sclerosis. Cortex 2023; 159:217-232. [PMID: 36640621 DOI: 10.1016/j.cortex.2022.11.014] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2022] [Revised: 06/13/2022] [Accepted: 11/09/2022] [Indexed: 12/25/2022]
Abstract
People with multiple sclerosis (pwMS) frequently present with deficits in binaural processing used for sound localization. This study examined spatial release from speech-on-speech masking in pwMS, which involves binaural processing and additional higher level mechanisms underlying streaming, such as spatial attention. 26 pwMS with mild severity (Expanded Disability Status Scale score <3) and 20 age-matched controls listened via headphones to pre-recorded sentences from a standard list presented simultaneously with eight-talker babble. Virtual acoustic techniques were used to simulate sentences originating from 0°, 20°, or 50° on the interaural horizontal plane around the listener whilst babble was presented continuously at 0° azimuth, and participants verbally repeated the target sentence. In a separate task, two simultaneous sentences both containing a colour and number were presented, and participants were required to report the target colour and number. Both competing sentences could originate from 0°, 20°, or 50° on the azimuthal plane. Participants also completed a series of neuropsychological assessments, an auditory questionnaire, and a three-alternative forced-choice task that involved the detection of interaural time differences (ITDs) in noise bursts. Spatial release from masking was observed in both pwMS and controls, as response accuracy in the two speech discrimination tasks improved in the spatially separated conditions (20° and 50°) compared with the co-localised condition. However, pwMS demonstrated significantly less spatial release (18%) than controls (28%) when discriminating colour/number coordinates. At 50° separation, pwMS discriminated significantly fewer coordinates (77%) than controls (89%). In contrast, pwMS had similar performances to controls when sentences were presented in babble, and for the basic ITD discrimination task. Significant correlations between speech discrimination performance and standardized neuropsychological scores were observed across all spatial conditions. Our findings suggest that spatial hearing is likely to be implicated in pwMS, thereby affecting the perception of competing speech originating from various locations.
Collapse
Affiliation(s)
- Pippa Iva
- Department of Physiology, Biomedicine Discovery Institute, Monash University, Melbourne, VIC, Australia.
| | - Russell Martin
- Department of Physiology, Biomedicine Discovery Institute, Monash University, Melbourne, VIC, Australia
| | - Joanne Fielding
- Department of Neurosciences, Central Clinical School, Alfred Hospital, Monash University, Melbourne, VIC, Australia
| | - Meaghan Clough
- Department of Neurosciences, Central Clinical School, Alfred Hospital, Monash University, Melbourne, VIC, Australia
| | - Owen White
- Department of Neurosciences, Central Clinical School, Alfred Hospital, Monash University, Melbourne, VIC, Australia
| | - Branislava Godic
- Department of Physiology, Biomedicine Discovery Institute, Monash University, Melbourne, VIC, Australia
| | - Anneke van der Walt
- Department of Neurosciences, Central Clinical School, Alfred Hospital, Monash University, Melbourne, VIC, Australia
| | - Ramesh Rajan
- Department of Physiology, Biomedicine Discovery Institute, Monash University, Melbourne, VIC, Australia
| |
Collapse
|
4
|
Luke R, Innes-Brown H, Undurraga JA, McAlpine D. Human cortical processing of interaural coherence. iScience 2022; 25:104181. [PMID: 35494228 PMCID: PMC9051632 DOI: 10.1016/j.isci.2022.104181] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2021] [Revised: 11/29/2021] [Accepted: 03/29/2022] [Indexed: 11/17/2022] Open
Abstract
Sounds reach the ears as a mixture of energy generated by different sources. Listeners extract cues that distinguish different sources from one another, including how similar sounds arrive at the two ears, the interaural coherence (IAC). Here, we find listeners cannot reliably distinguish two completely interaurally coherent sounds from a single sound with reduced IAC. Pairs of sounds heard toward the front were readily confused with single sounds with high IAC, whereas those heard to the sides were confused with single sounds with low IAC. Sounds that hold supra-ethological spatial cues are perceived as more diffuse than can be accounted for by their IAC, and this is accounted for by a computational model comprising a restricted, and sound-frequency dependent, distribution of auditory-spatial detectors. We observed elevated cortical hemodynamic responses for sounds with low IAC, suggesting that the ambiguity elicited by sounds with low interaural similarity imposes elevated cortical load.
Collapse
Affiliation(s)
- Robert Luke
- Macquarie University, Sydney, NSW, Australia
- The Bionics Institute, Melbourne, VIC, Australia
| | | | | | | |
Collapse
|
5
|
Shared cognitive resources between memory and attention during sound-sequence encoding. Atten Percept Psychophys 2022; 84:739-759. [PMID: 35106682 DOI: 10.3758/s13414-021-02390-2] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 10/02/2021] [Indexed: 11/08/2022]
Abstract
You are on the phone, walking down a street. This daily situation calls for selective attention, allowing you to ignore surrounding irrelevant sounds, while trying to encode in memory the relevant information from the phone. Attention and memory are indeed two cognitive functions that are interacting constantly. However, their interaction is not yet well characterized during sound-sequence encoding. We independently manipulated both selective attention and working memory in a delayed-matching-to-sample of two tone-series, played successively in one ear. During the first melody presentation (memory encoding), weakly or highly distracting melodies were played in the other ear. Detection of the difference between the two comparison melodies could be easy or difficult, requiring low- or high-precision encoding, i.e., low or high memory load. Sixteen non-musician and 16 musician participants performed this new task. As expected, both groups of participants were less accurate in the difficult memory task and in difficult-to-ignore distractor conditions. Importantly, an interaction between memory-task difficulty and distractor difficulty was found in both groups. Non-musicians presented less difference between easy and difficult-to-ignore distractors in the difficult than in the easy memory task. On the contrary, musicians, with better performance than non-musicians, showed a greater difference between easy and difficult-to-ignore distractors in the difficult than in the easy memory task. In a second experiment including trials without a distractor, we could show that these effects are in line with the cognitive load theory. Taken together, these results speak for shared cognitive resources between working memory and attention during sound-sequence encoding.
Collapse
|
6
|
Cortical Processing of Binaural Cues as Shown by EEG Responses to Random-Chord Stereograms. J Assoc Res Otolaryngol 2021; 23:75-94. [PMID: 34904205 PMCID: PMC8783002 DOI: 10.1007/s10162-021-00820-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2021] [Accepted: 10/06/2021] [Indexed: 10/26/2022] Open
Abstract
Spatial hearing facilitates the perceptual organization of complex soundscapes into accurate mental representations of sound sources in the environment. Yet, the role of binaural cues in auditory scene analysis (ASA) has received relatively little attention in recent neuroscientific studies employing novel, spectro-temporally complex stimuli. This may be because a stimulation paradigm that provides binaurally derived grouping cues of sufficient spectro-temporal complexity has not yet been established for neuroscientific ASA experiments. Random-chord stereograms (RCS) are a class of auditory stimuli that exploit spectro-temporal variations in the interaural envelope correlation of noise-like sounds with interaurally coherent fine structure; they evoke salient auditory percepts that emerge only under binaural listening. Here, our aim was to assess the usability of the RCS paradigm for indexing binaural processing in the human brain. To this end, we recorded EEG responses to RCS stimuli from 12 normal-hearing subjects. The stimuli consisted of an initial 3-s noise segment with interaurally uncorrelated envelopes, followed by another 3-s segment, where envelope correlation was modulated periodically according to the RCS paradigm. Modulations were applied either across the entire stimulus bandwidth (wideband stimuli) or in temporally shifting frequency bands (ripple stimulus). Event-related potentials and inter-trial phase coherence analyses of the EEG responses showed that the introduction of the 3- or 5-Hz wideband modulations produced a prominent change-onset complex and ongoing synchronized responses to the RCS modulations. In contrast, the ripple stimulus elicited a change-onset response but no response to ongoing RCS modulation. Frequency-domain analyses revealed increased spectral power at the fundamental frequency and the first harmonic of wideband RCS modulations. RCS stimulation yields robust EEG measures of binaurally driven auditory reorganization and has potential to provide a flexible stimulation paradigm suitable for isolating binaural effects in ASA experiments.
Collapse
|
7
|
Corbin NE, Buss E, Leibold LJ. Spatial Hearing and Functional Auditory Skills in Children With Unilateral Hearing Loss. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2021; 64:4495-4512. [PMID: 34609204 PMCID: PMC9132156 DOI: 10.1044/2021_jslhr-20-00081] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/24/2020] [Revised: 05/03/2021] [Accepted: 06/30/2021] [Indexed: 06/13/2023]
Abstract
Purpose The purpose of this study was to characterize spatial hearing abilities of children with longstanding unilateral hearing loss (UHL). UHL was expected to negatively impact children's sound source localization and masked speech recognition, particularly when the target and masker were separated in space. Spatial release from masking (SRM) in the presence of a two-talker speech masker was expected to predict functional auditory performance as assessed by parent report. Method Participants were 5- to 14-year-olds with sensorineural or mixed UHL, age-matched children with normal hearing (NH), and adults with NH. Sound source localization was assessed on the horizontal plane (-90° to 90°), with noise that was either all-pass, low-pass, high-pass, or an unpredictable mixture. Speech recognition thresholds were measured in the sound field for sentences presented in two-talker speech or speech-shaped noise. Target speech was always presented from 0°; the masker was either colocated with the target or spatially separated at ±90°. Parents of children with UHL rated their children's functional auditory performance in everyday environments via questionnaire. Results Sound source localization was poorer for children with UHL than those with NH. Children with UHL also derived less SRM than those with NH, with increased masking for some conditions. Effects of UHL were larger in the two-talker than the noise masker, and SRM in two-talker speech increased with age for both groups of children. Children with UHL whose parents reported greater functional difficulties achieved less SRM when either masker was on the side of the better-hearing ear. Conclusions Children with UHL are clearly at a disadvantage compared with children with NH for both sound source localization and masked speech recognition with spatial separation. Parents' report of their children's real-world communication abilities suggests that spatial hearing plays an important role in outcomes for children with UHL.
Collapse
Affiliation(s)
- Nicole E. Corbin
- Department of Communication Science and Disorders, University of Pittsburgh, PA
| | - Emily Buss
- Department of Otolaryngology—Head & Neck Surgery, School of Medicine, University of North Carolina at Chapel Hill
| | - Lori J. Leibold
- Center for Hearing Research, Boys Town National Research Hospital, Omaha, NE
| |
Collapse
|
8
|
Pastore MT, Natale SJ, Clayton C, Dorman MF, Yost WA, Zhou Y. Effects of Head Movements on Sound-Source Localization in Single-Sided Deaf Patients With Their Cochlear Implant On Versus Off. Ear Hear 2021; 41:1660-1674. [PMID: 33136640 PMCID: PMC7772279 DOI: 10.1097/aud.0000000000000882] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
Abstract
OBJECTIVES We investigated the ability of single-sided deaf listeners implanted with a cochlear implant (SSD-CI) to (1) determine the front-back and left-right location of sound sources presented from loudspeakers surrounding the listener and (2) use small head rotations to further improve their localization performance. The resulting behavioral data were used for further analyses investigating the value of so-called "monaural" spectral shape cues for front-back sound source localization. DESIGN Eight SSD-CI patients were tested with their cochlear implant (CI) on and off. Eight normal-hearing (NH) listeners, with one ear plugged during the experiment, and another group of eight NH listeners, with neither ear plugged, were also tested. Gaussian noises of 3-sec duration were band-pass filtered to 2-8 kHz and presented from 1 of 6 loudspeakers surrounding the listener, spaced 60° apart. Perceived sound source localization was tested under conditions where the patients faced forward with the head stationary, and under conditions where they rotated their heads between (Equation is included in full-text article.). RESULTS (1) Under stationary listener conditions, unilaterally-plugged NH listeners and SSD-CI listeners (with their CIs both on and off) were nearly at chance in determining the front-back location of high-frequency sound sources. (2) Allowing rotational head movements improved performance in both the front-back and left-right dimensions for all listeners. (3) For SSD-CI patients with their CI turned off, head rotations substantially reduced front-back reversals, and the combination of turning on the CI with head rotations led to near-perfect resolution of front-back sound source location. (4) Turning on the CI also improved left-right localization performance. (5) As expected, NH listeners with both ears unplugged localized to the correct front-back and left-right hemifields both with and without head movements. CONCLUSIONS Although SSD-CI listeners demonstrate a relatively poor ability to distinguish the front-back location of sound sources when their head is stationary, their performance is substantially improved with head movements. Most of this improvement occurs when the CI is off, suggesting that the NH ear does most of the "work" in this regard, though some additional gain is introduced with turning the CI on. During head turns, these listeners appear to primarily rely on comparing changes in head position to changes in monaural level cues produced by the direction-dependent attenuation of high-frequency sounds that result from acoustic head shadowing. In this way, SSD-CI listeners overcome limitations to the reliability of monaural spectral and level cues under stationary conditions. SSD-CI listeners may have learned, through chronic monaural experience before CI implantation, or with the relatively impoverished spatial cues provided by their CI-implanted ear, to exploit the monaural level cue. Unilaterally-plugged NH listeners were also able to use this cue during the experiment to realize approximately the same magnitude of benefit from head turns just minutes after plugging, though their performance was less accurate than that of the SSD-CI listeners, both with and without their CI turned on.
Collapse
Affiliation(s)
- M Torben Pastore
- College of Health Solutions, Arizona State University, Tempe, Arizona, USA
| | | | | | | | | | | |
Collapse
|
9
|
Middlebrooks JC. A Search for a Cortical Map of Auditory Space. J Neurosci 2021; 41:5772-5778. [PMID: 34011526 PMCID: PMC8265804 DOI: 10.1523/jneurosci.0501-21.2021] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2021] [Revised: 05/10/2021] [Accepted: 05/12/2021] [Indexed: 11/21/2022] Open
Abstract
This is the story of a search for a cortical map of auditory space. The search began with a study that was reported in the first issue of The Journal of Neuroscience (Middlebrooks and Pettigrew, 1981). That paper described some unexpected features of spatial sensitivity in the auditory cortex while failing to demonstrate the expected map. In the ensuing 40 years, we have encountered the following: panoramic spatial coding by single neurons; a rich variety of response patterns that are unmasked in the absence of general anesthesia; sharpening of spatial sensitivity when an animal is engaged in a listening task; and reorganization of spatial sensitivity in the presence of competing sounds. We have not encountered a map, but not through lack of trying. On the basis of years of negative results by our group and others, and positive results that are inconsistent with static point-to-point topography, we are confident in concluding that there just ain't no map. Instead, we have come to appreciate the highly dynamic spatial properties of cortical neurons, which serve the needs of listeners in a changing sonic environment.
Collapse
Affiliation(s)
- John C Middlebrooks
- Department of Otolaryngology
- Department of Neurobiology and Behavior
- Department of Cognitive Sciences
- Department of Biomedical Engineering, University of California at Irvine, Irvine, California 92697-5310
| |
Collapse
|
10
|
Hausfeld L, Shiell M, Formisano E, Riecke L. Cortical processing of distracting speech in noisy auditory scenes depends on perceptual demand. Neuroimage 2020; 228:117670. [PMID: 33359352 DOI: 10.1016/j.neuroimage.2020.117670] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2020] [Revised: 12/13/2020] [Accepted: 12/14/2020] [Indexed: 11/15/2022] Open
Abstract
Selective attention is essential for the processing of multi-speaker auditory scenes because they require the perceptual segregation of the relevant speech ("target") from irrelevant speech ("distractors"). For simple sounds, it has been suggested that the processing of multiple distractor sounds depends on bottom-up factors affecting task performance. However, it remains unclear whether such dependency applies to naturalistic multi-speaker auditory scenes. In this study, we tested the hypothesis that increased perceptual demand (the processing requirement posed by the scene to separate the target speech) reduces the cortical processing of distractor speech thus decreasing their perceptual segregation. Human participants were presented with auditory scenes including three speakers and asked to selectively attend to one speaker while their EEG was acquired. The perceptual demand of this selective listening task was varied by introducing an auditory cue (interaural time differences, ITDs) for segregating the target from the distractor speakers, while acoustic differences between the distractors were matched in ITD and loudness. We obtained a quantitative measure of the cortical segregation of distractor speakers by assessing the difference in how accurately speech-envelope following EEG responses could be predicted by models of averaged distractor speech versus models of individual distractor speech. In agreement with our hypothesis, results show that interaural segregation cues led to improved behavioral word-recognition performance and stronger cortical segregation of the distractor speakers. The neural effect was strongest in the δ-band and at early delays (0 - 200 ms). Our results indicate that during low perceptual demand, the human cortex represents individual distractor speech signals as more segregated. This suggests that, in addition to purely acoustical properties, the cortical processing of distractor speakers depends on factors like perceptual demand.
Collapse
Affiliation(s)
- Lars Hausfeld
- Department of Cognitive Neuroscience, Faculty of Psychology and Neuroscience, Maastricht University, P.O. Box 616, 6200MD Maastricht, The Netherlands; Maastricht Brain Imaging Centre, 6200MD Maastricht, The Netherlands.
| | - Martha Shiell
- Department of Cognitive Neuroscience, Faculty of Psychology and Neuroscience, Maastricht University, P.O. Box 616, 6200MD Maastricht, The Netherlands; Maastricht Brain Imaging Centre, 6200MD Maastricht, The Netherlands
| | - Elia Formisano
- Department of Cognitive Neuroscience, Faculty of Psychology and Neuroscience, Maastricht University, P.O. Box 616, 6200MD Maastricht, The Netherlands; Maastricht Brain Imaging Centre, 6200MD Maastricht, The Netherlands; Maastricht Centre for Systems Biology, 6200MD Maastricht, The Netherlands
| | - Lars Riecke
- Department of Cognitive Neuroscience, Faculty of Psychology and Neuroscience, Maastricht University, P.O. Box 616, 6200MD Maastricht, The Netherlands; Maastricht Brain Imaging Centre, 6200MD Maastricht, The Netherlands
| |
Collapse
|
11
|
Middlebrooks JC, Waters MF. Spatial Mechanisms for Segregation of Competing Sounds, and a Breakdown in Spatial Hearing. Front Neurosci 2020; 14:571095. [PMID: 33041763 PMCID: PMC7525094 DOI: 10.3389/fnins.2020.571095] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2020] [Accepted: 08/21/2020] [Indexed: 01/02/2023] Open
Abstract
We live in complex auditory environments, in which we are confronted with multiple competing sounds, including the cacophony of talkers in busy markets, classrooms, offices, etc. The purpose of this article is to synthesize observations from a series of experiments that focused on how spatial hearing might aid in disentangling interleaved sequences of sounds. The experiments were unified by a non-verbal task, "rhythmic masking release", which was applied to psychophysical studies in humans and cats and to cortical physiology in anesthetized cats. Human and feline listeners could segregate competing sequences of sounds from sources that were separated by as little as ∼10°. Similarly, single neurons in the cat primary auditory cortex tended to synchronize selectively to sound sequences from one of two competing sources, again with spatial resolution of ∼10°. The spatial resolution of spatial stream segregation varied widely depending on the binaural and monaural acoustical cues that were available in various experimental conditions. This is in contrast to a measure of basic sound-source localization, the minimum audible angle, which showed largely constant acuity across those conditions. The differential utilization of acoustical cues suggests that the central spatial mechanisms for stream segregation differ from those for sound localization. The highest-acuity spatial stream segregation was derived from interaural time and level differences. Brainstem processing of those cues is thought to rely heavily on normal function of a voltage-gated potassium channel, Kv3.3. A family was studied having a dominant negative mutation in the gene for that channel. Affected family members exhibited severe loss of sensitivity for interaural time and level differences, which almost certainly would degrade their ability to segregate competing sounds in real-world auditory scenes.
Collapse
Affiliation(s)
- John C. Middlebrooks
- Departments of Otolaryngology, Neurobiology and Behavior, Cognitive Sciences, and Biomedical Engineering, University of California, Irvine, Irvine, CA, United States
| | - Michael F. Waters
- Department of Neurology, Barrow Neurological Institute, Phoenix, AZ, United States
| |
Collapse
|
12
|
Improving Interaural Time Difference Sensitivity Using Short Inter-pulse Intervals with Amplitude-Modulated Pulse Trains in Bilateral Cochlear Implants. J Assoc Res Otolaryngol 2020; 21:105-120. [PMID: 32040655 DOI: 10.1007/s10162-020-00743-6] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2019] [Accepted: 01/22/2020] [Indexed: 10/25/2022] Open
Abstract
Interaural time differences (ITDs) at low frequencies are important for sound localization and spatial speech unmasking. These ITD cues are not encoded in commonly used envelope-based stimulation strategies for cochlear implants (CIs) using high pulse rates. However, ITD sensitivity can be improved by adding extra pulses with short inter-pulse intervals (SIPIs) in unmodulated high-rate trains. Here, we investigated whether this improvement also applies to amplitude-modulated (AM) high-rate pulse trains. To this end, we systematically varied the temporal position of SIPI pulses within the envelope cycle (SIPI phase), the fundamental frequency (F0) of AM (125 Hz and 250 Hz), and AM depth (from 0.1 to 0.9). Stimuli were presented at an interaurally place-matched electrode pair at a reference pulse rate of 1000 pulses/s. Participants performed an ITD-based left/right discrimination task. SIPI insertion resulted in improved ITD sensitivity throughout the range of modulation depths and for both male and female F0s. The improvements were largest for insertion at and around the envelope peak. These results are promising for conveying salient ITD cues at high pulse rates commonly used to encode speech information.
Collapse
|
13
|
Bednar A, Lalor EC. Where is the cocktail party? Decoding locations of attended and unattended moving sound sources using EEG. Neuroimage 2019; 205:116283. [PMID: 31629828 DOI: 10.1016/j.neuroimage.2019.116283] [Citation(s) in RCA: 23] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2019] [Revised: 10/08/2019] [Accepted: 10/14/2019] [Indexed: 11/18/2022] Open
Abstract
Recently, we showed that in a simple acoustic scene with one sound source, auditory cortex tracks the time-varying location of a continuously moving sound. Specifically, we found that both the delta phase and alpha power of the electroencephalogram (EEG) can be used to reconstruct the sound source azimuth. However, in natural settings, we are often presented with a mixture of multiple competing sounds and so we must focus our attention on the relevant source in order to segregate it from the competing sources e.g. 'cocktail party effect'. While many studies have examined this phenomenon in the context of sound envelope tracking by the cortex, it is unclear how we process and utilize spatial information in complex acoustic scenes with multiple sound sources. To test this, we created an experiment where subjects listened to two concurrent sound stimuli that were moving within the horizontal plane over headphones while we recorded their EEG. Participants were tasked with paying attention to one of the two presented stimuli. The data were analyzed by deriving linear mappings, temporal response functions (TRF), between EEG data and attended as well unattended sound source trajectories. Next, we used these TRFs to reconstruct both trajectories from previously unseen EEG data. In a first experiment we used noise stimuli and included the task involved spatially localizing embedded targets. Then, in a second experiment, we employed speech stimuli and a non-spatial speech comprehension task. Results showed the trajectory of an attended sound source can be reliably reconstructed from both delta phase and alpha power of EEG even in the presence of distracting stimuli. Moreover, the reconstruction was robust to task and stimulus type. The cortical representation of the unattended source position was below detection level for the noise stimuli, but we observed weak tracking of the unattended source location for the speech stimuli by the delta phase of EEG. In addition, we demonstrated that the trajectory reconstruction method can in principle be used to decode selective attention on a single-trial basis, however, its performance was inferior to envelope-based decoders. These results suggest a possible dissociation of delta phase and alpha power of EEG in the context of sound trajectory tracking. Moreover, the demonstrated ability to localize and determine the attended speaker in complex acoustic environments is particularly relevant for cognitively controlled hearing devices.
Collapse
Affiliation(s)
- Adam Bednar
- School of Engineering, Trinity College Dublin, Dublin, Ireland; Trinity Center for Bioengineering, Trinity College Dublin, Dublin, Ireland.
| | - Edmund C Lalor
- School of Engineering, Trinity College Dublin, Dublin, Ireland; Trinity Center for Bioengineering, Trinity College Dublin, Dublin, Ireland; Department of Biomedical Engineering, Department of Neuroscience, University of Rochester, Rochester, NY, USA.
| |
Collapse
|
14
|
Abstract
Humans and other animals use spatial hearing to rapidly localize events in the environment. However, neural encoding of sound location is a complex process involving the computation and integration of multiple spatial cues that are not represented directly in the sensory organ (the cochlea). Our understanding of these mechanisms has increased enormously in the past few years. Current research is focused on the contribution of animal models for understanding human spatial audition, the effects of behavioural demands on neural sound location encoding, the emergence of a cue-independent location representation in the auditory cortex, and the relationship between single-source and concurrent location encoding in complex auditory scenes. Furthermore, computational modelling seeks to unravel how neural representations of sound source locations are derived from the complex binaural waveforms of real-life sounds. In this article, we review and integrate the latest insights from neurophysiological, neuroimaging and computational modelling studies of mammalian spatial hearing. We propose that the cortical representation of sound location emerges from recurrent processing taking place in a dynamic, adaptive network of early (primary) and higher-order (posterior-dorsal and dorsolateral prefrontal) auditory regions. This cortical network accommodates changing behavioural requirements and is especially relevant for processing the location of real-life, complex sounds and complex auditory scenes.
Collapse
|
15
|
Coffey EBJ, Arseneau-Bruneau I, Zhang X, Zatorre RJ. The Music-In-Noise Task (MINT): A Tool for Dissecting Complex Auditory Perception. Front Neurosci 2019; 13:199. [PMID: 30930734 PMCID: PMC6427094 DOI: 10.3389/fnins.2019.00199] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2018] [Accepted: 02/20/2019] [Indexed: 11/30/2022] Open
Abstract
The ability to segregate target sounds in noisy backgrounds is relevant both to neuroscience and to clinical applications. Recent research suggests that hearing-in-noise (HIN) problems are solved using combinations of sub-skills that are applied according to task demand and information availability. While evidence is accumulating for a musician advantage in HIN, the exact nature of the reported training effect is not fully understood. Existing HIN tests focus on tasks requiring understanding of speech in the presence of competing sound. Because visual, spatial and predictive cues are not systematically considered in these tasks, few tools exist to investigate the most relevant components of cognitive processes involved in stream segregation. We present the Music-In-Noise Task (MINT) as a flexible tool to expand HIN measures beyond speech perception, and for addressing research questions pertaining to the relative contributions of HIN sub-skills, inter-individual differences in their use, and their neural correlates. The MINT uses a match-mismatch trial design: in four conditions (Baseline, Rhythm, Spatial, and Visual) subjects first hear a short instrumental musical excerpt embedded in an informational masker of "multi-music" noise, followed by either a matching or scrambled repetition of the target musical excerpt presented in silence; the four conditions differ according to the presence or absence of additional cues. In a fifth condition (Prediction), subjects hear the excerpt in silence as a target first, which helps to anticipate incoming information when the target is embedded in masking sound. Data from samples of young adults show that the MINT has good reliability and internal consistency, and demonstrate selective benefits of musicianship in the Prediction, Rhythm, and Visual subtasks. We also report a performance benefit of multilingualism that is separable from that of musicianship. Average MINT scores were correlated with scores on a sentence-in-noise perception task, but only accounted for a relatively small percentage of the variance, indicating that the MINT is sensitive to additional factors and can provide a complement and extension of speech-based tests for studying stream segregation. A customizable version of the MINT is made available for use and extension by the scientific community.
Collapse
Affiliation(s)
- Emily B. J. Coffey
- Department of Psychology, Concordia University, Montreal, QC, Canada
- Laboratory for Brain, Music and Sound Research (BRAMS), Montreal, QC, Canada
- Centre for Research on Brain, Language and Music (CRBLM), Montreal, QC, Canada
- Centre for Interdisciplinary Research in Music Media and Technology (CIRMMT), Montreal, QC, Canada
| | - Isabelle Arseneau-Bruneau
- Laboratory for Brain, Music and Sound Research (BRAMS), Montreal, QC, Canada
- Centre for Research on Brain, Language and Music (CRBLM), Montreal, QC, Canada
- Centre for Interdisciplinary Research in Music Media and Technology (CIRMMT), Montreal, QC, Canada
- Montreal Neurological Institute, McGill University, Montreal, QC, Canada
| | - Xiaochen Zhang
- Department of Biomedical Engineering, School of Medicine, Tsinghua University, Beijing, China
| | - Robert J. Zatorre
- Laboratory for Brain, Music and Sound Research (BRAMS), Montreal, QC, Canada
- Centre for Research on Brain, Language and Music (CRBLM), Montreal, QC, Canada
- Centre for Interdisciplinary Research in Music Media and Technology (CIRMMT), Montreal, QC, Canada
- Montreal Neurological Institute, McGill University, Montreal, QC, Canada
| |
Collapse
|
16
|
Tissieres I, Crottaz-Herbette S, Clarke S. Implicit representation of the auditory space: contribution of the left and right hemispheres. Brain Struct Funct 2019; 224:1569-1582. [PMID: 30848352 DOI: 10.1007/s00429-019-01853-5] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2018] [Accepted: 02/25/2019] [Indexed: 11/24/2022]
Abstract
Spatial cues contribute to the ability to segregate sound sources and thus facilitate their detection and recognition. This implicit use of spatial cues can be preserved in cases of cortical spatial deafness, suggesting that partially distinct neural networks underlie the explicit sound localization and the implicit use of spatial cues. We addressed this issue by assessing 40 patients, 20 patients with left and 20 patients with right hemispheric damage, for their ability to use auditory spatial cues implicitly in a paradigm of spatial release from masking (SRM) and explicitly in sound localization. The anatomical correlates of their performance were determined with voxel-based lesion-symptom mapping (VLSM). During the SRM task, the target was always presented at the centre, whereas the masker was presented at the centre or at one of the two lateral positions on the right or left side. The SRM effect was absent in some but not all patients; the inability to perceive the target when the masker was at one of the lateral positions correlated with lesions of the left temporo-parieto-frontal cortex or of the right inferior parietal lobule and the underlying white matter. As previously reported, sound localization depended critically on the right parietal and opercular cortex. Thus, explicit and implicit use of spatial cues depends on at least partially distinct neural networks. Our results suggest that the implicit use may rely on the left-dominant position-linked representation of sound objects, which has been demonstrated in previous EEG and fMRI studies.
Collapse
Affiliation(s)
- Isabel Tissieres
- Service de neuropsychologie et de neuroréhabilitation, Centre Hospitalier Universitaire Vaudois (CHUV), Université de Lausanne, Lausanne, Switzerland
| | - Sonia Crottaz-Herbette
- Service de neuropsychologie et de neuroréhabilitation, Centre Hospitalier Universitaire Vaudois (CHUV), Université de Lausanne, Lausanne, Switzerland
| | - Stephanie Clarke
- Service de neuropsychologie et de neuroréhabilitation, Centre Hospitalier Universitaire Vaudois (CHUV), Université de Lausanne, Lausanne, Switzerland.
| |
Collapse
|
17
|
Activity in Human Auditory Cortex Represents Spatial Separation Between Concurrent Sounds. J Neurosci 2018; 38:4977-4984. [PMID: 29712782 DOI: 10.1523/jneurosci.3323-17.2018] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2017] [Revised: 03/05/2018] [Accepted: 03/09/2018] [Indexed: 11/21/2022] Open
Abstract
The primary and posterior auditory cortex (AC) are known for their sensitivity to spatial information, but how this information is processed is not yet understood. AC that is sensitive to spatial manipulations is also modulated by the number of auditory streams present in a scene (Smith et al., 2010), suggesting that spatial and nonspatial cues are integrated for stream segregation. We reasoned that, if this is the case, then it is the distance between sounds rather than their absolute positions that is essential. To test this hypothesis, we measured human brain activity in response to spatially separated concurrent sounds with fMRI at 7 tesla in five men and five women. Stimuli were spatialized amplitude-modulated broadband noises recorded for each participant via in-ear microphones before scanning. Using a linear support vector machine classifier, we investigated whether sound location and/or location plus spatial separation between sounds could be decoded from the activity in Heschl's gyrus and the planum temporale. The classifier was successful only when comparing patterns associated with the conditions that had the largest difference in perceptual spatial separation. Our pattern of results suggests that the representation of spatial separation is not merely the combination of single locations, but rather is an independent feature of the auditory scene.SIGNIFICANCE STATEMENT Often, when we think of auditory spatial information, we think of where sounds are coming from-that is, the process of localization. However, this information can also be used in scene analysis, the process of grouping and segregating features of a soundwave into objects. Essentially, when sounds are further apart, they are more likely to be segregated into separate streams. Here, we provide evidence that activity in the human auditory cortex represents the spatial separation between sounds rather than their absolute locations, indicating that scene analysis and localization processes may be independent.
Collapse
|
18
|
Da Costa S, Clarke S, Crottaz-Herbette S. Keeping track of sound objects in space: The contribution of early-stage auditory areas. Hear Res 2018; 366:17-31. [PMID: 29643021 DOI: 10.1016/j.heares.2018.03.027] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/15/2017] [Revised: 03/21/2018] [Accepted: 03/28/2018] [Indexed: 12/01/2022]
Abstract
The influential dual-stream model of auditory processing stipulates that information pertaining to the meaning and to the position of a given sound object is processed in parallel along two distinct pathways, the ventral and dorsal auditory streams. Functional independence of the two processing pathways is well documented by conscious experience of patients with focal hemispheric lesions. On the other hand there is growing evidence that the meaning and the position of a sound are combined early in the processing pathway, possibly already at the level of early-stage auditory areas. Here, we investigated how early auditory areas integrate sound object meaning and space (simulated by interaural time differences) using a repetition suppression fMRI paradigm at 7 T. Subjects listen passively to environmental sounds presented in blocks of repetitions of the same sound object (same category) or different sounds objects (different categories), perceived either in the left or right space (no change within block) or shifted left-to-right or right-to-left halfway in the block (change within block). Environmental sounds activated bilaterally the superior temporal gyrus, middle temporal gyrus, inferior frontal gyrus, and right precentral cortex. Repetitions suppression effects were measured within bilateral early-stage auditory areas in the lateral portion of the Heschl's gyrus and posterior superior temporal plane. Left lateral early-stages areas showed significant effects for position and change, interactions Category x Initial Position and Category x Change in Position, while right lateral areas showed main effect of category and interaction Category x Change in Position. The combined evidence from our study and from previous studies speaks in favour of a position-linked representation of sound objects, which is independent from semantic encoding within the ventral stream and from spatial encoding within the dorsal stream. We argue for a third auditory stream, which has its origin in lateral belt areas and tracks sound objects across space.
Collapse
Affiliation(s)
- Sandra Da Costa
- Centre d'Imagerie BioMédicale (CIBM), EPFL et Universités de Lausanne et de Genève, Bâtiment CH, Station 6, CH-1015 Lausanne, Switzerland.
| | - Stephanie Clarke
- Service de Neuropsychologie et de Neuroréhabilitation, CHUV, Université de Lausanne, Avenue Pierre Decker 5, CH-1011 Lausanne, Switzerland
| | - Sonia Crottaz-Herbette
- Service de Neuropsychologie et de Neuroréhabilitation, CHUV, Université de Lausanne, Avenue Pierre Decker 5, CH-1011 Lausanne, Switzerland
| |
Collapse
|
19
|
Eipert L, Klinge-Strahl A, Klump GM. Processing of interaural phase differences in components of harmonic and mistuned complexes in the inferior colliculus of the Mongolian gerbil. Eur J Neurosci 2018; 47:1242-1251. [PMID: 29603825 DOI: 10.1111/ejn.13922] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2017] [Revised: 02/19/2018] [Accepted: 03/22/2018] [Indexed: 11/30/2022]
Abstract
Harmonicity and spatial location provide eminent cues for the perceptual grouping of sounds. In general, harmonicity is a strong grouping cue. In contrast, spatial cues such as interaural phase or time difference provide for strong grouping of stimulus sequences but weak grouping for simultaneously presented sounds. By studying the neuronal basis underlying the interaction of these cues in processing simultaneous sounds using van Rossum spike train distance measures, we aim at explaining the interaction observed in psychophysical experiments. Responses to interaural phase differences imposed on single components of harmonic and mistuned complex tones as well as noise delay functions were recorded as multiunit responses from the inferior colliculus of Mongolian gerbils. Results revealed a better representation of interaural phase differences if imposed on a harmonic rather than a mistuned frequency component of a complex tone. The representation of interaural phase differences was better for long integration-time windows approximately reflecting firing rates rather than short integration-time windows reflecting the temporal pattern of the stimulus-driven response. We found only a weak impact of interaural phase differences if combined with mistuning of a component in a harmonic tone complex.
Collapse
Affiliation(s)
- Lena Eipert
- Animal Physiology and Behavior Group, Department for Neuroscience, School for Medicine and Health Sciences, Carl-von-Ossietzky University Oldenburg, 26111, Oldenburg, Germany.,Cluster of Excellence Hearing4all, Carl-von-Ossietzky University Oldenburg, 26111, Oldenburg, Germany
| | - Astrid Klinge-Strahl
- Animal Physiology and Behavior Group, Department for Neuroscience, School for Medicine and Health Sciences, Carl-von-Ossietzky University Oldenburg, 26111, Oldenburg, Germany
| | - Georg M Klump
- Animal Physiology and Behavior Group, Department for Neuroscience, School for Medicine and Health Sciences, Carl-von-Ossietzky University Oldenburg, 26111, Oldenburg, Germany.,Cluster of Excellence Hearing4all, Carl-von-Ossietzky University Oldenburg, 26111, Oldenburg, Germany
| |
Collapse
|
20
|
Jaeger M, Bleichner MG, Bauer AKR, Mirkovic B, Debener S. Did You Listen to the Beat? Auditory Steady-State Responses in the Human Electroencephalogram at 4 and 7 Hz Modulation Rates Reflect Selective Attention. Brain Topogr 2018; 31:811-826. [DOI: 10.1007/s10548-018-0637-8] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2017] [Accepted: 02/23/2018] [Indexed: 01/23/2023]
|
21
|
Oberem J, Seibold J, Koch I, Fels J. Intentional switching in auditory selective attention: Exploring attention shifts with different reverberation times. Hear Res 2017; 359:32-39. [PMID: 29305038 DOI: 10.1016/j.heares.2017.12.013] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/06/2017] [Revised: 12/11/2017] [Accepted: 12/18/2017] [Indexed: 12/01/2022]
Abstract
Using a well-established binaural-listening paradigm the ability to intentionally switch auditory selective attention was examined under anechoic, low reverberation (0.8 s) and high reverberation (1.75 s) conditions. Twenty-three young, normal-hearing subjects were tested in a within-subject design to analyze influences of the reverberation times. Spoken word pairs by two speakers were presented simultaneously to subjects from two of eight azimuth positions. The stimuli consisted of a single number word, (i.e., 1 to 9), followed by either the direction "UP" or "DOWN" in German. Guided by a visual cue prior to auditory stimulus onset indicating the position of the target speaker, subjects were asked to identify whether the target number was numerically smaller or greater than five and to categorize the direction of the second word. Switch costs, (i.e. reaction time differences between a position switch of the target relative to a position repetition), were larger under the high reverberation condition. Furthermore, the error rates were highly dependent on reverberant energy and reverberation interacted with the congruence effect, (i.e. stimuli spoken by target and distractor may evoke the same answer (congruent) or different answers (incongruent)), indicating larger congruence effects under higher reverberation times.
Collapse
Affiliation(s)
- Josefa Oberem
- Institute of Technical Acoustics, Medical Acoustics Group, RWTH Aachen University, Kopernikusstraße 5, 52074 Aachen, Germany.
| | - Julia Seibold
- Institute of Psychology, RWTH Aachen University, Jägerstraße 17, 52066 Aachen, Germany.
| | - Iring Koch
- Institute of Psychology, RWTH Aachen University, Jägerstraße 17, 52066 Aachen, Germany.
| | - Janina Fels
- Institute of Technical Acoustics, Medical Acoustics Group, RWTH Aachen University, Kopernikusstraße 5, 52074 Aachen, Germany.
| |
Collapse
|
22
|
Itatani N, Klump GM. Interaction of spatial and non-spatial cues in auditory stream segregation in the European starling. Eur J Neurosci 2017; 51:1191-1200. [PMID: 28922512 DOI: 10.1111/ejn.13716] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2017] [Revised: 09/14/2017] [Accepted: 09/14/2017] [Indexed: 11/29/2022]
Abstract
Integrating sounds from the same source and segregating sounds from different sources in an acoustic scene are an essential function of the auditory system. Naturally, the auditory system simultaneously makes use of multiple cues. Here, we investigate the interaction between spatial cues and frequency cues in stream segregation of European starlings (Sturnus vulgaris) using an objective measure of perception. Neural responses to streaming sounds were recorded, while the bird was performing a behavioural task that results in a higher sensitivity during a one-stream than a two-stream percept. Birds were trained to detect an onset time shift of a B tone in an ABA- triplet sequence in which A and B could differ in frequency and/or spatial location. If the frequency difference or spatial separation between the signal sources or both were increased, the behavioural time shift detection performance deteriorated. Spatial separation had a smaller effect on the performance compared to the frequency difference and both cues additively affected the performance. Neural responses in the primary auditory forebrain were affected by the frequency and spatial cues. However, frequency and spatial cue differences being sufficiently large to elicit behavioural effects did not reveal correlated neural response differences. The difference between the neuronal response pattern and behavioural response is discussed with relation to the task given to the bird. Perceptual effects of combining different cues in auditory scene analysis indicate that these cues are analysed independently and given different weights suggesting that the streaming percept arises consecutively to initial cue analysis.
Collapse
Affiliation(s)
- Naoya Itatani
- Animal Physiology and Behavior Group, Department for Neuroscience, School for Medicine and Health Sciences, Carl-von-Ossietzky University Oldenburg, 26111, Oldenburg, Germany.,Cluster of Excellence Hearing4all, Carl-von-Ossietzky University Oldenburg, Oldenburg, Germany
| | - Georg M Klump
- Animal Physiology and Behavior Group, Department for Neuroscience, School for Medicine and Health Sciences, Carl-von-Ossietzky University Oldenburg, 26111, Oldenburg, Germany.,Cluster of Excellence Hearing4all, Carl-von-Ossietzky University Oldenburg, Oldenburg, Germany
| |
Collapse
|
23
|
David M, Lavandier M, Grimault N, Oxenham AJ. Discrimination and streaming of speech sounds based on differences in interaural and spectral cues. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2017; 142:1674. [PMID: 28964066 PMCID: PMC5617732 DOI: 10.1121/1.5003809] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/09/2017] [Revised: 09/01/2017] [Accepted: 09/07/2017] [Indexed: 05/29/2023]
Abstract
Differences in spatial cues, including interaural time differences (ITDs), interaural level differences (ILDs) and spectral cues, can lead to stream segregation of alternating noise bursts. It is unknown how effective such cues are for streaming sounds with realistic spectro-temporal variations. In particular, it is not known whether the high-frequency spectral cues associated with elevation remain sufficiently robust under such conditions. To answer these questions, sequences of consonant-vowel tokens were generated and filtered by non-individualized head-related transfer functions to simulate the cues associated with different positions in the horizontal and median planes. A discrimination task showed that listeners could discriminate changes in interaural cues both when the stimulus remained constant and when it varied between presentations. However, discrimination of changes in spectral cues was much poorer in the presence of stimulus variability. A streaming task, based on the detection of repeated syllables in the presence of interfering syllables, revealed that listeners can use both interaural and spectral cues to segregate alternating syllable sequences, despite the large spectro-temporal differences between stimuli. However, only the full complement of spatial cues (ILDs, ITDs, and spectral cues) resulted in obligatory streaming in a task that encouraged listeners to integrate the tokens into a single stream.
Collapse
Affiliation(s)
- Marion David
- Department of Psychology, University of Minnesota, Minneapolis, Minnesota 55455, USA
| | - Mathieu Lavandier
- Univ Lyon, ENTPE, Laboratoire Génie Civil et bâtiment, Rue Maurice Audin, 69518 Vaulx-en-Velin Cedex, France
| | - Nicolas Grimault
- Centre de Recherche en Neurosciences de Lyon, Université Lyon 1, Cognition Auditive et Psychoacoustique, Avenue Tony Garnier, 69366 Lyon Cedex 07, France
| | - Andrew J Oxenham
- Department of Psychology, University of Minnesota, Minneapolis, Minnesota 55455, USA
| |
Collapse
|
24
|
Itatani N, Klump GM. Animal models for auditory streaming. Philos Trans R Soc Lond B Biol Sci 2017; 372:rstb.2016.0112. [PMID: 28044022 DOI: 10.1098/rstb.2016.0112] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 07/31/2016] [Indexed: 11/12/2022] Open
Abstract
Sounds in the natural environment need to be assigned to acoustic sources to evaluate complex auditory scenes. Separating sources will affect the analysis of auditory features of sounds. As the benefits of assigning sounds to specific sources accrue to all species communicating acoustically, the ability for auditory scene analysis is widespread among different animals. Animal studies allow for a deeper insight into the neuronal mechanisms underlying auditory scene analysis. Here, we will review the paradigms applied in the study of auditory scene analysis and streaming of sequential sounds in animal models. We will compare the psychophysical results from the animal studies to the evidence obtained in human psychophysics of auditory streaming, i.e. in a task commonly used for measuring the capability for auditory scene analysis. Furthermore, the neuronal correlates of auditory streaming will be reviewed in different animal models and the observations of the neurons' response measures will be related to perception. The across-species comparison will reveal whether similar demands in the analysis of acoustic scenes have resulted in similar perceptual and neuronal processing mechanisms in the wide range of species being capable of auditory scene analysis.This article is part of the themed issue 'Auditory and visual scene analysis'.
Collapse
Affiliation(s)
- Naoya Itatani
- Cluster of Excellence Hearing4all, Animal Physiology and Behaviour Group, Department of Neuroscience, School of Medicine and Health Sciences, Carl von Ossietzky University Oldenburg, 26111 Oldenburg, Germany
| | - Georg M Klump
- Cluster of Excellence Hearing4all, Animal Physiology and Behaviour Group, Department of Neuroscience, School of Medicine and Health Sciences, Carl von Ossietzky University Oldenburg, 26111 Oldenburg, Germany
| |
Collapse
|
25
|
Suthakar K, Ryugo DK. Descending projections from the inferior colliculus to medial olivocochlear efferents: Mice with normal hearing, early onset hearing loss, and congenital deafness. Hear Res 2017; 343:34-49. [DOI: 10.1016/j.heares.2016.06.014] [Citation(s) in RCA: 22] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/12/2016] [Revised: 06/20/2016] [Accepted: 06/24/2016] [Indexed: 11/24/2022]
|
26
|
Carlile S, Fox A, Orchard-Mills E, Leung J, Alais D. Six Degrees of Auditory Spatial Separation. J Assoc Res Otolaryngol 2016; 17:209-21. [PMID: 27033087 PMCID: PMC4854823 DOI: 10.1007/s10162-016-0560-1] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2014] [Accepted: 03/09/2016] [Indexed: 11/30/2022] Open
Abstract
The location of a sound is derived computationally from acoustical cues rather than being inherent in the topography of the input signal, as in vision. Since Lord Rayleigh, the descriptions of that representation have swung between "labeled line" and "opponent process" models. Employing a simple variant of a two-point separation judgment using concurrent speech sounds, we found that spatial discrimination thresholds changed nonmonotonically as a function of the overall separation. Rather than increasing with separation, spatial discrimination thresholds first declined as two-point separation increased before reaching a turning point and increasing thereafter with further separation. This "dipper" function, with a minimum at 6 ° of separation, was seen for regions around the midline as well as for more lateral regions (30 and 45 °). The discrimination thresholds for the binaural localization cues were linear over the same range, so these cannot explain the shape of these functions. These data and a simple computational model indicate that the perception of auditory space involves a local code or multichannel mapping emerging subsequent to the binaural cue coding.
Collapse
Affiliation(s)
- Simon Carlile
- School of Medical Sciences, University of Sydney, Sydney, NSW, 2006, Australia.
- Bosch Institute, University of Sydney, Sydney, NSW, 2006, Australia.
| | - Alex Fox
- School of Medical Sciences, University of Sydney, Sydney, NSW, 2006, Australia
| | - Emily Orchard-Mills
- School of Medical Sciences, University of Sydney, Sydney, NSW, 2006, Australia
- School of Psychology, University of Sydney, Sydney, NSW, 2006, Australia
| | - Johahn Leung
- School of Medical Sciences, University of Sydney, Sydney, NSW, 2006, Australia
| | - David Alais
- School of Psychology, University of Sydney, Sydney, NSW, 2006, Australia
| |
Collapse
|
27
|
Abstract
UNLABELLED Stream segregation enables a listener to disentangle multiple competing sequences of sounds. A recent study from our laboratory demonstrated that cortical neurons in anesthetized cats exhibit spatial stream segregation (SSS) by synchronizing preferentially to one of two sequences of noise bursts that alternate between two source locations. Here, we examine the emergence of SSS along the ascending auditory pathway. Extracellular recordings were made in anesthetized rats from the inferior colliculus (IC), the nucleus of the brachium of the IC (BIN), the medial geniculate body (MGB), and the primary auditory cortex (A1). Stimuli consisted of interleaved sequences of broadband noise bursts that alternated between two source locations. At stimulus presentation rates of 5 and 10 bursts per second, at which human listeners report robust SSS, neural SSS is weak in the central nucleus of the IC (ICC), it appears in the nucleus of the brachium of the IC (BIN) and in approximately two-thirds of neurons in the ventral MGB (MGBv), and is prominent throughout A1. The enhancement of SSS at the cortical level reflects both increased spatial sensitivity and increased forward suppression. We demonstrate that forward suppression in A1 does not result from synaptic inhibition at the cortical level. Instead, forward suppression might reflect synaptic depression in the thalamocortical projection. Together, our findings indicate that auditory streams are increasingly segregated along the ascending auditory pathway as distinct mutually synchronized neural populations. SIGNIFICANCE STATEMENT Listeners are capable of disentangling multiple competing sequences of sounds that originate from distinct sources. This stream segregation is aided by differences in spatial location between the sources. A possible substrate of spatial stream segregation (SSS) has been described in the auditory cortex, but the mechanisms leading to those cortical responses are unknown. Here, we investigated SSS in three levels of the ascending auditory pathway with extracellular unit recordings in anesthetized rats. We found that neural SSS emerges within the ascending auditory pathway as a consequence of sharpening of spatial sensitivity and increasing forward suppression. Our results highlight brainstem mechanisms that culminate in SSS at the level of the auditory cortex.
Collapse
|
28
|
Abstract
Listeners can perceive interleaved sequences of sounds from two or more sources as segregated streams. In humans, physical separation of sound sources is a major factor enabling such stream segregation. Here, we examine spatial stream segregation with a psychophysical measure in domestic cats. Cats depressed a pedal to initiate a target sequence of brief sound bursts in a particular rhythm and then released the pedal when the rhythm changed. The target bursts were interleaved with a competing sequence of bursts that could differ in source location but otherwise were identical to the target bursts. This task was possible only when the sources were heard as segregated streams. When the sound bursts had broad spectra, cats could detect the rhythm change when target and competing sources were separated by as little as 9.4°. Essentially equal levels of performance were observed when frequencies were restricted to a high, 4-to-25-kHz, band in which the principal spatial cues presumably were related to sound levels. When the stimulus band was restricted from 0.4 to 1.6 kHz, leaving interaural time differences as the principal spatial cue, performance was severely degraded. The frequency sensitivity of cats in this task contrasts with that of humans, who show better spatial stream segregation with low- than with high-frequency sounds. Possible explanations for the species difference includes the smaller interaural delays available to cats due to smaller sizes of their heads and the potentially greater sound-level cues available due to the cat's frontally directed pinnae and higher audible frequency range.
Collapse
|
29
|
Cohen YE, Bennur S, Christison-Lagay K, Gifford AM, Tsunada J. Functional Organization of the Ventral Auditory Pathway. ADVANCES IN EXPERIMENTAL MEDICINE AND BIOLOGY 2016; 894:381-388. [PMID: 27080679 PMCID: PMC5444378 DOI: 10.1007/978-3-319-25474-6_40] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 01/19/2023]
Abstract
The fundamental problem in audition is determining the mechanisms required by the brain to transform an unlabelled mixture of auditory stimuli into coherent perceptual representations. This process is called auditory-scene analysis. The perceptual representations that result from auditory-scene analysis are formed through a complex interaction of perceptual grouping, attention, categorization and decision-making. Despite a great deal of scientific energy devoted to understanding these aspects of hearing, we still do not understand (1) how sound perception arises from neural activity and (2) the causal relationship between neural activity and sound perception. Here, we review the role of the "ventral" auditory pathway in sound perception. We hypothesize that, in the early parts of the auditory cortex, neural activity reflects the auditory properties of a stimulus. However, in latter parts of the auditory cortex, neurons encode the sensory evidence that forms an auditory decision and are causally involved in the decision process. Finally, in the prefrontal cortex, which receives input from the auditory cortex, neural activity reflects the actual perceptual decision. Together, these studies indicate that the ventral pathway contains hierarchical circuits that are specialized for auditory perception and scene analysis.
Collapse
Affiliation(s)
- Yale E Cohen
- Department of Otorhinolaryngology, University of Pennsylvania, Philadelphia, USA.
- Department of Neuroscience, University of Pennsylvania, Philadelphia, USA.
- Department of Bioengineering, University of Pennsylvania, Philadelphia, USA.
| | - Sharath Bennur
- Department of Otorhinolaryngology, University of Pennsylvania, Philadelphia, USA
| | | | - Adam M Gifford
- Neuroscience Graduate Group, University of Pennsylvania, Philadelphia, USA
| | - Joji Tsunada
- Department of Otorhinolaryngology, University of Pennsylvania, Philadelphia, USA
| |
Collapse
|
30
|
David M, Lavandier M, Grimault N. Sequential streaming, binaural cues and lateralization. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2015; 138:3500-3512. [PMID: 26723307 DOI: 10.1121/1.4936902] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
Interaural time differences (ITDs) and interaural level differences (ILDs) associated with monaural spectral differences (coloration) enable the localization of sound sources. The influence of these spatial cues as well as their relative importance on obligatory stream segregation were assessed in experiment 1. A temporal discrimination task favored by integration was used to measure obligatory stream segregation for sequences of speech-shaped noises. Binaural and monaural differences associated with different spatial positions increased discrimination thresholds, indicating that spatial cues can induce stream segregation. The results also demonstrated that ITDs and coloration were relatively more important cues compared to ILDs. Experiment 2 questioned whether sound segregation takes place at the level of acoustic cue extraction (ITD per se) or at the level of object formation (perceived azimuth). A difference in ITDs between stimuli was introduced either consistently or inconsistently across frequencies, leading to clearly lateralized sounds or blurred lateralization, respectively. Conditions with ITDs and clearly perceived azimuths induced significantly more segregation than the condition with ITDs but reduced lateralization. The results suggested that segregation was mainly based on a difference in lateralization, although the extraction of ITDs might have also helped segregation up to a ceiling magnitude.
Collapse
Affiliation(s)
- Marion David
- Université of Lyon, ENTPE, Laboratoire Génie Civil et Bâtiment, Rue M. Audin, F-69518 Vaulx-en-Velin Cedex, France
| | - Mathieu Lavandier
- Université of Lyon, ENTPE, Laboratoire Génie Civil et Bâtiment, Rue M. Audin, F-69518 Vaulx-en-Velin Cedex, France
| | - Nicolas Grimault
- Cognition Auditive et Psychoacoustique, Centre de Recherche en Neurosciences de Lyon, Université Lyon 1, UMR CNRS 5292, Avenue Tony Garnier, 69366 Lyon Cedex 07, France
| |
Collapse
|
31
|
Vannson N, Innes-Brown H, Marozeau J. Dichotic Listening Can Improve Perceived Clarity of Music in Cochlear Implant Users. Trends Hear 2015; 19:19/0/2331216515598971. [PMID: 26316123 PMCID: PMC4593516 DOI: 10.1177/2331216515598971] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022] Open
Abstract
Musical enjoyment for cochlear implant (CI) recipients is often reported to be unsatisfactory. Our goal was to determine whether the musical experience of postlingually deafened adult CI recipients could be enriched by presenting the bass and treble clef parts of short polyphonic piano pieces separately to each ear (dichotic). Dichotic presentation should artificially enhance the lateralization cues of each part and help the listeners to better segregate them and thus provide greater clarity. We also hypothesized that perception of the intended emotion of the pieces and their overall enjoyment would be enhanced in the dichotic mode compared with the monophonic (both parts in the same ear) and the diotic mode (both parts in both ears). Twenty-eight piano pieces specifically composed to induce sad or happy emotions were selected. The tempo of the pieces, which ranged from lento to presto covaried with the intended emotion (from sad to happy). Thirty participants (11 normal-hearing listeners, 11 bimodal CI and hearing-aid users, and 8 bilaterally implanted CI users) participated in this study. Participants were asked to rate the perceived clarity, the intended emotion, and their preference of each piece in different listening modes. Results indicated that dichotic presentation produced small significant improvements in subjective ratings based on perceived clarity. We also found that preference and clarity ratings were significantly higher for pieces with fast tempi compared with slow tempi. However, no significant differences between diotic and dichotic presentation were found for the participants’ preference ratings, or their judgments of intended emotion.
Collapse
Affiliation(s)
- Nicolas Vannson
- Centre de Recherche Cerveau et Cognition, Université de Toulouse, UPS, France CerCo, CNRS, France Cochlear France S.A.S, France
| | | | - Jeremy Marozeau
- Hearing Systems Group, Department of Electrical Engineering, Technical University of Denmark, Lyngby, Denmark
| |
Collapse
|
32
|
Roaring lions and chirruping lemurs: How the brain encodes sound objects in space. Neuropsychologia 2015; 75:304-13. [DOI: 10.1016/j.neuropsychologia.2015.06.012] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2014] [Revised: 06/07/2015] [Accepted: 06/10/2015] [Indexed: 01/29/2023]
|
33
|
Perception and coding of interaural time differences with bilateral cochlear implants. Hear Res 2015; 322:138-50. [DOI: 10.1016/j.heares.2014.10.004] [Citation(s) in RCA: 82] [Impact Index Per Article: 9.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/20/2014] [Revised: 10/01/2014] [Accepted: 10/07/2014] [Indexed: 11/21/2022]
|
34
|
Abstract
The auditory system derives locations of sound sources from spatial cues provided by the interaction of sound with the head and external ears. Those cues are analyzed in specific brainstem pathways and then integrated as cortical representation of locations. The principal cues for horizontal localization are interaural time differences (ITDs) and interaural differences in sound level (ILDs). Vertical and front/back localization rely on spectral-shape cues derived from direction-dependent filtering properties of the external ears. The likely first sites of analysis of these cues are the medial superior olive (MSO) for ITDs, lateral superior olive (LSO) for ILDs, and dorsal cochlear nucleus (DCN) for spectral-shape cues. Localization in distance is much less accurate than that in horizontal and vertical dimensions, and interpretation of the basic cues is influenced by additional factors, including acoustics of the surroundings and familiarity of source spectra and levels. Listeners are quite sensitive to sound motion, but it remains unclear whether that reflects specific motion detection mechanisms or simply detection of changes in static location. Intact auditory cortex is essential for normal sound localization. Cortical representation of sound locations is highly distributed, with no evidence for point-to-point topography. Spatial representation is strictly contralateral in laboratory animals that have been studied, whereas humans show a prominent right-hemisphere dominance.
Collapse
Affiliation(s)
- John C Middlebrooks
- Departments of Otolaryngology, Neurobiology and Behavior, Cognitive Sciences, and Biomedical Engineering, University of California at Irvine, Irvine, CA, USA.
| |
Collapse
|
35
|
David M, Lavandier M, Grimault N. Room and head coloration can induce obligatory stream segregation. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2014; 136:5-8. [PMID: 24993189 DOI: 10.1121/1.4883387] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/03/2023]
Abstract
Multiple sound reflections from room materials and a listener's head induce slight spectral modifications of sounds. This coloration depends on the listener and source positions, and on the room itself. This study investigated whether coloration could help segregate competing sources. Obligatory streaming was evaluated for diotic speech-shaped noises using a rhythmic discrimination task. Thresholds for detecting anisochrony were always significantly higher when stimuli differed in spectrum. The tested differences corresponded to three spatial configurations involving different levels of head and room coloration. These results suggest that, despite the generally deleterious effects of reverberation on speech intelligibility, coloration could favor source segregation.
Collapse
Affiliation(s)
- Marion David
- Université de Lyon, École Nationale des Travaux Publics de l'État, Laboratoire Génie Civil et Bâtiment, Rue M. Audin, 69518 Vaulx-en-Velin Cedex, France
| | - Mathieu Lavandier
- Université de Lyon, École Nationale des Travaux Publics de l'État, Laboratoire Génie Civil et Bâtiment, Rue M. Audin, 69518 Vaulx-en-Velin Cedex, France
| | - Nicolas Grimault
- Unité Mixte de Recherche au Centre National de la Recherche Scientifique 5292, Centre de Recherche en Neurosciences de Lyon, Université Lyon 1, Cognition Auditive et Psychoacoustique, Avenue Tony Garnier, 69366 Lyon Cedex 07, France
| |
Collapse
|
36
|
Willaredt MA, Ebbers L, Nothwang HG. Central auditory function of deafness genes. Hear Res 2014; 312:9-20. [DOI: 10.1016/j.heares.2014.02.004] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/23/2013] [Revised: 01/31/2014] [Accepted: 02/10/2014] [Indexed: 01/11/2023]
|
37
|
Maddox RK, Pospisil DA, Stecker GC, Lee AKC. Directing eye gaze enhances auditory spatial cue discrimination. Curr Biol 2014; 24:748-52. [PMID: 24631242 DOI: 10.1016/j.cub.2014.02.021] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2013] [Revised: 12/12/2013] [Accepted: 02/11/2014] [Indexed: 11/29/2022]
Abstract
The present study demonstrates, for the first time, a specific enhancement of auditory spatial cue discrimination due to eye gaze. Whereas the region of sharpest visual acuity, called the fovea, can be directed at will by moving one's eyes, auditory spatial information is derived primarily from head-related acoustic cues. Past auditory studies have found better discrimination in front of the head [1-3] but have not manipulated subjects' gaze, thus overlooking potential oculomotor influences. Electrophysiological studies have shown that the inferior colliculus, a critical auditory midbrain nucleus, shows visual and oculomotor responses [4-6] and modulations of auditory activity [7-9], and that auditory neurons in the superior colliculus show shifting receptive fields [10-13]. How the auditory system leverages this crossmodal information at the behavioral level remains unknown. Here we directed subjects' gaze (with an eccentric dot) or auditory attention (with lateralized noise) while they performed an auditory spatial cue discrimination task. We found that directing gaze toward a sound significantly enhances discrimination of both interaural level and time differences, whereas directing auditory spatial attention does not. These results show that oculomotor information variably enhances auditory spatial resolution even when the head remains stationary, revealing a distinct behavioral benefit possibly arising from auditory-oculomotor interactions at an earlier level of processing than previously demonstrated.
Collapse
Affiliation(s)
- Ross K Maddox
- Institute for Learning and Brain Sciences, University of Washington, 1715 NE Columbia Road, Portage Bay Building, Box 357988, Seattle, WA 98195, USA
| | - Dean A Pospisil
- Institute for Learning and Brain Sciences, University of Washington, 1715 NE Columbia Road, Portage Bay Building, Box 357988, Seattle, WA 98195, USA
| | - G Christopher Stecker
- Department of Speech and Hearing Sciences, University of Washington, 1417 NE 42(nd) Street, Eagleson Hall, Box 354875, Seattle, WA 98105, USA; Department of Hearing and Speech Sciences, Vanderbilt University Medical Center, 1215 21(st) Avenue South, Room 8310, Nashville, TN 37232, USA
| | - Adrian K C Lee
- Institute for Learning and Brain Sciences, University of Washington, 1715 NE Columbia Road, Portage Bay Building, Box 357988, Seattle, WA 98195, USA; Department of Speech and Hearing Sciences, University of Washington, 1417 NE 42(nd) Street, Eagleson Hall, Box 354875, Seattle, WA 98105, USA.
| |
Collapse
|
38
|
Abstract
The fundamental perceptual unit in hearing is the 'auditory object'. Similar to visual objects, auditory objects are the computational result of the auditory system's capacity to detect, extract, segregate and group spectrotemporal regularities in the acoustic environment; the multitude of acoustic stimuli around us together form the auditory scene. However, unlike the visual scene, resolving the component objects within the auditory scene crucially depends on their temporal structure. Neural correlates of auditory objects are found throughout the auditory system. However, neural responses do not become correlated with a listener's perceptual reports until the level of the cortex. The roles of different neural structures and the contribution of different cognitive states to the perception of auditory objects are not yet fully understood.
Collapse
|
39
|
Mutation in the kv3.3 voltage-gated potassium channel causing spinocerebellar ataxia 13 disrupts sound-localization mechanisms. PLoS One 2013; 8:e76749. [PMID: 24116147 PMCID: PMC3792041 DOI: 10.1371/journal.pone.0076749] [Citation(s) in RCA: 31] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2013] [Accepted: 08/26/2013] [Indexed: 11/19/2022] Open
Abstract
Normal sound localization requires precise comparisons of sound timing and pressure levels between the two ears. The primary localization cues are interaural time differences, ITD, and interaural level differences, ILD. Voltage-gated potassium channels, including Kv3.3, are highly expressed in the auditory brainstem and are thought to underlie the exquisite temporal precision and rapid spike rates that characterize brainstem binaural pathways. An autosomal dominant mutation in the gene encoding Kv3.3 has been demonstrated in a large Filipino kindred manifesting as spinocerebellar ataxia type 13 (SCA13). This kindred provides a rare opportunity to test in vivo the importance of a specific channel subunit for human hearing. Here, we demonstrate psychophysically that individuals with the mutant allele exhibit profound deficits in both ITD and ILD sensitivity, despite showing no obvious impairment in pure-tone sensitivity with either ear. Surprisingly, several individuals exhibited the auditory deficits even though they were pre-symptomatic for SCA13. We would expect that impairments of binaural processing as great as those observed in this family would result in prominent deficits in localization of sound sources and in loss of the "spatial release from masking" that aids in understanding speech in the presence of competing sounds.
Collapse
|
40
|
Abstract
In a complex auditory scene, a "cocktail party" for example, listeners can disentangle multiple competing sequences of sounds. A recent psychophysical study in our laboratory demonstrated a robust spatial component of stream segregation showing ∼8° acuity. Here, we recorded single- and multiple-neuron responses from the primary auditory cortex of anesthetized cats while presenting interleaved sound sequences that human listeners would experience as segregated streams. Sequences of broadband sounds alternated between pairs of locations. Neurons synchronized preferentially to sounds from one or the other location, thereby segregating competing sound sequences. Neurons favoring one source location or the other tended to aggregate within the cortex, suggestive of modular organization. The spatial acuity of stream segregation was as narrow as ∼10°, markedly sharper than the broad spatial tuning for single sources that is well known in the literature. Spatial sensitivity was sharpest among neurons having high characteristic frequencies. Neural stream segregation was predicted well by a parameter-free model that incorporated single-source spatial sensitivity and a measured forward-suppression term. We found that the forward suppression was not due to post discharge adaptation in the cortex and, therefore, must have arisen in the subcortical pathway or at the level of thalamocortical synapses. A linear-classifier analysis of single-neuron responses to rhythmic stimuli like those used in our psychophysical study yielded thresholds overlapping those of human listeners. Overall, the results indicate that the ascending auditory system does the work of segregating auditory streams, bringing them to discrete modules in the cortex for selection by top-down processes.
Collapse
|
41
|
Yao JD, Bremen P, Middlebrooks JC. Rat primary auditory cortex is tuned exclusively to the contralateral hemifield. J Neurophysiol 2013; 110:2140-51. [PMID: 23945782 DOI: 10.1152/jn.00219.2013] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022] Open
Abstract
The rat is a widely used species for study of the auditory system. Psychophysical results from rats have shown an inability to discriminate sound source locations within a lateral hemifield, despite showing fairly sharp near-midline acuity. We tested the hypothesis that those characteristics of the rat's sound localization psychophysics are evident in the characteristics of spatial sensitivity of its cortical neurons. In addition, we sought quantitative descriptions of in vivo spatial sensitivity of cortical neurons that would support development of an in vitro experimental model to study cortical mechanisms of spatial hearing. We assessed the spatial sensitivity of single- and multiple-neuron responses in the primary auditory cortex (A1) of urethane-anesthetized rats. Free-field noise bursts were varied throughout 360° of azimuth in the horizontal plane at sound levels from 10 to 40 dB above neural thresholds. All neurons encountered in A1 displayed contralateral-hemifield spatial tuning in that they responded strongly to contralateral sound source locations, their responses cut off sharply for locations near the frontal midline, and they showed weak or no responses to ipsilateral sources. Spatial tuning was quite stable across a 30-dB range of sound levels. Consistent with rat psychophysical results, a linear discriminator analysis of spike counts exhibited high spatial acuity for near-midline sounds and poor discrimination for off-midline locations. Hemifield spatial tuning is the most common pattern across all mammals tested previously. The homogeneous population of neurons in rat area A1 will make an excellent system for study of the mechanisms underlying that pattern.
Collapse
Affiliation(s)
- Justin D Yao
- Department of Neurobiology and Behavior, University of California at Irvine, Irvine, California
| | | | | |
Collapse
|
42
|
Bremen P, Middlebrooks JC. Weighting of spatial and spectro-temporal cues for auditory scene analysis by human listeners. PLoS One 2013; 8:e59815. [PMID: 23527271 PMCID: PMC3602423 DOI: 10.1371/journal.pone.0059815] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/25/2012] [Accepted: 02/19/2013] [Indexed: 11/18/2022] Open
Abstract
The auditory system creates a neuronal representation of the acoustic world based on spectral and temporal cues present at the listener's ears, including cues that potentially signal the locations of sounds. Discrimination of concurrent sounds from multiple sources is especially challenging. The current study is part of an effort to better understand the neuronal mechanisms governing this process, which has been termed "auditory scene analysis". In particular, we are interested in spatial release from masking by which spatial cues can segregate signals from other competing sounds, thereby overcoming the tendency of overlapping spectra and/or common temporal envelopes to fuse signals with maskers. We studied detection of pulsed tones in free-field conditions in the presence of concurrent multi-tone non-speech maskers. In "energetic" masking conditions, in which the frequencies of maskers fell within the ± 1/3-octave band containing the signal, spatial release from masking at low frequencies (~600 Hz) was found to be about 10 dB. In contrast, negligible spatial release from energetic masking was seen at high frequencies (~4000 Hz). We observed robust spatial release from masking in broadband "informational" masking conditions, in which listeners could confuse signal with masker even though there was no spectral overlap. Substantial spatial release was observed in conditions in which the onsets of the signal and all masker components were synchronized, and spatial release was even greater under asynchronous conditions. Spatial cues limited to high frequencies (>1500 Hz), which could have included interaural level differences and the better-ear effect, produced only limited improvement in signal detection. Substantially greater improvement was seen for low-frequency sounds, for which interaural time differences are the dominant spatial cue.
Collapse
Affiliation(s)
- Peter Bremen
- Department of Otolaryngology, University of California Irvine, Irvine, California, United States of America
- Center for Hearing Research, University of California Irvine, Irvine, California, United States of America
| | - John C. Middlebrooks
- Department of Otolaryngology, University of California Irvine, Irvine, California, United States of America
- Department of Neurobiology and Behavior, University of California Irvine, Irvine, California, United States of America
- Department of Cognitive Sciences, University of California Irvine, Irvine, California, United States of America
- Department of Biomedical Engineering, University of California Irvine, Irvine, California, United States of America
- Center for Hearing Research, University of California Irvine, Irvine, California, United States of America
| |
Collapse
|
43
|
High-Acuity Spatial Stream Segregation. ADVANCES IN EXPERIMENTAL MEDICINE AND BIOLOGY 2013; 787:491-9. [DOI: 10.1007/978-1-4614-1590-9_54] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
|