1
|
Target voice probability influences enhancement in auditory selective attention. Atten Percept Psychophys 2023; 85:879-888. [PMID: 36918507 DOI: 10.3758/s13414-023-02683-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 02/17/2023] [Indexed: 03/16/2023]
Abstract
Auditory selective attention is thought to consist of two mechanisms: an enhancement mechanism that boosts the target signal, and a suppression mechanism that attenuates concurrent distracting signals. The current study explored the conditions necessary to observe enhancement of predictable auditory objects. Participants heard scenes consisting of three voices and a distracting noise. They were asked to find the gender singleton (target) and report whether it was saying even or odd numbers. One of the voices appeared as the high-probability target (70%) across trials. We expected responses to be faster when the high-probability target was in the scene, and results from Experiment 1 supported that prediction. However, this target enhancement effect was substantially weakened when a distracting noise was also in the scene, suggesting that the distractor captured attention and interfered with enhancement. Experiment 2 tested the hypothesis that distractor predictability modulates target enhancement by varying the probability of the distractor. Although this hypothesis was not supported, the results of Experiment 1 were replicated. Findings support the existence of an easily disruptable enhancement mechanism that boosts the representation of highly probable target objects.
Collapse
|
2
|
Daly HR, Pitt MA. Distractor probability influences suppression in auditory selective attention. Cognition 2021; 216:104849. [PMID: 34332212 DOI: 10.1016/j.cognition.2021.104849] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2020] [Revised: 03/05/2021] [Accepted: 07/11/2021] [Indexed: 10/20/2022]
Abstract
Auditory selective attention is thought to facilitate listening to the sound of interest (e.g., voice or music) in a noisy environment. One mechanism thought to underlie this ability is suppression of distracting stimuli. However, little is known about its operation or characteristics. We tested whether suppression in auditory selective attention capitalizes on statistical regularities in the environment to facilitate attention. Participants listened to seven-second scenes consisting of several voices speaking sequences of numbers and a distractor, which occurred more (70%) or less (30%) frequently across trials. Participants had to find the voice that was a gender singleton and report whether it was saying even or odd numbers. If suppression is an active component of auditory selective attention, task performance was expected to be better when the more frequent distractor was present. Results across the experiment and three replications revealed significantly shorter RTs when the high-probability distractor was in the scene relative to the low-probability distractor. Results are suggestive of a suppression mechanism that mitigates the detrimental influence of a frequently occurring distracting sound.
Collapse
Affiliation(s)
- Heather R Daly
- Department of Psychology, The Ohio State University, United States of America.
| | - Mark A Pitt
- Department of Psychology, The Ohio State University, United States of America
| |
Collapse
|
3
|
Jeong E, Ryu H, Shin JH, Kwon GH, Jo G, Lee JY. High Oxygen Exchange to Music Indicates Auditory Distractibility in Acquired Brain Injury: An fNIRS Study with a Vector-Based Phase Analysis. Sci Rep 2018; 8:16737. [PMID: 30425287 PMCID: PMC6233191 DOI: 10.1038/s41598-018-35172-2] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2018] [Accepted: 10/31/2018] [Indexed: 01/30/2023] Open
Abstract
Attention deficits due to auditory distractibility are pervasive among patients with acquired brain injury (ABI). It remains unclear, however, whether attention deficits following ABI specific to auditory modality are associated with altered haemodynamic responses. Here, we examined cerebral haemodynamic changes using functional near-infrared spectroscopy combined with a topological vector-based analysis method. A total of thirty-seven participants (22 healthy adults, 15 patients with ABI) performed a melodic contour identification task (CIT) that simulates auditory distractibility. Findings demonstrated that the melodic CIT was able to detect auditory distractibility in patients with ABI. The rate-corrected score showed that the ABI group performed significantly worse than the non-ABI group in both CIT1 (target contour identification against environmental sounds) and CIT2 (target contour identification against target-like distraction). Phase-associated response intensity during the CITs was greater in the ABI group than in the non-ABI group. Moreover, there existed a significant interaction effect in the left dorsolateral prefrontal cortex (DLPFC) during CIT1 and CIT2. These findings indicated that stronger hemodynamic responses involving oxygen exchange in the left DLPFC can serve as a biomarker for evaluating and monitoring auditory distractibility, which could potentially lead to the discovery of the underlying mechanism that causes auditory attention deficits in patients with ABI.
Collapse
Affiliation(s)
- Eunju Jeong
- Department of Arts and Technology, Hanyang University, Seoul, 04763, Republic of Korea.
- Division of Industrial Information Studies, Hanyang University, Seoul, 04763, Republic of Korea.
| | - Hokyoung Ryu
- Department of Arts and Technology, Hanyang University, Seoul, 04763, Republic of Korea
- Graduate School of Technology and Innovation Management, Hanyang University, Seoul, 04763, Republic of Korea
| | - Joon-Ho Shin
- Department of Neurorehabilitation, National Rehabilitation Center, Ministry of Health and Welfare, Seoul, 01022, Republic of Korea
| | - Gyu Hyun Kwon
- Department of Arts and Technology, Hanyang University, Seoul, 04763, Republic of Korea
- Graduate School of Technology and Innovation Management, Hanyang University, Seoul, 04763, Republic of Korea
| | - Geonsang Jo
- Department of Arts and Technology, Hanyang University, Seoul, 04763, Republic of Korea
| | - Ji-Yeong Lee
- Department of Neurorehabilitation, National Rehabilitation Center, Ministry of Health and Welfare, Seoul, 01022, Republic of Korea
| |
Collapse
|
4
|
Deroche MLD, Limb CJ, Chatterjee M, Gracco VL. Similar abilities of musicians and non-musicians to segregate voices by fundamental frequency. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2017; 142:1739. [PMID: 29092612 PMCID: PMC5626570 DOI: 10.1121/1.5005496] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/24/2017] [Revised: 09/08/2017] [Accepted: 09/12/2017] [Indexed: 06/07/2023]
Abstract
Musicians can sometimes achieve better speech recognition in noisy backgrounds than non-musicians, a phenomenon referred to as the "musician advantage effect." In addition, musicians are known to possess a finer sense of pitch than non-musicians. The present study examined the hypothesis that the latter fact could explain the former. Four experiments measured speech reception threshold for a target voice against speech or non-speech maskers. Although differences in fundamental frequency (ΔF0s) were shown to be beneficial even when presented to opposite ears (experiment 1), the authors' attempt to maximize their use by directing the listener's attention to the target F0 led to unexpected impairments (experiment 2) and the authors' attempt to hinder their use by generating uncertainty about the competing F0s led to practically negligible effects (experiments 3 and 4). The benefits drawn from ΔF0s showed surprisingly little malleability for a cue that can be used in the complete absence of energetic masking. In half of the experiments, musicians obtained better thresholds than non-musicians, particularly in speech-on-speech conditions, but they did not reliably obtain larger ΔF0 benefits. Thus, the data do not support the hypothesis that the musician advantage effect is based on greater ability to exploit ΔF0s.
Collapse
Affiliation(s)
- Mickael L D Deroche
- Centre for Research on Brain, Language and Music, McGill University, 3640 rue de la Montagne, Montreal H3G 2A8, Canada
| | - Charles J Limb
- Department of Otolaryngology-Head and Neck Surgery, University of California San Francisco School of Medicine, 2233 Post Street, San Francisco, California 94115, USA
| | - Monita Chatterjee
- Auditory Prostheses and Perception Laboratory, Boys Town National Research Hospital, 555 North 30th Street, Omaha, Nebraska 68131, USA
| | - Vincent L Gracco
- Haskins Laboratories, 300 George Street, New Haven, Connecticut 06511, USA
| |
Collapse
|
5
|
Jeong E, Ryu H. Melodic Contour Identification Reflects the Cognitive Threshold of Aging. Front Aging Neurosci 2016; 8:134. [PMID: 27378907 PMCID: PMC4904015 DOI: 10.3389/fnagi.2016.00134] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2016] [Accepted: 05/27/2016] [Indexed: 01/16/2023] Open
Abstract
Cognitive decline is a natural phenomenon of aging. Although there exists a consensus that sensitivity to acoustic features of music is associated with such decline, no solid evidence has yet shown that structural elements and contexts of music explain this loss of cognitive performance. This study examined the extent and the type of cognitive decline that is related to the contour identification task (CIT) using tones with different pitches (i.e., melodic contours). Both younger and older adult groups participated in the CIT given in three listening conditions (i.e., focused, selective, and alternating). Behavioral data (accuracy and response times) and hemodynamic reactions were measured using functional near-infrared spectroscopy (fNIRS). Our findings showed cognitive declines in the older adult group but with a subtle difference from the younger adult group. The accuracy of the melodic CITs given in the target-like distraction task (CIT2) was significantly lower than that in the environmental noise (CIT1) condition in the older adult group, indicating that CIT2 may be a benchmark test for age-specific cognitive decline. The fNIRS findings also agreed with this interpretation, revealing significant increases in oxygenated hemoglobin (oxyHb) concentration in the younger (p < 0.05 for Δpre - on task; p < 0.01 for Δon – post task) rather than the older adult group (n.s for Δpre - on task; n.s for Δon – post task). We further concluded that the oxyHb difference was present in the brain regions near the right dorsolateral prefrontal cortex. Taken together, these findings suggest that CIT2 (i.e., the melodic contour task in the target-like distraction) is an optimized task that could indicate the degree and type of age-related cognitive decline.
Collapse
Affiliation(s)
- Eunju Jeong
- Department of Arts and Technology, Hanyang University Seoul, South Korea
| | - Hokyoung Ryu
- Department of Arts and Technology, Hanyang University Seoul, South Korea
| |
Collapse
|
6
|
Schoenmaker E, Brand T, van de Par S. The multiple contributions of interaural differences to improved speech intelligibility in multitalker scenarios. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2016; 139:2589. [PMID: 27250153 DOI: 10.1121/1.4948568] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
Spatial separation of talkers is known to improve speech intelligibility in a multitalker scenario. A contribution of binaural unmasking, in addition to a better-ear effect, is usually considered to account for this advantage. Binaural unmasking is assumed to result from the spectro-temporally simultaneous presence of target and masker energy with different interaural properties. However, in the case of speech targets and speech interference, the spectro-temporal signal-to-noise ratio (SNR) fluctuates strongly, resulting in audible and localizable glimpses of target speech even at adverse global SNRs. The disparate interaural properties of target and masker may thus lead to improved segregation without requiring simultaneity. This study addresses the binaural contribution to spatial release from masking due to simultaneous disparities in interaural cues between target and interferers. For that purpose stimuli were designed that lacked simultaneously occurring disparities, but yielded a percept of spatially separated speech nearly indistinguishable from that of non-modified stimuli. A phoneme recognition experiment with either three collocated or spatially separated talkers showed a substantial spatial release from masking for the modified stimuli. The results suggest that binaural unmasking made a minor contribution to spatial release from masking, and that rather the interaural cues mediated by dominant speech components were essential.
Collapse
Affiliation(s)
- Esther Schoenmaker
- Acoustics Group, Cluster of Excellence Hearing4all, Carl von Ossietzky University, 26111 Oldenburg, Germany
| | - Thomas Brand
- Medizinische Physik, Cluster of Excellence Hearing4all, Carl von Ossietzky University, 26111 Oldenburg, Germany
| | - Steven van de Par
- Acoustics Group, Cluster of Excellence Hearing4all, Carl von Ossietzky University, 26111 Oldenburg, Germany
| |
Collapse
|
7
|
Samson F, Johnsrude IS. Effects of a consistent target or masker voice on target speech intelligibility in two- and three-talker mixtures. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2016; 139:1037-1046. [PMID: 27036241 DOI: 10.1121/1.4942589] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
When the spatial location or identity of a sound is held constant, it is not masked as effectively by competing sounds. This suggests that experience with a particular voice over time might facilitate perceptual organization in multitalker environments. The current study examines whether listeners benefit from experience with a voice only when it is the target, or also when it is a masker, using diotic presentation and a closed-set task (coordinate response measure). A reliable interaction was observed such that, in two-talker mixtures, consistency of masker or target voice over 3-7 trials significantly benefited target recognition performance, whereas in three-talker mixtures, target, but not masker, consistency was beneficial. Overall, this work suggests that voice consistency improves intelligibility, although somewhat differently when two talkers, compared to three talkers, are present, suggesting that consistent-voice information facilitates intelligibility in at least two different ways. Listeners can use a template-matching strategy to extract a known voice from a mixture when it is the target. However, consistent-voice information facilitates segregation only when two, but not three, talkers are present.
Collapse
Affiliation(s)
- Fabienne Samson
- Department of Psychology, The Brain and Mind Institute, Natural Sciences Center, Room 227, The University of Western Ontario, London, Ontario, N6A 5B7, Canada
| | - Ingrid S Johnsrude
- Department of Psychology, The Brain and Mind Institute, Natural Sciences Center, Room 227, The University of Western Ontario, London, Ontario, N6A 5B7, Canada
| |
Collapse
|
8
|
Gai Y, Ruhland JL, Yin TCT. Effects of forward masking on sound localization in cats: basic findings with broadband maskers. J Neurophysiol 2013; 110:1600-10. [PMID: 23843432 DOI: 10.1152/jn.00255.2013] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022] Open
Abstract
Forward masking is traditionally measured with a detection task in which the addition of a preceding masking sound results in an increased signal-detection threshold. Little is known about the influence of forward masking on localization of free-field sound for human or animal subjects. Here we recorded gaze shifts of two head-unrestrained cats during localization using a search-coil technique. A broadband (BB) noise masker was presented straight ahead. A brief signal could come from 1 of the 17 speaker locations in the frontal hemifield. The signal was either a BB or a band-limited (BL) noise. For BB targets, the presence of the forward masker reduced localization accuracy at almost all target levels (20 to 80 dB SPL) along both horizontal and vertical dimensions. Temporal decay of masking was observed when a 15-ms interstimulus gap was added between the end of the masker and the beginning of the target. A large effect of forward masking was also observed for BL targets with low (0.2-2 kHz) and mid (2-7 kHz) frequencies, indicating that the interaural timing cue is susceptible to forward masking. Except at low sound levels, a small or little effect was observed for high-frequency (7-15 kHz) targets, indicating that the interaural level and the spectral cues in that frequency range remained relatively robust. Our findings suggest that different localization mechanisms can operate independently in a complex listening environment.
Collapse
Affiliation(s)
- Yan Gai
- Department of Neuroscience, University of Wisconsin, Madison, Wisconsin
| | | | | |
Collapse
|
9
|
Kitterick PT, Clarke E, O'Shea C, Seymour J, Summerfield AQ. Target identification using relative level in multi-talker listening. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2013; 133:2899-2909. [PMID: 23654395 DOI: 10.1121/1.4799810] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/02/2023]
Abstract
Previous studies have suggested that listeners can identify words spoken by a target talker amidst competing talkers if they are distinguished by their spatial location or vocal characteristics. This "direct" identification of individual words is distinct from an "indirect" identification based on an association with other words (call-signs) that uniquely label the target. The present study assessed listeners' ability to use differences in presentation level between a target and overlapping maskers to identify target words. A new sentence was spoken every 800 ms by an unpredictable talker from an unpredictable location. Listeners reported color and number words in a target sentence distinguished by a unique call-sign. When masker levels were fixed, target words could be identified directly based on their relative level. Speech-reception thresholds (SRTs) were low (-12.9 dB) and were raised by 5 dB when direct identification was disrupted by randomizing masker levels. Thus, direct identification is possible using relative level. The underlying psychometric functions were monotonic even when relative level was a reliable cue. In a further experiment, indirect identification was prevented by removing the unique call-sign cue. SRTs did not change provided that other cues were available to identify target words directly. Thus, direct identification is possible without indirect identification.
Collapse
Affiliation(s)
- Pádraig T Kitterick
- Department of Psychology, University of York, York YO10 5DD, United Kingdom.
| | | | | | | | | |
Collapse
|