1
|
Cohn M, Barreda S, Zellou G. Differences in a Musician's Advantage for Speech-in-Speech Perception Based on Age and Task. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2023; 66:545-564. [PMID: 36729698 DOI: 10.1044/2022_jslhr-22-00259] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/18/2023]
Abstract
PURPOSE This study investigates the debate that musicians have an advantage in speech-in-noise perception from years of targeted auditory training. We also consider the effect of age on any such advantage, comparing musicians and nonmusicians (age range: 18-66 years), all of whom had normal hearing. We manipulate the degree of fundamental frequency (f o) separation between the competing talkers, as well as use different tasks, to probe attentional differences that might shape a musician's advantage across ages. METHOD Participants (ranging in age from 18 to 66 years) included 29 musicians and 26 nonmusicians. They completed two tasks varying in attentional demands: (a) a selective attention task where listeners identify the target sentence presented with a one-talker interferer (Experiment 1), and (b) a divided attention task where listeners hear two vowels played simultaneously and identify both competing vowels (Experiment 2). In both paradigms, f o separation was manipulated between the two voices (Δf o = 0, 0.156, 0.306, 1, 2, 3 semitones). RESULTS Results show that increasing differences in f o separation lead to higher accuracy on both tasks. Additionally, we find evidence for a musician's advantage across the two studies. In the sentence identification task, younger adult musicians show higher accuracy overall, as well as a stronger reliance on f o separation. Yet, this advantage declines with musicians' age. In the double vowel identification task, musicians of all ages show an across-the-board advantage in detecting two vowels-and use f o separation more to aid in stream separation-but show no consistent difference in double vowel identification. CONCLUSIONS Overall, we find support for a hybrid auditory encoding-attention account of music-to-speech transfer. The musician's advantage includes f o, but the benefit also depends on the attentional demands in the task and listeners' age. Taken together, this study suggests a complex relationship between age, musical experience, and speech-in-speech paradigm on a musician's advantage. SUPPLEMENTAL MATERIAL https://doi.org/10.23641/asha.21956777.
Collapse
Affiliation(s)
- Michelle Cohn
- Phonetics Lab, Department of Linguistics, University of California, Davis
| | - Santiago Barreda
- Phonetics Lab, Department of Linguistics, University of California, Davis
| | - Georgia Zellou
- Phonetics Lab, Department of Linguistics, University of California, Davis
| |
Collapse
|
2
|
Farahbod H, Rogalsky C, Keator LM, Cai J, Pillay SB, Turner K, LaCroix A, Fridriksson J, Binder JR, Middlebrooks JC, Hickok G, Saberi K. Informational Masking in Aging and Brain-lesioned Individuals. J Assoc Res Otolaryngol 2023; 24:67-79. [PMID: 36471207 PMCID: PMC9971540 DOI: 10.1007/s10162-022-00877-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2022] [Accepted: 11/01/2022] [Indexed: 12/12/2022] Open
Abstract
Auditory stream segregation and informational masking were investigated in brain-lesioned individuals, age-matched controls with no neurological disease, and young college-age students. A psychophysical paradigm known as rhythmic masking release (RMR) was used to examine the ability of participants to identify a change in the rhythmic sequence of 20-ms Gaussian noise bursts presented through headphones and filtered through generalized head-related transfer functions to produce the percept of an externalized auditory image (i.e., a 3D virtual reality sound). The target rhythm was temporally interleaved with a masker sequence comprising similar noise bursts in a manner that resulted in a uniform sequence with no information remaining about the target rhythm when the target and masker were presented from the same location (an impossible task). Spatially separating the target and masker sequences allowed participants to determine if there was a change in the target rhythm midway during its presentation. RMR thresholds were defined as the minimum spatial separation between target and masker sequences that resulted in 70.7% correct-performance level in a single-interval 2-alternative forced-choice adaptive tracking procedure. The main findings were (1) significantly higher RMR thresholds for individuals with brain lesions (especially those with damage to parietal areas) and (2) a left-right spatial asymmetry in performance for lesion (but not control) participants. These findings contribute to a better understanding of spatiotemporal relations in informational masking and the neural bases of auditory scene analysis.
Collapse
Affiliation(s)
- Haleh Farahbod
- grid.266093.80000 0001 0668 7243Department of Cognitive Sciences, University of California, Irvine, USA
| | - Corianne Rogalsky
- grid.215654.10000 0001 2151 2636College of Health Solutions, Arizona State University, Tempe, USA
| | - Lynsey M. Keator
- grid.254567.70000 0000 9075 106XDepartment of Communication Sciences and Disorders, University of South Carolina, Columbia, USA
| | - Julia Cai
- grid.215654.10000 0001 2151 2636College of Health Solutions, Arizona State University, Tempe, USA
| | - Sara B. Pillay
- grid.30760.320000 0001 2111 8460Department of Neurology, Medical College of Wisconsin, Milwaukee, USA
| | - Katie Turner
- grid.266093.80000 0001 0668 7243Department of Cognitive Sciences, University of California, Irvine, USA
| | - Arianna LaCroix
- grid.260024.20000 0004 0627 4571College of Health Sciences, Midwestern University, Glendale, USA
| | - Julius Fridriksson
- grid.254567.70000 0000 9075 106XDepartment of Communication Sciences and Disorders, University of South Carolina, Columbia, USA
| | - Jeffrey R. Binder
- grid.30760.320000 0001 2111 8460Department of Neurology, Medical College of Wisconsin, Milwaukee, USA
| | - John C. Middlebrooks
- grid.266093.80000 0001 0668 7243Department of Cognitive Sciences, University of California, Irvine, USA ,grid.266093.80000 0001 0668 7243Department of Otolaryngology, University of California, Irvine, USA ,grid.266093.80000 0001 0668 7243Department of Language Science, University of California, Irvine, USA
| | - Gregory Hickok
- grid.266093.80000 0001 0668 7243Department of Cognitive Sciences, University of California, Irvine, USA ,grid.266093.80000 0001 0668 7243Department of Language Science, University of California, Irvine, USA
| | - Kourosh Saberi
- Department of Cognitive Sciences, University of California, Irvine, USA.
| |
Collapse
|
3
|
Pinto D, Kaufman M, Brown A, Zion Golumbic E. An ecological investigation of the capacity to follow simultaneous speech and preferential detection of ones’ own name. Cereb Cortex 2022; 33:5361-5374. [PMID: 36331339 DOI: 10.1093/cercor/bhac424] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2022] [Revised: 09/11/2022] [Accepted: 09/12/2022] [Indexed: 11/06/2022] Open
Abstract
Abstract
Many situations require focusing attention on one speaker, while monitoring the environment for potentially important information. Some have proposed that dividing attention among 2 speakers involves behavioral trade-offs, due to limited cognitive resources. However the severity of these trade-offs, particularly under ecologically-valid circumstances, is not well understood. We investigated the capacity to process simultaneous speech using a dual-task paradigm simulating task-demands and stimuli encountered in real-life. Participants listened to conversational narratives (Narrative Stream) and monitored a stream of announcements (Barista Stream), to detect when their order was called. We measured participants’ performance, neural activity, and skin conductance as they engaged in this dual-task. Participants achieved extremely high dual-task accuracy, with no apparent behavioral trade-offs. Moreover, robust neural and physiological responses were observed for target-stimuli in the Barista Stream, alongside significant neural speech-tracking of the Narrative Stream. These results suggest that humans have substantial capacity to process simultaneous speech and do not suffer from insufficient processing resources, at least for this highly ecological task-combination and level of perceptual load. Results also confirmed the ecological validity of the advantage for detecting ones’ own name at the behavioral, neural, and physiological level, highlighting the contribution of personal relevance when processing simultaneous speech.
Collapse
Affiliation(s)
- Danna Pinto
- The Gonda Multidisciplinary Center for Brain Research, Bar Ilan University, Ramat Gan, 5290002, Israel
| | - Maya Kaufman
- The Gonda Multidisciplinary Center for Brain Research, Bar Ilan University, Ramat Gan, 5290002, Israel
| | - Adi Brown
- The Gonda Multidisciplinary Center for Brain Research, Bar Ilan University, Ramat Gan, 5290002, Israel
| | - Elana Zion Golumbic
- The Gonda Multidisciplinary Center for Brain Research, Bar Ilan University, Ramat Gan, 5290002, Israel
| |
Collapse
|
4
|
Avivi-Reich M, Sran RK, Schneider BA. Do Age and Linguistic Status Alter the Effect of Sound Source Diffuseness on Speech Recognition in Noise? Front Psychol 2022; 13:838576. [PMID: 35369266 PMCID: PMC8965325 DOI: 10.3389/fpsyg.2022.838576] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2021] [Accepted: 02/14/2022] [Indexed: 11/13/2022] Open
Abstract
One aspect of auditory scenes that has received very little attention is the level of diffuseness of sound sources. This aspect has increasing importance due to growing use of amplification systems. When an auditory stimulus is amplified and presented over multiple, spatially-separated loudspeakers, the signal's timbre is altered due to comb filtering. In a previous study we examined how increasing the diffuseness of the sound sources might affect listeners' ability to recognize speech presented in different types of background noise. Listeners performed similarly when both the target and the masker were presented via a similar number of loudspeakers. However, performance improved when the target was presented using a single speaker (compact) and the masker from three spatially separate speakers (diffuse) but worsened when the target was diffuse, and the masker was compact. In the current study, we extended our research to examine whether the effects of timbre changes with age and linguistic experience. Twenty-four older adults whose first language was English (Old-EFLs) and 24 younger adults whose second language was English (Young-ESLs) were asked to repeat non-sense sentences masked by either Noise, Babble, or Speech and their results were compared with those of the Young-EFLs previously tested. Participants were divided into two experimental groups: (1) A Compact-Target group where the target sentences were presented over a single loudspeaker, while the masker was either presented over three loudspeakers or over a single loudspeaker; (2) A Diffuse-Target group, where the target sentences were diffuse while the masker was either compact or diffuse. The results indicate that the Target Timbre has a negligible effect on thresholds when the timbre of the target matches the timbre of the masker in all three groups. When there is a timbre contrast between target and masker, thresholds are significantly lower when the target is compact than when it is diffuse for all three listening groups in a Noise background. However, while this difference is maintained for the Young and Old-EFLs when the masker is Babble or Speech, speech reception thresholds in the Young-ESL group tend to be equivalent for all four combinations of target and masker timbre.
Collapse
Affiliation(s)
- Meital Avivi-Reich
- Department of Communication Arts, Sciences and Disorders, Brooklyn College, City University of New York, Brooklyn, NY, United States
| | - Rupinder Kaur Sran
- Human Communication Lab, Department of Psychology, University of Toronto Mississauga, Toronto, ON, Canada
- Department of Speech-Language Pathology, University of Toronto, Toronto, ON, Canada
| | - Bruce A. Schneider
- Human Communication Lab, Department of Psychology, University of Toronto Mississauga, Toronto, ON, Canada
| |
Collapse
|
5
|
Calcus A, Schoof T, Rosen S, Shinn-Cunningham B, Souza P. Switching Streams Across Ears to Evaluate Informational Masking of Speech-on-Speech. Ear Hear 2021; 41:208-216. [PMID: 31107365 PMCID: PMC6856419 DOI: 10.1097/aud.0000000000000741] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
OBJECTIVES This study aimed to evaluate the informational component of speech-on-speech masking. Speech perception in the presence of a competing talker involves not only informational masking (IM) but also a number of masking processes involving interaction of masker and target energy in the auditory periphery. Such peripherally generated masking can be eliminated by presenting the target and masker in opposite ears (dichotically). However, this also reduces IM by providing listeners with lateralization cues that support spatial release from masking (SRM). In tonal sequences, IM can be isolated by rapidly switching the lateralization of dichotic target and masker streams across the ears, presumably producing ambiguous spatial percepts that interfere with SRM. However, it is not clear whether this technique works with speech materials. DESIGN Speech reception thresholds (SRTs) were measured in 17 young normal-hearing adults for sentences produced by a female talker in the presence of a competing male talker under three different conditions: diotic (target and masker in both ears), dichotic, and dichotic but switching the target and masker streams across the ears. Because switching rate and signal coherence were expected to influence the amount of IM observed, these two factors varied across conditions. When switches occurred, they were either at word boundaries or periodically (every 116 msec) and either with or without a brief gap (84 msec) at every switch point. In addition, SRTs were measured in a quiet condition to rule out audibility as a limiting factor. RESULTS SRTs were poorer for the four switching dichotic conditions than for the nonswitching dichotic condition, but better than for the diotic condition. Periodic switches without gaps resulted in the worst SRTs compared to the other switch conditions, thus maximizing IM. CONCLUSIONS These findings suggest that periodically switching the target and masker streams across the ears (without gaps) was the most efficient in disrupting SRM. Thus, this approach can be used in experiments that seek a relatively pure measure of IM, and could be readily extended to translational research.
Collapse
Affiliation(s)
- Axelle Calcus
- UCL Speech, Hearing and Phonetic Sciences, 2 Wakefield Street, London WC1N 1PF, United Kingdom
- Laboratoire des Systèmes Perceptifs, Département d’Etudes Cognitives, Ecole Normale Supérieure, PSL University, CNRS, 75005 Paris, France
| | - Tim Schoof
- UCL Speech, Hearing and Phonetic Sciences, 2 Wakefield Street, London WC1N 1PF, United Kingdom
| | - Stuart Rosen
- UCL Speech, Hearing and Phonetic Sciences, 2 Wakefield Street, London WC1N 1PF, United Kingdom
| | | | - Pamela Souza
- Department of Communication Sciences and Disorders, Knowles Hearing Center, Northwestern University, 2240 Campus Drive, Evanston, Illinois 60208, USA
| |
Collapse
|
6
|
Static and dynamic cocktail party listening in younger and older adults. Hear Res 2020; 395:108020. [DOI: 10.1016/j.heares.2020.108020] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/05/2019] [Revised: 05/13/2020] [Accepted: 06/11/2020] [Indexed: 11/21/2022]
|
7
|
Focused and divided attention in a simulated cocktail-party situation: ERP evidence from younger and older adults. Neurobiol Aging 2016; 41:138-149. [DOI: 10.1016/j.neurobiolaging.2016.02.018] [Citation(s) in RCA: 33] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2015] [Revised: 02/17/2016] [Accepted: 02/21/2016] [Indexed: 11/21/2022]
|
8
|
Shafiro V, Sheft S, Risley R. The intelligibility of interrupted and temporally altered speech: Effects of context, age, and hearing loss. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2016; 139:455-65. [PMID: 26827039 PMCID: PMC4723407 DOI: 10.1121/1.4939891] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/16/2023]
Abstract
Temporal constraints on the perception of interrupted speech were investigated by comparing the intelligibility of speech that was periodically gated (PG) and subsequently either temporally compressed (PGTC) by concatenating remaining speech fragments or temporally expanded (PGTE) by doubling the silent intervals between speech fragments. Experiment 1 examined the effects of PGTC and PGTE at different gating rates (0.5 -16 Hz) on the intelligibility of words and sentences for young normal-hearing adults. In experiment 2, older normal-hearing (ONH) and older hearing-impaired (OHI) adults were tested with sentences only. The results of experiment 1 indicated that sentences were more intelligible than words. In both experiments, PGTC sentences were less intelligible than either PG or PGTE sentences. Compared with PG sentences, the intelligibility of PGTE sentences was significantly reduced by the same amount for ONH and OHI groups. Temporal alterations tended to produce a U-shaped rate-intelligibility function with a dip at 2-4 Hz, indicating that temporal alterations interacted with the duration of speech fragments. The present findings demonstrate that both aging and hearing loss negatively affect the overall intelligibility of interrupted and temporally altered speech. However, a mild-to-moderate hearing loss did not exacerbate the negative effects of temporal alterations associated with aging.
Collapse
Affiliation(s)
- Valeriy Shafiro
- Department of Communication Disorders and Sciences, Rush University Medical Center, 600 South Paulina Street, Suite 1012 AAC, Chicago, Illinois 60612, USA
| | - Stanley Sheft
- Department of Communication Disorders and Sciences, Rush University Medical Center, 600 South Paulina Street, Suite 1012 AAC, Chicago, Illinois 60612, USA
| | - Robert Risley
- Department of Communication Disorders and Sciences, Rush University Medical Center, 600 South Paulina Street, Suite 1012 AAC, Chicago, Illinois 60612, USA
| |
Collapse
|
9
|
Gygi B, Giordano BL, Shafiro V, Kharkhurin A, Zhang PX. Predicting the timing of dynamic events through sound: Bouncing balls. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2015; 138:457-466. [PMID: 26233044 DOI: 10.1121/1.4923020] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/04/2023]
Abstract
Dynamic information in acoustical signals produced by bouncing objects is often used by listeners to predict the objects' future behavior (e.g., hitting a ball). This study examined factors that affect the accuracy of motor responses to sounds of real-world dynamic events. In experiment 1, listeners heard 2-5 bounces from a tennis ball, ping-pong, basketball, or wiffle ball, and would tap to indicate the time of the next bounce in a series. Across ball types and number of bounces, listeners were extremely accurate in predicting the correct bounce time (CT) with a mean prediction error of only 2.58% of the CT. Prediction based on a physical model of bouncing events indicated that listeners relied primarily on temporal cues when estimating the timing of the next bounce, and to a lesser extent on the loudness and spectral cues. In experiment 2, the timing of each bounce pattern was altered to correspond to the bounce timing pattern of another ball, producing stimuli with contradictory acoustic cues. Nevertheless, listeners remained highly accurate in their estimates of bounce timing. This suggests that listeners can adopt their estimates of bouncing-object timing based on acoustic cues that provide most veridical information about dynamic aspects of object behavior.
Collapse
Affiliation(s)
- Brian Gygi
- Speech and Hearing Research, United States Department of Veterans Affairs Northern California Health Care System, Martinez, California 94553, USA
| | - Bruno L Giordano
- Institute of Neuroscience and Psychology, University of Glasgow, Glasgow, Scotland
| | - Valeriy Shafiro
- Department of Communication Sciences and Disorders, Rush University Medical Center, Chicago, Illinois 60612, USA
| | - Anatoliy Kharkhurin
- Department of International Studies, American University of Sharjah, Sharjah, United Arab Emirates
| | - Peter Xinya Zhang
- Department of Audio Arts and Acoustics, Columbia College, Chicago, Illinois 60605, USA
| |
Collapse
|
10
|
Shafiro V, Sheft S, Risley R, Gygi B. Effects of age and hearing loss on the intelligibility of interrupted speech. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2015; 137:745-56. [PMID: 25698009 PMCID: PMC4336257 DOI: 10.1121/1.4906275] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/15/2023]
Abstract
How age and hearing loss affect the perception of interrupted speech may vary based on both the physical properties of preserved or obliterated speech fragments and individual listener characteristics. To investigate perceptual processes and interruption parameters influencing intelligibility across interruption rates, participants of different age and hearing status heard sentences interrupted by silence at either a single primary rate (0.5-8 Hz; 25%, 50%, 75% duty cycle) or at an additional concurrent secondary rate (24 Hz; 50% duty cycle). Although age and hearing loss significantly affected intelligibility, the ability to integrate sub-phonemic speech fragments produced by the fast secondary rate was similar in all listener groups. Age and hearing loss interacted with rate with smallest group differences observed at the lowest and highest interruption rates of 0.5 and 24 Hz. Furthermore, intelligibility of dual-rate gated sentences was higher than single-rate gated sentences with the same proportion of retained speech. Correlations of intelligibility of interrupted speech to pure-tone thresholds, age, or measures of working memory and auditory spectro-temporal pattern discrimination were generally low-to-moderate and mostly nonsignificant. These findings demonstrate rate-dependent effects of age and hearing loss on the perception of interrupted speech, suggesting complex interactions of perceptual processes across different time scales.
Collapse
Affiliation(s)
- Valeriy Shafiro
- Department of Communication Disorders and Sciences, Rush University Medical Center, 600 South Paulina Street, Suite 1012 AAC, Chicago, Illinois 60612
| | - Stanley Sheft
- Department of Communication Disorders and Sciences, Rush University Medical Center, 600 South Paulina Street, Suite 1012 AAC, Chicago, Illinois 60612
| | - Robert Risley
- Department of Communication Disorders and Sciences, Rush University Medical Center, 600 South Paulina Street, Suite 1012 AAC, Chicago, Illinois 60612
| | - Brian Gygi
- National Institute for Health Research, Nottingham Hearing Biomedical Research Unit, Nottingham, United Kingdom
| |
Collapse
|