1
|
Puschmann S, Regev M, Fakhar K, Zatorre RJ, Thiel CM. Attention-Driven Modulation of Auditory Cortex Activity during Selective Listening in a Multispeaker Setting. J Neurosci 2024; 44:e1157232023. [PMID: 38388426 PMCID: PMC11007309 DOI: 10.1523/jneurosci.1157-23.2023] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2023] [Revised: 10/30/2023] [Accepted: 11/05/2023] [Indexed: 02/24/2024] Open
Abstract
Real-world listening settings often consist of multiple concurrent sound streams. To limit perceptual interference during selective listening, the auditory system segregates and filters the relevant sensory input. Previous work provided evidence that the auditory cortex is critically involved in this process and selectively gates attended input toward subsequent processing stages. We studied at which level of auditory cortex processing this filtering of attended information occurs using functional magnetic resonance imaging (fMRI) and a naturalistic selective listening task. Forty-five human listeners (of either sex) attended to one of two continuous speech streams, presented either concurrently or in isolation. Functional data were analyzed using an inter-subject analysis to assess stimulus-specific components of ongoing auditory cortex activity. Our results suggest that stimulus-related activity in the primary auditory cortex and the adjacent planum temporale are hardly affected by attention, whereas brain responses at higher stages of the auditory cortex processing hierarchy become progressively more selective for the attended input. Consistent with these findings, a complementary analysis of stimulus-driven functional connectivity further demonstrated that information on the to-be-ignored speech stream is shared between the primary auditory cortex and the planum temporale but largely fails to reach higher processing stages. Our findings suggest that the neural processing of ignored speech cannot be effectively suppressed at the level of early cortical processing of acoustic features but is gradually attenuated once the competing speech streams are fully segregated.
Collapse
Affiliation(s)
- Sebastian Puschmann
- Biological Psychology Lab, Department of Psychology, Carl von Ossietzky Universität Oldenburg, 26129 Oldenburg, Germany
- Cluster of Excellence "Hearing4all", Carl von Ossietzky Universität Oldenburg, 26129 Oldenburg, Germany
| | - Mor Regev
- Montreal Neurological Institute, McGill University, Montreal, Quebec H3A 2B4, Canada
| | - Kayson Fakhar
- Institute of Computational Neuroscience, University Medical Center Eppendorf, Hamburg University, Hamburg Center of Neuroscience, Hamburg 20246, Germany
| | - Robert J Zatorre
- Montreal Neurological Institute, McGill University, Montreal, Quebec H3A 2B4, Canada
- International Laboratory for Brain, Music and Sound Research (BRAMS), Montreal, Quebec H2V 2S9, Canada
| | - Christiane M Thiel
- Biological Psychology Lab, Department of Psychology, Carl von Ossietzky Universität Oldenburg, 26129 Oldenburg, Germany
- Cluster of Excellence "Hearing4all", Carl von Ossietzky Universität Oldenburg, 26129 Oldenburg, Germany
| |
Collapse
|
2
|
Caprini F, Zhao S, Chait M, Agus T, Pomper U, Tierney A, Dick F. Generalization of auditory expertise in audio engineers and instrumental musicians. Cognition 2024; 244:105696. [PMID: 38160651 DOI: 10.1016/j.cognition.2023.105696] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2021] [Revised: 12/04/2023] [Accepted: 12/13/2023] [Indexed: 01/03/2024]
Abstract
From auditory perception to general cognition, the ability to play a musical instrument has been associated with skills both related and unrelated to music. However, it is unclear if these effects are bound to the specific characteristics of musical instrument training, as little attention has been paid to other populations such as audio engineers and designers whose auditory expertise may match or surpass that of musicians in specific auditory tasks or more naturalistic acoustic scenarios. We explored this possibility by comparing students of audio engineering (n = 20) to matched conservatory-trained instrumentalists (n = 24) and to naive controls (n = 20) on measures of auditory discrimination, auditory scene analysis, and speech in noise perception. We found that audio engineers and performing musicians had generally lower psychophysical thresholds than controls, with pitch perception showing the largest effect size. Compared to controls, audio engineers could better memorise and recall auditory scenes composed of non-musical sounds, whereas instrumental musicians performed best in a sustained selective attention task with two competing streams of tones. Finally, in a diotic speech-in-babble task, musicians showed lower signal-to-noise-ratio thresholds than both controls and engineers; however, a follow-up online study did not replicate this musician advantage. We also observed differences in personality that might account for group-based self-selection biases. Overall, we showed that investigating a wider range of forms of auditory expertise can help us corroborate (or challenge) the specificity of the advantages previously associated with musical instrument training.
Collapse
Affiliation(s)
- Francesco Caprini
- Department of Psychological Sciences, Birkbeck, University of London, UK.
| | - Sijia Zhao
- Department of Experimental Psychology, University of Oxford, UK
| | - Maria Chait
- University College London (UCL) Ear Institute, UK
| | - Trevor Agus
- School of Arts, English and Languages, Queen's University Belfast, UK
| | - Ulrich Pomper
- Department of Cognition, Emotion, and Methods in Psychology, Universität Wien, Austria
| | - Adam Tierney
- Department of Psychological Sciences, Birkbeck, University of London, UK
| | - Fred Dick
- Department of Experimental Psychology, University College London (UCL), UK
| |
Collapse
|
3
|
Ma W, Bowers L, Behrend D, Hellmuth Margulis E, Forde Thompson W. Child word learning in song and speech. Q J Exp Psychol (Hove) 2024; 77:343-362. [PMID: 37073951 DOI: 10.1177/17470218231172494] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/20/2023]
Abstract
Listening to sung words rather than spoken words can facilitate word learning and memory in adults and school-aged children. To explore the development of this effect in young children, this study examined word learning (assessed as forming word-object associations) in 1- to 2-year olds and 3- to 4-year olds, and word long-term memory (LTM) in 4- to 5-year olds several days after the initial learning. In an intermodal preferential looking paradigm, children were taught a pair of words utilising adult-directed speech (ADS) and a pair of sung words. Word learning performance was better with sung words than with ADS words in 1- to 2-year olds (Experiments 1a and 1b), 3- to 4-year olds (Experiment 1a), and 4- to 5-year olds (Experiment 2b), revealing a benefit of song in word learning in all age ranges recruited. We also examined whether children successfully learned the words by comparing their performance against chance. The 1- to 2-year olds only learned sung words, but the 3- to 4-year olds learned both sung and ADS words, suggesting that the reliance on music features in word learning observed at ages 1-2 decreased with age. Furthermore, song facilitated the word mapping-recognition processes. Results on children's LTM performance showed that the 4- to 5-year olds' LTM performance did not differ between sung and ADS words. However, the 4- to 5-year olds reliably recalled sung words but not spoken words. The reliable LTM of sung words arose from hearing sung words during the initial learning rather than at test. Finally, the benefit of song on word learning and the reliable LTM of sung words observed at ages 3-5 cannot be explained as an attentional effect.
Collapse
Affiliation(s)
- Weiyi Ma
- School of Human Environmental Sciences, University of Arkansas, Fayetteville, AR, USA
| | - Lisa Bowers
- Department of Rehabilitation, Human Resources and Communication Disorders, University of Arkansas, Fayetteville, AR, USA
| | - Douglas Behrend
- Department of Psychological Science, University of Arkansas, Fayetteville, AR, USA
| | | | | |
Collapse
|
4
|
MacLean J, Stirn J, Sisson A, Bidelman GM. Short- and long-term neuroplasticity interact during the perceptual learning of concurrent speech. Cereb Cortex 2024; 34:bhad543. [PMID: 38212291 PMCID: PMC10839853 DOI: 10.1093/cercor/bhad543] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2023] [Revised: 12/20/2023] [Accepted: 12/21/2023] [Indexed: 01/13/2024] Open
Abstract
Plasticity from auditory experience shapes the brain's encoding and perception of sound. However, whether such long-term plasticity alters the trajectory of short-term plasticity during speech processing has yet to be investigated. Here, we explored the neural mechanisms and interplay between short- and long-term neuroplasticity for rapid auditory perceptual learning of concurrent speech sounds in young, normal-hearing musicians and nonmusicians. Participants learned to identify double-vowel mixtures during ~ 45 min training sessions recorded simultaneously with high-density electroencephalography (EEG). We analyzed frequency-following responses (FFRs) and event-related potentials (ERPs) to investigate neural correlates of learning at subcortical and cortical levels, respectively. Although both groups showed rapid perceptual learning, musicians showed faster behavioral decisions than nonmusicians overall. Learning-related changes were not apparent in brainstem FFRs. However, plasticity was highly evident in cortex, where ERPs revealed unique hemispheric asymmetries between groups suggestive of different neural strategies (musicians: right hemisphere bias; nonmusicians: left hemisphere). Source reconstruction and the early (150-200 ms) time course of these effects localized learning-induced cortical plasticity to auditory-sensory brain areas. Our findings reinforce the domain-general benefits of musicianship but reveal that successful speech sound learning is driven by a critical interplay between long- and short-term mechanisms of auditory plasticity, which first emerge at a cortical level.
Collapse
Affiliation(s)
- Jessica MacLean
- Department of Speech, Language and Hearing Sciences, Indiana University, Bloomington, IN, USA
- Program in Neuroscience, Indiana University, Bloomington, IN, USA
| | - Jack Stirn
- Department of Speech, Language and Hearing Sciences, Indiana University, Bloomington, IN, USA
| | - Alexandria Sisson
- Department of Speech, Language and Hearing Sciences, Indiana University, Bloomington, IN, USA
| | - Gavin M Bidelman
- Department of Speech, Language and Hearing Sciences, Indiana University, Bloomington, IN, USA
- Program in Neuroscience, Indiana University, Bloomington, IN, USA
- Cognitive Science Program, Indiana University, Bloomington, IN, USA
| |
Collapse
|
5
|
Hake R, Bürgel M, Nguyen NK, Greasley A, Müllensiefen D, Siedenburg K. Development of an adaptive test of musical scene analysis abilities for normal-hearing and hearing-impaired listeners. Behav Res Methods 2023:10.3758/s13428-023-02279-y. [PMID: 37957432 DOI: 10.3758/s13428-023-02279-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 10/21/2023] [Indexed: 11/15/2023]
Abstract
Auditory scene analysis (ASA) is the process through which the auditory system makes sense of complex acoustic environments by organising sound mixtures into meaningful events and streams. Although music psychology has acknowledged the fundamental role of ASA in shaping music perception, no efficient test to quantify listeners' ASA abilities in realistic musical scenarios has yet been published. This study presents a new tool for testing ASA abilities in the context of music, suitable for both normal-hearing (NH) and hearing-impaired (HI) individuals: the adaptive Musical Scene Analysis (MSA) test. The test uses a simple 'yes-no' task paradigm to determine whether the sound from a single target instrument is heard in a mixture of popular music. During the online calibration phase, 525 NH and 131 HI listeners were recruited. The level ratio between the target instrument and the mixture, choice of target instrument, and number of instruments in the mixture were found to be important factors affecting item difficulty, whereas the influence of the stereo width (induced by inter-aural level differences) only had a minor effect. Based on a Bayesian logistic mixed-effects model, an adaptive version of the MSA test was developed. In a subsequent validation experiment with 74 listeners (20 HI), MSA scores showed acceptable test-retest reliability and moderate correlations with other music-related tests, pure-tone-average audiograms, age, musical sophistication, and working memory capacities. The MSA test is a user-friendly and efficient open-source tool for evaluating musical ASA abilities and is suitable for profiling the effects of hearing impairment on music perception.
Collapse
Affiliation(s)
- Robin Hake
- Department of Medical Physics and Acoustics, University of Oldenburg, Oldenburg, Germany.
| | - Michel Bürgel
- Department of Medical Physics and Acoustics, University of Oldenburg, Oldenburg, Germany
| | - Ninh K Nguyen
- Department of Medical Physics and Acoustics, University of Oldenburg, Oldenburg, Germany
| | | | - Daniel Müllensiefen
- Department of Psychology, Goldsmiths, University of London, London, UK
- Hanover Music Lab, Hochschule Für Musik, Theater und Medien, Hannover, Germany
| | - Kai Siedenburg
- Department of Medical Physics and Acoustics, University of Oldenburg, Oldenburg, Germany
| |
Collapse
|
6
|
Bidelman GM, Carter JA. Continuous dynamics in behavior reveal interactions between perceptual warping in categorization and speech-in-noise perception. Front Neurosci 2023; 17:1032369. [PMID: 36937676 PMCID: PMC10014819 DOI: 10.3389/fnins.2023.1032369] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2022] [Accepted: 02/14/2023] [Indexed: 03/05/2023] Open
Abstract
Introduction Spoken language comprehension requires listeners map continuous features of the speech signal to discrete category labels. Categories are however malleable to surrounding context and stimulus precedence; listeners' percept can dynamically shift depending on the sequencing of adjacent stimuli resulting in a warping of the heard phonetic category. Here, we investigated whether such perceptual warping-which amplify categorical hearing-might alter speech processing in noise-degraded listening scenarios. Methods We measured continuous dynamics in perception and category judgments of an acoustic-phonetic vowel gradient via mouse tracking. Tokens were presented in serial vs. random orders to induce more/less perceptual warping while listeners categorized continua in clean and noise conditions. Results Listeners' responses were faster and their mouse trajectories closer to the ultimate behavioral selection (marked visually on the screen) in serial vs. random order, suggesting increased perceptual attraction to category exemplars. Interestingly, order effects emerged earlier and persisted later in the trial time course when categorizing speech in noise. Discussion These data describe interactions between perceptual warping in categorization and speech-in-noise perception: warping strengthens the behavioral attraction to relevant speech categories, making listeners more decisive (though not necessarily more accurate) in their decisions of both clean and noise-degraded speech.
Collapse
Affiliation(s)
- Gavin M. Bidelman
- Department of Speech, Language and Hearing Sciences, Indiana University, Bloomington, IN, United States
- Program in Neuroscience, Indiana University, Bloomington, IN, United States
| | - Jared A. Carter
- School of Communication Sciences and Disorders, University of Memphis, Memphis, TN, United States
- Hearing Sciences – Scottish Section, Division of Clinical Neuroscience, School of Medicine, University of Nottingham, Glasgow, United Kingdom
| |
Collapse
|
7
|
Cohn M, Barreda S, Zellou G. Differences in a Musician's Advantage for Speech-in-Speech Perception Based on Age and Task. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2023; 66:545-564. [PMID: 36729698 DOI: 10.1044/2022_jslhr-22-00259] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/18/2023]
Abstract
PURPOSE This study investigates the debate that musicians have an advantage in speech-in-noise perception from years of targeted auditory training. We also consider the effect of age on any such advantage, comparing musicians and nonmusicians (age range: 18-66 years), all of whom had normal hearing. We manipulate the degree of fundamental frequency (f o) separation between the competing talkers, as well as use different tasks, to probe attentional differences that might shape a musician's advantage across ages. METHOD Participants (ranging in age from 18 to 66 years) included 29 musicians and 26 nonmusicians. They completed two tasks varying in attentional demands: (a) a selective attention task where listeners identify the target sentence presented with a one-talker interferer (Experiment 1), and (b) a divided attention task where listeners hear two vowels played simultaneously and identify both competing vowels (Experiment 2). In both paradigms, f o separation was manipulated between the two voices (Δf o = 0, 0.156, 0.306, 1, 2, 3 semitones). RESULTS Results show that increasing differences in f o separation lead to higher accuracy on both tasks. Additionally, we find evidence for a musician's advantage across the two studies. In the sentence identification task, younger adult musicians show higher accuracy overall, as well as a stronger reliance on f o separation. Yet, this advantage declines with musicians' age. In the double vowel identification task, musicians of all ages show an across-the-board advantage in detecting two vowels-and use f o separation more to aid in stream separation-but show no consistent difference in double vowel identification. CONCLUSIONS Overall, we find support for a hybrid auditory encoding-attention account of music-to-speech transfer. The musician's advantage includes f o, but the benefit also depends on the attentional demands in the task and listeners' age. Taken together, this study suggests a complex relationship between age, musical experience, and speech-in-speech paradigm on a musician's advantage. SUPPLEMENTAL MATERIAL https://doi.org/10.23641/asha.21956777.
Collapse
Affiliation(s)
- Michelle Cohn
- Phonetics Lab, Department of Linguistics, University of California, Davis
| | - Santiago Barreda
- Phonetics Lab, Department of Linguistics, University of California, Davis
| | - Georgia Zellou
- Phonetics Lab, Department of Linguistics, University of California, Davis
| |
Collapse
|
8
|
Johns MA, Calloway RC, Phillips I, Karuzis VP, Dutta K, Smith E, Shamma SA, Goupell MJ, Kuchinsky SE. Performance on stochastic figure-ground perception varies with individual differences in speech-in-noise recognition and working memory capacity. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2023; 153:286. [PMID: 36732241 PMCID: PMC9851714 DOI: 10.1121/10.0016756] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/18/2022] [Revised: 12/07/2022] [Accepted: 12/10/2022] [Indexed: 06/18/2023]
Abstract
Speech recognition in noisy environments can be challenging and requires listeners to accurately segregate a target speaker from irrelevant background noise. Stochastic figure-ground (SFG) tasks in which temporally coherent inharmonic pure-tones must be identified from a background have been used to probe the non-linguistic auditory stream segregation processes important for speech-in-noise processing. However, little is known about the relationship between performance on SFG tasks and speech-in-noise tasks nor the individual differences that may modulate such relationships. In this study, 37 younger normal-hearing adults performed an SFG task with target figure chords consisting of four, six, eight, or ten temporally coherent tones amongst a background of randomly varying tones. Stimuli were designed to be spectrally and temporally flat. An increased number of temporally coherent tones resulted in higher accuracy and faster reaction times (RTs). For ten target tones, faster RTs were associated with better scores on the Quick Speech-in-Noise task. Individual differences in working memory capacity and self-reported musicianship further modulated these relationships. Overall, results demonstrate that the SFG task could serve as an assessment of auditory stream segregation accuracy and RT that is sensitive to individual differences in cognitive and auditory abilities, even among younger normal-hearing adults.
Collapse
Affiliation(s)
- Michael A Johns
- Institute for Systems Research, University of Maryland, College Park, Maryland 20742, USA
| | - Regina C Calloway
- Institute for Systems Research, University of Maryland, College Park, Maryland 20742, USA
| | - Ian Phillips
- Audiology and Speech Pathology Center, Walter Reed National Military Medical Center, Bethesda, Maryland 20889, USA
| | - Valerie P Karuzis
- Applied Research Laboratory of Intelligence and Security, University of Maryland, College Park, Maryland 20742, USA
| | - Kelsey Dutta
- Institute for Systems Research, University of Maryland, College Park, Maryland 20742, USA
| | - Ed Smith
- Department of Hearing and Speech Sciences, University of Maryland, College Park, Maryland 20742, USA
| | - Shihab A Shamma
- Department of Electrical and Computer Engineering, University of Maryland, College Park, Maryland 20742, USA
| | - Matthew J Goupell
- Department of Hearing and Speech Sciences, University of Maryland, College Park, Maryland 20742, USA
| | - Stefanie E Kuchinsky
- Audiology and Speech Pathology Center, Walter Reed National Military Medical Center, Bethesda, Maryland 20889, USA
| |
Collapse
|
9
|
Domain-specific hearing-in-noise performance is associated with absolute pitch proficiency. Sci Rep 2022; 12:16344. [PMID: 36175508 PMCID: PMC9521875 DOI: 10.1038/s41598-022-20869-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2022] [Accepted: 09/20/2022] [Indexed: 11/22/2022] Open
Abstract
Recent evidence suggests that musicians may have an advantage over non-musicians in perceiving speech against noisy backgrounds. Previously, musicians have been compared as a homogenous group, despite demonstrated heterogeneity, which may contribute to discrepancies between studies. Here, we investigated whether “quasi”-absolute pitch (AP) proficiency, viewed as a general trait that varies across a spectrum, accounts for the musician advantage in hearing-in-noise (HIN) performance, irrespective of whether the streams are speech or musical sounds. A cohort of 12 non-musicians and 42 trained musicians stratified into high, medium, or low AP proficiency identified speech or melody targets masked in noise (speech-shaped, multi-talker, and multi-music) under four signal-to-noise ratios (0, − 3, − 6, and − 9 dB). Cognitive abilities associated with HIN benefits, including auditory working memory and use of visuo-spatial cues, were assessed. AP proficiency was verified against pitch adjustment and relative pitch tasks. We found a domain-specific effect on HIN perception: quasi-AP abilities were related to improved perception of melody but not speech targets in noise. The quasi-AP advantage extended to tonal working memory and the use of spatial cues, but only during melodic stream segregation. Overall, the results do not support the putative musician advantage in speech-in-noise perception, but suggest a quasi-AP advantage in perceiving music under noisy environments.
Collapse
|
10
|
Brown JA, Bidelman GM. Familiarity of Background Music Modulates the Cortical Tracking of Target Speech at the "Cocktail Party". Brain Sci 2022; 12:brainsci12101320. [PMID: 36291252 PMCID: PMC9599198 DOI: 10.3390/brainsci12101320] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/27/2022] [Revised: 09/23/2022] [Accepted: 09/27/2022] [Indexed: 11/23/2022] Open
Abstract
The "cocktail party" problem-how a listener perceives speech in noisy environments-is typically studied using speech (multi-talker babble) or noise maskers. However, realistic cocktail party scenarios often include background music (e.g., coffee shops, concerts). Studies investigating music's effects on concurrent speech perception have predominantly used highly controlled synthetic music or shaped noise, which do not reflect naturalistic listening environments. Behaviorally, familiar background music and songs with vocals/lyrics inhibit concurrent speech recognition. Here, we investigated the neural bases of these effects. While recording multichannel EEG, participants listened to an audiobook while popular songs (or silence) played in the background at a 0 dB signal-to-noise ratio. Songs were either familiar or unfamiliar to listeners and featured either vocals or isolated instrumentals from the original audio recordings. Comprehension questions probed task engagement. We used temporal response functions (TRFs) to isolate cortical tracking to the target speech envelope and analyzed neural responses around 100 ms (i.e., auditory N1 wave). We found that speech comprehension was, expectedly, impaired during background music compared to silence. Target speech tracking was further hindered by the presence of vocals. When masked by familiar music, response latencies to speech were less susceptible to informational masking, suggesting concurrent neural tracking of speech was easier during music known to the listener. These differential effects of music familiarity were further exacerbated in listeners with less musical ability. Our neuroimaging results and their dependence on listening skills are consistent with early attentional-gain mechanisms where familiar music is easier to tune out (listeners already know the song's expectancies) and thus can allocate fewer attentional resources to the background music to better monitor concurrent speech material.
Collapse
Affiliation(s)
- Jane A. Brown
- School of Communication Sciences and Disorders, University of Memphis, Memphis, TN 38152, USA
- Institute for Intelligent Systems, University of Memphis, Memphis, TN 38152, USA
| | - Gavin M. Bidelman
- School of Communication Sciences and Disorders, University of Memphis, Memphis, TN 38152, USA
- Institute for Intelligent Systems, University of Memphis, Memphis, TN 38152, USA
- Department of Speech, Language and Hearing Sciences, Indiana University, Bloomington, IN 47408, USA
- Program in Neuroscience, Indiana University, Bloomington, IN 47405, USA
- Correspondence:
| |
Collapse
|
11
|
Lutfi RA, Pastore T, Rodriguez B, Yost WA, Lee J. Molecular analysis of individual differences in talker search at the cocktail-party. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2022; 152:1804. [PMID: 36182280 PMCID: PMC9507302 DOI: 10.1121/10.0014116] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/23/2022] [Revised: 08/22/2022] [Accepted: 08/29/2022] [Indexed: 06/16/2023]
Abstract
A molecular (trial-by-trial) analysis of data from a cocktail-party, target-talker search task was used to test two general classes of explanations accounting for individual differences in listener performance: cue weighting models for which errors are tied to the speech features talkers have in common with the target and internal noise models for which errors are largely independent of these features. The speech of eight different talkers was played simultaneously over eight different loudspeakers surrounding the listener. The locations of the eight talkers varied at random from trial to trial. The listener's task was to identify the location of a target talker with which they had previously been familiarized. An analysis of the response counts to individual talkers showed predominant confusion with one talker sharing the same fundamental frequency and timbre as the target and, secondarily, other talkers sharing the same timbre. The confusions occurred for a roughly constant 31% of all of the trials for all of the listeners. The remaining errors were uniformly distributed across the remaining talkers and responsible for the large individual differences in performances observed. The results are consistent with a model in which largely stimulus-independent factors (internal noise) are responsible for the wide variation in performance across listeners.
Collapse
Affiliation(s)
- Robert A Lutfi
- Auditory Behavioral Research Laboratory, Department of Communication Sciences and Disorders, University of South Florida, Tampa, Florida 33620, USA
| | - Torben Pastore
- Spatial Hearing Laboratory, Department of Speech and Hearing, Arizona State University, Tempe, Arizona 85281, USA
| | - Briana Rodriguez
- Auditory Behavioral Research Laboratory, Department of Communication Sciences and Disorders, University of South Florida, Tampa, Florida 33620, USA
| | - William A Yost
- Spatial Hearing Laboratory, Department of Speech and Hearing, Arizona State University, Tempe, Arizona 85281, USA
| | - Jungmee Lee
- Auditory Behavioral Research Laboratory, Department of Communication Sciences and Disorders, University of South Florida, Tampa, Florida 33620, USA
| |
Collapse
|
12
|
Bidelman GM, Chow R, Noly-Gandon A, Ryan JD, Bell KL, Rizzi R, Alain C. Transcranial Direct Current Stimulation Combined With Listening to Preferred Music Alters Cortical Speech Processing in Older Adults. Front Neurosci 2022; 16:884130. [PMID: 35873829 PMCID: PMC9298650 DOI: 10.3389/fnins.2022.884130] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2022] [Accepted: 06/17/2022] [Indexed: 11/13/2022] Open
Abstract
Emerging evidence suggests transcranial direct current stimulation (tDCS) can improve cognitive performance in older adults. Similarly, music listening may improve arousal and stimulate subsequent performance on memory-related tasks. We examined the synergistic effects of tDCS paired with music listening on auditory neurobehavioral measures to investigate causal evidence of short-term plasticity in speech processing among older adults. In a randomized sham-controlled crossover study, we measured how combined anodal tDCS over dorsolateral prefrontal cortex (DLPFC) paired with listening to autobiographically salient music alters neural speech processing in older adults compared to either music listening (sham stimulation) or tDCS alone. EEG assays included both frequency-following responses (FFRs) and auditory event-related potentials (ERPs) to trace neuromodulation-related changes at brainstem and cortical levels. Relative to music without tDCS (sham), we found tDCS alone (without music) modulates the early cortical neural encoding of speech in the time frame of ∼100-150 ms. Whereas tDCS by itself appeared to largely produce suppressive effects (i.e., reducing ERP amplitude), concurrent music with tDCS restored responses to those of the music+sham levels. However, the interpretation of this effect is somewhat ambiguous as this neural modulation could be attributable to a true effect of tDCS or presence/absence music. Still, the combined benefit of tDCS+music (above tDCS alone) was correlated with listeners' education level suggesting the benefit of neurostimulation paired with music might depend on listener demographics. tDCS changes in speech-FFRs were not observed with DLPFC stimulation. Improvements in working memory pre to post session were also associated with better speech-in-noise listening skills. Our findings provide new causal evidence that combined tDCS+music relative to tDCS-alone (i) modulates the early (100-150 ms) cortical encoding of speech and (ii) improves working memory, a cognitive skill which may indirectly bolster noise-degraded speech perception in older listeners.
Collapse
Affiliation(s)
- Gavin M. Bidelman
- Department of Speech, Language and Hearing Sciences, Indiana University Bloomington, Bloomington, IN, United States,School of Communication Sciences and Disorders, The University of Memphis, Memphis, TN, United States,*Correspondence: Gavin M. Bidelman,
| | - Ricky Chow
- Rotman Research Institute, Baycrest Centre, Toronto, ON, Canada
| | | | - Jennifer D. Ryan
- Rotman Research Institute, Baycrest Centre, Toronto, ON, Canada,Department of Psychology, University of Toronto, Toronto, ON, Canada,Department of Psychiatry, University of Toronto, Toronto, ON, Canada,Institute of Medical Science, University of Toronto, Toronto, ON, Canada
| | - Karen L. Bell
- Department of Audiology, San José State University, San Jose, CA, United States
| | - Rose Rizzi
- Department of Speech, Language and Hearing Sciences, Indiana University Bloomington, Bloomington, IN, United States,School of Communication Sciences and Disorders, The University of Memphis, Memphis, TN, United States
| | - Claude Alain
- Rotman Research Institute, Baycrest Centre, Toronto, ON, Canada,Department of Psychology, University of Toronto, Toronto, ON, Canada,Institute of Medical Science, University of Toronto, Toronto, ON, Canada,Music and Health Science Research Collaboratory, University of Toronto, Toronto, ON, Canada
| |
Collapse
|
13
|
Hennessy S, Mack WJ, Habibi A. Speech-in-noise perception in musicians and non-musicians: A multi-level meta-analysis. Hear Res 2022; 416:108442. [PMID: 35078132 DOI: 10.1016/j.heares.2022.108442] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/17/2021] [Revised: 01/10/2022] [Accepted: 01/13/2022] [Indexed: 01/25/2023]
Abstract
Speech-in-noise perception, the ability to hear a relevant voice within a noisy background, is important for successful communication. Musicians have been reported to perform better than non-musicians on speech-in-noise tasks. This meta-analysis uses a multi-level design to assess the claim that musicians have superior speech-in-noise abilities compared to non-musicians. Across 31 studies and 62 effect sizes, the overall effect of musician status on speech-in-noise ability is significant, with a moderate effect size (g = 0.58), 95% CI [0.42, 0.74]. The overall effect of musician status was not moderated by within-study IQ equivalence, target stimulus, target contextual information, type of background noise, or age. We conclude that musicians show superior speech-in-noise abilities compared to non-musicians, not modified by age, IQ, or speech task parameters. These effects may reflect changes due to music training or predisposed auditory advantages that encourage musicianship.
Collapse
Affiliation(s)
- Sarah Hennessy
- Brain and Creativity Institute, University of Southern California, Los Angeles, CA, United States
| | - Wendy J Mack
- Department of Population and Public Health Sciences, University of Southern California, Los Angeles, CA, United States
| | - Assal Habibi
- Brain and Creativity Institute, University of Southern California, Los Angeles, CA, United States.
| |
Collapse
|
14
|
Shukla B, Bidelman GM. Enhanced brainstem phase-locking in low-level noise reveals stochastic resonance in the frequency-following response (FFR). Brain Res 2021; 1771:147643. [PMID: 34473999 PMCID: PMC8490316 DOI: 10.1016/j.brainres.2021.147643] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2021] [Revised: 08/23/2021] [Accepted: 08/28/2021] [Indexed: 11/29/2022]
Abstract
In nonlinear systems, the inclusion of low-level noise can paradoxically improve signal detection, a phenomenon known as stochastic resonance (SR). SR has been observed in human hearing whereby sensory thresholds (e.g., signal detection and discrimination) are enhanced in the presence of noise. Here, we asked whether subcortical auditory processing (neural phase locking) shows evidence of SR. We recorded brainstem frequency-following-responses (FFRs) in young, normal-hearing listeners to near-electrophysiological-threshold (40 dB SPL) complex tones composed of 10 iso-amplitude harmonics of 150 Hz fundamental frequency (F0) presented concurrent with low-level noise (+20 to -20 dB SNRs). Though variable and weak across ears, some listeners showed improvement in auditory detection thresholds with subthreshold noise confirming SR psychophysically. At the neural level, low-level FFRs were initially eradicated by noise (expected masking effect) but were surprisingly reinvigorated at select masker levels (local maximum near ∼ 35 dB SPL). These data suggest brainstem phase-locking to near threshold periodic stimuli is enhanced in optimal levels of noise, the hallmark of SR. Our findings provide novel evidence for stochastic resonance in the human auditory brainstem and suggest that under some circumstances, noise can actually benefit both the behavioral and neural encoding of complex sounds.
Collapse
Affiliation(s)
- Bhanu Shukla
- School of Communication Sciences & Disorders, University of Memphis, Memphis, TN, USA; Institute for Intelligent Systems, University of Memphis, Memphis, TN, USA
| | - Gavin M Bidelman
- School of Communication Sciences & Disorders, University of Memphis, Memphis, TN, USA; Institute for Intelligent Systems, University of Memphis, Memphis, TN, USA; University of Tennessee Health Sciences Center, Department of Anatomy and Neurobiology, Memphis, TN, USA.
| |
Collapse
|
15
|
Lutfi RA, Rodriguez B, Lee J. The Listener Effect in Multitalker Speech Segregation and Talker Identification. Trends Hear 2021; 25:23312165211051886. [PMID: 34693853 PMCID: PMC8544763 DOI: 10.1177/23312165211051886] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/04/2022] Open
Abstract
Over six decades ago, Cherry (1953) drew attention to what he called the “cocktail-party problem”; the challenge of segregating the speech of one talker from others speaking at the same time. The problem has been actively researched ever since but for all this time one observation has eluded explanation. It is the wide variation in performance of individual listeners. That variation was replicated here for four major experimental factors known to impact performance: differences in task (talker segregation vs. identification), differences in the voice features of talkers (pitch vs. location), differences in the voice similarity and uncertainty of talkers (informational masking), and the presence or absence of linguistic cues. The effect of these factors on the segregation of naturally spoken sentences and synthesized vowels was largely eliminated in psychometric functions relating the performance of individual listeners to that of an ideal observer, d′ideal. The effect of listeners remained as differences in the slopes of the functions (fixed effect) with little within-listener variability in the estimates of slope (random effect). The results make a case for considering the listener a factor in multitalker segregation and identification equal in status to any major experimental variable.
Collapse
Affiliation(s)
- Robert A. Lutfi
- Auditory Behavioral Research Lab, Department of Communication Sciences and Disorders, University of South Florida, Tampa, Florida
- Robert A. Lutfi, Auditory Behavioral Research Lab, Department of Communication Sciences and Disorders, University of South Florida, Tampa, Florida, 33620.
| | - Briana Rodriguez
- Auditory Behavioral Research Lab, Department of Communication Sciences and Disorders, University of South Florida, Tampa, Florida
| | - Jungmee Lee
- Auditory Behavioral Research Lab, Department of Communication Sciences and Disorders, University of South Florida, Tampa, Florida
| |
Collapse
|
16
|
Yang X, Wang Y, Zhang R, Zhang Y. Physical and Psychoacoustic Characteristics of Typical Noise on Construction Site: "How Does Noise Impact Construction Workers' Experience?". Front Psychol 2021; 12:707868. [PMID: 34393945 PMCID: PMC8356746 DOI: 10.3389/fpsyg.2021.707868] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2021] [Accepted: 07/02/2021] [Indexed: 12/04/2022] Open
Abstract
Construction noise is an integral part of urban social noise. Construction workers are more directly and significantly affected by construction noise. Therefore, the construction noise situation within construction sites, the acoustic environment experience of construction workers, and the impact of noise on them are highly worthy of attention. This research conducted a 7-month noise level (LAeq) measurement on a construction site of a reinforced concrete structure high-rise residential building in northern China. The noise conditions within the site in different spatial areas and temporal stages was analyzed. Binaural recording of 10 typical construction noises, including earthwork machinery, concrete machinery, and hand-held machinery, were performed. The physical acoustics and psychoacoustic characteristics were analyzed with the aid of a sound quality analysis software. A total of 133 construction workers performing 12 types of tasks were asked about their subjective evaluation of the typical noises and given a survey on their noise experience on the construction site. This was done to explore the acoustic environment on the construction site, the environmental experience of construction workers, the impact of noise on hearing and on-site communications, and the corresponding influencing factors. This research showed that the noise situation on construction sites is not optimistic, and the construction workers have been affected to varying degrees in terms of psychological experience, hearing ability, and on-site communications. Partial correlation analysis showed that the construction workers’ perception of noise, their hearing, and their on-site communications were affected by the noise environment, which were correlated to varying degrees with the individual’s post-specific noise, demand for on-site communications, and age, respectively. Correlation analysis and cluster analysis both showed that the annoyance caused by typical construction noise was correlated to its physical and psychoacoustic characteristics. To maintain the physical and mental health of construction workers, there is a need to improve on the fronts of site management, noise reduction, equipment and facility optimization, and occupational protection.
Collapse
Affiliation(s)
- Xinhao Yang
- Eco-Building Physics Technology and Evaluation Provincial Key Lab, School of Architecture and Urban Planning, Shenyang Jianzhu University, Shenyang, China
| | - Yitong Wang
- Railway No.9 Bureau Group 4th Engineering Co., Ltd., Shenyang, China
| | - Ruining Zhang
- Eco-Building Physics Technology and Evaluation Provincial Key Lab, School of Architecture and Urban Planning, Shenyang Jianzhu University, Shenyang, China
| | - Yuan Zhang
- Eco-Building Physics Technology and Evaluation Provincial Key Lab, School of Architecture and Urban Planning, Shenyang Jianzhu University, Shenyang, China
| |
Collapse
|
17
|
Carter JA, Bidelman GM. Auditory cortex is susceptible to lexical influence as revealed by informational vs. energetic masking of speech categorization. Brain Res 2021; 1759:147385. [PMID: 33631210 DOI: 10.1016/j.brainres.2021.147385] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2020] [Revised: 02/15/2021] [Accepted: 02/16/2021] [Indexed: 02/02/2023]
Abstract
Speech perception requires the grouping of acoustic information into meaningful phonetic units via the process of categorical perception (CP). Environmental masking influences speech perception and CP. However, it remains unclear at which stage of processing (encoding, decision, or both) masking affects listeners' categorization of speech signals. The purpose of this study was to determine whether linguistic interference influences the early acoustic-phonetic conversion process inherent to CP. To this end, we measured source level, event related brain potentials (ERPs) from auditory cortex (AC) and inferior frontal gyrus (IFG) as listeners rapidly categorized speech sounds along a /da/ to /ga/ continuum presented in three listening conditions: quiet, and in the presence of forward (informational masker) and time-reversed (energetic masker) 2-talker babble noise. Maskers were matched in overall SNR and spectral content and thus varied only in their degree of linguistic interference (i.e., informational masking). We hypothesized a differential effect of informational versus energetic masking on behavioral and neural categorization responses, where we predicted increased activation of frontal regions when disambiguating speech from noise, especially during lexical-informational maskers. We found (1) informational masking weakens behavioral speech phoneme identification above and beyond energetic masking; (2) low-level AC activity not only codes speech categories but is susceptible to higher-order lexical interference; (3) identifying speech amidst noise recruits a cross hemispheric circuit (ACleft → IFGright) whose engagement varies according to task difficulty. These findings provide corroborating evidence for top-down influences on the early acoustic-phonetic analysis of speech through a coordinated interplay between frontotemporal brain areas.
Collapse
Affiliation(s)
- Jared A Carter
- Institute for Intelligent Systems, University of Memphis, Memphis, TN, USA; School of Communication Sciences & Disorders, University of Memphis, Memphis, TN, USA.
| | - Gavin M Bidelman
- Institute for Intelligent Systems, University of Memphis, Memphis, TN, USA; School of Communication Sciences & Disorders, University of Memphis, Memphis, TN, USA; University of Tennessee Health Sciences Center, Department of Anatomy and Neurobiology, Memphis, TN, USA.
| |
Collapse
|
18
|
Kaplan EC, Wagner AE, Toffanin P, Başkent D. Do Musicians and Non-musicians Differ in Speech-on-Speech Processing? Front Psychol 2021; 12:623787. [PMID: 33679539 PMCID: PMC7931613 DOI: 10.3389/fpsyg.2021.623787] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2020] [Accepted: 01/21/2021] [Indexed: 12/18/2022] Open
Abstract
Earlier studies have shown that musically trained individuals may have a benefit in adverse listening situations when compared to non-musicians, especially in speech-on-speech perception. However, the literature provides mostly conflicting results. In the current study, by employing different measures of spoken language processing, we aimed to test whether we could capture potential differences between musicians and non-musicians in speech-on-speech processing. We used an offline measure of speech perception (sentence recall task), which reveals a post-task response, and online measures of real time spoken language processing: gaze-tracking and pupillometry. We used stimuli of comparable complexity across both paradigms and tested the same groups of participants. In the sentence recall task, musicians recalled more words correctly than non-musicians. In the eye-tracking experiment, both groups showed reduced fixations to the target and competitor words' images as the level of speech maskers increased. The time course of gaze fixations to the competitor did not differ between groups in the speech-in-quiet condition, while the time course dynamics did differ between groups as the two-talker masker was added to the target signal. As the level of two-talker masker increased, musicians showed reduced lexical competition as indicated by the gaze fixations to the competitor. The pupil dilation data showed differences mainly in one target-to-masker ratio. This does not allow to draw conclusions regarding potential differences in the use of cognitive resources between groups. Overall, the eye-tracking measure enabled us to observe that musicians may be using a different strategy than non-musicians to attain spoken word recognition as the noise level increased. However, further investigation with more fine-grained alignment between the processes captured by online and offline measures is necessary to establish whether musicians differ due to better cognitive control or sound processing.
Collapse
Affiliation(s)
- Elif Canseza Kaplan
- Department of Otorhinolaryngology/Head and Neck Surgery, University Medical Center Groningen, University of Groningen, Groningen, Netherlands.,Research School of Behavioral and Cognitive Neurosciences, Graduate School of Medical Sciences, University of Groningen, Groningen, Netherlands
| | - Anita E Wagner
- Department of Otorhinolaryngology/Head and Neck Surgery, University Medical Center Groningen, University of Groningen, Groningen, Netherlands
| | - Paolo Toffanin
- Department of Otorhinolaryngology/Head and Neck Surgery, University Medical Center Groningen, University of Groningen, Groningen, Netherlands
| | - Deniz Başkent
- Department of Otorhinolaryngology/Head and Neck Surgery, University Medical Center Groningen, University of Groningen, Groningen, Netherlands.,Research School of Behavioral and Cognitive Neurosciences, Graduate School of Medical Sciences, University of Groningen, Groningen, Netherlands
| |
Collapse
|