1
|
Maguinness C, Schall S, Mathias B, Schoemann M, von Kriegstein K. Prior multisensory learning can facilitate auditory-only voice-identity and speech recognition in noise. Q J Exp Psychol (Hove) 2024:17470218241278649. [PMID: 39164830 DOI: 10.1177/17470218241278649] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/22/2024]
Abstract
Seeing the visual articulatory movements of a speaker, while hearing their voice, helps with understanding what is said. This multisensory enhancement is particularly evident in noisy listening conditions. Multisensory enhancement also occurs even in auditory-only conditions: auditory-only speech and voice-identity recognition are superior for speakers previously learned with their face, compared to control learning; an effect termed the "face-benefit." Whether the face-benefit can assist in maintaining robust perception in increasingly noisy listening conditions, similar to concurrent multisensory input, is unknown. Here, in two behavioural experiments, we examined this hypothesis. In each experiment, participants learned a series of speakers' voices together with their dynamic face or control image. Following learning, participants listened to auditory-only sentences spoken by the same speakers and recognised the content of the sentences (speech recognition, Experiment 1) or the voice-identity of the speaker (Experiment 2) in increasing levels of auditory noise. For speech recognition, we observed that 14 of 30 participants (47%) showed a face-benefit. 19 of 25 participants (76%) showed a face-benefit for voice-identity recognition. For those participants who demonstrated a face-benefit, the face-benefit increased with auditory noise levels. Taken together, the results support an audio-visual model of auditory communication and suggest that the brain can develop a flexible system in which learned facial characteristics are used to deal with varying auditory uncertainty.
Collapse
Affiliation(s)
- Corrina Maguinness
- Chair of Cognitive and Clinical Neuroscience, Faculty of Psychology, Technische Universität Dresden, Dresden, Germany
- Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany
| | - Sonja Schall
- Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany
| | - Brian Mathias
- Chair of Cognitive and Clinical Neuroscience, Faculty of Psychology, Technische Universität Dresden, Dresden, Germany
- School of Psychology, University of Aberdeen, Aberdeen, United Kingdom
| | - Martin Schoemann
- Chair of Psychological Methods and Cognitive Modelling, Faculty of Psychology, Technische Universität Dresden, Dresden, Germany
| | - Katharina von Kriegstein
- Chair of Cognitive and Clinical Neuroscience, Faculty of Psychology, Technische Universität Dresden, Dresden, Germany
- Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany
| |
Collapse
|
2
|
Flaherty MM. The role of long-term target and masker talker familiarity in children's speech-in-speech recognition. Front Psychol 2024; 15:1369195. [PMID: 38784624 PMCID: PMC11112701 DOI: 10.3389/fpsyg.2024.1369195] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2024] [Accepted: 04/23/2024] [Indexed: 05/25/2024] Open
Abstract
Objectives This study investigated the influence of long-term talker familiarity on speech-in-speech recognition in school-age children, with a specific emphasis on the role of familiarity with the mother's voice as either the target or masker speech. Design Open-set sentence recognition was measured adaptively in a two-talker masker. Target and masker sentences were recorded by the adult mothers of the child participants. Each child heard sentences spoken by three adult female voices during testing; their own mother's voice (familiar voice) and two unfamiliar adult female voices. Study sample Twenty-four school age children (8-13 years) with normal hearing. Results When the target speech was spoken by a familiar talker (the mother), speech recognition was significantly better compared to when the target was unfamiliar. When the masker was spoken by the familiar talker, there was no difference in performance relative to the unfamiliar masker condition. Across all conditions, younger children required a more favorable signal-to-noise ratio than older children. Conclusion Implicit long-term familiarity with a talker consistently improves children's speech-in-speech recognition across the age range tested, specifically when the target talker is familiar. However, performance remains unaffected by masker talker familiarity. Additionally, while target familiarity is advantageous, it does not entirely eliminate children's increased susceptibility to competing speech.
Collapse
Affiliation(s)
- Mary M. Flaherty
- Department of Speech and Hearing Science, University of Illinois at Urbana-Champaign, Champaign, IL, United States
| |
Collapse
|
3
|
Har-Shai Yahav P, Sharaabi A, Zion Golumbic E. The effect of voice familiarity on attention to speech in a cocktail party scenario. Cereb Cortex 2024; 34:bhad475. [PMID: 38142293 DOI: 10.1093/cercor/bhad475] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2023] [Revised: 11/20/2023] [Accepted: 11/20/2023] [Indexed: 12/25/2023] Open
Abstract
Selective attention to one speaker in multi-talker environments can be affected by the acoustic and semantic properties of speech. One highly ecological feature of speech that has the potential to assist in selective attention is voice familiarity. Here, we tested how voice familiarity interacts with selective attention by measuring the neural speech-tracking response to both target and non-target speech in a dichotic listening "Cocktail Party" paradigm. We measured Magnetoencephalography from n = 33 participants, presented with concurrent narratives in two different voices, and instructed to pay attention to one ear ("target") and ignore the other ("non-target"). Participants were familiarized with one of the voices during the week prior to the experiment, rendering this voice familiar to them. Using multivariate speech-tracking analysis we estimated the neural responses to both stimuli and replicate their well-established modulation by selective attention. Importantly, speech-tracking was also affected by voice familiarity, showing enhanced response for target speech and reduced response for non-target speech in the contra-lateral hemisphere, when these were in a familiar vs. an unfamiliar voice. These findings offer valuable insight into how voice familiarity, and by extension, auditory-semantics, interact with goal-driven attention, and facilitate perceptual organization and speech processing in noisy environments.
Collapse
Affiliation(s)
- Paz Har-Shai Yahav
- The Gonda Center for Multidisciplinary Brain Research, Bar Ilan University, Ramat Gan 5290002, Israel
| | - Aviya Sharaabi
- The Gonda Center for Multidisciplinary Brain Research, Bar Ilan University, Ramat Gan 5290002, Israel
| | - Elana Zion Golumbic
- The Gonda Center for Multidisciplinary Brain Research, Bar Ilan University, Ramat Gan 5290002, Israel
| |
Collapse
|
4
|
Zaltz Y. The Impact of Trained Conditions on the Generalization of Learning Gains Following Voice Discrimination Training. Trends Hear 2024; 28:23312165241275895. [PMID: 39212078 PMCID: PMC11367600 DOI: 10.1177/23312165241275895] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2024] [Revised: 06/03/2024] [Accepted: 07/29/2024] [Indexed: 09/04/2024] Open
Abstract
Auditory training can lead to notable enhancements in specific tasks, but whether these improvements generalize to untrained tasks like speech-in-noise (SIN) recognition remains uncertain. This study examined how training conditions affect generalization. Fifty-five young adults were divided into "Trained-in-Quiet" (n = 15), "Trained-in-Noise" (n = 20), and "Control" (n = 20) groups. Participants completed two sessions. The first session involved an assessment of SIN recognition and voice discrimination (VD) with word or sentence stimuli, employing combined fundamental frequency (F0) + formant frequencies voice cues. Subsequently, only the trained groups proceeded to an interleaved training phase, encompassing six VD blocks with sentence stimuli, utilizing either F0-only or formant-only cues. The second session replicated the interleaved training for the trained groups, followed by a second assessment conducted by all three groups, identical to the first session. Results showed significant improvements in the trained task regardless of training conditions. However, VD training with a single cue did not enhance VD with both cues beyond control group improvements, suggesting limited generalization. Notably, the Trained-in-Noise group exhibited the most significant SIN recognition improvements posttraining, implying generalization across tasks that share similar acoustic conditions. Overall, findings suggest training conditions impact generalization by influencing processing levels associated with the trained task. Training in noisy conditions may prompt higher auditory and/or cognitive processing than training in quiet, potentially extending skills to tasks involving challenging listening conditions, such as SIN recognition. These insights hold significant theoretical and clinical implications, potentially advancing the development of effective auditory training protocols.
Collapse
Affiliation(s)
- Yael Zaltz
- Department of Communication Disorders, The Stanley Steyer School of Health Professions, Faculty of Medicine, and Sagol School of Neuroscience, Tel Aviv University, Tel Aviv, Israel
| |
Collapse
|
5
|
Babaoğlu G, Rachman L, Ertürk P, Özkişi Yazgan B, Sennaroğlu G, Gaudrain E, Başkent D. Perception of voice cues in school-age children with hearing aids. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2024; 155:722-741. [PMID: 38284822 DOI: 10.1121/10.0024356] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/13/2023] [Accepted: 12/26/2023] [Indexed: 01/30/2024]
Abstract
The just-noticeable differences (JNDs) of the voice cues of voice pitch (F0) and vocal-tract length (VTL) were measured in school-aged children with bilateral hearing aids and children and adults with normal hearing. The JNDs were larger for hearing-aided than normal-hearing children up to the age of 12 for F0 and into adulthood for all ages for VTL. Age was a significant factor for both groups for F0 JNDs, but only for the hearing-aided group for VTL JNDs. Age of maturation was later for F0 than VTL. Individual JNDs of the two groups largely overlapped for F0, but little for VTL. Hearing thresholds (unaided or aided, 500-400 Hz, overlapping with mid-range speech frequencies) did not correlate with the JNDs. However, extended low-frequency hearing thresholds (unaided, 125-250 Hz, overlapping with voice F0 ranges) correlated with the F0 JNDs. Hence, age and hearing status differentially interact with F0 and VTL perception, and VTL perception seems challenging for hearing-aided children. On the other hand, even children with profound hearing loss could do the task, indicating a hearing aid benefit for voice perception. Given the significant age effect and that for F0 the hearing-aided children seem to be catching up with age-typical development, voice cue perception may continue developing in hearing-aided children.
Collapse
Affiliation(s)
- Gizem Babaoğlu
- Department of Otorhinolaryngology/Head and Neck Surgery, University Medical Center Groningen, University of Groningen, The Netherlands
| | - Laura Rachman
- Department of Otorhinolaryngology/Head and Neck Surgery, University Medical Center Groningen, University of Groningen, The Netherlands
| | - Pınar Ertürk
- Department of Audiology, Health Sciences Institute, Hacettepe University, Ankara, Turkey
| | - Başak Özkişi Yazgan
- Department of Audiology, Health Sciences Institute, Hacettepe University, Ankara, Turkey
| | - Gonca Sennaroğlu
- Department of Audiology, Health Sciences Institute, Hacettepe University, Ankara, Turkey
| | - Etienne Gaudrain
- Lyon Neuroscience Research Center, CNRS UMR5292, Inserm U1028, Université Lyon 1, Lyon, France
| | - Deniz Başkent
- Department of Otorhinolaryngology/Head and Neck Surgery, University Medical Center Groningen, University of Groningen, The Netherlands
| |
Collapse
|
6
|
Flaherty MM, Price R, Murgia S, Manukian E. Can Playing a Game Improve Children's Speech Recognition? A Preliminary Study of Implicit Talker Familiarity Effects. Am J Audiol 2023:1-16. [PMID: 38056473 DOI: 10.1044/2023_aja-23-00156] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/08/2023] Open
Abstract
PURPOSE The goal was to evaluate whether implicit talker familiarization via an interactive computer game, designed for this study, could improve children's word recognition in classroom noise. It was hypothesized that, regardless of age, children would perform better when recognizing words spoken by the talker who was heard during the game they played. METHOD Using a one-group pretest-posttest experimental design, this study examined the impact of short-term implicit voice exposure on children's word recognition in classroom noise. Implicit voice familiarization occurred via an interactive computer game, played at home for 10 min a day for 5 days. In the game, children (8-12 years) heard one voice, intended to become the "familiar talker." Pre- and postfamiliarization, children identified words in prerecorded classroom noise. Four conditions were tested to evaluate talker familiarity and generalization effects. RESULTS Results demonstrated an 11% improvement when recognizing words spoken by the voice heard in the game ("familiar talker"). This was observed only for words that were heard in the game and did not generalize to unfamiliarized words. Before familiarization, younger children had poorer recognition than older children in all conditions; however, after familiarization, there was no effect of age on performance for familiarized stimuli. CONCLUSIONS Implicit short-term exposure to a talker has the potential to improve children's speech recognition. Therefore, leveraging talker familiarity through gameplay shows promise as a viable method for improving children's speech-in-noise recognition. However, given that improvements did not generalize to unfamiliarized words, careful consideration of exposure stimuli is necessary to optimize this approach.
Collapse
Affiliation(s)
- Mary M Flaherty
- Department of Speech and Hearing Science, University of Illinois at Urbana-Champaign, Champaign
| | - Rachael Price
- Department of Speech and Hearing Science, University of Illinois at Urbana-Champaign, Champaign
- Department of Audiology, Children's Hospital of Philadelphia, PA
| | - Silvia Murgia
- Department of Speech and Hearing Science, University of Illinois at Urbana-Champaign, Champaign
| | - Emma Manukian
- Department of Speech and Hearing Science, University of Illinois at Urbana-Champaign, Champaign
| |
Collapse
|
7
|
Holmes E, Johnsrude IS. Intelligibility benefit for familiar voices is not accompanied by better discrimination of fundamental frequency or vocal tract length. Hear Res 2023; 429:108704. [PMID: 36701896 DOI: 10.1016/j.heares.2023.108704] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/30/2022] [Revised: 11/11/2022] [Accepted: 01/19/2023] [Indexed: 01/21/2023]
Abstract
Speech is more intelligible when it is spoken by familiar than unfamiliar people. If this benefit arises because key voice characteristics like perceptual correlates of fundamental frequency or vocal tract length (VTL) are more accurately represented for familiar voices, listeners may be able to discriminate smaller manipulations to such characteristics for familiar than unfamiliar voices. We measured participants' (N = 17) thresholds for discriminating pitch (correlate of fundamental frequency, or glottal pulse rate) and formant spacing (correlate of VTL; 'VTL-timbre') for voices that were familiar (participants' friends) and unfamiliar (other participants' friends). As expected, familiar voices were more intelligible. However, discrimination thresholds were no smaller for the same familiar voices. The size of the intelligibility benefit for a familiar over an unfamiliar voice did not relate to the difference in discrimination thresholds for the same voices. Also, the familiar-voice intelligibility benefit was just as large following perceptible manipulations to pitch and VTL-timbre. These results are more consistent with cognitive accounts of speech perception than traditional accounts that predict better discrimination.
Collapse
Affiliation(s)
- Emma Holmes
- Department of Speech Hearing and Phonetic Sciences, UCL, London WC1N 1PF, UK; Brain and Mind Institute, University of Western Ontario, London, Ontario N6A 3K7, Canada.
| | - Ingrid S Johnsrude
- Brain and Mind Institute, University of Western Ontario, London, Ontario N6A 3K7, Canada; School of Communication Sciences and Disorders, University of Western Ontario, London, Ontario N6G 1H1, Canada
| |
Collapse
|
8
|
Short Implicit Voice Training Affects Listening Effort During a Voice Cue Sensitivity Task With Vocoder-Degraded Speech. Ear Hear 2023:00003446-990000000-00113. [PMID: 36695603 PMCID: PMC10262993 DOI: 10.1097/aud.0000000000001335] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/26/2023]
Abstract
OBJECTIVES Understanding speech in real life can be challenging and effortful, such as in multiple-talker listening conditions. Fundamental frequency (fo) and vocal-tract length (vtl) voice cues can help listeners segregate between talkers, enhancing speech perception in adverse listening conditions. Previous research showed lower sensitivity to fo and vtl voice cues when speech signal was degraded, such as in cochlear implant hearing and vocoder-listening compared to normal hearing, likely contributing to difficulties in understanding speech in adverse listening. Nevertheless, when multiple talkers are present, familiarity with a talker's voice, via training or exposure, could provide a speech intelligibility benefit. In this study, the objective was to assess how an implicit short-term voice training could affect perceptual discrimination of voice cues (fo+vtl), measured in sensitivity and listening effort, with or without vocoder degradations. DESIGN Voice training was provided via listening to a recording of a book segment for approximately 30 min, and answering text-related questions, to ensure engagement. Just-noticeable differences (JNDs) for fo+vtl were measured with an odd-one-out task implemented as a 3-alternative forced-choice adaptive paradigm, while simultaneously collecting pupil data. The reference voice either belonged to the trained voice or an untrained voice. Effects of voice training (trained and untrained voice), vocoding (non-vocoded and vocoded), and item variability (fixed or variable consonant-vowel triplets presented across three items) on voice cue sensitivity (fo+vtl JNDs) and listening effort (pupillometry measurements) were analyzed. RESULTS Results showed that voice training did not have a significant effect on voice cue discrimination. As expected, fo+vtl JNDs were significantly larger for vocoded conditions than for non-vocoded conditions and with variable item presentations than fixed item presentations. Generalized additive mixed models analysis of pupil dilation over the time course of stimulus presentation showed that pupil dilation was significantly larger during fo+vtl discrimination while listening to untrained voices compared to trained voices, but only for vocoder-degraded speech. Peak pupil dilation was significantly larger for vocoded conditions compared to non-vocoded conditions and variable items increased the pupil baseline relative to fixed items, which could suggest a higher anticipated task difficulty. CONCLUSIONS In this study, even though short voice training did not lead to improved sensitivity to small fo+vtl voice cue differences at the discrimination threshold level, voice training still resulted in reduced listening effort for discrimination among vocoded voice cues.
Collapse
|
9
|
Njie S, Lavan N, McGettigan C. Talker and accent familiarity yield advantages for voice identity perception: A voice sorting study. Mem Cognit 2023; 51:175-187. [PMID: 35274221 PMCID: PMC9943951 DOI: 10.3758/s13421-022-01296-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 02/15/2022] [Indexed: 11/08/2022]
Abstract
In the current study, we examine and compare the effects of talker and accent familiarity in the context of a voice identity sorting task, using naturally varying voice recording samples from the TV show Derry Girls. Voice samples were thus all spoken with a regional accent of UK/Irish English (from [London]derry). We tested four listener groups: Listeners were either familiar or unfamiliar with the TV show (and therefore the talker identities) and were either highly familiar or relatively less familiar with Northern Irish accents. Both talker and accent familiarity significantly improved accuracy of voice identity sorting performance. However, the talker familiarity benefits were overall larger, and more consistent. We discuss the results in light of a possible hierarchy of familiarity effects and argue that our findings may provide additional evidence for interactions of speech and identity processing pathways in voice identity perception. We also identify some key limitations in the current work and provide suggestions for future studies to address these.
Collapse
Affiliation(s)
- Sheriff Njie
- Department of Speech, Hearing and Phonetic Sciences, University College London, Chandler House 2 Wakefield Street, London, WC1N 1PF, UK
| | - Nadine Lavan
- Department of Speech, Hearing and Phonetic Sciences, University College London, Chandler House 2 Wakefield Street, London, WC1N 1PF, UK.
- Department of Psychology, School of Biological and Chemical Sciences, Queen Mary University of London, Mile End Road, London, E1 4NS, UK.
| | - Carolyn McGettigan
- Department of Speech, Hearing and Phonetic Sciences, University College London, Chandler House 2 Wakefield Street, London, WC1N 1PF, UK.
| |
Collapse
|
10
|
Cheung S, Babel M. The own-voice benefit for word recognition in early bilinguals. Front Psychol 2022; 13:901326. [PMID: 36118470 PMCID: PMC9478475 DOI: 10.3389/fpsyg.2022.901326] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2022] [Accepted: 07/28/2022] [Indexed: 11/15/2022] Open
Abstract
The current study examines the self-voice benefit in an early bilingual population. Female Cantonese-English bilinguals produced words containing Cantonese contrasts. A subset of these minimal pairs was selected as stimuli for a perception task. Speakers' productions were grouped according to how acoustically contrastive their pronunciation of each minimal pair was and these groupings were used to design personalized experiments for each participant, featuring their own voice and the voices of others' similarly-contrastive tokens. The perception task was a two-alternative forced-choice word identification paradigm in which participants heard isolated Cantonese words, which had undergone synthesis to mask the original talker identity. Listeners were more accurate in recognizing minimal pairs produced in their own (disguised) voice than recognizing the realizations of speakers who maintain similar degrees of phonetic contrast for the same minimal pairs. Generally, individuals with larger phonetic contrasts were also more accurate in word identification for self and other voices overall. These results provide evidence for an own-voice benefit for early bilinguals. These results suggest that the phonetic distributions that undergird phonological contrasts are heavily shaped by one's own phonetic realizations.
Collapse
Affiliation(s)
- Sarah Cheung
- Department of Speech-Language Pathology, University of Toronto, Toronto, ON, Canada
| | - Molly Babel
- Department of Linguistics, University of British Columbia, Vancouver, BC, Canada
| |
Collapse
|
11
|
Familiarity and task context shape the use of acoustic information in voice identity perception. Cognition 2021; 215:104780. [PMID: 34298232 PMCID: PMC8381763 DOI: 10.1016/j.cognition.2021.104780] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2020] [Revised: 05/10/2021] [Accepted: 05/12/2021] [Indexed: 11/23/2022]
Abstract
Familiar and unfamiliar voice perception are often understood as being distinct from each other. For identity perception, theoretical work has proposed that listeners use acoustic information in different ways to perceive identity from familiar and unfamiliar voices: Unfamiliar voices are thought to be processed based on close comparisons of acoustic properties, while familiar voices are processed based on diagnostic acoustic features that activate a stored person-specific representation of that voice. To date no empirical study has directly examined whether and how familiar and unfamiliar listeners differ in their use of acoustic information for identity perception. Here, we tested this theoretical claim by linking listeners' judgements in voice identity tasks to complex acoustic representation - spectral similarity of the heard voice recordings. Participants (N = 177) who were either familiar or unfamiliar with a set of voices completed an identity discrimination task (Experiment 1) or an identity sorting task (Experiment 2). In both experiments, identity judgements for familiar and unfamiliar voices were guided by spectral similarity: Pairs of recordings with greater acoustic similarity were more likely to be perceived as belonging to the same voice identity. However, while there were no differences in how familiar and unfamiliar listeners used acoustic information for identity discrimination, differences were apparent for identity sorting. Our study therefore challenges proposals that view familiar and unfamiliar voice perception as being at all times distinct. Instead, our data suggest a critical role of the listening situation in which familiar and unfamiliar voices are evaluated, thus characterising voice identity perception as a highly dynamic process in which listeners opportunistically make use of any kind of information they can access.
Collapse
|
12
|
Holmes E, Johnsrude IS. Speech-evoked brain activity is more robust to competing speech when it is spoken by someone familiar. Neuroimage 2021; 237:118107. [PMID: 33933598 DOI: 10.1016/j.neuroimage.2021.118107] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2020] [Revised: 04/19/2021] [Accepted: 04/25/2021] [Indexed: 10/21/2022] Open
Abstract
When speech is masked by competing sound, people are better at understanding what is said if the talker is familiar compared to unfamiliar. The benefit is robust, but how does processing of familiar voices facilitate intelligibility? We combined high-resolution fMRI with representational similarity analysis to quantify the difference in distributed activity between clear and masked speech. We demonstrate that brain representations of spoken sentences are less affected by a competing sentence when they are spoken by a friend or partner than by someone unfamiliar-effectively, showing a cortical signal-to-noise ratio (SNR) enhancement for familiar voices. This effect correlated with the familiar-voice intelligibility benefit. We functionally parcellated auditory cortex, and found that the most prominent familiar-voice advantage was manifest along the posterior superior and middle temporal gyri. Overall, our results demonstrate that experience-driven improvements in intelligibility are associated with enhanced multivariate pattern activity in posterior temporal cortex.
Collapse
Affiliation(s)
- Emma Holmes
- The Brain and Mind Institute, University of Western Ontario, London, Ontario, N6A 3K7, Canada.
| | - Ingrid S Johnsrude
- The Brain and Mind Institute, University of Western Ontario, London, Ontario, N6A 3K7, Canada; School of Communication Sciences and Disorders, University of Western Ontario, London, Ontario, London, N6G 1H1, Canada
| |
Collapse
|
13
|
Luthra S. The Role of the Right Hemisphere in Processing Phonetic Variability Between Talkers. NEUROBIOLOGY OF LANGUAGE (CAMBRIDGE, MASS.) 2021; 2:138-151. [PMID: 37213418 PMCID: PMC10174361 DOI: 10.1162/nol_a_00028] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 04/17/2020] [Accepted: 11/13/2020] [Indexed: 05/23/2023]
Abstract
Neurobiological models of speech perception posit that both left and right posterior temporal brain regions are involved in the early auditory analysis of speech sounds. However, frank deficits in speech perception are not readily observed in individuals with right hemisphere damage. Instead, damage to the right hemisphere is often associated with impairments in vocal identity processing. Herein lies an apparent paradox: The mapping between acoustics and speech sound categories can vary substantially across talkers, so why might right hemisphere damage selectively impair vocal identity processing without obvious effects on speech perception? In this review, I attempt to clarify the role of the right hemisphere in speech perception through a careful consideration of its role in processing vocal identity. I review evidence showing that right posterior superior temporal, right anterior superior temporal, and right inferior / middle frontal regions all play distinct roles in vocal identity processing. In considering the implications of these findings for neurobiological accounts of speech perception, I argue that the recruitment of right posterior superior temporal cortex during speech perception may specifically reflect the process of conditioning phonetic identity on talker information. I suggest that the relative lack of involvement of other right hemisphere regions in speech perception may be because speech perception does not necessarily place a high burden on talker processing systems, and I argue that the extant literature hints at potential subclinical impairments in the speech perception abilities of individuals with right hemisphere damage.
Collapse
|
14
|
McKenzie C, Hodgetts WE, Ostevik AV, Cummine J. Listen before you drive: the effect of voice familiarity on listening comprehension and driving performance. Int J Audiol 2020; 60:621-628. [PMID: 33164608 DOI: 10.1080/14992027.2020.1842522] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
Abstract
OBJECTIVE Voice familiarity has been reported to reduce cognitive load in complex listening environments. The extent to which the reduction in listening effort allows for mental resources to be reallocated to other complex tasks needs further investigation. We sought to answer whether a familiar audiobook narrator provides benefits to (1) listening comprehension and/or (2) driving performance. DESIGN A double-blind between-groups design was implemented. Participants were randomly assigned to the Familiar group or the Unfamiliar group. STUDY SAMPLE Participants (n = 30) were normal-hearing adults, 18 to 28-years-old (M = 23, SD = 2.6) (n = 18 female). Participants first listened to an audiobook read by either Voice 1 (Familiar condition) or Voice 2 (Unfamiliar condition). Then they completed a virtual reality driving task while listening to a second audiobook, always read by Voice 1. Audiobook comprehension (30-question multiple-choice test) and driving performance (number of driving errors made) were recorded. RESULTS Participants in the Familiar group made fewer driving errors than participants in the Unfamiliar group. There were no differences in listening comprehension. CONCLUSIONS Increased voice familiarity positively impacts behaviour (i.e. reduced driving errors) in normal-hearing adults. We discuss our findings in the context of effortful listening frameworks.
Collapse
Affiliation(s)
- Cory McKenzie
- Faculty of Science, University of Alberta, Edmonton, Canada
| | - William E Hodgetts
- Faculty of Rehabilitation Medicine, University of Alberta, Edmonton, Canada.,Institute for Reconstructive Sciences in Medicine, Covenant Health, Edmonton, Canada
| | - Amberley V Ostevik
- Faculty of Rehabilitation Medicine, University of Alberta, Edmonton, Canada
| | - Jacqueline Cummine
- Faculty of Rehabilitation Medicine, University of Alberta, Edmonton, Canada.,Neuroscience and Mental Health Institute, University of Alberta, Edmonton, Canada
| |
Collapse
|
15
|
Jorgensen EJ, Stangl E, Chipara O, Hernandez H, Oleson J, Wu YH. GPS predicts stability of listening environment characteristics in one location over time among older hearing aid users. Int J Audiol 2020; 60:328-340. [PMID: 33074752 DOI: 10.1080/14992027.2020.1831083] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
Abstract
OBJECTIVE Hearing aid technology can allow users to "geo-tag" hearing aid preferences using the Global Positioning System (GPS). This technology assumes that listening environment characteristics that affect hearing aid benefit change little in a location over time. The purpose of this study was to investigate whether certain characteristics (reverberation, signal type, listening activity, noise location, noisiness, talker familiarity, talker location, and visual cues) changed in a location over time. Design: Participants completed GPS-tagged surveys on smartphones to report on characteristics of their listening environments. Coordinates were used to create indices that described how much listening environment characteristics changed in a location over time. Indices computed in one location were compared to indices computed across all locations for each participant. Study sample: 54 adults with hearing loss participated in this study (26 males and 38 females; 30 experienced hearing aid users and 24 new users). Results: A location dependency was observed for all characteristics. Characteristics were significantly different from one another in their stability over time. Conclusions: Listening environment characteristics changed less over time in a given location than in participants' lives generally. The effectiveness of GPS-dependent hearing aid settings likely depends on the accuracy and location definition of the GPS feature.
Collapse
Affiliation(s)
- Erik J Jorgensen
- Department of Communication Sciences and Disorders, University of Iowa, Iowa City, IA, USA
| | - Elizabeth Stangl
- Department of Communication Sciences and Disorders, University of Iowa, Iowa City, IA, USA
| | - Octav Chipara
- Department of Computer Science, University of Iowa, Iowa City, IA, USA
| | - Helin Hernandez
- Department of Biostatistics, University of Iowa, Iowa City, IA, USA
| | - Jacob Oleson
- Department of Biostatistics, University of Iowa, Iowa City, IA, USA
| | - Yu-Hsiang Wu
- Department of Communication Sciences and Disorders, University of Iowa, Iowa City, IA, USA
| |
Collapse
|
16
|
Rotman T, Lavie L, Banai K. Rapid Perceptual Learning: A Potential Source of Individual Differences in Speech Perception Under Adverse Conditions? Trends Hear 2020; 24:2331216520930541. [PMID: 32552477 PMCID: PMC7303778 DOI: 10.1177/2331216520930541] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open
Abstract
Challenging listening situations (e.g., when speech is rapid or noisy) result in substantial individual differences in speech perception. We propose that rapid auditory perceptual learning is one of the factors contributing to those individual differences. To explore this proposal, we assessed rapid perceptual learning of time-compressed speech in young adults with normal hearing and in older adults with age-related hearing loss. We also assessed the contribution of this learning as well as that of hearing and cognition (vocabulary, working memory, and selective attention) to the recognition of natural-fast speech (NFS; both groups) and speech in noise (younger adults). In young adults, rapid learning and vocabulary were significant predictors of NFS and speech in noise recognition. In older adults, hearing thresholds, vocabulary, and rapid learning were significant predictors of NFS recognition. In both groups, models that included learning fitted the speech data better than models that did not include learning. Therefore, under adverse conditions, rapid learning may be one of the skills listeners could employ to support speech recognition.
Collapse
Affiliation(s)
- Tali Rotman
- Department of Communication Sciences and Disorders, University of Haifa
| | - Limor Lavie
- Department of Communication Sciences and Disorders, University of Haifa
| | - Karen Banai
- Department of Communication Sciences and Disorders, University of Haifa
| |
Collapse
|
17
|
Zorzin L, Carvalho GF, Kreitewolf J, Teggi R, Pinheiro CF, Moreira JR, Dach F, Bevilaqua-Grossi D. Subdiagnosis, but not presence of vestibular symptoms, predicts balance impairment in migraine patients - a cross sectional study. J Headache Pain 2020; 21:56. [PMID: 32448118 PMCID: PMC7247141 DOI: 10.1186/s10194-020-01128-z] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2020] [Accepted: 05/19/2020] [Indexed: 02/08/2023] Open
Abstract
Background Vestibular symptoms and balance changes are common in patients with migraine, especially in the ones with aura and chronic migraine. However, it is not known if the balance changes are determined by the presence of vestibular symptoms or migraine subdiagnosis. Therefore, the aim of this study was to verify if the migraine subdiagnosis and/or the presence of vestibular symptoms can predict balance dysfunction in migraineurs. Methods The study included 49 women diagnosed with migraine with aura, 53 without aura, 51 with chronic migraine, and 54 headache-free women. All participants answered a structured questionnaire regarding migraine features and presence of vestibular symptoms, such as dizziness/vertigo. The participants performed the Modified Sensory Organization Test on an AMTI© force plate. The data were analysed using a linear mixed-effect regression model. Results The presence of vestibular symptoms did not predict postural sway, but the subdiagnosis was a significant predictor of postural sway. Migraine with aura patients exhibited more sway than migraine patients without aura when the surface was unstable. Additionally, we found high effect sizes (ES > 0.79) for postural sway differences between patients with chronic migraine or with aura compared to controls or migraine without aura, suggesting that these results are clinically relevant. Conclusions The subdiagnosis of migraine, instead of the presence of vestibular symptoms, can predict postural control impairments observed in migraineurs. This lends support to the notion that balance instability is related to the presence of aura and migraine chronicity, and that it should be considered even in patients without vestibular symptoms.
Collapse
Affiliation(s)
- Letícia Zorzin
- Department of Health Sciences, Ribeirão Preto Medical School, University of São Paulo, Av. Bandeirantes, 3900 - Vila Monte Alegre, Ribeirão Preto, SP, 14049-900, Brazil
| | - Gabriela F Carvalho
- Department of Health Sciences, Ribeirão Preto Medical School, University of São Paulo, Av. Bandeirantes, 3900 - Vila Monte Alegre, Ribeirão Preto, SP, 14049-900, Brazil
| | - Jens Kreitewolf
- Department of Psychology, University of Lübeck, Lübeck, Germany
| | - Roberto Teggi
- Department of Ear, Nose and Throat, San Raffaele University Hospital, Milan, Italy
| | - Carina F Pinheiro
- Department of Health Sciences, Ribeirão Preto Medical School, University of São Paulo, Av. Bandeirantes, 3900 - Vila Monte Alegre, Ribeirão Preto, SP, 14049-900, Brazil
| | - Jéssica R Moreira
- Department of Health Sciences, Ribeirão Preto Medical School, University of São Paulo, Av. Bandeirantes, 3900 - Vila Monte Alegre, Ribeirão Preto, SP, 14049-900, Brazil
| | - Fabíola Dach
- Department of Neurosciences and Behavioral Sciences, Ribeirão Preto Medical School, University of São Paulo, Ribeirão Preto, Brazil
| | - Débora Bevilaqua-Grossi
- Department of Health Sciences, Ribeirão Preto Medical School, University of São Paulo, Av. Bandeirantes, 3900 - Vila Monte Alegre, Ribeirão Preto, SP, 14049-900, Brazil.
| |
Collapse
|
18
|
Brief Report: Speech-in-Noise Recognition and the Relation to Vocal Pitch Perception in Adults with Autism Spectrum Disorder and Typical Development. J Autism Dev Disord 2020; 50:356-363. [PMID: 31583624 DOI: 10.1007/s10803-019-04244-1] [Citation(s) in RCA: 20] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]
Abstract
We tested the ability to recognise speech-in-noise and its relation to the ability to discriminate vocal pitch in adults with high-functioning autism spectrum disorder (ASD) and typically developed adults (matched pairwise on age, sex, and IQ). Typically developed individuals understood speech in higher noise levels as compared to the ASD group. Within the control group but not within the ASD group, better speech-in-noise recognition abilities were significantly correlated with better vocal pitch discrimination abilities. Our results show that speech-in-noise recognition is restricted in people with ASD. We speculate that perceptual impairments such as difficulties in vocal pitch perception might be relevant in explaining these difficulties in ASD.
Collapse
|
19
|
Domingo Y, Holmes E, Macpherson E, Johnsrude IS. Using spatial release from masking to estimate the magnitude of the familiar-voice intelligibility benefit. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2019; 146:3487. [PMID: 31795686 DOI: 10.1121/1.5133628] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/18/2019] [Accepted: 10/23/2019] [Indexed: 06/10/2023]
Abstract
The ability to segregate simultaneous speech streams is crucial for successful communication. Recent studies have demonstrated that participants can report 10%-20% more words spoken by naturally familiar (e.g., friends or spouses) than unfamiliar talkers in two-voice mixtures. This benefit is commensurate with one of the largest benefits to speech intelligibility currently known-that which is gained by spatially separating two talkers. However, because of differences in the methods of these previous studies, the relative benefits of spatial separation and voice familiarity are unclear. Here, the familiar-voice benefit and spatial release from masking are directly compared, and it is examined if and how these two cues interact with one another. Talkers were recorded while speaking sentences from a published closed-set "matrix" task, and then listeners were presented with three different sentences played simultaneously. Each target sentence was played at 0° azimuth, and two masker sentences were symmetrically separated about the target. On average, participants reported 10%-30% more words correctly when the target sentence was spoken in a familiar than unfamiliar voice (collapsed over spatial separation conditions); it was found that participants gain a similar benefit from a familiar target as when an unfamiliar voice is separated from two symmetrical maskers by approximately 15° azimuth.
Collapse
Affiliation(s)
- Ysabel Domingo
- Brain and Mind Institute, University of Western Ontario, London, Ontario, Canada
| | - Emma Holmes
- Brain and Mind Institute, University of Western Ontario, London, Ontario, Canada
| | - Ewan Macpherson
- School of Communication Sciences and Disorders, University of Western Ontario, London, Ontario, Canada
| | - Ingrid S Johnsrude
- Brain and Mind Institute, University of Western Ontario, London, Ontario, Canada
| |
Collapse
|
20
|
Working-memory disruption by task-irrelevant talkers depends on degree of talker familiarity. Atten Percept Psychophys 2019; 81:1108-1118. [PMID: 30993655 DOI: 10.3758/s13414-019-01727-2] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
When one is listening, familiarity with an attended talker's voice improves speech comprehension. Here, we instead investigated the effect of familiarity with a distracting talker. In an irrelevant-speech task, we assessed listeners' working memory for the serial order of spoken digits when a task-irrelevant, distracting sentence was produced by either a familiar or an unfamiliar talker (with rare omissions of the task-irrelevant sentence). We tested two groups of listeners using the same experimental procedure. The first group were undergraduate psychology students (N = 66) who had attended an introductory statistics course. Critically, each student had been taught by one of two course instructors, whose voices served as the familiar and unfamiliar task-irrelevant talkers. The second group of listeners were family members and friends (N = 20) who had known either one of the two talkers for more than 10 years. Students, but not family members and friends, made more errors when the task-irrelevant talker was familiar versus unfamiliar. Interestingly, the effect of talker familiarity was not modulated by the presence of task-irrelevant speech: Students experienced stronger working memory disruption by a familiar talker, irrespective of whether they heard a task-irrelevant sentence during memory retention or merely expected it. While previous work has shown that familiarity with an attended talker benefits speech comprehension, our findings indicate that familiarity with an ignored talker disrupts working memory for target speech. The absence of this effect in family members and friends suggests that the degree of familiarity modulates the memory disruption.
Collapse
|
21
|
Case J, Seyfarth S, Levi SV. Short-term implicit voice-learning leads to a Familiar Talker Advantage: The role of encoding specificity. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2018; 144:EL497. [PMID: 30599692 PMCID: PMC6279454 DOI: 10.1121/1.5081469] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/18/2018] [Revised: 10/30/2018] [Accepted: 11/12/2018] [Indexed: 06/09/2023]
Abstract
Whereas previous research has found that a Familiar Talker Advantage-better spoken language perception for familiar voices-occurs following explicit voice-learning, Case, Seyfarth, and Levi [(2018). J. Speech, Lang., Hear. Res. 61(5), 1251-1260] failed to find this effect after implicit voice-learning. To test whether the advantage is limited to explicit voice-learning, a follow-up experiment evaluated implicit voice-learning under more similar encoding (training) and retrieval (test) conditions. Sentence recognition in noise improved significantly more for familiar than unfamiliar talkers, suggesting that short-term implicit voice-learning can lead to a Familiar Talker Advantage. This paper explores how similarity in encoding and retrieval conditions might affect the acquired processing advantage.
Collapse
Affiliation(s)
- Julie Case
- Department of Communicative Sciences and Disorders, New York University, 665 Broadway, 9th floor, New York, New York 10012, USA
| | - Scott Seyfarth
- Department of Linguistics, Ohio State University, 1712 Neil Avenue, Oxley Hall, Columbus, Ohio 43210, USA
| | - Susannah V Levi
- Department of Communicative Sciences and Disorders, New York University, 665 Broadway, 9th floor, New York, New York 10012, USA
| |
Collapse
|
22
|
Holmes E, Domingo Y, Johnsrude IS. Familiar Voices Are More Intelligible, Even if They Are Not Recognized as Familiar. Psychol Sci 2018; 29:1575-1583. [PMID: 30096018 DOI: 10.1177/0956797618779083] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open
Abstract
We can recognize familiar people by their voices, and familiar talkers are more intelligible than unfamiliar talkers when competing talkers are present. However, whether the acoustic voice characteristics that permit recognition and those that benefit intelligibility are the same or different is unknown. Here, we recruited pairs of participants who had known each other for 6 months or longer and manipulated the acoustic correlates of two voice characteristics (vocal tract length and glottal pulse rate). These had different effects on explicit recognition of and the speech-intelligibility benefit realized from familiar voices. Furthermore, even when explicit recognition of familiar voices was eliminated, they were still more intelligible than unfamiliar voices-demonstrating that familiar voices do not need to be explicitly recognized to benefit intelligibility. Processing familiar-voice information appears therefore to depend on multiple, at least partially independent, systems that are recruited depending on the perceptual goal of the listener.
Collapse
Affiliation(s)
- Emma Holmes
- 1 Brain and Mind Institute, University of Western Ontario
| | - Ysabel Domingo
- 1 Brain and Mind Institute, University of Western Ontario
| | - Ingrid S Johnsrude
- 1 Brain and Mind Institute, University of Western Ontario.,2 School of Communication Sciences and Disorders, University of Western Ontario
| |
Collapse
|