1
|
Li J, Hiersche KJ, Saygin ZM. Demystifying visual word form area visual and nonvisual response properties with precision fMRI. iScience 2024; 27:111481. [PMID: 39759006 PMCID: PMC11696768 DOI: 10.1016/j.isci.2024.111481] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2024] [Revised: 06/05/2024] [Accepted: 11/22/2024] [Indexed: 01/07/2025] Open
Abstract
The visual word form area (VWFA) is a region in the left ventrotemporal cortex (VTC) whose specificity remains contentious. Using precision fMRI, we examine the VWFA's responses to numerous visual and nonvisual stimuli, comparing them to adjacent category-selective visual regions and regions involved in language and attentional demand. We find that VWFA responds moderately to non-word visual stimuli, but is unique within VTC in its pronounced selectivity for visual words. Interestingly, the VWFA is also the only category-selective visual region engaged in auditory language, unlike the ubiquitous attentional demand effect throughout the VTC. However, this language selectivity is dwarfed by its visual responses even to nonpreferred categories, indicating the VWFA is not a core (amodal) language region. We also observed two additional auditory language VTC clusters, but these had no specificity for visual words. Our detailed investigation clarifies longstanding controversies about the landscape of visual and auditory language functionality within VTC.
Collapse
Affiliation(s)
- Jin Li
- Department of Psychology, The Ohio State University, Columbus, OH 43210, USA
- Center for Cognitive and Behavioral Brain Imaging, The Ohio State University, Columbus, OH 43210, USA
- School of Psychology, Georgia Institute of Technology, Atlanta, GA 30332, USA
| | - Kelly J. Hiersche
- Department of Psychology, The Ohio State University, Columbus, OH 43210, USA
- Center for Cognitive and Behavioral Brain Imaging, The Ohio State University, Columbus, OH 43210, USA
| | - Zeynep M. Saygin
- Department of Psychology, The Ohio State University, Columbus, OH 43210, USA
- Center for Cognitive and Behavioral Brain Imaging, The Ohio State University, Columbus, OH 43210, USA
| |
Collapse
|
2
|
Maguinness C, Schall S, Mathias B, Schoemann M, von Kriegstein K. Prior multisensory learning can facilitate auditory-only voice-identity and speech recognition in noise. Q J Exp Psychol (Hove) 2024:17470218241278649. [PMID: 39164830 DOI: 10.1177/17470218241278649] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/22/2024]
Abstract
Seeing the visual articulatory movements of a speaker, while hearing their voice, helps with understanding what is said. This multisensory enhancement is particularly evident in noisy listening conditions. Multisensory enhancement also occurs even in auditory-only conditions: auditory-only speech and voice-identity recognition are superior for speakers previously learned with their face, compared to control learning; an effect termed the "face-benefit." Whether the face-benefit can assist in maintaining robust perception in increasingly noisy listening conditions, similar to concurrent multisensory input, is unknown. Here, in two behavioural experiments, we examined this hypothesis. In each experiment, participants learned a series of speakers' voices together with their dynamic face or control image. Following learning, participants listened to auditory-only sentences spoken by the same speakers and recognised the content of the sentences (speech recognition, Experiment 1) or the voice-identity of the speaker (Experiment 2) in increasing levels of auditory noise. For speech recognition, we observed that 14 of 30 participants (47%) showed a face-benefit. 19 of 25 participants (76%) showed a face-benefit for voice-identity recognition. For those participants who demonstrated a face-benefit, the face-benefit increased with auditory noise levels. Taken together, the results support an audio-visual model of auditory communication and suggest that the brain can develop a flexible system in which learned facial characteristics are used to deal with varying auditory uncertainty.
Collapse
Affiliation(s)
- Corrina Maguinness
- Chair of Cognitive and Clinical Neuroscience, Faculty of Psychology, Technische Universität Dresden, Dresden, Germany
- Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany
| | - Sonja Schall
- Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany
| | - Brian Mathias
- Chair of Cognitive and Clinical Neuroscience, Faculty of Psychology, Technische Universität Dresden, Dresden, Germany
- School of Psychology, University of Aberdeen, Aberdeen, United Kingdom
| | - Martin Schoemann
- Chair of Psychological Methods and Cognitive Modelling, Faculty of Psychology, Technische Universität Dresden, Dresden, Germany
| | - Katharina von Kriegstein
- Chair of Cognitive and Clinical Neuroscience, Faculty of Psychology, Technische Universität Dresden, Dresden, Germany
- Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany
| |
Collapse
|
3
|
Har-Shai Yahav P, Sharaabi A, Zion Golumbic E. The effect of voice familiarity on attention to speech in a cocktail party scenario. Cereb Cortex 2024; 34:bhad475. [PMID: 38142293 DOI: 10.1093/cercor/bhad475] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2023] [Revised: 11/20/2023] [Accepted: 11/20/2023] [Indexed: 12/25/2023] Open
Abstract
Selective attention to one speaker in multi-talker environments can be affected by the acoustic and semantic properties of speech. One highly ecological feature of speech that has the potential to assist in selective attention is voice familiarity. Here, we tested how voice familiarity interacts with selective attention by measuring the neural speech-tracking response to both target and non-target speech in a dichotic listening "Cocktail Party" paradigm. We measured Magnetoencephalography from n = 33 participants, presented with concurrent narratives in two different voices, and instructed to pay attention to one ear ("target") and ignore the other ("non-target"). Participants were familiarized with one of the voices during the week prior to the experiment, rendering this voice familiar to them. Using multivariate speech-tracking analysis we estimated the neural responses to both stimuli and replicate their well-established modulation by selective attention. Importantly, speech-tracking was also affected by voice familiarity, showing enhanced response for target speech and reduced response for non-target speech in the contra-lateral hemisphere, when these were in a familiar vs. an unfamiliar voice. These findings offer valuable insight into how voice familiarity, and by extension, auditory-semantics, interact with goal-driven attention, and facilitate perceptual organization and speech processing in noisy environments.
Collapse
Affiliation(s)
- Paz Har-Shai Yahav
- The Gonda Center for Multidisciplinary Brain Research, Bar Ilan University, Ramat Gan 5290002, Israel
| | - Aviya Sharaabi
- The Gonda Center for Multidisciplinary Brain Research, Bar Ilan University, Ramat Gan 5290002, Israel
| | - Elana Zion Golumbic
- The Gonda Center for Multidisciplinary Brain Research, Bar Ilan University, Ramat Gan 5290002, Israel
| |
Collapse
|
4
|
Zadoorian S, Rosenblum LD. The Benefit of Bimodal Training in Voice Learning. Brain Sci 2023; 13:1260. [PMID: 37759861 PMCID: PMC10526927 DOI: 10.3390/brainsci13091260] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2023] [Revised: 08/25/2023] [Accepted: 08/28/2023] [Indexed: 09/29/2023] Open
Abstract
It is known that talkers can be recognized by listening to their specific vocal qualities-breathiness and fundamental frequencies. However, talker identification can also occur by focusing on the talkers' unique articulatory style, which is known to be available auditorily and visually and can be shared across modalities. Evidence shows that voices heard while seeing talkers' faces are later recognized better on their own compared to the voices heard alone. The present study investigated whether the facilitation of voice learning through facial cues relies on talker-specific articulatory or nonarticulatory facial information. Participants were initially trained to learn the voices of ten talkers presented either on their own or together with (a) an articulating face, (b) a static face, or (c) an isolated articulating mouth. Participants were then tested on recognizing the voices on their own regardless of their training modality. Consistent with previous research, voices learned with articulating faces were recognized better on their own compared to voices learned alone. However, isolated articulating mouths did not provide an advantage in learning the voices. The results demonstrated that learning voices while seeing faces resulted in better voice learning compared to the voices learned alone.
Collapse
|
5
|
Mathias B, von Kriegstein K. Enriched learning: behavior, brain, and computation. Trends Cogn Sci 2023; 27:81-97. [PMID: 36456401 DOI: 10.1016/j.tics.2022.10.007] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2022] [Revised: 10/05/2022] [Accepted: 10/25/2022] [Indexed: 11/29/2022]
Abstract
The presence of complementary information across multiple sensory or motor modalities during learning, referred to as multimodal enrichment, can markedly benefit learning outcomes. Why is this? Here, we integrate cognitive, neuroscientific, and computational approaches to understanding the effectiveness of enrichment and discuss recent neuroscience findings indicating that crossmodal responses in sensory and motor brain regions causally contribute to the behavioral benefits of enrichment. The findings provide novel evidence for multimodal theories of enriched learning, challenge assumptions of longstanding cognitive theories, and provide counterevidence to unimodal neurobiologically inspired theories. Enriched educational methods are likely effective not only because they may engage greater levels of attention or deeper levels of processing, but also because multimodal interactions in the brain can enhance learning and memory.
Collapse
Affiliation(s)
- Brian Mathias
- School of Psychology, University of Aberdeen, Aberdeen, UK; Chair of Cognitive and Clinical Neuroscience, Faculty of Psychology, Technische Universität Dresden, Dresden, Germany.
| | - Katharina von Kriegstein
- Chair of Cognitive and Clinical Neuroscience, Faculty of Psychology, Technische Universität Dresden, Dresden, Germany.
| |
Collapse
|
6
|
Schroeger A, Kaufmann JM, Zäske R, Kovács G, Klos T, Schweinberger SR. Atypical prosopagnosia following right hemispheric stroke: A 23-year follow-up study with M.T. Cogn Neuropsychol 2022; 39:196-207. [PMID: 36202621 DOI: 10.1080/02643294.2022.2119838] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/31/2023]
Abstract
Most findings on prosopagnosia to date suggest preserved voice recognition in prosopagnosia (except in cases with bilateral lesions). Here we report a follow-up examination on M.T., suffering from acquired prosopagnosia following a large unilateral right-hemispheric lesion in frontal, parietal, and anterior temporal areas excluding core ventral occipitotemporal face areas. Twenty-three years after initial testing we reassessed face and object recognition skills [Henke, K., Schweinberger, S. R., Grigo, A., Klos, T., & Sommer, W. (1998). Specificity of face recognition: Recognition of exemplars of non-face objects in prosopagnosia. Cortex, 34(2), 289-296]; [Schweinberger, S. R., Klos, T., & Sommer, W. (1995). Covert face recognition in prosopagnosia - A dissociable function? Cortex, 31(3), 517-529] and additionally studied voice recognition. Confirming the persistence of deficits, M.T. exhibited substantial impairments in famous face recognition and memory for learned faces, but preserved face matching and object recognition skills. Critically, he showed substantially impaired voice recognition skills. These findings are congruent with the ideas that (i) prosopagnosia after right anterior temporal lesions can persist over long periods > 20 years, and that (ii) such lesions can be associated with both facial and vocal deficits in person recognition.
Collapse
Affiliation(s)
- Anna Schroeger
- Department of Psychology, Faculty of Psychology and Sports Science, Justus Liebig University, Giessen, Germany.,Department for General Psychology and Cognitive Neuroscience, Institute of Psychology, Friedrich Schiller University, Jena, Germany.,Department for the Psychology of Human Movement and Sport, Institute of Sport Science, Friedrich Schiller University, Jena, Germany
| | - Jürgen M Kaufmann
- Department for General Psychology and Cognitive Neuroscience, Institute of Psychology, Friedrich Schiller University, Jena, Germany.,DFG Research Unit Person Perception, Friedrich Schiller University, Jena, Germany
| | - Romi Zäske
- Department for General Psychology and Cognitive Neuroscience, Institute of Psychology, Friedrich Schiller University, Jena, Germany.,DFG Research Unit Person Perception, Friedrich Schiller University, Jena, Germany
| | - Gyula Kovács
- DFG Research Unit Person Perception, Friedrich Schiller University, Jena, Germany.,Biological Psychology and Cognitive Neurosciences, Institute of Psychology, Friedrich Schiller University, Jena, Germany
| | | | - Stefan R Schweinberger
- Department for General Psychology and Cognitive Neuroscience, Institute of Psychology, Friedrich Schiller University, Jena, Germany.,DFG Research Unit Person Perception, Friedrich Schiller University, Jena, Germany
| |
Collapse
|
7
|
Maguinness C, von Kriegstein K. Visual mechanisms for voice-identity recognition flexibly adjust to auditory noise level. Hum Brain Mapp 2021; 42:3963-3982. [PMID: 34043249 PMCID: PMC8288083 DOI: 10.1002/hbm.25532] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2021] [Revised: 04/26/2021] [Accepted: 05/02/2021] [Indexed: 11/24/2022] Open
Abstract
Recognising the identity of voices is a key ingredient of communication. Visual mechanisms support this ability: recognition is better for voices previously learned with their corresponding face (compared to a control condition). This so‐called ‘face‐benefit’ is supported by the fusiform face area (FFA), a region sensitive to facial form and identity. Behavioural findings indicate that the face‐benefit increases in noisy listening conditions. The neural mechanisms for this increase are unknown. Here, using functional magnetic resonance imaging, we examined responses in face‐sensitive regions while participants recognised the identity of auditory‐only speakers (previously learned by face) in high (SNR −4 dB) and low (SNR +4 dB) levels of auditory noise. We observed a face‐benefit in both noise levels, for most participants (16 of 21). In high‐noise, the recognition of face‐learned speakers engaged the right posterior superior temporal sulcus motion‐sensitive face area (pSTS‐mFA), a region implicated in the processing of dynamic facial cues. The face‐benefit in high‐noise also correlated positively with increased functional connectivity between this region and voice‐sensitive regions in the temporal lobe in the group of 16 participants with a behavioural face‐benefit. In low‐noise, the face‐benefit was robustly associated with increased responses in the FFA and to a lesser extent the right pSTS‐mFA. The findings highlight the remarkably adaptive nature of the visual network supporting voice‐identity recognition in auditory‐only listening conditions.
Collapse
Affiliation(s)
- Corrina Maguinness
- Chair of Cognitive and Clinical Neuroscience, Faculty of Psychology, Technische Universität Dresden, Dresden, Germany.,Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany
| | - Katharina von Kriegstein
- Chair of Cognitive and Clinical Neuroscience, Faculty of Psychology, Technische Universität Dresden, Dresden, Germany.,Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany
| |
Collapse
|