1
|
Becker J, Korn CW, Blank H. Pupil diameter as an indicator of sound pair familiarity after statistically structured auditory sequence. Sci Rep 2024; 14:8739. [PMID: 38627572 PMCID: PMC11021535 DOI: 10.1038/s41598-024-59302-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2023] [Accepted: 04/09/2024] [Indexed: 04/19/2024] Open
Abstract
Inspired by recent findings in the visual domain, we investigated whether the stimulus-evoked pupil dilation reflects temporal statistical regularities in sequences of auditory stimuli. We conducted two preregistered pupillometry experiments (experiment 1, n = 30, 21 females; experiment 2, n = 31, 22 females). In both experiments, human participants listened to sequences of spoken vowels in two conditions. In the first condition, the stimuli were presented in a random order and, in the second condition, the same stimuli were presented in a sequence structured in pairs. The second experiment replicated the first experiment with a modified timing and number of stimuli presented and without participants being informed about any sequence structure. The sound-evoked pupil dilation during a subsequent familiarity task indicated that participants learned the auditory vowel pairs of the structured condition. However, pupil diameter during the structured sequence did not differ according to the statistical regularity of the pair structure. This contrasts with similar visual studies, emphasizing the susceptibility of pupil effects during statistically structured sequences to experimental design settings in the auditory domain. In sum, our findings suggest that pupil diameter may serve as an indicator of sound pair familiarity but does not invariably respond to task-irrelevant transition probabilities of auditory sequences.
Collapse
Affiliation(s)
- Janika Becker
- Department of Systems Neuroscience, University Medical Center Hamburg-Eppendorf, Martinistr. 52, 20246, Hamburg, Germany.
| | - Christoph W Korn
- Department of Systems Neuroscience, University Medical Center Hamburg-Eppendorf, Martinistr. 52, 20246, Hamburg, Germany
- Section Social Neuroscience, Department of General Psychiatry, University of Heidelberg, 69115, Heidelberg, Germany
| | - Helen Blank
- Department of Systems Neuroscience, University Medical Center Hamburg-Eppendorf, Martinistr. 52, 20246, Hamburg, Germany
| |
Collapse
|
2
|
Yu K, Zhou Y, Zhang L, Li L, Li P, Wang R. How Different Types of Linguistic Information Impact Voice Perception: Evidence From the Language-Familiarity Effect. LANGUAGE AND SPEECH 2023; 66:1007-1029. [PMID: 36680473 DOI: 10.1177/00238309221143062] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/17/2023]
Abstract
Previous studies have suggested the effect of linguistic information on voice perception (e.g., the language-familiarity effect [LFE]). However, it remains unclear which type of specific information in speech contributes to voice perception, including acoustic, phonological, lexical, and semantic information. It is also underexamined whether the roles of these different types of information are modulated by the experimental paradigm (speaker discrimination vs. speaker identification). In this study, we conducted two experiments to investigate these issues regarding LFEs. Experiment 1 examined the roles of acoustic and phonological information in speaker discrimination and identification with forward and time-reversed Mandarin and Indonesian sentences. Experiment 2 further identified the roles of phonological, lexical, and semantic information with forward, word-scrambled, and reconstructed (consisting of pseudo-Mandarin words) Mandarin and forward Indonesian sentences. For Mandarin-only participants, in Experiment 1, speaker discrimination was more accurate for forward than reversed sentences, but there was no LFE in either sentence. Speaker identification was also more accurate for forward than reversed sentences, whereas there was an LFE for forward sentences. In Experiment 2, speaker discrimination was better for word-scrambled than reconstructed Mandarin sentences. Speaker identification was more accurate for forward and word-scrambled Mandarin sentences but less accurate for Mandarin reconstructed and forward Indonesian sentences. In general, the pattern of the results for Indonesian learners was the same as that for Mandarin-only speakers. These results suggest that different kinds of information support speaker discrimination and identification in native and unfamiliar languages. The LFE in speaker identification depends on both phonological and lexical information.
Collapse
Affiliation(s)
- Keke Yu
- Philosophy and Social Science Laboratory of Reading and Development in Children and Adolescents, Ministry of Education, & Center for Studies of Psychological Application, School of Psychology, South China Normal University, China
| | - Yacong Zhou
- Philosophy and Social Science Laboratory of Reading and Development in Children and Adolescents, Ministry of Education, & Center for Studies of Psychological Application, School of Psychology, South China Normal University, China; Huanghe Science and Technology University, China
| | | | - Li Li
- The Key Laboratory of Chinese Learning and International Promotion, and College of International Culture, South China Normal University, China
| | - Ping Li
- The Pennsylvania State University, USA
| | - Ruiming Wang
- Philosophy and Social Science Laboratory of Reading and Development in Children and Adolescents, Ministry of Education, & Center for Studies of Psychological Application, School of Psychology, South China Normal University, China
| |
Collapse
|
3
|
Krumbiegel J, Ufer C, Blank H. Influence of voice properties on vowel perception depends on speaker context. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2022; 152:820. [PMID: 36050169 DOI: 10.1121/10.0013363] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/07/2022] [Accepted: 07/13/2022] [Indexed: 06/15/2023]
Abstract
Different speakers produce the same intended vowel with very different physical properties. Fundamental frequency (F0) and formant frequencies (FF), the two main parameters that discriminate between voices, also influence vowel perception. While it has been shown that listeners comprehend speech more accurately if they are familiar with a talker's voice, it is still unclear how such prior information is used when decoding the speech stream. In three online experiments, we examined the influence of speaker context via F0 and FF shifts on the perception of /o/-/u/ vowel contrasts. Participants perceived vowels from an /o/-/u/ continuum shifted toward /u/ when F0 was lowered or FF increased relative to the original speaker's voice and vice versa. This shift was reduced when the speakers were presented in a block-wise context compared to random order. Conversely, the original base voice was perceived to be shifted toward /u/ when presented in the context of a low F0 or high FF speaker, compared to a shift toward /o/ with high F0 or low FF speaker context. These findings demonstrate that that F0 and FF jointly influence vowel perception in speaker context.
Collapse
Affiliation(s)
- Julius Krumbiegel
- Institute for Systems Neuroscience, University Hospital Hamburg-Eppendorf, Hamburg, Germany
| | - Carina Ufer
- Institute for Systems Neuroscience, University Hospital Hamburg-Eppendorf, Hamburg, Germany
| | - Helen Blank
- Institute for Systems Neuroscience, University Hospital Hamburg-Eppendorf, Hamburg, Germany
| |
Collapse
|
4
|
Schelinski S, Tabas A, von Kriegstein K. Altered processing of communication signals in the subcortical auditory sensory pathway in autism. Hum Brain Mapp 2022; 43:1955-1972. [PMID: 35037743 PMCID: PMC8933247 DOI: 10.1002/hbm.25766] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2021] [Revised: 11/24/2021] [Accepted: 12/19/2021] [Indexed: 12/17/2022] Open
Abstract
Autism spectrum disorder (ASD) is characterised by social communication difficulties. These difficulties have been mainly explained by cognitive, motivational, and emotional alterations in ASD. The communication difficulties could, however, also be associated with altered sensory processing of communication signals. Here, we assessed the functional integrity of auditory sensory pathway nuclei in ASD in three independent functional magnetic resonance imaging experiments. We focused on two aspects of auditory communication that are impaired in ASD: voice identity perception, and recognising speech‐in‐noise. We found reduced processing in adults with ASD as compared to typically developed control groups (pairwise matched on sex, age, and full‐scale IQ) in the central midbrain structure of the auditory pathway (inferior colliculus [IC]). The right IC responded less in the ASD as compared to the control group for voice identity, in contrast to speech recognition. The right IC also responded less in the ASD as compared to the control group when passively listening to vocal in contrast to non‐vocal sounds. Within the control group, the left and right IC responded more when recognising speech‐in‐noise as compared to when recognising speech without additional noise. In the ASD group, this was only the case in the left, but not the right IC. The results show that communication signal processing in ASD is associated with reduced subcortical sensory functioning in the midbrain. The results highlight the importance of considering sensory processing alterations in explaining communication difficulties, which are at the core of ASD.
Collapse
Affiliation(s)
- Stefanie Schelinski
- Faculty of Psychology, Chair of Cognitive and Clinical Neuroscience, Technische Universität Dresden, Dresden, Germany.,Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany
| | - Alejandro Tabas
- Faculty of Psychology, Chair of Cognitive and Clinical Neuroscience, Technische Universität Dresden, Dresden, Germany.,Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany
| | - Katharina von Kriegstein
- Faculty of Psychology, Chair of Cognitive and Clinical Neuroscience, Technische Universität Dresden, Dresden, Germany.,Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany
| |
Collapse
|
5
|
Mihai PG, Tschentscher N, von Kriegstein K. Modulation of the Primary Auditory Thalamus When Recognizing Speech with Background Noise. J Neurosci 2021; 41:7136-7147. [PMID: 34244362 PMCID: PMC8372015 DOI: 10.1523/jneurosci.2902-20.2021] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2020] [Revised: 05/18/2021] [Accepted: 05/20/2021] [Indexed: 11/21/2022] Open
Abstract
Recognizing speech in background noise is a strenuous daily activity, yet most humans can master it. An explanation of how the human brain deals with such sensory uncertainty during speech recognition is to-date missing. Previous work has shown that recognition of speech without background noise involves modulation of the auditory thalamus (medial geniculate body; MGB): there are higher responses in left MGB for speech recognition tasks that require tracking of fast-varying stimulus properties in contrast to relatively constant stimulus properties (e.g., speaker identity tasks) despite the same stimulus input. Here, we tested the hypotheses that (1) this task-dependent modulation for speech recognition increases in parallel with the sensory uncertainty in the speech signal, i.e., the amount of background noise; and that (2) this increase is present in the ventral MGB, which corresponds to the primary sensory part of the auditory thalamus. In accordance with our hypothesis, we show, by using ultra-high-resolution functional magnetic resonance imaging (fMRI) in male and female human participants, that the task-dependent modulation of the left ventral MGB (vMGB) for speech is particularly strong when recognizing speech in noisy listening conditions in contrast to situations where the speech signal is clear. The results imply that speech in noise recognition is supported by modifications at the level of the subcortical sensory pathway providing driving input to the auditory cortex.SIGNIFICANCE STATEMENT Speech recognition in noisy environments is a challenging everyday task. One reason why humans can master this task is the recruitment of additional cognitive resources as reflected in recruitment of non-language cerebral cortex areas. Here, we show that also modulation in the primary sensory pathway is specifically involved in speech in noise recognition. We found that the left primary sensory thalamus (ventral medial geniculate body; vMGB) is more involved when recognizing speech signals as opposed to a control task (speaker identity recognition) when heard in background noise versus when the noise was absent. This finding implies that the brain optimizes sensory processing in subcortical sensory pathway structures in a task-specific manner to deal with speech recognition in noisy environments.
Collapse
Affiliation(s)
- Paul Glad Mihai
- Chair of Cognitive and Clinical Neuroscience, Faculty of Psychology, Technische Universität Dresden, Dresden 01187, Germany
- Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig 04103, Germany
| | - Nadja Tschentscher
- Research Unit Biological Psychology, Department of Psychology, Ludwig-Maximilians-University Munich, Munich 80802, Germany
| | - Katharina von Kriegstein
- Chair of Cognitive and Clinical Neuroscience, Faculty of Psychology, Technische Universität Dresden, Dresden 01187, Germany
- Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig 04103, Germany
| |
Collapse
|
6
|
Johnson JF, Belyk M, Schwartze M, Pinheiro AP, Kotz SA. Expectancy changes the self-monitoring of voice identity. Eur J Neurosci 2021; 53:2681-2695. [PMID: 33638190 PMCID: PMC8252045 DOI: 10.1111/ejn.15162] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2020] [Revised: 01/18/2021] [Accepted: 02/20/2021] [Indexed: 12/02/2022]
Abstract
Self‐voice attribution can become difficult when voice characteristics are ambiguous, but functional magnetic resonance imaging (fMRI) investigations of such ambiguity are sparse. We utilized voice‐morphing (self‐other) to manipulate (un‐)certainty in self‐voice attribution in a button‐press paradigm. This allowed investigating how levels of self‐voice certainty alter brain activation in brain regions monitoring voice identity and unexpected changes in voice playback quality. FMRI results confirmed a self‐voice suppression effect in the right anterior superior temporal gyrus (aSTG) when self‐voice attribution was unambiguous. Although the right inferior frontal gyrus (IFG) was more active during a self‐generated compared to a passively heard voice, the putative role of this region in detecting unexpected self‐voice changes during the action was demonstrated only when hearing the voice of another speaker and not when attribution was uncertain. Further research on the link between right aSTG and IFG is required and may establish a threshold monitoring voice identity in action. The current results have implications for a better understanding of the altered experience of self‐voice feedback in auditory verbal hallucinations.
Collapse
Affiliation(s)
- Joseph F Johnson
- Department of Neuropsychology and Psychopharmacology, Maastricht University, Maastricht, the Netherlands
| | - Michel Belyk
- Division of Psychology and Language Sciences, University College London, London, UK
| | - Michael Schwartze
- Department of Neuropsychology and Psychopharmacology, Maastricht University, Maastricht, the Netherlands
| | - Ana P Pinheiro
- Faculdade de Psicologia, Universidade de Lisboa, Lisbon, Portugal
| | - Sonja A Kotz
- Department of Neuropsychology and Psychopharmacology, Maastricht University, Maastricht, the Netherlands.,Department of Neuropsychology, Max Planck Institute for Human and Cognitive Sciences, Leipzig, Germany
| |
Collapse
|
7
|
Luthra S. The Role of the Right Hemisphere in Processing Phonetic Variability Between Talkers. NEUROBIOLOGY OF LANGUAGE (CAMBRIDGE, MASS.) 2021; 2:138-151. [PMID: 37213418 PMCID: PMC10174361 DOI: 10.1162/nol_a_00028] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 04/17/2020] [Accepted: 11/13/2020] [Indexed: 05/23/2023]
Abstract
Neurobiological models of speech perception posit that both left and right posterior temporal brain regions are involved in the early auditory analysis of speech sounds. However, frank deficits in speech perception are not readily observed in individuals with right hemisphere damage. Instead, damage to the right hemisphere is often associated with impairments in vocal identity processing. Herein lies an apparent paradox: The mapping between acoustics and speech sound categories can vary substantially across talkers, so why might right hemisphere damage selectively impair vocal identity processing without obvious effects on speech perception? In this review, I attempt to clarify the role of the right hemisphere in speech perception through a careful consideration of its role in processing vocal identity. I review evidence showing that right posterior superior temporal, right anterior superior temporal, and right inferior / middle frontal regions all play distinct roles in vocal identity processing. In considering the implications of these findings for neurobiological accounts of speech perception, I argue that the recruitment of right posterior superior temporal cortex during speech perception may specifically reflect the process of conditioning phonetic identity on talker information. I suggest that the relative lack of involvement of other right hemisphere regions in speech perception may be because speech perception does not necessarily place a high burden on talker processing systems, and I argue that the extant literature hints at potential subclinical impairments in the speech perception abilities of individuals with right hemisphere damage.
Collapse
|
8
|
Nagels L, Gaudrain E, Vickers D, Hendriks P, Başkent D. Development of voice perception is dissociated across gender cues in school-age children. Sci Rep 2020; 10:5074. [PMID: 32193411 PMCID: PMC7081243 DOI: 10.1038/s41598-020-61732-6] [Citation(s) in RCA: 18] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2019] [Accepted: 02/27/2020] [Indexed: 11/11/2022] Open
Abstract
Children's ability to distinguish speakers' voices continues to develop throughout childhood, yet it remains unclear how children's sensitivity to voice cues, such as differences in speakers' gender, develops over time. This so-called voice gender is primarily characterized by speakers' mean fundamental frequency (F0), related to glottal pulse rate, and vocal-tract length (VTL), related to speakers' size. Here we show that children's acquisition of adult-like performance for discrimination, a lower-order perceptual task, and categorization, a higher-order cognitive task, differs across voice gender cues. Children's discrimination was adult-like around the age of 8 for VTL but still differed from adults at the age of 12 for F0. Children's perceptual weight attributed to F0 for gender categorization was adult-like around the age of 6 but around the age of 10 for VTL. Children's discrimination and weighting of F0 and VTL were only correlated for 4- to 6-year-olds. Hence, children's development of discrimination and weighting of voice gender cues are dissociated, i.e., adult-like performance for F0 and VTL is acquired at different rates and does not seem to be closely related. The different developmental patterns for auditory discrimination and categorization highlight the complexity of the relationship between perceptual and cognitive mechanisms of voice perception.
Collapse
Affiliation(s)
- Leanne Nagels
- Center for Language and Cognition Groningen (CLCG), University of Groningen, Groningen, The Netherlands.
- Department of Otorhinolaryngology/Head and Neck Surgery, University Medical Center Groningen, University of Groningen, Groningen, The Netherlands.
- Research School of Behavioral and Cognitive Neuroscience, Graduate School of Medical Sciences, University of Groningen, Groningen, The Netherlands.
| | - Etienne Gaudrain
- Department of Otorhinolaryngology/Head and Neck Surgery, University Medical Center Groningen, University of Groningen, Groningen, The Netherlands
- Research School of Behavioral and Cognitive Neuroscience, Graduate School of Medical Sciences, University of Groningen, Groningen, The Netherlands
- CNRS UMR 5292, Lyon Neuroscience Research Center, Auditory Cognition and Psychoacoustics, Université de Lyon, Lyon, France
| | - Deborah Vickers
- Cambridge Hearing Group, Clinical Neurosciences Department, University of Cambridge, Cambridge, United Kingdom
| | - Petra Hendriks
- Center for Language and Cognition Groningen (CLCG), University of Groningen, Groningen, The Netherlands
- Research School of Behavioral and Cognitive Neuroscience, Graduate School of Medical Sciences, University of Groningen, Groningen, The Netherlands
| | - Deniz Başkent
- Department of Otorhinolaryngology/Head and Neck Surgery, University Medical Center Groningen, University of Groningen, Groningen, The Netherlands
- Research School of Behavioral and Cognitive Neuroscience, Graduate School of Medical Sciences, University of Groningen, Groningen, The Netherlands
| |
Collapse
|
9
|
Stevenage SV, Symons AE, Fletcher A, Coen C. Sorting through the impact of familiarity when processing vocal identity: Results from a voice sorting task. Q J Exp Psychol (Hove) 2019; 73:519-536. [PMID: 31658884 PMCID: PMC7074657 DOI: 10.1177/1747021819888064] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
The present article reports on one experiment designed to examine the importance of familiarity when processing vocal identity. A voice sorting task was used with participants who were either personally familiar or unfamiliar with three speakers. The results suggested that familiarity supported both an ability to tell different instances of the same voice together, and to tell similar instances of different voices apart. In addition, the results suggested differences between the three speakers in terms of the extent to which they were confusable, underlining the importance of vocal characteristics and stimulus selection within behavioural tasks. The results are discussed with reference to existing debates regarding the nature of stored representations as familiarity develops, and the difficulty when processing voices over faces more generally.
Collapse
Affiliation(s)
| | - Ashley E Symons
- School of Psychology, University of Southampton, Southampton, UK
| | - Abi Fletcher
- School of Psychology, University of Southampton, Southampton, UK
| | - Chantelle Coen
- School of Psychology, University of Southampton, Southampton, UK
| |
Collapse
|
10
|
Sjerps MJ, Fox NP, Johnson K, Chang EF. Speaker-normalized sound representations in the human auditory cortex. Nat Commun 2019; 10:2465. [PMID: 31165733 PMCID: PMC6549175 DOI: 10.1038/s41467-019-10365-z] [Citation(s) in RCA: 26] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2018] [Accepted: 05/03/2019] [Indexed: 11/08/2022] Open
Abstract
The acoustic dimensions that distinguish speech sounds (like the vowel differences in "boot" and "boat") also differentiate speakers' voices. Therefore, listeners must normalize across speakers without losing linguistic information. Past behavioral work suggests an important role for auditory contrast enhancement in normalization: preceding context affects listeners' perception of subsequent speech sounds. Here, using intracranial electrocorticography in humans, we investigate whether and how such context effects arise in auditory cortex. Participants identified speech sounds that were preceded by phrases from two different speakers whose voices differed along the same acoustic dimension as target words (the lowest resonance of the vocal tract). In every participant, target vowels evoke a speaker-dependent neural response that is consistent with the listener's perception, and which follows from a contrast enhancement model. Auditory cortex processing thus displays a critical feature of normalization, allowing listeners to extract meaningful content from the voices of diverse speakers.
Collapse
Affiliation(s)
- Matthias J Sjerps
- Donders Institute for Brain, Cognition and Behaviour, Centre for Cognitive Neuroimaging, Radboud University, Kapittelweg 29, Nijmegen, 6525 EN, The Netherlands
- Max Planck Institute for Psycholinguistics, Wundtlaan 1, Nijmegen, 6525 XD, Netherlands
| | - Neal P Fox
- Department of Neurological Surgery, University of California, San Francisco, 675 Nelson Rising Lane, San Francisco, California, 94158, USA
| | - Keith Johnson
- Department of Linguistics, University of California, Berkeley, 1203 Dwinelle Hall #2650, Berkeley, California, 94720, USA
| | - Edward F Chang
- Department of Neurological Surgery, University of California, San Francisco, 675 Nelson Rising Lane, San Francisco, California, 94158, USA.
- Weill Institute for Neurosciences, University of California, San Francisco, 675 Nelson Rising Lane, San Francisco, California, 94158, USA.
| |
Collapse
|
11
|
Schelinski S, von Kriegstein K. The Relation Between Vocal Pitch and Vocal Emotion Recognition Abilities in People with Autism Spectrum Disorder and Typical Development. J Autism Dev Disord 2019; 49:68-82. [PMID: 30022285 PMCID: PMC6331502 DOI: 10.1007/s10803-018-3681-z] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/02/2023]
Abstract
We tested the relation between vocal emotion and vocal pitch perception abilities in adults with high-functioning autism spectrum disorder (ASD) and pairwise matched adults with typical development. The ASD group had impaired vocal but typical non-vocal pitch and vocal timbre perception abilities. The ASD group showed less accurate vocal emotion perception than the comparison group and vocal emotion perception abilities were correlated with traits and symptoms associated with ASD. Vocal pitch and vocal emotion perception abilities were significantly correlated in the comparison group only. Our results suggest that vocal emotion recognition difficulties in ASD might not only be based on difficulties with complex social tasks, but also on difficulties with processing of basic sensory features, such as vocal pitch.
Collapse
Affiliation(s)
- Stefanie Schelinski
- Max Planck Institute for Human Cognitive and Brain Sciences, Stephanstraße 1a, 04103 Leipzig, Germany
- Technische Universität Dresden, Faculty of Psychology, Bamberger Straße 7, 01187 Dresden, Germany
| | - Katharina von Kriegstein
- Max Planck Institute for Human Cognitive and Brain Sciences, Stephanstraße 1a, 04103 Leipzig, Germany
- Technische Universität Dresden, Faculty of Psychology, Bamberger Straße 7, 01187 Dresden, Germany
| |
Collapse
|
12
|
Kreitewolf J, Mathias SR, Trapeau R, Obleser J, Schönwiesner M. Perceptual grouping in the cocktail party: Contributions of voice-feature continuity. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2018; 144:2178. [PMID: 30404485 DOI: 10.1121/1.5058684] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/26/2018] [Accepted: 09/18/2018] [Indexed: 06/08/2023]
Abstract
Cocktail parties pose a difficult yet solvable problem for the auditory system. Previous work has shown that the cocktail-party problem is considerably easier when all sounds in the target stream are spoken by the same talker (the voice-continuity benefit). The present study investigated the contributions of two of the most salient voice features-glottal-pulse rate (GPR) and vocal-tract length (VTL)-to the voice-continuity benefit. Twenty young, normal-hearing listeners participated in two experiments. On each trial, listeners heard concurrent sequences of spoken digits from three different spatial locations and reported the digits coming from a target location. Critically, across conditions, GPR and VTL either remained constant or varied across target digits. Additionally, across experiments, the target location either remained constant (Experiment 1) or varied (Experiment 2) within a trial. In Experiment 1, listeners benefited from continuity in either voice feature, but VTL continuity was more helpful than GPR continuity. In Experiment 2, spatial discontinuity greatly hindered listeners' abilities to exploit continuity in GPR and VTL. The present results suggest that selective attention benefits from continuity in target voice features and that VTL and GPR play different roles for perceptual grouping and stream segregation in the cocktail party.
Collapse
Affiliation(s)
- Jens Kreitewolf
- International Laboratory for Brain, Music and Sound Research (BRAMS), Department of Psychology, Université de Montréal, Pavillon 1420 Boulevard Mont-Royal, Outremont, Quebec, H2V 4P3, Canada
| | - Samuel R Mathias
- Neurocognition, Neurocomputation and Neurogenetics (n3) Division, Yale University School of Medicine, 40 Temple Street, New Haven, Connecticut 06511, USA
| | - Régis Trapeau
- International Laboratory for Brain, Music and Sound Research (BRAMS), Department of Psychology, Université de Montréal, Pavillon 1420 Boulevard Mont-Royal, Outremont, Quebec, H2V 4P3, Canada
| | - Jonas Obleser
- Department of Psychology, University of Lübeck, Maria-Goeppert-Straße 9a, D-23562 Lübeck, Germany
| | - Marc Schönwiesner
- International Laboratory for Brain, Music and Sound Research (BRAMS), Department of Psychology, Université de Montréal, Pavillon 1420 Boulevard Mont-Royal, Outremont, Quebec, H2V 4P3, Canada
| |
Collapse
|
13
|
Maguinness C, Roswandowitz C, von Kriegstein K. Understanding the mechanisms of familiar voice-identity recognition in the human brain. Neuropsychologia 2018; 116:179-193. [DOI: 10.1016/j.neuropsychologia.2018.03.039] [Citation(s) in RCA: 47] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2017] [Revised: 03/28/2018] [Accepted: 03/29/2018] [Indexed: 11/26/2022]
|
14
|
Training-induced brain activation and functional connectivity differentiate multi-talker and single-talker speech training. Neurobiol Learn Mem 2018. [PMID: 29535043 DOI: 10.1016/j.nlm.2018.03.009] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
In second language acquisition studies, the high talker variability training approach has been frequently used to train participants to learn new speech patterns. However, the neuroplasticity induced by training is poorly understood. In the present study, native English speakers were trained on non-native pitch patterns (linguistic tones from Mandarin Chinese) in multi-talker (N = 16) or single-talker (N = 16) training conditions. We focused on two aspects of multi-talker training, voice processing and lexical phonology accessing, and used functional magnetic resonance imaging (fMRI) to measure the brain activation and functional connectivity (FC) of two regions of interest in a tone identification task conducted before and after training, namely the anterior part of the right superior temporal gyrus (aRSTG) and the posterior left superior temporal gyrus (pLSTG). The results showed distinct patterns of associations between neural signals and learning success for multi-talker training. Specifically, post-training brain activation in the aRSTG and FC strength between the aRSTG and pLSTG were correlated with learning success in the multi-talker training group but not in the single-talker group. These results suggest that talker variability in the training procedure may enhance neural efficiency in these brain areas and strengthen the cooperation between them. Our findings highlight the brain processing of newly learned speech patterns is influenced by the given training approach.
Collapse
|
15
|
Carey D, Miquel ME, Evans BG, Adank P, McGettigan C. Functional brain outcomes of L2 speech learning emerge during sensorimotor transformation. Neuroimage 2017; 159:18-31. [PMID: 28669904 DOI: 10.1016/j.neuroimage.2017.06.053] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2017] [Revised: 06/20/2017] [Accepted: 06/21/2017] [Indexed: 11/18/2022] Open
Abstract
Sensorimotor transformation (ST) may be a critical process in mapping perceived speech input onto non-native (L2) phonemes, in support of subsequent speech production. Yet, little is known concerning the role of ST with respect to L2 speech, particularly where learned L2 phones (e.g., vowels) must be produced in more complex lexical contexts (e.g., multi-syllabic words). Here, we charted the behavioral and neural outcomes of producing trained L2 vowels at word level, using a speech imitation paradigm and functional MRI. We asked whether participants would be able to faithfully imitate trained L2 vowels when they occurred in non-words of varying complexity (one or three syllables). Moreover, we related individual differences in imitation success during training to BOLD activation during ST (i.e., pre-imitation listening), and during later imitation. We predicted that superior temporal and peri-Sylvian speech regions would show increased activation as a function of item complexity and non-nativeness of vowels, during ST. We further anticipated that pre-scan acoustic learning performance would predict BOLD activation for non-native (vs. native) speech during ST and imitation. We found individual differences in imitation success for training on the non-native vowel tokens in isolation; these were preserved in a subsequent task, during imitation of mono- and trisyllabic words containing those vowels. fMRI data revealed a widespread network involved in ST, modulated by both vowel nativeness and utterance complexity: superior temporal activation increased monotonically with complexity, showing greater activation for non-native than native vowels when presented in isolation and in trisyllables, but not in monosyllables. Individual differences analyses showed that learning versus lack of improvement on the non-native vowel during pre-scan training predicted increased ST activation for non-native compared with native items, at insular cortex, pre-SMA/SMA, and cerebellum. Our results hold implications for the importance of ST as a process underlying successful imitation of non-native speech.
Collapse
Affiliation(s)
- Daniel Carey
- Department of Psychology, Royal Holloway, University of London, TW20 0EX, UK; Combined Universities Brain Imaging Centre, Royal Holloway, University of London, TW20 0EX, UK; The Irish Longitudinal Study on Ageing (TILDA), Dept. Medical Gerontology, TCD, Dublin, Ireland
| | - Marc E Miquel
- William Harvey Research Institute, Queen Mary, University of London, EC1M 6BQ, UK; Clinical Physics, Barts Health NHS Trust, London, EC1A 7BE, UK
| | - Bronwen G Evans
- Department of Speech, Hearing & Phonetic Sciences, University College London, WC1E 6BT, UK
| | - Patti Adank
- Department of Speech, Hearing & Phonetic Sciences, University College London, WC1E 6BT, UK
| | - Carolyn McGettigan
- Department of Psychology, Royal Holloway, University of London, TW20 0EX, UK; Combined Universities Brain Imaging Centre, Royal Holloway, University of London, TW20 0EX, UK; Institute of Cognitive Neuroscience, University College London, WC1N 3AR, UK.
| |
Collapse
|
16
|
Kreitewolf J, Mathias SR, von Kriegstein K. Implicit Talker Training Improves Comprehension of Auditory Speech in Noise. Front Psychol 2017; 8:1584. [PMID: 28959226 PMCID: PMC5603660 DOI: 10.3389/fpsyg.2017.01584] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2017] [Accepted: 08/29/2017] [Indexed: 11/13/2022] Open
Abstract
Previous studies have shown that listeners are better able to understand speech when they are familiar with the talker's voice. In most of these studies, talker familiarity was ensured by explicit voice training; that is, listeners learned to identify the familiar talkers. In the real world, however, the characteristics of familiar talkers are learned incidentally, through communication. The present study investigated whether speech comprehension benefits from implicit voice training; that is, through exposure to talkers' voices without listeners explicitly trying to identify them. During four training sessions, listeners heard short sentences containing a single verb (e.g., "he writes"), spoken by one talker. The sentences were mixed with noise, and listeners identified the verb within each sentence while their speech-reception thresholds (SRT) were measured. In a final test session, listeners performed the same task, but this time they heard different sentences spoken by the familiar talker and three unfamiliar talkers. Familiar and unfamiliar talkers were counterbalanced across listeners. Half of the listeners performed a test session in which the four talkers were presented in separate blocks (blocked paradigm). For the other half, talkers varied randomly from trial to trial (interleaved paradigm). The results showed that listeners had lower SRT when the speech was produced by the familiar talker than the unfamiliar talkers. The type of talker presentation (blocked vs. interleaved) had no effect on this familiarity benefit. These findings suggest that listeners implicitly learn talker-specific information during a speech-comprehension task, and exploit this information to improve the comprehension of novel speech material from familiar talkers.
Collapse
Affiliation(s)
- Jens Kreitewolf
- Department of Psychology, University of LübeckLübeck, Germany.,Max Planck Institute for Human Cognitive and Brain SciencesLeipzig, Germany
| | - Samuel R Mathias
- Department of Psychiatry, Yale University, New HavenCT, United States
| | - Katharina von Kriegstein
- Max Planck Institute for Human Cognitive and Brain SciencesLeipzig, Germany.,Department of Psychology, Humboldt University of BerlinBerlin, Germany
| |
Collapse
|
17
|
Stevenage SV. Drawing a distinction between familiar and unfamiliar voice processing: A review of neuropsychological, clinical and empirical findings. Neuropsychologia 2017; 116:162-178. [PMID: 28694095 DOI: 10.1016/j.neuropsychologia.2017.07.005] [Citation(s) in RCA: 24] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2017] [Revised: 06/04/2017] [Accepted: 07/07/2017] [Indexed: 11/29/2022]
Abstract
Thirty years on from their initial observation that familiar voice recognition is not the same as unfamiliar voice discrimination (van Lancker and Kreiman, 1987), the current paper reviews available evidence in support of a distinction between familiar and unfamiliar voice processing. Here, an extensive review of the literature is provided, drawing on evidence from four domains of interest: the neuropsychological study of healthy individuals, neuropsychological investigation of brain-damaged individuals, the exploration of voice recognition deficits in less commonly studied clinical conditions, and finally empirical data from healthy individuals. All evidence is assessed in terms of its contribution to the question of interest - is familiar voice processing distinct from unfamiliar voice processing. In this regard, the evidence provides compelling support for van Lancker and Kreiman's early observation. Two considerations result: First, the limits of research based on one or other type of voice stimulus are more clearly appreciated. Second, given the demonstration of a distinction between unfamiliar and familiar voice processing, a new wave of research is encouraged which examines the transition involved as a voice is learned.
Collapse
Affiliation(s)
- Sarah V Stevenage
- Department of Psychology, University of Southampton, Highfield, Southampton, Hampshire SO17 1BJ, UK.
| |
Collapse
|
18
|
Neuromagnetic correlates of voice pitch, vowel type, and speaker size in auditory cortex. Neuroimage 2017; 158:79-89. [PMID: 28669914 DOI: 10.1016/j.neuroimage.2017.06.065] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2017] [Revised: 06/13/2017] [Accepted: 06/22/2017] [Indexed: 11/24/2022] Open
Abstract
Vowel recognition is largely immune to differences in speaker size despite the waveform differences associated with variation in speaker size. This has led to the suggestion that voice pitch and mean formant frequency (MFF) are extracted early in the hierarchy of hearing/speech processing and used to normalize the internal representation of vowel sounds. This paper presents a magnetoencephalographic (MEG) experiment designed to locate and compare neuromagnetic activity associated with voice pitch, MFF and vowel type in human auditory cortex. Sequences of six sustained vowels were used to contrast changes in the three components of vowel perception, and MEG responses to the changes were recorded from 25 participants. A staged procedure was employed to fit the MEG data with a source model having one bilateral pair of dipoles for each component of vowel perception. This dipole model showed that the activity associated with the three perceptual changes was functionally separable; the pitch source was located in Heschl's gyrus (bilaterally), while the vowel-type and formant-frequency sources were located (bilaterally) just behind Heschl's gyrus in planum temporale. The results confirm that vowel normalization begins in auditory cortex at an early point in the hierarchy of speech processing.
Collapse
|
19
|
Myers EB, Theodore RM. Voice-sensitive brain networks encode talker-specific phonetic detail. BRAIN AND LANGUAGE 2017; 165:33-44. [PMID: 27898342 PMCID: PMC5237402 DOI: 10.1016/j.bandl.2016.11.001] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/05/2016] [Revised: 09/13/2016] [Accepted: 11/04/2016] [Indexed: 05/09/2023]
Abstract
The speech stream simultaneously carries information about talker identity and linguistic content, and the same acoustic property (e.g., voice-onset-time, or VOT) may be used for both purposes. Separable neural networks for processing talker identity and phonetic content have been identified, but it is unclear how a singular acoustic property is parsed by the neural system for talker identification versus phonetic processing. In the current study, listeners were exposed to two talkers with characteristically different VOTs. Subsequently, brain activation was measured using fMRI as listeners performed a phonetic categorization task on these stimuli. Right temporoparietal regions previously implicated in talker identification showed sensitivity to the match between VOT variant and talker, whereas left posterior temporal regions showed sensitivity to the typicality of phonetic exemplars, regardless of talker typicality. Taken together, these results suggest that neural systems for voice recognition capture talker-specific phonetic variation.
Collapse
Affiliation(s)
- Emily B Myers
- University of Connecticut, Department of Speech, Language, and Hearing Sciences, 850 Bolton Road, Unit 1085, Storrs, CT 06269-1085, United States; University of Connecticut, Department of Psychological Sciences, 406 Babbidge Road, Unit 1020, Storrs, CT 06269-1020, United States; Haskins Laboratories, 300 George Street, Suite 900, New Haven, CT 06511, United States; Connecticut Institute for the Brain and Cognitive Sciences, 337 Mansfield Road, Unit 1272, Storrs, CT 06269-1085, United States.
| | - Rachel M Theodore
- University of Connecticut, Department of Speech, Language, and Hearing Sciences, 850 Bolton Road, Unit 1085, Storrs, CT 06269-1085, United States; Haskins Laboratories, 300 George Street, Suite 900, New Haven, CT 06511, United States; Connecticut Institute for the Brain and Cognitive Sciences, 337 Mansfield Road, Unit 1272, Storrs, CT 06269-1085, United States
| |
Collapse
|
20
|
Schelinski S, Borowiak K, von Kriegstein K. Temporal voice areas exist in autism spectrum disorder but are dysfunctional for voice identity recognition. Soc Cogn Affect Neurosci 2016; 11:1812-1822. [PMID: 27369067 PMCID: PMC5091681 DOI: 10.1093/scan/nsw089] [Citation(s) in RCA: 35] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2016] [Revised: 05/05/2016] [Accepted: 06/20/2016] [Indexed: 11/24/2022] Open
Abstract
The ability to recognise the identity of others is a key requirement for successful communication. Brain regions that respond selectively to voices exist in humans from early infancy on. Currently, it is unclear whether dysfunction of these voice-sensitive regions can explain voice identity recognition impairments. Here, we used two independent functional magnetic resonance imaging studies to investigate voice processing in a population that has been reported to have no voice-sensitive regions: autism spectrum disorder (ASD). Our results refute the earlier report that individuals with ASD have no responses in voice-sensitive regions: Passive listening to vocal, compared to non-vocal, sounds elicited typical responses in voice-sensitive regions in the high-functioning ASD group and controls. In contrast, the ASD group had a dysfunction in voice-sensitive regions during voice identity but not speech recognition in the right posterior superior temporal sulcus/gyrus (STS/STG)-a region implicated in processing complex spectrotemporal voice features and unfamiliar voices. The right anterior STS/STG correlated with voice identity recognition performance in controls but not in the ASD group. The findings suggest that right STS/STG dysfunction is critical for explaining voice recognition impairments in high-functioning ASD and show that ASD is not characterised by a general lack of voice-sensitive responses.
Collapse
Affiliation(s)
- Stefanie Schelinski
- Max Planck Institute for Human Cognitive and Brain Sciences, Max Planck Research Group, Neural mechanisms of human communication, Leipzig, 04103, Germany
| | - Kamila Borowiak
- Max Planck Institute for Human Cognitive and Brain Sciences, Max Planck Research Group, Neural mechanisms of human communication, Leipzig, 04103, Germany
- Berlin School of Mind and Brain, Humboldt University of Berlin, Berlin, 10117
| | - Katharina von Kriegstein
- Max Planck Institute for Human Cognitive and Brain Sciences, Max Planck Research Group, Neural mechanisms of human communication, Leipzig, 04103, Germany
- Department of Psychology, Humboldt University of Berlin, Berlin, 12489, Germany
| |
Collapse
|
21
|
Schelinski S, Roswandowitz C, von Kriegstein K. Voice identity processing in autism spectrum disorder. Autism Res 2016; 10:155-168. [PMID: 27404447 DOI: 10.1002/aur.1639] [Citation(s) in RCA: 28] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2015] [Revised: 04/01/2016] [Accepted: 04/04/2016] [Indexed: 12/20/2022]
Abstract
People with autism spectrum disorder (ASD) have difficulties in identifying another person by face and voice. This might contribute considerably to the development of social cognition and interaction difficulties. The characteristics of the voice recognition deficit in ASD are unknown. Here, we used a comprehensive behavioral test battery to systematically investigate voice processing in high-functioning ASD (n = 16) and typically developed pair-wise matched controls (n = 16). The ASD group had particular difficulties with discriminating, learning, and recognizing unfamiliar voices, while recognizing famous voices was relatively intact. Tests on acoustic processing abilities showed that the ASD group had a specific deficit in vocal pitch perception that was dissociable from otherwise intact acoustic processing (i.e., musical pitch, musical, and vocal timbre perception). Our results allow a characterization of the voice recognition deficit in ASD: The findings indicate that in high-functioning ASD, the difficulty to recognize voices is particularly pronounced for learning novel voices and the recognition of unfamiliar peoples' voices. This pattern might be indicative of difficulties with integrating the acoustic characteristics of the voice into a coherent percept-a function that has been previously associated with voice-selective regions in the posterior superior temporal sulcus/gyrus of the human brain. Autism Res 2017, 10: 155-168. © 2016 International Society for Autism Research, Wiley Periodicals, Inc.
Collapse
Affiliation(s)
- Stefanie Schelinski
- Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany.,International Max Planck Research School on Neuroscience of Communication, Leipzig, Germany.,Humboldt University of Berlin, Berlin, Germany
| | - Claudia Roswandowitz
- International Max Planck Research School on Neuroscience of Communication, Leipzig, Germany
| | | |
Collapse
|
22
|
Marie D, Maingault S, Crivello F, Mazoyer B, Tzourio-Mazoyer N. Surface-Based Morphometry of Cortical Thickness and Surface Area Associated with Heschl's Gyri Duplications in 430 Healthy Volunteers. Front Hum Neurosci 2016; 10:69. [PMID: 27014013 PMCID: PMC4779901 DOI: 10.3389/fnhum.2016.00069] [Citation(s) in RCA: 24] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2015] [Accepted: 02/11/2016] [Indexed: 01/31/2023] Open
Abstract
We applied Surface-Based Morphometry to assess the variations in cortical thickness (CT) and cortical surface area (CSA) in relation to the occurrence of Heschl's gyrus (HG) duplications in each hemisphere. 430 healthy brains that had previously been classified as having a single HG, Common Stem Duplication (CSD) or Complete Posterior Duplication (CPD) in each hemisphere were analyzed. To optimally align the HG area across the different groups of gyrification, we computed a specific surface-based template composed of 40 individuals with a symmetrical HG gyrification pattern (20 single HG, 10 CPD, 10 CSD). After normalizing the 430 participants' T1 images to this specific template, we separately compared the groups constituted of participants with a single HG, CPD, and CSD in each hemisphere. The occurrence of a duplication in either hemisphere was associated with an increase in CT posterior to the primary auditory cortex. This may be the neural support of expertise or great abilities in either speech or music processing domains that were related with duplications by previous studies. A decrease in CSA in the planum temporale was detected in cases with duplication in the left hemisphere. In the right hemisphere, a medial decrease in CSA and a lateral increase in CSA were present in HG when a CPD occurred together with an increase in CSA in the depth of the superior temporal sulcus (STS) in CSD compared to a single HG. These variations associated with duplication might be related to the functions that they process jointly within each hemisphere: temporal and speech processing in the left and spectral and music processing in the right.
Collapse
Affiliation(s)
- Damien Marie
- Groupe d'Imagerie Neurofonctionnelle, Institut des Maladies Neurodégénératives UMR 5293, Université de BordeauxBordeaux, France; Centre National de la Recherche Scientifique, Groupe d'Imagerie Neurofonctionnelle, Institut des Maladies Neurodégénératives UMR 5293Bordeaux, France; Commissariat à l'Énergie Atomique et aux Énergies Alternatives, Groupe d'Imagerie Neurofonctionnelle, Institut des Maladies Neurodégénératives UMR 5293Bordeaux, France
| | - Sophie Maingault
- Groupe d'Imagerie Neurofonctionnelle, Institut des Maladies Neurodégénératives UMR 5293, Université de BordeauxBordeaux, France; Centre National de la Recherche Scientifique, Groupe d'Imagerie Neurofonctionnelle, Institut des Maladies Neurodégénératives UMR 5293Bordeaux, France; Commissariat à l'Énergie Atomique et aux Énergies Alternatives, Groupe d'Imagerie Neurofonctionnelle, Institut des Maladies Neurodégénératives UMR 5293Bordeaux, France
| | - Fabrice Crivello
- Groupe d'Imagerie Neurofonctionnelle, Institut des Maladies Neurodégénératives UMR 5293, Université de BordeauxBordeaux, France; Centre National de la Recherche Scientifique, Groupe d'Imagerie Neurofonctionnelle, Institut des Maladies Neurodégénératives UMR 5293Bordeaux, France; Commissariat à l'Énergie Atomique et aux Énergies Alternatives, Groupe d'Imagerie Neurofonctionnelle, Institut des Maladies Neurodégénératives UMR 5293Bordeaux, France
| | - Bernard Mazoyer
- Groupe d'Imagerie Neurofonctionnelle, Institut des Maladies Neurodégénératives UMR 5293, Université de BordeauxBordeaux, France; Centre National de la Recherche Scientifique, Groupe d'Imagerie Neurofonctionnelle, Institut des Maladies Neurodégénératives UMR 5293Bordeaux, France; Commissariat à l'Énergie Atomique et aux Énergies Alternatives, Groupe d'Imagerie Neurofonctionnelle, Institut des Maladies Neurodégénératives UMR 5293Bordeaux, France
| | - Nathalie Tzourio-Mazoyer
- Groupe d'Imagerie Neurofonctionnelle, Institut des Maladies Neurodégénératives UMR 5293, Université de BordeauxBordeaux, France; Centre National de la Recherche Scientifique, Groupe d'Imagerie Neurofonctionnelle, Institut des Maladies Neurodégénératives UMR 5293Bordeaux, France; Commissariat à l'Énergie Atomique et aux Énergies Alternatives, Groupe d'Imagerie Neurofonctionnelle, Institut des Maladies Neurodégénératives UMR 5293Bordeaux, France
| |
Collapse
|
23
|
Zhang C, Pugh KR, Mencl WE, Molfese PJ, Frost SJ, Magnuson JS, Peng G, Wang WSY. Functionally integrated neural processing of linguistic and talker information: An event-related fMRI and ERP study. Neuroimage 2015; 124:536-549. [PMID: 26343322 DOI: 10.1016/j.neuroimage.2015.08.064] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2014] [Revised: 08/15/2015] [Accepted: 08/28/2015] [Indexed: 11/16/2022] Open
Abstract
Speech signals contain information of both linguistic content and a talker's voice. Conventionally, linguistic and talker processing are thought to be mediated by distinct neural systems in the left and right hemispheres respectively, but there is growing evidence that linguistic and talker processing interact in many ways. Previous studies suggest that talker-related vocal tract changes are processed integrally with phonetic changes in the bilateral posterior superior temporal gyrus/superior temporal sulcus (STG/STS), because the vocal tract parameter influences the perception of phonetic information. It is yet unclear whether the bilateral STG is also activated by the integral processing of another parameter - pitch, which influences the perception of lexical tone information and is related to talker differences in tone languages. In this study, we conducted separate functional magnetic resonance imaging (fMRI) and event-related potential (ERP) experiments to examine the spatial and temporal loci of interactions of lexical tone and talker-related pitch processing in Cantonese. We found that the STG was activated bilaterally during the processing of talker changes when listeners attended to lexical tone changes in the stimuli and during the processing of lexical tone changes when listeners attended to talker changes, suggesting that lexical tone and talker processing are functionally integrated in the bilateral STG. It extends the previous study, providing evidence for a general neural mechanism of integral phonetic and talker processing in the bilateral STG. The ERP results show interactions of lexical tone and talker processing 500-800ms after auditory word onset (a simultaneous posterior P3b and a frontal negativity). Moreover, there is some asymmetry in the interaction, such that unattended talker changes affect linguistic processing more than vice versa, which may be related to the ambiguity that talker changes cause in speech perception and/or attention bias to talker changes. Our findings have implications for understanding the neural encoding of linguistic and talker information.
Collapse
Affiliation(s)
- Caicai Zhang
- Department of Chinese and Bilingual Studies, The Hong Kong Polytechnic University, Hong Kong, China; Key Laboratory of Human-Machine Intelligence-Synergy Systems, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China.
| | - Kenneth R Pugh
- Haskins Laboratories, New Haven, CT, USA; Department of Psychology, University of Connecticut, Storrs, CT, USA; Department of Linguistics, Yale University, New Haven, CT, USA
| | - W Einar Mencl
- Haskins Laboratories, New Haven, CT, USA; Department of Linguistics, Yale University, New Haven, CT, USA
| | - Peter J Molfese
- Haskins Laboratories, New Haven, CT, USA; Department of Psychology, University of Connecticut, Storrs, CT, USA
| | | | - James S Magnuson
- Haskins Laboratories, New Haven, CT, USA; Department of Psychology, University of Connecticut, Storrs, CT, USA
| | - Gang Peng
- Key Laboratory of Human-Machine Intelligence-Synergy Systems, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China; CUHK-PKU-UST Joint Research Centre for Language and Human Complexity, The Chinese University of Hong Kong, Hong Kong, China; Department of Linguistics and Modern Languages, The Chinese University of Hong Kong, Hong Kong, China.
| | - William S-Y Wang
- Key Laboratory of Human-Machine Intelligence-Synergy Systems, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China; CUHK-PKU-UST Joint Research Centre for Language and Human Complexity, The Chinese University of Hong Kong, Hong Kong, China; Department of Linguistics and Modern Languages, The Chinese University of Hong Kong, Hong Kong, China; Department of Electronic Engineering, The Chinese University of Hong Kong, Hong Kong, China
| |
Collapse
|
24
|
Visual abilities are important for auditory-only speech recognition: Evidence from autism spectrum disorder. Neuropsychologia 2014; 65:1-11. [DOI: 10.1016/j.neuropsychologia.2014.09.031] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/11/2014] [Revised: 08/25/2014] [Accepted: 09/18/2014] [Indexed: 11/22/2022]
|
25
|
Hemispheric lateralization of linguistic prosody recognition in comparison to speech and speaker recognition. Neuroimage 2014; 102 Pt 2:332-44. [PMID: 25087482 DOI: 10.1016/j.neuroimage.2014.07.038] [Citation(s) in RCA: 34] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2014] [Revised: 06/24/2014] [Accepted: 07/18/2014] [Indexed: 11/24/2022] Open
Abstract
Hemispheric specialization for linguistic prosody is a controversial issue. While it is commonly assumed that linguistic prosody and emotional prosody are preferentially processed in the right hemisphere, neuropsychological work directly comparing processes of linguistic prosody and emotional prosody suggests a predominant role of the left hemisphere for linguistic prosody processing. Here, we used two functional magnetic resonance imaging (fMRI) experiments to clarify the role of left and right hemispheres in the neural processing of linguistic prosody. In the first experiment, we sought to confirm previous findings showing that linguistic prosody processing compared to other speech-related processes predominantly involves the right hemisphere. Unlike previous studies, we controlled for stimulus influences by employing a prosody and speech task using the same speech material. The second experiment was designed to investigate whether a left-hemispheric involvement in linguistic prosody processing is specific to contrasts between linguistic prosody and emotional prosody or whether it also occurs when linguistic prosody is contrasted against other non-linguistic processes (i.e., speaker recognition). Prosody and speaker tasks were performed on the same stimulus material. In both experiments, linguistic prosody processing was associated with activity in temporal, frontal, parietal and cerebellar regions. Activation in temporo-frontal regions showed differential lateralization depending on whether the control task required recognition of speech or speaker: recognition of linguistic prosody predominantly involved right temporo-frontal areas when it was contrasted against speech recognition; when contrasted against speaker recognition, recognition of linguistic prosody predominantly involved left temporo-frontal areas. The results show that linguistic prosody processing involves functions of both hemispheres and suggest that recognition of linguistic prosody is based on an inter-hemispheric mechanism which exploits both a right-hemispheric sensitivity to pitch information and a left-hemispheric dominance in speech processing.
Collapse
|