1
|
Roark CL, Paulon G, Rebaudo G, McHaney JR, Sarkar A, Chandrasekaran B. Individual differences in working memory impact the trajectory of non-native speech category learning. PLoS One 2024; 19:e0297917. [PMID: 38857268 PMCID: PMC11164376 DOI: 10.1371/journal.pone.0297917] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2023] [Accepted: 01/15/2024] [Indexed: 06/12/2024] Open
Abstract
What is the role of working memory over the course of non-native speech category learning? Prior work has predominantly focused on how working memory might influence learning assessed at a single timepoint. Here, we substantially extend this prior work by examining the role of working memory on speech learning performance over time (i.e., over several months) and leverage a multifaceted approach that provides key insights into how working memory influences learning accuracy, maintenance of knowledge over time, generalization ability, and decision processes. We found that the role of working memory in non-native speech learning depends on the timepoint of learning and whether individuals learned the categories at all. Among learners, across all stages of learning, working memory was associated with higher accuracy as well as faster and slightly more cautious decision making. Further, while learners and non-learners did not have substantially different working memory performance, learners had faster evidence accumulation and more cautious decision thresholds throughout all sessions. Working memory may enhance learning by facilitating rapid category acquisition in initial stages and enabling faster and slightly more careful decision-making strategies that may reduce the overall effort needed to learn. Our results have important implications for developing interventions to improve learning in naturalistic language contexts.
Collapse
Affiliation(s)
- Casey L. Roark
- Communication Science & Disorders, University of Pittsburgh, Pittsburgh, PA, United States of America
- Center for the Neural Basis of Cognition, Pittsburgh, PA, United States of America
| | - Giorgio Paulon
- Statistics and Data Sciences, University of Texas at Austin, Austin, TX, United States of America
| | - Giovanni Rebaudo
- Statistics and Data Sciences, University of Texas at Austin, Austin, TX, United States of America
| | - Jacie R. McHaney
- Communication Science & Disorders, University of Pittsburgh, Pittsburgh, PA, United States of America
| | - Abhra Sarkar
- Statistics and Data Sciences, University of Texas at Austin, Austin, TX, United States of America
| | - Bharath Chandrasekaran
- Communication Science & Disorders, University of Pittsburgh, Pittsburgh, PA, United States of America
- Center for the Neural Basis of Cognition, Pittsburgh, PA, United States of America
| |
Collapse
|
2
|
Yu K, Zhou Y, Zhang L, Li L, Li P, Wang R. How Different Types of Linguistic Information Impact Voice Perception: Evidence From the Language-Familiarity Effect. LANGUAGE AND SPEECH 2023; 66:1007-1029. [PMID: 36680473 DOI: 10.1177/00238309221143062] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/17/2023]
Abstract
Previous studies have suggested the effect of linguistic information on voice perception (e.g., the language-familiarity effect [LFE]). However, it remains unclear which type of specific information in speech contributes to voice perception, including acoustic, phonological, lexical, and semantic information. It is also underexamined whether the roles of these different types of information are modulated by the experimental paradigm (speaker discrimination vs. speaker identification). In this study, we conducted two experiments to investigate these issues regarding LFEs. Experiment 1 examined the roles of acoustic and phonological information in speaker discrimination and identification with forward and time-reversed Mandarin and Indonesian sentences. Experiment 2 further identified the roles of phonological, lexical, and semantic information with forward, word-scrambled, and reconstructed (consisting of pseudo-Mandarin words) Mandarin and forward Indonesian sentences. For Mandarin-only participants, in Experiment 1, speaker discrimination was more accurate for forward than reversed sentences, but there was no LFE in either sentence. Speaker identification was also more accurate for forward than reversed sentences, whereas there was an LFE for forward sentences. In Experiment 2, speaker discrimination was better for word-scrambled than reconstructed Mandarin sentences. Speaker identification was more accurate for forward and word-scrambled Mandarin sentences but less accurate for Mandarin reconstructed and forward Indonesian sentences. In general, the pattern of the results for Indonesian learners was the same as that for Mandarin-only speakers. These results suggest that different kinds of information support speaker discrimination and identification in native and unfamiliar languages. The LFE in speaker identification depends on both phonological and lexical information.
Collapse
Affiliation(s)
- Keke Yu
- Philosophy and Social Science Laboratory of Reading and Development in Children and Adolescents, Ministry of Education, & Center for Studies of Psychological Application, School of Psychology, South China Normal University, China
| | - Yacong Zhou
- Philosophy and Social Science Laboratory of Reading and Development in Children and Adolescents, Ministry of Education, & Center for Studies of Psychological Application, School of Psychology, South China Normal University, China; Huanghe Science and Technology University, China
| | | | - Li Li
- The Key Laboratory of Chinese Learning and International Promotion, and College of International Culture, South China Normal University, China
| | - Ping Li
- The Pennsylvania State University, USA
| | - Ruiming Wang
- Philosophy and Social Science Laboratory of Reading and Development in Children and Adolescents, Ministry of Education, & Center for Studies of Psychological Application, School of Psychology, South China Normal University, China
| |
Collapse
|
3
|
McHaney JR, Schuerman WL, Leonard MK, Chandrasekaran B. Transcutaneous Auricular Vagus Nerve Stimulation Modulates Performance but Not Pupil Size During Nonnative Speech Category Learning. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2023; 66:3825-3843. [PMID: 37652065 DOI: 10.1044/2023_jslhr-22-00596] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/02/2023]
Abstract
PURPOSE Subthreshold transcutaneous auricular vagus nerve stimulation (taVNS) synchronized with behavioral training can selectively enhance nonnative speech category learning in adults. Prior work has demonstrated that behavioral performance increases when taVNS is paired with easier-to-learn Mandarin tone categories in native English listeners, relative to when taVNS is paired with harder-to-learn Mandarin tone categories or without taVNS. Mechanistically, this temporally precise plasticity has been attributed to noradrenergic modulation. However, prior work did not specifically utilize methodologies that indexed noradrenergic modulation and, therefore, was unable to explicitly test this hypothesis. Our goal for this study was to use pupillometry to gain mechanistic insights into taVNS behavioral effects. METHOD Thirty-eight participants learned to categorize Mandarin tones while pupillometry was recorded. In a double-blinded design, participants were divided into two taVNS groups that, as in the prior study, differed according to whether taVNS was paired with easier-to-learn tones or harder-to-learn tones. Learning performance and pupillary responses were measured using linear mixed-effects models. RESULTS We found that taVNS did not have any tone-specific or group behavioral or pupillary effects. However, in an exploratory analysis, we observed that taVNS did lead to faster rates of learning on trials paired with stimulation, particularly for those who were stimulated at lower amplitudes. CONCLUSIONS Our results suggest that pupillary responses may not be a reliable marker of locus coeruleus-norepinephrine system activity in humans. However, future research should systematically examine the effects of stimulation amplitude on both behavior and pupillary responses. SUPPLEMENTAL MATERIAL https://doi.org/10.23641/asha.24036666.
Collapse
|
4
|
Gan Z, Zheng L, Wang S, Feng G. Distribution-dependent representations in auditory category learning and generalization. Front Psychol 2023; 14:1132570. [PMID: 37829077 PMCID: PMC10566369 DOI: 10.3389/fpsyg.2023.1132570] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/27/2022] [Accepted: 08/31/2023] [Indexed: 10/14/2023] Open
Abstract
A fundamental objective in Auditory Sciences is to understand how people learn to generalize auditory category knowledge in new situations. How we generalize to novel scenarios speaks to the nature of acquired category representations and generalization mechanisms in handling perceptual variabilities and novelty. The dual learning system (DLS) framework proposes that auditory category learning involves an explicit, hypothesis-testing learning system, which is optimal for learning rule-based (RB) categories, and an implicit, procedural-based learning system, which is optimal for learning categories requiring pre-decisional information integration (II) across acoustic dimensions. Although DLS describes distinct mechanisms of two types of category learning, it is yet clear the nature of acquired representations and how we transfer them to new contexts. Here, we conducted three experiments to examine differences between II and RB category representations by examining what acoustic and perceptual novelties and variabilities affect learners' generalization success. Learners can successfully categorize different sets of untrained sounds after only eight blocks of training for both II and RB categories. The category structures and novel contexts differentially modulated the generalization success. The II learners significantly decreased generalization performances when categorizing new items derived from an untrained perceptual area and in a context with more distributed samples. In contrast, RB learners' generalizations are resistant to changes in perceptual regions but are sensitive to changes in sound dispersity. Representational similarity modeling revealed that the generalization in the more dispersed sampling context was accomplished differently by II and RB learners. II learners increased representations of perceptual similarity and decision distance to compensate for the decreased transfer of category representations, whereas the RB learners used a more computational cost strategy by default, computing the decision-bound distance to guide generalization decisions. These results suggest that distinct representations emerged after learning the two types of category structures and using different computations and flexible mechanisms in resolving generalization challenges when facing novel perceptual variability in new contexts. These findings provide new evidence for dissociated representations of auditory categories and reveal novel generalization mechanisms in resolving variabilities to maintain perceptual constancy.
Collapse
Affiliation(s)
- Zhenzhong Gan
- Philosophy and Social Science Laboratory of Reading and Development in Children and Adolescents (South China Normal University), Ministry of Education, Guangzhou, Guangdong, China
- Guangdong Provincial Key Laboratory of Mental Health and Cognitive Science, South China Normal University, Guangzhou, Guangdong, China
- School of Psychology, South China Normal University, Guangzhou, Guangdong, China
| | - Lurong Zheng
- Philosophy and Social Science Laboratory of Reading and Development in Children and Adolescents (South China Normal University), Ministry of Education, Guangzhou, Guangdong, China
- Guangdong Provincial Key Laboratory of Mental Health and Cognitive Science, South China Normal University, Guangzhou, Guangdong, China
- School of Psychology, South China Normal University, Guangzhou, Guangdong, China
| | - Suiping Wang
- Philosophy and Social Science Laboratory of Reading and Development in Children and Adolescents (South China Normal University), Ministry of Education, Guangzhou, Guangdong, China
| | - Gangyi Feng
- Department of Linguistics and Modern Languages, The Chinese University of Hong Kong, Shatin, Hong Kong SAR, China
- Brain and Mind Institute, The Chinese University of Hong Kong, Shatin, Hong Kong SAR, China
| |
Collapse
|
5
|
Ma Y, Yu K, Yin S, Li L, Li P, Wang R. Attention Modulates the Role of Speakers' Voice Identity and Linguistic Information in Spoken Word Processing: Evidence From Event-Related Potentials. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2023; 66:1678-1693. [PMID: 37071787 DOI: 10.1044/2023_jslhr-22-00420] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/11/2023]
Abstract
PURPOSE The human voice usually contains two types of information: linguistic and identity information. However, whether and how linguistic information interacts with identity information remains controversial. This study aimed to explore the processing of identity and linguistic information during spoken word processing by considering the modulation of attention. METHOD We conducted two event-related potentials (ERPs) experiments in the study. Different speakers (self, friend, and unfamiliar speakers) and emotional words (positive, negative, and neutral words) were used to manipulate the identity and linguistic information. With the manipulation, Experiment 1 explored the identity and linguistic information processing with a word decision task that requires participants' explicit attention to linguistic information. Experiment 2 further investigated the issue with a passive oddball paradigm that requires rare attention to either the identity or linguistic information. RESULTS Experiment 1 revealed an interaction among speaker, word type, and hemisphere in N400 amplitudes but not in N100 and P200, which suggests that identity information interacted with linguistic information at the later stage of spoken word processing. The mismatch negativity results of Experiment 2 showed no significant interaction between speaker and word pair, which indicates that identity and linguistic information were processed independently. CONCLUSIONS The identity information would interact with linguistic information during spoken word processing. However, the interaction was modulated by the task demands on attention involvement. We propose an attention-modulated explanation to explain the mechanism underlying identity and linguistic information processing. Implications of our findings are discussed in light of the integration and independence theories.
Collapse
Affiliation(s)
- Yunxiao Ma
- Philosophy and Social Science Laboratory of Reading and Development in Children and Adolescents, Ministry of Education, & Center for Studies of Psychological Application, School of Psychology, South China Normal University, Guangzhou, China
| | - Keke Yu
- Philosophy and Social Science Laboratory of Reading and Development in Children and Adolescents, Ministry of Education, & Center for Studies of Psychological Application, School of Psychology, South China Normal University, Guangzhou, China
| | - Shuqi Yin
- Philosophy and Social Science Laboratory of Reading and Development in Children and Adolescents, Ministry of Education, & Center for Studies of Psychological Application, School of Psychology, South China Normal University, Guangzhou, China
| | - Li Li
- The Key Laboratory of Chinese Learning and International Promotion, and College of International Culture, South China Normal University, Guangzhou, China
| | - Ping Li
- Department of Chinese and Bilingual Studies, Faculty of Humanities, The Hong Kong Polytechnic University, Hong Kong SAR, China
| | - Ruiming Wang
- Philosophy and Social Science Laboratory of Reading and Development in Children and Adolescents, Ministry of Education, & Center for Studies of Psychological Application, School of Psychology, South China Normal University, Guangzhou, China
| |
Collapse
|
6
|
Baese-Berk MM, Chandrasekaran B, Roark CL. The nature of non-native speech sound representations. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2022; 152:3025. [PMID: 36456300 PMCID: PMC9671621 DOI: 10.1121/10.0015230] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/30/2022] [Revised: 10/20/2022] [Accepted: 11/01/2022] [Indexed: 05/23/2023]
Abstract
Most current theories and models of second language speech perception are grounded in the notion that learners acquire speech sound categories in their target language. In this paper, this classic idea in speech perception is revisited, given that clear evidence for formation of such categories is lacking in previous research. To understand the debate on the nature of speech sound representations in a second language, an operational definition of "category" is presented, and the issues of categorical perception and current theories of second language learning are reviewed. Following this, behavioral and neuroimaging evidence for and against acquisition of categorical representations is described. Finally, recommendations for future work are discussed. The paper concludes with a recommendation for integration of behavioral and neuroimaging work and theory in this area.
Collapse
Affiliation(s)
| | - Bharath Chandrasekaran
- Department of Communication Sciences and Disorders, University of Pittsburgh, Pittsburgh, Pennsylvania 15260, USA
| | - Casey L Roark
- Department of Communication Sciences and Disorders, University of Pittsburgh, Pittsburgh, Pennsylvania 15260, USA
| |
Collapse
|
7
|
Chen Y, Luo Q, Liang M, Gao L, Yang J, Feng R, Liu J, Qiu G, Li Y, Zheng Y, Lu S. Children's Neural Sensitivity to Prosodic Features of Natural Speech and Its Significance to Speech Development in Cochlear Implanted Children. Front Neurosci 2022; 16:892894. [PMID: 35903806 PMCID: PMC9315047 DOI: 10.3389/fnins.2022.892894] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2022] [Accepted: 06/14/2022] [Indexed: 11/13/2022] Open
Abstract
Catchy utterances, such as proverbs, verses, and nursery rhymes (i.e., "No pain, no gain" in English), contain strong-prosodic (SP) features and are child-friendly in repeating and memorizing; yet the way those prosodic features encoded by neural activity and their influence on speech development in children are still largely unknown. Using functional near-infrared spectroscopy (fNIRS), this study investigated the cortical responses to the perception of natural speech sentences with strong/weak-prosodic (SP/WP) features and evaluated the speech communication ability in 21 pre-lingually deaf children with cochlear implantation (CI) and 25 normal hearing (NH) children. A comprehensive evaluation of speech communication ability was conducted on all the participants to explore the potential correlations between neural activities and children's speech development. The SP information evoked right-lateralized cortical responses across a broad brain network in NH children and facilitated the early integration of linguistic information, highlighting children's neural sensitivity to natural SP sentences. In contrast, children with CI showed significantly weaker cortical activation and characteristic deficits in speech perception with SP features, suggesting hearing loss at the early age of life, causing significantly impaired sensitivity to prosodic features of sentences. Importantly, the level of neural sensitivity to SP sentences was significantly related to the speech behaviors of all children participants. These findings demonstrate the significance of speech prosodic features in children's speech development.
Collapse
Affiliation(s)
- Yuebo Chen
- Department of Otolaryngology, Sun Yat-sen Memorial Hospital, Sun Yat-sen University, Guangzhou, China
| | - Qinqin Luo
- Department of Chinese Language and Literature, The Chinese University of Hong Kong, Hong Kong, Hong Kong SAR, China
- School of Foreign Languages, Shenzhen University, Shenzhen, China
| | - Maojin Liang
- Department of Otolaryngology, Sun Yat-sen Memorial Hospital, Sun Yat-sen University, Guangzhou, China
| | - Leyan Gao
- Neurolinguistics Teaching Laboratory, Department of Chinese Language and Literature, Sun Yat-sen University, Guangzhou, China
| | - Jingwen Yang
- Department of Neurology, The Third Affiliated Hospital of Sun Yat-sen University, Guangzhou, China
- Department of Clinical Neurolinguistics Research, Mental and Neurological Diseases Research Center, The Third Affiliated Hospital of Sun Yat-sen University, Guangzhou, China
| | - Ruiyan Feng
- Neurolinguistics Teaching Laboratory, Department of Chinese Language and Literature, Sun Yat-sen University, Guangzhou, China
| | - Jiahao Liu
- Department of Otolaryngology, Sun Yat-sen Memorial Hospital, Sun Yat-sen University, Guangzhou, China
- Hearing and Speech Science Department, Guangzhou Xinhua University, Guangzhou, China
| | - Guoxin Qiu
- Department of Clinical Neurolinguistics Research, Mental and Neurological Diseases Research Center, The Third Affiliated Hospital of Sun Yat-sen University, Guangzhou, China
| | - Yi Li
- School of Foreign Languages, Shenzhen University, Shenzhen, China
| | - Yiqing Zheng
- Department of Otolaryngology, Sun Yat-sen Memorial Hospital, Sun Yat-sen University, Guangzhou, China
- Hearing and Speech Science Department, Guangzhou Xinhua University, Guangzhou, China
| | - Shuo Lu
- School of Foreign Languages, Shenzhen University, Shenzhen, China
- Department of Clinical Neurolinguistics Research, Mental and Neurological Diseases Research Center, The Third Affiliated Hospital of Sun Yat-sen University, Guangzhou, China
| |
Collapse
|
8
|
Liu L, Lai R, Singh L, Kalashnikova M, Wong PCM, Kasisopa B, Chen A, Onsuwan C, Burnham D. The tone atlas of perceptual discriminability and perceptual distance: Four tone languages and five language groups. BRAIN AND LANGUAGE 2022; 229:105106. [PMID: 35390675 DOI: 10.1016/j.bandl.2022.105106] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/15/2021] [Revised: 03/02/2022] [Accepted: 03/08/2022] [Indexed: 06/14/2023]
Abstract
Some prior investigations suggest that tone perception is flexible, reasonably independent of native phonology, whereas others suggest it is constrained by native phonology. We address this issue in a systematic and comprehensive investigation of adult tone perception. Sampling from diverse tone and non-tone speaking communities, we tested discrimination of the three major tone systems (Cantonese, Thai, Mandarin) that dominate the tone perception literature, in relation to native language and language experience as well as stimulus variation (tone properties, presentation order, pitch cues) using linear mixed effect modelling and multidimensional scaling. There was an overall discrimination advantage for tone language speakers and for native tones. However, language- and tone-specific effects, and presentation order effects also emerged. Thus, over and above native phonology, stimulus variation exerts a powerful influence on tone discrimination. This study provides a tone atlas, a reference guide to inform empirical studies of tone sensitivity, both retrospectively and prospectively.
Collapse
Affiliation(s)
- Liquan Liu
- School of Psychology, Western Sydney University, Australia; The MARCS Institute for Brain, Behaviour and Development, Western Sydney University, Australia; Center for Multilingualism in Society Across the Lifespan, University of Oslo, Norway; Centre of Excellence for the Dynamics of Language, Australian Research Council, Australia.
| | - Regine Lai
- Department of Linguistics and Modern Languages, The Chinese University of Hong Kong, Hong Kong
| | - Leher Singh
- Department of Psychology, National University of Singapore, Singapore
| | - Marina Kalashnikova
- The MARCS Institute for Brain, Behaviour and Development, Western Sydney University, Australia; Basque Center on Cognition, Brain and Language, Spain; Ikerbasque, Basque Foundation for Science, Spain
| | - Patrick C M Wong
- Department of Linguistics and Modern Languages, The Chinese University of Hong Kong, Hong Kong; Brain and Mind Institute, The Chinese University of Hong Kong, Hong Kong
| | - Benjawan Kasisopa
- The MARCS Institute for Brain, Behaviour and Development, Western Sydney University, Australia
| | - Ao Chen
- School of Communication Sciences, Beijing Language and Culture University, China
| | - Chutamanee Onsuwan
- Department of Linguistics, Faculty of Liberal Arts and Center of Excellence in Intelligent Informatics, Speech and Language Technology, and Service Innovation (CILS), Thammasat University, Thailand
| | - Denis Burnham
- The MARCS Institute for Brain, Behaviour and Development, Western Sydney University, Australia.
| |
Collapse
|
9
|
Zhang W, Xiang M, Wang S. The role of left angular gyrus in the representation of linguistic composition relations. Hum Brain Mapp 2022; 43:2204-2217. [PMID: 35064707 PMCID: PMC8996362 DOI: 10.1002/hbm.25781] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2021] [Revised: 12/24/2021] [Accepted: 12/24/2021] [Indexed: 11/13/2022] Open
Abstract
Language comprehension is compositional: individual words are combined structurally to form larger meaning representations. The neural basis for compositionality is at the center of a growing body of recent research. Previous work has largely used univariate analysis to investigate the question, a technique that could potentially lead to the loss of fined‐grained information due to the procedure of averaging over neural responses. In a functional magnetic resonance imaging experiment, the present study examined different types of composition relations in Chinese phrases, using a 1‐back composition relation probe (CRP) task and a 1‐back word probe (WP) task. We first analyzed the data using the multivariate representation similarity analysis, which better captures the fine‐grained representational differences in the stimuli. The results showed that the left angular gyrus (AG) represents different types of composition relations in the CRP task, but no brain areas were identified in the WP task. We also conducted a traditional univariate analysis and found greater activations in the bilateral inferior frontal gyrus in the CRP task relative to the WP task. We discuss the methodological and theoretical implications of our findings in the context of the larger language neural network identified in previous studies. Our findings highlight the role of left AG in representing and distinguishing fine‐grained linguistic composition relations.
Collapse
Affiliation(s)
- Wenjia Zhang
- Philosophy and Social Science Laboratory of Reading and Development in Children and Adolescents (South China Normal University) Ministry of Education Guangzhou China
- School of Psychology South China Normal University Guangzhou China
| | - Ming Xiang
- Department of Linguistics University of Chicago Chicago Illinois USA
| | - Suiping Wang
- Philosophy and Social Science Laboratory of Reading and Development in Children and Adolescents (South China Normal University) Ministry of Education Guangzhou China
| |
Collapse
|
10
|
Zou L, Xia Z, Zhang W, Zhang X, Shu H. Brain responses during auditory word recognition vary with reading ability in Chinese school-age children. Dev Sci 2021; 25:e13216. [PMID: 34910843 DOI: 10.1111/desc.13216] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2021] [Revised: 10/07/2021] [Accepted: 12/06/2021] [Indexed: 12/28/2022]
Abstract
While the close relationship between the brain system for speech processing and reading development is well-documented in alphabetic languages, whether and how such a link exists in children in a language without systematic grapheme-phoneme correspondence has not been directly investigated. In the present study, we measured Chinese children's brain activation during an auditory lexical decision task with functional magnetic resonance imaging. The results showed that brain areas distributed across the temporal and frontal lobes activated during spoken word recognition. In addition, the left occipitotemporal cortex (OTC) was recruited, especially under the real word condition, thus confirming the involvement of this orthographic-related area in spoken language processing in Chinese children. Importantly, activation of the left temporoparietal cortex (TPC) in response to words and pseudowords was positively correlated with children's reading ability, thus supporting the salient role phonological processing plays in Chinese reading in the developing brain. Furthermore, children with higher reading scores also increasingly recruited the left anterior OTC to make decisions on the lexical status of pseudowords, indicating that higher-skill children tend to search abstract lexical representations more deeply than lower-skill children in deciding whether spoken syllables are real. In contrast, the precuneus was more related to trial-by-trial reaction time in lower-skill children, suggesting that effort-related neural systems differ among pupils with varying reading abilities. Taken together, these findings suggest a strong link between the neural correlates of speech processing and reading ability in Chinese children, thus supporting a universal basis underlying reading development across languages.
Collapse
Affiliation(s)
- Lijuan Zou
- School of Psychology, Shandong Normal University, Jinan, China.,School of Psychology and Education, Zaozhuang University, Zaozhuang, China
| | - Zhichao Xia
- State Key Laboratory of Cognitive Neuroscience and Learning & IDG/McGovern Institute for Brain Research, Beijing Normal University, Beijing, China.,School of Systems Science, Beijing Normal University, Beijing, China
| | - Wei Zhang
- College of Chemical Engineering and Material Science, Zaozhuang University, Zaozhuang, China
| | - Xianglin Zhang
- State Key Laboratory of Cognitive Neuroscience and Learning & IDG/McGovern Institute for Brain Research, Beijing Normal University, Beijing, China
| | - Hua Shu
- State Key Laboratory of Cognitive Neuroscience and Learning & IDG/McGovern Institute for Brain Research, Beijing Normal University, Beijing, China
| |
Collapse
|
11
|
Feng G, Gan Z, Yi HG, Ell SW, Roark CL, Wang S, Wong PCM, Chandrasekaran B. Neural dynamics underlying the acquisition of distinct auditory category structures. Neuroimage 2021; 244:118565. [PMID: 34543762 DOI: 10.1016/j.neuroimage.2021.118565] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2021] [Revised: 09/05/2021] [Accepted: 09/06/2021] [Indexed: 11/16/2022] Open
Abstract
Despite the multidimensional and temporally fleeting nature of auditory signals we quickly learn to assign novel sounds to behaviorally relevant categories. The neural systems underlying the learning and representation of novel auditory categories are far from understood. Current models argue for a rigid specialization of hierarchically organized core regions that are fine-tuned to extracting and mapping relevant auditory dimensions to meaningful categories. Scaffolded within a dual-learning systems approach, we test a competing hypothesis: the spatial and temporal dynamics of emerging auditory-category representations are not driven by the underlying dimensions but are constrained by category structure and learning strategies. To test these competing models, we used functional Magnetic Resonance Imaging (fMRI) to assess representational dynamics during the feedback-based acquisition of novel non-speech auditory categories with identical dimensions but differing category structures: rule-based (RB) categories, hypothesized to involve an explicit sound-to-rule mapping network, and information integration (II) based categories, involving pre-decisional integration of dimensions via a procedural-based sound-to-reward mapping network. Adults were assigned to either the RB (n = 30, 19 females) or II (n = 30, 22 females) learning tasks. Despite similar behavioral learning accuracies, learning strategies derived from computational modeling and involvements of corticostriatal systems during feedback processing differed across tasks. Spatiotemporal multivariate representational similarity analysis revealed an emerging representation within an auditory sensory-motor pathway exclusively for the II learning task, prominently involving the superior temporal gyrus (STG), inferior frontal gyrus (IFG), and posterior precentral gyrus. In contrast, the RB learning task yielded distributed neural representations within regions involved in cognitive-control and attentional processes that emerged at different time points of learning. Our results unequivocally demonstrate that auditory learners' neural systems are highly flexible and show distinct spatial and temporal patterns that are not dimension-specific but reflect underlying category structures and learning strategies.
Collapse
Affiliation(s)
- Gangyi Feng
- Department of Linguistics and Modern Languages, The Chinese University of Hong Kong, Shatin, N.T., Hong Kong SAR, China; Brain and Mind Institute, The Chinese University of Hong Kong, Shatin, N.T., Hong Kong SAR, China.
| | - Zhenzhong Gan
- Department of Linguistics and Modern Languages, The Chinese University of Hong Kong, Shatin, N.T., Hong Kong SAR, China; Key Laboratory of Brain, Cognition and Education Sciences, Ministry of Education, China, School of Psychology, Center for Studies of Psychological Application, and Guangdong Key Laboratory of Mental Health and Cognitive Science, South China Normal University, Guangzhou 510631, China
| | - Han Gyol Yi
- Department of Neurological Surgery, University of California, San Francisco, CA 94158, United States
| | - Shawn W Ell
- Department of Psychology, Graduate School of Biomedical Sciences and Engineering, University of Maine, 5742 Little Hall, Room 301, Orono, ME 04469-5742, United States
| | - Casey L Roark
- Department of Communication Science and Disorders, School of Health and Rehabilitation Sciences, University of Pittsburgh, Pittsburgh, PA 15260, United States; Center for the Neural Basis of Cognition, Pittsburgh, PA 15232, United States
| | - Suiping Wang
- Key Laboratory of Brain, Cognition and Education Sciences, Ministry of Education, China, School of Psychology, Center for Studies of Psychological Application, and Guangdong Key Laboratory of Mental Health and Cognitive Science, South China Normal University, Guangzhou 510631, China
| | - Patrick C M Wong
- Department of Linguistics and Modern Languages, The Chinese University of Hong Kong, Shatin, N.T., Hong Kong SAR, China; Brain and Mind Institute, The Chinese University of Hong Kong, Shatin, N.T., Hong Kong SAR, China
| | - Bharath Chandrasekaran
- Department of Communication Science and Disorders, School of Health and Rehabilitation Sciences, University of Pittsburgh, Pittsburgh, PA 15260, United States; Center for the Neural Basis of Cognition, Pittsburgh, PA 15232, United States.
| |
Collapse
|
12
|
McHaney JR, Tessmer R, Roark CL, Chandrasekaran B. Working memory relates to individual differences in speech category learning: Insights from computational modeling and pupillometry. BRAIN AND LANGUAGE 2021; 222:105010. [PMID: 34454285 DOI: 10.1016/j.bandl.2021.105010] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/08/2021] [Revised: 07/26/2021] [Accepted: 08/10/2021] [Indexed: 05/27/2023]
Abstract
Across two experiments, we examine the relationship between individual differences in working memory (WM) and the acquisition of non-native speech categories in adulthood. While WM is associated with individual differences in a variety of learning tasks, successful acquisition of speech categories is argued to be contingent on WM-independent procedural-learning mechanisms. Thus, the role of WM in speech category learning is unclear. In Experiment 1, we show that individuals with higher WM acquire non-native speech categories faster and to a greater extent than those with lower WM. In Experiment 2, we replicate these results and show that individuals with higher WM use more optimal, procedural-based learning strategies and demonstrate more distinct speech-evoked pupillary responses for correct relative to incorrect trials. We propose that higher WM may allow for greater stimulus-related attention, resulting in more robust representations and optimal learning strategies. We discuss implications for neurobiological models of speech category learning.
Collapse
Affiliation(s)
- Jacie R McHaney
- Department of Communication Science and Disorders, University of Pittsburgh, United States
| | - Rachel Tessmer
- Department of Speech, Language, and Hearing Sciences, University of Texas at Austin, United States
| | - Casey L Roark
- Department of Communication Science and Disorders, University of Pittsburgh, United States; Center for the Neural Basis of Cognition, Pittsburgh, PA, United States
| | - Bharath Chandrasekaran
- Department of Communication Science and Disorders, University of Pittsburgh, United States.
| |
Collapse
|
13
|
Abstract
Human speech perception results from neural computations that transform external acoustic speech signals into internal representations of words. The superior temporal gyrus (STG) contains the nonprimary auditory cortex and is a critical locus for phonological processing. Here, we describe how speech sound representation in the STG relies on fundamentally nonlinear and dynamical processes, such as categorization, normalization, contextual restoration, and the extraction of temporal structure. A spatial mosaic of local cortical sites on the STG exhibits complex auditory encoding for distinct acoustic-phonetic and prosodic features. We propose that as a population ensemble, these distributed patterns of neural activity give rise to abstract, higher-order phonemic and syllabic representations that support speech perception. This review presents a multi-scale, recurrent model of phonological processing in the STG, highlighting the critical interface between auditory and language systems. Expected final online publication date for the Annual Review of Psychology, Volume 73 is January 2022. Please see http://www.annualreviews.org/page/journal/pubdates for revised estimates.
Collapse
Affiliation(s)
- Ilina Bhaya-Grossman
- Department of Neurological Surgery, University of California, San Francisco, California 94143, USA; .,Joint Graduate Program in Bioengineering, University of California, Berkeley and San Francisco, California 94720, USA
| | - Edward F Chang
- Department of Neurological Surgery, University of California, San Francisco, California 94143, USA;
| |
Collapse
|
14
|
Feng G, Ou J, Gan Z, Jia X, Meng D, Wang S, Wong PCM. Neural Fingerprints Underlying Individual Language Learning Profiles. J Neurosci 2021; 41:7372-7387. [PMID: 34301824 PMCID: PMC8412988 DOI: 10.1523/jneurosci.0415-21.2021] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2021] [Revised: 07/11/2021] [Accepted: 07/14/2021] [Indexed: 11/21/2022] Open
Abstract
Human language learning differs significantly across individuals in the process and ultimate attainment. Although decades of research exploring the neural substrates of language learning have identified distinct and overlapping neural networks subserving learning of different components, the neural mechanisms that drive the large interindividual differences are still far from being understood. Here we examine to what extent the neural dynamics of multiple brain networks in men and women across sessions of training contribute to explaining individual differences in learning multiple linguistic components (i.e., vocabulary, morphology, and phrase and sentence structures) of an artificial language in a 7 d training and imaging paradigm with functional MRI. With machine-learning and predictive modeling, neural activation patterns across training sessions were highly predictive of individual learning success profiles derived from the four components. We identified four neural learning networks (i.e., the Perisylvian, frontoparietal, salience, and default-mode networks) and examined their dynamic contributions to the learning success prediction. Moreover, the robustness of the predictions systematically changes across networks depending on specific training phases and the learning components. We further demonstrate that a subset of network nodes in the inferior frontal, insular, and frontoparietal regions increasingly represent newly acquired language knowledge, while the multivariate connectivity between these representation regions is enhanced during learning for more successful learners. These findings allow us to understand why learners differ and are the first to attribute not only the degree of success but also patterns of language learning across components, to neural fingerprints summarized from multiple neural network dynamics.SIGNIFICANCE STATEMENT Individual differences in learning a language are widely observed not only within the same component of language but also across components. This study demonstrates that the dynamics of multiple brain networks across four imaging sessions of a 7 d artificial language training contribute to individual differences in learning-outcome profiles derived from four language components. With machine-learning predictive modeling, we identified four neural learning networks, including the Perisylvian, frontoparietal, salience, and default-mode networks, that contribute to predicting individual learning-outcome profiles and revealed language-component-general and component-specific prediction patterns across training sessions. These findings provide significant insights in understanding training-dependent neural dynamics underlying individual differences in learning success across language components.
Collapse
Affiliation(s)
- Gangyi Feng
- Department of Linguistics and Modern Languages, Chinese University of Hong Kong, Shatin, N.T, Hong Kong SAR, China
- Brain and Mind Institute, Chinese University of Hong Kong, Shatin, N.T, Hong Kong SAR, China
| | - Jinghua Ou
- Department of Linguistics, University of Chicago, Chicago, 60637, Illinois
| | - Zhenzhong Gan
- Department of Linguistics and Modern Languages, Chinese University of Hong Kong, Shatin, N.T, Hong Kong SAR, China
- Key Laboratory of Brain, Cognition and Education Sciences, Ministry of Education, China; School of Psychology, Center for Studies of Psychological Application, and Guangdong Key Laboratory of Mental Health and Cognitive Science, South China Normal University, 510631, China
| | - Xiaoyan Jia
- Key Laboratory of Brain, Cognition and Education Sciences, Ministry of Education, China; School of Psychology, Center for Studies of Psychological Application, and Guangdong Key Laboratory of Mental Health and Cognitive Science, South China Normal University, 510631, China
| | - Danting Meng
- Key Laboratory of Brain, Cognition and Education Sciences, Ministry of Education, China; School of Psychology, Center for Studies of Psychological Application, and Guangdong Key Laboratory of Mental Health and Cognitive Science, South China Normal University, 510631, China
| | - Suiping Wang
- Key Laboratory of Brain, Cognition and Education Sciences, Ministry of Education, China; School of Psychology, Center for Studies of Psychological Application, and Guangdong Key Laboratory of Mental Health and Cognitive Science, South China Normal University, 510631, China
| | - Patrick C M Wong
- Department of Linguistics and Modern Languages, Chinese University of Hong Kong, Shatin, N.T, Hong Kong SAR, China
- Brain and Mind Institute, Chinese University of Hong Kong, Shatin, N.T, Hong Kong SAR, China
| |
Collapse
|
15
|
Levy DF, Wilson SM. Categorical Encoding of Vowels in Primary Auditory Cortex. Cereb Cortex 2021; 30:618-627. [PMID: 31241149 DOI: 10.1093/cercor/bhz112] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2019] [Revised: 04/05/2019] [Accepted: 05/02/2019] [Indexed: 11/14/2022] Open
Abstract
Speech perception involves mapping from a continuous and variable acoustic speech signal to discrete, linguistically meaningful units. However, it is unclear where in the auditory processing stream speech sound representations cease to be veridical (faithfully encoding precise acoustic properties) and become categorical (encoding sounds as linguistic categories). In this study, we used functional magnetic resonance imaging and multivariate pattern analysis to determine whether tonotopic primary auditory cortex (PAC), defined as tonotopic voxels falling within Heschl's gyrus, represents one class of speech sounds-vowels-veridically or categorically. For each of 15 participants, 4 individualized synthetic vowel stimuli were generated such that the vowels were equidistant in acoustic space, yet straddled a categorical boundary (with the first 2 vowels perceived as [i] and the last 2 perceived as [i]). Each participant's 4 vowels were then presented in a block design with an irrelevant but attention-demanding level change detection task. We found that in PAC bilaterally, neural discrimination between pairs of vowels that crossed the categorical boundary was more accurate than neural discrimination between equivalently spaced vowel pairs that fell within a category. These findings suggest that PAC does not represent vowel sounds veridically, but that encoding of vowels is shaped by linguistically relevant phonemic categories.
Collapse
Affiliation(s)
- Deborah F Levy
- Department of Hearing and Speech Sciences, Vanderbilt University Medical Center, Nashville, TN 37232, USA
| | - Stephen M Wilson
- Department of Hearing and Speech Sciences, Vanderbilt University Medical Center, Nashville, TN 37232, USA
| |
Collapse
|
16
|
Beach SD, Ozernov-Palchik O, May SC, Centanni TM, Gabrieli JDE, Pantazis D. Neural Decoding Reveals Concurrent Phonemic and Subphonemic Representations of Speech Across Tasks. NEUROBIOLOGY OF LANGUAGE (CAMBRIDGE, MASS.) 2021; 2:254-279. [PMID: 34396148 PMCID: PMC8360503 DOI: 10.1162/nol_a_00034] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/28/2020] [Accepted: 02/21/2021] [Indexed: 06/13/2023]
Abstract
Robust and efficient speech perception relies on the interpretation of acoustically variable phoneme realizations, yet prior neuroimaging studies are inconclusive regarding the degree to which subphonemic detail is maintained over time as categorical representations arise. It is also unknown whether this depends on the demands of the listening task. We addressed these questions by using neural decoding to quantify the (dis)similarity of brain response patterns evoked during two different tasks. We recorded magnetoencephalography (MEG) as adult participants heard isolated, randomized tokens from a /ba/-/da/ speech continuum. In the passive task, their attention was diverted. In the active task, they categorized each token as ba or da. We found that linear classifiers successfully decoded ba vs. da perception from the MEG data. Data from the left hemisphere were sufficient to decode the percept early in the trial, while the right hemisphere was necessary but not sufficient for decoding at later time points. We also decoded stimulus representations and found that they were maintained longer in the active task than in the passive task; however, these representations did not pattern more like discrete phonemes when an active categorical response was required. Instead, in both tasks, early phonemic patterns gave way to a representation of stimulus ambiguity that coincided in time with reliable percept decoding. Our results suggest that the categorization process does not require the loss of subphonemic detail, and that the neural representation of isolated speech sounds includes concurrent phonemic and subphonemic information.
Collapse
Affiliation(s)
- Sara D. Beach
- McGovern Institute for Brain Research, Massachusetts Institute of Technology, Cambridge, MA, USA
- Program in Speech and Hearing Bioscience and Technology, Harvard University, Cambridge, MA, USA
| | - Ola Ozernov-Palchik
- McGovern Institute for Brain Research, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Sidney C. May
- McGovern Institute for Brain Research, Massachusetts Institute of Technology, Cambridge, MA, USA
- Lynch School of Education and Human Development, Boston College, Chestnut Hill, MA, USA
| | - Tracy M. Centanni
- McGovern Institute for Brain Research, Massachusetts Institute of Technology, Cambridge, MA, USA
- Department of Psychology, Texas Christian University, Fort Worth, TX, USA
| | - John D. E. Gabrieli
- McGovern Institute for Brain Research, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Dimitrios Pantazis
- McGovern Institute for Brain Research, Massachusetts Institute of Technology, Cambridge, MA, USA
| |
Collapse
|
17
|
Fuhrmeister P, Myers EB. Structural neural correlates of individual differences in categorical perception. BRAIN AND LANGUAGE 2021; 215:104919. [PMID: 33524740 DOI: 10.1016/j.bandl.2021.104919] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/26/2020] [Revised: 11/18/2020] [Accepted: 01/12/2021] [Indexed: 06/12/2023]
Abstract
Listeners perceive speech sounds categorically. While group-level differences in categorical perception have been observed in children or individuals with reading disorders, recent findings suggest that typical adults vary in how categorically they perceive sounds. The current study investigated neural sources of individual variability in categorical perception of speech. Fifty-seven participants rated phonetic tokens on a visual analogue scale; categoricity and response consistency were measured and related to measures of brain structure from MRI. Increased surface area of the right middle frontal gyrus predicted more categorical perception of a fricative continuum. This finding supports the idea that frontal regions are sensitive to phonetic category-level information and extends it to make behavioral predictions at the individual level. Additionally, more gyrification in bilateral transverse temporal gyri predicted less consistent responses on the task, perhaps reflecting subtle variation in language ability across the population.
Collapse
Affiliation(s)
- Pamela Fuhrmeister
- Department of Speech, Language, and Hearing Sciences, University of Connecticut, 2 Alethia Drive, Storrs, CT 06269, United States.
| | - Emily B Myers
- Department of Speech, Language, and Hearing Sciences, University of Connecticut, 2 Alethia Drive, Storrs, CT 06269, United States
| |
Collapse
|
18
|
Mahmud MS, Yeasin M, Bidelman GM. Data-driven machine learning models for decoding speech categorization from evoked brain responses. J Neural Eng 2021; 18. [PMID: 33690177 DOI: 10.1101/2020.08.03.234997] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2020] [Accepted: 03/09/2021] [Indexed: 05/24/2023]
Abstract
Objective.Categorical perception (CP) of audio is critical to understand how the human brain perceives speech sounds despite widespread variability in acoustic properties. Here, we investigated the spatiotemporal characteristics of auditory neural activity that reflects CP for speech (i.e. differentiates phonetic prototypes from ambiguous speech sounds).Approach.We recorded 64-channel electroencephalograms as listeners rapidly classified vowel sounds along an acoustic-phonetic continuum. We used support vector machine classifiers and stability selection to determine when and where in the brain CP was best decoded across space and time via source-level analysis of the event-related potentials.Main results. We found that early (120 ms) whole-brain data decoded speech categories (i.e. prototypical vs. ambiguous tokens) with 95.16% accuracy (area under the curve 95.14%;F1-score 95.00%). Separate analyses on left hemisphere (LH) and right hemisphere (RH) responses showed that LH decoding was more accurate and earlier than RH (89.03% vs. 86.45% accuracy; 140 ms vs. 200 ms). Stability (feature) selection identified 13 regions of interest (ROIs) out of 68 brain regions [including auditory cortex, supramarginal gyrus, and inferior frontal gyrus (IFG)] that showed categorical representation during stimulus encoding (0-260 ms). In contrast, 15 ROIs (including fronto-parietal regions, IFG, motor cortex) were necessary to describe later decision stages (later 300-800 ms) of categorization but these areas were highly associated with the strength of listeners' categorical hearing (i.e. slope of behavioral identification functions).Significance.Our data-driven multivariate models demonstrate that abstract categories emerge surprisingly early (∼120 ms) in the time course of speech processing and are dominated by engagement of a relatively compact fronto-temporal-parietal brain network.
Collapse
Affiliation(s)
- Md Sultan Mahmud
- Department of Electrical and Computer Engineering, University of Memphis, 3815 Central Avenue, Memphis, TN 38152, United States of America
- Institute for Intelligent Systems, University of Memphis, Memphis, TN, United States of America
| | - Mohammed Yeasin
- Department of Electrical and Computer Engineering, University of Memphis, 3815 Central Avenue, Memphis, TN 38152, United States of America
- Institute for Intelligent Systems, University of Memphis, Memphis, TN, United States of America
| | - Gavin M Bidelman
- Institute for Intelligent Systems, University of Memphis, Memphis, TN, United States of America
- School of Communication Sciences and Disorders, University of Memphis, Memphis, TN, United States of America
- University of Tennessee Health Sciences Center, Department of Anatomy and Neurobiology, Memphis, TN, United States of America
| |
Collapse
|
19
|
Mahmud MS, Yeasin M, Bidelman GM. Data-driven machine learning models for decoding speech categorization from evoked brain responses. J Neural Eng 2021; 18:10.1088/1741-2552/abecf0. [PMID: 33690177 PMCID: PMC8738965 DOI: 10.1088/1741-2552/abecf0] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2020] [Accepted: 03/09/2021] [Indexed: 11/12/2022]
Abstract
Objective.Categorical perception (CP) of audio is critical to understand how the human brain perceives speech sounds despite widespread variability in acoustic properties. Here, we investigated the spatiotemporal characteristics of auditory neural activity that reflects CP for speech (i.e. differentiates phonetic prototypes from ambiguous speech sounds).Approach.We recorded 64-channel electroencephalograms as listeners rapidly classified vowel sounds along an acoustic-phonetic continuum. We used support vector machine classifiers and stability selection to determine when and where in the brain CP was best decoded across space and time via source-level analysis of the event-related potentials.Main results. We found that early (120 ms) whole-brain data decoded speech categories (i.e. prototypical vs. ambiguous tokens) with 95.16% accuracy (area under the curve 95.14%;F1-score 95.00%). Separate analyses on left hemisphere (LH) and right hemisphere (RH) responses showed that LH decoding was more accurate and earlier than RH (89.03% vs. 86.45% accuracy; 140 ms vs. 200 ms). Stability (feature) selection identified 13 regions of interest (ROIs) out of 68 brain regions [including auditory cortex, supramarginal gyrus, and inferior frontal gyrus (IFG)] that showed categorical representation during stimulus encoding (0-260 ms). In contrast, 15 ROIs (including fronto-parietal regions, IFG, motor cortex) were necessary to describe later decision stages (later 300-800 ms) of categorization but these areas were highly associated with the strength of listeners' categorical hearing (i.e. slope of behavioral identification functions).Significance.Our data-driven multivariate models demonstrate that abstract categories emerge surprisingly early (∼120 ms) in the time course of speech processing and are dominated by engagement of a relatively compact fronto-temporal-parietal brain network.
Collapse
Affiliation(s)
- Md Sultan Mahmud
- Department of Electrical and Computer Engineering, University of Memphis, 3815 Central Avenue, Memphis, TN 38152, United States of America
- Institute for Intelligent Systems, University of Memphis, Memphis, TN, United States of America
| | - Mohammed Yeasin
- Department of Electrical and Computer Engineering, University of Memphis, 3815 Central Avenue, Memphis, TN 38152, United States of America
- Institute for Intelligent Systems, University of Memphis, Memphis, TN, United States of America
| | - Gavin M Bidelman
- Institute for Intelligent Systems, University of Memphis, Memphis, TN, United States of America
- School of Communication Sciences and Disorders, University of Memphis, Memphis, TN, United States of America
- University of Tennessee Health Sciences Center, Department of Anatomy and Neurobiology, Memphis, TN, United States of America
| |
Collapse
|
20
|
Mahmud MS, Yeasin M, Bidelman GM. Speech categorization is better described by induced rather than evoked neural activity. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2021; 149:1644. [PMID: 33765780 PMCID: PMC8267855 DOI: 10.1121/10.0003572] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/08/2023]
Abstract
Categorical perception (CP) describes how the human brain categorizes speech despite inherent acoustic variability. We examined neural correlates of CP in both evoked and induced electroencephalogram (EEG) activity to evaluate which mode best describes the process of speech categorization. Listeners labeled sounds from a vowel gradient while we recorded their EEGs. Using a source reconstructed EEG, we used band-specific evoked and induced neural activity to build parameter optimized support vector machine models to assess how well listeners' speech categorization could be decoded via whole-brain and hemisphere-specific responses. We found whole-brain evoked β-band activity decoded prototypical from ambiguous speech sounds with ∼70% accuracy. However, induced γ-band oscillations showed better decoding of speech categories with ∼95% accuracy compared to evoked β-band activity (∼70% accuracy). Induced high frequency (γ-band) oscillations dominated CP decoding in the left hemisphere, whereas lower frequencies (θ-band) dominated the decoding in the right hemisphere. Moreover, feature selection identified 14 brain regions carrying induced activity and 22 regions of evoked activity that were most salient in describing category-level speech representations. Among the areas and neural regimes explored, induced γ-band modulations were most strongly associated with listeners' behavioral CP. The data suggest that the category-level organization of speech is dominated by relatively high frequency induced brain rhythms.
Collapse
Affiliation(s)
- Md Sultan Mahmud
- Department of Electrical and Computer Engineering, University of Memphis, 3815 Central Avenue, Memphis, Tennessee 38152, USA
| | - Mohammed Yeasin
- Department of Electrical and Computer Engineering, University of Memphis, 3815 Central Avenue, Memphis, Tennessee 38152, USA
| | - Gavin M Bidelman
- School of Communication Sciences and Disorders, University of Memphis, 4055 North Park Loop, Memphis, Tennessee 38152, USA
| |
Collapse
|
21
|
Carter JA, Bidelman GM. Auditory cortex is susceptible to lexical influence as revealed by informational vs. energetic masking of speech categorization. Brain Res 2021; 1759:147385. [PMID: 33631210 DOI: 10.1016/j.brainres.2021.147385] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2020] [Revised: 02/15/2021] [Accepted: 02/16/2021] [Indexed: 02/02/2023]
Abstract
Speech perception requires the grouping of acoustic information into meaningful phonetic units via the process of categorical perception (CP). Environmental masking influences speech perception and CP. However, it remains unclear at which stage of processing (encoding, decision, or both) masking affects listeners' categorization of speech signals. The purpose of this study was to determine whether linguistic interference influences the early acoustic-phonetic conversion process inherent to CP. To this end, we measured source level, event related brain potentials (ERPs) from auditory cortex (AC) and inferior frontal gyrus (IFG) as listeners rapidly categorized speech sounds along a /da/ to /ga/ continuum presented in three listening conditions: quiet, and in the presence of forward (informational masker) and time-reversed (energetic masker) 2-talker babble noise. Maskers were matched in overall SNR and spectral content and thus varied only in their degree of linguistic interference (i.e., informational masking). We hypothesized a differential effect of informational versus energetic masking on behavioral and neural categorization responses, where we predicted increased activation of frontal regions when disambiguating speech from noise, especially during lexical-informational maskers. We found (1) informational masking weakens behavioral speech phoneme identification above and beyond energetic masking; (2) low-level AC activity not only codes speech categories but is susceptible to higher-order lexical interference; (3) identifying speech amidst noise recruits a cross hemispheric circuit (ACleft → IFGright) whose engagement varies according to task difficulty. These findings provide corroborating evidence for top-down influences on the early acoustic-phonetic analysis of speech through a coordinated interplay between frontotemporal brain areas.
Collapse
Affiliation(s)
- Jared A Carter
- Institute for Intelligent Systems, University of Memphis, Memphis, TN, USA; School of Communication Sciences & Disorders, University of Memphis, Memphis, TN, USA.
| | - Gavin M Bidelman
- Institute for Intelligent Systems, University of Memphis, Memphis, TN, USA; School of Communication Sciences & Disorders, University of Memphis, Memphis, TN, USA; University of Tennessee Health Sciences Center, Department of Anatomy and Neurobiology, Memphis, TN, USA.
| |
Collapse
|
22
|
Li Y, Tang C, Lu J, Wu J, Chang EF. Human cortical encoding of pitch in tonal and non-tonal languages. Nat Commun 2021; 12:1161. [PMID: 33608548 PMCID: PMC7896081 DOI: 10.1038/s41467-021-21430-x] [Citation(s) in RCA: 28] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/10/2020] [Accepted: 01/26/2021] [Indexed: 11/09/2022] Open
Abstract
Languages can use a common repertoire of vocal sounds to signify distinct meanings. In tonal languages, such as Mandarin Chinese, pitch contours of syllables distinguish one word from another, whereas in non-tonal languages, such as English, pitch is used to convey intonation. The neural computations underlying language specialization in speech perception are unknown. Here, we use a cross-linguistic approach to address this. Native Mandarin- and English- speaking participants each listened to both Mandarin and English speech, while neural activity was directly recorded from the non-primary auditory cortex. Both groups show language-general coding of speaker-invariant pitch at the single electrode level. At the electrode population level, we find language-specific distribution of cortical tuning parameters in Mandarin speakers only, with enhanced sensitivity to Mandarin tone categories. Our results show that speech perception relies upon a shared cortical auditory feature processing mechanism, which may be tuned to the statistics of a given language. Different languages rely on different vocal sounds to convey meaning. Here the authors show that language-general coding of pitch occurs in the non-primary auditory cortex for both tonal (Mandarin Chinese) and non-tonal (English) languages, with some language specificity on the population level.
Collapse
Affiliation(s)
- Yuanning Li
- Department of Neurological Surgery, University of California, San Francisco, CA, USA.,Center for Integrative Neuroscience, University of California, San Francisco, CA, USA
| | - Claire Tang
- Department of Neurological Surgery, University of California, San Francisco, CA, USA.,Center for Integrative Neuroscience, University of California, San Francisco, CA, USA
| | - Junfeng Lu
- Brain Function Laboratory, Neurosurgical Institute of Fudan University, Shanghai, China.,Shanghai Key laboratory of Brain Function Restoration and Neural Regeneration, Shanghai, China
| | - Jinsong Wu
- Brain Function Laboratory, Neurosurgical Institute of Fudan University, Shanghai, China. .,Shanghai Key laboratory of Brain Function Restoration and Neural Regeneration, Shanghai, China. .,Neurologic Surgery Department, Huashan Hospital, Shanghai Medical College, Fudan University, Shanghai, China. .,Institute of Brain-Intelligence Technology, Zhangjiang Lab, Shanghai, China.
| | - Edward F Chang
- Department of Neurological Surgery, University of California, San Francisco, CA, USA. .,Center for Integrative Neuroscience, University of California, San Francisco, CA, USA.
| |
Collapse
|
23
|
Bidelman GM, Pearson C, Harrison A. Lexical Influences on Categorical Speech Perception Are Driven by a Temporoparietal Circuit. J Cogn Neurosci 2021; 33:840-852. [PMID: 33464162 DOI: 10.1162/jocn_a_01678] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022]
Abstract
Categorical judgments of otherwise identical phonemes are biased toward hearing words (i.e., "Ganong effect") suggesting lexical context influences perception of even basic speech primitives. Lexical biasing could manifest via late stage postperceptual mechanisms related to decision or, alternatively, top-down linguistic inference that acts on early perceptual coding. Here, we exploited the temporal sensitivity of EEG to resolve the spatiotemporal dynamics of these context-related influences on speech categorization. Listeners rapidly classified sounds from a /gɪ/-/kɪ/ gradient presented in opposing word-nonword contexts (GIFT-kift vs. giss-KISS), designed to bias perception toward lexical items. Phonetic perception shifted toward the direction of words, establishing a robust Ganong effect behaviorally. ERPs revealed a neural analog of lexical biasing emerging within ~200 msec. Source analyses uncovered a distributed neural network supporting the Ganong including middle temporal gyrus, inferior parietal lobe, and middle frontal cortex. Yet, among Ganong-sensitive regions, only left middle temporal gyrus and inferior parietal lobe predicted behavioral susceptibility to lexical influence. Our findings confirm lexical status rapidly constrains sublexical categorical representations for speech within several hundred milliseconds but likely does so outside the purview of canonical auditory-sensory brain areas.
Collapse
Affiliation(s)
- Gavin M Bidelman
- University of Memphis, TN.,University of Tennessee Health Sciences Center, Memphis, TN
| | | | | |
Collapse
|
24
|
Feng G, Li Y, Hsu SM, Wong PC, Chou TL, Chandrasekaran B. Emerging native-similar neural representations underlie non-native speech category learning success. NEUROBIOLOGY OF LANGUAGE (CAMBRIDGE, MASS.) 2021; 2:280-307. [PMID: 34368775 PMCID: PMC8345815 DOI: 10.1162/nol_a_00035] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/09/2023]
Abstract
Learning non-native phonetic categories in adulthood is an exceptionally challenging task, characterized by large inter-individual differences in learning speed and outcomes. The neurobiological mechanisms underlying the inter-individual differences in the learning efficacy are not fully understood. Here we examined the extent to which training-induced neural representations of non-native Mandarin tone categories in English listeners (n = 53) are increasingly similar to those of the native listeners (n = 33) who acquired these categories early in infancy. We particularly assessed whether the neural similarities in representational structure between non-native learners and native listeners are robust neuromarkers of inter-individual differences in learning success. Using inter-subject neural representational similarity (IS-NRS) analysis and predictive modeling on two functional magnetic resonance imaging (fMRI) datasets, we examined the neural representational mechanisms underlying speech category learning success. Learners' neural representations that were significantly similar to the native listeners emerged in brain regions mediating speech perception following training; the extent of the emerging neural similarities with native listeners significantly predicted the learning speed and outcome in learners. The predictive power of IS-NRS outperformed models with other neural representational measures. Furthermore, neural representations underlying successful learning are multidimensional but cost-efficient in nature. The degree of the emergent native-similar neural representations was closely related to the robust neural sensitivity to feedback in the frontostriatal network. These findings provide important insights on experience-dependent representational neuroplasticity underlying successful speech learning in adulthood and could be leveraged in designing individualized feedback-based training paradigms that maximize learning efficiency.
Collapse
Affiliation(s)
- Gangyi Feng
- Department of Linguistics and Modern Languages, The Chinese University of Hong Kong, Shatin, N.T., Hong Kong SAR, China
- Brain and Mind Institute, The Chinese University of Hong Kong, Shatin, N.T., Hong Kong SAR, China
- Corresponding authors: Gangyi Feng, Ph.D., Brain and Mind Institute, Department of Linguistics and Modern Languages, The Chinese University of Hong Kong, Shatin, N.T., Hong Kong SAR, China, +852-3943 3190, , Bharath Chandrasekaran, Ph.D., Department of Communication Science and Disorders, University of Pittsburgh 6074 Forbes Tower, Pittsburgh, PA 15260, (412) 383-6565,
| | - Yu Li
- Department of Linguistics and Modern Languages, The Chinese University of Hong Kong, Shatin, N.T., Hong Kong SAR, China
- Brain and Mind Institute, The Chinese University of Hong Kong, Shatin, N.T., Hong Kong SAR, China
| | - Shen-Mou Hsu
- Imaging Center for Integrated Body, Mind and Culture Research, National Taiwan University, Taipei 10617, Taiwan
| | - Patrick C.M. Wong
- Department of Linguistics and Modern Languages, The Chinese University of Hong Kong, Shatin, N.T., Hong Kong SAR, China
- Brain and Mind Institute, The Chinese University of Hong Kong, Shatin, N.T., Hong Kong SAR, China
| | - Tai-Li Chou
- Imaging Center for Integrated Body, Mind and Culture Research, National Taiwan University, Taipei 10617, Taiwan
- Department of Psychology, National Taiwan University, Taipei 10617, Taiwan
| | - Bharath Chandrasekaran
- Department of Communication Sciences and Disorders, School of Health and Rehabilitation Sciences, University of Pittsburgh, Pittsburgh, PA 15260, USA
- Corresponding authors: Gangyi Feng, Ph.D., Brain and Mind Institute, Department of Linguistics and Modern Languages, The Chinese University of Hong Kong, Shatin, N.T., Hong Kong SAR, China, +852-3943 3190, , Bharath Chandrasekaran, Ph.D., Department of Communication Science and Disorders, University of Pittsburgh 6074 Forbes Tower, Pittsburgh, PA 15260, (412) 383-6565,
| |
Collapse
|
25
|
Feng G, Gan Z, Llanos F, Meng D, Wang S, Wong PCM, Chandrasekaran B. A distributed dynamic brain network mediates linguistic tone representation and categorization. Neuroimage 2021; 224:117410. [PMID: 33011415 PMCID: PMC7749825 DOI: 10.1016/j.neuroimage.2020.117410] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2020] [Revised: 08/21/2020] [Accepted: 09/25/2020] [Indexed: 12/21/2022] Open
Abstract
Successful categorization requires listeners to represent the incoming sensory information, resolve the "blooming, buzzing confusion" inherent to noisy sensory signals, and leverage the accumulated evidence towards making a decision. Despite decades of intense debate, the neural systems underlying speech categorization remain unresolved. Here we assessed the neural representation and categorization of lexical tones by native Mandarin speakers (N = 31) across a range of acoustic and contextual variabilities (talkers, perceptual saliences, and stimulus-contexts) using functional magnetic imaging (fMRI) and an evidence accumulation model of decision-making. Univariate activation and multivariate pattern analyses reveal that the acoustic-variability-tolerant representations of tone category are observed within the middle portion of the left superior temporal gyrus (STG). Activation patterns in the frontal and parietal regions also contained category-relevant information that was differentially sensitive to various forms of variability. The robustness of neural representations of tone category in a distributed fronto-temporoparietal network is associated with trial-by-trial decision-making parameters. These findings support a hybrid model involving a representational core within the STG that operates dynamically within an extensive frontoparietal network to support the representation and categorization of linguistic pitch patterns.
Collapse
Affiliation(s)
- Gangyi Feng
- Department of Linguistics and Modern Languages, The Chinese University of Hong Kong, Shatin, N.T., Hong Kong SAR, China; Brain and Mind Institute, The Chinese University of Hong Kong, Shatin, N.T., Hong Kong SAR, China.
| | - Zhenzhong Gan
- Center for the Study of Applied Psychology and School of Psychology, South China Normal University, Guangzhou 510631, China
| | - Fernando Llanos
- Department of Communication Science and Disorders, School of Health and Rehabilitation Sciences, University of Pittsburgh, Pittsburgh, PA 15260, United States
| | - Danting Meng
- Center for the Study of Applied Psychology and School of Psychology, South China Normal University, Guangzhou 510631, China
| | - Suiping Wang
- Center for the Study of Applied Psychology and School of Psychology, South China Normal University, Guangzhou 510631, China; Guangdong Provincial Key Laboratory of Mental Health and Cognitive Science, South China Normal University, Guangzhou 510631, China
| | - Patrick C M Wong
- Department of Linguistics and Modern Languages, The Chinese University of Hong Kong, Shatin, N.T., Hong Kong SAR, China; Brain and Mind Institute, The Chinese University of Hong Kong, Shatin, N.T., Hong Kong SAR, China
| | - Bharath Chandrasekaran
- Department of Communication Science and Disorders, School of Health and Rehabilitation Sciences, University of Pittsburgh, Pittsburgh, PA 15260, United States.
| |
Collapse
|
26
|
Feng G, Yi HG, Chandrasekaran B. The Role of the Human Auditory Corticostriatal Network in Speech Learning. Cereb Cortex 2020; 29:4077-4089. [PMID: 30535138 DOI: 10.1093/cercor/bhy289] [Citation(s) in RCA: 19] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2018] [Revised: 08/30/2018] [Indexed: 01/26/2023] Open
Abstract
We establish a mechanistic account of how the mature human brain functionally reorganizes to acquire and represent new speech sounds. Native speakers of English learned to categorize Mandarin lexical tone categories produced by multiple talkers using trial-by-trial feedback. We hypothesized that the corticostriatal system is a key intermediary in mediating temporal lobe plasticity and the acquisition of new speech categories in adulthood. We conducted a functional magnetic resonance imaging experiment in which participants underwent a sound-to-category mapping task. Diffusion tensor imaging data were collected, and probabilistic fiber tracking analysis was employed to assay the auditory corticostriatal pathways. Multivariate pattern analysis showed that talker-invariant novel tone category representations emerged in the left superior temporal gyrus (LSTG) within a few hundred training trials. Univariate analysis showed that the putamen, a subregion of the striatum, was sensitive to positive feedback in correctly categorized trials. With learning, functional coupling between the putamen and LSTG increased during error processing. Furthermore, fiber tractography demonstrated robust structural connectivity between the feedback-sensitive striatal regions and the LSTG regions that represent the newly learned tone categories. Our convergent findings highlight a critical role for the auditory corticostriatal circuitry in mediating the acquisition of new speech categories.
Collapse
Affiliation(s)
- Gangyi Feng
- Department of Linguistics and Modern Languages, The Chinese University of Hong Kong, Hong Kong SAR, China.,Brain and Mind Institute, The Chinese University of Hong Kong, Hong Kong SAR, China
| | - Han Gyol Yi
- Department of Neurological Surgery, University of California, San Francisco, San Francisco, CA 94158, USA
| | - Bharath Chandrasekaran
- Department of Communication Science and Disorders, School of Health and Rehabilitation Sciences, University of Pittsburgh, Pittsburgh, PA 15260, USA
| |
Collapse
|
27
|
Cheng Y, Yan L, Hu L, Wu H, Huang X, Tian Y, Wu X. Differences in network centrality between high and low myopia: a voxel-level degree centrality study. Acta Radiol 2020; 61:1388-1397. [PMID: 32098475 DOI: 10.1177/0284185120902385] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
BACKGROUND Previous studies have linked high myopia (HM) to brain activity, and the difference between HM and low myopia (LM) can be assessed. PURPOSE To study the differences in functional networks of brain activity between HM and LM by the voxel-level degree centrality (DC) method. MATERIAL AND METHODS Twenty-eight patients with HM (10 men, 18 women), 18 patients with LM (4 men, 14 women), and 59 healthy controls (27 men, 32 women) were enrolled in this study. The voxel-level DC method was used to assess spontaneous brain activity. Correlation analysis was used to explore the change of average DC value in different brain regions, in order to analyze differences in brain activity between HM and LM. RESULTS DC values of the right cerebellum anterior lobe/brainstem, right parahippocampal gyrus, and left caudate in HM patients were significantly higher than those in LM patients (P < 0.05). In contrast, DC values of the left medial frontal gyrus, right inferior frontal gyrus, left middle frontal gyrus, and left inferior parietal lobule were significantly lower in patients with HM (P < 0.05). However, there was no correlation between behavior and average DC values in different brain regions (P < 0.05). CONCLUSION Different changes in brain regions between HM and LM may indicate differences in neural mechanisms between HM and LM. DC values could be useful as biomarkers for differences in brain activity between patients with HM and LM. This study provides a new method to assess differences in functional networks of brain activity between patients with HM and LM.
Collapse
Affiliation(s)
- Yi Cheng
- Department of Ophthalmology, The First Affiliated Hospital of Nanchang University, Nanchang, Jiangxi, PR China
| | - Li Yan
- Department of Ophthalmology, The First Affiliated Hospital of Nanchang University, Nanchang, Jiangxi, PR China
| | - Liqun Hu
- Department of Ophthalmology, Ganzhou People's Hospital of Jiangxi Province, PR China
| | - Hongyun Wu
- Department of Ophthalmology, Ganzhou People's Hospital of Jiangxi Province, PR China
| | - Xin Huang
- Department of Ophthalmology, The First Affiliated Hospital of Nanchang University, Nanchang, Jiangxi, PR China
| | - Yu Tian
- Department of Ophthalmology, Ganzhou People's Hospital of Jiangxi Province, PR China
| | - Xiaorong Wu
- Department of Ophthalmology, The First Affiliated Hospital of Nanchang University, Nanchang, Jiangxi, PR China
| |
Collapse
|
28
|
Chien PJ, Friederici AD, Hartwigsen G, Sammler D. Intonation processing increases task-specific fronto-temporal connectivity in tonal language speakers. Hum Brain Mapp 2020; 42:161-174. [PMID: 32996647 PMCID: PMC7721241 DOI: 10.1002/hbm.25214] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2020] [Revised: 09/08/2020] [Accepted: 09/13/2020] [Indexed: 01/08/2023] Open
Abstract
Language comprehension depends on tight functional interactions between distributed brain regions. While these interactions are established for semantic and syntactic processes, the functional network of speech intonation – the linguistic variation of pitch – has been scarcely defined. Particularly little is known about intonation in tonal languages, in which pitch not only serves intonation but also expresses meaning via lexical tones. The present study used psychophysiological interaction analyses of functional magnetic resonance imaging data to characterise the neural networks underlying intonation and tone processing in native Mandarin Chinese speakers. Participants categorised either intonation or tone of monosyllabic Mandarin words that gradually varied between statement and question and between Tone 2 and Tone 4. Intonation processing induced bilateral fronto‐temporal activity and increased functional connectivity between left inferior frontal gyrus and bilateral temporal regions, likely linking auditory perception and labelling of intonation categories in a phonological network. Tone processing induced bilateral temporal activity, associated with the auditory representation of tonal (phonemic) categories. Together, the present data demonstrate the breadth of the functional intonation network in a tonal language including higher‐level phonological processes in addition to auditory representations common to both intonation and tone.
Collapse
Affiliation(s)
- Pei-Ju Chien
- International Max Planck Research School NeuroCom, Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany.,Otto Hahn Group 'Neural Bases of Intonation in Speech and Music', Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany.,Lise Meitner Research Group 'Cognition and Plasticity', Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany.,Department of Neuropsychology, Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany
| | - Angela D Friederici
- Department of Neuropsychology, Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany
| | - Gesa Hartwigsen
- Lise Meitner Research Group 'Cognition and Plasticity', Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany
| | - Daniela Sammler
- Otto Hahn Group 'Neural Bases of Intonation in Speech and Music', Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany
| |
Collapse
|
29
|
Al-Fahad R, Yeasin M, Bidelman GM. Decoding of single-trial EEG reveals unique states of functional brain connectivity that drive rapid speech categorization decisions. J Neural Eng 2020; 17:016045. [PMID: 31822643 PMCID: PMC7004853 DOI: 10.1088/1741-2552/ab6040] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
Abstract
OBJECTIVE Categorical perception (CP) is an inherent property of speech perception. The response time (RT) of listeners' perceptual speech identification is highly sensitive to individual differences. While the neural correlates of CP have been well studied in terms of the regional contributions of the brain to behavior, functional connectivity patterns that signify individual differences in listeners' speed (RT) for speech categorization is less clear. In this study, we introduce a novel approach to address these questions. APPROACH We applied several computational approaches to the EEG, including graph mining, machine learning (i.e., support vector machine), and stability selection to investigate the unique brain states (functional neural connectivity) that predict the speed of listeners' behavioral decisions. MAIN RESULTS We infer that (i) the listeners' perceptual speed is directly related to dynamic variations in their brain connectomics, (ii) global network assortativity and efficiency distinguished fast, medium, and slow RTs, (iii) the functional network underlying speeded decisions increases in negative assortativity (i.e., became disassortative) for slower RTs, (iv) slower categorical speech decisions cause excessive use of neural resources and more aberrant information flow within the CP circuitry, (v) slower responders tended to utilize functional brain networks excessively (or inappropriately) whereas fast responders (with lower global efficiency) utilized the same neural pathways but with more restricted organization. SIGNIFICANCE Findings show that neural classifiers (SVM) coupled with stability selection correctly classify behavioral RTs from functional connectivity alone with over 92% accuracy (AUC = 0.9). Our results corroborate previous studies by supporting the engagement of similar temporal (STG), parietal, motor, and prefrontal regions in CP using an entirely data-driven approach.
Collapse
Affiliation(s)
- Rakib Al-Fahad
- Department of Electrical and Computer Engineering, University of Memphis, Memphis, 38152 TN, USA
| | - Mohammed Yeasin
- Department of Electrical and Computer Engineering, University of Memphis, Memphis, 38152 TN, USA
- Institute for Intelligent Systems, University of Memphis, Memphis, TN, USA
| | - Gavin M. Bidelman
- Institute for Intelligent Systems, University of Memphis, Memphis, TN, USA
- School of Communication Sciences & Disorders, University of Memphis, Memphis, TN, USA
- University of Tennessee Health Sciences Center, Department of Anatomy and Neurobiology, Memphis, TN, USA
| |
Collapse
|
30
|
Chien PJ, Friederici AD, Hartwigsen G, Sammler D. Neural correlates of intonation and lexical tone in tonal and non-tonal language speakers. Hum Brain Mapp 2020; 41:1842-1858. [PMID: 31957928 PMCID: PMC7268089 DOI: 10.1002/hbm.24916] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2019] [Revised: 12/10/2019] [Accepted: 12/18/2019] [Indexed: 12/31/2022] Open
Abstract
Intonation, the modulation of pitch in speech, is a crucial aspect of language that is processed in right‐hemispheric regions, beyond the classical left‐hemispheric language system. Whether or not this notion generalises across languages remains, however, unclear. Particularly, tonal languages are an interesting test case because of the dual linguistic function of pitch that conveys lexical meaning in form of tone, in addition to intonation. To date, only few studies have explored how intonation is processed in tonal languages, how this compares to tone and between tonal and non‐tonal language speakers. The present fMRI study addressed these questions by testing Mandarin and German speakers with Mandarin material. Both groups categorised mono‐syllabic Mandarin words in terms of intonation, tone, and voice gender. Systematic comparisons of brain activity of the two groups between the three tasks showed large cross‐linguistic commonalities in the neural processing of intonation in left fronto‐parietal, right frontal, and bilateral cingulo‐opercular regions. These areas are associated with general phonological, specific prosodic, and controlled categorical decision‐making processes, respectively. Tone processing overlapped with intonation processing in left fronto‐parietal areas, in both groups, but evoked additional activity in bilateral temporo‐parietal semantic regions and subcortical areas in Mandarin speakers only. Together, these findings confirm cross‐linguistic commonalities in the neural implementation of intonation processing but dissociations for semantic processing of tone only in tonal language speakers.
Collapse
Affiliation(s)
- Pei-Ju Chien
- International Max Planck Research School NeuroCom, Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany.,Otto Hahn Group "Neural Bases of Intonation in Speech and Music", Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany.,Lise Meitner Research Group "Cognition and Plasticity", Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany.,Department of Neuropsychology, Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany
| | - Angela D Friederici
- Department of Neuropsychology, Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany
| | - Gesa Hartwigsen
- Lise Meitner Research Group "Cognition and Plasticity", Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany
| | - Daniela Sammler
- Otto Hahn Group "Neural Bases of Intonation in Speech and Music", Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany
| |
Collapse
|
31
|
Llanos F, Xie Z, Chandrasekaran B. Biometric identification of listener identity from frequency following responses to speech. J Neural Eng 2019; 16:056004. [PMID: 31039552 DOI: 10.1088/1741-2552/ab1e01] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]
Abstract
OBJECTIVE We investigate the biometric specificity of the frequency following response (FFR), an EEG marker of early auditory processing that reflects phase-locked activity from neural ensembles in the auditory cortex and subcortex (Chandrasekaran and Kraus 2010, Bidelman, 2015a, 2018, Coffey et al 2017b). Our objective is two-fold: demonstrate that the FFR contains information beyond stimulus properties and broad group-level markers, and to assess the practical viability of the FFR as a biometric across different sounds, auditory experiences, and recording days. APPROACH We trained the hidden Markov model (HMM) to decode listener identity from FFR spectro-temporal patterns across multiple frequency bands. Our dataset included FFRs from twenty native speakers of English or Mandarin Chinese (10 per group) listening to Mandarin Chinese tones across three EEG sessions separated by days. We decoded subject identity within the same auditory context (same tone and session) and across different stimuli and recording sessions. MAIN RESULTS The HMM decoded listeners for averaging sizes as small as one single FFR. However, model performance improved for larger averaging sizes (e.g. 25 FFRs), similarity in auditory context (same tone and day), and lack of familiarity with the sounds (i.e. native English relative to native Chinese listeners). Our results also revealed important biometric contributions from frequency bands in the cortical and subcortical EEG. SIGNIFICANCE Our study provides the first deep and systematic biometric characterization of the FFR and provides the basis for biometric identification systems incorporating this neural signal.
Collapse
Affiliation(s)
- Fernando Llanos
- Department of Communication Sciences and Disorders, University of Pittsburgh, Pittsburgh, PA 15213, United States of America
| | | | | |
Collapse
|
32
|
Bidelman GM, Walker B. Plasticity in auditory categorization is supported by differential engagement of the auditory-linguistic network. Neuroimage 2019; 201:116022. [PMID: 31310863 DOI: 10.1016/j.neuroimage.2019.116022] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2019] [Revised: 06/30/2019] [Accepted: 07/12/2019] [Indexed: 12/21/2022] Open
Abstract
To construct our perceptual world, the brain categorizes variable sensory cues into behaviorally-relevant groupings. Categorical representations are apparent within a distributed fronto-temporo-parietal brain network but how this neural circuitry is shaped by experience remains undefined. Here, we asked whether speech and music categories might be formed within different auditory-linguistic brain regions depending on listeners' auditory expertise. We recorded EEG in highly skilled (musicians) vs. less experienced (nonmusicians) perceivers as they rapidly categorized speech and musical sounds. Musicians showed perceptual enhancements across domains, yet source EEG data revealed a double dissociation in the neurobiological mechanisms supporting categorization between groups. Whereas musicians coded categories in primary auditory cortex (PAC), nonmusicians recruited non-auditory regions (e.g., inferior frontal gyrus, IFG) to generate category-level information. Functional connectivity confirmed nonmusicians' increased left IFG involvement reflects stronger routing of signal from PAC directed to IFG, presumably because sensory coding is insufficient to construct categories in less experienced listeners. Our findings establish auditory experience modulates specific engagement and inter-regional communication in the auditory-linguistic network supporting categorical perception. Whereas early canonical PAC representations are sufficient to generate categories in highly trained ears, less experienced perceivers broadcast information downstream to higher-order linguistic brain areas (IFG) to construct abstract sound labels.
Collapse
Affiliation(s)
- Gavin M Bidelman
- Institute for Intelligent Systems, University of Memphis, Memphis, TN, USA; School of Communication Sciences & Disorders, University of Memphis, Memphis, TN, USA; University of Tennessee Health Sciences Center, Department of Anatomy and Neurobiology, Memphis, TN, USA.
| | - Breya Walker
- Institute for Intelligent Systems, University of Memphis, Memphis, TN, USA; Department of Psychology, University of Memphis, Memphis, TN, USA; Department of Mathematical Sciences, University of Memphis, Memphis, TN, USA
| |
Collapse
|
33
|
Rampinini AC, Handjaras G, Leo A, Cecchetti L, Betta M, Marotta G, Ricciardi E, Pietrini P. Formant Space Reconstruction From Brain Activity in Frontal and Temporal Regions Coding for Heard Vowels. Front Hum Neurosci 2019; 13:32. [PMID: 30837851 PMCID: PMC6383050 DOI: 10.3389/fnhum.2019.00032] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2018] [Accepted: 01/21/2019] [Indexed: 11/29/2022] Open
Abstract
Classical studies have isolated a distributed network of temporal and frontal areas engaged in the neural representation of speech perception and production. With modern literature arguing against unique roles for these cortical regions, different theories have favored either neural code-sharing or cortical space-sharing, thus trying to explain the intertwined spatial and functional organization of motor and acoustic components across the fronto-temporal cortical network. In this context, the focus of attention has recently shifted toward specific model fitting, aimed at motor and/or acoustic space reconstruction in brain activity within the language network. Here, we tested a model based on acoustic properties (formants), and one based on motor properties (articulation parameters), where model-free decoding of evoked fMRI activity during perception, imagery, and production of vowels had been successful. Results revealed that phonological information organizes around formant structure during the perception of vowels; interestingly, such a model was reconstructed in a broad temporal region, outside of the primary auditory cortex, but also in the pars triangularis of the left inferior frontal gyrus. Conversely, articulatory features were not associated with brain activity in these regions. Overall, our results call for a degree of interdependence based on acoustic information, between the frontal and temporal ends of the language network.
Collapse
Affiliation(s)
| | | | - Andrea Leo
- IMT School for Advanced Studies Lucca, Lucca, Italy
| | | | - Monica Betta
- IMT School for Advanced Studies Lucca, Lucca, Italy
| | - Giovanna Marotta
- Department of Philology, Literature and Linguistics, University of Pisa, Pisa, Italy
| | | | | |
Collapse
|
34
|
Rampinini AC, Handjaras G, Leo A, Cecchetti L, Ricciardi E, Marotta G, Pietrini P. Functional and spatial segregation within the inferior frontal and superior temporal cortices during listening, articulation imagery, and production of vowels. Sci Rep 2017; 7:17029. [PMID: 29208951 PMCID: PMC5717247 DOI: 10.1038/s41598-017-17314-0] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2017] [Accepted: 11/24/2017] [Indexed: 11/09/2022] Open
Abstract
Classical models of language localize speech perception in the left superior temporal and production in the inferior frontal cortex. Nonetheless, neuropsychological, structural and functional studies have questioned such subdivision, suggesting an interwoven organization of the speech function within these cortices. We tested whether sub-regions within frontal and temporal speech-related areas retain specific phonological representations during both perception and production. Using functional magnetic resonance imaging and multivoxel pattern analysis, we showed functional and spatial segregation across the left fronto-temporal cortex during listening, imagery and production of vowels. In accordance with classical models of language and evidence from functional studies, the inferior frontal and superior temporal cortices discriminated among perceived and produced vowels respectively, also engaging in the non-classical, alternative function - i.e. perception in the inferior frontal and production in the superior temporal cortex. Crucially, though, contiguous and non-overlapping sub-regions within these hubs performed either the classical or non-classical function, the latter also representing non-linguistic sounds (i.e., pure tones). Extending previous results and in line with integration theories, our findings not only demonstrate that sensitivity to speech listening exists in production-related regions and vice versa, but they also suggest that the nature of such interwoven organisation is built upon low-level perception.
Collapse
Affiliation(s)
| | | | - Andrea Leo
- IMT School for Advanced Studies, Lucca, 55100, Italy
| | | | | | - Giovanna Marotta
- Department of Philology, Literature and Linguistics, University of Pisa, Pisa, 56100, Italy
| | | |
Collapse
|