1
|
Luthra S, Razin RN, Tierney AT, Holt LL, Dick F. Systematic changes in neural selectivity reflect the acquired salience of category-diagnostic dimensions. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.09.21.614258. [PMID: 39386708 PMCID: PMC11463673 DOI: 10.1101/2024.09.21.614258] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 10/12/2024]
Abstract
Humans and other animals develop remarkable behavioral specializations for identifying, differentiating, and acting on classes of ecologically important signals. Ultimately, this expertise is flexible enough to support diverse perceptual judgments: a voice, for example, simultaneously conveys what a talker says as well as myriad cues about her identity and state. Mature perception across complex signals thus involves both discovering and learning regularities that best inform diverse perceptual judgments, and weighting this information flexibly as task demands change. Here, we test whether this flexibility may involve endogenous attentional gain to task-relevant dimensions. We use two prospective auditory category learning tasks to relate a complex, entirely novel soundscape to four classes of "alien identity" and two classes of "alien size." Identity, but not size, categorization requires discovery and learning of patterned acoustic input situated in one of two simultaneous, frequency- delimited bands. This allows us to capitalize on the coarsely segregated frequency-band- specific channels of auditory tonotopic maps using fMRI to ask whether category-relevant perceptual information is prioritized relative to simultaneous, uninformative information. Among participants expert at alien identity categorization, we observe prioritization of the diagnostic frequency band that persists even when the diagnostic information becomes irrelevant in the size categorization task. Tellingly, the neural selectivity evoked implicitly in categorization aligns closely with activation driven by explicit, sustained selective attention to other sounds presented in the same frequency band. Additionally, we observe fingerprints of individual differences in the learning trajectories taken to achieve expert-level categorization in patterns of neural activity associated with the diagnostic dimension. In all, this indicates that acquiring categories can drive the emergence of acquired attentional salience to dimensions of acoustic input.
Collapse
|
2
|
Cheng THZ, Zhao TC. Validating a novel paradigm for simultaneously assessing mismatch response and frequency-following response to speech sounds. J Neurosci Methods 2024; 412:110277. [PMID: 39245330 DOI: 10.1016/j.jneumeth.2024.110277] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2024] [Revised: 08/08/2024] [Accepted: 09/01/2024] [Indexed: 09/10/2024]
Abstract
BACKGROUND Speech sounds are processed in the human brain through intricate and interconnected cortical and subcortical structures. Two neural signatures, one largely from cortical sources (mismatch response, MMR) and one largely from subcortical sources (frequency-following response, FFR) are critical for assessing speech processing as they both show sensitivity to high-level linguistic information. However, there are distinct prerequisites for recording MMR and FFR, making them difficult to acquire simultaneously NEW METHOD: Using a new paradigm, our study aims to concurrently capture both signals and test them against the following criteria: (1) replicating the effect that the MMR to a native speech contrast significantly differs from the MMR to a nonnative speech contrast, and (2) demonstrating that FFRs to three speech sounds can be reliably differentiated. RESULTS Using EEG from 18 adults, we observed a decoding accuracy of 72.2 % between the MMR to native vs. nonnative speech contrasts. A significantly larger native MMR was shown in the expected time window. Similarly, a significant decoding accuracy of 79.6 % was found for FFR. A high stimulus-to-response cross-correlation with a 9 ms lag suggested that FFR closely tracks speech sounds. COMPARISON WITH EXISTING METHOD(S) These findings demonstrate that our paradigm reliably captures both MMR and FFR concurrently, replicating and extending past research with much fewer trials (MMR: 50 trials; FFR: 200 trials) and shorter experiment time (12 minutes). CONCLUSIONS This study paves the way to understanding cortical-subcortical interactions for speech and language processing, with the ultimate goal of developing an assessment tool specific to early development.
Collapse
Affiliation(s)
- Tzu-Han Zoe Cheng
- Department of Speech and Hearing Sciences, University of Washington, Seattle, WA 98195, USA; Institute for Learning & Brain Sciences, University of Washington, Seattle, WA 98195, USA.
| | - Tian Christina Zhao
- Department of Speech and Hearing Sciences, University of Washington, Seattle, WA 98195, USA; Institute for Learning & Brain Sciences, University of Washington, Seattle, WA 98195, USA.
| |
Collapse
|
3
|
Honda CT, Clayards M, Baum SR. Individual differences in the consistency of neural and behavioural responses to speech sounds. Brain Res 2024; 1845:149208. [PMID: 39218332 DOI: 10.1016/j.brainres.2024.149208] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2023] [Revised: 08/13/2024] [Accepted: 08/27/2024] [Indexed: 09/04/2024]
Abstract
There are documented individual differences among adults in the consistency of speech sound processing, both at neural and behavioural levels. Some adults show more consistent neural responses to speech sounds than others, as measured by an event-related potential called the frequency-following response (FFR); similarly, some adults show more consistent behavioural responses to native speech sounds than others, as measured by two-alternative forced choice (2AFC) and visual analog scaling (VAS) tasks. Adults also differ in how successfully they can perceive non-native speech sounds. Interestingly, it remains unclear whether these differences are related within individuals. In the current study, native English-speaking adults completed native phonetic perception tasks (2AFC and VAS), a non-native (German) phonetic perception task, and an FFR recording session. From these tasks, we derived measures of the consistency of participants' neural and behavioural responses to native speech as well as their non-native perception ability. We then examined the relationships among individual differences in these measures. Analysis of the behavioural measures revealed that more consistent responses to native sounds predicted more successful perception of unfamiliar German sounds. Analysis of neural and behavioural data did not reveal clear relationships between FFR consistency and our phonetic perception measures. This multimodal work furthers our understanding of individual differences in speech processing among adults, and may eventually lead to individualized approaches for enhancing non-native language acquisition in adulthood.
Collapse
Affiliation(s)
- Claire T Honda
- Integrated Program in Neuroscience, McGill University, Montreal, Canada; Centre for Research on Brain, Language and Music, Montreal, Canada.
| | - Meghan Clayards
- Centre for Research on Brain, Language and Music, Montreal, Canada; School of Communication Sciences and Disorders, McGill University, Montreal, Canada; Department of Linguistics, McGill University, Montreal, Canada
| | - Shari R Baum
- Centre for Research on Brain, Language and Music, Montreal, Canada; School of Communication Sciences and Disorders, McGill University, Montreal, Canada
| |
Collapse
|
4
|
Cao M, Pavlik PI, Bidelman GM. Enhancing lexical tone learning for second language speakers: effects of acoustic properties in Mandarin tone perception. Front Psychol 2024; 15:1403816. [PMID: 39233888 PMCID: PMC11371754 DOI: 10.3389/fpsyg.2024.1403816] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2024] [Accepted: 08/08/2024] [Indexed: 09/06/2024] Open
Abstract
Understanding the challenges faced by second language (L2) learners in lexical tone perception is crucial for effective language acquisition. This study investigates the impact of exaggerated acoustic properties on facilitating Mandarin tone learning for English speakers. Using synthesized tone stimuli, we systematically manipulated pitch contours through three key modifications: expanding the fundamental frequency (F0), increasing F0 (female voice), and extending the overall duration. Our objectives were to assess the influence of F0 expansion, higher F0, longer duration, and varied syllables on Mandarin tone learning and generalization. Participants engaged in a non-adaptive trial-by-trial tone identification task. Mixed-effects logistic regression modeling was used to analyze accuracy across learning phases, acoustic factors, and tones. Findings reveal improvements in accuracy from training to testing and generalization phases, indicating the effectiveness of perceptual training to tone perception for adult English speakers. Tone 1 emerged as the easiest to perceive, while Tone 3 posed the most challenge, consistent with established hierarchies of tonal acquisition difficulty. Analysis of acoustic factors highlighted tone-specific effects. Expanded F0 was beneficial for the identification of Tone 2 and Tone 3 but posed challenges for Tone 1 and Tone 4. Additionally, longer durations also exhibited varied effects across tones, aiding in the identification of Tone 3 and Tone 4 but hindering Tone 1 identification. The higher F0 was advantageous for Tone 2 but disadvantageous for Tone 3. Furthermore, the syllable ma facilitated the identification of Tone 1 and Tone 2 but not for Tone 3 and Tone 4. These findings enhance our understanding of the role of acoustic properties in L2 tone perception and have implications for the design of effective training programs for second language acquisition.
Collapse
Affiliation(s)
- Meng Cao
- Optimal Learning Lab, Department of Psychology, Institute for Intelligent Systems, University of Memphis, Memphis, TN, United States
| | - Philip I Pavlik
- Optimal Learning Lab, Department of Psychology, Institute for Intelligent Systems, University of Memphis, Memphis, TN, United States
| | - Gavin M Bidelman
- Department of Speech, Language, and Hearing Sciences, Indiana University, Bloomington, IN, United States
- Program in Neuroscience, Indiana University, Bloomington, IN, United States
- Cognitive Science Program, Indiana University, Bloomington, IN, United States
| |
Collapse
|
5
|
Skoe E, Kraus N. Neural Delays in Processing Speech in Background Noise Minimized after Short-Term Auditory Training. BIOLOGY 2024; 13:509. [PMID: 39056702 PMCID: PMC11273880 DOI: 10.3390/biology13070509] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/15/2024] [Revised: 06/19/2024] [Accepted: 06/24/2024] [Indexed: 07/28/2024]
Abstract
Background noise disrupts the neural processing of sound, resulting in delayed and diminished far-field auditory-evoked responses. In young adults, we previously provided evidence that cognitively based short-term auditory training can ameliorate the impact of background noise on the frequency-following response (FFR), leading to greater neural synchrony to the speech fundamental frequency(F0) in noisy listening conditions. In this same dataset (55 healthy young adults), we now examine whether training-related changes extend to the latency of the FFR, with the prediction of faster neural timing after training. FFRs were measured on two days separated by ~8 weeks. FFRs were elicited by the syllable "da" presented at a signal-to-noise ratio (SNR) of +10 dB SPL relative to a background of multi-talker noise. Half of the participants participated in 20 sessions of computerized training (Listening and Communication Enhancement Program, LACE) between test sessions, while the other half served as Controls. In both groups, half of the participants were non-native speakers of English. In the Control Group, response latencies were unchanged at retest, but for the training group, response latencies were earlier. Findings suggest that auditory training can improve how the adult nervous system responds in noisy listening conditions, as demonstrated by decreased response latencies.
Collapse
Affiliation(s)
- Erika Skoe
- Department of Speech, Language, and Hearing Sciences, University of Connecticut, Storrs, CT 06269, USA
| | - Nina Kraus
- Department of Communication Sciences, Northwestern University, Evanston, IL 60208, USA;
- Cognitive Sciences, Institute for Neuroscience, Northwestern University, Evanston, IL 60208, USA
- Department of Neurobiology and Physiology, Northwestern University, Evanston, IL 60208, USA
- Department of Linguistics, Northwestern University, Evanston, IL 60208, USA
- Department of Otolaryngology, Northwestern University, Evanston, IL 60208, USA
| |
Collapse
|
6
|
Bidelman GM, Sisson A, Rizzi R, MacLean J, Baer K. Myogenic artifacts masquerade as neuroplasticity in the auditory frequency-following response. Front Neurosci 2024; 18:1422903. [PMID: 39040631 PMCID: PMC11260751 DOI: 10.3389/fnins.2024.1422903] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2024] [Accepted: 06/24/2024] [Indexed: 07/24/2024] Open
Abstract
The frequency-following response (FFR) is an evoked potential that provides a neural index of complex sound encoding in the brain. FFRs have been widely used to characterize speech and music processing, experience-dependent neuroplasticity (e.g., learning and musicianship), and biomarkers for hearing and language-based disorders that distort receptive communication abilities. It is widely assumed that FFRs stem from a mixture of phase-locked neurogenic activity from the brainstem and cortical structures along the hearing neuraxis. In this study, we challenge this prevailing view by demonstrating that upwards of ~50% of the FFR can originate from an unexpected myogenic source: contamination from the postauricular muscle (PAM) vestigial startle reflex. We measured PAM, transient auditory brainstem responses (ABRs), and sustained frequency-following response (FFR) potentials reflecting myogenic (PAM) and neurogenic (ABR/FFR) responses in young, normal-hearing listeners with varying degrees of musical training. We first establish that PAM artifact is present in all ears, varies with electrode proximity to the muscle, and can be experimentally manipulated by directing listeners' eye gaze toward the ear of sound stimulation. We then show this muscular noise easily confounds auditory FFRs, spuriously amplifying responses 3-4-fold with tandem PAM contraction and even explaining putative FFR enhancements observed in highly skilled musicians. Our findings expose a new and unrecognized myogenic source to the FFR that drives its large inter-subject variability and cast doubt on whether changes in the response typically attributed to neuroplasticity/pathology are solely of brain origin.
Collapse
Affiliation(s)
- Gavin M. Bidelman
- Department of Speech, Language and Hearing Sciences, Indiana University, Bloomington, IN, United States
- Program in Neuroscience, Indiana University, Bloomington, IN, United States
- Cognitive Science Program, Indiana University, Bloomington, IN, United States
| | - Alexandria Sisson
- Department of Speech, Language and Hearing Sciences, Indiana University, Bloomington, IN, United States
| | - Rose Rizzi
- Department of Speech, Language and Hearing Sciences, Indiana University, Bloomington, IN, United States
- Program in Neuroscience, Indiana University, Bloomington, IN, United States
| | - Jessica MacLean
- Department of Speech, Language and Hearing Sciences, Indiana University, Bloomington, IN, United States
- Program in Neuroscience, Indiana University, Bloomington, IN, United States
| | - Kaitlin Baer
- School of Communication Sciences and Disorders, University of Memphis, Memphis, TN, United States
- Veterans Affairs Medical Center, Memphis, TN, United States
| |
Collapse
|
7
|
Roark CL, Paulon G, Rebaudo G, McHaney JR, Sarkar A, Chandrasekaran B. Individual differences in working memory impact the trajectory of non-native speech category learning. PLoS One 2024; 19:e0297917. [PMID: 38857268 PMCID: PMC11164376 DOI: 10.1371/journal.pone.0297917] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2023] [Accepted: 01/15/2024] [Indexed: 06/12/2024] Open
Abstract
What is the role of working memory over the course of non-native speech category learning? Prior work has predominantly focused on how working memory might influence learning assessed at a single timepoint. Here, we substantially extend this prior work by examining the role of working memory on speech learning performance over time (i.e., over several months) and leverage a multifaceted approach that provides key insights into how working memory influences learning accuracy, maintenance of knowledge over time, generalization ability, and decision processes. We found that the role of working memory in non-native speech learning depends on the timepoint of learning and whether individuals learned the categories at all. Among learners, across all stages of learning, working memory was associated with higher accuracy as well as faster and slightly more cautious decision making. Further, while learners and non-learners did not have substantially different working memory performance, learners had faster evidence accumulation and more cautious decision thresholds throughout all sessions. Working memory may enhance learning by facilitating rapid category acquisition in initial stages and enabling faster and slightly more careful decision-making strategies that may reduce the overall effort needed to learn. Our results have important implications for developing interventions to improve learning in naturalistic language contexts.
Collapse
Affiliation(s)
- Casey L. Roark
- Communication Science & Disorders, University of Pittsburgh, Pittsburgh, PA, United States of America
- Center for the Neural Basis of Cognition, Pittsburgh, PA, United States of America
| | - Giorgio Paulon
- Statistics and Data Sciences, University of Texas at Austin, Austin, TX, United States of America
| | - Giovanni Rebaudo
- Statistics and Data Sciences, University of Texas at Austin, Austin, TX, United States of America
| | - Jacie R. McHaney
- Communication Science & Disorders, University of Pittsburgh, Pittsburgh, PA, United States of America
| | - Abhra Sarkar
- Statistics and Data Sciences, University of Texas at Austin, Austin, TX, United States of America
| | - Bharath Chandrasekaran
- Communication Science & Disorders, University of Pittsburgh, Pittsburgh, PA, United States of America
- Center for the Neural Basis of Cognition, Pittsburgh, PA, United States of America
| |
Collapse
|
8
|
Mukhopadhyay M, McHaney JR, Chandrasekaran B, Sarkar A. Bayesian Semiparametric Longitudinal Inverse-Probit Mixed Models for Category Learning. PSYCHOMETRIKA 2024; 89:461-485. [PMID: 38374497 DOI: 10.1007/s11336-024-09947-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/12/2023] [Indexed: 02/21/2024]
Abstract
Understanding how the adult human brain learns novel categories is an important problem in neuroscience. Drift-diffusion models are popular in such contexts for their ability to mimic the underlying neural mechanisms. One such model for gradual longitudinal learning was recently developed in Paulon et al. (J Am Stat Assoc 116:1114-1127, 2021). In practice, category response accuracies are often the only reliable measure recorded by behavioral scientists to describe human learning. Category response accuracies are, however, often the only reliable measure recorded by behavioral scientists to describe human learning. To our knowledge, however, drift-diffusion models for such scenarios have never been considered in the literature before. To address this gap, in this article, we build carefully on Paulon et al. (J Am Stat Assoc 116:1114-1127, 2021), but now with latent response times integrated out, to derive a novel biologically interpretable class of 'inverse-probit' categorical probability models for observed categories alone. However, this new marginal model presents significant identifiability and inferential challenges not encountered originally for the joint model in Paulon et al. (J Am Stat Assoc 116:1114-1127, 2021). We address these new challenges using a novel projection-based approach with a symmetry-preserving identifiability constraint that allows us to work with conjugate priors in an unconstrained space. We adapt the model for group and individual-level inference in longitudinal settings. Building again on the model's latent variable representation, we design an efficient Markov chain Monte Carlo algorithm for posterior computation. We evaluate the empirical performance of the method through simulation experiments. The practical efficacy of the method is illustrated in applications to longitudinal tone learning studies.
Collapse
Affiliation(s)
- Minerva Mukhopadhyay
- Department of Mathematics and Statistics, Indian Institute of Technology, Kanpur, 208016, Uttar Pradesh, India
| | - Jacie R McHaney
- Department of Communication Sciences and Disorders, Northwestern University, 70 Arts Circle Drive, Evanston, IL, 60208, USA
| | - Bharath Chandrasekaran
- Department of Communication Sciences and Disorders, Northwestern University, 70 Arts Circle Drive, Evanston, IL, 60208, USA
| | - Abhra Sarkar
- Department of Statistics and Data Sciences, University of Texas at Austin, 105 East 24th Street D9800, Austin, TX, 78712, USA.
| |
Collapse
|
9
|
Ribas-Prats T, Arenillas-Alcón S, Martínez SIF, Gómez-Roig MD, Escera C. The frequency-following response in late preterm neonates: a pilot study. Front Psychol 2024; 15:1341171. [PMID: 38784610 PMCID: PMC11112609 DOI: 10.3389/fpsyg.2024.1341171] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2023] [Accepted: 04/23/2024] [Indexed: 05/25/2024] Open
Abstract
Introduction Infants born very early preterm are at high risk of language delays. However, less is known about the consequences of late prematurity. Hence, the aim of the present study is to characterize the neural encoding of speech sounds in late preterm neonates in comparison with those born at term. Methods The speech-evoked frequency-following response (FFR) was recorded to a consonant-vowel stimulus /da/ in 36 neonates in three different groups: 12 preterm neonates [mean gestational age (GA) 36.05 weeks], 12 "early term neonates" (mean GA 38.3 weeks), and "late term neonates" (mean GA 41.01 weeks). Results From the FFR recordings, a delayed neural response and a weaker stimulus F0 encoding in premature neonates compared to neonates born at term was observed. No differences in the response time onset nor in stimulus F0 encoding were observed between the two groups of neonates born at term. No differences between the three groups were observed in the neural encoding of the stimulus temporal fine structure. Discussion These results highlight alterations in the neural encoding of speech sounds related to prematurity, which were present for the stimulus F0 but not for its temporal fine structure.
Collapse
Affiliation(s)
- Teresa Ribas-Prats
- Brainlab–Cognitive Neuroscience Research Group. Department of Clinical Psychology and Psychobiology, University of Barcelona, Barcelona, Spain
- Institute of Neurosciences, University of Barcelona, Barcelona, Spain
- Institut de Recerca Sant Joan de Déu, Esplugues de Llobregat, Barcelona, Spain
| | - Sonia Arenillas-Alcón
- Brainlab–Cognitive Neuroscience Research Group. Department of Clinical Psychology and Psychobiology, University of Barcelona, Barcelona, Spain
- Institute of Neurosciences, University of Barcelona, Barcelona, Spain
- Institut de Recerca Sant Joan de Déu, Esplugues de Llobregat, Barcelona, Spain
| | - Silvia Irene Ferrero Martínez
- Institut de Recerca Sant Joan de Déu, Esplugues de Llobregat, Barcelona, Spain
- BCNatal–Barcelona Center for Maternal Fetal and Neonatal Medicine (Hospital Sant Joan de Déu and Hospital Clínic), University of Barcelona, Barcelona, Spain
| | - Maria Dolores Gómez-Roig
- Institut de Recerca Sant Joan de Déu, Esplugues de Llobregat, Barcelona, Spain
- BCNatal–Barcelona Center for Maternal Fetal and Neonatal Medicine (Hospital Sant Joan de Déu and Hospital Clínic), University of Barcelona, Barcelona, Spain
| | - Carles Escera
- Brainlab–Cognitive Neuroscience Research Group. Department of Clinical Psychology and Psychobiology, University of Barcelona, Barcelona, Spain
- Institute of Neurosciences, University of Barcelona, Barcelona, Spain
- Institut de Recerca Sant Joan de Déu, Esplugues de Llobregat, Barcelona, Spain
| |
Collapse
|
10
|
Bidelman G, Sisson A, Rizzi R, MacLean J, Baer K. Myogenic artifacts masquerade as neuroplasticity in the auditory frequency-following response (FFR). BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2023.10.27.564446. [PMID: 37961324 PMCID: PMC10634913 DOI: 10.1101/2023.10.27.564446] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/15/2023]
Abstract
The frequency-following response (FFR) is an evoked potential that provides a "neural fingerprint" of complex sound encoding in the brain. FFRs have been widely used to characterize speech and music processing, experience-dependent neuroplasticity (e.g., learning, musicianship), and biomarkers for hearing and language-based disorders that distort receptive communication abilities. It is widely assumed FFRs stem from a mixture of phase-locked neurogenic activity from brainstem and cortical structures along the hearing neuraxis. Here, we challenge this prevailing view by demonstrating upwards of ~50% of the FFR can originate from a non-neural source: contamination from the postauricular muscle (PAM) vestigial startle reflex. We first establish PAM artifact is present in all ears, varies with electrode proximity to the muscle, and can be experimentally manipulated by directing listeners' eye gaze toward the ear of sound stimulation. We then show this muscular noise easily confounds auditory FFRs, spuriously amplifying responses by 3-4x fold with tandem PAM contraction and even explaining putative FFR enhancements observed in highly skilled musicians. Our findings expose a new and unrecognized myogenic source to the FFR that drives its large inter-subject variability and cast doubt on whether changes in the response typically attributed to neuroplasticity/pathology are solely of brain origin.
Collapse
|
11
|
Ying R, Stolzberg DJ, Caras ML. Neural correlates of flexible sound perception in the auditory midbrain and thalamus. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.04.12.589266. [PMID: 38645241 PMCID: PMC11030403 DOI: 10.1101/2024.04.12.589266] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/23/2024]
Abstract
Hearing is an active process in which listeners must detect and identify sounds, segregate and discriminate stimulus features, and extract their behavioral relevance. Adaptive changes in sound detection can emerge rapidly, during sudden shifts in acoustic or environmental context, or more slowly as a result of practice. Although we know that context- and learning-dependent changes in the spectral and temporal sensitivity of auditory cortical neurons support many aspects of flexible listening, the contribution of subcortical auditory regions to this process is less understood. Here, we recorded single- and multi-unit activity from the central nucleus of the inferior colliculus (ICC) and the ventral subdivision of the medial geniculate nucleus (MGV) of Mongolian gerbils under two different behavioral contexts: as animals performed an amplitude modulation (AM) detection task and as they were passively exposed to AM sounds. Using a signal detection framework to estimate neurometric sensitivity, we found that neural thresholds in both regions improved during task performance, and this improvement was driven by changes in firing rate rather than phase locking. We also found that ICC and MGV neurometric thresholds improved and correlated with behavioral performance as animals learn to detect small AM depths during a multi-day perceptual training paradigm. Finally, we reveal that in the MGV, but not the ICC, context-dependent enhancements in AM sensitivity grow stronger during perceptual training, mirroring prior observations in the auditory cortex. Together, our results suggest that the auditory midbrain and thalamus contribute to flexible sound processing and perception over rapid and slow timescales.
Collapse
Affiliation(s)
- Rose Ying
- Neuroscience and Cognitive Science Program, University of Maryland, College Park, Maryland, 20742
- Department of Biology, University of Maryland, College Park, Maryland, 20742
- Center for Comparative and Evolutionary Biology of Hearing, University of Maryland, College Park, Maryland, 20742
| | - Daniel J. Stolzberg
- Department of Biology, University of Maryland, College Park, Maryland, 20742
| | - Melissa L. Caras
- Neuroscience and Cognitive Science Program, University of Maryland, College Park, Maryland, 20742
- Department of Biology, University of Maryland, College Park, Maryland, 20742
- Center for Comparative and Evolutionary Biology of Hearing, University of Maryland, College Park, Maryland, 20742
- Department of Hearing and Speech Sciences, University of Maryland, College Park, Maryland, 20742
| |
Collapse
|
12
|
Bidelman GM, Bernard F, Skubic K. Hearing in categories aids speech streaming at the "cocktail party". BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.04.03.587795. [PMID: 38617284 PMCID: PMC11014555 DOI: 10.1101/2024.04.03.587795] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/16/2024]
Abstract
Our perceptual system bins elements of the speech signal into categories to make speech perception manageable. Here, we aimed to test whether hearing speech in categories (as opposed to a continuous/gradient fashion) affords yet another benefit to speech recognition: parsing noisy speech at the "cocktail party." We measured speech recognition in a simulated 3D cocktail party environment. We manipulated task difficulty by varying the number of additional maskers presented at other spatial locations in the horizontal soundfield (1-4 talkers) and via forward vs. time-reversed maskers, promoting more and less informational masking (IM), respectively. In separate tasks, we measured isolated phoneme categorization using two-alternative forced choice (2AFC) and visual analog scaling (VAS) tasks designed to promote more/less categorical hearing and thus test putative links between categorization and real-world speech-in-noise skills. We first show that listeners can only monitor up to ~3 talkers despite up to 5 in the soundscape and streaming is not related to extended high-frequency hearing thresholds (though QuickSIN scores are). We then confirm speech streaming accuracy and speed decline with additional competing talkers and amidst forward compared to reverse maskers with added IM. Dividing listeners into "discrete" vs. "continuous" categorizers based on their VAS labeling (i.e., whether responses were binary or continuous judgments), we then show the degree of IM experienced at the cocktail party is predicted by their degree of categoricity in phoneme labeling; more discrete listeners are less susceptible to IM than their gradient responding peers. Our results establish a link between speech categorization skills and cocktail party processing, with a categorical (rather than gradient) listening strategy benefiting degraded speech perception. These findings imply figure-ground deficits common in many disorders might arise through a surprisingly simple mechanism: a failure to properly bin sounds into categories.
Collapse
Affiliation(s)
- Gavin M. Bidelman
- Department of Speech, Language and Hearing Sciences, Indiana University, Bloomington, IN, USA
- Program in Neuroscience, Indiana University, Bloomington, IN, USA
- Cognitive Science Program, Indiana University, Bloomington, IN, USA
| | - Fallon Bernard
- School of Communication Sciences & Disorders, University of Memphis, Memphis TN, USA
| | - Kimberly Skubic
- School of Communication Sciences & Disorders, University of Memphis, Memphis TN, USA
| |
Collapse
|
13
|
MacLean J, Stirn J, Sisson A, Bidelman GM. Short- and long-term neuroplasticity interact during the perceptual learning of concurrent speech. Cereb Cortex 2024; 34:bhad543. [PMID: 38212291 PMCID: PMC10839853 DOI: 10.1093/cercor/bhad543] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2023] [Revised: 12/20/2023] [Accepted: 12/21/2023] [Indexed: 01/13/2024] Open
Abstract
Plasticity from auditory experience shapes the brain's encoding and perception of sound. However, whether such long-term plasticity alters the trajectory of short-term plasticity during speech processing has yet to be investigated. Here, we explored the neural mechanisms and interplay between short- and long-term neuroplasticity for rapid auditory perceptual learning of concurrent speech sounds in young, normal-hearing musicians and nonmusicians. Participants learned to identify double-vowel mixtures during ~ 45 min training sessions recorded simultaneously with high-density electroencephalography (EEG). We analyzed frequency-following responses (FFRs) and event-related potentials (ERPs) to investigate neural correlates of learning at subcortical and cortical levels, respectively. Although both groups showed rapid perceptual learning, musicians showed faster behavioral decisions than nonmusicians overall. Learning-related changes were not apparent in brainstem FFRs. However, plasticity was highly evident in cortex, where ERPs revealed unique hemispheric asymmetries between groups suggestive of different neural strategies (musicians: right hemisphere bias; nonmusicians: left hemisphere). Source reconstruction and the early (150-200 ms) time course of these effects localized learning-induced cortical plasticity to auditory-sensory brain areas. Our findings reinforce the domain-general benefits of musicianship but reveal that successful speech sound learning is driven by a critical interplay between long- and short-term mechanisms of auditory plasticity, which first emerge at a cortical level.
Collapse
Affiliation(s)
- Jessica MacLean
- Department of Speech, Language and Hearing Sciences, Indiana University, Bloomington, IN, USA
- Program in Neuroscience, Indiana University, Bloomington, IN, USA
| | - Jack Stirn
- Department of Speech, Language and Hearing Sciences, Indiana University, Bloomington, IN, USA
| | - Alexandria Sisson
- Department of Speech, Language and Hearing Sciences, Indiana University, Bloomington, IN, USA
| | - Gavin M Bidelman
- Department of Speech, Language and Hearing Sciences, Indiana University, Bloomington, IN, USA
- Program in Neuroscience, Indiana University, Bloomington, IN, USA
- Cognitive Science Program, Indiana University, Bloomington, IN, USA
| |
Collapse
|
14
|
McHaney JR, Schuerman WL, Leonard MK, Chandrasekaran B. Transcutaneous Auricular Vagus Nerve Stimulation Modulates Performance but Not Pupil Size During Nonnative Speech Category Learning. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2023; 66:3825-3843. [PMID: 37652065 DOI: 10.1044/2023_jslhr-22-00596] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/02/2023]
Abstract
PURPOSE Subthreshold transcutaneous auricular vagus nerve stimulation (taVNS) synchronized with behavioral training can selectively enhance nonnative speech category learning in adults. Prior work has demonstrated that behavioral performance increases when taVNS is paired with easier-to-learn Mandarin tone categories in native English listeners, relative to when taVNS is paired with harder-to-learn Mandarin tone categories or without taVNS. Mechanistically, this temporally precise plasticity has been attributed to noradrenergic modulation. However, prior work did not specifically utilize methodologies that indexed noradrenergic modulation and, therefore, was unable to explicitly test this hypothesis. Our goal for this study was to use pupillometry to gain mechanistic insights into taVNS behavioral effects. METHOD Thirty-eight participants learned to categorize Mandarin tones while pupillometry was recorded. In a double-blinded design, participants were divided into two taVNS groups that, as in the prior study, differed according to whether taVNS was paired with easier-to-learn tones or harder-to-learn tones. Learning performance and pupillary responses were measured using linear mixed-effects models. RESULTS We found that taVNS did not have any tone-specific or group behavioral or pupillary effects. However, in an exploratory analysis, we observed that taVNS did lead to faster rates of learning on trials paired with stimulation, particularly for those who were stimulated at lower amplitudes. CONCLUSIONS Our results suggest that pupillary responses may not be a reliable marker of locus coeruleus-norepinephrine system activity in humans. However, future research should systematically examine the effects of stimulation amplitude on both behavior and pupillary responses. SUPPLEMENTAL MATERIAL https://doi.org/10.23641/asha.24036666.
Collapse
|
15
|
MacLean J, Stirn J, Sisson A, Bidelman GM. Short- and long-term experience-dependent neuroplasticity interact during the perceptual learning of concurrent speech. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.09.26.559640. [PMID: 37808665 PMCID: PMC10557636 DOI: 10.1101/2023.09.26.559640] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/10/2023]
Abstract
Plasticity from auditory experiences shapes brain encoding and perception of sound. However, whether such long-term plasticity alters the trajectory of short-term plasticity during speech processing has yet to be investigated. Here, we explored the neural mechanisms and interplay between short- and long-term neuroplasticity for rapid auditory perceptual learning of concurrent speech sounds in young, normal-hearing musicians and nonmusicians. Participants learned to identify double-vowel mixtures during ∼45 minute training sessions recorded simultaneously with high-density EEG. We analyzed frequency-following responses (FFRs) and event-related potentials (ERPs) to investigate neural correlates of learning at subcortical and cortical levels, respectively. While both groups showed rapid perceptual learning, musicians showed faster behavioral decisions than nonmusicians overall. Learning-related changes were not apparent in brainstem FFRs. However, plasticity was highly evident in cortex, where ERPs revealed unique hemispheric asymmetries between groups suggestive of different neural strategies (musicians: right hemisphere bias; nonmusicians: left hemisphere). Source reconstruction and the early (150-200 ms) time course of these effects localized learning-induced cortical plasticity to auditory-sensory brain areas. Our findings confirm domain-general benefits for musicianship but reveal successful speech sound learning is driven by a critical interplay between long- and short-term mechanisms of auditory plasticity that first emerge at a cortical level.
Collapse
|
16
|
Obasih CO, Luthra S, Dick F, Holt LL. Auditory category learning is robust across training regimes. Cognition 2023; 237:105467. [PMID: 37148640 PMCID: PMC11415078 DOI: 10.1016/j.cognition.2023.105467] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2022] [Revised: 03/17/2023] [Accepted: 04/21/2023] [Indexed: 05/08/2023]
Abstract
Multiple lines of research have developed training approaches that foster category learning, with important translational implications for education. Increasing exemplar variability, blocking or interleaving by category-relevant dimension, and providing explicit instructions about diagnostic dimensions each have been shown to facilitate category learning and/or generalization. However, laboratory research often must distill the character of natural input regularities that define real-world categories. As a result, much of what we know about category learning has come from studies with simplifying assumptions. We challenge the implicit expectation that these studies reflect the process of category learning of real-world input by creating an auditory category learning paradigm that intentionally violates some common simplifying assumptions of category learning tasks. Across five experiments and nearly 300 adult participants, we used training regimes previously shown to facilitate category learning, but here drew from a more complex and multidimensional category space with tens of thousands of unique exemplars. Learning was equivalently robust across training regimes that changed exemplar variability, altered the blocking of category exemplars, or provided explicit instructions of the category-diagnostic dimension. Each drove essentially equivalent accuracy measures of learning generalization following 40 min of training. These findings suggest that auditory category learning across complex input is not as susceptible to training regime manipulation as previously thought.
Collapse
Affiliation(s)
- Chisom O Obasih
- Department of Psychology, Carnegie Mellon University, United States of America; Neuroscience Institute, Carnegie Mellon University, United States of America; Center for the Neural Basis of Cognition, Carnegie Mellon University, United States of America.
| | - Sahil Luthra
- Department of Psychology, Carnegie Mellon University, United States of America; Neuroscience Institute, Carnegie Mellon University, United States of America; Center for the Neural Basis of Cognition, Carnegie Mellon University, United States of America
| | - Frederic Dick
- Experimental Psychology, University College London, United Kingdom; Birkbeck/UCL Centre for NeuroImaging, United Kingdom
| | - Lori L Holt
- Department of Psychology, Carnegie Mellon University, United States of America; Neuroscience Institute, Carnegie Mellon University, United States of America; Center for the Neural Basis of Cognition, Carnegie Mellon University, United States of America
| |
Collapse
|
17
|
Ou J, Xiang M, Yu ACL. Individual variability in subcortical neural encoding shapes phonetic cue weighting. Sci Rep 2023; 13:9991. [PMID: 37340072 DOI: 10.1038/s41598-023-37212-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2022] [Accepted: 06/18/2023] [Indexed: 06/22/2023] Open
Abstract
Recent studies have revealed great individual variability in cue weighting, and such variation is shown to be systematic across individuals and linked to differences in some general cognitive mechanism. The present study investigated the role of subcortical encoding as a source of individual variability in cue weighting by focusing on English listeners' frequency following responses to the tense/lax English vowel contrast varying in spectral and durational cues. Listeners differed in early auditory encoding with some encoding the spectral cue more veridically than the durational one, while others exhibited the reverse pattern. These differences in cue encoding further correlate with behavioral variability in cue weighting, suggesting that specificity in cue encoding across individuals modulates how cues are weighted in downstream processes.
Collapse
Affiliation(s)
- Jinghua Ou
- Department of Linguistics, University of Chicago, 1115 E. 58Th Street, Chicago, IL, 60637, USA.
| | - Ming Xiang
- Department of Linguistics, University of Chicago, 1115 E. 58Th Street, Chicago, IL, 60637, USA
| | - Alan C L Yu
- Department of Linguistics, University of Chicago, 1115 E. 58Th Street, Chicago, IL, 60637, USA.
| |
Collapse
|
18
|
Carter JA, Bidelman GM. Perceptual warping exposes categorical representations for speech in human brainstem responses. Neuroimage 2023; 269:119899. [PMID: 36720437 PMCID: PMC9992300 DOI: 10.1016/j.neuroimage.2023.119899] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2022] [Revised: 01/17/2023] [Accepted: 01/22/2023] [Indexed: 01/30/2023] Open
Abstract
The brain transforms continuous acoustic events into discrete category representations to downsample the speech signal for our perceptual-cognitive systems. Such phonetic categories are highly malleable, and their percepts can change depending on surrounding stimulus context. Previous work suggests these acoustic-phonetic mapping and perceptual warping of speech emerge in the brain no earlier than auditory cortex. Here, we examined whether these auditory-category phenomena inherent to speech perception occur even earlier in the human brain, at the level of auditory brainstem. We recorded speech-evoked frequency following responses (FFRs) during a task designed to induce more/less warping of listeners' perceptual categories depending on stimulus presentation order of a speech continuum (random, forward, backward directions). We used a novel clustered stimulus paradigm to rapidly record the high trial counts needed for FFRs concurrent with active behavioral tasks. We found serial stimulus order caused perceptual shifts (hysteresis) near listeners' category boundary confirming identical speech tokens are perceived differentially depending on stimulus context. Critically, we further show neural FFRs during active (but not passive) listening are enhanced for prototypical vs. category-ambiguous tokens and are biased in the direction of listeners' phonetic label even for acoustically-identical speech stimuli. These findings were not observed in the stimulus acoustics nor model FFR responses generated via a computational model of cochlear and auditory nerve transduction, confirming a central origin to the effects. Our data reveal FFRs carry category-level information and suggest top-down processing actively shapes the neural encoding and categorization of speech at subcortical levels. These findings suggest the acoustic-phonetic mapping and perceptual warping in speech perception occur surprisingly early along the auditory neuroaxis, which might aid understanding by reducing ambiguity inherent to the speech signal.
Collapse
Affiliation(s)
- Jared A Carter
- Institute for Intelligent Systems, University of Memphis, Memphis, TN, USA; School of Communication Sciences and Disorders, University of Memphis, Memphis, TN, USA; Division of Clinical Neuroscience, School of Medicine, Hearing Sciences - Scottish Section, University of Nottingham, Glasgow, Scotland, UK
| | - Gavin M Bidelman
- Department of Speech, Language and Hearing Sciences, Indiana University, Bloomington, IN, USA; Program in Neuroscience, Indiana University, Bloomington, IN, USA.
| |
Collapse
|
19
|
Bidelman GM, Carter JA. Continuous dynamics in behavior reveal interactions between perceptual warping in categorization and speech-in-noise perception. Front Neurosci 2023; 17:1032369. [PMID: 36937676 PMCID: PMC10014819 DOI: 10.3389/fnins.2023.1032369] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2022] [Accepted: 02/14/2023] [Indexed: 03/05/2023] Open
Abstract
Introduction Spoken language comprehension requires listeners map continuous features of the speech signal to discrete category labels. Categories are however malleable to surrounding context and stimulus precedence; listeners' percept can dynamically shift depending on the sequencing of adjacent stimuli resulting in a warping of the heard phonetic category. Here, we investigated whether such perceptual warping-which amplify categorical hearing-might alter speech processing in noise-degraded listening scenarios. Methods We measured continuous dynamics in perception and category judgments of an acoustic-phonetic vowel gradient via mouse tracking. Tokens were presented in serial vs. random orders to induce more/less perceptual warping while listeners categorized continua in clean and noise conditions. Results Listeners' responses were faster and their mouse trajectories closer to the ultimate behavioral selection (marked visually on the screen) in serial vs. random order, suggesting increased perceptual attraction to category exemplars. Interestingly, order effects emerged earlier and persisted later in the trial time course when categorizing speech in noise. Discussion These data describe interactions between perceptual warping in categorization and speech-in-noise perception: warping strengthens the behavioral attraction to relevant speech categories, making listeners more decisive (though not necessarily more accurate) in their decisions of both clean and noise-degraded speech.
Collapse
Affiliation(s)
- Gavin M. Bidelman
- Department of Speech, Language and Hearing Sciences, Indiana University, Bloomington, IN, United States
- Program in Neuroscience, Indiana University, Bloomington, IN, United States
| | - Jared A. Carter
- School of Communication Sciences and Disorders, University of Memphis, Memphis, TN, United States
- Hearing Sciences – Scottish Section, Division of Clinical Neuroscience, School of Medicine, University of Nottingham, Glasgow, United Kingdom
| |
Collapse
|
20
|
Baese-Berk MM, Chandrasekaran B, Roark CL. The nature of non-native speech sound representations. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2022; 152:3025. [PMID: 36456300 PMCID: PMC9671621 DOI: 10.1121/10.0015230] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/30/2022] [Revised: 10/20/2022] [Accepted: 11/01/2022] [Indexed: 05/23/2023]
Abstract
Most current theories and models of second language speech perception are grounded in the notion that learners acquire speech sound categories in their target language. In this paper, this classic idea in speech perception is revisited, given that clear evidence for formation of such categories is lacking in previous research. To understand the debate on the nature of speech sound representations in a second language, an operational definition of "category" is presented, and the issues of categorical perception and current theories of second language learning are reviewed. Following this, behavioral and neuroimaging evidence for and against acquisition of categorical representations is described. Finally, recommendations for future work are discussed. The paper concludes with a recommendation for integration of behavioral and neuroimaging work and theory in this area.
Collapse
Affiliation(s)
| | - Bharath Chandrasekaran
- Department of Communication Sciences and Disorders, University of Pittsburgh, Pittsburgh, Pennsylvania 15260, USA
| | - Casey L Roark
- Department of Communication Sciences and Disorders, University of Pittsburgh, Pittsburgh, Pennsylvania 15260, USA
| |
Collapse
|
21
|
Zhao TC, Llanos F, Chandrasekaran B, Kuhl PK. Language experience during the sensitive period narrows infants' sensory encoding of lexical tones-Music intervention reverses it. Front Hum Neurosci 2022; 16:941853. [PMID: 36016666 PMCID: PMC9398460 DOI: 10.3389/fnhum.2022.941853] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2022] [Accepted: 07/19/2022] [Indexed: 01/13/2023] Open
Abstract
The sensitive period for phonetic learning (6∼12 months), evidenced by improved native speech processing and declined non-native speech processing, represents an early milestone in language acquisition. We examined the extent that sensory encoding of speech is altered by experience during this period by testing two hypotheses: (1) early sensory encoding of non-native speech declines as infants gain native-language experience, and (2) music intervention reverses this decline. We longitudinally measured the frequency-following response (FFR), a robust indicator of early sensory encoding along the auditory pathway, to a Mandarin lexical tone in 7- and 11-months-old monolingual English-learning infants. Infants received either no intervention (language-experience group) or music intervention (music-intervention group) randomly between FFR recordings. The language-experience group exhibited the expected decline in FFR pitch-tracking accuracy to the Mandarin tone, while the music-intervention group did not. Our results support both hypotheses and demonstrate that both language and music experiences alter infants' speech encoding.
Collapse
Affiliation(s)
- Tian Christina Zhao
- Institute for Learning & Brain Sciences, University of Washington, Seattle, WA, United States
- Department of Speech and Hearing Sciences, University of Washington, Seattle, WA, United States
| | - Fernando Llanos
- Department of Linguistics, University of Texas at Austin, Austin, TX, United States
| | - Bharath Chandrasekaran
- Department of Communication Sciences and Disorders, University of Pittsburgh, Pittsburgh, PA, United States
| | - Patricia K. Kuhl
- Institute for Learning & Brain Sciences, University of Washington, Seattle, WA, United States
- Department of Speech and Hearing Sciences, University of Washington, Seattle, WA, United States
| |
Collapse
|
22
|
Llanos F, Nike Gnanateja G, Chandrasekaran B. Principal component decomposition of acoustic and neural representations of time-varying pitch reveals adaptive efficient coding of speech covariation patterns. BRAIN AND LANGUAGE 2022; 230:105122. [PMID: 35460953 PMCID: PMC9934908 DOI: 10.1016/j.bandl.2022.105122] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/04/2021] [Revised: 03/30/2022] [Accepted: 04/02/2022] [Indexed: 06/14/2023]
Abstract
Understanding the effects of statistical regularities on speech processing is a central issue in auditory neuroscience. To investigate the effects of distributional covariance on the neural processing of speech features, we introduce and validate a novel approach: decomposition of time-varying signals into patterns of covariation extracted with Principal Component Analysis. We used this decomposition to assay the sensory representation of pitch covariation patterns in native Chinese listeners and non-native learners of Mandarin Chinese tones. Sensory representations were examined using the frequency-following response, a far-field potential that reflects phase-locked activity from neural ensembles along the auditory pathway. We found a more efficient representation of the covariation patterns that accounted for more redundancy in the form of distributional covariance. Notably, long-term language and short-term training experiences enhanced the sensory representation of these covariation patterns.
Collapse
Affiliation(s)
- Fernando Llanos
- Department of Linguistics, The University of Texas at Austin, Austin, TX 78712, USA.
| | - G Nike Gnanateja
- Department of Communication Sciences and Disorders, University of Pittsburgh, Pittsburgh, PA 15260, USA
| | - Bharath Chandrasekaran
- Department of Communication Sciences and Disorders, University of Pittsburgh, Pittsburgh, PA 15260, USA.
| |
Collapse
|
23
|
Mankel K, Shrestha U, Tipirneni-Sajja A, Bidelman GM. Functional Plasticity Coupled With Structural Predispositions in Auditory Cortex Shape Successful Music Category Learning. Front Neurosci 2022; 16:897239. [PMID: 35837119 PMCID: PMC9274125 DOI: 10.3389/fnins.2022.897239] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2022] [Accepted: 05/25/2022] [Indexed: 11/23/2022] Open
Abstract
Categorizing sounds into meaningful groups helps listeners more efficiently process the auditory scene and is a foundational skill for speech perception and language development. Yet, how auditory categories develop in the brain through learning, particularly for non-speech sounds (e.g., music), is not well understood. Here, we asked musically naïve listeners to complete a brief (∼20 min) training session where they learned to identify sounds from a musical interval continuum (minor-major 3rds). We used multichannel EEG to track behaviorally relevant neuroplastic changes in the auditory event-related potentials (ERPs) pre- to post-training. To rule out mere exposure-induced changes, neural effects were evaluated against a control group of 14 non-musicians who did not undergo training. We also compared individual categorization performance with structural volumetrics of bilateral Heschl's gyrus (HG) from MRI to evaluate neuroanatomical substrates of learning. Behavioral performance revealed steeper (i.e., more categorical) identification functions in the posttest that correlated with better training accuracy. At the neural level, improvement in learners' behavioral identification was characterized by smaller P2 amplitudes at posttest, particularly over right hemisphere. Critically, learning-related changes in the ERPs were not observed in control listeners, ruling out mere exposure effects. Learners also showed smaller and thinner HG bilaterally, indicating superior categorization was associated with structural differences in primary auditory brain regions. Collectively, our data suggest successful auditory categorical learning of music sounds is characterized by short-term functional changes (i.e., greater post-training efficiency) in sensory coding processes superimposed on preexisting structural differences in bilateral auditory cortex.
Collapse
Affiliation(s)
- Kelsey Mankel
- School of Communication Sciences and Disorders, University of Memphis, Memphis, TN, United States
- Institute for Intelligent Systems, University of Memphis, Memphis, TN, United States
- Center for Mind and Brain, University of California, Davis, Davis, CA, United States
| | - Utsav Shrestha
- Department of Biomedical Engineering, University of Memphis, Memphis, TN, United States
| | | | - Gavin M. Bidelman
- School of Communication Sciences and Disorders, University of Memphis, Memphis, TN, United States
- Institute for Intelligent Systems, University of Memphis, Memphis, TN, United States
- Department of Speech, Language and Hearing Sciences, Indiana University, Bloomington, IN, United States
| |
Collapse
|
24
|
Chapelle ADL, Savard MA, Restani R, Ghaemmaghami P, Thillou N, Zardoui K, Chandrasekaran B, Coffey EBJ. Sleep affects higher-level categorization of speech sounds, but not frequency encoding. Cortex 2022; 154:27-45. [PMID: 35732089 DOI: 10.1016/j.cortex.2022.04.018] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2021] [Revised: 03/26/2022] [Accepted: 04/19/2022] [Indexed: 11/03/2022]
Abstract
Sleep can increase consolidation of new knowledge and skills. It is less clear whether sleep plays a role in other aspects of experience-dependent neuroplasticity, which underlie important human capabilities such as spoken language processing. Theories of sensory learning differ in their predictions; some imply rapid learning at early sensory levels, while other propose a slow, progressive timecourse such that higher-level categorical representations guide immediate, novice learning, while lower-level sensory changes do not emerge until later stages. In this study, we investigated the role of sleep across both behavioural and physiological indices of auditory neuroplasticity. Forty healthy young human adults (23 female) who did not speak a tonal language participated in the study. They learned to categorize non-native Mandarin lexical tones using a sound-to-category training paradigm, and were then randomly assigned to a Nap or Wake condition. Polysomnographic data were recorded to quantify sleep during a 3 h afternoon nap opportunity, or equivalent period of quiet wakeful activity. Measures of behavioural performance accuracy revealed a significant improvement in learning the sound-to-category training paradigm between Nap and Wake groups. Conversely, a neural index of fine sound encoding fidelity of speech sounds known as the frequency-following response (FFR) suggested no change due to sleep, and a null model was supported, using Bayesian statistics. Together, these results support theories that propose a slow, progressive and hierarchical timecourse for sensory learning. Sleep's effect may play the biggest role in the higher-level learning, although contributions to more protracted processes of plasticity that exceed the study duration cannot be ruled out.
Collapse
Affiliation(s)
- Aurélien de la Chapelle
- Lyon Neuroscience Research Centre, Lyon, France; Department of Psychology, Concordia University, Montreal, QC, Canada
| | | | - Reyan Restani
- Department of Psychology, Concordia University, Montreal, QC, Canada; Université Paris Nanterre, Paris, France
| | | | - Noam Thillou
- Department of Psychology, Concordia University, Montreal, QC, Canada
| | - Khashayar Zardoui
- Department of Psychology, Concordia University, Montreal, QC, Canada
| | - Bharath Chandrasekaran
- Department of Communication Science and Disorders, University of Pittsburgh, Pittsburgh, USA
| | - Emily B J Coffey
- Department of Psychology, Concordia University, Montreal, QC, Canada.
| |
Collapse
|
25
|
Llanos F, Zhao TC, Kuhl PK, Chandrasekaran B. The emergence of idiosyncratic patterns in the frequency-following response during the first year of life. JASA EXPRESS LETTERS 2022; 2:054401. [PMID: 35578694 PMCID: PMC9096806 DOI: 10.1121/10.0010493] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 02/28/2022] [Accepted: 04/24/2022] [Indexed: 06/15/2023]
Abstract
The frequency-following response (FFR) is a scalp-recorded signal that reflects phase-locked activity from neurons across the auditory system. In addition to capturing information about sounds, the FFR conveys biometric information, reflecting individual differences in auditory processing. To investigate the development of FFR biometric patterns, we trained a pattern recognition model to recognize infants (N = 16) from FFRs collected at 7 and 11 months. Model recognition scores were used to index the robustness of FFR biometric patterns at each time. Results showed better recognition scores at 11 months, demonstrating the emergence of robust FFR idiosyncratic patterns during this first year of life.
Collapse
Affiliation(s)
- Fernando Llanos
- Department of Linguistics, University of Texas at Austin, Austin, Texas 78712, USA
| | - T Christina Zhao
- Department of Speech and Hearing Sciences, University of Washington, Seattle, Washington 98195, USA
| | - Patricia K Kuhl
- Institute for Learning and Brain Sciences, University of Washington, Seattle, Washington 98195, USA
| | - Bharath Chandrasekaran
- Department of Communication Sciences and Disorders, University of Pittsburgh, Pittsburgh, Pennsylvania 15260, USA , , ,
| |
Collapse
|
26
|
Carter JA, Buder EH, Bidelman GM. Nonlinear dynamics in auditory cortical activity reveal the neural basis of perceptual warping in speech categorization. JASA EXPRESS LETTERS 2022; 2:045201. [PMID: 35434716 PMCID: PMC8984957 DOI: 10.1121/10.0009896] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/30/2021] [Accepted: 03/03/2022] [Indexed: 06/14/2023]
Abstract
Surrounding context influences speech listening, resulting in dynamic shifts to category percepts. To examine its neural basis, event-related potentials (ERPs) were recorded during vowel identification with continua presented in random, forward, and backward orders to induce perceptual warping. Behaviorally, sequential order shifted individual listeners' categorical boundary, versus random delivery, revealing perceptual warping (biasing) of the heard phonetic category dependent on recent stimulus history. ERPs revealed later (∼300 ms) activity localized to superior temporal and middle/inferior frontal gyri that predicted listeners' hysteresis/enhanced contrast magnitudes. Findings demonstrate that interactions between frontotemporal brain regions govern top-down, stimulus history effects on speech categorization.
Collapse
Affiliation(s)
- Jared A Carter
- Institute for Intelligent Systems, University of Memphis, Memphis, Tennessee 38152, USA
| | - Eugene H Buder
- School of Communication Sciences and Disorders, University of Memphis, Memphis, Tennessee 38152, USA
| | - Gavin M Bidelman
- Department of Speech, Language and Hearing Sciences, Indiana University, , Bloomington, Indiana 47408, USA , ,
| |
Collapse
|
27
|
Skoe E, García-Sierra A, Ramírez-Esparza N, Jiang S. Automatic sound encoding is sensitive to language familiarity: Evidence from English monolinguals and Spanish-English bilinguals. Neurosci Lett 2022; 777:136582. [DOI: 10.1016/j.neulet.2022.136582] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2021] [Revised: 03/16/2022] [Accepted: 03/17/2022] [Indexed: 11/30/2022]
|
28
|
Novitskiy N, Maggu AR, Lai CM, Chan PHY, Wong KHY, Lam HS, Leung TY, Leung TF, Wong PCM. Early Development of Neural Speech Encoding Depends on Age but Not Native Language Status: Evidence From Lexical Tone. NEUROBIOLOGY OF LANGUAGE (CAMBRIDGE, MASS.) 2022; 3:67-86. [PMID: 37215329 PMCID: PMC10178623 DOI: 10.1162/nol_a_00049] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 10/21/2020] [Accepted: 06/22/2021] [Indexed: 05/24/2023]
Abstract
We investigated the development of early-latency and long-latency brain responses to native and non-native speech to shed light on the neurophysiological underpinnings of perceptual narrowing and early language development. Specifically, we postulated a two-level process to explain the decrease in sensitivity to non-native phonemes toward the end of infancy. Neurons at the earlier stages of the ascending auditory pathway mature rapidly during infancy facilitating the encoding of both native and non-native sounds. This growth enables neurons at the later stages of the auditory pathway to assign phonological status to speech according to the infant's native language environment. To test this hypothesis, we collected early-latency and long-latency neural responses to native and non-native lexical tones from 85 Cantonese-learning children aged between 23 days and 24 months, 16 days. As expected, a broad range of presumably subcortical early-latency neural encoding measures grew rapidly and substantially during the first two years for both native and non-native tones. By contrast, long-latency cortical electrophysiological changes occurred on a much slower scale and showed sensitivity to nativeness at around six months. Our study provided a comprehensive understanding of early language development by revealing the complementary roles of earlier and later stages of speech processing in the developing brain.
Collapse
Affiliation(s)
- Nikolay Novitskiy
- Department of Linguistics and Modern Languages, Brain and Mind Institute, The Chinese University of Hong Kong, Hong Kong SAR, China
| | - Akshay R. Maggu
- Department of Linguistics and Modern Languages, Brain and Mind Institute, The Chinese University of Hong Kong, Hong Kong SAR, China
- O-lab, Duke Psychology and Neuroscience, Duke University, Durham, NC, USA
| | - Ching Man Lai
- Department of Linguistics and Modern Languages, Brain and Mind Institute, The Chinese University of Hong Kong, Hong Kong SAR, China
| | - Peggy H. Y. Chan
- Department of Linguistics and Modern Languages, Brain and Mind Institute, The Chinese University of Hong Kong, Hong Kong SAR, China
- Department of Paediatrics, The Chinese University of Hong Kong, Hong Kong SAR, China
| | - Kay H. Y. Wong
- Department of Linguistics and Modern Languages, Brain and Mind Institute, The Chinese University of Hong Kong, Hong Kong SAR, China
| | - Hugh Simon Lam
- Department of Paediatrics, The Chinese University of Hong Kong, Hong Kong SAR, China
| | - Tak Yeung Leung
- Department of Obsterics and Gynaecology, The Chinese University of Hong Kong, Hong Kong SAR, China
| | - Ting Fan Leung
- Department of Paediatrics, The Chinese University of Hong Kong, Hong Kong SAR, China
| | - Patrick C. M. Wong
- Department of Linguistics and Modern Languages, Brain and Mind Institute, The Chinese University of Hong Kong, Hong Kong SAR, China
| |
Collapse
|
29
|
Abstract
The human brain exhibits the remarkable ability to categorize speech sounds into distinct, meaningful percepts, even in challenging tasks like learning non-native speech categories in adulthood and hearing speech in noisy listening conditions. In these scenarios, there is substantial variability in perception and behavior, both across individual listeners and individual trials. While there has been extensive work characterizing stimulus-related and contextual factors that contribute to variability, recent advances in neuroscience are beginning to shed light on another potential source of variability that has not been explored in speech processing. Specifically, there are task-independent, moment-to-moment variations in neural activity in broadly-distributed cortical and subcortical networks that affect how a stimulus is perceived on a trial-by-trial basis. In this review, we discuss factors that affect speech sound learning and moment-to-moment variability in perception, particularly arousal states—neurotransmitter-dependent modulations of cortical activity. We propose that a more complete model of speech perception and learning should incorporate subcortically-mediated arousal states that alter behavior in ways that are distinct from, yet complementary to, top-down cognitive modulations. Finally, we discuss a novel neuromodulation technique, transcutaneous auricular vagus nerve stimulation (taVNS), which is particularly well-suited to investigating causal relationships between arousal mechanisms and performance in a variety of perceptual tasks. Together, these approaches provide novel testable hypotheses for explaining variability in classically challenging tasks, including non-native speech sound learning.
Collapse
|
30
|
Feng G, Gan Z, Yi HG, Ell SW, Roark CL, Wang S, Wong PCM, Chandrasekaran B. Neural dynamics underlying the acquisition of distinct auditory category structures. Neuroimage 2021; 244:118565. [PMID: 34543762 DOI: 10.1016/j.neuroimage.2021.118565] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2021] [Revised: 09/05/2021] [Accepted: 09/06/2021] [Indexed: 11/16/2022] Open
Abstract
Despite the multidimensional and temporally fleeting nature of auditory signals we quickly learn to assign novel sounds to behaviorally relevant categories. The neural systems underlying the learning and representation of novel auditory categories are far from understood. Current models argue for a rigid specialization of hierarchically organized core regions that are fine-tuned to extracting and mapping relevant auditory dimensions to meaningful categories. Scaffolded within a dual-learning systems approach, we test a competing hypothesis: the spatial and temporal dynamics of emerging auditory-category representations are not driven by the underlying dimensions but are constrained by category structure and learning strategies. To test these competing models, we used functional Magnetic Resonance Imaging (fMRI) to assess representational dynamics during the feedback-based acquisition of novel non-speech auditory categories with identical dimensions but differing category structures: rule-based (RB) categories, hypothesized to involve an explicit sound-to-rule mapping network, and information integration (II) based categories, involving pre-decisional integration of dimensions via a procedural-based sound-to-reward mapping network. Adults were assigned to either the RB (n = 30, 19 females) or II (n = 30, 22 females) learning tasks. Despite similar behavioral learning accuracies, learning strategies derived from computational modeling and involvements of corticostriatal systems during feedback processing differed across tasks. Spatiotemporal multivariate representational similarity analysis revealed an emerging representation within an auditory sensory-motor pathway exclusively for the II learning task, prominently involving the superior temporal gyrus (STG), inferior frontal gyrus (IFG), and posterior precentral gyrus. In contrast, the RB learning task yielded distributed neural representations within regions involved in cognitive-control and attentional processes that emerged at different time points of learning. Our results unequivocally demonstrate that auditory learners' neural systems are highly flexible and show distinct spatial and temporal patterns that are not dimension-specific but reflect underlying category structures and learning strategies.
Collapse
Affiliation(s)
- Gangyi Feng
- Department of Linguistics and Modern Languages, The Chinese University of Hong Kong, Shatin, N.T., Hong Kong SAR, China; Brain and Mind Institute, The Chinese University of Hong Kong, Shatin, N.T., Hong Kong SAR, China.
| | - Zhenzhong Gan
- Department of Linguistics and Modern Languages, The Chinese University of Hong Kong, Shatin, N.T., Hong Kong SAR, China; Key Laboratory of Brain, Cognition and Education Sciences, Ministry of Education, China, School of Psychology, Center for Studies of Psychological Application, and Guangdong Key Laboratory of Mental Health and Cognitive Science, South China Normal University, Guangzhou 510631, China
| | - Han Gyol Yi
- Department of Neurological Surgery, University of California, San Francisco, CA 94158, United States
| | - Shawn W Ell
- Department of Psychology, Graduate School of Biomedical Sciences and Engineering, University of Maine, 5742 Little Hall, Room 301, Orono, ME 04469-5742, United States
| | - Casey L Roark
- Department of Communication Science and Disorders, School of Health and Rehabilitation Sciences, University of Pittsburgh, Pittsburgh, PA 15260, United States; Center for the Neural Basis of Cognition, Pittsburgh, PA 15232, United States
| | - Suiping Wang
- Key Laboratory of Brain, Cognition and Education Sciences, Ministry of Education, China, School of Psychology, Center for Studies of Psychological Application, and Guangdong Key Laboratory of Mental Health and Cognitive Science, South China Normal University, Guangzhou 510631, China
| | - Patrick C M Wong
- Department of Linguistics and Modern Languages, The Chinese University of Hong Kong, Shatin, N.T., Hong Kong SAR, China; Brain and Mind Institute, The Chinese University of Hong Kong, Shatin, N.T., Hong Kong SAR, China
| | - Bharath Chandrasekaran
- Department of Communication Science and Disorders, School of Health and Rehabilitation Sciences, University of Pittsburgh, Pittsburgh, PA 15260, United States; Center for the Neural Basis of Cognition, Pittsburgh, PA 15232, United States.
| |
Collapse
|
31
|
Gnanateja GN, Rupp K, Llanos F, Remick M, Pernia M, Sadagopan S, Teichert T, Abel TJ, Chandrasekaran B. Frequency-Following Responses to Speech Sounds Are Highly Conserved across Species and Contain Cortical Contributions. eNeuro 2021; 8:ENEURO.0451-21.2021. [PMID: 34799409 PMCID: PMC8704423 DOI: 10.1523/eneuro.0451-21.2021] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2021] [Accepted: 11/02/2021] [Indexed: 11/21/2022] Open
Abstract
Time-varying pitch is a vital cue for human speech perception. Neural processing of time-varying pitch has been extensively assayed using scalp-recorded frequency-following responses (FFRs), an electrophysiological signal thought to reflect integrated phase-locked neural ensemble activity from subcortical auditory areas. Emerging evidence increasingly points to a putative contribution of auditory cortical ensembles to the scalp-recorded FFRs. However, the properties of cortical FFRs and precise characterization of laminar sources are still unclear. Here we used direct human intracortical recordings as well as extracranial and intracranial recordings from macaques and guinea pigs to characterize the properties of cortical sources of FFRs to time-varying pitch patterns. We found robust FFRs in the auditory cortex across all species. We leveraged representational similarity analysis as a translational bridge to characterize similarities between the human and animal models. Laminar recordings in animal models showed FFRs emerging primarily from the thalamorecipient layers of the auditory cortex. FFRs arising from these cortical sources significantly contributed to the scalp-recorded FFRs via volume conduction. Our research paves the way for a wide array of studies to investigate the role of cortical FFRs in auditory perception and plasticity.
Collapse
Affiliation(s)
- G Nike Gnanateja
- Department of Communication Sciences and Disorders, University of Pittsburgh, Pittsburgh, Pennsylvania 15260
| | - Kyle Rupp
- Department of Neurological Surgery, UPMC Children's Hospital of Pittsburgh, Pittsburgh, Pennsylvania 15213
| | - Fernando Llanos
- Department of Linguistics, The University of Texas at Austin, Austin, Texas 78712
| | - Madison Remick
- Department of Neurological Surgery, UPMC Children's Hospital of Pittsburgh, Pittsburgh, Pennsylvania 15213
| | - Marianny Pernia
- Center for Neuroscience, University of Pittsburgh, Pittsburgh, Pennsylvania 15261
- Department of Neurobiology, University of Pittsburgh, Pittsburgh, Pennsylvania 15260
| | - Srivatsun Sadagopan
- Department of Communication Sciences and Disorders, University of Pittsburgh, Pittsburgh, Pennsylvania 15260
- Center for Neuroscience, University of Pittsburgh, Pittsburgh, Pennsylvania 15261
- Department of Bioengineering, University of Pittsburgh, Pittsburgh, Pennsylvania 15260
- Department of Neurobiology, University of Pittsburgh, Pittsburgh, Pennsylvania 15260
- Center for the Neural Basis of Cognition, University of Pittsburgh, Pittsburgh, Pennsylvania 15261
| | - Tobias Teichert
- Center for Neuroscience, University of Pittsburgh, Pittsburgh, Pennsylvania 15261
- Department of Bioengineering, University of Pittsburgh, Pittsburgh, Pennsylvania 15260
- Department of Psychiatry, University of Pittsburgh, Pittsburgh, Pennsylvania 15213
| | - Taylor J Abel
- Department of Neurological Surgery, UPMC Children's Hospital of Pittsburgh, Pittsburgh, Pennsylvania 15213
- Department of Bioengineering, University of Pittsburgh, Pittsburgh, Pennsylvania 15260
| | - Bharath Chandrasekaran
- Department of Communication Sciences and Disorders, University of Pittsburgh, Pittsburgh, Pennsylvania 15260
- Center for Neuroscience, University of Pittsburgh, Pittsburgh, Pennsylvania 15261
| |
Collapse
|
32
|
Skoe E, Krizman J, Spitzer ER, Kraus N. Auditory Cortical Changes Precede Brainstem Changes During Rapid Implicit Learning: Evidence From Human EEG. Front Neurosci 2021; 15:718230. [PMID: 34483831 PMCID: PMC8415395 DOI: 10.3389/fnins.2021.718230] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2021] [Accepted: 07/20/2021] [Indexed: 11/28/2022] Open
Abstract
The auditory system is sensitive to stimulus regularities such as frequently occurring sounds and sound combinations. Evidence of regularity detection can be seen in how neurons across the auditory network, from brainstem to cortex, respond to the statistical properties of the soundscape, and in the rapid learning of recurring patterns in their environment by children and adults. Although rapid auditory learning is presumed to involve functional changes to the auditory network, the chronology and directionality of changes are not well understood. To study the mechanisms by which this learning occurs, auditory brainstem and cortical activity was simultaneously recorded via electroencephalogram (EEG) while young adults listened to novel sound streams containing recurring patterns. Neurophysiological responses were compared between easier and harder learning conditions. Collectively, the behavioral and neurophysiological findings suggest that cortical and subcortical structures each provide distinct contributions to auditory pattern learning, but that cortical sensitivity to stimulus patterns likely precedes subcortical sensitivity.
Collapse
Affiliation(s)
- Erika Skoe
- Department of Speech, Language and Hearing Sciences, Connecticut Institute for Brain and Cognitive Sciences, University of Connecticut, Storrs, CT, United States
| | - Jennifer Krizman
- Auditory Neuroscience Laboratory, Department of Communication Sciences, Northwestern University, Evanston, IL, United States
| | - Emily R Spitzer
- Department of Otolaryngology, Head and Neck Surgery, New York University Grossman School of Medicine, New York, NY, United States
| | - Nina Kraus
- Auditory Neuroscience Laboratory, Department of Communication Sciences, Northwestern University, Evanston, IL, United States.,Department of Neurobiology and Physiology, Northwestern University, Evanston, IL, United States.,Department of Otolaryngology, Northwestern University, Evanston, IL, United States.,Institute for Neuroscience, Northwestern University, Evanston, IL, United States
| |
Collapse
|
33
|
Learning nonnative speech sounds changes local encoding in the adult human cortex. Proc Natl Acad Sci U S A 2021; 118:2101777118. [PMID: 34475209 DOI: 10.1073/pnas.2101777118] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2021] [Accepted: 07/12/2021] [Indexed: 11/18/2022] Open
Abstract
Adults can learn to identify nonnative speech sounds with training, albeit with substantial variability in learning behavior. Increases in behavioral accuracy are associated with increased separability for sound representations in cortical speech areas. However, it remains unclear whether individual auditory neural populations all show the same types of changes with learning, or whether there are heterogeneous encoding patterns. Here, we used high-resolution direct neural recordings to examine local population response patterns, while native English listeners learned to recognize unfamiliar vocal pitch patterns in Mandarin Chinese tones. We found a distributed set of neural populations in bilateral superior temporal gyrus and ventrolateral frontal cortex, where the encoding of Mandarin tones changed throughout training as a function of trial-by-trial accuracy ("learning effect"), including both increases and decreases in the separability of tones. These populations were distinct from populations that showed changes as a function of exposure to the stimuli regardless of trial-by-trial accuracy. These learning effects were driven in part by more variable neural responses to repeated presentations of acoustically identical stimuli. Finally, learning effects could be predicted from speech-evoked activity even before training, suggesting that intrinsic properties of these populations make them amenable to behavior-related changes. Together, these results demonstrate that nonnative speech sound learning involves a wide array of changes in neural representations across a distributed set of brain regions.
Collapse
|
34
|
Llanos F, German JS, Gnanateja GN, Chandrasekaran B. The neural processing of pitch accents in continuous speech. Neuropsychologia 2021; 158:107883. [PMID: 33989647 DOI: 10.1016/j.neuropsychologia.2021.107883] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2020] [Revised: 04/29/2021] [Accepted: 05/03/2021] [Indexed: 12/21/2022]
Abstract
Pitch accents are local pitch patterns that convey differences in word prominence and modulate the information structure of the discourse. Despite the importance to discourse in languages like English, neural processing of pitch accents remains understudied. The current study investigates the neural processing of pitch accents by native and non-native English speakers while they are listening to or ignoring 45 min of continuous, natural speech. Leveraging an approach used to study phonemes in natural speech, we analyzed thousands of electroencephalography (EEG) segments time-locked to pitch accents in a prosodic transcription. The optimal neural discrimination between pitch accent categories emerged at latencies between 100 and 200 ms. During these latencies, we found a strong structural alignment between neural and phonetic representations of pitch accent categories. In the same latencies, native listeners exhibited more robust processing of pitch accent contrasts than non-native listeners. However, these group differences attenuated when the speech signal was ignored. We can reliably capture the neural processing of discrete and contrastive pitch accent categories in continuous speech. Our analytic approach also captures how language-specific knowledge and selective attention influences the neural processing of pitch accent categories.
Collapse
Affiliation(s)
- Fernando Llanos
- Department of Communication Science and Disorders, University of Pittsburgh, Pittsburgh, PA, USA; Department of Linguistics, The University of Texas at Austin, Austin, TX, USA
| | - James S German
- Aix-Marseille University, CNRS, LPL, Aix-en-Provence, France
| | - G Nike Gnanateja
- Department of Communication Science and Disorders, University of Pittsburgh, Pittsburgh, PA, USA
| | - Bharath Chandrasekaran
- Department of Communication Science and Disorders, University of Pittsburgh, Pittsburgh, PA, USA.
| |
Collapse
|
35
|
Heidlmayr K, Ferragne E, Isel F. Neuroplasticity in the phonological system: The PMN and the N400 as markers for the perception of non-native phonemic contrasts by late second language learners. Neuropsychologia 2021; 156:107831. [PMID: 33753084 DOI: 10.1016/j.neuropsychologia.2021.107831] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/10/2020] [Revised: 12/28/2020] [Accepted: 03/17/2021] [Indexed: 02/04/2023]
Abstract
Second language (L2) learners frequently encounter persistent difficulty in perceiving certain non-native sound contrasts, i.e., a phenomenon called "phonological deafness". However, if extensive L2 experience leads to neuroplastic changes in the phonological system, then the capacity to discriminate non-native phonemic contrasts should progressively improve. Such perceptual changes should be attested by modifications at the neurophysiological level. We designed an EEG experiment in which the listeners' perceptual capacities to discriminate second language phonemic contrasts influence the processing of lexical-semantic violations. Semantic congruency of critical words in a sentence context was driven by a phonemic contrast that was unique to the L2, English (e.g.,/ɪ/-/i:/, ship - sheep). Twenty-eight young adult native speakers of French with intermediate proficiency in English listened to sentences that contained either a semantically congruent or incongruent critical word (e.g., The anchor of theship/*sheepwas let down) while EEG was recorded. Three ERP effects were found to relate to increasing L2 proficiency: (1) a left frontal auditory N100 effect, (2) a smaller fronto-central phonological mismatch negativity (PMN) effect and (3) a semantic N400 effect. No effect of proficiency was found on oscillatory markers. The current findings suggest that neuronal plasticity in the human brain allows for the late acquisition of even hard-wired linguistic features such as the discrimination of phonemic contrasts in a second language. This is the first time that behavioral and neurophysiological evidence for the critical role of neural plasticity underlying L2 phonological processing and its interdependence with semantic processing has been provided. Our data strongly support the idea that pieces of information from different levels of linguistic processing (e.g., phonological, semantic) strongly interact and influence each other during online language processing.
Collapse
Affiliation(s)
- Karin Heidlmayr
- UMR 1253, iBrain, University of Tours, Inserm, Tours, France; Max-Planck Institute for Psycholinguistics, Nijmegen, the Netherlands; Laboratory CLILLAC-ARP - URP3967, Université de Paris, Paris, France; Laboratory Models, Dynamics, Corpus, CNRS/University Paris Nanterre, Paris Lumières, France.
| | - Emmanuel Ferragne
- Laboratory CLILLAC-ARP - URP3967, Université de Paris, Paris, France
| | - Frédéric Isel
- Laboratory Models, Dynamics, Corpus, CNRS/University Paris Nanterre, Paris Lumières, France
| |
Collapse
|
36
|
Carter JA, Bidelman GM. Auditory cortex is susceptible to lexical influence as revealed by informational vs. energetic masking of speech categorization. Brain Res 2021; 1759:147385. [PMID: 33631210 DOI: 10.1016/j.brainres.2021.147385] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2020] [Revised: 02/15/2021] [Accepted: 02/16/2021] [Indexed: 02/02/2023]
Abstract
Speech perception requires the grouping of acoustic information into meaningful phonetic units via the process of categorical perception (CP). Environmental masking influences speech perception and CP. However, it remains unclear at which stage of processing (encoding, decision, or both) masking affects listeners' categorization of speech signals. The purpose of this study was to determine whether linguistic interference influences the early acoustic-phonetic conversion process inherent to CP. To this end, we measured source level, event related brain potentials (ERPs) from auditory cortex (AC) and inferior frontal gyrus (IFG) as listeners rapidly categorized speech sounds along a /da/ to /ga/ continuum presented in three listening conditions: quiet, and in the presence of forward (informational masker) and time-reversed (energetic masker) 2-talker babble noise. Maskers were matched in overall SNR and spectral content and thus varied only in their degree of linguistic interference (i.e., informational masking). We hypothesized a differential effect of informational versus energetic masking on behavioral and neural categorization responses, where we predicted increased activation of frontal regions when disambiguating speech from noise, especially during lexical-informational maskers. We found (1) informational masking weakens behavioral speech phoneme identification above and beyond energetic masking; (2) low-level AC activity not only codes speech categories but is susceptible to higher-order lexical interference; (3) identifying speech amidst noise recruits a cross hemispheric circuit (ACleft → IFGright) whose engagement varies according to task difficulty. These findings provide corroborating evidence for top-down influences on the early acoustic-phonetic analysis of speech through a coordinated interplay between frontotemporal brain areas.
Collapse
Affiliation(s)
- Jared A Carter
- Institute for Intelligent Systems, University of Memphis, Memphis, TN, USA; School of Communication Sciences & Disorders, University of Memphis, Memphis, TN, USA.
| | - Gavin M Bidelman
- Institute for Intelligent Systems, University of Memphis, Memphis, TN, USA; School of Communication Sciences & Disorders, University of Memphis, Memphis, TN, USA; University of Tennessee Health Sciences Center, Department of Anatomy and Neurobiology, Memphis, TN, USA.
| |
Collapse
|
37
|
Feng G, Li Y, Hsu SM, Wong PC, Chou TL, Chandrasekaran B. Emerging native-similar neural representations underlie non-native speech category learning success. NEUROBIOLOGY OF LANGUAGE (CAMBRIDGE, MASS.) 2021; 2:280-307. [PMID: 34368775 PMCID: PMC8345815 DOI: 10.1162/nol_a_00035] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/09/2023]
Abstract
Learning non-native phonetic categories in adulthood is an exceptionally challenging task, characterized by large inter-individual differences in learning speed and outcomes. The neurobiological mechanisms underlying the inter-individual differences in the learning efficacy are not fully understood. Here we examined the extent to which training-induced neural representations of non-native Mandarin tone categories in English listeners (n = 53) are increasingly similar to those of the native listeners (n = 33) who acquired these categories early in infancy. We particularly assessed whether the neural similarities in representational structure between non-native learners and native listeners are robust neuromarkers of inter-individual differences in learning success. Using inter-subject neural representational similarity (IS-NRS) analysis and predictive modeling on two functional magnetic resonance imaging (fMRI) datasets, we examined the neural representational mechanisms underlying speech category learning success. Learners' neural representations that were significantly similar to the native listeners emerged in brain regions mediating speech perception following training; the extent of the emerging neural similarities with native listeners significantly predicted the learning speed and outcome in learners. The predictive power of IS-NRS outperformed models with other neural representational measures. Furthermore, neural representations underlying successful learning are multidimensional but cost-efficient in nature. The degree of the emergent native-similar neural representations was closely related to the robust neural sensitivity to feedback in the frontostriatal network. These findings provide important insights on experience-dependent representational neuroplasticity underlying successful speech learning in adulthood and could be leveraged in designing individualized feedback-based training paradigms that maximize learning efficiency.
Collapse
Affiliation(s)
- Gangyi Feng
- Department of Linguistics and Modern Languages, The Chinese University of Hong Kong, Shatin, N.T., Hong Kong SAR, China
- Brain and Mind Institute, The Chinese University of Hong Kong, Shatin, N.T., Hong Kong SAR, China
- Corresponding authors: Gangyi Feng, Ph.D., Brain and Mind Institute, Department of Linguistics and Modern Languages, The Chinese University of Hong Kong, Shatin, N.T., Hong Kong SAR, China, +852-3943 3190, , Bharath Chandrasekaran, Ph.D., Department of Communication Science and Disorders, University of Pittsburgh 6074 Forbes Tower, Pittsburgh, PA 15260, (412) 383-6565,
| | - Yu Li
- Department of Linguistics and Modern Languages, The Chinese University of Hong Kong, Shatin, N.T., Hong Kong SAR, China
- Brain and Mind Institute, The Chinese University of Hong Kong, Shatin, N.T., Hong Kong SAR, China
| | - Shen-Mou Hsu
- Imaging Center for Integrated Body, Mind and Culture Research, National Taiwan University, Taipei 10617, Taiwan
| | - Patrick C.M. Wong
- Department of Linguistics and Modern Languages, The Chinese University of Hong Kong, Shatin, N.T., Hong Kong SAR, China
- Brain and Mind Institute, The Chinese University of Hong Kong, Shatin, N.T., Hong Kong SAR, China
| | - Tai-Li Chou
- Imaging Center for Integrated Body, Mind and Culture Research, National Taiwan University, Taipei 10617, Taiwan
- Department of Psychology, National Taiwan University, Taipei 10617, Taiwan
| | - Bharath Chandrasekaran
- Department of Communication Sciences and Disorders, School of Health and Rehabilitation Sciences, University of Pittsburgh, Pittsburgh, PA 15260, USA
- Corresponding authors: Gangyi Feng, Ph.D., Brain and Mind Institute, Department of Linguistics and Modern Languages, The Chinese University of Hong Kong, Shatin, N.T., Hong Kong SAR, China, +852-3943 3190, , Bharath Chandrasekaran, Ph.D., Department of Communication Science and Disorders, University of Pittsburgh 6074 Forbes Tower, Pittsburgh, PA 15260, (412) 383-6565,
| |
Collapse
|
38
|
Paulon G, Llanos F, Chandrasekaran B, Sarkar A. Bayesian Semiparametric Longitudinal Drift-Diffusion Mixed Models for Tone Learning in Adults. J Am Stat Assoc 2020; 116:1114-1127. [PMID: 34650315 PMCID: PMC8513775 DOI: 10.1080/01621459.2020.1801448] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2020] [Revised: 06/10/2020] [Accepted: 07/22/2020] [Indexed: 02/07/2023]
Abstract
Understanding how adult humans learn nonnative speech categories such as tone information has shed novel insights into the mechanisms underlying experience-dependent brain plasticity. Scientists have traditionally examined these questions using longitudinal learning experiments under a multi-category decision making paradigm. Drift-diffusion processes are popular in such contexts for their ability to mimic underlying neural mechanisms. Motivated by these problems, we develop a novel Bayesian semiparametric inverse Gaussian drift-diffusion mixed model for multi-alternative decision making in longitudinal settings. We design a Markov chain Monte Carlo algorithm for posterior computation. We evaluate the method's empirical performances through synthetic experiments. Applied to our motivating longitudinal tone learning study, the method provides novel insights into how the biologically interpretable model parameters evolve with learning, differ between input-response tone combinations, and differ between well and poorly performing adults. supplementary materials for this article, including a standardized description of the materials available for reproducing the work, are available as an online supplement.
Collapse
Affiliation(s)
- Giorgio Paulon
- Department of Statistics and Data Sciences, University of Texas at Austin, Austin, TX
| | - Fernando Llanos
- Department of Linguistics, University of Texas at Austin, Austin, TX
- Department of Communication Science and Disorders, University of Pittsburgh, Pittsburgh, PA
| | - Bharath Chandrasekaran
- Department of Communication Science and Disorders, University of Pittsburgh, Pittsburgh, PA
| | - Abhra Sarkar
- Department of Statistics and Data Sciences, University of Texas at Austin, Austin, TX
| |
Collapse
|
39
|
Heimler B, Amedi A. Are critical periods reversible in the adult brain? Insights on cortical specializations based on sensory deprivation studies. Neurosci Biobehav Rev 2020; 116:494-507. [DOI: 10.1016/j.neubiorev.2020.06.034] [Citation(s) in RCA: 17] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2019] [Revised: 06/07/2020] [Accepted: 06/25/2020] [Indexed: 02/06/2023]
|
40
|
Llanos F, McHaney JR, Schuerman WL, Yi HG, Leonard MK, Chandrasekaran B. Non-invasive peripheral nerve stimulation selectively enhances speech category learning in adults. NPJ SCIENCE OF LEARNING 2020; 5:12. [PMID: 32802406 PMCID: PMC7410845 DOI: 10.1038/s41539-020-0070-0] [Citation(s) in RCA: 25] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/26/2019] [Accepted: 06/05/2020] [Indexed: 05/30/2023]
Abstract
Adults struggle to learn non-native speech contrasts even after years of exposure. While laboratory-based training approaches yield learning, the optimal training conditions for maximizing speech learning in adulthood are currently unknown. Vagus nerve stimulation has been shown to prime adult sensory-perceptual systems towards plasticity in animal models. Precise temporal pairing with auditory stimuli can enhance auditory cortical representations with a high degree of specificity. Here, we examined whether sub-perceptual threshold transcutaneous vagus nerve stimulation (tVNS), paired with non-native speech sounds, enhances speech category learning in adults. Twenty-four native English-speakers were trained to identify non-native Mandarin tone categories. Across two groups, tVNS was paired with the tone categories that were easier- or harder-to-learn. A control group received no stimulation but followed an identical thresholding procedure as the intervention groups. We found that tVNS robustly enhanced speech category learning and retention of correct stimulus-response associations, but only when stimulation was paired with the easier-to-learn categories. This effect emerged rapidly, generalized to new exemplars, and was qualitatively different from the normal individual variability observed in hundreds of learners who have performed in the same task without stimulation. Electroencephalography recorded before and after training indicated no evidence of tVNS-induced changes in the sensory representation of auditory stimuli. These results suggest that paired-tVNS induces a temporally precise neuromodulatory signal that selectively enhances the perception and memory consolidation of perceptually salient categories.
Collapse
Affiliation(s)
- Fernando Llanos
- Department of Communication Science and Disorders, University of Pittsburgh, Pittsburgh, PA 15260 USA
| | - Jacie R. McHaney
- Department of Communication Science and Disorders, University of Pittsburgh, Pittsburgh, PA 15260 USA
| | - William L. Schuerman
- Neurological Surgery, University of California, San Francisco, San Francisco, CA 94143 USA
| | - Han G. Yi
- Neurological Surgery, University of California, San Francisco, San Francisco, CA 94143 USA
| | - Matthew K. Leonard
- Neurological Surgery, University of California, San Francisco, San Francisco, CA 94143 USA
| | - Bharath Chandrasekaran
- Department of Communication Science and Disorders, University of Pittsburgh, Pittsburgh, PA 15260 USA
| |
Collapse
|
41
|
Tang M, Huang ZL, Zhong F, Xiang JL, Wang XD. One-week phonemic training rebuilds the memory traces of merged phonemes in merged speakers. Brain Res 2020; 1740:146848. [PMID: 32330520 DOI: 10.1016/j.brainres.2020.146848] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2019] [Revised: 04/16/2020] [Accepted: 04/20/2020] [Indexed: 11/16/2022]
Abstract
The phonemic merger is a unique phenomenon which is referred to as acoustically very different phonemes are recognized as the same phoneme. In our previous study, we demonstrated that the merged speakers had lost the ability to discriminate the merged phonemes pre-attentively, as revealed by their failure in mismatch negativity (MMN) elicitation in the oddball stream of the merged phonemes /n/-/l/. In this study, we investigated the recovery of the discrimination ability via phonemic training and found that the merged speakers regained the ability of discriminating merged phonemes pre-attentively, after a 7-day /n/-/l/ phonemic training, as revealed by the reactivation of MMN brain response to the /n/-/l/ phoneme categories. Our finding indicates that separate memory traces of merged phonemes could be rebuilt during the training process.
Collapse
Affiliation(s)
- Mi Tang
- Faculty of Psychology, Southwest University, Chongqing 400715, China
| | - Zheng-Lan Huang
- Faculty of Psychology, Southwest University, Chongqing 400715, China
| | - Fei Zhong
- Faculty of Psychology, Southwest University, Chongqing 400715, China
| | - Jing-Lan Xiang
- Faculty of Psychology, Southwest University, Chongqing 400715, China
| | - Xiao-Dong Wang
- Faculty of Psychology, Southwest University, Chongqing 400715, China.
| |
Collapse
|
42
|
Bidelman GM, Bush LC, Boudreaux AM. Effects of Noise on the Behavioral and Neural Categorization of Speech. Front Neurosci 2020; 14:153. [PMID: 32180700 PMCID: PMC7057933 DOI: 10.3389/fnins.2020.00153] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2019] [Accepted: 02/10/2020] [Indexed: 02/02/2023] Open
Abstract
We investigated whether the categorical perception (CP) of speech might also provide a mechanism that aids its perception in noise. We varied signal-to-noise ratio (SNR) [clear, 0 dB, -5 dB] while listeners classified an acoustic-phonetic continuum (/u/ to /a/). Noise-related changes in behavioral categorization were only observed at the lowest SNR. Event-related brain potentials (ERPs) differentiated category vs. category-ambiguous speech by the P2 wave (~180-320 ms). Paralleling behavior, neural responses to speech with clear phonetic status (i.e., continuum endpoints) were robust to noise down to -5 dB SNR, whereas responses to ambiguous tokens declined with decreasing SNR. Results demonstrate that phonetic speech representations are more resistant to degradation than corresponding acoustic representations. Findings suggest the mere process of binning speech sounds into categories provides a robust mechanism to aid figure-ground speech perception by fortifying abstract categories from the acoustic signal and making the speech code more resistant to external interferences.
Collapse
Affiliation(s)
- Gavin M Bidelman
- Institute for Intelligent Systems, University of Memphis, Memphis, TN, United States.,School of Communication Sciences and Disorders, University of Memphis, Memphis, TN, United States.,Department of Anatomy and Neurobiology, University of Tennessee Health Sciences Center, Memphis, TN, United States
| | - Lauren C Bush
- School of Communication Sciences and Disorders, University of Memphis, Memphis, TN, United States
| | - Alex M Boudreaux
- School of Communication Sciences and Disorders, University of Memphis, Memphis, TN, United States
| |
Collapse
|
43
|
Al-Fahad R, Yeasin M, Bidelman GM. Decoding of single-trial EEG reveals unique states of functional brain connectivity that drive rapid speech categorization decisions. J Neural Eng 2020; 17:016045. [PMID: 31822643 PMCID: PMC7004853 DOI: 10.1088/1741-2552/ab6040] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
Abstract
OBJECTIVE Categorical perception (CP) is an inherent property of speech perception. The response time (RT) of listeners' perceptual speech identification is highly sensitive to individual differences. While the neural correlates of CP have been well studied in terms of the regional contributions of the brain to behavior, functional connectivity patterns that signify individual differences in listeners' speed (RT) for speech categorization is less clear. In this study, we introduce a novel approach to address these questions. APPROACH We applied several computational approaches to the EEG, including graph mining, machine learning (i.e., support vector machine), and stability selection to investigate the unique brain states (functional neural connectivity) that predict the speed of listeners' behavioral decisions. MAIN RESULTS We infer that (i) the listeners' perceptual speed is directly related to dynamic variations in their brain connectomics, (ii) global network assortativity and efficiency distinguished fast, medium, and slow RTs, (iii) the functional network underlying speeded decisions increases in negative assortativity (i.e., became disassortative) for slower RTs, (iv) slower categorical speech decisions cause excessive use of neural resources and more aberrant information flow within the CP circuitry, (v) slower responders tended to utilize functional brain networks excessively (or inappropriately) whereas fast responders (with lower global efficiency) utilized the same neural pathways but with more restricted organization. SIGNIFICANCE Findings show that neural classifiers (SVM) coupled with stability selection correctly classify behavioral RTs from functional connectivity alone with over 92% accuracy (AUC = 0.9). Our results corroborate previous studies by supporting the engagement of similar temporal (STG), parietal, motor, and prefrontal regions in CP using an entirely data-driven approach.
Collapse
Affiliation(s)
- Rakib Al-Fahad
- Department of Electrical and Computer Engineering, University of Memphis, Memphis, 38152 TN, USA
| | - Mohammed Yeasin
- Department of Electrical and Computer Engineering, University of Memphis, Memphis, 38152 TN, USA
- Institute for Intelligent Systems, University of Memphis, Memphis, TN, USA
| | - Gavin M. Bidelman
- Institute for Intelligent Systems, University of Memphis, Memphis, TN, USA
- School of Communication Sciences & Disorders, University of Memphis, Memphis, TN, USA
- University of Tennessee Health Sciences Center, Department of Anatomy and Neurobiology, Memphis, TN, USA
| |
Collapse
|
44
|
Coffey EBJ, Nicol T, White-Schwoch T, Chandrasekaran B, Krizman J, Skoe E, Zatorre RJ, Kraus N. Evolving perspectives on the sources of the frequency-following response. Nat Commun 2019; 10:5036. [PMID: 31695046 PMCID: PMC6834633 DOI: 10.1038/s41467-019-13003-w] [Citation(s) in RCA: 103] [Impact Index Per Article: 20.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2019] [Accepted: 10/14/2019] [Indexed: 11/09/2022] Open
Abstract
The auditory frequency-following response (FFR) is a non-invasive index of the fidelity of sound encoding in the brain, and is used to study the integrity, plasticity, and behavioral relevance of the neural encoding of sound. In this Perspective, we review recent evidence suggesting that, in humans, the FFR arises from multiple cortical and subcortical sources, not just subcortically as previously believed, and we illustrate how the FFR to complex sounds can enhance the wider field of auditory neuroscience. Far from being of use only to study basic auditory processes, the FFR is an uncommonly multifaceted response yielding a wealth of information, with much yet to be tapped.
Collapse
Affiliation(s)
- Emily B J Coffey
- Department of Psychology, Concordia University, 1455 Boulevard de Maisonneuve Ouest, Montréal, QC, H3G 1M8, Canada.
- International Laboratory for Brain, Music, and Sound Research (BRAMS), Montréal, QC, Canada.
- Centre for Research on Brain, Language and Music (CRBLM), McGill University, 3640 de la Montagne, Montréal, QC, H3G 2A8, Canada.
| | - Trent Nicol
- Auditory Neuroscience Laboratory, Department of Communication Sciences, Northwestern University, 2240 Campus Dr., Evanston, IL, 60208, USA
| | - Travis White-Schwoch
- Auditory Neuroscience Laboratory, Department of Communication Sciences, Northwestern University, 2240 Campus Dr., Evanston, IL, 60208, USA
| | - Bharath Chandrasekaran
- Communication Sciences and Disorders, School of Health and Rehabilitation Sciences, University of Pittsburgh, Forbes Tower, 3600 Atwood St, Pittsburgh, PA, 15260, USA
| | - Jennifer Krizman
- Auditory Neuroscience Laboratory, Department of Communication Sciences, Northwestern University, 2240 Campus Dr., Evanston, IL, 60208, USA
| | - Erika Skoe
- Department of Speech, Language, and Hearing Sciences, The Connecticut Institute for the Brain and Cognitive Sciences, University of Connecticut, 2 Alethia Drive, Unit 1085, Storrs, CT, 06269, USA
| | - Robert J Zatorre
- International Laboratory for Brain, Music, and Sound Research (BRAMS), Montréal, QC, Canada
- Centre for Research on Brain, Language and Music (CRBLM), McGill University, 3640 de la Montagne, Montréal, QC, H3G 2A8, Canada
- Montreal Neurological Institute, McGill University, 3801 rue Université, Montréal, QC, H3A 2B4, Canada
| | - Nina Kraus
- Auditory Neuroscience Laboratory, Department of Communication Sciences, Northwestern University, 2240 Campus Dr., Evanston, IL, 60208, USA
- Department of Neurobiology, Northwestern University, 2205 Tech Dr., Evanston, IL, 60208, USA
- Department of Otolaryngology, Northwestern University, 420 E Superior St., Chicago, IL, 6011, USA
| |
Collapse
|
45
|
Luthra S, Fuhrmeister P, Molfese PJ, Guediche S, Blumstein SE, Myers EB. Brain-behavior relationships in incidental learning of non-native phonetic categories. BRAIN AND LANGUAGE 2019; 198:104692. [PMID: 31522094 PMCID: PMC6773471 DOI: 10.1016/j.bandl.2019.104692] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/14/2019] [Revised: 08/29/2019] [Accepted: 09/01/2019] [Indexed: 06/01/2023]
Abstract
Research has implicated the left inferior frontal gyrus (LIFG) in mapping acoustic-phonetic input to sound category representations, both in native speech perception and non-native phonetic category learning. At issue is whether this sensitivity reflects access to phonetic category information per se or to explicit category labels, the latter often being required by experimental procedures. The current study employed an incidental learning paradigm designed to increase sensitivity to a difficult non-native phonetic contrast without inducing explicit awareness of the categorical nature of the stimuli. Functional MRI scans revealed frontal sensitivity to phonetic category structure both before and after learning. Additionally, individuals who succeeded most on the learning task showed the largest increases in frontal recruitment after learning. Overall, results suggest that processing novel phonetic category information entails a reliance on frontal brain regions, even in the absence of explicit category labels.
Collapse
Affiliation(s)
- Sahil Luthra
- University of Connecticut, Department of Psychological Sciences, United States.
| | - Pamela Fuhrmeister
- University of Connecticut, Department of Speech, Language and Hearing Sciences, United States.
| | | | - Sara Guediche
- Basque Center on Cognition, Brain and Language, Spain.
| | - Sheila E Blumstein
- Brown University, Department of Cognitive, Linguistic and Psychological Sciences, United States.
| | - Emily B Myers
- University of Connecticut, Department of Psychological Sciences, United States; University of Connecticut, Department of Speech, Language and Hearing Sciences, United States; Haskins Laboratories, United States.
| |
Collapse
|
46
|
Bidelman GM, Price CN, Shen D, Arnott SR, Alain C. Afferent-efferent connectivity between auditory brainstem and cortex accounts for poorer speech-in-noise comprehension in older adults. Hear Res 2019; 382:107795. [PMID: 31479953 DOI: 10.1016/j.heares.2019.107795] [Citation(s) in RCA: 30] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/08/2019] [Revised: 08/14/2019] [Accepted: 08/22/2019] [Indexed: 12/19/2022]
Abstract
Speech-in-noise (SIN) comprehension deficits in older adults have been linked to changes in both subcortical and cortical auditory evoked responses. However, older adults' difficulty understanding SIN may also be related to an imbalance in signal transmission (i.e., functional connectivity) between brainstem and auditory cortices. By modeling high-density scalp recordings of speech-evoked responses with sources in brainstem (BS) and bilateral primary auditory cortices (PAC), we show that beyond attenuating neural activity, hearing loss in older adults compromises the transmission of speech information between subcortical and early cortical hubs of the speech network. We found that the strength of afferent BS→PAC neural signaling (but not the reverse efferent flow; PAC→BS) varied with mild declines in hearing acuity and this "bottom-up" functional connectivity robustly predicted older adults' performance in a SIN identification task. Connectivity was also a better predictor of SIN processing than unitary subcortical or cortical responses alone. Our neuroimaging findings suggest that in older adults (i) mild hearing loss differentially reduces neural output at several stages of auditory processing (PAC > BS), (ii) subcortical-cortical connectivity is more sensitive to peripheral hearing loss than top-down (cortical-subcortical) control, and (iii) reduced functional connectivity in afferent auditory pathways plays a significant role in SIN comprehension problems.
Collapse
Affiliation(s)
- Gavin M Bidelman
- School of Communication Sciences & Disorders, University of Memphis, Memphis, TN, USA; Institute for Intelligent Systems, University of Memphis, Memphis, TN, USA; University of Tennessee Health Sciences Center, Department of Anatomy and Neurobiology, Memphis, TN, USA.
| | - Caitlin N Price
- School of Communication Sciences & Disorders, University of Memphis, Memphis, TN, USA
| | - Dawei Shen
- Rotman Research Institute-Baycrest Centre for Geriatric Care, Toronto, Ontario, Canada
| | - Stephen R Arnott
- Rotman Research Institute-Baycrest Centre for Geriatric Care, Toronto, Ontario, Canada
| | - Claude Alain
- Rotman Research Institute-Baycrest Centre for Geriatric Care, Toronto, Ontario, Canada; University of Toronto, Department of Psychology, Toronto, Ontario, Canada; University of Toronto, Institute of Medical Sciences, Toronto, Ontario, Canada
| |
Collapse
|
47
|
Llanos F, Xie Z, Chandrasekaran B. Biometric identification of listener identity from frequency following responses to speech. J Neural Eng 2019; 16:056004. [PMID: 31039552 DOI: 10.1088/1741-2552/ab1e01] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]
Abstract
OBJECTIVE We investigate the biometric specificity of the frequency following response (FFR), an EEG marker of early auditory processing that reflects phase-locked activity from neural ensembles in the auditory cortex and subcortex (Chandrasekaran and Kraus 2010, Bidelman, 2015a, 2018, Coffey et al 2017b). Our objective is two-fold: demonstrate that the FFR contains information beyond stimulus properties and broad group-level markers, and to assess the practical viability of the FFR as a biometric across different sounds, auditory experiences, and recording days. APPROACH We trained the hidden Markov model (HMM) to decode listener identity from FFR spectro-temporal patterns across multiple frequency bands. Our dataset included FFRs from twenty native speakers of English or Mandarin Chinese (10 per group) listening to Mandarin Chinese tones across three EEG sessions separated by days. We decoded subject identity within the same auditory context (same tone and session) and across different stimuli and recording sessions. MAIN RESULTS The HMM decoded listeners for averaging sizes as small as one single FFR. However, model performance improved for larger averaging sizes (e.g. 25 FFRs), similarity in auditory context (same tone and day), and lack of familiarity with the sounds (i.e. native English relative to native Chinese listeners). Our results also revealed important biometric contributions from frequency bands in the cortical and subcortical EEG. SIGNIFICANCE Our study provides the first deep and systematic biometric characterization of the FFR and provides the basis for biometric identification systems incorporating this neural signal.
Collapse
Affiliation(s)
- Fernando Llanos
- Department of Communication Sciences and Disorders, University of Pittsburgh, Pittsburgh, PA 15213, United States of America
| | | | | |
Collapse
|
48
|
Bidelman GM, Sigley L, Lewis GA. Acoustic noise and vision differentially warp the auditory categorization of speech. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2019; 146:60. [PMID: 31370660 PMCID: PMC6786888 DOI: 10.1121/1.5114822] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/18/2019] [Revised: 06/05/2019] [Accepted: 06/07/2019] [Indexed: 06/10/2023]
Abstract
Speech perception requires grouping acoustic information into meaningful linguistic-phonetic units via categorical perception (CP). Beyond shrinking observers' perceptual space, CP might aid degraded speech perception if categories are more resistant to noise than surface acoustic features. Combining audiovisual (AV) cues also enhances speech recognition, particularly in noisy environments. This study investigated the degree to which visual cues from a talker (i.e., mouth movements) aid speech categorization amidst noise interference by measuring participants' identification of clear and noisy speech (0 dB signal-to-noise ratio) presented in auditory-only or combined AV modalities (i.e., A, A+noise, AV, AV+noise conditions). Auditory noise expectedly weakened (i.e., shallower identification slopes) and slowed speech categorization. Interestingly, additional viseme cues largely counteracted noise-related decrements in performance and stabilized classification speeds in both clear and noise conditions suggesting more precise acoustic-phonetic representations with multisensory information. Results are parsimoniously described under a signal detection theory framework and by a reduction (visual cues) and increase (noise) in the precision of perceptual object representation, which were not due to lapses of attention or guessing. Collectively, findings show that (i) mapping sounds to categories aids speech perception in "cocktail party" environments; (ii) visual cues help lattice formation of auditory-phonetic categories to enhance and refine speech identification.
Collapse
Affiliation(s)
- Gavin M Bidelman
- School of Communication Sciences & Disorders, University of Memphis, 4055 North Park Loop, Memphis, Tennessee 38152, USA
| | - Lauren Sigley
- School of Communication Sciences & Disorders, University of Memphis, 4055 North Park Loop, Memphis, Tennessee 38152, USA
| | - Gwyneth A Lewis
- School of Communication Sciences & Disorders, University of Memphis, 4055 North Park Loop, Memphis, Tennessee 38152, USA
| |
Collapse
|
49
|
Paulon G, Reetzke R, Chandrasekaran B, Sarkar A. Functional Logistic Mixed-Effects Models for Learning Curves From Longitudinal Binary Data. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2019; 62:543-553. [PMID: 30950747 PMCID: PMC6802892 DOI: 10.1044/2018_jslhr-s-astm-18-0283] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/15/2018] [Revised: 09/22/2018] [Accepted: 10/25/2018] [Indexed: 06/09/2023]
Abstract
Purpose We present functional logistic mixed-effects models (FLMEMs) for estimating population and individual-level learning curves in longitudinal experiments. Method Using functional analysis tools in a Bayesian hierarchical framework, the FLMEM captures nonlinear, smoothly varying learning curves, appropriately accommodating uncertainty in various aspects of the analysis while also borrowing information across different model layers. An R package implementing our method is available as part of the Supplemental Materials . Results Application to speech learning data from Reetzke, Xie, Llanos, and Chandrasekaran (2018) and a simulation study demonstrate the utility of FLMEM and its many advantages over linear and logistic mixed-effects models. Conclusion The FLMEM is highly flexible and efficient in improving upon the practical limitations of linear models and logistic linear mixed-effects models. We expect the FLMEM to be a useful addition to the speech, language, and hearing scientist's toolkit. Supplemental Material https://doi.org/10.23641/asha.7822568.
Collapse
Affiliation(s)
- Giorgio Paulon
- Department of Statistics and Data Sciences, The University of Texas at Austin
| | - Rachel Reetzke
- Department of Psychiatry and Behavioral Medicine, University of California, Davis
| | | | - Abhra Sarkar
- Department of Statistics and Data Sciences, The University of Texas at Austin
| |
Collapse
|
50
|
Xie Z, Reetzke R, Chandrasekaran B. Machine Learning Approaches to Analyze Speech-Evoked Neurophysiological Responses. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2019; 62:587-601. [PMID: 30950746 PMCID: PMC6802895 DOI: 10.1044/2018_jslhr-s-astm-18-0244] [Citation(s) in RCA: 25] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/18/2018] [Revised: 10/28/2018] [Accepted: 11/26/2018] [Indexed: 05/27/2023]
Abstract
Purpose Speech-evoked neurophysiological responses are often collected to answer clinically and theoretically driven questions concerning speech and language processing. Here, we highlight the practical application of machine learning (ML)-based approaches to analyzing speech-evoked neurophysiological responses. Method Two categories of ML-based approaches are introduced: decoding models, which generate a speech stimulus output using the features from the neurophysiological responses, and encoding models, which use speech stimulus features to predict neurophysiological responses. In this review, we focus on (a) a decoding model classification approach, wherein speech-evoked neurophysiological responses are classified as belonging to 1 of a finite set of possible speech events (e.g., phonological categories), and (b) an encoding model temporal response function approach, which quantifies the transformation of a speech stimulus feature to continuous neural activity. Results We illustrate the utility of the classification approach to analyze early electroencephalographic (EEG) responses to Mandarin lexical tone categories from a traditional experimental design, and to classify EEG responses to English phonemes evoked by natural continuous speech (i.e., an audiobook) into phonological categories (plosive, fricative, nasal, and vowel). We also demonstrate the utility of temporal response function to predict EEG responses to natural continuous speech from acoustic features. Neural metrics from the 3 examples all exhibit statistically significant effects at the individual level. Conclusion We propose that ML-based approaches can complement traditional analysis approaches to analyze neurophysiological responses to speech signals and provide a deeper understanding of natural speech and language processing using ecologically valid paradigms in both typical and clinical populations.
Collapse
Affiliation(s)
- Zilong Xie
- Department of Communication Sciences and Disorders, The University of Texas at Austin
| | - Rachel Reetzke
- Department of Communication Sciences and Disorders, The University of Texas at Austin
| | - Bharath Chandrasekaran
- Department of Communication Science and Disorders, School of Health and Rehabilitation Sciences, University of Pittsburgh
| |
Collapse
|