1
|
Ribas-Prats T, Arenillas-Alcón S, Martínez SIF, Gómez-Roig MD, Escera C. The frequency-following response in late preterm neonates: a pilot study. Front Psychol 2024; 15:1341171. [PMID: 38784610 PMCID: PMC11112609 DOI: 10.3389/fpsyg.2024.1341171] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2023] [Accepted: 04/23/2024] [Indexed: 05/25/2024] Open
Abstract
Introduction Infants born very early preterm are at high risk of language delays. However, less is known about the consequences of late prematurity. Hence, the aim of the present study is to characterize the neural encoding of speech sounds in late preterm neonates in comparison with those born at term. Methods The speech-evoked frequency-following response (FFR) was recorded to a consonant-vowel stimulus /da/ in 36 neonates in three different groups: 12 preterm neonates [mean gestational age (GA) 36.05 weeks], 12 "early term neonates" (mean GA 38.3 weeks), and "late term neonates" (mean GA 41.01 weeks). Results From the FFR recordings, a delayed neural response and a weaker stimulus F0 encoding in premature neonates compared to neonates born at term was observed. No differences in the response time onset nor in stimulus F0 encoding were observed between the two groups of neonates born at term. No differences between the three groups were observed in the neural encoding of the stimulus temporal fine structure. Discussion These results highlight alterations in the neural encoding of speech sounds related to prematurity, which were present for the stimulus F0 but not for its temporal fine structure.
Collapse
Affiliation(s)
- Teresa Ribas-Prats
- Brainlab–Cognitive Neuroscience Research Group. Department of Clinical Psychology and Psychobiology, University of Barcelona, Barcelona, Spain
- Institute of Neurosciences, University of Barcelona, Barcelona, Spain
- Institut de Recerca Sant Joan de Déu, Esplugues de Llobregat, Barcelona, Spain
| | - Sonia Arenillas-Alcón
- Brainlab–Cognitive Neuroscience Research Group. Department of Clinical Psychology and Psychobiology, University of Barcelona, Barcelona, Spain
- Institute of Neurosciences, University of Barcelona, Barcelona, Spain
- Institut de Recerca Sant Joan de Déu, Esplugues de Llobregat, Barcelona, Spain
| | - Silvia Irene Ferrero Martínez
- Institut de Recerca Sant Joan de Déu, Esplugues de Llobregat, Barcelona, Spain
- BCNatal–Barcelona Center for Maternal Fetal and Neonatal Medicine (Hospital Sant Joan de Déu and Hospital Clínic), University of Barcelona, Barcelona, Spain
| | - Maria Dolores Gómez-Roig
- Institut de Recerca Sant Joan de Déu, Esplugues de Llobregat, Barcelona, Spain
- BCNatal–Barcelona Center for Maternal Fetal and Neonatal Medicine (Hospital Sant Joan de Déu and Hospital Clínic), University of Barcelona, Barcelona, Spain
| | - Carles Escera
- Brainlab–Cognitive Neuroscience Research Group. Department of Clinical Psychology and Psychobiology, University of Barcelona, Barcelona, Spain
- Institute of Neurosciences, University of Barcelona, Barcelona, Spain
- Institut de Recerca Sant Joan de Déu, Esplugues de Llobregat, Barcelona, Spain
| |
Collapse
|
2
|
Jeng FC, Matzdorf K, Hickman KL, Bauer SW, Carriero AE, McDonald K, Lin TH, Wang CY. Advancing Auditory Processing by Detecting Frequency-Following Responses Through a Specialized Machine Learning Model. Percept Mot Skills 2024; 131:417-431. [PMID: 38153030 DOI: 10.1177/00315125231225767] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2023]
Abstract
In this study, we explore the feasibility and performance of detecting scalp-recorded frequency-following responses (FFRs) with a specialized machine learning (ML) model. By leveraging the strengths of feature extraction of the source separation non-negative matrix factorization (SSNMF) algorithm and its adeptness in handling limited training data, we adapted the SSNMF algorithm into a specialized ML model with a hybrid architecture to enhance FFR detection amidst background noise. We recruited 40 adults with normal hearing and evoked their scalp recorded FFRs using the English vowel/i/with a rising pitch contour. The model was trained on FFR-present and FFR-absent conditions, and its performance was evaluated using sensitivity, specificity, efficiency, false-positive rate, and false-negative rate metrics. This study revealed that the specialized SSNMF model achieved heightened sensitivity, specificity, and efficiency in detecting FFRs as the number of recording sweeps increased. Sensitivity exceeded 80% at 500 sweeps and maintained over 89% from 1000 sweeps onwards. Similarly, specificity and efficiency also improved rapidly with increasing sweeps. The progressively enhanced sensitivity, specificity, and efficiency of this specialized ML model underscore its practicality and potential for broader applications. These findings have immediate implications for FFR research and clinical use, while paving the way for further advancements in the assessment of auditory processing.
Collapse
Affiliation(s)
- Fuh-Cherng Jeng
- Communication Sciences and Disorders, Ohio University, Athens, OH, USA
- Communication Sciences and Disorders, Asia University, Taichung, Taiwan
| | - Katie Matzdorf
- Communication Sciences and Disorders, Ohio University, Athens, OH, USA
| | - Kassy L Hickman
- Communication Sciences and Disorders, Ohio University, Athens, OH, USA
| | - Sydney W Bauer
- Communication Sciences and Disorders, Ohio University, Athens, OH, USA
| | - Amanda E Carriero
- Communication Sciences and Disorders, Ohio University, Athens, OH, USA
| | - Kalyn McDonald
- Communication Sciences and Disorders, Ohio University, Athens, OH, USA
| | - Tzu-Hao Lin
- Biodiversity Research Center, Academia Sinica, Taipei, Taiwan
| | - Ching-Yuan Wang
- Department of Otolaryngology-HNS, China Medical University Hospital, Taichung, Taiwan
| |
Collapse
|
3
|
Giordano AT, Jeng FC, Black TR, Bauer SW, Carriero AE, McDonald K, Lin TH, Wang CY. Effects of Silent Intervals on the Extraction of Human Frequency-Following Responses Using Non-Negative Matrix Factorization. Percept Mot Skills 2023; 130:1834-1851. [PMID: 37534595 DOI: 10.1177/00315125231191303] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/04/2023]
Abstract
Source-Separation Non-Negative Matrix Factorization (SSNMF) is a mathematical algorithm recently developed to extract scalp-recorded frequency-following responses (FFRs) from noise. Despite its initial success, the effects of silent intervals on algorithm performance remain undetermined. Our purpose in this study was to determine the effects of silent intervals on the extraction of FFRs, which are electrophysiological responses that are commonly used to evaluate auditory processing and neuroplasticity in the human brain. We used an English vowel /i/ with a rising frequency contour to evoke FFRs in 23 normal-hearing adults. The stimulus had a duration of 150 ms, while the silent interval between the onset of one stimulus and the offset of the next one was also 150 ms. We computed FFR Enhancement and Noise Residue to estimate algorithm performance, while silent intervals were either included (i.e., the WithSI condition) or excluded (i.e., the WithoutSI condition) in our analysis. The FFR Enhancements and Noise Residues obtained in the WithoutSI condition were significantly better (p < .05) than those obtained in the WithSI condition. On average, the exclusion of silent intervals produced a 11.78% increment in FFR Enhancement and a 20.69% decrement in Noise Residue. These results not only quantify the effects of silent intervals on the extraction of human FFRs, but also provide recommendations for designing and improving the SSNMF algorithm in future research.
Collapse
Affiliation(s)
- Allison T Giordano
- Communication Sciences and Disorders, Ohio University, Athens, Ohio, USA
| | - Fuh-Cherng Jeng
- Communication Sciences and Disorders, Ohio University, Athens, Ohio, USA
| | - Taylor R Black
- Communication Sciences and Disorders, Ohio University, Athens, Ohio, USA
| | - Sydney W Bauer
- Communication Sciences and Disorders, Ohio University, Athens, Ohio, USA
| | - Amanda E Carriero
- Communication Sciences and Disorders, Ohio University, Athens, Ohio, USA
| | - Kalyn McDonald
- Communication Sciences and Disorders, Ohio University, Athens, Ohio, USA
| | - Tzu-Hao Lin
- Biodiversity Research Center, Academia Sinica, Taipei, Taiwan
| | - Ching-Yuan Wang
- Department of Otolaryngology-HNS, China Medical University Hospital, Taichung, Taiwan
| |
Collapse
|
4
|
Xu C, Cheng FY, Medina S, Eng E, Gifford R, Smith S. Objective discrimination of bimodal speech using frequency following responses. Hear Res 2023; 437:108853. [PMID: 37441879 DOI: 10.1016/j.heares.2023.108853] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/25/2023] [Revised: 07/03/2023] [Accepted: 07/08/2023] [Indexed: 07/15/2023]
Abstract
Bimodal hearing, in which a contralateral hearing aid is combined with a cochlear implant (CI), provides greater speech recognition benefits than using a CI alone. Factors predicting individual bimodal patient success are not fully understood. Previous studies have shown that bimodal benefits may be driven by a patient's ability to extract fundamental frequency (f0) and/or temporal fine structure cues (e.g., F1). Both of these features may be represented in frequency following responses (FFR) to bimodal speech. Thus, the goals of this study were to: 1) parametrically examine neural encoding of f0 and F1 in simulated bimodal speech conditions; 2) examine objective discrimination of FFRs to bimodal speech conditions using machine learning; 3) explore whether FFRs are predictive of perceptual bimodal benefit. Three vowels (/ε/, /i/, and /ʊ/) with identical f0 were manipulated by a vocoder (right ear) and low-pass filters (left ear) to create five bimodal simulations for evoking FFRs: Vocoder-only, Vocoder +125 Hz, Vocoder +250 Hz, Vocoder +500 Hz, and Vocoder +750 Hz. Perceptual performance on the BKB-SIN test was also measured using the same five configurations. Results suggested that neural representation of f0 and F1 FFR components were enhanced with increasing acoustic bandwidth in the simulated "non-implanted" ear. As spectral differences between vowels emerged in the FFRs with increased acoustic bandwidth, FFRs were more accurately classified and discriminated using a machine learning algorithm. Enhancement of f0 and F1 neural encoding with increasing bandwidth were collectively predictive of perceptual bimodal benefit on a speech-in-noise task. Given these results, FFR may be a useful tool to objectively assess individual variability in bimodal hearing.
Collapse
Affiliation(s)
- Can Xu
- Department of Speech, Language, and Hearing Sciences, University of Texas at Austin, 2504A Whitis Ave. (A1100), Austin 78712-0114, TX, USA
| | - Fan-Yin Cheng
- Department of Speech, Language, and Hearing Sciences, University of Texas at Austin, 2504A Whitis Ave. (A1100), Austin 78712-0114, TX, USA
| | - Sarah Medina
- Department of Speech, Language, and Hearing Sciences, University of Texas at Austin, 2504A Whitis Ave. (A1100), Austin 78712-0114, TX, USA
| | - Erica Eng
- Department of Speech, Language, and Hearing Sciences, University of Texas at Austin, 2504A Whitis Ave. (A1100), Austin 78712-0114, TX, USA
| | - René Gifford
- Department of Speech, Language, and Hearing Sciences, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Spencer Smith
- Department of Speech, Language, and Hearing Sciences, University of Texas at Austin, 2504A Whitis Ave. (A1100), Austin 78712-0114, TX, USA.
| |
Collapse
|
5
|
Wimalarathna H, Ankmnal-Veeranna S, Allan C, Agrawal SK, Samarabandu J, Ladak HM, Allen P. Machine learning approaches used to analyze auditory evoked responses from the human auditory brainstem: A systematic review. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2022; 226:107118. [PMID: 36122495 DOI: 10.1016/j.cmpb.2022.107118] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/08/2022] [Revised: 08/01/2022] [Accepted: 09/06/2022] [Indexed: 06/15/2023]
Abstract
BACKGROUND The application of machine learning algorithms for assessing the auditory brainstem response has gained interest over recent years with a considerable number of publications in the literature. In this systematic review, we explore how machine learning has been used to develop algorithms to assess auditory brainstem responses. A clear and comprehensive overview is provided to allow clinicians and researchers to explore the domain and the potential translation to clinical care. METHODS The systematic review was performed based on PRISMA guidelines. A search was conducted of PubMed, IEEE-Xplore, and Scopus databases focusing on human studies that have used machine learning to assess auditory brainstem responses. The duration of the search was from January 1, 1990, to April 3, 2021. The Covidence systematic review platform (www.covidence.org) was used throughout the process. RESULTS A total of 5812 studies were found through the database search and 451 duplicates were removed. The title and abstract screening process further reduced the article count to 89 and in the proceeding full-text screening, 34 articles met our full inclusion criteria. CONCLUSION Three categories of applications were found, namely neurologic diagnosis, hearing threshold estimation, and other (does not relate to neurologic or hearing threshold estimation). Neural networks and support vector machines were the most commonly used machine learning algorithms in all three categories. Only one study had conducted a clinical trial to evaluate the algorithm after development. Challenges remain in the amount of data required to train machine learning models. Suggestions for future research avenues are mentioned with recommended reporting methods for researchers.
Collapse
Affiliation(s)
- Hasitha Wimalarathna
- Department of Electrical & Computer Engineering, Western University, London, Ontario, Canada; National Centre for Audiology, Western University, London, Ontario, Canada.
| | - Sangamanatha Ankmnal-Veeranna
- National Centre for Audiology, Western University, London, Ontario, Canada; College of Nursing and Health Professions, School of Speech and Hearing Sciences, The University of Southern Mississippi, J.B. George Building, Hattiesburg, MS, USA
| | - Chris Allan
- National Centre for Audiology, Western University, London, Ontario, Canada; School of Communication Sciences & Disorders, Western University, London, Ontario, Canada
| | - Sumit K Agrawal
- Department of Electrical & Computer Engineering, Western University, London, Ontario, Canada; National Centre for Audiology, Western University, London, Ontario, Canada; School of Biomedical Engineering, Western University, London, Ontario, Canada; Department of Medical Biophysics, Western University, London, Ontario, Canada; Department of Otolaryngology - Head and Neck Surgery, Western University, London, Ontario, Canada
| | - Jagath Samarabandu
- Department of Electrical & Computer Engineering, Western University, London, Ontario, Canada
| | - Hanif M Ladak
- Department of Electrical & Computer Engineering, Western University, London, Ontario, Canada; National Centre for Audiology, Western University, London, Ontario, Canada; School of Biomedical Engineering, Western University, London, Ontario, Canada; Department of Medical Biophysics, Western University, London, Ontario, Canada; Department of Otolaryngology - Head and Neck Surgery, Western University, London, Ontario, Canada
| | - Prudence Allen
- National Centre for Audiology, Western University, London, Ontario, Canada; School of Communication Sciences & Disorders, Western University, London, Ontario, Canada
| |
Collapse
|
6
|
Lai J, Price CN, Bidelman GM. Brainstem speech encoding is dynamically shaped online by fluctuations in cortical α state. Neuroimage 2022; 263:119627. [PMID: 36122686 PMCID: PMC10017375 DOI: 10.1016/j.neuroimage.2022.119627] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2022] [Accepted: 09/12/2022] [Indexed: 11/25/2022] Open
Abstract
Experimental evidence in animals demonstrates cortical neurons innervate subcortex bilaterally to tune brainstem auditory coding. Yet, the role of the descending (corticofugal) auditory system in modulating earlier sound processing in humans during speech perception remains unclear. Here, we measured EEG activity as listeners performed speech identification tasks in different noise backgrounds designed to tax perceptual and attentional processing. We hypothesized brainstem speech coding might be tied to attention and arousal states (indexed by cortical α power) that actively modulate the interplay of brainstem-cortical signal processing. When speech-evoked brainstem frequency-following responses (FFRs) were categorized according to cortical α states, we found low α FFRs in noise were weaker, correlated positively with behavioral response times, and were more "decodable" via neural classifiers. Our data provide new evidence for online corticofugal interplay in humans and establish that brainstem sensory representations are continuously yoked to (i.e., modulated by) the ebb and flow of cortical states to dynamically update perceptual processing.
Collapse
Affiliation(s)
- Jesyin Lai
- Institute for Intelligent Systems, University of Memphis, Memphis, TN, USA; School of Communication Sciences and Disorders, University of Memphis, Memphis, TN, USA; Diagnostic Imaging Department, St. Jude Children's Research Hospital, Memphis, TN, USA.
| | - Caitlin N Price
- Institute for Intelligent Systems, University of Memphis, Memphis, TN, USA; School of Communication Sciences and Disorders, University of Memphis, Memphis, TN, USA; Department of Audiology and Speech Pathology, University of Arkansas for Medical Sciences, Little Rock, AR, USA
| | - Gavin M Bidelman
- Institute for Intelligent Systems, University of Memphis, Memphis, TN, USA; School of Communication Sciences and Disorders, University of Memphis, Memphis, TN, USA; Department of Speech, Language and Hearing Sciences, Indiana University, 2631 East Discovery Parkway, Bloomington, IN 47408, USA; Program in Neuroscience, Indiana University, 1101 E 10th St, Bloomington, IN 47405, USA.
| |
Collapse
|
7
|
Jeng FC, Jeng YS. Implementation of Machine Learning on Human Frequency-Following Responses: A Tutorial. Semin Hear 2022; 43:251-274. [PMID: 36313046 PMCID: PMC9605809 DOI: 10.1055/s-0042-1756219] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022] Open
Abstract
The frequency-following response (FFR) provides enriched information on how acoustic stimuli are processed in the human brain. Based on recent studies, machine learning techniques have demonstrated great utility in modeling human FFRs. This tutorial focuses on the fundamental principles, algorithmic designs, and custom implementations of several supervised models (linear regression, logistic regression, k -nearest neighbors, support vector machines) and an unsupervised model ( k -means clustering). Other useful machine learning tools (Markov chains, dimensionality reduction, principal components analysis, nonnegative matrix factorization, and neural networks) are discussed as well. Each model's applicability and its pros and cons are explained. The choice of a suitable model is highly dependent on the research question, FFR recordings, target variables, extracted features, and their data types. To promote understanding, an example project implemented in Python is provided, which demonstrates practical usage of several of the discussed models on a sample dataset of six FFR features and a target response label.
Collapse
Affiliation(s)
- Fuh-Cherng Jeng
- Communication Sciences and Disorders, Ohio University, Athens, Ohio
| | - Yu-Shiang Jeng
- Computer Science and Engineering, Ohio State University, Columbus, Ohio
| |
Collapse
|
8
|
Smith S. Translational Applications of Machine Learning in Auditory Electrophysiology. Semin Hear 2022; 43:240-250. [PMID: 36313047 PMCID: PMC9605807 DOI: 10.1055/s-0042-1756166] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022] Open
Abstract
Machine learning (ML) is transforming nearly every aspect of modern life including medicine and its subfields, such as hearing science. This article presents a brief conceptual overview of selected ML approaches and describes how these techniques are being applied to outstanding problems in hearing science, with a particular focus on auditory evoked potentials (AEPs). Two vignettes are presented in which ML is used to analyze subcortical AEP data. The first vignette demonstrates how ML can be used to determine if auditory learning has influenced auditory neurophysiologic function. The second vignette demonstrates how ML analysis of AEPs may be useful in determining whether hearing devices are optimized for discriminating speech sounds.
Collapse
Affiliation(s)
- Spencer Smith
- Department of Speech, Language, and Hearing Sciences, University of Texas at Austin, Austin, Texas
| |
Collapse
|
9
|
Zhao TC, Llanos F, Chandrasekaran B, Kuhl PK. Language experience during the sensitive period narrows infants' sensory encoding of lexical tones-Music intervention reverses it. Front Hum Neurosci 2022; 16:941853. [PMID: 36016666 PMCID: PMC9398460 DOI: 10.3389/fnhum.2022.941853] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2022] [Accepted: 07/19/2022] [Indexed: 01/13/2023] Open
Abstract
The sensitive period for phonetic learning (6∼12 months), evidenced by improved native speech processing and declined non-native speech processing, represents an early milestone in language acquisition. We examined the extent that sensory encoding of speech is altered by experience during this period by testing two hypotheses: (1) early sensory encoding of non-native speech declines as infants gain native-language experience, and (2) music intervention reverses this decline. We longitudinally measured the frequency-following response (FFR), a robust indicator of early sensory encoding along the auditory pathway, to a Mandarin lexical tone in 7- and 11-months-old monolingual English-learning infants. Infants received either no intervention (language-experience group) or music intervention (music-intervention group) randomly between FFR recordings. The language-experience group exhibited the expected decline in FFR pitch-tracking accuracy to the Mandarin tone, while the music-intervention group did not. Our results support both hypotheses and demonstrate that both language and music experiences alter infants' speech encoding.
Collapse
Affiliation(s)
- Tian Christina Zhao
- Institute for Learning & Brain Sciences, University of Washington, Seattle, WA, United States,Department of Speech and Hearing Sciences, University of Washington, Seattle, WA, United States,*Correspondence: Tian Christina Zhao,
| | - Fernando Llanos
- Department of Linguistics, University of Texas at Austin, Austin, TX, United States
| | - Bharath Chandrasekaran
- Department of Communication Sciences and Disorders, University of Pittsburgh, Pittsburgh, PA, United States
| | - Patricia K. Kuhl
- Institute for Learning & Brain Sciences, University of Washington, Seattle, WA, United States,Department of Speech and Hearing Sciences, University of Washington, Seattle, WA, United States
| |
Collapse
|
10
|
Llanos F, Nike Gnanateja G, Chandrasekaran B. Principal component decomposition of acoustic and neural representations of time-varying pitch reveals adaptive efficient coding of speech covariation patterns. BRAIN AND LANGUAGE 2022; 230:105122. [PMID: 35460953 PMCID: PMC9934908 DOI: 10.1016/j.bandl.2022.105122] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/04/2021] [Revised: 03/30/2022] [Accepted: 04/02/2022] [Indexed: 06/14/2023]
Abstract
Understanding the effects of statistical regularities on speech processing is a central issue in auditory neuroscience. To investigate the effects of distributional covariance on the neural processing of speech features, we introduce and validate a novel approach: decomposition of time-varying signals into patterns of covariation extracted with Principal Component Analysis. We used this decomposition to assay the sensory representation of pitch covariation patterns in native Chinese listeners and non-native learners of Mandarin Chinese tones. Sensory representations were examined using the frequency-following response, a far-field potential that reflects phase-locked activity from neural ensembles along the auditory pathway. We found a more efficient representation of the covariation patterns that accounted for more redundancy in the form of distributional covariance. Notably, long-term language and short-term training experiences enhanced the sensory representation of these covariation patterns.
Collapse
Affiliation(s)
- Fernando Llanos
- Department of Linguistics, The University of Texas at Austin, Austin, TX 78712, USA.
| | - G Nike Gnanateja
- Department of Communication Sciences and Disorders, University of Pittsburgh, Pittsburgh, PA 15260, USA
| | - Bharath Chandrasekaran
- Department of Communication Sciences and Disorders, University of Pittsburgh, Pittsburgh, PA 15260, USA.
| |
Collapse
|
11
|
Jeng FC, Lin TH, Hart BN, Montgomery-Reagan K, McDonald K. Non-negative matrix factorization improves the efficiency of recording frequency-following responses in normal-hearing adults and neonates. Int J Audiol 2022:1-11. [PMID: 35522832 DOI: 10.1080/14992027.2022.2071345] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]
Abstract
OBJECTIVE One challenge in extracting the scalp-recorded frequency-following response (FFR) is related to its inherently small amplitude, which means that the response cannot be identified with confidence when only a relatively small number of recording sweeps are included in the averaging procedure. DESIGN This study examined how the non-negative matrix factorisation (NMF) algorithm with a source separation constraint could be applied to improve the efficiency of FFR recordings. Conventional FFRs elicited by an English vowel/i/with a rising frequency contour were collected. Study sample: Fifteen normal-hearing adults and 15 normal-hearing neonates were recruited. RESULTS The improvements of FFR recordings, defined as the correlation coefficient and root-mean-square differences across a sweep series of amplitude spectrograms before and after the application of the source separation NMF (SSNMF) algorithm, were characterised through an exponential curve fitting model. Statistical analysis of variance indicated that the SSNMF algorithm was able to enhance the FFRs recorded in both groups of participants. CONCLUSIONS Such improvements enabled FFR extractions in a relatively small number of recording sweeps, and opened a new window to better understand how speech sounds are processed in the human brain.
Collapse
Affiliation(s)
- Fuh-Cherng Jeng
- Communication Sciences and Disorders, Ohio University, Athens, OH, USA
| | - Tzu-Hao Lin
- Biodiversity Research Center, Academia Sinica, Taipei, Taiwan
| | - Breanna N Hart
- Communication Sciences and Disorders, Ohio University, Athens, OH, USA
| | | | - Kalyn McDonald
- Communication Sciences and Disorders, Ohio University, Athens, OH, USA
| |
Collapse
|
12
|
Llanos F, Zhao TC, Kuhl PK, Chandrasekaran B. The emergence of idiosyncratic patterns in the frequency-following response during the first year of life. JASA EXPRESS LETTERS 2022; 2:054401. [PMID: 35578694 PMCID: PMC9096806 DOI: 10.1121/10.0010493] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 02/28/2022] [Accepted: 04/24/2022] [Indexed: 06/15/2023]
Abstract
The frequency-following response (FFR) is a scalp-recorded signal that reflects phase-locked activity from neurons across the auditory system. In addition to capturing information about sounds, the FFR conveys biometric information, reflecting individual differences in auditory processing. To investigate the development of FFR biometric patterns, we trained a pattern recognition model to recognize infants (N = 16) from FFRs collected at 7 and 11 months. Model recognition scores were used to index the robustness of FFR biometric patterns at each time. Results showed better recognition scores at 11 months, demonstrating the emergence of robust FFR idiosyncratic patterns during this first year of life.
Collapse
Affiliation(s)
- Fernando Llanos
- Department of Linguistics, University of Texas at Austin, Austin, Texas 78712, USA
| | - T Christina Zhao
- Department of Speech and Hearing Sciences, University of Washington, Seattle, Washington 98195, USA
| | - Patricia K Kuhl
- Institute for Learning and Brain Sciences, University of Washington, Seattle, Washington 98195, USA
| | - Bharath Chandrasekaran
- Department of Communication Sciences and Disorders, University of Pittsburgh, Pittsburgh, Pennsylvania 15260, USA , , ,
| |
Collapse
|
13
|
Cheng FY, Xu C, Gold L, Smith S. Rapid Enhancement of Subcortical Neural Responses to Sine-Wave Speech. Front Neurosci 2022; 15:747303. [PMID: 34987356 PMCID: PMC8721138 DOI: 10.3389/fnins.2021.747303] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2021] [Accepted: 12/02/2021] [Indexed: 01/15/2023] Open
Abstract
The efferent auditory nervous system may be a potent force in shaping how the brain responds to behaviorally significant sounds. Previous human experiments using the frequency following response (FFR) have shown efferent-induced modulation of subcortical auditory function online and over short- and long-term time scales; however, a contemporary understanding of FFR generation presents new questions about whether previous effects were constrained solely to the auditory subcortex. The present experiment used sine-wave speech (SWS), an acoustically-sparse stimulus in which dynamic pure tones represent speech formant contours, to evoke FFRSWS. Due to the higher stimulus frequencies used in SWS, this approach biased neural responses toward brainstem generators and allowed for three stimuli (/bɔ/, /bu/, and /bo/) to be used to evoke FFRSWSbefore and after listeners in a training group were made aware that they were hearing a degraded speech stimulus. All SWS stimuli were rapidly perceived as speech when presented with a SWS carrier phrase, and average token identification reached ceiling performance during a perceptual training phase. Compared to a control group which remained naïve throughout the experiment, training group FFRSWS amplitudes were enhanced post-training for each stimulus. Further, linear support vector machine classification of training group FFRSWS significantly improved post-training compared to the control group, indicating that training-induced neural enhancements were sufficient to bolster machine learning classification accuracy. These results suggest that the efferent auditory system may rapidly modulate auditory brainstem representation of sounds depending on their context and perception as non-speech or speech.
Collapse
Affiliation(s)
- Fan-Yin Cheng
- Department of Speech, Language, and Hearing Sciences, University of Texas at Austin, Austin, TX, United States
| | - Can Xu
- Department of Speech, Language, and Hearing Sciences, University of Texas at Austin, Austin, TX, United States
| | - Lisa Gold
- Department of Speech, Language, and Hearing Sciences, University of Texas at Austin, Austin, TX, United States
| | - Spencer Smith
- Department of Speech, Language, and Hearing Sciences, University of Texas at Austin, Austin, TX, United States
| |
Collapse
|
14
|
Gnanateja GN, Rupp K, Llanos F, Remick M, Pernia M, Sadagopan S, Teichert T, Abel TJ, Chandrasekaran B. Frequency-Following Responses to Speech Sounds Are Highly Conserved across Species and Contain Cortical Contributions. eNeuro 2021; 8:ENEURO.0451-21.2021. [PMID: 34799409 PMCID: PMC8704423 DOI: 10.1523/eneuro.0451-21.2021] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2021] [Accepted: 11/02/2021] [Indexed: 11/21/2022] Open
Abstract
Time-varying pitch is a vital cue for human speech perception. Neural processing of time-varying pitch has been extensively assayed using scalp-recorded frequency-following responses (FFRs), an electrophysiological signal thought to reflect integrated phase-locked neural ensemble activity from subcortical auditory areas. Emerging evidence increasingly points to a putative contribution of auditory cortical ensembles to the scalp-recorded FFRs. However, the properties of cortical FFRs and precise characterization of laminar sources are still unclear. Here we used direct human intracortical recordings as well as extracranial and intracranial recordings from macaques and guinea pigs to characterize the properties of cortical sources of FFRs to time-varying pitch patterns. We found robust FFRs in the auditory cortex across all species. We leveraged representational similarity analysis as a translational bridge to characterize similarities between the human and animal models. Laminar recordings in animal models showed FFRs emerging primarily from the thalamorecipient layers of the auditory cortex. FFRs arising from these cortical sources significantly contributed to the scalp-recorded FFRs via volume conduction. Our research paves the way for a wide array of studies to investigate the role of cortical FFRs in auditory perception and plasticity.
Collapse
Affiliation(s)
- G Nike Gnanateja
- Department of Communication Sciences and Disorders, University of Pittsburgh, Pittsburgh, Pennsylvania 15260
| | - Kyle Rupp
- Department of Neurological Surgery, UPMC Children's Hospital of Pittsburgh, Pittsburgh, Pennsylvania 15213
| | - Fernando Llanos
- Department of Linguistics, The University of Texas at Austin, Austin, Texas 78712
| | - Madison Remick
- Department of Neurological Surgery, UPMC Children's Hospital of Pittsburgh, Pittsburgh, Pennsylvania 15213
| | - Marianny Pernia
- Center for Neuroscience, University of Pittsburgh, Pittsburgh, Pennsylvania 15261
- Department of Neurobiology, University of Pittsburgh, Pittsburgh, Pennsylvania 15260
| | - Srivatsun Sadagopan
- Department of Communication Sciences and Disorders, University of Pittsburgh, Pittsburgh, Pennsylvania 15260
- Center for Neuroscience, University of Pittsburgh, Pittsburgh, Pennsylvania 15261
- Department of Bioengineering, University of Pittsburgh, Pittsburgh, Pennsylvania 15260
- Department of Neurobiology, University of Pittsburgh, Pittsburgh, Pennsylvania 15260
- Center for the Neural Basis of Cognition, University of Pittsburgh, Pittsburgh, Pennsylvania 15261
| | - Tobias Teichert
- Center for Neuroscience, University of Pittsburgh, Pittsburgh, Pennsylvania 15261
- Department of Bioengineering, University of Pittsburgh, Pittsburgh, Pennsylvania 15260
- Department of Psychiatry, University of Pittsburgh, Pittsburgh, Pennsylvania 15213
| | - Taylor J Abel
- Department of Neurological Surgery, UPMC Children's Hospital of Pittsburgh, Pittsburgh, Pennsylvania 15213
- Department of Bioengineering, University of Pittsburgh, Pittsburgh, Pennsylvania 15260
| | - Bharath Chandrasekaran
- Department of Communication Sciences and Disorders, University of Pittsburgh, Pittsburgh, Pennsylvania 15260
- Center for Neuroscience, University of Pittsburgh, Pittsburgh, Pennsylvania 15261
| |
Collapse
|
15
|
Llanos F, McHaney JR, Schuerman WL, Yi HG, Leonard MK, Chandrasekaran B. Non-invasive peripheral nerve stimulation selectively enhances speech category learning in adults. NPJ SCIENCE OF LEARNING 2020; 5:12. [PMID: 32802406 PMCID: PMC7410845 DOI: 10.1038/s41539-020-0070-0] [Citation(s) in RCA: 25] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/26/2019] [Accepted: 06/05/2020] [Indexed: 05/30/2023]
Abstract
Adults struggle to learn non-native speech contrasts even after years of exposure. While laboratory-based training approaches yield learning, the optimal training conditions for maximizing speech learning in adulthood are currently unknown. Vagus nerve stimulation has been shown to prime adult sensory-perceptual systems towards plasticity in animal models. Precise temporal pairing with auditory stimuli can enhance auditory cortical representations with a high degree of specificity. Here, we examined whether sub-perceptual threshold transcutaneous vagus nerve stimulation (tVNS), paired with non-native speech sounds, enhances speech category learning in adults. Twenty-four native English-speakers were trained to identify non-native Mandarin tone categories. Across two groups, tVNS was paired with the tone categories that were easier- or harder-to-learn. A control group received no stimulation but followed an identical thresholding procedure as the intervention groups. We found that tVNS robustly enhanced speech category learning and retention of correct stimulus-response associations, but only when stimulation was paired with the easier-to-learn categories. This effect emerged rapidly, generalized to new exemplars, and was qualitatively different from the normal individual variability observed in hundreds of learners who have performed in the same task without stimulation. Electroencephalography recorded before and after training indicated no evidence of tVNS-induced changes in the sensory representation of auditory stimuli. These results suggest that paired-tVNS induces a temporally precise neuromodulatory signal that selectively enhances the perception and memory consolidation of perceptually salient categories.
Collapse
Affiliation(s)
- Fernando Llanos
- Department of Communication Science and Disorders, University of Pittsburgh, Pittsburgh, PA 15260 USA
| | - Jacie R. McHaney
- Department of Communication Science and Disorders, University of Pittsburgh, Pittsburgh, PA 15260 USA
| | - William L. Schuerman
- Neurological Surgery, University of California, San Francisco, San Francisco, CA 94143 USA
| | - Han G. Yi
- Neurological Surgery, University of California, San Francisco, San Francisco, CA 94143 USA
| | - Matthew K. Leonard
- Neurological Surgery, University of California, San Francisco, San Francisco, CA 94143 USA
| | - Bharath Chandrasekaran
- Department of Communication Science and Disorders, University of Pittsburgh, Pittsburgh, PA 15260 USA
| |
Collapse
|
16
|
Llanos F, Xie Z, Chandrasekaran B. Biometric identification of listener identity from frequency following responses to speech. J Neural Eng 2019; 16:056004. [PMID: 31039552 DOI: 10.1088/1741-2552/ab1e01] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]
Abstract
OBJECTIVE We investigate the biometric specificity of the frequency following response (FFR), an EEG marker of early auditory processing that reflects phase-locked activity from neural ensembles in the auditory cortex and subcortex (Chandrasekaran and Kraus 2010, Bidelman, 2015a, 2018, Coffey et al 2017b). Our objective is two-fold: demonstrate that the FFR contains information beyond stimulus properties and broad group-level markers, and to assess the practical viability of the FFR as a biometric across different sounds, auditory experiences, and recording days. APPROACH We trained the hidden Markov model (HMM) to decode listener identity from FFR spectro-temporal patterns across multiple frequency bands. Our dataset included FFRs from twenty native speakers of English or Mandarin Chinese (10 per group) listening to Mandarin Chinese tones across three EEG sessions separated by days. We decoded subject identity within the same auditory context (same tone and session) and across different stimuli and recording sessions. MAIN RESULTS The HMM decoded listeners for averaging sizes as small as one single FFR. However, model performance improved for larger averaging sizes (e.g. 25 FFRs), similarity in auditory context (same tone and day), and lack of familiarity with the sounds (i.e. native English relative to native Chinese listeners). Our results also revealed important biometric contributions from frequency bands in the cortical and subcortical EEG. SIGNIFICANCE Our study provides the first deep and systematic biometric characterization of the FFR and provides the basis for biometric identification systems incorporating this neural signal.
Collapse
Affiliation(s)
- Fernando Llanos
- Department of Communication Sciences and Disorders, University of Pittsburgh, Pittsburgh, PA 15213, United States of America
| | | | | |
Collapse
|
17
|
Xie Z, Reetzke R, Chandrasekaran B. Machine Learning Approaches to Analyze Speech-Evoked Neurophysiological Responses. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2019; 62:587-601. [PMID: 30950746 PMCID: PMC6802895 DOI: 10.1044/2018_jslhr-s-astm-18-0244] [Citation(s) in RCA: 25] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/18/2018] [Revised: 10/28/2018] [Accepted: 11/26/2018] [Indexed: 05/27/2023]
Abstract
Purpose Speech-evoked neurophysiological responses are often collected to answer clinically and theoretically driven questions concerning speech and language processing. Here, we highlight the practical application of machine learning (ML)-based approaches to analyzing speech-evoked neurophysiological responses. Method Two categories of ML-based approaches are introduced: decoding models, which generate a speech stimulus output using the features from the neurophysiological responses, and encoding models, which use speech stimulus features to predict neurophysiological responses. In this review, we focus on (a) a decoding model classification approach, wherein speech-evoked neurophysiological responses are classified as belonging to 1 of a finite set of possible speech events (e.g., phonological categories), and (b) an encoding model temporal response function approach, which quantifies the transformation of a speech stimulus feature to continuous neural activity. Results We illustrate the utility of the classification approach to analyze early electroencephalographic (EEG) responses to Mandarin lexical tone categories from a traditional experimental design, and to classify EEG responses to English phonemes evoked by natural continuous speech (i.e., an audiobook) into phonological categories (plosive, fricative, nasal, and vowel). We also demonstrate the utility of temporal response function to predict EEG responses to natural continuous speech from acoustic features. Neural metrics from the 3 examples all exhibit statistically significant effects at the individual level. Conclusion We propose that ML-based approaches can complement traditional analysis approaches to analyze neurophysiological responses to speech signals and provide a deeper understanding of natural speech and language processing using ecologically valid paradigms in both typical and clinical populations.
Collapse
Affiliation(s)
- Zilong Xie
- Department of Communication Sciences and Disorders, The University of Texas at Austin
| | - Rachel Reetzke
- Department of Communication Sciences and Disorders, The University of Texas at Austin
| | - Bharath Chandrasekaran
- Department of Communication Science and Disorders, School of Health and Rehabilitation Sciences, University of Pittsburgh
| |
Collapse
|
18
|
Tracing the Trajectory of Sensory Plasticity across Different Stages of Speech Learning in Adulthood. Curr Biol 2018; 28:1419-1427.e4. [PMID: 29681473 DOI: 10.1016/j.cub.2018.03.026] [Citation(s) in RCA: 41] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2017] [Revised: 01/17/2018] [Accepted: 03/14/2018] [Indexed: 12/11/2022]
Abstract
Although challenging, adults can learn non-native phonetic contrasts with extensive training [1, 2], indicative of perceptual learning beyond an early sensitivity period [3, 4]. Training can alter low-level sensory encoding of newly acquired speech sound patterns [5]; however, the time-course, behavioral relevance, and long-term retention of such sensory plasticity is unclear. Some theories argue that sensory plasticity underlying signal enhancement is immediate and critical to perceptual learning [6, 7]. Others, like the reverse hierarchy theory (RHT), posit a slower time-course for sensory plasticity [8]. RHT proposes that higher-level categorical representations guide immediate, novice learning, while lower-level sensory changes do not emerge until expert stages of learning [9]. We trained 20 English-speaking adults to categorize a non-native phonetic contrast (Mandarin lexical tones) using a criterion-dependent sound-to-category training paradigm. Sensory and perceptual indices were assayed across operationally defined learning phases (novice, experienced, over-trained, and 8-week retention) by measuring the frequency-following response, a neurophonic potential that reflects fidelity of sensory encoding, and the perceptual identification of a tone continuum. Our results demonstrate that while robust changes in sensory encoding and perceptual identification of Mandarin tones emerged with training and were retained, such changes followed different timescales. Sensory changes were evidenced and related to behavioral performance only when participants were over-trained. In contrast, changes in perceptual identification reflecting improvement in categorical percept emerged relatively earlier. Individual differences in perceptual identification, and not sensory encoding, related to faster learning. Our findings support the RHT-sensory plasticity accompanies, rather than drives, expert levels of non-native speech learning.
Collapse
|