1
|
Hong Y, Ryun S, Chung CK. Evoking artificial speech perception through invasive brain stimulation for brain-computer interfaces: current challenges and future perspectives. Front Neurosci 2024; 18:1428256. [PMID: 38988764 PMCID: PMC11234843 DOI: 10.3389/fnins.2024.1428256] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2024] [Accepted: 06/10/2024] [Indexed: 07/12/2024] Open
Abstract
Encoding artificial perceptions through brain stimulation, especially that of higher cognitive functions such as speech perception, is one of the most formidable challenges in brain-computer interfaces (BCI). Brain stimulation has been used for functional mapping in clinical practices for the last 70 years to treat various disorders affecting the nervous system, including epilepsy, Parkinson's disease, essential tremors, and dystonia. Recently, direct electrical stimulation has been used to evoke various forms of perception in humans, ranging from sensorimotor, auditory, and visual to speech cognition. Successfully evoking and fine-tuning artificial perceptions could revolutionize communication for individuals with speech disorders and significantly enhance the capabilities of brain-computer interface technologies. However, despite the extensive literature on encoding various perceptions and the rising popularity of speech BCIs, inducing artificial speech perception is still largely unexplored, and its potential has yet to be determined. In this paper, we examine the various stimulation techniques used to evoke complex percepts and the target brain areas for the input of speech-like information. Finally, we discuss strategies to address the challenges of speech encoding and discuss the prospects of these approaches.
Collapse
Affiliation(s)
- Yirye Hong
- Department of Brain and Cognitive Sciences, College of Natural Sciences, Seoul National University, Seoul, Republic of Korea
| | - Seokyun Ryun
- Neuroscience Research Institute, Seoul National University Medical Research Center, Seoul, Republic of Korea
| | - Chun Kee Chung
- Neuroscience Research Institute, Seoul National University Medical Research Center, Seoul, Republic of Korea
| |
Collapse
|
2
|
Mohn JL, Baese-Berk MM, Jaramillo S. Selectivity to acoustic features of human speech in the auditory cortex of the mouse. Hear Res 2024; 441:108920. [PMID: 38029503 PMCID: PMC10787375 DOI: 10.1016/j.heares.2023.108920] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/21/2023] [Revised: 10/29/2023] [Accepted: 11/20/2023] [Indexed: 12/01/2023]
Abstract
A better understanding of the neural mechanisms of speech processing can have a major impact in the development of strategies for language learning and in addressing disorders that affect speech comprehension. Technical limitations in research with human subjects hinder a comprehensive exploration of these processes, making animal models essential for advancing the characterization of how neural circuits make speech perception possible. Here, we investigated the mouse as a model organism for studying speech processing and explored whether distinct regions of the mouse auditory cortex are sensitive to specific acoustic features of speech. We found that mice can learn to categorize frequency-shifted human speech sounds based on differences in formant transitions (FT) and voice onset time (VOT). Moreover, neurons across various auditory cortical regions were selective to these speech features, with a higher proportion of speech-selective neurons in the dorso-posterior region. Last, many of these neurons displayed mixed-selectivity for both features, an attribute that was most common in dorsal regions of the auditory cortex. Our results demonstrate that the mouse serves as a valuable model for studying the detailed mechanisms of speech feature encoding and neural plasticity during speech-sound learning.
Collapse
Affiliation(s)
- Jennifer L Mohn
- Institute of Neuroscience, University of Oregon, Eugene, OR 97403, United States of America
| | - Melissa M Baese-Berk
- Department of Linguistics, University of Oregon, Eugene, OR 97403, United States of America; Department of Linguistics, University of Chicago, Chicago, IL 60637, United States of America(1)
| | - Santiago Jaramillo
- Institute of Neuroscience, University of Oregon, Eugene, OR 97403, United States of America.
| |
Collapse
|
3
|
Mohn JL, Baese-Berk MM, Jaramillo S. Selectivity to acoustic features of human speech in the auditory cortex of the mouse. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.09.20.558699. [PMID: 37790479 PMCID: PMC10542132 DOI: 10.1101/2023.09.20.558699] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/05/2023]
Abstract
A better understanding of the neural mechanisms of speech processing can have a major impact in the development of strategies for language learning and in addressing disorders that affect speech comprehension. Technical limitations in research with human subjects hinder a comprehensive exploration of these processes, making animal models essential for advancing the characterization of how neural circuits make speech perception possible. Here, we investigated the mouse as a model organism for studying speech processing and explored whether distinct regions of the mouse auditory cortex are sensitive to specific acoustic features of speech. We found that mice can learn to categorize frequency-shifted human speech sounds based on differences in formant transitions (FT) and voice onset time (VOT). Moreover, neurons across various auditory cortical regions were selective to these speech features, with a higher proportion of speech-selective neurons in the dorso-posterior region. Last, many of these neurons displayed mixed-selectivity for both features, an attribute that was most common in dorsal regions of the auditory cortex. Our results demonstrate that the mouse serves as a valuable model for studying the detailed mechanisms of speech feature encoding and neural plasticity during speech-sound learning.
Collapse
Affiliation(s)
- Jennifer L. Mohn
- Institute of Neuroscience, University of Oregon. Eugene, OR 97403
| | | | | |
Collapse
|
4
|
Franken MK, Liu BC, Ostry DJ. Towards a somatosensory theory of speech perception. J Neurophysiol 2022; 128:1683-1695. [PMID: 36416451 PMCID: PMC9762980 DOI: 10.1152/jn.00381.2022] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2022] [Revised: 11/19/2022] [Accepted: 11/19/2022] [Indexed: 11/24/2022] Open
Abstract
Speech perception is known to be a multimodal process, relying not only on auditory input but also on the visual system and possibly on the motor system as well. To date there has been little work on the potential involvement of the somatosensory system in speech perception. In the present review, we identify the somatosensory system as another contributor to speech perception. First, we argue that evidence in favor of a motor contribution to speech perception can just as easily be interpreted as showing somatosensory involvement. Second, physiological and neuroanatomical evidence for auditory-somatosensory interactions across the auditory hierarchy indicates the availability of a neural infrastructure that supports somatosensory involvement in auditory processing in general. Third, there is accumulating evidence for somatosensory involvement in the context of speech specifically. In particular, tactile stimulation modifies speech perception, and speech auditory input elicits activity in somatosensory cortical areas. Moreover, speech sounds can be decoded from activity in somatosensory cortex; lesions to this region affect perception, and vowels can be identified based on somatic input alone. We suggest that the somatosensory involvement in speech perception derives from the somatosensory-auditory pairing that occurs during speech production and learning. By bringing together findings from a set of studies that have not been previously linked, the present article identifies the somatosensory system as a presently unrecognized contributor to speech perception.
Collapse
Affiliation(s)
| | | | - David J Ostry
- McGill University, Montreal, Quebec, Canada
- Haskins Laboratories, New Haven, Connecticut
| |
Collapse
|
5
|
Baese-Berk MM, Chandrasekaran B, Roark CL. The nature of non-native speech sound representations. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2022; 152:3025. [PMID: 36456300 PMCID: PMC9671621 DOI: 10.1121/10.0015230] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/30/2022] [Revised: 10/20/2022] [Accepted: 11/01/2022] [Indexed: 05/23/2023]
Abstract
Most current theories and models of second language speech perception are grounded in the notion that learners acquire speech sound categories in their target language. In this paper, this classic idea in speech perception is revisited, given that clear evidence for formation of such categories is lacking in previous research. To understand the debate on the nature of speech sound representations in a second language, an operational definition of "category" is presented, and the issues of categorical perception and current theories of second language learning are reviewed. Following this, behavioral and neuroimaging evidence for and against acquisition of categorical representations is described. Finally, recommendations for future work are discussed. The paper concludes with a recommendation for integration of behavioral and neuroimaging work and theory in this area.
Collapse
Affiliation(s)
| | - Bharath Chandrasekaran
- Department of Communication Sciences and Disorders, University of Pittsburgh, Pittsburgh, Pennsylvania 15260, USA
| | - Casey L Roark
- Department of Communication Sciences and Disorders, University of Pittsburgh, Pittsburgh, Pennsylvania 15260, USA
| |
Collapse
|
6
|
Li Z, Hong B, Wang D, Nolte G, Engel AK, Zhang D. Speaker-listener neural coupling reveals a right-lateralized mechanism for non-native speech-in-noise comprehension. Cereb Cortex 2022; 33:3701-3714. [PMID: 35975617 DOI: 10.1093/cercor/bhac302] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2022] [Revised: 07/08/2022] [Accepted: 07/09/2022] [Indexed: 11/14/2022] Open
Abstract
While the increasingly globalized world has brought more and more demands for non-native language communication, the prevalence of background noise in everyday life poses a great challenge to non-native speech comprehension. The present study employed an interbrain approach based on functional near-infrared spectroscopy (fNIRS) to explore how people adapt to comprehend non-native speech information in noise. A group of Korean participants who acquired Chinese as their non-native language was invited to listen to Chinese narratives at 4 noise levels (no noise, 2 dB, -6 dB, and - 9 dB). These narratives were real-life stories spoken by native Chinese speakers. Processing of the non-native speech was associated with significant fNIRS-based listener-speaker neural couplings mainly over the right hemisphere at both the listener's and the speaker's sides. More importantly, the neural couplings from the listener's right superior temporal gyrus, the right middle temporal gyrus, as well as the right postcentral gyrus were found to be positively correlated with their individual comprehension performance at the strongest noise level (-9 dB). These results provide interbrain evidence in support of the right-lateralized mechanism for non-native speech processing and suggest that both an auditory-based and a sensorimotor-based mechanism contributed to the non-native speech-in-noise comprehension.
Collapse
Affiliation(s)
- Zhuoran Li
- Department of Psychology, School of Social Sciences, Tsinghua University, Beijing 100084, China.,Tsinghua Laboratory of Brain and Intelligence, Tsinghua University, Beijing 100084, China
| | - Bo Hong
- Tsinghua Laboratory of Brain and Intelligence, Tsinghua University, Beijing 100084, China.,Department of Biomedical Engineering, School of Medicine, Tsinghua University, Beijing 100084, China
| | - Daifa Wang
- School of Biological Science and Medical Engineering, Beihang University, Beijing 100083, China
| | - Guido Nolte
- Department of Neurophysiology and Pathophysiology, University Medical Center Hamburg Eppendorf, 20246 Hamburg, Germany
| | - Andreas K Engel
- Department of Neurophysiology and Pathophysiology, University Medical Center Hamburg Eppendorf, 20246 Hamburg, Germany
| | - Dan Zhang
- Department of Psychology, School of Social Sciences, Tsinghua University, Beijing 100084, China.,Tsinghua Laboratory of Brain and Intelligence, Tsinghua University, Beijing 100084, China
| |
Collapse
|
7
|
Abstract
The human brain exhibits the remarkable ability to categorize speech sounds into distinct, meaningful percepts, even in challenging tasks like learning non-native speech categories in adulthood and hearing speech in noisy listening conditions. In these scenarios, there is substantial variability in perception and behavior, both across individual listeners and individual trials. While there has been extensive work characterizing stimulus-related and contextual factors that contribute to variability, recent advances in neuroscience are beginning to shed light on another potential source of variability that has not been explored in speech processing. Specifically, there are task-independent, moment-to-moment variations in neural activity in broadly-distributed cortical and subcortical networks that affect how a stimulus is perceived on a trial-by-trial basis. In this review, we discuss factors that affect speech sound learning and moment-to-moment variability in perception, particularly arousal states—neurotransmitter-dependent modulations of cortical activity. We propose that a more complete model of speech perception and learning should incorporate subcortically-mediated arousal states that alter behavior in ways that are distinct from, yet complementary to, top-down cognitive modulations. Finally, we discuss a novel neuromodulation technique, transcutaneous auricular vagus nerve stimulation (taVNS), which is particularly well-suited to investigating causal relationships between arousal mechanisms and performance in a variety of perceptual tasks. Together, these approaches provide novel testable hypotheses for explaining variability in classically challenging tasks, including non-native speech sound learning.
Collapse
|