1
|
Ashokumar M, Schwartz JL, Ito T. Changes in Speech Production Following Perceptual Training With Orofacial Somatosensory Inputs. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2024; 67:3962-3973. [PMID: 38497731 DOI: 10.1044/2023_jslhr-23-00249] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/19/2024]
Abstract
PURPOSE Orofacial somatosensory inputs play an important role in speech motor control and speech learning. Since receiving specific auditory-somatosensory inputs during speech perceptual training alters speech perception, similar perceptual training could also alter speech production. We examined whether the production performance was changed by perceptual training with orofacial somatosensory inputs. METHOD We focused on the French vowels /e/ and /ø/, contrasted in their articulation by horizontal gestures. Perceptual training consisted of a vowel identification task contrasting /e/ and /ø/. Along with training, for the first group of participants, somatosensory stimulation was applied as facial skin stretch in backward direction. We recorded the target vowels uttered by the participants before and after the perceptual training and compared their F1, F2, and F3 formants. We also tested a control group with no somatosensory stimulation and another somatosensory group with a different vowel continuum (/e/-/i/) for perceptual training. RESULTS Perceptual training with somatosensory stimulation induced changes in F2 and F3 in the produced vowel sounds. F2 decreased consistently in the two somatosensory groups. F3 increased following the /e/-/ø/ training and decreased following the /e/-/i/ training. F2 change was significantly correlated with the perceptual shift between the first and second half of the training phase in the somatosensory group with the /e/-/ø/ training, but not with the /e/-/i/ training. The control group displayed no effect on F2 and F3, and just a tendency of F1 increase. CONCLUSION The results suggest that somatosensory inputs associated to speech sound inputs can play a role in speech training and learning in both production and perception.
Collapse
Affiliation(s)
| | | | - Takayuki Ito
- Univ. Grenoble Alpes, CNRS, Grenoble INP, GIPSA-lab, France
| |
Collapse
|
2
|
Kim KS, Gaines JL, Parrell B, Ramanarayanan V, Nagarajan SS, Houde JF. Mechanisms of sensorimotor adaptation in a hierarchical state feedback control model of speech. PLoS Comput Biol 2023; 19:e1011244. [PMID: 37506120 PMCID: PMC10434967 DOI: 10.1371/journal.pcbi.1011244] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2022] [Revised: 08/17/2023] [Accepted: 06/06/2023] [Indexed: 07/30/2023] Open
Abstract
Upon perceiving sensory errors during movements, the human sensorimotor system updates future movements to compensate for the errors, a phenomenon called sensorimotor adaptation. One component of this adaptation is thought to be driven by sensory prediction errors-discrepancies between predicted and actual sensory feedback. However, the mechanisms by which prediction errors drive adaptation remain unclear. Here, auditory prediction error-based mechanisms involved in speech auditory-motor adaptation were examined via the feedback aware control of tasks in speech (FACTS) model. Consistent with theoretical perspectives in both non-speech and speech motor control, the hierarchical architecture of FACTS relies on both the higher-level task (vocal tract constrictions) as well as lower-level articulatory state representations. Importantly, FACTS also computes sensory prediction errors as a part of its state feedback control mechanism, a well-established framework in the field of motor control. We explored potential adaptation mechanisms and found that adaptive behavior was present only when prediction errors updated the articulatory-to-task state transformation. In contrast, designs in which prediction errors updated forward sensory prediction models alone did not generate adaptation. Thus, FACTS demonstrated that 1) prediction errors can drive adaptation through task-level updates, and 2) adaptation is likely driven by updates to task-level control rather than (only) to forward predictive models. Additionally, simulating adaptation with FACTS generated a number of important hypotheses regarding previously reported phenomena such as identifying the source(s) of incomplete adaptation and driving factor(s) for changes in the second formant frequency during adaptation to the first formant perturbation. The proposed model design paves the way for a hierarchical state feedback control framework to be examined in the context of sensorimotor adaptation in both speech and non-speech effector systems.
Collapse
Affiliation(s)
- Kwang S. Kim
- Department of Speech, Language, and Hearing Sciences, Purdue University, West Lafayette, Indiana, United States of America
| | - Jessica L. Gaines
- Graduate Program in Bioengineering, University of California Berkeley-University of California San Francisco, San Francisco, California, United States of America
| | - Benjamin Parrell
- Department of Communication Sciences and Disorders, University of Wisconsin–Madison, Madison, Wisconsin, United States of America
| | - Vikram Ramanarayanan
- Department of Otolaryngology-Head and Neck Surgery, University of California San Francisco, San Francisco, California, United States of America
- Modality.AI, San Francisco, California, United States of America
| | - Srikantan S. Nagarajan
- Department of Radiology and Biomedical Imaging, University of California San Francisco, San Francisco, California, United States of America
| | - John F. Houde
- Department of Otolaryngology-Head and Neck Surgery, University of California San Francisco, San Francisco, California, United States of America
| |
Collapse
|
3
|
Ménard L, Beaudry L, Perrier P. Effects of somatosensory perturbation on the perception of French /u/. JASA EXPRESS LETTERS 2023; 3:2887654. [PMID: 37125874 DOI: 10.1121/10.0017933] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/01/2022] [Accepted: 04/05/2023] [Indexed: 05/03/2023]
Abstract
In a study of whether somatosensory feedback related to articulatory configuration is involved in speech perception, 30 French-speaking adults performed a speech discrimination task in which vowel pairs along the French /u/ (rounded vowel requiring a small lip area) to /œ/ (rounded vowel associated with larger lip area) continuum were used as stimuli. Listeners had to perform the test in two conditions: with a 2-cm-diameter lip-tube in place (mimicking /œ/) and without the lip-tube (neutral lip position). Results show that, in the lip-tube condition, listeners perceived more stimuli as /œ/, in line with the proposal that an auditory-somatosensory interaction exists.
Collapse
Affiliation(s)
- Lucie Ménard
- Laboratoire de Phonétique, Université du Québec à Montréal, Center for Research on Brain, Language, and Music, CP. 8888, succ. Centre-Ville, Montreal, Québec H3C 3P8, Canada
| | - Lambert Beaudry
- Laboratoire de Phonétique, Université du Québec à Montréal, Center for Research on Brain, Language, and Music, CP. 8888, succ. Centre-Ville, Montreal, Québec H3C 3P8, Canada
| | - Pascal Perrier
- Université Grenoble Alpes, Centre National de la Recherche Scientifique (CNRS), Grenoble Institut National Polytechnique (INP), Institute of Engineering, and GIPSA-Lab, 38000 Grenoble, , ,
| |
Collapse
|
4
|
Floegel M, Kasper J, Perrier P, Kell CA. How the conception of control influences our understanding of actions. Nat Rev Neurosci 2023; 24:313-329. [PMID: 36997716 DOI: 10.1038/s41583-023-00691-z] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 02/28/2023] [Indexed: 04/01/2023]
Abstract
Wilful movement requires neural control. Commonly, neural computations are thought to generate motor commands that bring the musculoskeletal system - that is, the plant - from its current physical state into a desired physical state. The current state can be estimated from past motor commands and from sensory information. Modelling movement on the basis of this concept of plant control strives to explain behaviour by identifying the computational principles for control signals that can reproduce the observed features of movements. From an alternative perspective, movements emerge in a dynamically coupled agent-environment system from the pursuit of subjective perceptual goals. Modelling movement on the basis of this concept of perceptual control aims to identify the controlled percepts and their coupling rules that can give rise to the observed characteristics of behaviour. In this Perspective, we discuss a broad spectrum of approaches to modelling human motor control and their notions of control signals, internal models, handling of sensory feedback delays and learning. We focus on the influence that the plant control and the perceptual control perspective may have on decisions when modelling empirical data, which may in turn shape our understanding of actions.
Collapse
Affiliation(s)
- Mareike Floegel
- Department of Neurology and Brain Imaging Center, Goethe University Frankfurt, Frankfurt, Germany
| | - Johannes Kasper
- Department of Neurology and Brain Imaging Center, Goethe University Frankfurt, Frankfurt, Germany
| | - Pascal Perrier
- Univ. Grenoble Alpes, CNRS, Grenoble INP, GIPSA-lab, Grenoble, France
| | - Christian A Kell
- Department of Neurology and Brain Imaging Center, Goethe University Frankfurt, Frankfurt, Germany.
| |
Collapse
|
5
|
Li T, Zhu X, Wu X, Gong Y, Jones JA, Liu P, Chang Y, Yan N, Chen X, Liu H. Continuous theta burst stimulation over left and right supramarginal gyri demonstrates their involvement in auditory feedback control of vocal production. Cereb Cortex 2022; 33:11-22. [PMID: 35174862 DOI: 10.1093/cercor/bhac049] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2021] [Revised: 01/25/2022] [Accepted: 01/26/2022] [Indexed: 01/06/2023] Open
Abstract
The supramarginal gyrus (SMG) has been implicated in auditory-motor integration for vocal production. However, whether the SMG is bilaterally or unilaterally involved in auditory feedback control of vocal production in a causal manner remains unclear. The present event-related potential (ERP) study investigated the causal roles of the left and right SMG to auditory-vocal integration using neuronavigated continuous theta burst stimulation (c-TBS). Twenty-four young adults produced sustained vowel phonations and heard their voice unexpectedly pitch-shifted by ±200 cents after receiving active or sham c-TBS over the left or right SMG. As compared to sham stimulation, c-TBS over the left or right SMG led to significantly smaller vocal compensations for pitch perturbations that were accompanied by smaller cortical P2 responses. Moreover, no significant differences were found in the vocal and ERP responses when comparing active c-TBS over the left vs. right SMG. These findings provide neurobehavioral evidence for a causal influence of both the left and right SMG on auditory feedback control of vocal production. Decreased vocal compensations paralleled by reduced P2 responses following c-TBS over the bilateral SMG support their roles for auditory-motor transformation in a bottom-up manner: receiving auditory feedback information and mediating vocal compensations for feedback errors.
Collapse
Affiliation(s)
- Tingni Li
- Department of Rehabilitation Medicine, The First Affiliated Hospital, Sun Yat-sen University, Guangzhou, 510080, China
| | - Xiaoxia Zhu
- Department of Rehabilitation Medicine, The First Affiliated Hospital, Sun Yat-sen University, Guangzhou, 510080, China
| | - Xiuqin Wu
- Department of Rehabilitation Medicine, The First Affiliated Hospital, Sun Yat-sen University, Guangzhou, 510080, China
| | - Yulai Gong
- Department of Neurological Rehabilitation, Affiliated Sichuan Provincial Rehabilitation Hospital of Chengdu University of Traditional Chinese Medicine, Chengdu, 611135, China
| | - Jeffery A Jones
- Psychology Department and Laurier Centre for Cognitive Neuroscience, Wilfrid Laurier University, Waterloo, Ontario, N2L 3C5, Canada
| | - Peng Liu
- Department of Rehabilitation Medicine, The First Affiliated Hospital, Sun Yat-sen University, Guangzhou, 510080, China
| | - Yichen Chang
- Department of Rehabilitation Medicine, The First Affiliated Hospital, Sun Yat-sen University, Guangzhou, 510080, China
| | - Nan Yan
- CAS Key Laboratory of Human-Machine Intelligence-Synergy Systems, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, 518055, China.,Guangdong-Hong Kong-Macao Joint Laboratory of Human-Machine Intelligence-Synergy Systems, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, 518055, China
| | - Xi Chen
- Department of Rehabilitation Medicine, The First Affiliated Hospital, Sun Yat-sen University, Guangzhou, 510080, China
| | - Hanjun Liu
- Department of Rehabilitation Medicine, The First Affiliated Hospital, Sun Yat-sen University, Guangzhou, 510080, China.,Guangdong Provincial Key Laboratory of Brain Function and Disease, Zhongshan School of Medicine, Sun Yat-sen University, Guangzhou, 510080, China
| |
Collapse
|
6
|
Ashokumar M, Guichet C, Schwartz JL, Ito T. Correlation between the effect of orofacial somatosensory inputs in speech perception and speech production performance. AUDITORY PERCEPTION & COGNITION 2022; 6:97-107. [PMID: 37260602 PMCID: PMC10229140 DOI: 10.1080/25742442.2022.2134674] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/30/2021] [Accepted: 09/20/2022] [Indexed: 06/02/2023]
Abstract
Introduction Orofacial somatosensory inputs modify the perception of speech sounds. Such auditory-somatosensory integration likely develops alongside speech production acquisition. We examined whether the somatosensory effect in speech perception varies depending on individual characteristics of speech production. Methods The somatosensory effect in speech perception was assessed by changes in category boundary between /e/ and /ø/ in a vowel identification test resulting from somatosensory stimulation providing facial skin deformation in the rearward direction corresponding to articulatory movement for /e/ applied together with the auditory input. Speech production performance was quantified by the acoustic distances between the average first, second and third formants of /e/ and /ø/ utterances recorded in a separate test. Results The category boundary between /e/ and /ø/ was significantly shifted towards /ø/ due to the somatosensory stimulation which is consistent with previous research. The amplitude of the category boundary shift was significantly correlated with the acoustic distance between the mean second - and marginally third - formants of /e/ and /ø/ productions, with no correlation with the first formant distance. Discussion Greater acoustic distances can be related to larger contrasts between the articulatory targets of vowels in speech production. These results suggest that the somatosensory effect in speech perception can be linked to speech production performance.
Collapse
Affiliation(s)
- Monica Ashokumar
- Univ. Grenoble Alpes, CNRS, Grenoble INP, GIPSA-lab, Grenoble, France
| | - Clément Guichet
- Univ. Grenoble Alpes, CNRS, Grenoble INP, GIPSA-lab, Grenoble, France
| | - Jean-Luc Schwartz
- Univ. Grenoble Alpes, CNRS, Grenoble INP, GIPSA-lab, Grenoble, France
| | - Takayuki Ito
- Univ. Grenoble Alpes, CNRS, Grenoble INP, GIPSA-lab, Grenoble, France
- Haskins Laboratories, New Haven, USA
| |
Collapse
|
7
|
Nault DR, Mitsuya T, Purcell DW, Munhall KG. Perturbing the consistency of auditory feedback in speech. Front Hum Neurosci 2022; 16:905365. [PMID: 36092651 PMCID: PMC9453207 DOI: 10.3389/fnhum.2022.905365] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2022] [Accepted: 08/04/2022] [Indexed: 11/13/2022] Open
Abstract
Sensory information, including auditory feedback, is used by talkers to maintain fluent speech articulation. Current models of speech motor control posit that speakers continually adjust their motor commands based on discrepancies between the sensory predictions made by a forward model and the sensory consequences of their speech movements. Here, in two within-subject design experiments, we used a real-time formant manipulation system to explore how reliant speech articulation is on the accuracy or predictability of auditory feedback information. This involved introducing random formant perturbations during vowel production that varied systematically in their spatial location in formant space (Experiment 1) and temporal consistency (Experiment 2). Our results indicate that, on average, speakers’ responses to auditory feedback manipulations varied based on the relevance and degree of the error that was introduced in the various feedback conditions. In Experiment 1, speakers’ average production was not reliably influenced by random perturbations that were introduced every utterance to the first (F1) and second (F2) formants in various locations of formant space that had an overall average of 0 Hz. However, when perturbations were applied that had a mean of +100 Hz in F1 and −125 Hz in F2, speakers demonstrated reliable compensatory responses that reflected the average magnitude of the applied perturbations. In Experiment 2, speakers did not significantly compensate for perturbations of varying magnitudes that were held constant for one and three trials at a time. Speakers’ average productions did, however, significantly deviate from a control condition when perturbations were held constant for six trials. Within the context of these conditions, our findings provide evidence that the control of speech movements is, at least in part, dependent upon the reliability and stability of the sensory information that it receives over time.
Collapse
Affiliation(s)
- Daniel R. Nault
- Department of Psychology, Queen’s University, Kingston, ON, Canada
- *Correspondence: Daniel R. Nault,
| | - Takashi Mitsuya
- School of Communication Sciences and Disorders, Western University, London, ON, Canada
- National Centre for Audiology, Western University, London, ON, Canada
| | - David W. Purcell
- School of Communication Sciences and Disorders, Western University, London, ON, Canada
- National Centre for Audiology, Western University, London, ON, Canada
| | - Kevin G. Munhall
- Department of Psychology, Queen’s University, Kingston, ON, Canada
| |
Collapse
|
8
|
Oschkinat M, Hoole P. Compensation to real-time temporal auditory feedback perturbation depends on syllable position. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2020; 148:1478. [PMID: 33003874 DOI: 10.1121/10.0001765] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/14/2020] [Accepted: 07/29/2020] [Indexed: 06/11/2023]
Abstract
Auditory feedback perturbations involving spectral shifts indicated a crucial contribution of auditory feedback to planning and execution of speech. However, much less is known about the contribution of auditory feedback with respect to temporal properties of speech. The current study aimed at providing insight into the representation of temporal properties of speech and the relevance of auditory feedback for speech timing. Real-time auditory feedback perturbations were applied in the temporal domain, viz., stretching and compressing of consonant-consonant-vowel (CCV) durations in onset + nucleus vs vowel-consonant-consonant (VCC) durations in nucleus + coda. Since CCV forms a gesturally more cohesive and stable structure than VCC, greater articulatory adjustments to nucleus + coda (VCC) perturbation were expected. The results show that speakers compensate for focal temporal feedback alterations. Responses to VCC perturbation were greater than to CCV perturbation, suggesting less deformability of onsets when confronted with temporally perturbed auditory feedback. Further, responses to CCV perturbation rather reflected within-trial reactive compensation, whereas VCC compensation was more pronounced and indicative of adaptive behavior. Accordingly, planning and execution of temporal properties of speech are indeed guided by auditory feedback, but the precise nature of the reaction to perturbations is linked to the structural position in the syllable and the associated feedforward timing strategies.
Collapse
Affiliation(s)
- Miriam Oschkinat
- Institute of Phonetics and Speech Processing, Ludwig Maximilian University of Munich, Schellingstrasse 3, Munich, 80799, Germany
| | - Philip Hoole
- Institute of Phonetics and Speech Processing, Ludwig Maximilian University of Munich, Schellingstrasse 3, Munich, 80799, Germany
| |
Collapse
|
9
|
Olasagasti I, Giraud AL. Integrating prediction errors at two time scales permits rapid recalibration of speech sound categories. eLife 2020; 9:44516. [PMID: 32223894 PMCID: PMC7217692 DOI: 10.7554/elife.44516] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2019] [Accepted: 03/17/2020] [Indexed: 01/01/2023] Open
Abstract
Speech perception presumably arises from internal models of how specific sensory features are associated with speech sounds. These features change constantly (e.g. different speakers, articulation modes etc.), and listeners need to recalibrate their internal models by appropriately weighing new versus old evidence. Models of speech recalibration classically ignore this volatility. The effect of volatility in tasks where sensory cues were associated with arbitrary experimenter-defined categories were well described by models that continuously adapt the learning rate while keeping a single representation of the category. Using neurocomputational modelling we show that recalibration of natural speech sound categories is better described by representing the latter at different time scales. We illustrate our proposal by modeling fast recalibration of speech sounds after experiencing the McGurk effect. We propose that working representations of speech categories are driven both by their current environment and their long-term memory representations. People can distinguish words or syllables even though they may sound different with every speaker. This striking ability reflects the fact that our brain is continually modifying the way we recognise and interpret the spoken word based on what we have heard before, by comparing past experience with the most recent one to update expectations. This phenomenon also occurs in the McGurk effect: an auditory illusion in which someone hears one syllable but sees a person saying another syllable and ends up perceiving a third distinct sound. Abstract models, which provide a functional rather than a mechanistic description of what the brain does, can test how humans use expectations and prior knowledge to interpret the information delivered by the senses at any given moment. Olasagasti and Giraud have now built an abstract model of how brains recalibrate perception of natural speech sounds. By fitting the model with existing experimental data using the McGurk effect, the results suggest that, rather than using a single sound representation that is adjusted with each sensory experience, the brain recalibrates sounds at two different timescales. Over and above slow “procedural” learning, the findings show that there is also rapid recalibration of how different sounds are interpreted. This working representation of speech enables adaptation to changing or noisy environments and illustrates that the process is far more dynamic and flexible than previously thought.
Collapse
Affiliation(s)
- Itsaso Olasagasti
- Department of Basic Neuroscience, University of Geneva, Geneva, Switzerland
| | - Anne-Lise Giraud
- Department of Basic Neuroscience, University of Geneva, Geneva, Switzerland
| |
Collapse
|
10
|
Speakers are able to categorize vowels based on tongue somatosensation. Proc Natl Acad Sci U S A 2020; 117:6255-6263. [PMID: 32123070 DOI: 10.1073/pnas.1911142117] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Auditory speech perception enables listeners to access phonological categories from speech sounds. During speech production and speech motor learning, speakers' experience matched auditory and somatosensory input. Accordingly, access to phonetic units might also be provided by somatosensory information. The present study assessed whether humans can identify vowels using somatosensory feedback, without auditory feedback. A tongue-positioning task was used in which participants were required to achieve different tongue postures within the /e, ε, a/ articulatory range, in a procedure that was totally nonspeech like, involving distorted visual feedback of tongue shape. Tongue postures were measured using electromagnetic articulography. At the end of each tongue-positioning trial, subjects were required to whisper the corresponding vocal tract configuration with masked auditory feedback and to identify the vowel associated with the reached tongue posture. Masked auditory feedback ensured that vowel categorization was based on somatosensory feedback rather than auditory feedback. A separate group of subjects was required to auditorily classify the whispered sounds. In addition, we modeled the link between vowel categories and tongue postures in normal speech production with a Bayesian classifier based on the tongue postures recorded from the same speakers for several repetitions of the /e, ε, a/ vowels during a separate speech production task. Overall, our results indicate that vowel categorization is possible with somatosensory feedback alone, with an accuracy that is similar to the accuracy of the auditory perception of whispered sounds, and in congruence with normal speech articulation, as accounted for by the Bayesian classifier.
Collapse
|
11
|
Santoni C, de Boer G, Thaut M, Bressmann T. Influence of Altered Auditory Feedback on Oral-Nasal Balance in Song. J Voice 2020; 34:157.e9-157.e15. [DOI: 10.1016/j.jvoice.2018.06.014] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2018] [Revised: 06/19/2018] [Accepted: 06/21/2018] [Indexed: 11/24/2022]
|
12
|
Patri JF, Diard J, Perrier P. Modeling Sensory Preference in Speech Motor Planning: A Bayesian Modeling Framework. Front Psychol 2019; 10:2339. [PMID: 31708828 PMCID: PMC6824204 DOI: 10.3389/fpsyg.2019.02339] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2019] [Accepted: 10/01/2019] [Indexed: 11/13/2022] Open
Abstract
Experimental studies of speech production involving compensations for auditory and somatosensory perturbations and adaptation after training suggest that both types of sensory information are considered to plan and monitor speech production. Interestingly, individual sensory preferences have been observed in this context: subjects who compensate less for somatosensory perturbations compensate more for auditory perturbations, and vice versa. We propose to integrate this sensory preference phenomenon in a model of speech motor planning using a probabilistic model in which speech units are characterized both in auditory and somatosensory terms. Sensory preference is implemented in the model according to two approaches. In the first approach, which is often used in motor control models accounting for sensory integration, sensory preference is attributed to the relative precision (i.e., inverse of the variance) of the sensory characterization of the speech motor goals associated with phonological units (which are phonemes in the context of this paper). In the second, "more original" variant, sensory preference is implemented by modulating the sensitivity of the comparison between the predicted sensory consequences of motor commands and the sensory characterizations of the phonemes. We present simulation results using these two variants, in the context of the adaptation to an auditory perturbation, implemented in a 2-dimensional biomechanical model of the tongue. Simulation results show that both variants lead to qualitatively similar results. Distinguishing them experimentally would require precise analyses of partial compensation patterns. However, the second proposed variant implements sensory preference without changing the sensory characterizations of the phonemes. This dissociates sensory preference and sensory characterizations of the phonemes, and makes the account of sensory preference more flexible. Indeed, in the second variant the sensory characterizations of the phonemes can remain stable, when sensory preference varies as a response to cognitive or attentional control. This opens new perspectives for capturing speech production variability associated with aging, disorders and speaking conditions.
Collapse
Affiliation(s)
- Jean-François Patri
- Université Grenoble Alpes, CNRS, Grenoble INP, GIPSA-lab, Grenoble, France.,Université Grenoble Alpes, CNRS, LPNC, Grenoble, France.,Cognition Motion and Neuroscience Unit, Fondazione Istituto Italiano di Tecnologia, Genova, Italy
| | - Julien Diard
- Université Grenoble Alpes, CNRS, LPNC, Grenoble, France
| | - Pascal Perrier
- Université Grenoble Alpes, CNRS, Grenoble INP, GIPSA-lab, Grenoble, France
| |
Collapse
|
13
|
Grandchamp R, Rapin L, Perrone-Bertolotti M, Pichat C, Haldin C, Cousin E, Lachaux JP, Dohen M, Perrier P, Garnier M, Baciu M, Lœvenbruck H. The ConDialInt Model: Condensation, Dialogality, and Intentionality Dimensions of Inner Speech Within a Hierarchical Predictive Control Framework. Front Psychol 2019; 10:2019. [PMID: 31620039 PMCID: PMC6759632 DOI: 10.3389/fpsyg.2019.02019] [Citation(s) in RCA: 26] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2019] [Accepted: 08/19/2019] [Indexed: 11/19/2022] Open
Abstract
Inner speech has been shown to vary in form along several dimensions. Along condensation, condensed inner speech forms have been described, that are supposed to be deprived of acoustic, phonological and even syntactic qualities. Expanded forms, on the other extreme, display articulatory and auditory properties. Along dialogality, inner speech can be monologal, when we engage in internal soliloquy, or dialogal, when we recall past conversations or imagine future dialogs involving our own voice as well as that of others addressing us. Along intentionality, it can be intentional (when we deliberately rehearse material in short-term memory) or it can arise unintentionally (during mind wandering). We introduce the ConDialInt model, a neurocognitive predictive control model of inner speech that accounts for its varieties along these three dimensions. ConDialInt spells out the condensation dimension by including inhibitory control at the conceptualization, formulation or articulatory planning stage. It accounts for dialogality, by assuming internal model adaptations and by speculating on neural processes underlying perspective switching. It explains the differences between intentional and spontaneous varieties in terms of monitoring. We present an fMRI study in which we probed varieties of inner speech along dialogality and intentionality, to examine the validity of the neuroanatomical correlates posited in ConDialInt. Condensation was also informally tackled. Our data support the hypothesis that expanded inner speech recruits speech production processes down to articulatory planning, resulting in a predicted signal, the inner voice, with auditory qualities. Along dialogality, covertly using an avatar's voice resulted in the activation of right hemisphere homologs of the regions involved in internal own-voice soliloquy and in reduced cerebellar activation, consistent with internal model adaptation. Switching from first-person to third-person perspective resulted in activations in precuneus and parietal lobules. Along intentionality, compared with intentional inner speech, mind wandering with inner speech episodes was associated with greater bilateral inferior frontal activation and decreased activation in left temporal regions. This is consistent with the reported subjective evanescence and presumably reflects condensation processes. Our results provide neuroanatomical evidence compatible with predictive control and in favor of the assumptions made in the ConDialInt model.
Collapse
Affiliation(s)
- Romain Grandchamp
- Univ. Grenoble Alpes, Univ. Savoie Mont Blanc, CNRS, LPNC, Grenoble, France
| | - Lucile Rapin
- Univ. Grenoble Alpes, Univ. Savoie Mont Blanc, CNRS, LPNC, Grenoble, France
| | | | - Cédric Pichat
- Univ. Grenoble Alpes, Univ. Savoie Mont Blanc, CNRS, LPNC, Grenoble, France
| | - Célise Haldin
- Univ. Grenoble Alpes, Univ. Savoie Mont Blanc, CNRS, LPNC, Grenoble, France
| | - Emilie Cousin
- Univ. Grenoble Alpes, Univ. Savoie Mont Blanc, CNRS, LPNC, Grenoble, France
| | - Jean-Philippe Lachaux
- INSERM U1028, CNRS UMR5292, Brain Dynamics and Cognition Team, Lyon Neurosciences Research Center, Bron, France
| | - Marion Dohen
- Univ. Grenoble Alpes, CNRS, Grenoble INP, GIPSA-lab, Grenoble, France
| | - Pascal Perrier
- Univ. Grenoble Alpes, CNRS, Grenoble INP, GIPSA-lab, Grenoble, France
| | - Maëva Garnier
- Univ. Grenoble Alpes, CNRS, Grenoble INP, GIPSA-lab, Grenoble, France
| | - Monica Baciu
- Univ. Grenoble Alpes, Univ. Savoie Mont Blanc, CNRS, LPNC, Grenoble, France
| | - Hélène Lœvenbruck
- Univ. Grenoble Alpes, Univ. Savoie Mont Blanc, CNRS, LPNC, Grenoble, France
| |
Collapse
|
14
|
Tamura S, Ito K, Hirose N, Mori S. Precision of voicing perceptual identification is altered in association with voice-onset time production changes. Exp Brain Res 2019; 237:2197-2204. [DOI: 10.1007/s00221-019-05584-1] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2018] [Accepted: 06/13/2019] [Indexed: 11/30/2022]
|
15
|
Vilain A, Dole M, Lœvenbruck H, Pascalis O, Schwartz JL. The role of production abilities in the perception of consonant category in infants. Dev Sci 2019; 22:e12830. [PMID: 30908771 DOI: 10.1111/desc.12830] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2018] [Revised: 02/25/2019] [Accepted: 03/06/2019] [Indexed: 12/01/2022]
Abstract
The influence of motor knowledge on speech perception is well established, but the functional role of the motor system is still poorly understood. The present study explores the hypothesis that speech production abilities may help infants discover phonetic categories in the speech stream, in spite of coarticulation effects. To this aim, we examined the influence of babbling abilities on consonant categorization in 6- and 9-month-old infants. Using an intersensory matching procedure, we investigated the infants' capacity to associate auditory information about a consonant in various vowel contexts with visual information about the same consonant, and to map auditory and visual information onto a common phoneme representation. Moreover, a parental questionnaire evaluated the infants' consonantal repertoire. In a first experiment using /b/-/d/ consonants, we found that infants who displayed babbling abilities and produced the /b/ and/or the /d/ consonants in repetitive sequences were able to correctly perform intersensory matching, while non-babblers were not. In a second experiment using the /v/-/z/ pair, which is as visually contrasted as the /b/-/d/ pair but which is usually not produced at the tested ages, no significant matching was observed, for any group of infants, babbling or not. These results demonstrate, for the first time, that the emergence of babbling could play a role in the extraction of vowel-independent representations for consonant place of articulation. They have important implications for speech perception theories, as they highlight the role of sensorimotor interactions in the development of phoneme representations during the first year of life.
Collapse
Affiliation(s)
- Anne Vilain
- GIPSA-Lab, Speech & Cognition Department, CNRS, Université Grenoble Alpes, Grenoble INP, Grenoble, France
| | - Marjorie Dole
- GIPSA-Lab, Speech & Cognition Department, CNRS, Université Grenoble Alpes, Grenoble INP, Grenoble, France
| | - Hélène Lœvenbruck
- LPNC, CNRS, Université Grenoble Alpes, Université Savoie Mont Blanc, Grenoble, France
| | - Olivier Pascalis
- LPNC, CNRS, Université Grenoble Alpes, Université Savoie Mont Blanc, Grenoble, France
| | - Jean-Luc Schwartz
- GIPSA-Lab, Speech & Cognition Department, CNRS, Université Grenoble Alpes, Grenoble INP, Grenoble, France
| |
Collapse
|
16
|
Barnaud ML, Schwartz JL, Bessière P, Diard J. Computer simulations of coupled idiosyncrasies in speech perception and speech production with COSMO, a perceptuo-motor Bayesian model of speech communication. PLoS One 2019; 14:e0210302. [PMID: 30633745 PMCID: PMC6329510 DOI: 10.1371/journal.pone.0210302] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2018] [Accepted: 12/18/2018] [Indexed: 01/09/2023] Open
Abstract
The existence of a functional relationship between speech perception and production systems is now widely accepted, but the exact nature and role of this relationship remains quite unclear. The existence of idiosyncrasies in production and in perception sheds interesting light on the nature of the link. Indeed, a number of studies explore inter-individual variability in auditory and motor prototypes within a given language, and provide evidence for a link between both sets. In this paper, we attempt to simulate one study on coupled idiosyncrasies in the perception and production of French oral vowels, within COSMO, a Bayesian computational model of speech communication. First, we show that if the learning process in COSMO includes a communicative mechanism between a Learning Agent and a Master Agent, vowel production does display idiosyncrasies. Second, we implement within COSMO three models for speech perception that are, respectively, auditory, motor and perceptuo-motor. We show that no idiosyncrasy in perception can be obtained in the auditory model, since it is optimally tuned to the learning environment, which does not include the motor variability of the Learning Agent. On the contrary, motor and perceptuo-motor models provide perception idiosyncrasies correlated with idiosyncrasies in production. We draw conclusions about the role and importance of motor processes in speech perception, and propose a perceptuo-motor model in which auditory processing would enable optimal processing of learned sounds and motor processing would be helpful in unlearned adverse conditions.
Collapse
Affiliation(s)
- Marie-Lou Barnaud
- Univ. Grenoble Alpes, Gipsa-lab, Grenoble, France.,CNRS, Gipsa-lab, Grenoble, France.,Univ. Grenoble Alpes, LPNC, Grenoble, France.,CNRS, LPNC, Grenoble, France
| | - Jean-Luc Schwartz
- Univ. Grenoble Alpes, Gipsa-lab, Grenoble, France.,CNRS, Gipsa-lab, Grenoble, France
| | | | - Julien Diard
- Univ. Grenoble Alpes, LPNC, Grenoble, France.,CNRS, LPNC, Grenoble, France
| |
Collapse
|