1
|
Chang YM, Jeong PY, Hwang K, Ihn BY, McAuliffe MJ, Sim H, Levy ES. Effects of Speech Cues on Acoustics and Intelligibility of Korean-Speaking Children With Cerebral Palsy. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2024; 67:2856-2871. [PMID: 38573834 DOI: 10.1044/2024_jslhr-23-00457] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/06/2024]
Abstract
PURPOSE Reduced speech intelligibility is often a hallmark of children with dysarthria secondary to cerebral palsy (CP), but effects of speech strategies for increasing intelligibility are understudied, especially in children who speak languages other than English. This study examined the effects of (the Korean translation of) two cues, "speak with your big mouth" and "speak with your strong voice," on speech acoustics and intelligibility of Korean-speaking children with CP. METHOD Fifteen Korean-speaking children with CP repeated words and sentences in habitual, big mouth, and strong voice conditions. Acoustic analyses were performed and intelligibility was assessed by means of 90 blinded listeners' ease-of-understanding (EoU) ratings and percentage of words correctly transcribed (PWC). RESULTS In response to both cues, children's vocal intensity and utterance duration increased significantly and differentially, whereas their vowel space area gains did not reach statistical significance. EoU increased significantly in the big mouth condition at word, but not sentence, level, whereas in the strong voice condition, EoU increased significantly at both levels. PWC increases were not statistically significant. Considerable variability in children's responses to cues was noted overall. CONCLUSIONS Korean-speaking children with CP modify their speech styles differentially when provided with cues aimed to increase their articulatory working space and vocal intensity. The results provide preliminary support for the use of the strong voice cue, in particular, to increase EoU. While the findings do not offer conclusive evidence of the intelligibility benefits of these cues, investigation with a larger sample size should provide further insight into optimal cueing strategies for increasing intelligibility in this population. Implications for language-specific versus language-independent treatment approaches are discussed. SUPPLEMENTAL MATERIAL https://doi.org/10.23641/asha.25521052.
Collapse
Affiliation(s)
| | - Pil-Yeon Jeong
- Ewha Womans University Center for Child Development and Disability, Seoul, South Korea
| | | | - Bo-Yeon Ihn
- Teachers College, Columbia University, New York, NY
| | | | | | - Erika S Levy
- Teachers College, Columbia University, New York, NY
| |
Collapse
|
2
|
Levy ES, Moya-Galé G. Revisiting Dysarthria Treatment Across Languages: The Hybrid Approach. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2024; 67:2893-2902. [PMID: 38056466 DOI: 10.1044/2023_jslhr-23-00629] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/08/2023]
Abstract
PURPOSE Ten years after Miller and Lowit's (2014) groundbreaking book providing a cross-linguistic perspective on motor speech disorders, we ask where we are regarding dysarthria treatment across languages in two specific populations: adults with Parkinson's disease (PD) and children with cerebral palsy (CP). METHOD In this commentary, we consider preliminary evidence for both language-independent and language-specific approaches to treatment and propose a hybrid approach to speech treatment across languages, centered on the individual with dysarthria who speaks any given language. CONCLUSIONS Treatment research on individuals with dysarthria secondary to PD and CP is advancing, but several areas remain to be explored. Next steps are suggested for addressing the paucity and complexity of cross-linguistic speech treatment research.
Collapse
|
3
|
Nip ISB. Articulatory and Vocal Fold Movement Patterns During Loud Speech in Children With Cerebral Palsy. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2024; 67:477-493. [PMID: 38227476 PMCID: PMC11000802 DOI: 10.1044/2023_jslhr-23-00411] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/05/2023] [Revised: 09/19/2023] [Accepted: 11/25/2023] [Indexed: 01/17/2024]
Abstract
PURPOSE Speech motor control changes underlying louder speech are poorly understood in children with cerebral palsy (CP). The current study evaluates changes in the oral articulatory and laryngeal subsystems in children with CP and their typically developing (TD) peers during louder speech. METHOD Nine children with CP and nine age- and sex-matched TD peers produced sentence repetitions in two conditions: (a) with their habitual rate and loudness and (b) with louder speech. Lip and jaw movements were recorded with optical motion capture. Acoustic recordings were obtained to evaluate vocal fold articulation. RESULTS Children with CP had smaller jaw movements, larger lower lip movements, slower jaw speeds, faster lip speeds, reduced interarticulator coordination, reduced low-frequency spectral tilt, and lower cepstral peak prominences (CPP) in comparison to their TD peers. Both groups produced louder speech with larger lip and jaw movements, faster lip and jaw speeds, increased temporal coordination, reduced movement variability, reduced spectral tilt, and increased CPP. CONCLUSIONS Children with CP differ from their TD peers in the speech motor control of both the oral articulatory and laryngeal subsystems. Both groups alter oral articulatory and vocal fold movements when cued to speak loudly, which may contribute to the increased intelligibility associated with louder speech. SUPPLEMENTAL MATERIAL https://doi.org/10.23641/asha.24970302.
Collapse
|
4
|
Alaka B, Shibwabo B. Models and Approaches for Comprehension of Dysarthric Speech Using Natural Language Processing: Systematic Review. JMIR Rehabil Assist Technol 2023; 10:e44489. [PMID: 37889538 PMCID: PMC10655903 DOI: 10.2196/44489] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2022] [Revised: 05/11/2023] [Accepted: 07/24/2023] [Indexed: 10/28/2023] Open
Abstract
BACKGROUND Speech intelligibility and speech comprehension for dysarthric speech has attracted much attention recently. Dysarthria is characterized by irregularities in the speed, strength, pitch, breath control, range, steadiness, and accuracy of muscle movements required for articulatory aspects of speech production. OBJECTIVE This study examined the contributions made by other studies involved in dysarthric speech comprehension. We focused on the modes of meaning extraction used in generalizing speaker-listener underpinnings in light of semantic ontology extraction as a desired technique, applied method types, speech representations used, and databases sourced from. METHODS This study involved a systematic literature review using 7 electronic databases: Cochrane Database of Systematic Reviews, Web of Science Core Collection, Scopus, PubMed, ACM, IEEE Xplore, and Google Scholar. The main eligibility criterion was the extraction of meaning from dysarthric speech using natural language processing or understanding approaches to improve on dysarthric speech comprehension. In total, out of 834 search results, 30 studies that matched the eligibility requirements were acquired following screening by 2 independent reviewers, with a lack of consensus being resolved through joint discussion or consultation with a third party. In order to evaluate the studies' methodological quality, the risk of bias assessment was based on the Cochrane risk-of-bias tool version 2 (RoB2) with 23 of the studies (77%) registering low risk of bias and 7 studies (33%) raising some concern over the risk of bias. The overall quality assessment of the study was done using TRIPOD (Transparent Reporting of a Multivariable Prediction Model for Individual Prognosis or Diagnosis). RESULTS Following a review of 30 primary studies, this study revealed that the reviewed studies focused on natural language understanding or clinical approaches, with an increase in proposed solutions from 2020 onwards. Most studies relied on speaker-dependent speech features, while others used speech patterns, semantic knowledge, or hybrid approaches. The prevalent use of vector representation aligned with natural language understanding models, while Mel-frequency cepstral coefficient representation and no representation approaches were applied in neural networks. Hybrid representation studies aimed to reconstruct dysarthric speech or improve comprehension. Comprehensive databases, like TORGO and UA-Speech, were commonly used in combination with other curated databases, while primary data was preferred for specific or unique research objectives. CONCLUSIONS We found significant gaps in dysarthric speech comprehension characterized by the lack of inclusion of important listener or speech-independent features in the speech representations, mode of extraction, and data sources used. Further research is therefore proposed regarding the formulation of models that accommodate listener and speech-independent features through semantic ontologies that will be useful in the inclusion of key features of listener and speech-independent features for meaning extraction of dysarthric speech.
Collapse
Affiliation(s)
- Benard Alaka
- School of Computing and Engineering Sciences, Strathmore University, Nairobi, Kenya
| | - Bernard Shibwabo
- School of Computing and Engineering Sciences, Strathmore University, Nairobi, Kenya
| |
Collapse
|
5
|
Moya-Galé G, Walsh SJ, Goudarzi A. Automatic Assessment of Intelligibility in Noise in Parkinson’s Disease: A Validation Method (Preprint). J Med Internet Res 2022; 24:e40567. [PMID: 36264608 PMCID: PMC9634525 DOI: 10.2196/40567] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2022] [Revised: 09/05/2022] [Accepted: 09/16/2022] [Indexed: 11/30/2022] Open
Abstract
Background Most individuals with Parkinson disease (PD) experience a degradation in their speech intelligibility. Research on the use of automatic speech recognition (ASR) to assess intelligibility is still sparse, especially when trying to replicate communication challenges in real-life conditions (ie, noisy backgrounds). Developing technologies to automatically measure intelligibility in noise can ultimately assist patients in self-managing their voice changes due to the disease. Objective The goal of this study was to pilot-test and validate the use of a customized web-based app to assess speech intelligibility in noise in individuals with dysarthria associated with PD. Methods In total, 20 individuals with dysarthria associated with PD and 20 healthy controls (HCs) recorded a set of sentences using their phones. The Google Cloud ASR API was used to automatically transcribe the speakers’ sentences. An algorithm was created to embed speakers’ sentences in +6-dB signal-to-noise multitalker babble. Results from ASR performance were compared to those from 30 listeners who orthographically transcribed the same set of sentences. Data were reduced into a single event, defined as a success if the artificial intelligence (AI) system transcribed a random speaker or sentence as well or better than the average of 3 randomly chosen human listeners. These data were further analyzed by logistic regression to assess whether AI success differed by speaker group (HCs or speakers with dysarthria) or was affected by sentence length. A discriminant analysis was conducted on the human listener data and AI transcriber data independently to compare the ability of each data set to discriminate between HCs and speakers with dysarthria. Results The data analysis indicated a 0.8 probability (95% CI 0.65-0.91) that AI performance would be as good or better than the average human listener. AI transcriber success probability was not found to be dependent on speaker group. AI transcriber success was found to decrease with sentence length, losing an estimated 0.03 probability of transcribing as well as the average human listener for each word increase in sentence length. The AI transcriber data were found to offer the same discrimination of speakers into categories (HCs and speakers with dysarthria) as the human listener data. Conclusions ASR has the potential to assess intelligibility in noise in speakers with dysarthria associated with PD. Our results hold promise for the use of AI with this clinical population, although a full range of speech severity needs to be evaluated in future work, as well as the effect of different speaking tasks on ASR.
Collapse
Affiliation(s)
- Gemma Moya-Galé
- Department of Communication Sciences & Disorders, Long Island University, Brooklyn, NY, United States
| | - Stephen J Walsh
- Department of Mathematics and Statistics, Utah State University, Logan, UT, United States
| | | |
Collapse
|
6
|
Carl M, Levy ES, Icht M. Speech treatment for Hebrew-speaking adolescents and young adults with developmental dysarthria: A comparison of mSIT and Beatalk. INTERNATIONAL JOURNAL OF LANGUAGE & COMMUNICATION DISORDERS 2022; 57:660-679. [PMID: 35363414 DOI: 10.1111/1460-6984.12715] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/02/2021] [Accepted: 02/16/2022] [Indexed: 06/14/2023]
Abstract
BACKGROUND Individuals with developmental dysarthria typically demonstrate reduced functioning of one or more of the speech subsystems, which negatively impacts speech intelligibility and communication within social contexts. A few treatment approaches are available for improving speech production and intelligibility among individuals with developmental dysarthria. However, these approaches have only limited application and research findings among adolescents and young adults. AIMS To determine and compare the effectiveness of two treatment approaches, the modified Speech Intelligibility Treatment (mSIT) and the Beatalk technique, on speech production and intelligibility among Hebrew-speaking adolescents and young adults with developmental dysarthria. METHODS & PROCEDURES Two matched groups of adolescents and young adults with developmental dysarthria participated in the study. Each received one of the two treatments, mSIT or Beatalk, over the course of 9 weeks. Measures of speech intelligibility, articulatory accuracy, voice and vowel acoustics were assessed both pre- and post-treatment. OUTCOMES & RESULTS Both the mSIT and Beatalk groups demonstrated gains in at least some of the outcome measures. Participants in the mSIT group exhibited improvement in speech intelligibility and voice measures, while participants in the Beatalk group demonstrated increased articulatory accuracy and gains in voice measures from pre- to post-treatment. Significant increases were noted post-treatment for first formant values for select vowels. CONCLUSIONS & IMPLICATIONS Results of this preliminary study are promising for both treatment approaches. The differentiated results indicate their distinct application to speech intelligibility deficits. The current findings also hold clinical significance for treatment among adolescents and young adults with motor speech disorders and application for a language other than English. WHAT THIS PAPER ADDS What is already known on the subject Developmental dysarthria (e.g., secondary to cerebral palsy) is a motor speech disorder that negatively impacts speech intelligibility, and thus communication participation. Select treatment approaches are available with the aim of improving speech intelligibility in individuals with developmental dysarthria; however, these approaches are limited in number and have only seldomly been applied specifically to adolescents and young adults. What this paper adds to existing knowledge The current study presents preliminary data regarding two treatment approaches, the mSIT and Beatalk technique, administered to Hebrew-speaking adolescents and young adults with developmental dysarthria in a group setting. Results demonstrate the initial effectiveness of the treatment approaches, with different gains noted for each approach across speech and voice domains. What are the potential or actual clinical implications of this work? The findings add to the existing literature on potential treatment approaches aiming to improve speech production and intelligibility among individuals with developmental dysarthria. The presented approaches also show promise for group-based treatments as well as the potential for improvement among adolescents and young adults with motor speech disorders.
Collapse
Affiliation(s)
- Micalle Carl
- Department of Communication Disorders, Ariel University, Ariel, Israel
| | - Erika S Levy
- Teachers College, Columbia University, New York, NY, USA
| | - Michal Icht
- Department of Communication Disorders, Ariel University, Ariel, Israel
| |
Collapse
|
7
|
Research on Open Oral English Scoring System Based on Neural Network. COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE 2022; 2022:1346543. [PMID: 35502353 PMCID: PMC9056240 DOI: 10.1155/2022/1346543] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/12/2022] [Revised: 03/20/2022] [Accepted: 04/08/2022] [Indexed: 11/18/2022]
Abstract
This study designs and implements a scoring system for open-spoken English using NN technology. The system scores the oral recording from the phonetic level and the text level, respectively, and can comprehensively evaluate its oral level. The system will separately score the spoken speech and the spoken content through different scoring models and add the scoring results as the final score, in which the spoken content is obtained by text transcription of the recording by an external speech recognition engine. An acoustic sensor is adopted to collect pronunciation signals of spoken English. Modern signal processing and automatic pattern recognition technology are used to distinguish the quality of spoken pronunciation. Similar semantic units are marked between acoustic feature sequences, which make use of the parallel algorithm processing mode of multi-computing cores of modern GPU and allow multiple units to independently execute the comparison algorithm at the same time. Experiments show that the model in this study achieves better comprehensive scoring performance. The scoring model is of great significance to the development of educational informatization and intelligence, and it also provides a reference for the construction of intelligent oral scoring system.
Collapse
|
8
|
Moya-Galé G, Keller B, Escorial S, Levy ES. Speech Treatment Effects on Narrative Intelligibility in French-Speaking Children With Dysarthria. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2021; 64:2154-2168. [PMID: 33719503 DOI: 10.1044/2020_jslhr-20-00258] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Purpose This study examined the effects of Speech Intelligibility Treatment (SIT) on intelligibility and naturalness of narrative speech produced by francophone children with dysarthria due to cerebral palsy. Method Ten francophone children with dysarthria were randomized to one of two treatments, SIT or Hand-Arm Bimanual Intensive Therapy Including Lower Extremities, a physical therapy (PT) treatment. Both treatments were conducted in a camp setting and were comparable in dosage. The children were recorded pre- and posttreatment producing a story narrative. Intelligibility was measured by means of 60 blinded listeners' orthographic transcription accuracy (percentage of words transcribed correctly). The listeners also rated the children's naturalness on a visual analogue scale. Results A significant pre- to posttreatment increase in intelligibility was found for the SIT group, but not for the PT group, with great individual variability observed among the children. No significant changes were found for naturalness ratings or sound pressure level in the SIT group or the PT group posttreatment. Articulation rate increased in both treatment groups, although not differentially across treatments. Conclusions Findings from this first treatment study on intelligibility in francophone children with dysarthria suggest that SIT shows promise for increasing narrative intelligibility in this population. Acoustic contributors to the increased intelligibility remain to be explored further. Supplemental Material https://doi.org/10.23641/asha.14161943.
Collapse
Affiliation(s)
- Gemma Moya-Galé
- Department of Communication Sciences and Disorders, Long Island University, Brooklyn, NY
| | - Bryan Keller
- Department of Human Development, Teachers College, Columbia University, New York, NY
| | - Sergio Escorial
- Departamento de Psicobiología y Metodología en Ciencias del Comportamiento, Universidad Complutense de Madrid, Spain
| | - Erika S Levy
- Department of Biobehavioral Sciences, Teachers College, Columbia University, New York, NY
| |
Collapse
|
9
|
Levy ES, Chang YM, Hwang K, McAuliffe MJ. Perceptual and Acoustic Effects of Dual-Focus Speech Treatment in Children With Dysarthria. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2021; 64:2301-2316. [PMID: 33656916 DOI: 10.1044/2020_jslhr-20-00301] [Citation(s) in RCA: 15] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Purpose Children with dysarthria secondary to cerebral palsy may experience reduced speech intelligibility and diminished communicative participation. However, minimal research has been conducted examining the outcomes of behavioral speech treatments in this population. This study examined the effect of Speech Intelligibility Treatment (SIT), a dual-focus speech treatment targeting increased articulatory excursion and vocal intensity, on intelligibility of narrative speech, speech acoustics, and communicative participation in children with dysarthria. Method American English-speaking children with dysarthria (n = 17) received SIT in a 3-week summer camplike setting at Columbia University. SIT follows motor-learning principles to train the child-friendly, dual-focus strategy, "Speak with your big mouth and strong voice." Children produced a story narrative at baseline, immediate posttreatment (POST), and at 6-week follow-up (FUP). Outcomes were examined via blinded listener ratings of ease of understanding (n = 108 adult listeners), acoustic analyses, and questionnaires focused on communicative participation. Results SIT resulted in significant increases in ease of understanding at POST, that were maintained at FUP. There were no significant changes to vocal intensity, speech rate, or vowel spectral characteristics, with the exception of an increase in second formant difference between vowels following SIT. Significantly enhanced communicative participation was evident at POST and FUP. Considerable variability in response to SIT was observed between children. Conclusions Dual-focus treatment shows promise for improving intelligibility and communicative participation in children with dysarthria, although responses to treatment vary considerably across children. Possible mechanisms underlying the intelligibility gains, enhanced communicative participation, and variability in treatment effects are discussed.
Collapse
Affiliation(s)
- Erika S Levy
- Department of Biobehavioral Sciences, Teachers College, Columbia University, New York, NY
| | - Younghwa M Chang
- Department of Biobehavioral Sciences, Teachers College, Columbia University, New York, NY
| | - KyungHae Hwang
- Department of Biobehavioral Sciences, Teachers College, Columbia University, New York, NY
| | - Megan J McAuliffe
- School of Psychology, Speech and Hearing and New Zealand Institute of Language, Brain and Behaviour, University of Canterbury, Christchurch, New Zealand
| |
Collapse
|
10
|
Hsu SC, McAuliffe MJ, Lin P, Wu RM, Levy ES. Acoustic and Perceptual Consequences of Speech Cues for Mandarin Speakers With Parkinson's Disease. AMERICAN JOURNAL OF SPEECH-LANGUAGE PATHOLOGY 2019; 28:521-535. [PMID: 31136238 DOI: 10.1044/2018_ajslp-18-0020] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
Purpose This study investigated the effects of cueing for increased loudness and reduced speech rate on scaled intelligibility and acoustics of speech produced by Mandarin speakers with hypokinetic dysarthria due to Parkinson's disease (PD). Method Eleven speakers with PD read passages in habitual, loud, and slow speaking conditions. Fifteen listeners rated ease of understanding (EOU) of the speech samples on a visual analog scale. Effects of the cues on EOU, vocal loudness, pitch range, pause duration and frequency, articulation rate, and vowel space, as well as relationships between EOU gains and acoustic features, were analyzed. Results EOU increased significantly in the loud condition only. The loud cue resulted in increased intensity, and the slow cue resulted both in reduced articulation rate and increased pause frequency. In the loud condition, EOU increased significantly as intensity increased and vowel centralization decreased. In the slow condition, EOU tended to increase as intensity increased and vowel centralization decreased but did not reach statistical significance. Conclusion Cueing for loud speech may yield greater EOU gains than cueing for slow speech in Mandarin speakers with PD. Theoretical and clinical implications are discussed, although further investigations with more participants and a larger range of dysarthria severity are warranted.
Collapse
Affiliation(s)
- Sih-Chiao Hsu
- Department of Biobehavioral Sciences, Teachers College, Columbia University, New York, NY
| | - Megan J McAuliffe
- Department of Communication Disorders and New Zealand Institute of Language, Brain and Behaviour, University of Canterbury, Christchurch
| | - Peiyi Lin
- Institute for Learning Technologies, Teachers College, Columbia University, New York, NY
| | - Ruey-Meei Wu
- Centre of Parkinson and Movement Disorders, Department of Neurology, National Taiwan University Hospital, Taipei
- College of Medicine, National Taiwan University, Taipei
| | - Erika S Levy
- Department of Biobehavioral Sciences, Teachers College, Columbia University, New York, NY
| |
Collapse
|