1
|
Jeong SM, Kim S, Lee EC, Kim HJ. Exploring Spectrogram-Based Audio Classification for Parkinson's Disease: A Study on Speech Classification and Qualitative Reliability Verification. SENSORS (BASEL, SWITZERLAND) 2024; 24:4625. [PMID: 39066023 PMCID: PMC11280556 DOI: 10.3390/s24144625] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/12/2024] [Revised: 07/15/2024] [Accepted: 07/16/2024] [Indexed: 07/28/2024]
Abstract
Patients suffering from Parkinson's disease suffer from voice impairment. In this study, we introduce models to classify normal and Parkinson's patients using their speech. We used an AST (audio spectrogram transformer), a transformer-based speech classification model that has recently outperformed CNN-based models in many fields, and a CNN-based PSLA (pretraining, sampling, labeling, and aggregation), a high-performance model in the existing speech classification field, for the study. This study compares and analyzes the models from both quantitative and qualitative perspectives. First, qualitatively, PSLA outperformed AST by more than 4% in accuracy, and the AUC was also higher, with 94.16% for AST and 97.43% for PSLA. Furthermore, we qualitatively evaluated the ability of the models to capture the acoustic features of Parkinson's through various CAM (class activation map)-based XAI (eXplainable AI) models such as GradCAM and EigenCAM. Based on PSLA, we found that the model focuses well on the muffled frequency band of Parkinson's speech, and the heatmap analysis of false positives and false negatives shows that the speech features are also visually represented when the model actually makes incorrect predictions. The contribution of this paper is that we not only found a suitable model for diagnosing Parkinson's through speech using two different types of models but also validated the predictions of the model in practice.
Collapse
Affiliation(s)
- Seung-Min Jeong
- Department of AI & Informatics, Graduate School, Sangmyung University, Hongjimun 2-gil 20, Jongno-gu, Seoul 03016, Republic of Korea; (S.-M.J.); (S.K.)
| | - Seunghyun Kim
- Department of AI & Informatics, Graduate School, Sangmyung University, Hongjimun 2-gil 20, Jongno-gu, Seoul 03016, Republic of Korea; (S.-M.J.); (S.K.)
| | - Eui Chul Lee
- Department of Human-Centered Artificial Intelligence, Sangmyung University, Hongjimun 2-gil 20, Jongno-gu, Seoul 03016, Republic of Korea
| | - Han Joon Kim
- Department of Neurology, Seoul National University College of Medicine, Seoul National University Hospital, Daehak-ro 101, Jongno-gu, Seoul 03080, Republic of Korea
| |
Collapse
|
2
|
Mirkoska V, Antonsson M, Hartelius L, Nylén F. Detection of Subclinical Motor Speech Deficits after Presumed Low-Grade Glioma Surgery. Brain Sci 2023; 13:1631. [PMID: 38137079 PMCID: PMC10741922 DOI: 10.3390/brainsci13121631] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2023] [Revised: 11/11/2023] [Accepted: 11/21/2023] [Indexed: 12/24/2023] Open
Abstract
Motor speech performance was compared before and after surgical resection of presumed low-grade gliomas. This pre- and post-surgery study was conducted on 15 patients (mean age = 41) with low-grade glioma classified based on anatomic features. Repetitions of /pa/, /ta/, /ka/, and /pataka/ recorded before and 3 months after surgery were analyzed regarding rate and regularity. A significant reduction (6 to 5.6 syllables/s) pre- vs. post-surgery was found in the rate for /ka/, which is comparable to the approximate average decline over 10-15 years of natural aging reported previously. For all other syllable types, rates were within normal age-adjusted ranges in both preoperative and postoperative sessions. The decline in /ka/ rate might reflect a subtle reduction in motor speech production, but the effects were not severe. All but one patient continued to perform within normal ranges post-surgery; one performed two standard deviations below age-appropriate norms pre- and post-surgery in all syllable tasks. The patient experienced motor speech difficulties, which may be related to the tumor's location in an area important for speech. Low-grade glioma may reduce maximum speech-motor performance in individual patients, but larger samples are needed to elucidate how often the effect occurs.
Collapse
Affiliation(s)
- Vesna Mirkoska
- Speech and Language Pathology Unit, Institute of Neuroscience and Physiology, Sahlgrenska Academy at the University of Gothenburg, 40530 Gothenburg, Sweden; (M.A.); (L.H.)
| | - Malin Antonsson
- Speech and Language Pathology Unit, Institute of Neuroscience and Physiology, Sahlgrenska Academy at the University of Gothenburg, 40530 Gothenburg, Sweden; (M.A.); (L.H.)
| | - Lena Hartelius
- Speech and Language Pathology Unit, Institute of Neuroscience and Physiology, Sahlgrenska Academy at the University of Gothenburg, 40530 Gothenburg, Sweden; (M.A.); (L.H.)
| | - Fredrik Nylén
- Department of Clinical Sciences, Umeå University, 90736 Umeå, Sweden
| |
Collapse
|
3
|
Johansson IL, Samuelsson C, Müller N. Consonant articulation acoustics and intelligibility in Swedish speakers with Parkinson's disease: a pilot study. CLINICAL LINGUISTICS & PHONETICS 2023; 37:845-865. [PMID: 35833475 DOI: 10.1080/02699206.2022.2095926] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/09/2021] [Revised: 05/16/2022] [Accepted: 06/22/2022] [Indexed: 06/15/2023]
Abstract
Imprecise consonant articulation is common in speakers with Parkinson's disease and can affect intelligibility. The research on the relationship between acoustic speech measures and intelligibility in Parkinson's disease is limited, and most of the research has been conducted on English. This pilot study investigated aspects of consonant articulation acoustics in eleven Swedish speakers with Parkinson's disease and six neurologically healthy persons. The focus of the study was on consonant cluster production, articulatory motion rate and variation, and voice onset time, and how these acoustic features correlate with speech intelligibility. Among the measures in the present study, typicality ratings of heterorganic consonant clusters /spr/ and /skr/ had the strongest correlations with intelligibility. Measures based on syllable repetition, such as repetition rate and voice onset time, showed varying results with weak to moderate correlations with intelligibility. One conclusion is that some acoustic measures may be more sensitive than others to the impact of the underlying sensory-motor impairment and dysarthria on speech production and intelligibility in speakers with Parkinson's disease. Some aspects of articulation appear to be equally demanding in terms of acoustic realisation for elderly healthy speakers and for speakers with Parkinson's disease, such as sequential motion rate measures. Clinically, this would imply that for the purpose of detecting signs of disordered speech motor control, choosing measures with less variation among older speakers without articulation impairment would lead to more robust results.
Collapse
Affiliation(s)
- Inga-Lena Johansson
- Department of Biomedical and Clinical Sciences/Speech and Language Pathology, Linköping University, Linköping, Sweden
| | - Christina Samuelsson
- Department of Biomedical and Clinical Sciences/Speech and Language Pathology, Linköping University, Linköping, Sweden
- Department of Clinical Science, Intervention and Technology (CLINTEC), Karolinska Institute, Solna, Sweden
| | - Nicole Müller
- Department of Biomedical and Clinical Sciences/Speech and Language Pathology, Linköping University, Linköping, Sweden
- Department of Speech and Hearing Sciences, University College Cork, Cork, Ireland
| |
Collapse
|
4
|
Roland V, Huet K, Harmegnies B, Piccaluga M, Verhaegen C, Delvaux V. Vowel production: a potential speech biomarker for early detection of dysarthria in Parkinson's disease. Front Psychol 2023; 14:1129830. [PMID: 37701868 PMCID: PMC10493417 DOI: 10.3389/fpsyg.2023.1129830] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2022] [Accepted: 07/26/2023] [Indexed: 09/14/2023] Open
Abstract
Objectives Our aim is to detect early, subclinical speech biomarkers of dysarthria in Parkinson's disease (PD), i.e., systematic atypicalities in speech that remain subtle, are not easily detectible by the clinician, so that the patient is labeled "non-dysarthric." Based on promising exploratory work, we examine here whether vowel articulation, as assessed by three acoustic metrics, can be used as early indicator of speech difficulties associated with Parkinson's disease. Study design This is a prospective case-control study. Methods Sixty-three individuals with PD and 35 without PD (healthy controls-HC) participated in this study. Out of 63 PD patients, 43 had been diagnosed with dysarthria (DPD) and 20 had not (NDPD). Sustained vowels were recorded for each speaker and formant frequencies were measured. The analyses focus on three acoustic metrics: individual vowel triangle areas (tVSA), vowel articulation index (VAI) and the Phi index. Results tVSA were found to be significantly smaller for DPD speakers than for HC. The VAI showed significant differences between these two groups, indicating greater centralization and lower vowel contrasts in the DPD speakers with dysarhtria. In addition, DPD and NDPD speakers had lower Phi values, indicating a lower organization of their vowel system compared to the HC. Results also showed that the VAI index was the most efficient to distinguish between DPD and NDPD whereas the Phi index was the best acoustic metric to discriminate NDPD and HC. Conclusion This acoustic study identified potential subclinical vowel-related speech biomarkers of dysarthria in speakers with Parkinson's disease who have not been diagnosed with dysarthria.
Collapse
Affiliation(s)
- Virginie Roland
- Metrology and Language Sciences Unit, Mons, Belgium
- Research Institute for Language Science and Technology, University of Mons, Mons, Belgium
| | - Kathy Huet
- Metrology and Language Sciences Unit, Mons, Belgium
- Research Institute for Language Science and Technology, University of Mons, Mons, Belgium
| | - Bernard Harmegnies
- Research Institute for Language Science and Technology, University of Mons, Mons, Belgium
| | - Myriam Piccaluga
- Metrology and Language Sciences Unit, Mons, Belgium
- Research Institute for Language Science and Technology, University of Mons, Mons, Belgium
| | - Clémence Verhaegen
- Metrology and Language Sciences Unit, Mons, Belgium
- Research Institute for Language Science and Technology, University of Mons, Mons, Belgium
| | - Véronique Delvaux
- Metrology and Language Sciences Unit, Mons, Belgium
- Research Institute for Language Science and Technology, University of Mons, Mons, Belgium
- National Fund for Scientific Research, Brussels, Belgium
| |
Collapse
|
5
|
Patel S, Grabowski C, Dayalu V, Testa AJ. Speech error rates after a sports-related concussion. Front Psychol 2023; 14:1135441. [PMID: 36960009 PMCID: PMC10027790 DOI: 10.3389/fpsyg.2023.1135441] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2022] [Accepted: 02/13/2023] [Indexed: 03/09/2023] Open
Abstract
Background Alterations in speech have long been identified as indicators of various neurologic conditions including traumatic brain injury, neurodegenerative diseases, and stroke. The extent to which speech errors occur in milder brain injuries, such as sports-related concussions, is unknown. The present study examined speech error rates in student athletes after a sports-related concussion compared to pre-injury speech performance in order to determine the presence and relevant characteristics of changes in speech production in this less easily detected neurologic condition. Methods A within-subjects pre/post-injury design was used. A total of 359 Division I student athletes participated in pre-season baseline speech testing. Of these, 27 athletes (18-22 years) who sustained a concussion also participated in speech testing in the days immediately following diagnosis of concussion. Picture description tasks were utilized to prompt connected speech samples. These samples were recorded and then transcribed for identification of errors and disfluencies. These were coded by two trained raters using a 6-category system that included 14 types of error metrics. Results Repeated measures analysis of variance was used to compare the difference in error rates at baseline and post-concussion. Results revealed significant increases in the speech error categories of pauses and time fillers (interjections/fillers). Additionally, regression analysis showed that a different pattern of errors and disfluencies occur after a sports-related concussion (primarily time fillers) compared to pre-injury (primarily pauses). Conclusion Results demonstrate that speech error rates increase following even mild head injuries, in particular, sports-related concussion. Furthermore, the speech error patterns driving this increase in speech errors, rate of pauses and interjections, are distinct features of this neurological injury, which is in contrast with more severe injuries that are marked by articulation errors and an overall reduction in verbal output. Future studies should consider speech as a diagnostic tool for concussion.
Collapse
Affiliation(s)
- Sona Patel
- Department of Speech-Language Pathology, Seton Hall University, Nutley, NJ, United States
- Department of Medical Sciences, Hackensack Meridian School of Medicine, Nutley, NJ, United States
- *Correspondence: Sona Patel,
| | - Caryn Grabowski
- Department of Speech-Language Pathology, Seton Hall University, Nutley, NJ, United States
| | - Vikram Dayalu
- Department of Speech-Language Pathology, Seton Hall University, Nutley, NJ, United States
| | - Anthony J. Testa
- Center for Sports Medicine, Seton Hall University, South Orange, NJ, United States
| |
Collapse
|
6
|
On the Primary Influences of Age on Articulation and Phonation in Maximum Performance Tasks. LANGUAGES 2021. [DOI: 10.3390/languages6040174] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Maximum performance tasks have been identified as possible domains where incipient signs of neurological disease may be detected in simple speech and voice samples. However, it is likely that these will simultaneously be influenced by the age and sex of the speaker. In this study, a comprehensive set of acoustic quantifications were collected from the literature and applied to productions of sustained [a] productions and Alternating Motion Rate diadochokinetic (DDK) syllable sequences made by 130 (62 women, 68 men) healthy speakers, aged 20–90 years. The participants were asked to produce as stable (sustained [a] and DDK) and fast (DDK) productions as possible. The full set of features were reduced to a functional subset that most efficiently modeled sex-specific differences between younger and older speakers using a cross-validation procedure. Twelve measures of [a] and 16 measures of DDK sequences were identified across men and women and investigated in terms of how they were altered with increasing age of speakers. Increased production instability is observed in both tasks, primarily above the age of 60 years. DDK sequences were slower in older speakers, but also altered in their syllable and segment level acoustic properties. Increasing age does not appear to affect phonation or articulation uniformly, and men and women are affected differently in most quantifications investigated.
Collapse
|
7
|
Chu SY, Foong JH, Lee J, Ben-David BM, Barlow SM, Hsu C. Oral diadochokinetic rates across languages: Multilingual speakers comparison. INTERNATIONAL JOURNAL OF LANGUAGE & COMMUNICATION DISORDERS 2021; 56:1026-1036. [PMID: 34331497 DOI: 10.1111/1460-6984.12653] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/11/2020] [Revised: 06/01/2021] [Accepted: 06/04/2021] [Indexed: 06/13/2023]
Abstract
BACKGROUND It is unclear whether oral diadochokinetic rate (oral-DDK) performance is affected by different languages within a multilingual country. AIMS This study investigated the effects of age, sex, and stimulus type (real word in L1, L2 vs. non-word) on oral-DDK rates among healthy Malaysian-Malay speakers in order to establish language- and age-sensitive norms. The second aim was to compared the nonword 'pataka' oral-DDK rates produced by Malaysian-Malay speakers on currently available normative data for Hebrew speakers and Malaysian-Mandarin speakers. METHODS & PROCEDURES Oral-DDK performance of 90 participants (aged 20-77 years) using nonword ('pataka'), Malay real word ('patahkan'), and English real word ('buttercake') was audio recorded. The number of syllables produced in 8 seconds was calculated. Mixed analysis of variance (ANOVA) was conducted to examine the effects of stimulus type (nonword, Malay, and English real word), sex (male, female), age (younger, 20-40 years; middle, 41-60 years; older, ≥61 years), and their interactions on the oral-DDK rate. Data obtained were also compared with the raw data of Malaysian-Mandarin and Hebrew speakers from the previous studies. OUTCOMES & RESULTS A normative oral-DDK rate has been established for healthy Malaysian-Malay speakers. The oral-DDK rate was significantly affected by stimuli (p < 0.001). Malay real word showed the slowest rate, whereas there was no significant difference between English real word and nonword. The oral-DDK rate for Malay speakers was significantly higher than Mandarin and Hebrew speakers across stimuli (all p < 0.01). Interestingly, oral-DDK rates were not affected by age group for Malay speakers. CONCLUSIONS & IMPLICATIONS Stimuli type and language affect the oral-DDK rate, indicating that speech-language therapists should consider using language-specific norms when assessing multilingual speakers. WHAT THIS PAPER ADDS What is already known on the subject Age, sex, and language are factors that need to be considered when developing oral-DDK normative protocol. It is unclear whether oral-DDK performance is affected by different languages within a multilingual country. What this paper adds to existing knowledge No ageing effect across real word versus nonword on oral-DDK performance was observed among Malaysian-Malay speakers, contrasting with current available literature that speech movements slow down as we age. Additionally, Malaysian-Malay speakers have faster oral-DDK rates than Malaysian-Mandarin and Hebrew speakers across all stimuli. What are the potential or actual clinical implications of this work? Establishing normative data of different languages will enable speech-language therapists to select the appropriate reference dataset based on the language mastery of these multilingual speakers.
Collapse
Affiliation(s)
- Shin Ying Chu
- Faculty of Health Sciences, Centre for Healthy Ageing and Wellness (H-CARE), Universiti Kebangsaan Malaysia, Jalan Raja Muda Abdul Aziz, Kuala Lumpur, Malaysia
| | - Jia Hao Foong
- Faculty of Health Sciences, Speech Science Programme, Universiti Kebangsaan Malaysia, Jalan Raja Muda Abdul Aziz, Kuala Lumpur, Malaysia
| | - Jaehoon Lee
- Department of Educational Psychology and Leadership, Texas Tech University, Lubbock, Texas, USA
| | - Boaz M Ben-David
- Communication Aging and Neuropsychology Lab (CANlab), Baruch Ivcher School of Psychology, Interdisciplinary Center (IDC) Herzliya, Israel
- Department of Speech-Language Pathology, University of Toronto, Toronto, Ontario, Canada
- Toronto Rehabilitation Institute, University Health Networks (UHN), Toronto, Ontario, Canada
| | - Steven M Barlow
- Department of Special Education and Communication Disorders, Department of Biological Systems Engineering, Center for Brain, Biology and Behavior, Communication Neuroscience Laboratories, University of Nebraska, Lincoln, Nebraska, USA
| | - Cristiane Hsu
- Department of Audiology and Speech-Language Pathology, Mackay Medical College, New Taipei City, Taiwan
| |
Collapse
|
8
|
Solomon NP, Brungart DS, Wince JR, Abramowitz JC, Eitel MM, Cohen J, Lippa SM, Brickell TA, French LM, Lange RT. Syllabic Diadochokinesis in Adults With and Without Traumatic Brain Injury: Severity, Stability, and Speech Considerations. AMERICAN JOURNAL OF SPEECH-LANGUAGE PATHOLOGY 2021; 30:1400-1409. [PMID: 33630660 DOI: 10.1044/2020_ajslp-20-00158] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Purpose Syllabic diadochokinesis (DDK) is a standard assessment task for motor speech disorders. This study aimed to compare rate and regularity of DDK according to the presence or absence of traumatic brain injury (TBI) and severity of TBI, examine the stability of DDK over time, and explore associations between DDK and extemporaneous speech. Method Military service members and veterans were categorized into three groups: no history of TBI (control), uncomplicated mild TBI (mTBI), and moderate through severe (including penetrating) TBI (msTBI). Participants produced rapid alternating-motion and sequential-motion syllable repetitions during one or two sessions. A semi-automated protocol determined syllabic rate and regularity. Perceptual ratings of selected participants' connected speech samples were compared to DDK results. Results Two hundred sixty-three service members and veterans provided data from one session and 69 from two sessions separated by 1.9 years (SD = 1.0). DDKs were significantly slower overall for mTBI and msTBI groups compared to controls. Regularity of productions did not differ significantly across groups. A significant Group × Task interaction revealed that the msTBI group produced sequential-motion syllable repetitions but not alternating-motion repetitions with greater regularity, whereas the opposite occurred for control and mTBI groups. DDK results did not differ significantly between sessions. Perceptual speech analysis for 30 participants, including 20 with atypical or questionable DDK performance, revealed two participants with mildly abnormal speech. Conclusions Overall, DDK productions are slower than normal in adults with moderate, severe, and penetrating TBI and are stable over time. Regularity of productions did not differentiate groups, although this result differed according to task. There were surprisingly few people identified with disordered speech, making comparisons to DDK data tenuous, and indicating that dysarthria is a rare complication in a population of adults with mostly uncomplicated mTBI who are not selected from referrals to a speech-language pathology clinic.
Collapse
Affiliation(s)
- Nancy Pearl Solomon
- Walter Reed National Military Medical Center, Bethesda, MD
- Uniformed Services University of the Health Sciences, Bethesda, MD
| | - Douglas S Brungart
- Walter Reed National Military Medical Center, Bethesda, MD
- Uniformed Services University of the Health Sciences, Bethesda, MD
| | - Jessica R Wince
- Walter Reed National Military Medical Center, Bethesda, MD
- Towson University, Baltimore, MD
| | - Jordan C Abramowitz
- Walter Reed National Military Medical Center, Bethesda, MD
- University of Maryland, College Park, MD
| | - Megan M Eitel
- Walter Reed National Military Medical Center, Bethesda, MD
- Henry M. Jackson Foundation, Rockville, MD
- Defense and Veterans Brain Injury Center, Silver Spring, MD
| | - Julie Cohen
- Walter Reed National Military Medical Center, Bethesda, MD
- University of Maryland, College Park, MD
- Henry M. Jackson Foundation, Rockville, MD
| | - Sara M Lippa
- Walter Reed National Military Medical Center, Bethesda, MD
- Defense and Veterans Brain Injury Center, Silver Spring, MD
- National Intrepid Center of Excellence, Bethesda, MD
| | - Tracey A Brickell
- Walter Reed National Military Medical Center, Bethesda, MD
- Uniformed Services University of the Health Sciences, Bethesda, MD
- Defense and Veterans Brain Injury Center, Silver Spring, MD
- National Intrepid Center of Excellence, Bethesda, MD
- General Dynamics Information Technology, Falls Church, VA
| | - Louis M French
- Walter Reed National Military Medical Center, Bethesda, MD
- Uniformed Services University of the Health Sciences, Bethesda, MD
- Defense and Veterans Brain Injury Center, Silver Spring, MD
- National Intrepid Center of Excellence, Bethesda, MD
| | - Rael T Lange
- Walter Reed National Military Medical Center, Bethesda, MD
- Defense and Veterans Brain Injury Center, Silver Spring, MD
- National Intrepid Center of Excellence, Bethesda, MD
- General Dynamics Information Technology, Falls Church, VA
- University of British Columbia, Vancouver, Canada
| |
Collapse
|
9
|
Sidorova J, Anisimova M. Impact of Diabetes Mellitus on Voice : A Methodological Commentary. J Voice 2020; 36:294.e1-294.e12. [PMID: 32739034 DOI: 10.1016/j.jvoice.2020.05.015] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2020] [Revised: 05/14/2020] [Accepted: 05/26/2020] [Indexed: 11/18/2022]
Affiliation(s)
- Julia Sidorova
- Blekinge Institute of Technology, Vallhallavagän 1, Karlskrona, 37141, Sweden.
| | - Maria Anisimova
- Zurich University of Applied Sciences, Technikumstrasse, 9, 8400, Winterthur
| |
Collapse
|
10
|
Karlsson F, Schalling E, Laakso K, Johansson K, Hartelius L. Assessment of speech impairment in patients with Parkinson's disease from acoustic quantifications of oral diadochokinetic sequences. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2020; 147:839. [PMID: 32113309 DOI: 10.1121/10.0000581] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/17/2019] [Accepted: 12/20/2019] [Indexed: 06/10/2023]
Abstract
This investigation aimed at determining whether an acoustic quantification of the oral diadochokinetic (DDK) task may be used to predict the perceived level of speech impairment when speakers with Parkinson's disease (PD) are reading a standard passage. DDK sequences with repeated [pa], [ta], and [ka] syllables were collected from 108 recordings (68 unique speakers with PD), along with recordings of the speakers reading a standardized text. The passage readings were assessed in five dimensions individually by four speech-language pathologists in a blinded and randomized procedure. The 46 acoustic DDK measures were merged with the perceptual ratings of read speech in the same recording session. Ordinal regression models were trained repeatedly on 80% of ratings and acoustic DDK predictors per dimension in 10-folds, and evaluated in testing data. The models developed from [ka] sequences achieved the best performance overall in predicting the clinicians' ratings of passage readings. The developed [pa] and [ta] models showed a much lower performance across all dimensions. The addition of samples with severe impairments and further automation of the procedure is required for the models to be used for screening purposes by non-expert clinical staff.
Collapse
Affiliation(s)
- Fredrik Karlsson
- Department of Clinical Science, Speech and Language Pathology, Umeå University, Umeå SE90187, Sweden
| | - Ellika Schalling
- Department of Clinical Science, Intervention and Technology, Division of Speech and Language Pathology, Karolinska Institutet, Stockholm SE14186, Sweden
| | - Katja Laakso
- Institute of Neuroscience and Physiology, Department of Health and Rehabilitation, University of Gothenburg, Gothenburg SE40530, Sweden
| | - Kerstin Johansson
- Department of Clinical Science, Intervention and Technology, Division of Speech and Language Pathology, Karolinska Institutet, Stockholm SE14186, Sweden
| | - Lena Hartelius
- Institute of Neuroscience and Physiology, Department of Health and Rehabilitation, University of Gothenburg, Gothenburg SE40530, Sweden
| |
Collapse
|