1
|
Ghaheri P, Nasiri H, Shateri A, Homafar A. Diagnosis of Parkinson's disease based on voice signals using SHAP and hard voting ensemble method. Comput Methods Biomech Biomed Engin 2024; 27:1858-1874. [PMID: 37771234 DOI: 10.1080/10255842.2023.2263125] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2023] [Revised: 08/24/2023] [Accepted: 09/17/2023] [Indexed: 09/30/2023]
Abstract
Parkinson's disease (PD) is the second most common progressive neurological condition after Alzheimer's. The significant number of individuals afflicted with this illness makes it essential to develop a method to diagnose the conditions in their early phases. PD is typically identified from motor symptoms or via other Neuroimaging techniques. Expensive, time-consuming, and unavailable to the general public, these methods are not very accurate. Another issue to be addressed is the black-box nature of machine learning methods that needs interpretation. These issues encourage us to develop a novel technique using Shapley additive explanations (SHAP) and Hard Voting Ensemble Method based on voice signals to diagnose PD more accurately. Another purpose of this study is to interpret the output of the model and determine the most important features in diagnosing PD. The present article uses Pearson Correlation Coefficients to understand the relationship between input features and the output. Input features with high correlation are selected and then classified by the Extreme Gradient Boosting, Light Gradient Boosting Machine, Gradient Boosting, and Bagging. Moreover, the weights in Hard Voting Ensemble Method are determined based on the performance of the mentioned classifiers. At the final stage, it uses SHAP to determine the most important features in PD diagnosis. The effectiveness of the proposed method is validated using 'Parkinson Dataset with Replicated Acoustic Features' from the UCI machine learning repository. It has achieved an accuracy of 85.42%. The findings demonstrate that the proposed method outperformed state-of-the-art approaches and can assist physicians in diagnosing Parkinson's cases.
Collapse
Affiliation(s)
- Paria Ghaheri
- Electrical and Computer Engineering Department, Semnan University, Semnan, Iran
| | - Hamid Nasiri
- Department of Computer Engineering, Amirkabir University of Technology (Tehran Polytechnic), Tehran, Iran
| | - Ahmadreza Shateri
- Electrical and Computer Engineering Department, Semnan University, Semnan, Iran
| | - Arman Homafar
- Electrical and Computer Engineering Department, Semnan University, Semnan, Iran
| |
Collapse
|
2
|
Schalling E, Winkler H, Franzén E. HiCommunication as a novel speech and communication treatment for Parkinson's disease: A feasibility study. Brain Behav 2021; 11:e02150. [PMID: 33943030 PMCID: PMC8213924 DOI: 10.1002/brb3.2150] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/06/2020] [Revised: 02/05/2021] [Accepted: 03/14/2021] [Indexed: 01/07/2023] Open
Abstract
INTRODUCTION Speech and communication problems are common in Parkinson's disease (PD) and can result in social withdrawal and reduced quality of life. Intervention may improve symptoms but transfer and maintenance remain challenging for many. Access to treatment may also be limited. Group intervention incorporating principles for experience-dependent plasticity may address these challenges. The aim of this study was to develop and study feasibility aspects of a new intervention program for group training of speech and communication in people with PD. MATERIALS & METHODS Development and content of the program called HiCommunication is described. Core target areas are voice, articulation, word-finding and memory. Five participants with mild-moderate PD completed this feasibility trial. Attendance rate and possible adverse events as well as the participants' experiences were documented. A speech recording and dysarthria testing were completed to study feasibility of the assessment procedure and evaluate possible changes in voice sound level and intelligibility. RESULTS Attendance rate was 89%. No adverse events occurred. Participants reported a positive experience and limited fatigue. Assessment was completed in approximately 30 min and was well tolerated. Four of five participants had an increased voice sound level during text-reading postintervention and mean intelligibility improved. CONCLUSIONS Results indicate that HiCommunication is feasible for people with mild-moderate PD. The program was appreciated and well tolerated. Positive outcomes regarding voice sound level and intelligibility were observed; however, the number of participants was very limited. The results motivate that effects of HiCommunication are further studied in a randomized controlled trial, which is ongoing.
Collapse
Affiliation(s)
- Ellika Schalling
- Division of Speech and Language Pathology, Department of Clinical Science Intervention and Technology, Karolinska Institutet, Stockholm, Sweden.,Medical Unit Speech and Language Pathology, Karolinska University Hospital, Stockholm, Sweden
| | - Helena Winkler
- Medical Unit Speech and Language Pathology, Karolinska University Hospital, Stockholm, Sweden
| | - Erika Franzén
- Division of Physiotherapy, Department of Neurobiology, Care Sciences and Society, Karolinska Institutet, Stockholm, Sweden.,Medical Unit Occupational Therapy and Physical Therapy, Karolinska University Hospital, Stockholm, Sweden
| |
Collapse
|
3
|
Prediction and Estimation of Parkinson’s Disease Severity Based on Voice Signal. J Voice 2020; 36:439.e9-439.e20. [DOI: 10.1016/j.jvoice.2020.06.004] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2020] [Revised: 06/07/2020] [Accepted: 06/08/2020] [Indexed: 10/23/2022]
|
4
|
Wodzinski M, Skalski A, Hemmerling D, Orozco-Arroyave JR, Noth E. Deep Learning Approach to Parkinson's Disease Detection Using Voice Recordings and Convolutional Neural Network Dedicated to Image Classification. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2020; 2019:717-720. [PMID: 31945997 DOI: 10.1109/embc.2019.8856972] [Citation(s) in RCA: 25] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
This study presents an approach to Parkinson's disease detection using vowels with sustained phonation and a ResNet architecture dedicated originally to image classification. We calculated spectrum of the audio recordings and used them as an image input to the ResNet architecture pre-trained using the ImageNet and SVD databases. To prevent overfitting the dataset was strongly augmented in the time domain. The Parkinson's dataset (from PC-GITA database) consists of 100 patients (50 were healthy / 50 were diagnosed with Parkinson's disease). Each patient was recorded 3 times. The obtained accuracy on the validation set is above 90% which is comparable to the current state-of-the-art methods. The results are promising because it turned out that features learned on natural images are able to transfer the knowledge to artificial images representing the spectrogram of the voice signal. What is more, we showed that it is possible to perform a successful detection of Parkinson's disease using only frequency-based features. A spectrogram enables visual representation of frequencies spectrum of a signal. It allows to follow the frequencies changes of a signal in time.
Collapse
|
5
|
Ramig L, Halpern A, Spielman J, Fox C, Freeman K. Speech treatment in Parkinson's disease: Randomized controlled trial (RCT). Mov Disord 2018; 33:1777-1791. [PMID: 30264896 PMCID: PMC6261685 DOI: 10.1002/mds.27460] [Citation(s) in RCA: 112] [Impact Index Per Article: 18.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2018] [Revised: 05/07/2018] [Accepted: 05/22/2018] [Indexed: 11/24/2022] Open
Abstract
BACKGROUND As many as 89% of people with Parkinson's disease (PD) develop speech disorders. OBJECTIVES This randomized controlled trial evaluated two speech treatments for PD matched in intensive dosage and high-effort mode of delivery, differing in subsystem target: voice (respiratory-laryngeal) versus articulation (orofacial-articulatory). METHODS PD participants were randomized to 1-month LSVT LOUD (voice), LSVT ARTIC (articulation), or UNTXPD (untreated) groups. Speech clinicians specializing in PD delivered treatment. Primary outcome was sound pressure level (SPL) in reading and spontaneous speech, and secondary outcome was participant-reported Modified Communication Effectiveness Index (CETI-M), evaluated at baseline, 1, and 7 months. Healthy controls were matched by age and sex. RESULTS At baseline, the combined PD group (n = 64) was significantly worse than healthy controls (n = 20) for SPL (P < 0.05) and CETI-M (P = 0.0001). At 1 and 7 months, SPL between-group comparisons showed greater improvements for LSVT LOUD (n = 22) than LSVT ARTIC (n = 20; P < 0.05) and UNTXPD (n = 22; P < 0.05). Sound pressure level differences between LSVT ARTIC and UNTXPD at 1 and 7 months were not significant (P > 0.05). For CETI-M, between-group comparisons showed greater improvements for LSVT LOUD and LSVT ARTIC than UNTXPD at 1 month (P = 0.02; P = 0.02). At 7 months, CETI-M between-group differences were not significant (P = 0.08). Within-group CETI-M improvements for LSVT LOUD were maintained through 7 months (P = 0.0011). CONCLUSIONS LSVT LOUD showed greater improvements than both LSVT ARTIC and UNTXPD for SPL at 1 and 7 months. For CETI-M, both LSVT LOUD and LSVT ARTIC improved at 1 month relative to UNTXPD. Only LSVT LOUD maintained CETI-M improvements at 7 months. © 2018 The Authors. Movement Disorders published by Wiley Periodicals, Inc. on behalf of International Parkinson and Movement Disorder Society.
Collapse
Affiliation(s)
- Lorraine Ramig
- University of Colorado‐BoulderBoulderColoradoUSA
- National Center for Voice and Speech‐DenverDenverColoradoUSA
- Columbia University‐New York CityNew YorkNew YorkUSA
- LSVT Global, Inc.‐TucsonTucsonArizonaUSA
| | - Angela Halpern
- University of Colorado‐BoulderBoulderColoradoUSA
- National Center for Voice and Speech‐DenverDenverColoradoUSA
- LSVT Global, Inc.‐TucsonTucsonArizonaUSA
| | - Jennifer Spielman
- University of Colorado‐BoulderBoulderColoradoUSA
- National Center for Voice and Speech‐DenverDenverColoradoUSA
| | - Cynthia Fox
- National Center for Voice and Speech‐DenverDenverColoradoUSA
- LSVT Global, Inc.‐TucsonTucsonArizonaUSA
| | | |
Collapse
|
6
|
Perju-Dumbrava L, Lau K, Phyland D, Papanikolaou V, Finlay P, Beare R, Bardin P, Stuckey S, Kempster P, Thyagarajan D. Arytenoid cartilage movements are hypokinetic in Parkinson's disease: A quantitative dynamic computerised tomographic study. PLoS One 2017; 12:e0186611. [PMID: 29099841 PMCID: PMC5669420 DOI: 10.1371/journal.pone.0186611] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2016] [Accepted: 10/04/2017] [Indexed: 11/19/2022] Open
Abstract
BACKGROUND Voice change is one of the earliest features of Parkinson's disease. However, quantitative studies of vocal fold dynamics which are needed to provide insight into disease biology, aid diagnosis, or track progression, are few. METHODS We therefore quantified arytenoid cartilage movements and glottic area during repeated phonation in 15 patients with Parkinson's disease (symptom duration < 6 years) and 19 controls, with 320-slice computerised tomography (CT). We related these measures to perceptual voice evaluations and spirometry. We hypothesised that Parkinson's disease patients have a smaller inter-arytenoid distance, a preserved or larger glottic area because vocal cord bowing has previously been reported, less variability in loudness, more voice dysdiadochokinesis and breathiness and a shortened phonation time because of arytenoid hypokinesis relative to glottic area. RESULTS Inter-arytenoid distance in Parkinson's disease patients was moderately smaller (Mdn = 0.106, IQR = 0.091-0.116) than in controls (Mdn = 0.132, IQR = 0.116-0.166) (W = 212, P = 0.015, r = -0.42), normalised for anatomical and other inter-subject variance, analysed with two-tailed Wilcoxon's rank sum test. This finding was confirmed in a linear mixed model analysis-Parkinson's disease significantly predicted a reduction in the dependent variable, inter-arytenoid distance (b = -0.87, SEb = 0.39, 95% CI [-1.66, -0.08], t(31) = -2.24, P = 0.032). There was no difference in glottic area. On perceptual voice evaluation, patients had more breathiness and dysdiadochokinesis, a shorter maximum phonation time, and less variability in loudness than controls. There was no difference in spirometry after adjustment for smoking history. CONCLUSIONS As predicted, vocal fold adduction movements are reduced in Parkinson's disease on repeated phonation but glottic area is maintained. Some perceptual characteristics of Parkinsonian speech reflect these changes. We are the first to use 320-slice CT to study laryngeal motion. Our findings indicate how Parkinson's disease affects intrinsic laryngeal muscle position and excursion.
Collapse
Affiliation(s)
| | - Ken Lau
- Department of Medical Imaging, Monash Medical Center, Clayton, Victoria, Australia
| | - Debbie Phyland
- Department of Surgery, Monash Medical Center, Clayton, Victoria, Australia
| | - Vicki Papanikolaou
- Department of Respiratory Medicine, Monash Medical Center, Clayton, Victoria, Australia
| | - Paul Finlay
- Department of Respiratory Medicine, Monash Medical Center, Clayton, Victoria, Australia
| | - Richard Beare
- Department of Neuroscience, Monash Medical Center, Clayton, Victoria, Australia
| | - Philip Bardin
- Department of Respiratory Medicine, Monash Medical Center, Clayton, Victoria, Australia
| | - Stephen Stuckey
- Department of Medical Imaging, Monash Medical Center, Clayton, Victoria, Australia
| | - Peter Kempster
- Department of Neuroscience, Monash Medical Center, Clayton, Victoria, Australia
| | - Dominic Thyagarajan
- Department of Neuroscience, Monash Medical Center, Clayton, Victoria, Australia
- * E-mail:
| |
Collapse
|
7
|
Fujii S, Wan CY. The Role of Rhythm in Speech and Language Rehabilitation: The SEP Hypothesis. Front Hum Neurosci 2014; 8:777. [PMID: 25352796 PMCID: PMC4195275 DOI: 10.3389/fnhum.2014.00777] [Citation(s) in RCA: 48] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2014] [Accepted: 09/12/2014] [Indexed: 11/16/2022] Open
Abstract
For thousands of years, human beings have engaged in rhythmic activities such as drumming, dancing, and singing. Rhythm can be a powerful medium to stimulate communication and social interactions, due to the strong sensorimotor coupling. For example, the mere presence of an underlying beat or pulse can result in spontaneous motor responses such as hand clapping, foot stepping, and rhythmic vocalizations. Examining the relationship between rhythm and speech is fundamental not only to our understanding of the origins of human communication but also in the treatment of neurological disorders. In this paper, we explore whether rhythm has therapeutic potential for promoting recovery from speech and language dysfunctions. Although clinical studies are limited to date, existing experimental evidence demonstrates rich rhythmic organization in both music and language, as well as overlapping brain networks that are crucial in the design of rehabilitation approaches. Here, we propose the “SEP” hypothesis, which postulates that (1) “sound envelope processing” and (2) “synchronization and entrainment to pulse” may help stimulate brain networks that underlie human communication. Ultimately, we hope that the SEP hypothesis will provide a useful framework for facilitating rhythm-based research in various patient populations.
Collapse
Affiliation(s)
- Shinya Fujii
- Heart and Stroke Foundation Canadian Partnership for Stroke Recovery, Sunnybrook Research Institute , Toronto, ON , Canada
| | - Catherine Y Wan
- Department of Radiology, Boston Children's Hospital, Harvard Medical School , Boston, MA , USA
| |
Collapse
|
8
|
Brain mechanisms of acoustic communication in humans and nonhuman primates: An evolutionary perspective. Behav Brain Sci 2014; 37:529-46. [DOI: 10.1017/s0140525x13003099] [Citation(s) in RCA: 147] [Impact Index Per Article: 14.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022]
Abstract
AbstractAny account of “what is special about the human brain” (Passingham 2008) must specify the neural basis of our unique ability to produce speech and delineate how these remarkable motor capabilities could have emerged in our hominin ancestors. Clinical data suggest that the basal ganglia provide a platform for the integration of primate-general mechanisms of acoustic communication with the faculty of articulate speech in humans. Furthermore, neurobiological and paleoanthropological data point at a two-stage model of the phylogenetic evolution of this crucial prerequisite of spoken language: (i) monosynaptic refinement of the projections of motor cortex to the brainstem nuclei that steer laryngeal muscles, presumably, as part of a “phylogenetic trend” associated with increasing brain size during hominin evolution; (ii) subsequent vocal-laryngeal elaboration of cortico-basal ganglia circuitries, driven by human-specificFOXP2mutations.;>This concept implies vocal continuity of spoken language evolution at the motor level, elucidating the deep entrenchment of articulate speech into a “nonverbal matrix” (Ingold 1994), which is not accounted for by gestural-origin theories. Moreover, it provides a solution to the question for the adaptive value of the “first word” (Bickerton 2009) since even the earliest and most simple verbal utterances must have increased the versatility of vocal displays afforded by the preceding elaboration of monosynaptic corticobulbar tracts, giving rise to enhanced social cooperation and prestige. At the ontogenetic level, the proposed model assumes age-dependent interactions between the basal ganglia and their cortical targets, similar to vocal learning in some songbirds. In this view, the emergence of articulate speech builds on the “renaissance” of an ancient organizational principle and, hence, may represent an example of “evolutionary tinkering” (Jacob 1977).
Collapse
|
9
|
Schalling E, Gustafsson J, Ternström S, Bulukin Wilén F, Södersten M. Effects of Tactile Biofeedback by a Portable Voice Accumulator on Voice Sound Level in Speakers with Parkinson's Disease. J Voice 2013; 27:729-37. [DOI: 10.1016/j.jvoice.2013.04.014] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2013] [Accepted: 04/29/2013] [Indexed: 12/01/2022]
|