1
|
Ibarra EJ, Arias-Londoño JD, Zañartu M, Godino-Llorente JI. Towards a Corpus (and Language)-Independent Screening of Parkinson's Disease from Voice and Speech through Domain Adaptation. Bioengineering (Basel) 2023; 10:1316. [PMID: 38002440 PMCID: PMC10669342 DOI: 10.3390/bioengineering10111316] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2023] [Revised: 11/03/2023] [Accepted: 11/10/2023] [Indexed: 11/26/2023] Open
Abstract
End-to-end deep learning models have shown promising results for the automatic screening of Parkinson's disease by voice and speech. However, these models often suffer degradation in their performance when applied to scenarios involving multiple corpora. In addition, they also show corpus-dependent clusterings. These facts indicate a lack of generalisation or the presence of certain shortcuts in the decision, and also suggest the need for developing new corpus-independent models. In this respect, this work explores the use of domain adversarial training as a viable strategy to develop models that retain their discriminative capacity to detect Parkinson's disease across diverse datasets. The paper presents three deep learning architectures and their domain adversarial counterparts. The models were evaluated with sustained vowels and diadochokinetic recordings extracted from four corpora with different demographics, dialects or languages, and recording conditions. The results showed that the space distribution of the embedding features extracted by the domain adversarial networks exhibits a higher intra-class cohesion. This behaviour is supported by a decrease in the variability and inter-domain divergence computed within each class. The findings suggest that domain adversarial networks are able to learn the common characteristics present in Parkinsonian voice and speech, which are supposed to be corpus, and consequently, language independent. Overall, this effort provides evidence that domain adaptation techniques refine the existing end-to-end deep learning approaches for Parkinson's disease detection from voice and speech, achieving more generalizable models.
Collapse
Affiliation(s)
- Emiro J. Ibarra
- Department of Electronic Engineering, Universidad Técnica Federico Santa María, Avenida España 1680, Casilla 110-V, Valparaíso 2390123, Chile; (E.J.I.); (M.Z.)
| | - Julián D. Arias-Londoño
- Escuela Técnica Superior de Ingeneiros de Telecomunicación, Universidad Politécnica de Madrid, Avda, Ciudad Universitaria, 30, 28040 Madrid, Spain;
| | - Matías Zañartu
- Department of Electronic Engineering, Universidad Técnica Federico Santa María, Avenida España 1680, Casilla 110-V, Valparaíso 2390123, Chile; (E.J.I.); (M.Z.)
| | - Juan I. Godino-Llorente
- Escuela Técnica Superior de Ingeneiros de Telecomunicación, Universidad Politécnica de Madrid, Avda, Ciudad Universitaria, 30, 28040 Madrid, Spain;
| |
Collapse
|
2
|
Suppa A, Asci F, Costantini G, Bove F, Piano C, Pistoia F, Cerroni R, Brusa L, Cesarini V, Pietracupa S, Modugno N, Zampogna A, Sucapane P, Pierantozzi M, Tufo T, Pisani A, Peppe A, Stefani A, Calabresi P, Bentivoglio AR, Saggio G. Effects of deep brain stimulation of the subthalamic nucleus on patients with Parkinson's disease: a machine-learning voice analysis. Front Neurol 2023; 14:1267360. [PMID: 37928137 PMCID: PMC10622670 DOI: 10.3389/fneur.2023.1267360] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2023] [Accepted: 09/20/2023] [Indexed: 11/07/2023] Open
Abstract
Introduction Deep brain stimulation of the subthalamic nucleus (STN-DBS) can exert relevant effects on the voice of patients with Parkinson's disease (PD). In this study, we used artificial intelligence to objectively analyze the voices of PD patients with STN-DBS. Materials and methods In a cross-sectional study, we enrolled 108 controls and 101 patients with PD. The cohort of PD was divided into two groups: the first group included 50 patients with STN-DBS, and the second group included 51 patients receiving the best medical treatment. The voices were clinically evaluated using the Unified Parkinson's Disease Rating Scale part-III subitem for voice (UPDRS-III-v). We recorded and then analyzed voices using specific machine-learning algorithms. The likelihood ratio (LR) was also calculated as an objective measure for clinical-instrumental correlations. Results Clinically, voice impairment was greater in STN-DBS patients than in those who received oral treatment. Using machine learning, we objectively and accurately distinguished between the voices of STN-DBS patients and those under oral treatments. We also found significant clinical-instrumental correlations since the greater the LRs, the higher the UPDRS-III-v scores. Discussion STN-DBS deteriorates speech in patients with PD, as objectively demonstrated by machine-learning voice analysis.
Collapse
Affiliation(s)
- Antonio Suppa
- Department of Human Neurosciences, Sapienza University of Rome, Rome, Italy
- IRCCS Neuromed Institute, Pozzilli, IS, Italy
| | - Francesco Asci
- Department of Human Neurosciences, Sapienza University of Rome, Rome, Italy
- IRCCS Neuromed Institute, Pozzilli, IS, Italy
| | - Giovanni Costantini
- Department of Electronic Engineering, University of Rome Tor Vergata, Rome, Italy
| | - Francesco Bove
- Neurology Unit, Fondazione Policlinico Universitario A. Gemelli IRCCS, Rome, Italy
| | - Carla Piano
- Neurology Unit, Fondazione Policlinico Universitario A. Gemelli IRCCS, Rome, Italy
| | - Francesca Pistoia
- Department of Biotechnological and Applied Clinical Sciences, University of L'Aquila, Coppito, AQ, Italy
- Neurology Unit, San Salvatore Hospital, Coppito, AQ, Italy
| | - Rocco Cerroni
- Department of System Medicine, University of Rome Tor Vergata, Rome, Italy
| | - Livia Brusa
- Neurology Unit, S. Eugenio Hospital, Rome, Italy
| | - Valerio Cesarini
- Department of Electronic Engineering, University of Rome Tor Vergata, Rome, Italy
| | - Sara Pietracupa
- Department of Human Neurosciences, Sapienza University of Rome, Rome, Italy
- IRCCS Neuromed Institute, Pozzilli, IS, Italy
| | | | | | | | | | - Tommaso Tufo
- Neurosurgery Unit, Policlinico A. Gemelli University Hospital Foundation IRCSS, Rome, Italy
- Neurosurgery Department, Fakeeh University Hospital, Dubai, United Arab Emirates
| | - Antonio Pisani
- Department of Brain and Behavioral Sciences, University of Pavia, Pavia, Italy
- IRCCS Mondino Foundation, Pavia, Italy
| | | | - Alessandro Stefani
- Department of System Medicine, University of Rome Tor Vergata, Rome, Italy
| | - Paolo Calabresi
- Neurology Unit, Fondazione Policlinico Universitario A. Gemelli IRCCS, Rome, Italy
| | | | - Giovanni Saggio
- Department of Electronic Engineering, University of Rome Tor Vergata, Rome, Italy
| |
Collapse
|
3
|
Mondol SIMMR, Kim R, Lee S. Hybrid Machine Learning Framework for Multistage Parkinson's Disease Classification Using Acoustic Features of Sustained Korean Vowels. Bioengineering (Basel) 2023; 10:984. [PMID: 37627869 PMCID: PMC10451837 DOI: 10.3390/bioengineering10080984] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2023] [Revised: 08/17/2023] [Accepted: 08/18/2023] [Indexed: 08/27/2023] Open
Abstract
Recent research has achieved a great classification rate for separating healthy people from those with Parkinson's disease (PD) using speech and the voice. However, these studies have primarily treated early and advanced stages of PD as equal entities, neglecting the distinctive speech impairments and other symptoms that vary across the different stages of the disease. To address this limitation, and improve diagnostic precision, this study assesses the selected acoustic features of dysphonia, as they relate to PD and the Hoehn and Yahr stages, by combining various preprocessing techniques and multiple classification algorithms, to create a comprehensive and robust solution for classification tasks. The dysphonia features extracted from the three sustained Korean vowels /아/(a), /이/(i), and /우/(u) exhibit diversity and strong correlations. To address this issue, the analysis of variance F-Value feature selection classifier from scikit-learn was employed, to identify the topmost relevant features. Additionally, to overcome the class imbalance problem, the synthetic minority over-sampling technique was utilized. To ensure fair comparisons, and mitigate the influence of individual classifiers, four commonly used machine learning classifiers, namely random forest (RF), support vector machine (SVM), k-nearest neighbor (kNN), and multi-layer perceptron (MLP), were employed. This approach enables a comprehensive evaluation of the feature extraction methods, and minimizes the variance in the final classification models. The proposed hybrid machine learning pipeline using the acoustic features of sustained vowels efficiently detects the early and mid-advanced stages of PD with a detection accuracy of 95.48%, and with a detection accuracy of 86.62% for the 4-stage, and a detection accuracy of 89.48% for the 3-stage classification of PD. This study successfully demonstrates the significance of utilizing the diverse acoustic features of dysphonia in the classification of PD and its stages.
Collapse
Affiliation(s)
- S. I. M. M. Raton Mondol
- Department of Electrical and Computer Engineering, Inha University, Incheon 22212, Republic of Korea
| | - Ryul Kim
- Department of Neurology, Inha University Hospital, Inha University College of Medicine, Incheon 22212, Republic of Korea
| | - Sangmin Lee
- Department of Electrical and Computer Engineering, Inha University, Incheon 22212, Republic of Korea
| |
Collapse
|
4
|
Ge W, Lueck C, Suominen H, Apthorp D. Has machine learning over-promised in healthcare? A critical analysis and a proposal for improved evaluation, with evidence from Parkinson’s disease. Artif Intell Med 2023; 139:102524. [PMID: 37100503 DOI: 10.1016/j.artmed.2023.102524] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2022] [Revised: 02/22/2023] [Accepted: 02/28/2023] [Indexed: 03/17/2023]
Abstract
Adoption of artificial intelligence (AI) by the medical community has long been anticipated, endorsed by a stream of machine learning literature showcasing AI systems that yield extraordinary performance. However, many of these systems are likely over-promising and will under-deliver in practice. One key reason is the community's failure to acknowledge and address the presence of inflationary effects in the data. These simultaneously inflate evaluation performance and prevent a model from learning the underlying task, thus severely misrepresenting how that model would perform in the real world. This paper investigated the impact of these inflationary effects on healthcare tasks, as well as how these effects can be addressed. Specifically, we defined three inflationary effects that occur in medical data sets and allow models to easily reach small training losses and prevent skillful learning. We investigated two data sets of sustained vowel phonation from participants with and without Parkinson's disease, and revealed that published models which have achieved high classification performances on these were artificially enhanced due to the inflationary effects. Our experiments showed that removing each inflationary effect corresponded with a decrease in classification accuracy, and that removing all inflationary effects reduced the evaluated performance by up to 30%. Additionally, the performance on a more realistic test set increased, suggesting that the removal of these inflationary effects enabled the model to better learn the underlying task and generalize. Source code is available at https://github.com/Wenbo-G/pd-phonation-analysis under the MIT license.
Collapse
|
6
|
Ngo QC, Motin MA, Pah ND, Drotár P, Kempster P, Kumar D. Computerized analysis of speech and voice for Parkinson's disease: A systematic review. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2022; 226:107133. [PMID: 36183641 DOI: 10.1016/j.cmpb.2022.107133] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/23/2022] [Revised: 09/13/2022] [Accepted: 09/13/2022] [Indexed: 06/16/2023]
Abstract
BACKGROUND AND OBJECTIVE Speech impairment is an early symptom of Parkinson's disease (PD). This study has summarized the literature related to speech and voice in detecting PD and assessing its severity. METHODS A systematic review of the literature from 2010 to 2021 to investigate analysis methods and signal features. The keywords "Automatic analysis" in conjunction with "PD speech" or "PD voice" were used, and the PubMed and ScienceDirect databases were searched. A total of 838 papers were found on the first run, of which 189 were selected. One hundred and forty-seven were found to be suitable for the review. The different datasets, recording protocols, signal analysis methods and features that were reported are listed. Values of the features that separate PD patients from healthy controls were tabulated. Finally, the barriers that limit the wide use of computerized speech analysis are discussed. RESULTS Speech and voice may be valuable markers for PD. However, large differences between the datasets make it difficult to compare different studies. In addition, speech analytic methods that are not informed by physiological understanding may alienate clinicians. CONCLUSIONS The potential usefulness of speech and voice for the detection and assessment of PD is confirmed by evidence from the classification and correlation results.
Collapse
Affiliation(s)
| | - Mohammod Abdul Motin
- Biosignals Lab, RMIT University, Melbourne, Australia; Department of Electrical & Electronic Engineering, Rajshahi University of Engineering & Technology, Rajshahi 6204, Bangladesh
| | - Nemuel Daniel Pah
- Biosignals Lab, RMIT University, Melbourne, Australia; Universitas Surabaya, Indonesia
| | - Peter Drotár
- Intelligent Information Systems Lab, Technical University of Kosice, Letna 9, 42001, Kosice, Slovakia
| | - Peter Kempster
- Neurosciences Department, Monash Health, Clayton, VIC, Australia; Department of Medicine, School of Clinical Sciences, Monash University, Clayton, VIC, Australia
| | - Dinesh Kumar
- Biosignals Lab, RMIT University, Melbourne, Australia.
| |
Collapse
|