1
|
Convey RB, Laukkanen AM, Ylinen S, Penttilä N. Analysis of Voice Changes in Early-Stage Parkinson's Disease with AVQI and ABI: A Follow-up Study. J Voice 2024:S0892-1997(24)00160-7. [PMID: 38897855 DOI: 10.1016/j.jvoice.2024.05.009] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2024] [Revised: 05/17/2024] [Accepted: 05/18/2024] [Indexed: 06/21/2024]
Abstract
OBJECTIVES The purpose of this pilot study was to examine voice quality changes in individuals with early-stage Parkinson's disease (PD) utilizing the Acoustic Voice Quality Index (AVQI) and Acoustic Breathiness Index (ABI) over approximately a 1-year period. STUDY DESIGN Follow-up study. METHODS Baseline and follow-up data were gathered from the PDSTUlong speech corpus. The data for both time points included: speaker background information, sustained vowels, reading samples, and measures of PD severity (Hoehn and Yahr scores and Unified Parkinson's Disease Rating Scale III scores [UPDRS-III]). All speakers (N = 12) were native Finnish speakers. AVQIv03.01 and ABI analysis were completed in VOXplot v2.0.1. Changes in AVQI and ABI scores between baseline and follow-up were examined via causal analysis. Further, AVQI and ABI were analyzed in relation to measures of PD severity. RESULTS Baseline mean AVQI score was 1.79 (range 0.14-4.83, SD=1.60), whereas follow-up mean AVQI score was 2.25 (range 0.55-4.53, SD=1.36). Baseline mean ABI score, in turn, was 2.92 (range 1-27 - 5.31, SD=1.57), whereas follow-up mean ABI score was 3.42 (range 1.40-5.40, SD=1.38). A significant difference was found between baseline and follow-up measures for both AVQI (Z = -2.002, P = 0.045) and ABI (Z = -2.197, P = 0.028). A significant difference in smoothed cepstral peak prominence (Z = -2.118, P = 0.034) and harmonics-to-noise ratio (Z = -1.961, P = 0.050) was also found between the two measurement periods. Change in AVQI and ABI were not correlated with the change in measures of PD severity. CONCLUSION Over approximately 1-year, a statistical change was observed in AVQI and ABI scores, even in such a small dataset. The specific qualities of breathiness and hoarseness showed the most significant progression. Changes in voice quality were more prominent in ABI analysis.
Collapse
Affiliation(s)
- Rachel B Convey
- Faculty of Social Sciences, Tampere University, Tampere, Finland.
| | | | - Sari Ylinen
- Faculty of Social Sciences, Tampere University, Tampere, Finland
| | - Nelly Penttilä
- Faculty of Social Sciences, Tampere University, Tampere, Finland
| |
Collapse
|
2
|
Saghiri MA, Vakhnovetsky J, Amanabi M, Karamifar K, Farhadi M, Amini SB, Conte M. Exploring the impact of type II diabetes mellitus on voice quality. Eur Arch Otorhinolaryngol 2024; 281:2707-2716. [PMID: 38319369 DOI: 10.1007/s00405-024-08485-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2023] [Accepted: 01/15/2024] [Indexed: 02/07/2024]
Abstract
PURPOSE This cross-sectional study aimed to investigate the potential of voice analysis as a prescreening tool for type II diabetes mellitus (T2DM) by examining the differences in voice recordings between non-diabetic and T2DM participants. METHODS 60 participants diagnosed as non-diabetic (n = 30) or T2DM (n = 30) were recruited on the basis of specific inclusion and exclusion criteria in Iran between February 2020 and September 2023. Participants were matched according to their year of birth and then placed into six age categories. Using the WhatsApp application, participants recorded the translated versions of speech elicitation tasks. Seven acoustic features [fundamental frequency, jitter, shimmer, harmonic-to-noise ratio (HNR), cepstral peak prominence (CPP), voice onset time (VOT), and formant (F1-F2)] were extracted from each recording and analyzed using Praat software. Data was analyzed with Kolmogorov-Smirnov, two-way ANOVA, post hoc Tukey, binary logistic regression, and student t tests. RESULTS The comparison between groups showed significant differences in fundamental frequency, jitter, shimmer, CPP, and HNR (p < 0.05), while there were no significant differences in formant and VOT (p > 0.05). Binary logistic regression showed that shimmer was the most significant predictor of the disease group. There was also a significant difference between diabetes status and age, in the case of CPP. CONCLUSIONS Participants with type II diabetes exhibited significant vocal variations compared to non-diabetic controls.
Collapse
Affiliation(s)
- M A Saghiri
- Biomaterial and Prosthodontics Laboratory, Department of Restorative Dentistry, Rutgers School of Dental Medicine, Rutgers Biomedical and Health Sciences, MSB C639A, 185 South Orange Avenue, Newark, NJ, 07103, USA.
- Department of Endodontics, University of the Pacific, Arthur A. Dugoni School of Dentistry, San Francisco, CA, USA.
| | - Julia Vakhnovetsky
- Sector of Innovation in Dentistry, Dr. Hajar Afsar Lajevardi Research Cluster (DHAL), Hackensack, NJ, USA
- Rutgers School of Dental Medicine, Newark, NJ, USA
- University of Michigan School of Dentistry, Ann Arbor, MI, USA
| | | | - Kasra Karamifar
- Sector of Innovation in Medicine and Dentistry, Dr. Hajar Afsar Lajevardi Research Cluster (DHAL), Hackensack, NJ, USA
| | - Maziar Farhadi
- Sector of Innovation in Medicine and Dentistry, Dr. Hajar Afsar Lajevardi Research Cluster (DHAL), Hackensack, NJ, USA
| | - Saeid B Amini
- Dr. Hajar Afsar Lajevardi Research Cluster (DHAL), Hackensack, NJ, USA
| | - Michael Conte
- Office for Clinical Affairs, Rutgers School of Dental Medicine, Newark, NJ, USA
| |
Collapse
|
3
|
Schultz BG, Joukhadar Z, Nattala U, Quiroga MDM, Noffs G, Rojas S, Reece H, Van Der Walt A, Vogel AP. Disease Delineation for Multiple Sclerosis, Friedreich Ataxia, and Healthy Controls Using Supervised Machine Learning on Speech Acoustics. IEEE Trans Neural Syst Rehabil Eng 2023; 31:4278-4285. [PMID: 37792655 DOI: 10.1109/tnsre.2023.3321874] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/06/2023]
Abstract
Neurodegenerative disease often affects speech. Speech acoustics can be used as objective clinical markers of pathology. Previous investigations of pathological speech have primarily compared controls with one specific condition and excluded comorbidities. We broaden the utility of speech markers by examining how multiple acoustic features can delineate diseases. We used supervised machine learning with gradient boosting (CatBoost) to delineate healthy speech from speech of people with multiple sclerosis or Friedreich ataxia. Participants performed a diadochokinetic task where they repeated alternating syllables. We subjected 74 spectral and temporal prosodic features from the speech recordings to machine learning. Results showed that Friedreich ataxia, multiple sclerosis and healthy controls were all identified with high accuracy (over 82%). Twenty-one acoustic features were strong markers of neurodegenerative diseases, falling under the categories of spectral qualia, spectral power, and speech rate. We demonstrated that speech markers can delineate neurodegenerative diseases and distinguish healthy speech from pathological speech with high accuracy. Findings emphasize the importance of examining speech outcomes when assessing indicators of neurodegenerative disease. We propose large-scale initiatives to broaden the scope for differentiating other neurological diseases and affective disorders.
Collapse
|
4
|
Ngo QC, Motin MA, Pah ND, Drotár P, Kempster P, Kumar D. Computerized analysis of speech and voice for Parkinson's disease: A systematic review. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2022; 226:107133. [PMID: 36183641 DOI: 10.1016/j.cmpb.2022.107133] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/23/2022] [Revised: 09/13/2022] [Accepted: 09/13/2022] [Indexed: 06/16/2023]
Abstract
BACKGROUND AND OBJECTIVE Speech impairment is an early symptom of Parkinson's disease (PD). This study has summarized the literature related to speech and voice in detecting PD and assessing its severity. METHODS A systematic review of the literature from 2010 to 2021 to investigate analysis methods and signal features. The keywords "Automatic analysis" in conjunction with "PD speech" or "PD voice" were used, and the PubMed and ScienceDirect databases were searched. A total of 838 papers were found on the first run, of which 189 were selected. One hundred and forty-seven were found to be suitable for the review. The different datasets, recording protocols, signal analysis methods and features that were reported are listed. Values of the features that separate PD patients from healthy controls were tabulated. Finally, the barriers that limit the wide use of computerized speech analysis are discussed. RESULTS Speech and voice may be valuable markers for PD. However, large differences between the datasets make it difficult to compare different studies. In addition, speech analytic methods that are not informed by physiological understanding may alienate clinicians. CONCLUSIONS The potential usefulness of speech and voice for the detection and assessment of PD is confirmed by evidence from the classification and correlation results.
Collapse
Affiliation(s)
| | - Mohammod Abdul Motin
- Biosignals Lab, RMIT University, Melbourne, Australia; Department of Electrical & Electronic Engineering, Rajshahi University of Engineering & Technology, Rajshahi 6204, Bangladesh
| | - Nemuel Daniel Pah
- Biosignals Lab, RMIT University, Melbourne, Australia; Universitas Surabaya, Indonesia
| | - Peter Drotár
- Intelligent Information Systems Lab, Technical University of Kosice, Letna 9, 42001, Kosice, Slovakia
| | - Peter Kempster
- Neurosciences Department, Monash Health, Clayton, VIC, Australia; Department of Medicine, School of Clinical Sciences, Monash University, Clayton, VIC, Australia
| | - Dinesh Kumar
- Biosignals Lab, RMIT University, Melbourne, Australia.
| |
Collapse
|
5
|
Bao G, Lin M, Sang X, Hou Y, Liu Y, Wu Y. Classification of Dysphonic Voices in Parkinson's Disease with Semi-Supervised Competitive Learning Algorithm. BIOSENSORS 2022; 12:502. [PMID: 35884305 PMCID: PMC9312485 DOI: 10.3390/bios12070502] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 06/10/2022] [Revised: 07/04/2022] [Accepted: 07/07/2022] [Indexed: 06/15/2023]
Abstract
This article proposes a novel semi-supervised competitive learning (SSCL) algorithm for vocal pattern classifications in Parkinson’s disease (PD). The acoustic parameters of voice records were grouped into the families of jitter, shimmer, harmonic-to-noise, frequency, and nonlinear measures, respectively. The linear correlations were computed within each acoustic parameter family. According to the correlation matrix results, the jitter, shimmer, and harmonic-to-noise parameters presented as highly correlated in terms of Pearson’s correlation coefficients. Then, the principal component analysis (PCA) technique was implemented to eliminate the redundant dimensions of the acoustic parameters for each family. The Mann−Whitney−Wilcoxon hypothesis test was used to evaluate the significant difference of the PCA-projected features between the healthy subjects and PD patients. Eight dominant PCA-projected features were selected based on the eigenvalue threshold criterion and the statistical significance level (p < 0.05) of the hypothesis test. The SSCL algorithm proposed in this paper included the procedures of the competitive prototype seed selection, K-means optimization, and the nearest neighbor classifications. The pattern classification experimental results showed that the proposed SSCL method can provide the excellent diagnostic performances in terms of accuracy (0.838), recall (0.825), specificity (0.85), precision (0.846), F-score (0.835), Matthews correlation coefficient (0.675), area under the receiver operating characteristic curve (0.939), and Kappa coefficient (0.675), which were consistently better than those results of conventional KNN or SVM classifiers.
Collapse
|
6
|
Saghiri MA, Vakhnovetsky A, Vakhnovetsky J. Scoping review of the relationship between diabetes and voice quality. Diabetes Res Clin Pract 2022; 185:109782. [PMID: 35176400 DOI: 10.1016/j.diabres.2022.109782] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/25/2021] [Revised: 01/30/2022] [Accepted: 02/11/2022] [Indexed: 11/17/2022]
Abstract
AIMS The objective of this scoping review is to synthesize all of the known information about the relationship between diabetes mellitus and voice quality and to explore its potential applications for new technology. METHODS We conducted a scoping literature review of articles published between March 2000 and September 2021 using the following databases: PubMed, Web of Science, Scopus, and Embase. Additionally, we did a manual search of Google Scholar. The search strategy abides by the PRISMA-ScR guidelines. Studies pertaining to the relationship between diabetes and the voice were categorized separately for further evaluation. RESULTS Out of the 2732 originally identified articles, nine were ultimately included in this scoping review. The chosen articles address both diabetes and its impact on a variety of vocal parameters. CONCLUSIONS There is currently very little research investigating the relationship between diabetes, neuropathy, and phonatory symptoms. Additionally, existing publications contain some contradictory findings. Further research that incorporates imaging technology is needed to clarify the physiological explanations for the differences observed between healthy individuals and those with diabetes mellitus. Such information can be used to develop noninvasive technology for diabetes diagnosis and monitoring.
Collapse
Affiliation(s)
- Mohammad Ali Saghiri
- Department of Restorative Dentistry, Rutgers School of Dental Medicine, NJ, United States; Department of Endodontics, University of the Pacific, Arthur A. Dugoni School of Dentistry, San Francisco, CA, United States.
| | | | - Julia Vakhnovetsky
- Sector of Angiogenesis Regenerative Medicine, Dr. Hajar Afsar Lajevardi Research Cluster (DHAL), Hackensack, NJ, United States; Biomaterial and Prosthodontics Laboratory, Department of Restorative Dentistry, Rutgers School of Dental Medicine, NJ, United States
| |
Collapse
|
7
|
Yu Q, Zou X, Quan F, Dong Z, Yin H, Liu J, Zuo H, Xu J, Han Y, Zou D, Li Y, Cheng O. Parkinson's disease patients with freezing of gait have more severe voice impairment than non-freezers during "ON state". J Neural Transm (Vienna) 2022; 129:277-286. [PMID: 34989833 DOI: 10.1007/s00702-021-02458-1] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2021] [Accepted: 12/26/2021] [Indexed: 12/22/2022]
Abstract
BACKGROUND Speech disorders and freezing of gait (FOG) in Parkinson's disease (PD) may have some common pathological mechanisms. The purpose of this study was to compare the acoustic parameters of PD patients with dopamine-responsive FOG (PD-FOG) and without FOG (PD-nFOG) during "ON state" and explore the ability of "ON state" voice features in distinguishing PD-FOG from PD-nFOG. METHODS A total of 120 subjects, including 40 PD patients with dopamine-responsive FOG, 40 PD-nFOG, and 40 healthy controls (HCs) were recruited. All subjects underwent neuropsychological tests. Speech samples were recorded through the sustained vowel pronunciation tasks during the "ON state" and then analyzed by the Praat software. A set of 27 voice features was extracted from each sample for comparison. Support vector machine (SVM) was used to build mathematical models to classify PD-FOG and PD-nFOG. RESULTS Compared with PD-nFOG, the jitter, the standard deviation of fundamental frequency (F0SD), the standard deviation of pulse period (pulse period SD) and the noise-homophonic-ratio (NHR) were increased, and the maximum phonation time (MPT) was decreased in PD-FOG. The above voice features were correlated with the freezing of gait questionnaire (FOGQ). The average accuracy, specificity, and sensitivity of SVM models based on 27 voice features for classifying PD-FOG and PD-nFOG were 73.57%, 75.71%, and 71.43%, respectively. CONCLUSIONS PD-FOG have more severe voice impairment than PD-nFOG during "ON state".
Collapse
Affiliation(s)
- Qian Yu
- Department of Neurology, The First Affiliated Hospital, Chongqing Medical University, Chongqing, 400016, China
| | - Xiaoya Zou
- Department of Neurology, The First Affiliated Hospital, Chongqing Medical University, Chongqing, 400016, China
| | - Fengying Quan
- Department of Neurology, The First Affiliated Hospital, Chongqing Medical University, Chongqing, 400016, China
| | - Zhaoying Dong
- Department of Neurology, The First Affiliated Hospital, Chongqing Medical University, Chongqing, 400016, China
| | - Huimei Yin
- Department of Neurology, The First Affiliated Hospital, Chongqing Medical University, Chongqing, 400016, China
| | - Jinjing Liu
- Department of Neurology, The First Affiliated Hospital, Chongqing Medical University, Chongqing, 400016, China
| | - Hongzhou Zuo
- Department of Neurology, The First Affiliated Hospital, Chongqing Medical University, Chongqing, 400016, China
| | - Jiaman Xu
- Department of Neurology, The First Affiliated Hospital, Chongqing Medical University, Chongqing, 400016, China
| | - Yu Han
- Department of Neurology, The First Affiliated Hospital, Chongqing Medical University, Chongqing, 400016, China
| | - Dezhi Zou
- Department of Neurology, The First Affiliated Hospital, Chongqing Medical University, Chongqing, 400016, China
| | - Yongming Li
- School of Microelectronics and Communication Engineering, Chongqing University, Chongqing, 400030, China
| | - Oumei Cheng
- Department of Neurology, The First Affiliated Hospital, Chongqing Medical University, Chongqing, 400016, China.
| |
Collapse
|
8
|
Application of Genetic Algorithms for the Selection of Neural Network Architecture in the Monitoring System for Patients with Parkinson’s Disease. APPLIED SCIENCES-BASEL 2021. [DOI: 10.3390/app11125470] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
This article describes an approach for collecting and pre-processing phone owner data, including their voice, in order to classify their condition using data mining methods. The most important research results presented in this article are the developed approaches for the processing of patient voices and the use of genetic algorithms to select the architecture of the neural network in the monitoring system for patients with Parkinson’s disease. The process used to pre-process a person’s voice is described in order to determine the main parameters that can be used in assessing a person’s condition. It is shown that the efficiency of using genetic algorithms for constructing neural networks depends on the composition of the data. As a result, the best result in the accuracy of assessing the patient’s condition can be obtained by a hybrid approach, where a part of the neural network architecture is selected analytically manually, while the other part is built automatically.
Collapse
|
9
|
Monitoring Parkinson's disease progression based on recorded speech with missing ordinal responses and replicated covariates. Comput Biol Med 2021; 134:104503. [PMID: 34091382 DOI: 10.1016/j.compbiomed.2021.104503] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2021] [Revised: 05/10/2021] [Accepted: 05/15/2021] [Indexed: 11/19/2022]
Abstract
Monitoring Parkinson's Disease (PD) progression is an important task to improve the life quality of the affected people. This task can be performed by extracting features from voice recordings and applying specifically designed statistical models, leading to systems that improve the ability of monitoring the progression of PD in an objective, remote, non-invasive, fast, and economically sustainable way. An experiment has been conducted with 36 subjects to study the progression of the PD over 4 years by using the Hoehn and Yahr (HY) scale and features extracted from the phonation of the vowel/a/. The collected dataset had many missing data, which should be addressed jointly with the non-decreasing nature of the disease and the within-subject variability due to the use of replicated features. In order to handle these issues, a Hidden Markov model for longitudinal data was designed and implemented by using a data augmentation scheme based on different latent variables. Markov chain Monte Carlo methods were used to generate from the posterior distribution. The proposed approach has been tested on simulated data, providing good accuracy rates in the context of a multiclass problem. It also has been applied to the real data obtained from the conducted experiment, providing imputed and predicted HY stages compatible with the progression of PD. The conducted experiment and the proposed approach contribute to fill a gap in the scientific literature on experiments and methodologies for tracking PD progression based on acoustic features and the HY scale. This would help to derive an expert system that can be integrated into the protocols of neurology units in hospital centers.
Collapse
|
10
|
An improved framework for Parkinson’s disease prediction using Variational Mode Decomposition-Hilbert spectrum of speech signal. Biocybern Biomed Eng 2021. [DOI: 10.1016/j.bbe.2021.04.014] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
|