1
|
Mračková M, Mareček R, Mekyska J, Košťálová M, Rektorová I. Levodopa may modulate specific speech impairment in Parkinson's disease: an fMRI study. J Neural Transm (Vienna) 2024; 131:181-187. [PMID: 37943390 DOI: 10.1007/s00702-023-02715-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2023] [Accepted: 10/22/2023] [Indexed: 11/10/2023]
Abstract
Hypokinetic dysarthria (HD) is a difficult-to-treat symptom affecting quality of life in patients with Parkinson's disease (PD). Levodopa may partially alleviate some symptoms of HD in PD, but the neural correlates of these effects are not fully understood. The aim of our study was to identify neural mechanisms by which levodopa affects articulation and prosody in patients with PD. Altogether 20 PD patients participated in a task fMRI study (overt sentence reading). Using a single dose of levodopa after an overnight withdrawal of dopaminergic medication, levodopa-induced BOLD signal changes within the articulatory pathway (in regions of interest; ROIs) were studied. We also correlated levodopa-induced BOLD signal changes with the changes in acoustic parameters of speech. We observed no significant changes in acoustic parameters due to acute levodopa administration. After levodopa administration as compared to the OFF dopaminergic condition, patients showed task-induced BOLD signal decreases in the left ventral thalamus (p = 0.0033). The changes in thalamic activation were associated with changes in pitch variation (R = 0.67, p = 0.006), while the changes in caudate nucleus activation were related to changes in the second formant variability which evaluates precise articulation (R = 0.70, p = 0.003). The results are in line with the notion that levodopa does not have a major impact on HD in PD, but it may induce neural changes within the basal ganglia circuitries that are related to changes in speech prosody and articulation.
Collapse
Affiliation(s)
- Martina Mračková
- First Department of Neurology, Faculty of Medicine, Masaryk University and St. Anne's University Hospital Brno, Brno, Czech Republic
- Applied Neuroscience Research Group, Central European Institute of Technology, CEITEC, Masaryk University Brno, Brno, Czech Republic
| | - Radek Mareček
- Multimodal and Functional Neuroimaging Research Group, Central European Institute of Technology, CEITEC, Masaryk University Brno, Brno, Czech Republic
| | - Jiří Mekyska
- Department of Telecommunications, Brno University of Technology, Brno, Czech Republic
| | - Milena Košťálová
- Department of Neurology, Faculty of Medicine, Masaryk University and Faculty Hospital Brno, Brno, Czech Republic
| | - Irena Rektorová
- First Department of Neurology, Faculty of Medicine, Masaryk University and St. Anne's University Hospital Brno, Brno, Czech Republic.
- Applied Neuroscience Research Group, Central European Institute of Technology, CEITEC, Masaryk University Brno, Brno, Czech Republic.
| |
Collapse
|
2
|
Skibińska J, Hosek J. Computerized analysis of hypomimia and hypokinetic dysarthria for improved diagnosis of Parkinson's disease. Heliyon 2023; 9:e21175. [PMID: 37908703 PMCID: PMC10613914 DOI: 10.1016/j.heliyon.2023.e21175] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2023] [Revised: 10/07/2023] [Accepted: 10/17/2023] [Indexed: 11/02/2023] Open
Abstract
Background and Objective An aging society requires easy-to-use approaches for diagnosis and monitoring of neurodegenerative disorders, such as Parkinson's disease (PD), so that clinicians can effectively adjust a treatment policy and improve patients' quality of life. Current methods of PD diagnosis and monitoring usually require the patients to come to a hospital, where they undergo several neurological and neuropsychological examinations. These examinations are usually time-consuming, expensive, and performed just a few times per year. Hence, this study explores the possibility of fusing computerized analysis of hypomimia and hypokinetic dysarthria (two motor symptoms manifested in the majority of PD patients) with the goal of proposing a new methodology of PD diagnosis that could be easily integrated into mHealth systems. Methods We enrolled 73 PD patients and 46 age- and gender-matched healthy controls, who performed several speech/voice tasks while recorded by a microphone and a camera. Acoustic signals were parametrized in the fields of phonation, articulation and prosody. Video recordings of a face were analyzed in terms of facial landmarks movement. Both modalities were consequently modeled by the XGBoost algorithm. Results The acoustic analysis enabled diagnosis of PD with 77% balanced accuracy, while in the case of the facial analysis, we observed 81% balanced accuracy. The fusion of both modalities increased the balanced accuracy to 83% (88% sensitivity and 78% specificity). The most informative speech exercise in the multimodality system turned out to be a tongue twister. Additionally, we identified muscle movements that are characteristic of hypomimia. Conclusions The introduced methodology, which is based on the myriad of speech exercises likewise audio and video modality, allows for the detection of PD with an accuracy of up to 83%. The speech exercise - tongue twisters occurred to be the most valuable from the clinical point of view. Additionally, the clinical interpretation of the created models is illustrated. The presented computer-supported methodology could serve as an extra tool for neurologists in PD detection and the proposed potential solution of mHealth will facilitate the patient's and doctor's life.
Collapse
Affiliation(s)
- Justyna Skibińska
- Faculty of Electrical Engineering and Communication, Brno University of Technology, Technicka 12, Brno, 61600, Czechia
- Unit of Electrical Engineering, Tampere University, Kalevantie 4, Tampere, 33100, Finland
| | - Jiri Hosek
- Faculty of Electrical Engineering and Communication, Brno University of Technology, Technicka 12, Brno, 61600, Czechia
| |
Collapse
|
3
|
Jegan R, Jayagowri R. Voice pathology detection using optimized convolutional neural networks and explainable artificial intelligence-based analysis. Comput Methods Biomech Biomed Engin 2023:1-17. [PMID: 37850553 DOI: 10.1080/10255842.2023.2270102] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2023] [Accepted: 10/08/2023] [Indexed: 10/19/2023]
Abstract
This article proposes a noninvasive computer-aided assessment approach based on optimized convolutional neural network for healthy and pathological voice detection. Firstly, the input voice samples are first transformed into mel-spectrogram time-frequency visual representations and fed for training the CNN model. The time-frequency image captures inherent speech variations beneficial for healthy and pathological voice sample detection. The weights and biases of trained CNN network are further optimized using artificial bee colony (ABC) optimization algorithm resulting in optimum CNN network employed for testing unseen data. The proposed approach is evaluated using three popular and publicly available datasets: SVD, AVPD and VOICED. Experimental results emphasize that proposed ABC optimized CNN model shows improved accuracy performance by 1.02% compared to conventional CNN network illustrating data-independent discriminative representation ability. Finally, gradient-weighted class activation mapping (Grad-CAM) explainable artificial intelligence (XAI) is utilized to make the decision understandable.
Collapse
Affiliation(s)
- Roohum Jegan
- Department of Electronics and Communication Engineering, BMS College of Engineering, Bengluru, Karnataka, India
| | - R Jayagowri
- Department of Electronics and Communication Engineering, BMS College of Engineering, Bengluru, Karnataka, India
| |
Collapse
|
4
|
Marchese MR, Sensoli F, Campagnini S, Cianchetti M, Nacci A, Ursino F, D’Alatri L, Galli J, Carrozza MC, Paludetti G, Mannini A. Artificial intelligence for the recognition of benign lesions of vocal folds from audio recordings. ACTA OTORHINOLARYNGOLOGICA ITALICA : ORGANO UFFICIALE DELLA SOCIETA ITALIANA DI OTORINOLARINGOLOGIA E CHIRURGIA CERVICO-FACCIALE 2023; 43:317-323. [PMID: 37519137 PMCID: PMC10551729 DOI: 10.14639/0392-100x-n2309] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/20/2022] [Accepted: 03/22/2023] [Indexed: 08/01/2023]
Abstract
Objective The diagnosis of benign lesions of the vocal fold (BLVF) is still challenging. The analysis of the acoustic signals through the implementation of machine learning models can be a viable solution aimed at offering support for clinical diagnosis. Materials and methods In this study, a support vector machine was trained and cross-validated (10-fold cross-validation) using 138 features extracted from the acoustic signals of 418 patients with polyps, nodules, oedema, and cysts. The model's performance was presented as accuracy and average F1-score. The results were also analysed in male (M) and female (F) subgroups. Results The validation accuracy was 55%, 80%, and 54% on the overall cohort, and in M and F, respectively. Better performances were observed in the detection of cysts and nodules (58% and 62%, respectively) vs polyps and oedema (47% and 53%, respectively). The results on each lesion and the different patterns of the model on M and F are in line with clinical observations, obtaining better results on F and detection of sensitive polyps in M. Conclusions This study showed moderately accurate detection of four types of BLVF using acoustic signals. The analysis of the diagnostic results on gender subgroups highlights different behaviours of the diagnostic model.
Collapse
Affiliation(s)
- Maria Raffaella Marchese
- Unità Operativa Complessa di Otorinolaringoiatria, Dipartimento di Neuroscienze, Organi di Senso e Torace, Fondazione Policlinico Universitario A. Gemelli IRCCS, Rome, Italy
| | - Federico Sensoli
- Institute of Biorobotics, Scuola Superiore Sant’Anna, Pontedera, Italy
| | - Silvia Campagnini
- Institute of Biorobotics, Scuola Superiore Sant’Anna, Pontedera, Italy
- IRCCS Fondazione Don Carlo Gnocchi, Firenze, Italy
| | - Matteo Cianchetti
- Institute of Biorobotics, Scuola Superiore Sant’Anna, Pontedera, Italy
| | - Andrea Nacci
- U.O. Otorinolaringoiatria Audiologia e Foniatria, Azienda Ospedaliero Universitaria Pisana, Pisa, Italy
| | - Francesco Ursino
- Istituto Nazionale di Ricerche in Foniatria “G. Bartalena”, Pisa, Italy
| | - Lucia D’Alatri
- Unità Operativa Complessa di Otorinolaringoiatria, Dipartimento di Neuroscienze, Organi di Senso e Torace, Fondazione Policlinico Universitario A. Gemelli IRCCS, Rome, Italy
- Sezione di Otorinolaringoiatria, Dipartimento Universitario Testa-Collo e Organi di Senso, Università Cattolica del Sacro Cuore, Rome, Italy
| | - Jacopo Galli
- Unità Operativa Complessa di Otorinolaringoiatria, Dipartimento di Neuroscienze, Organi di Senso e Torace, Fondazione Policlinico Universitario A. Gemelli IRCCS, Rome, Italy
- Sezione di Otorinolaringoiatria, Dipartimento Universitario Testa-Collo e Organi di Senso, Università Cattolica del Sacro Cuore, Rome, Italy
| | | | - Gaetano Paludetti
- Unità Operativa Complessa di Otorinolaringoiatria, Dipartimento di Neuroscienze, Organi di Senso e Torace, Fondazione Policlinico Universitario A. Gemelli IRCCS, Rome, Italy
- Sezione di Otorinolaringoiatria, Dipartimento Universitario Testa-Collo e Organi di Senso, Università Cattolica del Sacro Cuore, Rome, Italy
| | - Andrea Mannini
- Institute of Biorobotics, Scuola Superiore Sant’Anna, Pontedera, Italy
- IRCCS Fondazione Don Carlo Gnocchi, Firenze, Italy
| |
Collapse
|
5
|
Frassineti L, Calà F, Sforza E, Onesimo R, Leoni C, Lanatà A, Zampino G, Manfredi C. Quantitative acoustical analysis of genetic syndromes in the number listing task. Biomed Signal Process Control 2023. [DOI: 10.1016/j.bspc.2023.104887] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/30/2023]
|
6
|
Faragó P, Ștefănigă SA, Cordoș CG, Mihăilă LI, Hintea S, Peștean AS, Beyer M, Perju-Dumbravă L, Ileșan RR. CNN-Based Identification of Parkinson's Disease from Continuous Speech in Noisy Environments. Bioengineering (Basel) 2023; 10:bioengineering10050531. [PMID: 37237601 DOI: 10.3390/bioengineering10050531] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2023] [Revised: 04/21/2023] [Accepted: 04/24/2023] [Indexed: 05/28/2023] Open
Abstract
Parkinson's disease is a progressive neurodegenerative disorder caused by dopaminergic neuron degeneration. Parkinsonian speech impairment is one of the earliest presentations of the disease and, along with tremor, is suitable for pre-diagnosis. It is defined by hypokinetic dysarthria and accounts for respiratory, phonatory, articulatory, and prosodic manifestations. The topic of this article targets artificial-intelligence-based identification of Parkinson's disease from continuous speech recorded in a noisy environment. The novelty of this work is twofold. First, the proposed assessment workflow performed speech analysis on samples of continuous speech. Second, we analyzed and quantified Wiener filter applicability for speech denoising in the context of Parkinsonian speech identification. We argue that the Parkinsonian features of loudness, intonation, phonation, prosody, and articulation are contained in the speech, speech energy, and Mel spectrograms. Thus, the proposed workflow follows a feature-based speech assessment to determine the feature variation ranges, followed by speech classification using convolutional neural networks. We report the best classification accuracies of 96% on speech energy, 93% on speech, and 92% on Mel spectrograms. We conclude that the Wiener filter improves both feature-based analysis and convolutional-neural-network-based classification performances.
Collapse
Affiliation(s)
- Paul Faragó
- Bases of Electronics Department, Faculty of Electronics, Telecommunications and Information Technology, Technical University of Cluj-Napoca, 400114 Cluj-Napoca, Romania
| | - Sebastian-Aurelian Ștefănigă
- Department of Computer Science, Faculty of Mathematics and Computer Science, West University of Timisoara, 300223 Timisoara, Romania
| | - Claudia-Georgiana Cordoș
- Bases of Electronics Department, Faculty of Electronics, Telecommunications and Information Technology, Technical University of Cluj-Napoca, 400114 Cluj-Napoca, Romania
| | - Laura-Ioana Mihăilă
- Bases of Electronics Department, Faculty of Electronics, Telecommunications and Information Technology, Technical University of Cluj-Napoca, 400114 Cluj-Napoca, Romania
| | - Sorin Hintea
- Bases of Electronics Department, Faculty of Electronics, Telecommunications and Information Technology, Technical University of Cluj-Napoca, 400114 Cluj-Napoca, Romania
| | - Ana-Sorina Peștean
- Department of Neurology and Pediatric Neurology, Faculty of Medicine, University of Medicine and Pharmacy "Iuliu Hatieganu" Cluj-Napoca, 400012 Cluj-Napoca, Romania
| | - Michel Beyer
- Clinic of Oral and Cranio-Maxillofacial Surgery, University Hospital Basel, CH-4031 Basel, Switzerland
- Medical Additive Manufacturing Research Group (Swiss MAM), Department of Biomedical Engineering, University of Basel, CH-4123 Allschwil, Switzerland
| | - Lăcrămioara Perju-Dumbravă
- Department of Neurology and Pediatric Neurology, Faculty of Medicine, University of Medicine and Pharmacy "Iuliu Hatieganu" Cluj-Napoca, 400012 Cluj-Napoca, Romania
| | - Robert Radu Ileșan
- Department of Neurology and Pediatric Neurology, Faculty of Medicine, University of Medicine and Pharmacy "Iuliu Hatieganu" Cluj-Napoca, 400012 Cluj-Napoca, Romania
- Clinic of Oral and Cranio-Maxillofacial Surgery, University Hospital Basel, CH-4031 Basel, Switzerland
| |
Collapse
|
7
|
Jiang W, Li M, Shabaz M, Sharma A, Haq MA. Generation of Voice Signal Tone Sandhi and Melody Based on Convolutional Neural Network. ACM T ASIAN LOW-RESO 2022. [DOI: 10.1145/3545569] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/14/2022]
Abstract
There is a need to prevent the generation of criminal activities in the voice signals due to changing voices by intruders to cover up their personal identities. The voice signal change detection based on convolutional neural network is proposed in this work that uses three commonly used voice processing software to change the tone of the voice library: Audacity, CoolEdit and RTISI. The research further raises 5 semitones for each voice, which are recorded at different levels, as +4, +5, +6, +7 and +8 respectively. Simultaneously, every speech is lowered by 5 halftones and which are further represented as -4, -5, -6, -7 and -8 respectively. The convolution neural network corresponding to network b-3 is determined as the final classifier in this article through experiments. The average accuracy A1 of its three categories has reached more than 97%, the detection accuracy A2 of electronic tone sandhi speech has reached more than 97%, and the false alarm rate FAR of the original speech is less than 1.9%. The outcomes obtained shows that the detection algorithm in this paper is effective, and it has good generalization ability.
Collapse
Affiliation(s)
- Wei Jiang
- Department of Music, Shandong University of Science and Technology, Qingdao Shandong, 266590, China
| | - Mengqi Li
- Department of Music, Shandong University of Science and Technology, Qingdao Shandong, 266590, China
| | - Mohammad Shabaz
- Model Institute of Engineering and Technology, Jammu, J&K, India
| | - Ashutosh Sharma
- School of Computer Science, University of Petroleum and Energy Studies, Dehradun, India
| | - Mohd Anul Haq
- Department of Computer Science, College of Computer Science and Information Science, Majmaah University, Saudi Arabia
| |
Collapse
|
8
|
Feasibility of telemedicine research visits in people with Parkinson's disease residing in medically underserved areas. J Clin Transl Sci 2022; 6:e133. [PMID: 36590358 PMCID: PMC9794963 DOI: 10.1017/cts.2022.459] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2022] [Revised: 08/25/2022] [Accepted: 09/05/2022] [Indexed: 01/04/2023] Open
Abstract
Introduction Gait, balance, and cognitive impairment make travel cumbersome for People with Parkinson's disease (PwPD). About 75% of PwPD cared for at the University of Arkansas for Medical Sciences' Movement Disorders Clinic reside in medically underserved areas (MUAs). Validated remote evaluations could help improve their access to care. Our goal was to explore the feasibility of telemedicine research visits for the evaluation of multi-modal function in PwPD in a rural state. Methods In-home telemedicine research visits were performed in PwPD. Motor and non-motor disease features were evaluated and quantified by trained personnel, digital survey instruments for self-assessments, digital voice recordings, and scanned and digitized Archimedes spiral drawings. Participant's MUA residence was determined after evaluations were completed. Results Twenty of the fifty PwPD enrolled resided in MUAs. The groups were well matched for disease duration, modified motor UPDRS, and Montreal Cognitive assessment scores but MUA participants were younger. Ninety-two percent were satisfied with their visit, and 61% were more likely to participate in future telemedicine research. MUA participants traveled longer distances, with higher travel costs, lower income, and education level. While 50% of MUA participants reported self-reliance for in-person visits, 85% reported self-reliance for the telemedicine visit. We rated audio-video quality highly in approximately 60% of visits in both groups. There was good correlation with prior in-person research assessments in a subset of participants. Conclusions In-home research visits for PwPD in MUAs are feasible and could help improve access to care and research participation in these traditionally underrepresented populations.
Collapse
|
9
|
Zakariah M, B R, Ajmi Alotaibi Y, Guo Y, Tran-Trung K, Elahi MM. An Analytical Study of Speech Pathology Detection Based on MFCC and Deep Neural Networks. COMPUTATIONAL AND MATHEMATICAL METHODS IN MEDICINE 2022; 2022:7814952. [PMID: 35529259 PMCID: PMC9071878 DOI: 10.1155/2022/7814952] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/21/2022] [Revised: 02/17/2022] [Accepted: 03/07/2022] [Indexed: 11/17/2022]
Abstract
Diseases of internal organs other than the vocal folds can also affect a person's voice. As a result, voice problems are on the rise, even though they are frequently overlooked. According to a recent study, voice pathology detection systems can successfully help the assessment of voice abnormalities and enable the early diagnosis of voice pathology. For instance, in the early identification and diagnosis of voice problems, the automatic system for distinguishing healthy and diseased voices has gotten much attention. As a result, artificial intelligence-assisted voice analysis brings up new possibilities in healthcare. The work was aimed at assessing the utility of several automatic speech signal analysis methods for diagnosing voice disorders and suggesting a strategy for classifying healthy and diseased voices. The proposed framework integrates the efficacy of three voice characteristics: chroma, mel spectrogram, and mel frequency cepstral coefficient (MFCC). We also designed a deep neural network (DNN) capable of learning from the retrieved data and producing a highly accurate voice-based disease prediction model. The study describes a series of studies using the Saarbruecken Voice Database (SVD) to detect abnormal voices. The model was developed and tested using the vowels /a/, /i/, and /u/ pronounced in high, low, and average pitches. We also maintained the "continuous sentence" audio files collected from SVD to select how well the developed model generalizes to completely new data. The highest accuracy achieved was 77.49%, superior to prior attempts in the same domain. Additionally, the model attains an accuracy of 88.01% by integrating speaker gender information. The designed model trained on selected diseases can also obtain a maximum accuracy of 96.77% (cordectomy × healthy). As a result, the suggested framework is the best fit for the healthcare industry.
Collapse
Affiliation(s)
- Mohammed Zakariah
- Department of Computer Science, College of Computer and Information Sciences, King Saud University, P.O. Box 57168, Riyadh 21574, Saudi Arabia
| | - Reshma B
- Division of Electronics Engineering, School of Engineering, Cochin University of Science and Technology, India
| | - Yousef Ajmi Alotaibi
- Department of Computer Engineering, College of Computer and Information Sciences, King Saud University, P.O. Box 57168, Riyadh 21574, Saudi Arabia
| | | | - Kiet Tran-Trung
- Faculty of Computer Science, Ho Chi Minh City Open University, 97 Vo Van Tan, Ward Vo Thi Sau, District 3, Ho Chi Minh City Code postal: 70000, Vietnam
| | - Mohammad Mamun Elahi
- Department of Computer Science and Engineering, United International University, Dhaka, Bangladesh
| |
Collapse
|
10
|
Hireš M, Gazda M, Drotár P, Pah ND, Motin MA, Kumar DK. Convolutional neural network ensemble for Parkinson's disease detection from voice recordings. Comput Biol Med 2021; 141:105021. [PMID: 34799077 DOI: 10.1016/j.compbiomed.2021.105021] [Citation(s) in RCA: 18] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2021] [Revised: 11/02/2021] [Accepted: 11/03/2021] [Indexed: 11/03/2022]
Abstract
The computerized detection of Parkinson's disease (PD) will facilitate population screening and frequent monitoring and provide a more objective measure of symptoms, benefiting both patients and healthcare providers. Dysarthria is an early symptom of the disease and examining it for computerized diagnosis and monitoring has been proposed. Deep learning-based approaches have advantages for such applications because they do not require manual feature extraction, and while this approach has achieved excellent results in speech recognition, its utilization in the detection of pathological voices is limited. In this work, we present an ensemble of convolutional neural networks (CNNs) for the detection of PD from the voice recordings of 50 healthy people and 50 people with PD obtained from PC-GITA, a publicly available database. We propose a multiple-fine-tuning method to train the base CNN. This approach reduces the semantical gap between the source task that has been used for network pretraining and the target task by expanding the training process by including training on another dataset. Training and testing were performed for each vowel separately, and a 10-fold validation was performed to test the models. The performance was measured by using accuracy, sensitivity, specificity and area under the ROC curve (AUC). The results show that this approach was able to distinguish between the voices of people with PD and those of healthy people for all vowels. While there were small differences between the different vowels, the best performance was when/a/was considered; we achieved 99% accuracy, 86.2% sensitivity, 93.3% specificity and 89.6% AUC. This shows that the method has potential for use in clinical practice for the screening, diagnosis and monitoring of PD, with the advantage that vowel-based voice recordings can be performed online without requiring additional hardware.
Collapse
Affiliation(s)
- Máté Hireš
- Intelligent Information Systems Lab, Technical University of Košice, Letná 9, 42001, Košice, Slovakia
| | - Matej Gazda
- Intelligent Information Systems Lab, Technical University of Košice, Letná 9, 42001, Košice, Slovakia
| | - Peter Drotár
- Intelligent Information Systems Lab, Technical University of Košice, Letná 9, 42001, Košice, Slovakia.
| | | | | | | |
Collapse
|
11
|
Tripathi A, Bhosale S, Kopparapu SK. Automatic speaker independent dysarthric speech intelligibility assessment system. COMPUT SPEECH LANG 2021. [DOI: 10.1016/j.csl.2021.101213] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]
|
12
|
Roldan-Vasco S, Orozco-Duque A, Suarez-Escudero JC, Orozco-Arroyave JR. Machine learning based analysis of speech dimensions in functional oropharyngeal dysphagia. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2021; 208:106248. [PMID: 34260973 DOI: 10.1016/j.cmpb.2021.106248] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/23/2021] [Accepted: 06/15/2021] [Indexed: 06/13/2023]
Abstract
BACKGROUND AND OBJECTIVE The normal swallowing process requires a complex coordination of anatomical structures driven by sensory and cranial nerves. Alterations in such coordination cause swallowing malfunctions, namely dysphagia. The dysphagia screening methods are quite subjective and experience dependent. Bearing in mind that the swallowing process and speech production share some anatomical structures and mechanisms of neurological control, this work aims to evaluate the suitability of automatic speech processing and machine learning techniques for screening of functional dysphagia. METHODS Speech recordings were collected from 46 patients with functional oropharyngeal dysphagia produced by neurological causes, and 46 healthy controls. The dimensions of speech including phonation, articulation, and prosody were considered through different speech tasks. Specific features per dimension were extracted and analyzed using statistical tests. Machine learning models were applied per dimension via nested cross-validation. Hyperparameters were selected using the AUC - ROC as optimization criterion. RESULTS The Random Forest in the articulation related speech tasks retrieved the highest performance measures (AUC=0.86±0.10, sensitivity=0.91±0.12) for individual analysis of dimensions. In addition, the combination of speech dimensions with a voting ensemble improved the results, which suggests a contribution of information from different feature sets extracted from speech signals in dysphagia conditions. CONCLUSIONS The proposed approach based on speech related models is suitable for the automatic discrimination between dysphagic and healthy individuals. These findings seem to have potential use in the screening of functional oropharyngeal dysphagia in a non-invasive and inexpensive way.
Collapse
Affiliation(s)
- Sebastian Roldan-Vasco
- Faculty of Engineering, Instituto Tecnológico Metropolitano, Medellín, Colombia; Faculty of Engineering, Universidad de Antioquia, Medellín, Colombia.
| | - Andres Orozco-Duque
- Faculty of Pure and Applied Sciences, Instituto Tecnológico Metropolitano, Medellín, Colombia
| | - Juan Camilo Suarez-Escudero
- School of Health Sciences, Faculty of Medicine, Universidad Pontificia Bolivariana, Medellín, Colombia; Faculty of Pure and Applied Sciences, Instituto Tecnológico Metropolitano, Medellín, Colombia
| | - Juan Rafael Orozco-Arroyave
- Faculty of Engineering, Universidad de Antioquia, Medellín, Colombia; Pattern Recognition Lab, Friedrich-Alexander-Universität, Erlangen-Nürnberg, Germany.
| |
Collapse
|
13
|
Hidalgo-De la Guía I, Garayzábal-Heinze E, Gómez-Vilda P, Martínez-Olalla R, Palacios-Alonso D. Acoustic Analysis of Phonation in Children With Smith-Magenis Syndrome. Front Hum Neurosci 2021; 15:661392. [PMID: 34149380 PMCID: PMC8209519 DOI: 10.3389/fnhum.2021.661392] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2021] [Accepted: 04/27/2021] [Indexed: 11/13/2022] Open
Abstract
Complex simultaneous neuropsychophysiological mechanisms are responsible for the processing of the information to be transmitted and for the neuromotor planning of the articulatory organs involved in speech. The nature of this set of mechanisms is closely linked to the clinical state of the subject. Thus, for example, in populations with neurodevelopmental deficits, these underlying neuropsychophysiological procedures are deficient and determine their phonation. Most of these cases with neurodevelopmental deficits are due to a genetic abnormality, as is the case in the population with Smith–Magenis syndrome (SMS). SMS is associated with neurodevelopmental deficits, intellectual disability, and a cohort of characteristic phenotypic features, including voice quality, which does not seem to be in line with the gender, age, and complexion of the diagnosed subject. The phonatory profile and speech features in this syndrome are dysphonia, high f0, excess vocal muscle stiffness, fluency alterations, numerous syllabic simplifications, phoneme omissions, and unintelligibility of speech. This exploratory study investigates whether the neuromotor deficits in children with SMS adversely affect phonation as compared to typically developing children without neuromotor deficits, which has not been previously determined. The authors compare the phonatory performance of a group of children with SMS (N = 12) with a healthy control group of children (N = 12) matched in age, gender, and grouped into two age ranges. The first group ranges from 5 to 7 years old, and the second group goes from 8 to 12 years old. Group differences were determined for two forms of acoustic analysis performed on repeated recordings of the sustained vowel /a/ F1 and F2 extraction and cepstral peak prominence (CPP). It is expected that the results will enlighten the question of the underlying neuromotor aspects of phonation in SMS population. These findings could provide evidence of the susceptibility of phonation of speech to neuromotor disturbances, regardless of their origin.
Collapse
Affiliation(s)
| | | | - Pedro Gómez-Vilda
- Center for Biomedical Technology, Universidad Politécnica de Madrid, Madrid, Spain
| | | | - Daniel Palacios-Alonso
- Escuela Técnica Superior de Ingeniería Informática, Universidad Rey Juan Carlos, Madrid, Spain
| |
Collapse
|
14
|
Tăuţan AM, Ionescu B, Santarnecchi E. Artificial intelligence in neurodegenerative diseases: A review of available tools with a focus on machine learning techniques. Artif Intell Med 2021; 117:102081. [PMID: 34127244 DOI: 10.1016/j.artmed.2021.102081] [Citation(s) in RCA: 28] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2020] [Revised: 02/21/2021] [Accepted: 04/26/2021] [Indexed: 10/21/2022]
Abstract
Neurodegenerative diseases have shown an increasing incidence in the older population in recent years. A significant amount of research has been conducted to characterize these diseases. Computational methods, and particularly machine learning techniques, are now very useful tools in helping and improving the diagnosis as well as the disease monitoring process. In this paper, we provide an in-depth review on existing computational approaches used in the whole neurodegenerative spectrum, namely for Alzheimer's, Parkinson's, and Huntington's Diseases, Amyotrophic Lateral Sclerosis, and Multiple System Atrophy. We propose a taxonomy of the specific clinical features, and of the existing computational methods. We provide a detailed analysis of the various modalities and decision systems employed for each disease. We identify and present the sleep disorders which are present in various diseases and which represent an important asset for onset detection. We overview the existing data set resources and evaluation metrics. Finally, we identify current remaining open challenges and discuss future perspectives.
Collapse
Affiliation(s)
- Alexandra-Maria Tăuţan
- University "Politehnica" of Bucharest, Splaiul Independenţei 313, 060042 Bucharest, Romania.
| | - Bogdan Ionescu
- University "Politehnica" of Bucharest, Splaiul Independenţei 313, 060042 Bucharest, Romania.
| | - Emiliano Santarnecchi
- Berenson-Allen Center for Noninvasive Brain Stimulation, Harvard Medical School, 330 Brookline Avenue, Boston, United States.
| |
Collapse
|
15
|
Gómez A, Tsanas A, Gómez P, Palacios-Alonso D, Rodellar V, Álvarez A. Acoustic to kinematic projection in Parkinson’s disease dysarthria. Biomed Signal Process Control 2021. [DOI: 10.1016/j.bspc.2021.102422] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/19/2023]
|
16
|
Gómez A, Gómez P, Palacios D, Rodellar V, Nieto V, Álvarez A, Tsanas A. A Neuromotor to Acoustical Jaw-Tongue Projection Model With Application in Parkinson's Disease Hypokinetic Dysarthria. Front Hum Neurosci 2021; 15:622825. [PMID: 33790751 PMCID: PMC8005556 DOI: 10.3389/fnhum.2021.622825] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2020] [Accepted: 02/17/2021] [Indexed: 11/13/2022] Open
Abstract
Aim The present work proposes the study of the neuromotor activity of the masseter-jaw-tongue articulation during diadochokinetic exercising to establish functional statistical relationships between surface Electromyography (sEMG), 3D Accelerometry (3DAcc), and acoustic features extracted from the speech signal, with the aim of characterizing Hypokinetic Dysarthria (HD). A database of multi-trait signals of recordings from an age-matched control and PD participants are used in the experimental study. Hypothesis: The main assumption is that information between sEMG and 3D acceleration, and acoustic features may be quantified using linear regression methods. Methods Recordings from a cohort of eight age-matched control participants (4 males, 4 females) and eight PD participants (4 males, 4 females) were collected during the utterance of a diadochokinetic exercise (the fast repetition of diphthong [aI]). The dynamic and acoustic absolute kinematic velocities produced during the exercises were estimated by acoustic filter inversion and numerical integration and differentiation of the speech signal. The amplitude distributions of the absolute kinematic and acoustic velocities (AKV and AFV) are estimated to allow comparisons in terms of Mutual Information. Results The regression results show the relationships between sEMG and dynamic and acoustic estimates. The projection methodology may help in understanding the basic neuromotor muscle activity regarding neurodegenerative speech in remote monitoring neuromotor and neurocognitive diseases using speech as the vehicular tool, and in the study of other speech-related disorders. The study also showed strong and significant cross-correlations between articulation kinematics, both for the control and the PD cohorts. The absolute kinematic variables presents an observable difference for the PD participants compared to the control group. Conclusion Kinematic distributions derived from acoustic analysis may be useful biomarkers toward characterizing HD in neuromotor disorders providing new insights into PD.
Collapse
Affiliation(s)
- Andrés Gómez
- Old Medical School, Medical School, Usher Institute, University of Edinburgh, Edinburgh, United Kingdom.,NeuSpeLab, Center for Biomedical Technology, Universidad Politécnica de Madrid, Madrid, Spain
| | - Pedro Gómez
- NeuSpeLab, Center for Biomedical Technology, Universidad Politécnica de Madrid, Madrid, Spain
| | - Daniel Palacios
- NeuSpeLab, Center for Biomedical Technology, Universidad Politécnica de Madrid, Madrid, Spain.,Escuela Técnica Superior de Ingeniería Informática-Universidad Rey Juan Carlos, Móstoles, Spain
| | - Victoria Rodellar
- NeuSpeLab, Center for Biomedical Technology, Universidad Politécnica de Madrid, Madrid, Spain
| | - Víctor Nieto
- NeuSpeLab, Center for Biomedical Technology, Universidad Politécnica de Madrid, Madrid, Spain
| | - Agustín Álvarez
- NeuSpeLab, Center for Biomedical Technology, Universidad Politécnica de Madrid, Madrid, Spain
| | - Athanasios Tsanas
- Old Medical School, Medical School, Usher Institute, University of Edinburgh, Edinburgh, United Kingdom
| |
Collapse
|
17
|
Tena A, Claria F, Solsona F, Meister E, Povedano M. Detection of Bulbar Involvement in Patients With Amyotrophic Lateral Sclerosis by Machine Learning Voice Analysis: Diagnostic Decision Support Development Study. JMIR Med Inform 2021; 9:e21331. [PMID: 33688838 PMCID: PMC7991994 DOI: 10.2196/21331] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2020] [Revised: 10/26/2020] [Accepted: 01/17/2021] [Indexed: 11/13/2022] Open
Abstract
Background Bulbar involvement is a term used in amyotrophic lateral sclerosis (ALS) that refers to motor neuron impairment in the corticobulbar area of the brainstem, which produces a dysfunction of speech and swallowing. One of the earliest symptoms of bulbar involvement is voice deterioration characterized by grossly defective articulation; extremely slow, laborious speech; marked hypernasality; and severe harshness. Bulbar involvement requires well-timed and carefully coordinated interventions. Therefore, early detection is crucial to improving the quality of life and lengthening the life expectancy of patients with ALS who present with this dysfunction. Recent research efforts have focused on voice analysis to capture bulbar involvement. Objective The main objective of this paper was (1) to design a methodology for diagnosing bulbar involvement efficiently through the acoustic parameters of uttered vowels in Spanish, and (2) to demonstrate that the performance of the automated diagnosis of bulbar involvement is superior to human diagnosis. Methods The study focused on the extraction of features from the phonatory subsystem—jitter, shimmer, harmonics-to-noise ratio, and pitch—from the utterance of the five Spanish vowels. Then, we used various supervised classification algorithms, preceded by principal component analysis of the features obtained. Results To date, support vector machines have performed better (accuracy 95.8%) than the models analyzed in the related work. We also show how the model can improve human diagnosis, which can often misdiagnose bulbar involvement. Conclusions The results obtained are very encouraging and demonstrate the efficiency and applicability of the automated model presented in this paper. It may be an appropriate tool to help in the diagnosis of ALS by multidisciplinary clinical teams, in particular to improve the diagnosis of bulbar involvement.
Collapse
Affiliation(s)
- Alberto Tena
- Information and Communication Technologies Group, International Centre for Numerical Methods in Engineering, Barcelona, Spain
| | - Francec Claria
- Department of Computer Science, Universitat de Lleida, Lleida, Spain
| | - Francesc Solsona
- Department of Computer Science, Universitat de Lleida, Lleida, Spain
| | - Einar Meister
- Institute of Cybernetics, Tallinn University of Technology, Tallinn, Estonia
| | - Monica Povedano
- Motoneuron Functional Unit, Hospital Universitari de Bellvitge, Barcelona, Spain
| |
Collapse
|
18
|
Nguyen Van S, Lobo Marques JA, Biala TA, Li Y. Identification of Latent Risk Clinical Attributes for Children Born Under IUGR Condition Using Machine Learning Techniques. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2021; 200:105842. [PMID: 33257111 DOI: 10.1016/j.cmpb.2020.105842] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/02/2020] [Accepted: 11/10/2020] [Indexed: 06/12/2023]
Abstract
BACKGROUND AND OBJECTIVE Intrauterine Growth Restriction (IUGR) is a condition in which a fetus does not grow to the expected weight during pregnancy. There are several well documented causes in the literature for this issue, such as maternal disorder, and genetic influences. Nevertheless, besides the risk during pregnancy and labour periods, in a long term perspective, the impact of IUGR condition during the child development is an area of research itself. The main objective of this work is to propose a machine learning solution to identify the most significant features of importance based on physiological, clinical or socioeconomic factors correlated with previous IUGR condition after 10 years of birth. METHODS In this work, 41 IUGR (18 male) and 34 Non-IUGR (22 male) children were followed up 9 years after the birth, in average (9.1786 ± 0.6784 years old). A group of machine learning algorithms is proposed to classify children previously identified as born under IUGR condition based on 24-hours monitoring of ECG (Holter) and blood pressure (ABPM), and other clinical and socioeconomic attributes. In additional, an algorithm of relevance determination based on the classifier is also proposed, to determine the level of importance of the considered features. RESULTS The proposed classification solution achieved accuracy up to 94.73%, and better performance than seven state-of-the-art machine learning algorithms. Also, relevant latent factors related to HRV and BP monitoring are proposed, such as: day-time heart rate (day-time HR), day-night systolic blood pressure (day-night SBP), 24-hour standard deviation (SD) of SBP, dropped, morning cortisol creatinine, 24-hour mean of SDs of all NN intervals for each 5 minutes segment (24-hour SDNNi), among others. CONCLUSION With outstanding accuracy of our proposed solutions, the classification system and the indication of relevant attributes may support medical teams on the clinical monitoring of IUGR children during their childhood development.
Collapse
Affiliation(s)
- Sau Nguyen Van
- Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China.
| | | | - T A Biala
- University of Leicester, Leicester, UK and the Biotechnology Research Center, Lybia.
| | - Ye Li
- Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China.
| |
Collapse
|
19
|
Ribeiro M, Henriques T, Castro L, Souto A, Antunes L, Costa-Santos C, Teixeira A. The Entropy Universe. ENTROPY 2021; 23:e23020222. [PMID: 33670121 PMCID: PMC7916845 DOI: 10.3390/e23020222] [Citation(s) in RCA: 24] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/15/2021] [Revised: 02/06/2021] [Accepted: 02/08/2021] [Indexed: 11/16/2022]
Abstract
About 160 years ago, the concept of entropy was introduced in thermodynamics by Rudolf Clausius. Since then, it has been continually extended, interpreted, and applied by researchers in many scientific fields, such as general physics, information theory, chaos theory, data mining, and mathematical linguistics. This paper presents The Entropy Universe, which aims to review the many variants of entropies applied to time-series. The purpose is to answer research questions such as: How did each entropy emerge? What is the mathematical definition of each variant of entropy? How are entropies related to each other? What are the most applied scientific fields for each entropy? We describe in-depth the relationship between the most applied entropies in time-series for different scientific fields, establishing bases for researchers to properly choose the variant of entropy most suitable for their data. The number of citations over the past sixteen years of each paper proposing a new entropy was also accessed. The Shannon/differential, the Tsallis, the sample, the permutation, and the approximate entropies were the most cited ones. Based on the ten research areas with the most significant number of records obtained in the Web of Science and Scopus, the areas in which the entropies are more applied are computer science, physics, mathematics, and engineering. The universe of entropies is growing each day, either due to the introducing new variants either due to novel applications. Knowing each entropy's strengths and of limitations is essential to ensure the proper improvement of this research field.
Collapse
Affiliation(s)
- Maria Ribeiro
- Institute for Systems and Computer Engineering, Technology and Science (INESC-TEC), 4200-465 Porto, Portugal;
- Computer Science Department, Faculty of Sciences, University of Porto, 4169-007 Porto, Portugal
- Correspondence:
| | - Teresa Henriques
- Centre for Health Technology and Services Research (CINTESIS), Faculty of Medicine University of Porto, 4200-450 Porto, Portugal; (T.H.); (L.C.); (C.C.-S.); (A.T.)
- Department of Community Medicine, Information and Health Decision Sciences-MEDCIDS, Faculty of Medicine, University of Porto, 4200-450 Porto, Portugal
| | - Luísa Castro
- Centre for Health Technology and Services Research (CINTESIS), Faculty of Medicine University of Porto, 4200-450 Porto, Portugal; (T.H.); (L.C.); (C.C.-S.); (A.T.)
| | - André Souto
- LASIGE, Faculdade de Ciências da Universidade de Lisboa, 1749-016 Lisboa, Portugal;
- Departamento de Informática, Faculdade de Ciências da Universidade de Lisboa, 1749-016 Lisboa, Portugal
- Instituto de Telecomunicações, 1049-001 Lisboa, Portugal
| | - Luís Antunes
- Institute for Systems and Computer Engineering, Technology and Science (INESC-TEC), 4200-465 Porto, Portugal;
- Computer Science Department, Faculty of Sciences, University of Porto, 4169-007 Porto, Portugal
| | - Cristina Costa-Santos
- Centre for Health Technology and Services Research (CINTESIS), Faculty of Medicine University of Porto, 4200-450 Porto, Portugal; (T.H.); (L.C.); (C.C.-S.); (A.T.)
- Department of Community Medicine, Information and Health Decision Sciences-MEDCIDS, Faculty of Medicine, University of Porto, 4200-450 Porto, Portugal
| | - Andreia Teixeira
- Centre for Health Technology and Services Research (CINTESIS), Faculty of Medicine University of Porto, 4200-450 Porto, Portugal; (T.H.); (L.C.); (C.C.-S.); (A.T.)
- Department of Community Medicine, Information and Health Decision Sciences-MEDCIDS, Faculty of Medicine, University of Porto, 4200-450 Porto, Portugal
- Instituto Politécnico de Viana do Castelo, 4900-347 Viana do Castelo, Portugal
| |
Collapse
|
20
|
Estimation of Parkinson's disease severity using speech features and extreme gradient boosting. Med Biol Eng Comput 2020; 58:2757-2773. [PMID: 32910301 DOI: 10.1007/s11517-020-02250-5] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2020] [Accepted: 08/20/2020] [Indexed: 10/23/2022]
Abstract
In recent years, there is an increasing interest in building e-health systems. The systems built to deliver the health services with the use of internet and communication technologies aim to reduce the costs arising from outpatient visits of patients. Some of the related recent studies propose machine learning-based telediagnosis and telemonitoring systems for Parkinson's disease (PD). Motivated from the studies showing the potential of speech disorders in PD telemonitoring systems, in this study, we aim to estimate the severity of PD from voice recordings of the patients using motor Unified Parkinson's Disease Rating Scale (UPDRS) as the evaluation metric. For this purpose, we apply various speech processing algorithms to the voice signals of the patients and then use these features as input to a two-stage estimation model. The first step is to apply a wrapper-based feature selection algorithm, called Boruta, and select the most informative speech features. The second step is to feed the selected set of features to a decision tree-based boosting algorithm, extreme gradient boosting, which has been recently applied successfully in many machine learning tasks due to its generalization ability and speed. The feature selection analysis showed that the vibration pattern of the vocal fold is an important indicator of PD severity. Besides, we also investigate the effectiveness of using age and years passed since diagnosis as covariates together with speech features. The lowest mean absolute error with 3.87 was obtained by combining these covariates and speech features with prediction level fusion. Graphical Abstract Framework for the proposed UPDRS estimation model.
Collapse
|
21
|
Gómez-Rodellar A, Palacios-Alonso D, Ferrández Vicente JM, Mekyska J, Álvarez-Marquina A, Gómez-Vilda P. A Methodology to Differentiate Parkinson's Disease and Aging Speech Based on Glottal Flow Acoustic Analysis. Int J Neural Syst 2020; 30:2050058. [PMID: 32880202 DOI: 10.1142/s0129065720500586] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
Speech is controlled by axial neuromotor systems, therefore, it is highly sensitive to the effects of neurodegenerative illnesses such as Parkinson's Disease (PD). Patients suffering from PD present important alterations in speech, which are manifested in phonation, articulation, prosody, and fluency. These alterations may be evaluated using statistical methods on features obtained from glottal, spectral, cepstral, or fractal descriptions of speech. This work introduces an evaluation paradigm based on Information Theory (IT) to differentiate the effects of PD and aging on glottal amplitude distributions. The study is conducted on a database including 48 PD patients (24 males, 24 females), 48 age-matched healthy controls (HC, 24 males, 24 females), and 48 mid-age normative subjects (NS, 24 males, 24 females). It may be concluded from the study that Hierarchical Clustering (HiCl) methods produce a clear separation between the phonation of PD patients from NS subjects (accuracy of 89.6% for both male and female subsets), but the separation between PD patients and HC subjects is less efficient (accuracy of 75.0% for the male subset and 70.8% for the female subset). Conversely, using feature selection and Support Vector Machine (SVM) classification, the differentiation between PD and HC is substantially improved (accuracy of 94.8% for the male subset and 92.8% for the female subset). This improvement was mainly boosted by feature selection, at a cost of information and generalization losses. The results point to the possibility that speech deterioration may affect HC phonation with aging, reducing its difference to PD phonation.
Collapse
Affiliation(s)
- Andrés Gómez-Rodellar
- Usher Institute, Medical School, University of Edinburgh, Old Medical School, Teviot Place, Edinburgh, EH8 9AG UK
| | - Daniel Palacios-Alonso
- Escuela Técnica Superior de Ingeniería Informática, Universidad Rey Juan Carlos, Calle Tulipán, s/n, 28933 Móstoles, Madrid, Spain
| | - José M Ferrández Vicente
- Universidad Politécnica de Cartagena, Campus Universitario Muralla del Mar, Pza. Hospital 1, 30202 Cartagena, Spain
| | - Jiri Mekyska
- Department of Telecommunications, Brno University of Technology, Technicka 10, 61600 Brno, Czech Republic
| | - Agustín Álvarez-Marquina
- Neuromorphic Speech Processing Lab, Center for Biomedical Technology, Universidad, Politécnica de Madrid, Campus de Montegancedo, 28223 Pozuelo de Alarcón, Madrid, Spain
| | - Pedro Gómez-Vilda
- Neuromorphic Speech Processing Lab, Center for Biomedical Technology, Universidad, Politécnica de Madrid, Campus de Montegancedo, 28223 Pozuelo de Alarcón, Madrid, Spain
| |
Collapse
|
22
|
Voice Pathology Detection and Classification Using Convolutional Neural Network Model. APPLIED SCIENCES-BASEL 2020. [DOI: 10.3390/app10113723] [Citation(s) in RCA: 57] [Impact Index Per Article: 14.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Voice pathology disorders can be effectively detected using computer-aided voice pathology classification tools. These tools can diagnose voice pathologies at an early stage and offering appropriate treatment. This study aims to develop a powerful feature extraction voice pathology detection tool based on Deep Learning. In this paper, a pre-trained Convolutional Neural Network (CNN) was applied to a dataset of voice pathology to maximize the classification accuracy. This study also proposes a distinguished training method combined with various training strategies in order to generalize the application of the proposed system on a wide range of problems related to voice disorders. The proposed system has tested using a voice database, namely the Saarbrücken voice database (SVD). The experimental results show the proposed CNN method for speech pathology detection achieves accuracy up to 95.41%. It also obtains 94.22% and 96.13% for F1-Score and Recall. The proposed system shows a high capability of the real-clinical application that offering a fast-automatic diagnosis and treatment solutions within 3 s to achieve the classification accuracy.
Collapse
|
23
|
Fonseca ES, Guido RC, Junior SB, Dezani H, Gati RR, Mosconi Pereira DC. Acoustic investigation of speech pathologies based on the discriminative paraconsistent machine (DPM). Biomed Signal Process Control 2020. [DOI: 10.1016/j.bspc.2019.101615] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
|
24
|
Characterization of Parkinson’s disease dysarthria in terms of speech articulation kinematics. Biomed Signal Process Control 2019. [DOI: 10.1016/j.bspc.2019.04.029] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
|
25
|
|
26
|
On the design of automatic voice condition analysis systems. Part I: Review of concepts and an insight to the state of the art. Biomed Signal Process Control 2019. [DOI: 10.1016/j.bspc.2018.12.024] [Citation(s) in RCA: 28] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
|
27
|
Gómez-Vilda P, Gómez-Rodellar A, Vicente JMF, Mekyska J, Palacios-Alonso D, Rodellar-Biarge V, Álvarez-Marquina A, Eliasova I, Kostalova M, Rektorova I. Neuromechanical Modelling of Articulatory Movements from Surface Electromyography and Speech Formants. Int J Neural Syst 2019; 29:1850039. [DOI: 10.1142/s0129065718500399] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
Speech articulation is produced by the movements of muscles in the larynx, pharynx, mouth and face. Therefore speech shows acoustic features as formants which are directly related with neuromotor actions of these muscles. The first two formants are strongly related with jaw and tongue muscular activity. Speech can be used as a simple and ubiquitous signal, easy to record and process, either locally or on e-Health platforms. This fact may open a wide set of applications in the study of functional grading and monitoring neurodegenerative diseases. A relevant question, in this sense, is how far speech correlates and neuromotor actions are related. This preliminary study is intended to find answers to this question by using surface electromyographic recordings on the masseter and the acoustic kinematics related with the first formant. It is shown in the study that relevant correlations can be found among the surface electromyographic activity (dynamic muscle behavior) and the positions and first derivatives of the first formant (kinematic variables related to vertical velocity and acceleration of the joint jaw and tongue biomechanical system). As an application example, it is shown that the probability density function associated to these kinematic variables is more sensitive than classical features as Vowel Space Area (VSA) or Formant Centralization Ratio (FCR) in characterizing neuromotor degeneration in Parkinson’s Disease.
Collapse
Affiliation(s)
- Pedro Gómez-Vilda
- Neuromorphic Speech Processing Lab, Center for Biomedical Technology, Universidad Politécnica de, Madrid Campus de Montegancedo, 28223 Pozuelo de Alarcón, Madrid, Spain
| | - Andrés Gómez-Rodellar
- Neuromorphic Speech Processing Lab, Center for Biomedical Technology, Universidad Politécnica de, Madrid Campus de Montegancedo, 28223 Pozuelo de Alarcón, Madrid, Spain
| | - José M. Ferrández Vicente
- Universidad Politécnica de Cartagena, Campus Universitario Muralla del Mar, Pza. Hospital 1, 30202 Cartagena, Spain
| | - Jiri Mekyska
- Department of Telecommunications, Brno University of Technology, Technicka 10, 61600 Brno, Czech Republic
| | - Daniel Palacios-Alonso
- Neuromorphic Speech Processing Lab, Center for Biomedical Technology, Universidad Politécnica de, Madrid Campus de Montegancedo, 28223 Pozuelo de Alarcón, Madrid, Spain
- Escuela Técnica Superior de Ingeniería Informática - Universidad Rey Juan Carlos, Campus de Móstoles, Tulipán s/n, 28933 Móstoles, Madrid, Spain
| | - Victoria Rodellar-Biarge
- Neuromorphic Speech Processing Lab, Center for Biomedical Technology, Universidad Politécnica de, Madrid Campus de Montegancedo, 28223 Pozuelo de Alarcón, Madrid, Spain
| | - Agustín Álvarez-Marquina
- Neuromorphic Speech Processing Lab, Center for Biomedical Technology, Universidad Politécnica de, Madrid Campus de Montegancedo, 28223 Pozuelo de Alarcón, Madrid, Spain
| | - Ilona Eliasova
- First Department of Neurology, Faculty of Medicine and St. Anne’s University Hospital, Masaryk University, Pekarska 53, 656 91 Brno, Czech Republic
- Applied Neuroscience Research Group, Central European Institute of Technology, CEITEC, Masaryk University, Kamenice 753/5, 625 00 Brno, Czech Republic
| | - Milena Kostalova
- Applied Neuroscience Research Group, Central European Institute of Technology, CEITEC, Masaryk University, Kamenice 753/5, 625 00 Brno, Czech Republic
- Department of Neurology, Faculty Hospital and Masaryk University, Jihlavska 20, 63900 Brno, Czech Republic
| | - Irena Rektorova
- First Department of Neurology, Faculty of Medicine and St. Anne’s University Hospital, Masaryk University, Pekarska 53, 656 91 Brno, Czech Republic
- Applied Neuroscience Research Group, Central European Institute of Technology, CEITEC, Masaryk University, Kamenice 753/5, 625 00 Brno, Czech Republic
| |
Collapse
|
28
|
Changes in Phonation and Their Relations with Progress of Parkinson’s Disease. APPLIED SCIENCES-BASEL 2018. [DOI: 10.3390/app8122339] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Hypokinetic dysarthria, which is associated with Parkinson’s disease (PD), affects several speech dimensions, including phonation. Although the scientific community has dealt with a quantitative analysis of phonation in PD patients, a complex research revealing probable relations between phonatory features and progress of PD is missing. Therefore, the aim of this study is to explore these relations and model them mathematically to be able to estimate progress of PD during a two-year follow-up. We enrolled 51 PD patients who were assessed by three commonly used clinical scales. In addition, we quantified eight possible phonatory disorders in five vowels. To identify the relationship between baseline phonatory features and changes in clinical scores, we performed a partial correlation analysis. Finally, we trained XGBoost models to predict the changes in clinical scores during a two-year follow-up. For two years, the patients’ voices became more aperiodic with increased microperturbations of frequency and amplitude. Next, the XGBoost models were able to predict changes in clinical scores with an error in range 11–26%. Although we identified some significant correlations between changes in phonatory features and clinical scores, they are less interpretable. This study suggests that it is possible to predict the progress of PD based on the acoustic analysis of phonation. Moreover, it recommends utilizing the sustained vowel /i/ instead of /a/.
Collapse
|
29
|
|
30
|
Abstract
This paper addresses the processing of speech data and their utilization in a decision support system. The main aim of this work is to utilize machine learning methods to recognize pathological speech, particularly dysphonia. We extracted 1560 speech features and used these to train the classification model. As classifiers, three state-of-the-art methods were used: K-nearest neighbors, random forests, and support vector machine. We analyzed the performance of classifiers with and without gender taken into account. The experimental results showed that it is possible to recognize pathological speech with as high as a 91.3% classification accuracy.
Collapse
|
31
|
Gómez-Vilda P, Galaz Z, Mekyska J, Vicente JMF, Gómez-Rodellar A, Palacios-Alonso D, Smekal Z, Eliasova I, Kostalova M, Rektorova I. Vowel Articulation Dynamic Stability Related to Parkinson's Disease Rating Features: Male Dataset. Int J Neural Syst 2018; 29:1850037. [PMID: 30336711 DOI: 10.1142/s0129065718500375] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
Neurodegenerative pathologies as Parkinson's Disease (PD) show important distortions in speech, affecting fluency, prosody, articulation and phonation. Classically, measurements based on articulation gestures altering formant positions, as the Vocal Space Area (VSA) or the Formant Centralization Ratio (FCR) have been proposed to measure speech distortion, but these markers are based mainly on static positions of sustained vowels. The present study introduces a measurement based on the mutual information distance among probability density functions of kinematic correlates derived from formant dynamics. An absolute kinematic velocity associated to the position of the jaw and tongue articulation gestures is estimated and modeled statistically. The distribution of this feature may differentiate PD patients from normative speakers during sustained vowel emission. The study is based on a limited database of 53 male PD patients, contrasted to a very selected and stable set of eight normative speakers. In this sense, distances based on Kullback-Leibler divergence seem to be sensitive to PD articulation instability. Correlation studies show statistically relevant relationship between information contents based on articulation instability to certain motor and nonmotor clinical scores, such as freezing of gait, or sleep disorders. Remarkably, one of the statistically relevant correlations point out to the time interval passed since the first diagnostic. These results stress the need of defining scoring scales specifically designed for speech disability estimation and monitoring methodologies in degenerative diseases of neuromotor origin.
Collapse
Affiliation(s)
- Pedro Gómez-Vilda
- Neuromorphic Speech Processing Lab, Center for Biomedical Technology, Universidad Politécnica de Madrid, Campus de Montegancedo, 28223 Pozuelo de Alarcón, Madrid, Spain
| | - Zoltan Galaz
- Department of Telecommunications, Brno University of Technology, Technicka 10, 61600 Brno, Czech Republic
| | - Jiri Mekyska
- Department of Telecommunications, Brno University of Technology, Technicka 10, 61600 Brno, Czech Republic
| | - José M. Ferrández Vicente
- Universidad Politécnica de Cartagena, Campus Universitario Muralla del Mar, Pza. Hospital 1, 30202 Cartagena, Spain
| | - Andrés Gómez-Rodellar
- Neuromorphic Speech Processing Lab, Center for Biomedical Technology, Universidad Politécnica de Madrid, Campus de Montegancedo, 28223 Pozuelo de Alarcón, Madrid, Spain
| | - Daniel Palacios-Alonso
- Neuromorphic Speech Processing Lab, Center for Biomedical Technology, Universidad Politécnica de Madrid, Campus de Montegancedo, 28223 Pozuelo de Alarcón, Madrid, Spain
- Escuela Técnica Superior de Ingeniería Informática – Universidad Rey Juan Carlos, Campus de Móstoles, Tulipán, s/n, 28933 Móstoles, Madrid, Spain
| | - Zdenek Smekal
- Department of Telecommunications, Brno University of Technology, Technicka 10, 61600 Brno, Czech Republic
| | - Ilona Eliasova
- First Department of Neurology, Faculty of Medicine, and St. Anne’s University Hospital, Masaryk University, Pekarska 53, 656 91 Brno, Czech Republic
- Applied Neuroscience Research Group, Central European Institute of Technology, CEITEC, Masaryk University, Kamenice 753/5, 625 00 Brno, Czech Republic
| | - Milena Kostalova
- Applied Neuroscience Research Group, Central European Institute of Technology, CEITEC, Masaryk University, Kamenice 753/5, 625 00 Brno, Czech Republic
- Department of Neurology, Faculty Hospital and Masaryk University, Jihlavska 20, 63900 Brno, Czech Republic
| | - Irena Rektorova
- First Department of Neurology, Faculty of Medicine, and St. Anne’s University Hospital, Masaryk University, Pekarska 53, 656 91 Brno, Czech Republic
- Applied Neuroscience Research Group, Central European Institute of Technology, CEITEC, Masaryk University, Kamenice 753/5, 625 00 Brno, Czech Republic
| |
Collapse
|
32
|
Biomechanical Description of Phonation in Children Affected by Williams Syndrome. J Voice 2018; 32:515.e15-515.e28. [DOI: 10.1016/j.jvoice.2017.07.002] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2016] [Revised: 06/30/2017] [Accepted: 07/05/2017] [Indexed: 11/20/2022]
|
33
|
Mekyska J, Galaz Z, Kiska T, Zvoncak V, Mucha J, Smekal Z, Eliasova I, Kostalova M, Mrackova M, Fiedorova D, Faundez-Zanuy M, Solé-Casals J, Gomez-Vilda P, Rektorova I. Quantitative Analysis of Relationship Between Hypokinetic Dysarthria and the Freezing of Gait in Parkinson's Disease. Cognit Comput 2018; 10:1006-1018. [PMID: 30595758 PMCID: PMC6294819 DOI: 10.1007/s12559-018-9575-8] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/10/2017] [Accepted: 06/13/2018] [Indexed: 12/27/2022]
Abstract
Hypokinetic dysarthria (HD) and freezing of gait (FOG) are both axial symptoms that occur in patients with Parkinson's disease (PD). It is assumed they have some common pathophysiological mechanisms and therefore that speech disorders in PD can predict FOG deficits within the horizon of some years. The aim of this study is to employ a complex quantitative analysis of the phonation, articulation and prosody in PD patients in order to identify the relationship between HD and FOG, and establish a mathematical model that would predict FOG deficits using acoustic analysis at baseline. We enrolled 75 PD patients who were assessed by 6 clinical scales including the Freezing of Gait Questionnaire (FOG-Q). We subsequently extracted 19 acoustic measures quantifying speech disorders in the fields of phonation, articulation and prosody. To identify the relationship between HD and FOG, we performed a partial correlation analysis. Finally, based on the selected acoustic measures, we trained regression models to predict the change in FOG during a 2-year follow-up. We identified significant correlations between FOG-Q scores and the acoustic measures based on formant frequencies (quantifying the movement of the tongue and jaw) and speech rate. Using the regression models, we were able to predict a change in particular FOG-Q scores with an error of between 7.4 and 17.0 %. This study is suggesting that FOG in patients with PD is mainly linked to improper articulation, a disturbed speech rate and to intelligibility. We have also proved that the acoustic analysis of HD at the baseline can be used as a predictor of the FOG deficit during 2 years of follow-up. This knowledge enables researchers to introduce new cognitive systems that predict gait difficulties in PD patients.
Collapse
Affiliation(s)
- Jiri Mekyska
- Department of Telecommunications, Brno University of Technology, Technicka 10, 61600 Brno, Czech Republic
| | - Zoltan Galaz
- Department of Telecommunications, Brno University of Technology, Technicka 10, 61600 Brno, Czech Republic
| | - Tomas Kiska
- Department of Telecommunications, Brno University of Technology, Technicka 10, 61600 Brno, Czech Republic
| | - Vojtech Zvoncak
- Department of Telecommunications, Brno University of Technology, Technicka 10, 61600 Brno, Czech Republic
| | - Jan Mucha
- Department of Telecommunications, Brno University of Technology, Technicka 10, 61600 Brno, Czech Republic
| | - Zdenek Smekal
- Department of Telecommunications, Brno University of Technology, Technicka 10, 61600 Brno, Czech Republic
| | - Ilona Eliasova
- First Department of Neurology, St. Anne’s University Hospital, Pekarska 53, 656 91 Brno, Czech Republic
- Applied Neuroscience Research Group, Central European Institute of Technology, Masaryk University, Kamenice 5, 62500 Brno, Czech Republic
| | - Milena Kostalova
- Applied Neuroscience Research Group, Central European Institute of Technology, Masaryk University, Kamenice 5, 62500 Brno, Czech Republic
- Department of Neurology, Faculty Hospital and Masaryk University, Jihlavska 20, 63900 Brno, Czech Republic
| | - Martina Mrackova
- Applied Neuroscience Research Group, Central European Institute of Technology, Masaryk University, Kamenice 5, 62500 Brno, Czech Republic
| | - Dagmar Fiedorova
- Applied Neuroscience Research Group, Central European Institute of Technology, Masaryk University, Kamenice 5, 62500 Brno, Czech Republic
| | - Marcos Faundez-Zanuy
- Escola Superior Politecnica, Tecnocampus, Avda. Ernest Lluch 32, 08302 Mataro, Barcelona Spain
| | - Jordi Solé-Casals
- Data and Signal Processing Research Group, University of Vic – Central University of Catalonia, Perot Rocaguinarda 17, 08500 Vic, Catalonia Spain
| | - Pedro Gomez-Vilda
- Neuromorphic Processing Laboratory (NeuVox Lab), Center for Biomedical Technology, Universidad Politécnica de Madrid Campus de Montegancedo, s/n, 28223, Pozuelo de Alarcón, Madrid Spain
| | - Irena Rektorova
- First Department of Neurology, St. Anne’s University Hospital, Pekarska 53, 656 91 Brno, Czech Republic
- Applied Neuroscience Research Group, Central European Institute of Technology, Masaryk University, Kamenice 5, 62500 Brno, Czech Republic
| |
Collapse
|
34
|
On the analysis of speech and disfluencies for automatic detection of Mild Cognitive Impairment. Neural Comput Appl 2018. [DOI: 10.1007/s00521-018-3494-1] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/18/2023]
Abstract
Abstract
Alzheimer’s disease is characterized by a progressive and irreversible cognitive deterioration. In a previous stage, the so-called Mild Cognitive Impairment or cognitive loss appears. Nevertheless, this previous stage does not seem sufficiently severe to interfere in independent abilities of daily life, so it is usually diagnosed inappropriately. Thus, its detection is a crucial challenge to be addressed by medical specialists. This paper presents a novel proposal for such early diagnosis based on automatic analysis of speech and disfluencies, and Deep Learning methodologies. The proposed tools could be useful for supporting Mild Cognitive Impairment diagnosis. The Deep Learning approach includes Convolutional Neural Networks and nonlinear multifeature modeling. Additionally, an automatic hybrid methodology is used in order to select the most relevant features by means of nonparametric Mann–Whitney U test and Support Vector Machine Attribute evaluation.
Collapse
|
35
|
|
36
|
López-de-Ipiña K, Calvo P, Faundez-Zanuy M, Clavé P, Nascimento W, Martinez de Lizarduy U, Alvarez D, Arreola V, Ortega O, Mekyska J, Sanz-Cartagena P. Automatic voice analysis for dysphagia detection. SPEECH LANGUAGE AND HEARING 2017. [DOI: 10.1080/2050571x.2017.1369017] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
Affiliation(s)
- K. López-de-Ipiña
- Universidad del País Vasco, Euskal Herriko Unibertsitatea, Donostia, Spain
| | - P. Calvo
- Universidad del País Vasco, Euskal Herriko Unibertsitatea, Donostia, Spain
| | | | - P. Clavé
- Hospital de Mataró, Consorci Sanitari del Maresme, Barcelona, Spain
- Centro de Investigación Biomedica en Red de Enfermedades Hepaticas y Digestivas, Barcelona, Spain
| | - W. Nascimento
- Hospital de Mataró, Consorci Sanitari del Maresme, Barcelona, Spain
- Medical School of Ribeirao Preto – USP, São Paulo, Brasil
| | | | - D. Alvarez
- Hospital de Mataró, Consorci Sanitari del Maresme, Barcelona, Spain
| | - V. Arreola
- Hospital de Mataró, Consorci Sanitari del Maresme, Barcelona, Spain
| | - O. Ortega
- Hospital de Mataró, Consorci Sanitari del Maresme, Barcelona, Spain
| | - Jiri Mekyska
- Brno University of Technology, Brno, Czech Republic
| | | |
Collapse
|
37
|
|
38
|
Gómez-Vilda P, Palacios-Alonso D, Rodellar-Biarge V, Álvarez-Marquina A, Nieto-Lluis V, Martínez-Olalla R. Parkinson's disease monitoring by biomechanical instability of phonation. Neurocomputing 2017. [DOI: 10.1016/j.neucom.2016.06.092] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
|
39
|
Gómez-Vilda P, Mekyska J, Ferrández JM, Palacios-Alonso D, Gómez-Rodellar A, Rodellar-Biarge V, Galaz Z, Smekal Z, Eliasova I, Kostalova M, Rektorova I. Parkinson Disease Detection from Speech Articulation Neuromechanics. Front Neuroinform 2017; 11:56. [PMID: 28970792 PMCID: PMC5609562 DOI: 10.3389/fninf.2017.00056] [Citation(s) in RCA: 29] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2017] [Accepted: 08/09/2017] [Indexed: 12/03/2022] Open
Abstract
Aim: The research described is intended to give a description of articulation dynamics as a correlate of the kinematic behavior of the jaw-tongue biomechanical system, encoded as a probability distribution of an absolute joint velocity. This distribution may be used in detecting and grading speech from patients affected by neurodegenerative illnesses, as Parkinson Disease. Hypothesis: The work hypothesis is that the probability density function of the absolute joint velocity includes information on the stability of phonation when applied to sustained vowels, as well as on fluency if applied to connected speech. Methods: A dataset of sustained vowels recorded from Parkinson Disease patients is contrasted with similar recordings from normative subjects. The probability distribution of the absolute kinematic velocity of the jaw-tongue system is extracted from each utterance. A Random Least Squares Feed-Forward Network (RLSFN) has been used as a binary classifier working on the pathological and normative datasets in a leave-one-out strategy. Monte Carlo simulations have been conducted to estimate the influence of the stochastic nature of the classifier. Two datasets for each gender were tested (males and females) including 26 normative and 53 pathological subjects in the male set, and 25 normative and 38 pathological in the female set. Results: Male and female data subsets were tested in single runs, yielding equal error rates under 0.6% (Accuracy over 99.4%). Due to the stochastic nature of each experiment, Monte Carlo runs were conducted to test the reliability of the methodology. The average detection results after 200 Montecarlo runs of a 200 hyperplane hidden layer RLSFN are given in terms of Sensitivity (males: 0.9946, females: 0.9942), Specificity (males: 0.9944, females: 0.9941) and Accuracy (males: 0.9945, females: 0.9942). The area under the ROC curve is 0.9947 (males) and 0.9945 (females). The equal error rate is 0.0054 (males) and 0.0057 (females). Conclusions: The proposed methodology avails that the use of highly normalized descriptors as the probability distribution of kinematic variables of vowel articulation stability, which has some interesting properties in terms of information theory, boosts the potential of simple yet powerful classifiers in producing quite acceptable detection results in Parkinson Disease.
Collapse
Affiliation(s)
- Pedro Gómez-Vilda
- NeuVox Lab, Biomedical Technology Center, Universidad Politécnica de MadridMadrid, Spain
| | - Jiri Mekyska
- Department of Telecommunications, Brno University of TechnologyBrno, Czechia
| | - José M Ferrández
- Department of Electronics, Computer Technology and Projects, Universidad Politécnica de CartagenaCartagena, Spain
| | - Daniel Palacios-Alonso
- NeuVox Lab, Biomedical Technology Center, Universidad Politécnica de MadridMadrid, Spain
| | - Andrés Gómez-Rodellar
- NeuVox Lab, Biomedical Technology Center, Universidad Politécnica de MadridMadrid, Spain
| | | | - Zoltan Galaz
- Department of Telecommunications, Brno University of TechnologyBrno, Czechia
| | - Zdenek Smekal
- Department of Telecommunications, Brno University of TechnologyBrno, Czechia
| | - Ilona Eliasova
- First Department of Neurology, Faculty of Medicine and St. Anne's University Hospital, Masaryk UniversityBrno, Czechia.,Applied Neuroscience Research Group, Central European Institute of Technology, CEITEC, Masaryk UniversityBrno, Czechia
| | - Milena Kostalova
- Applied Neuroscience Research Group, Central European Institute of Technology, CEITEC, Masaryk UniversityBrno, Czechia.,Department of Neurology, Faculty Hospital and Masaryk UniversityBrno, Czechia
| | - Irena Rektorova
- First Department of Neurology, Faculty of Medicine and St. Anne's University Hospital, Masaryk UniversityBrno, Czechia.,Applied Neuroscience Research Group, Central European Institute of Technology, CEITEC, Masaryk UniversityBrno, Czechia
| |
Collapse
|
40
|
Shilaskar S, Ghatol A, Chatur P. Medical decision support system for extremely imbalanced datasets. Inf Sci (N Y) 2017. [DOI: 10.1016/j.ins.2016.08.077] [Citation(s) in RCA: 30] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
|
41
|
Naranjo L, Pérez CJ, Martín J, Campos-Roca Y. A two-stage variable selection and classification approach for Parkinson's disease detection by using voice recording replications. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2017; 142:147-156. [PMID: 28325442 DOI: 10.1016/j.cmpb.2017.02.019] [Citation(s) in RCA: 26] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/11/2016] [Revised: 01/27/2017] [Accepted: 02/09/2017] [Indexed: 06/06/2023]
Abstract
BACKGROUND AND OBJECTIVE In the scientific literature, there is a lack of variable selection and classification methods considering replicated data. The problem motivating this work consists in the discrimination of people suffering Parkinson's disease from healthy subjects based on acoustic features automatically extracted from replicated voice recordings. METHODS A two-stage variable selection and classification approach has been developed to properly match the replication-based experimental design. The way the statistical approach has been specified allows that the computational problems are solved by using an easy-to-implement Gibbs sampling algorithm. RESULTS The proposed approach produces an acceptable predictive capacity for PD discrimination with the considered database, despite the fact that the sample size is relatively small. Specifically, the accuracy rate, sensitivity and specificity are 86.2%, 82.5%, and 90.0%, respectively. However, the most important fact is that there is an improvement in the interpretability of the results at the same time that it is shown a better chain mixing and a lower computation time with respect to the only-classification approaches presented in the scientific literature. CONCLUSIONS To the best of the authors' knowledge, this is the first approach developed to properly consider intra-subject variability for variable selection and classification. Although the proposed approach has been applied for PD discrimination, it can be applied in other contexts with similar replication-based experimental designs.
Collapse
Affiliation(s)
- Lizbeth Naranjo
- Departamento de Matemáticas, Facultad de Ciencias, Universidad Nacional Autónoma de México, México D.F., Mexico.
| | - Carlos J Pérez
- Departamento de Matemáticas, Universidad de Extremadura, Cáceres, Spain.
| | - Jacinto Martín
- Departamento de Matemáticas, Universidad de Extremadura, Cáceres, Spain.
| | - Yolanda Campos-Roca
- Departamento de Tecnologías de los Computadores y las Comunicaciones, Universidad de Extremadura, Cáceres, Spain.
| |
Collapse
|
42
|
Speech disorders in Parkinson’s disease: early diagnostics and effects of medication and brain stimulation. J Neural Transm (Vienna) 2017; 124:303-334. [PMID: 28101650 DOI: 10.1007/s00702-017-1676-0] [Citation(s) in RCA: 98] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2016] [Accepted: 01/04/2017] [Indexed: 01/31/2023]
|
43
|
Speech prosody impairment predicts cognitive decline in Parkinson’s disease. Parkinsonism Relat Disord 2016; 29:90-5. [DOI: 10.1016/j.parkreldis.2016.05.018] [Citation(s) in RCA: 32] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/09/2016] [Revised: 05/02/2016] [Accepted: 05/18/2016] [Indexed: 11/22/2022]
|
44
|
Drotár P, Mekyska J, Rektorová I, Masarová L, Smékal Z, Faundez-Zanuy M. Evaluation of handwriting kinematics and pressure for differential diagnosis of Parkinson's disease. Artif Intell Med 2016; 67:39-46. [PMID: 26874552 DOI: 10.1016/j.artmed.2016.01.004] [Citation(s) in RCA: 79] [Impact Index Per Article: 9.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2015] [Revised: 12/30/2015] [Accepted: 01/13/2016] [Indexed: 11/28/2022]
Abstract
OBJECTIVE We present the PaHaW Parkinson's disease handwriting database, consisting of handwriting samples from Parkinson's disease (PD) patients and healthy controls. Our goal is to show that kinematic features and pressure features in handwriting can be used for the differential diagnosis of PD. METHODS AND MATERIAL The database contains records from 37 PD patients and 38 healthy controls performing eight different handwriting tasks. The tasks include drawing an Archimedean spiral, repetitively writing orthographically simple syllables and words, and writing of a sentence. In addition to the conventional kinematic features related to the dynamics of handwriting, we investigated new pressure features based on the pressure exerted on the writing surface. To discriminate between PD patients and healthy subjects, three different classifiers were compared: K-nearest neighbors (K-NN), ensemble AdaBoost classifier, and support vector machines (SVM). RESULTS For predicting PD based on kinematic and pressure features of handwriting, the best performing model was SVM with classification accuracy of Pacc=81.3% (sensitivity Psen=87.4% and specificity of Pspe=80.9%). When evaluated separately, pressure features proved to be relevant for PD diagnosis, yielding Pacc=82.5% compared to Pacc=75.4% using kinematic features. CONCLUSION Experimental results showed that an analysis of kinematic and pressure features during handwriting can help assess subtle characteristics of handwriting and discriminate between PD patients and healthy controls.
Collapse
Affiliation(s)
- Peter Drotár
- Department of Telecommunications, Brno University of Technology, Technická 12, 61200 Brno, Czech Republic
| | - Jiří Mekyska
- Department of Telecommunications, Brno University of Technology, Technická 12, 61200 Brno, Czech Republic
| | - Irena Rektorová
- First Department of Neurology, Faculty of Medicine, St. Anns University Hospital, Pekarska 664, 66591 Brno, Czech Republic.
| | - Lucia Masarová
- First Department of Neurology, Faculty of Medicine, St. Anns University Hospital, Pekarska 664, 66591 Brno, Czech Republic
| | - Zdeněk Smékal
- Department of Telecommunications, Brno University of Technology, Technická 12, 61200 Brno, Czech Republic
| | - Marcos Faundez-Zanuy
- Signal Processing Group, Tecnocampus, Escola Universitaria Politecnica de Mataro, Avda. Ernest Llunch 32, 08302 Mataro, Spain
| |
Collapse
|
45
|
Moro-Velázquez L, Gómez-García JA, Godino-Llorente JI. Voice Pathology Detection Using Modulation Spectrum-Optimized Metrics. Front Bioeng Biotechnol 2016; 4:1. [PMID: 26835449 PMCID: PMC4718980 DOI: 10.3389/fbioe.2016.00001] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2015] [Accepted: 01/06/2016] [Indexed: 11/13/2022] Open
Abstract
There exist many acoustic parameters employed for pathological assessment tasks, which have served as tools for clinicians to distinguish between normophonic and pathological voices. However, many of these parameters require an appropriate tuning in order to maximize its efficiency. In this work, a group of new and already proposed modulation spectrum (MS) metrics are optimized considering different time and frequency ranges pursuing the maximization of efficiency for the detection of pathological voices. The optimization of the metrics is performed simultaneously in two different voice databases in order to identify what tuning ranges produce a better generalization. The experiments were cross-validated so as to ensure the validity of the results. A third database is used to test the optimized metrics. In spite of some differences, results indicate that the behavior of the metrics in the optimization process follows similar tendencies for the tuning databases, confirming the generalization capabilities of the proposed MS metrics. In addition, the tuning process reveals which bands of the modulation spectra have relevant information for each metric, which has a physical interpretation respecting the phonatory system. Efficiency values up to 90.6% are obtained in one tuning database, while in the other, the maximum efficiency reaches 71.1%. Obtained results also evidence a separability between normophonic and pathological states using the proposed metrics, which can be exploited for voice pathology detection or assessment.
Collapse
|
46
|
Elfmarková N, Gajdoš M, Mračková M, Mekyska J, Mikl M, Rektorová I. Impact of Parkinson's disease and levodopa on resting state functional connectivity related to speech prosody control. Parkinsonism Relat Disord 2015; 22 Suppl 1:S52-5. [PMID: 26363673 DOI: 10.1016/j.parkreldis.2015.09.006] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/30/2015] [Revised: 09/01/2015] [Accepted: 09/02/2015] [Indexed: 11/29/2022]
Abstract
BACKGROUND Impaired speech prosody is common in Parkinson's disease (PD). We assessed the impact of PD and levodopa on MRI resting-state functional connectivity (rs-FC) underlying speech prosody control. METHODS We studied 19 PD patients in the OFF and ON dopaminergic conditions and 15 age-matched healthy controls using functional MRI and seed partial least squares correlation (PLSC) analysis. In the PD group, we also correlated levodopa-induced rs-FC changes with the results of acoustic analysis. RESULTS The PLCS analysis revealed a significant impact of PD but not of medication on the rs-FC strength of spatial correlation maps seeded by the anterior cingulate (p = 0.006), the right orofacial primary sensorimotor cortex (OF_SM1; p = 0.025) and the right caudate head (CN; p = 0.047). In the PD group, levodopa-induced changes in the CN and OF_SM1 connectivity strengths were related to changes in speech prosody. CONCLUSIONS We demonstrated an impact of PD but not of levodopa on rs-FC within the brain networks related to speech prosody control. When only the PD patients were taken into account, the association between treatment-induced changes in speech prosody and changes in rs-FC within the associative striato-prefrontal and motor speech networks was found.
Collapse
Affiliation(s)
- Nela Elfmarková
- Brain and Mind Research Program, Central European Institute of Technology, CEITEC MU, Masaryk University, Brno, Czech Republic; First Department of Neurology, School of Medicine, Masaryk University and St. Anne's Hospital, Brno, Czech Republic
| | - Martin Gajdoš
- Brain and Mind Research Program, Central European Institute of Technology, CEITEC MU, Masaryk University, Brno, Czech Republic
| | - Martina Mračková
- Brain and Mind Research Program, Central European Institute of Technology, CEITEC MU, Masaryk University, Brno, Czech Republic; First Department of Neurology, School of Medicine, Masaryk University and St. Anne's Hospital, Brno, Czech Republic
| | - Jiří Mekyska
- Department of Telecommunications, Faculty of Electrical Engineering and Communication, Brno University of Technology, Brno, Czech Republic
| | - Michal Mikl
- Brain and Mind Research Program, Central European Institute of Technology, CEITEC MU, Masaryk University, Brno, Czech Republic
| | - Irena Rektorová
- Brain and Mind Research Program, Central European Institute of Technology, CEITEC MU, Masaryk University, Brno, Czech Republic; First Department of Neurology, School of Medicine, Masaryk University and St. Anne's Hospital, Brno, Czech Republic.
| |
Collapse
|