1
|
Wang Y, Wang H, Li Z, Zhang H, Yang L, Li J, Tang Z, Hou S, Wang Q. Sound as a bell: a deep learning approach for health status classification through speech acoustic biomarkers. Chin Med 2024; 19:101. [PMID: 39049005 PMCID: PMC11267751 DOI: 10.1186/s13020-024-00973-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2024] [Accepted: 07/16/2024] [Indexed: 07/27/2024] Open
Abstract
BACKGROUND Human health is a complex, dynamic concept encompassing a spectrum of states influenced by genetic, environmental, physiological, and psychological factors. Traditional Chinese Medicine categorizes health into nine body constitutional types, each reflecting unique balances or imbalances in vital energies, influencing physical, mental, and emotional states. Advances in machine learning models offer promising avenues for diagnosing conditions like Alzheimer's, dementia, and respiratory diseases by analyzing speech patterns, enabling complementary non-invasive disease diagnosis. The study aims to use speech audio to identify subhealth populations characterized by unbalanced constitution types. METHODS Participants, aged 18-45, were selected from the Acoustic Study of Health. Audio recordings were collected using ATR2500X-USB microphones and Praat software. Exclusion criteria included recent illness, dental issues, and specific medical histories. The audio data were preprocessed to Mel-frequency cepstral coefficients (MFCCs) for model training. Three deep learning models-1-Dimensional Convolution Network (Conv1D), 2-Dimensional Convolution Network (Conv2D), and Long Short-Term Memory (LSTM)-were implemented using Python to classify health status. Saliency maps were generated to provide model explainability. RESULTS The study used 1,378 recordings from balanced (healthy) and 1,413 from unbalanced (subhealth) types. The Conv1D model achieved a training accuracy of 91.91% and validation accuracy of 84.19%. The Conv2D model had 96.19% training accuracy and 84.93% validation accuracy. The LSTM model showed 92.79% training accuracy and 87.13% validation accuracy, with early signs of overfitting. AUC scores were 0.92 and 0.94 (Conv1D), 0.99 (Conv2D), and 0.97 (LSTM). All models demonstrated robust performance, with Conv2D excelling in discrimination accuracy. CONCLUSIONS The deep learning classification of human speech audio for health status using body constitution types showed promising results with Conv1D, Conv2D, and LSTM models. Analysis of ROC curves, training accuracy, and validation accuracy showed all models robustly distinguished between balanced and unbalanced constitution types. Conv2D excelled with good accuracy, while Conv1D and LSTM also performed well, affirming their reliability. The study integrates constitution theory and deep learning technologies to classify subhealth populations using noninvasive approach, thereby promoting personalized medicine and early intervention strategies.
Collapse
Affiliation(s)
- Yanbing Wang
- School of Traditional Chinese Medicine, Beijing University of Chinese Medicine, Beijing, 100029, China
| | - Haiyan Wang
- School of Traditional Chinese Medicine, Beijing University of Chinese Medicine, Beijing, 100029, China
| | - Zhuoxuan Li
- School of Traditional Chinese Medicine, Beijing University of Chinese Medicine, Beijing, 100029, China
| | - Haoran Zhang
- School of Management, Beijing University of Chinese Medicine, Beijing, 100029, China
| | - Liwen Yang
- School of Traditional Chinese Medicine, Beijing University of Chinese Medicine, Beijing, 100029, China
| | - Jiarui Li
- School of Traditional Chinese Medicine, Beijing University of Chinese Medicine, Beijing, 100029, China
| | - Zixiang Tang
- School of Traditional Chinese Medicine, Beijing University of Chinese Medicine, Beijing, 100029, China
| | - Shujuan Hou
- National Institute of TCM Constitution and Preventive Medicine, Beijing University of Chinese Medicine, Beijing, 100029, China.
| | - Qi Wang
- National Institute of TCM Constitution and Preventive Medicine, Beijing University of Chinese Medicine, Beijing, 100029, China.
| |
Collapse
|
2
|
Shabber SM, Sumesh EP. AFM signal model for dysarthric speech classification using speech biomarkers. Front Hum Neurosci 2024; 18:1346297. [PMID: 38445096 PMCID: PMC10912169 DOI: 10.3389/fnhum.2024.1346297] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2023] [Accepted: 02/05/2024] [Indexed: 03/07/2024] Open
Abstract
Neurological disorders include various conditions affecting the brain, spinal cord, and nervous system which results in reduced performance in different organs and muscles throughout the human body. Dysarthia is a neurological disorder that significantly impairs an individual's ability to effectively communicate through speech. Individuals with dysarthria are characterized by muscle weakness that results in slow, slurred, and less intelligible speech production. An efficient identification of speech disorders at the beginning stages helps doctors suggest proper medications. The classification of dysarthric speech assumes a pivotal role as a diagnostic tool, enabling accurate differentiation between healthy speech patterns and those affected by dysarthria. Achieving a clear distinction between dysarthric speech and the speech of healthy individuals is made possible through the application of advanced machine learning techniques. In this work, we conducted feature extraction by utilizing the Amplitude and frequency modulated (AFM) signal model, resulting in the generation of a comprehensive array of unique features. A method involving Fourier-Bessel series expansion is employed to separate various components within a complex speech signal into distinct elements. Subsequently, the Discrete Energy Separation Algorithm is utilized to extract essential parameters, namely the Amplitude envelope and Instantaneous frequency, from each component within the speech signal. To ensure the robustness and applicability of our findings, we harnessed data from various sources, including TORGO, UA Speech, and Parkinson datasets. Furthermore, the classifier's performance was evaluated based on multiple measures such as the area under the curve, F1-Score, sensitivity, and accuracy, encompassing KNN, SVM, LDA, NB, and Boosted Tree. Our analyses resulted in classification accuracies ranging from 85 to 97.8% and the F1-score ranging between 0.90 and 0.97.
Collapse
|
3
|
Ali L, Javeed A, Noor A, Rauf HT, Kadry S, Gandomi AH. Parkinson's disease detection based on features refinement through L1 regularized SVM and deep neural network. Sci Rep 2024; 14:1333. [PMID: 38228772 PMCID: PMC10791701 DOI: 10.1038/s41598-024-51600-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2023] [Accepted: 01/07/2024] [Indexed: 01/18/2024] Open
Abstract
In previous studies, replicated and multiple types of speech data have been used for Parkinson's disease (PD) detection. However, two main problems in these studies are lower PD detection accuracy and inappropriate validation methodologies leading to unreliable results. This study discusses the effects of inappropriate validation methodologies used in previous studies and highlights the use of appropriate alternative validation methods that would ensure generalization. To enhance PD detection accuracy, we propose a two-stage diagnostic system that refines the extracted set of features through [Formula: see text] regularized linear support vector machine and classifies the refined subset of features through a deep neural network. To rigorously evaluate the effectiveness of the proposed diagnostic system, experiments are performed on two different voice recording-based benchmark datasets. For both datasets, the proposed diagnostic system achieves 100% accuracy under leave-one-subject-out (LOSO) cross-validation (CV) and 97.5% accuracy under k-fold CV. The results show that the proposed system outperforms the existing methods regarding PD detection accuracy. The results suggest that the proposed diagnostic system is essential to improving non-invasive diagnostic decision support in PD.
Collapse
Affiliation(s)
- Liaqat Ali
- Department of Electrical Engineering, University of Science and Technology Bannu, Bannu, Pakistan
| | - Ashir Javeed
- Aging Research Center, Karolinska Institutet, Solna, Sweden
| | - Adeeb Noor
- Department of Information Technology, Faculty of Computing and Information Technology, King Abdulaziz University, 80221, Jeddah, Saudi Arabia
| | | | - Seifedine Kadry
- Department of Applied Data Science, Noroff University College, Kristiansand, Norway
- Artificial Intelligence Research Center (AIRC), Ajman University, Ajman, 346, United Arab Emirates
- Department of Electrical and Computer Engineering, Lebanese American University, Byblos, Lebanon
| | - Amir H Gandomi
- Faculty of Engineering and Information Technology, University of Technology Sydney, Ultimo, NSW, 2007, Australia.
- University Research and Innovation Center (EKIK), Óbuda University, Budapest, 1034, Hungary.
| |
Collapse
|
4
|
da Silva ACF, de Araújo Lima-Filho LM, Almeida AA, Coêlho HFC, Ribeiro VV, Lopes LW. Spectrographic Voice Analysis Protocol (SAP): Convergent, Concurrent, and Accuracy Validity. J Voice 2023:S0892-1997(23)00283-7. [PMID: 37863674 DOI: 10.1016/j.jvoice.2023.09.009] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2023] [Revised: 09/10/2023] [Accepted: 09/11/2023] [Indexed: 10/22/2023]
Abstract
OBJECTIVE To verify the convergent and concurrent validity of the Spectrographic Voice Analysis Protocol (SAP) and its accuracy to discriminate dysphonic from nondysphonic patients. METHOD The study used 82 vowel /Ɛ/ samples and their respective narrowband spectrograms, analyzed with SAP. Cepstral peak prominence (CPP) and cepstral peak prominence smoothed (CPPS) verified the convergent validity of the SAP total score, while the general grade of vocal deviation (GG) verified the concurrent validity of the SAP total score. The ROC (receive operator curve) curve and its accuracy, sensitivity, and specificity values, positive predictive value (PPV) and negative predictive value (NPV), and positive likelihood ratio (LR+) and negative likelihood ratio (LR-) verified the accuracy of the SAP score to discriminate dysphonic from nondysphonic individuals. RESULTS Dysphonic and nondysphonic had different SAP total scores. In the convergent validity, the SAP score had a weak and moderate negative correlation, respectively, with CPP and CPPS, as well as a moderate positive correlation with GG. SAP performed well in discriminating dysphonic from nondysphonic individuals (area under the curve = 82.0%; sensitivity = 91.7%; specificity = 51.7%; PPV = 93.7%; NPV = 44.0%; LR+ = 6.21; LR- = 0.53) based on the 8-point cutoff score. CONCLUSION SAP has convergent validity with CPP and CPPS and concurrent validity with GG. The SAP total score performed well in discriminating dysphonic from nondysphonic individuals. However, the specificity, NPV, and LR- values justify cautiously using SAP, always in combination with other information in clinical voice assessment.
Collapse
Affiliation(s)
| | | | - Anna Alice Almeida
- Universidade Federal da Paraíba (UFPB), Decision Models and Health Program, João Pessoa, Paraíba, Brazil
| | | | - Vanessa Veis Ribeiro
- Universidade de Brasília (UNB), Speech-Language and Hearing Department, Brasília, Federal District, Brazil
| | - Leonardo Wanderley Lopes
- Universidade Federal da Paraíba (UFPB), Decision Models and Health Program, João Pessoa, Paraíba, Brazil.
| |
Collapse
|
5
|
Biswas SK, Nath Boruah A, Saha R, Raj RS, Chakraborty M, Bordoloi M. Early detection of Parkinson disease using stacking ensemble method. Comput Methods Biomech Biomed Engin 2023; 26:527-539. [PMID: 35587795 DOI: 10.1080/10255842.2022.2072683] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/03/2022]
Abstract
Parkinson's disease (PD) is a common progressive neurodegenerative disorder that occurs due to corrosion of the substantianigra, located in the thalamic region of the human brain, and is responsible for the transmission of neural signals throughout the human body using brain chemical, termed as "dopamine." Diagnosis of PD is difficult, as it is often affected by the characteristics of the medical data of the patients, which include the presence of various indicators, imbalance cases of patients' data records, similar cases of healthy/affected persons, etc. Hence, sometimes the process of diagnosis may also be affected by human error. To overcome this problem some intelligent models have been proposed; however, most of them are single classifier-based models and due to this these models cannot handle noisy and imbalanced data properly and thus sometimes overfit the model. To reduce bias and variance, and to avoid overfitting of a single classifier-based model, this paper proposes an ensemble-based PD diagnosis model, named Ensembled Expert System for Diagnosis of Parkinson's Disease (EESDPD) with relevant features and a simple stacking ensemble technique. The proposed EESDPD aggregates diverse assumptions for making the prediction. The performance of the proposed EESDPD is compared with the performances of logistic regression, SVM, Naïve Bayes, Random Forest, XGBoost, simple Decision Tree, B-TDS-PD and B-TESM-PD in terms of classification accuracy, precision, recall and F1-score measures.
Collapse
Affiliation(s)
- Saroj Kumar Biswas
- Computer Science and Engineering Department, National Institute of Technology, Silchar, India
| | - Arpita Nath Boruah
- Computer Science and Engineering Department, National Institute of Technology, Silchar, India
| | - Rajib Saha
- Computer Science and Engineering Department, National Institute of Technology, Silchar, India
| | - Ravi Shankar Raj
- Computer Science and Engineering Department, National Institute of Technology, Silchar, India
| | - Manomita Chakraborty
- School of Computer Science and Engineering, VIT-AP University, Amaravathi, India
| | - Monali Bordoloi
- School of Computer Science and Engineering, VIT-AP University, Amaravathi, India
| |
Collapse
|
6
|
Likhachov DS, Vashkevich MI, Petrovsky NA, Azarov ES. Small-size spectral features for machine learning in voice signal analysis and classification tasks. INFORMATICS 2023. [DOI: 10.37661/1816-0301-2023-20-1-102-112] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/07/2023] Open
Abstract
Objectives. The problem of developing a method for calculating small-sized spectral features that increases the efficiency of existing machine learning systems for analyzing and classifying voice signals is being solved.Methods. Spectral features are extracted using a generative approach, which involves calculating a discrete Fourier spectrum for a sequence of samples generated using an autoregressive model of input voice signal. The generated sequence processed by the discrete Fourier transform considers the periodicity of the transform and thereby increase the accuracy of spectral estimation of analyzed signal.Results. A generative method for calculating spectral features intended for use in machine learning systems for the analysis and classification of voice signals is proposed and described. An experimental analysis of the accuracy and stability of the spectrum representation of a test signal with a known spectral composition has been carried out using the envelopes. The envelopes were calculated using proposed generative method and using discrete Fourier transform with different analysis windows (rectangular window and Hanna window). The analysis showed that spectral envelopes obtained using the proposed method more accurately represent the spectrum of test signal according to the criterion of minimum square error. A comparison of the effectiveness of voice signal classification with proposed features and the features based on the mel-frequency kepstral coefficients is carried out. A diagnostic system for amyotrophic lateral sclerosis was used as a basic test system to evaluate the effectiveness of proposed approach in practice. Conclusion. The obtained experimental results showed a significant increase of classification accuracy when using proposed approach for calculating features compared with the features based on the mel-frequency kepstral coefficients.
Collapse
Affiliation(s)
- D. S. Likhachov
- Belarusian State University of Informatics and Radioelectronics
| | | | - N. A. Petrovsky
- Belarusian State University of Informatics and Radioelectronics
| | - E. S. Azarov
- Belarusian State University of Informatics and Radioelectronics
| |
Collapse
|
7
|
Shrivas A, Deshpande S, Gidaye G, Nirmal J, Ezzine K, Frikha M, Desai K, Shinde S, Oza AD, Burduhos-Nergis DD, Burduhos-Nergis DP. Employing Energy and Statistical Features for Automatic Diagnosis of Voice Disorders. Diagnostics (Basel) 2022; 12:diagnostics12112758. [PMID: 36428819 PMCID: PMC9689977 DOI: 10.3390/diagnostics12112758] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2022] [Revised: 10/30/2022] [Accepted: 11/09/2022] [Indexed: 11/16/2022] Open
Abstract
The presence of laryngeal disease affects vocal fold(s) dynamics and thus causes changes in pitch, loudness, and other characteristics of the human voice. Many frameworks based on the acoustic analysis of speech signals have been created in recent years; however, they are evaluated on just one or two corpora and are not independent to voice illnesses and human bias. In this article, a unified wavelet-based paradigm for evaluating voice diseases is presented. This approach is independent of voice diseases, human bias, or dialect. The vocal folds' dynamics are impacted by the voice disorder, and this further modifies the sound source. Therefore, inverse filtering is used to capture the modified voice source. Furthermore, the fundamental frequency independent statistical and energy metrics are derived from each spectral sub-band to characterize the retrieved voice source. Speech recordings of the sustained vowel /a/ were collected from four different datasets in German, Spanish, English, and Arabic to run the several intra and inter-dataset experiments. The classifiers' achieved performance indicators show that energy and statistical features uncover vital information on a variety of clinical voices, and therefore the suggested approach can be used as a complementary means for the automatic medical assessment of voice diseases.
Collapse
Affiliation(s)
- Avinash Shrivas
- Department of Computer Science & Technology, Degree College of Physical Education, Sant Gadge Baba Amravati University, Amravati 444605, India
- Correspondence: (A.S.); (D.P.B.-N.); Tel.: +91-9819261821 (A.S.)
| | - Shrinivas Deshpande
- Department of Computer Science & Technology, Degree College of Physical Education, Sant Gadge Baba Amravati University, Amravati 444605, India
| | - Girish Gidaye
- Department of Electronics and Computer Science, Vidyalankar Institute of Technology, Mumbai University, Mumbai 400037, India
| | - Jagannath Nirmal
- Department of Electronics Engineering, Somaiya Vidyavihar University, Mumbai 400077, India
| | - Kadria Ezzine
- ATISP, ENET’COM, Sfax University, Sfax 3000, Tunisia
| | | | - Kamalakar Desai
- Department of Electronics and Telecommunication Engineering, Bharati Vidyapeeth’s College of Engineering, Shivaji University, Kolhapur 416013, India
| | - Sachin Shinde
- Department of Mechanical Engineering, Datta Meghe College of Engineering, Mumbai University, Airoli, Navi Mumbai 400708, India
| | - Ankit D. Oza
- Department of Computer Sciences and Engineering, Institute of Advanced Research, The University for Innovation, Gandhianagar 382426, India
| | - Dumitru Doru Burduhos-Nergis
- Faculty of Materials Science and Engineering, Gheorghe Asachi Technical University of Iasi, 700050 Iasi, Romania
| | - Diana Petronela Burduhos-Nergis
- Faculty of Materials Science and Engineering, Gheorghe Asachi Technical University of Iasi, 700050 Iasi, Romania
- Correspondence: (A.S.); (D.P.B.-N.); Tel.: +91-9819261821 (A.S.)
| |
Collapse
|
8
|
Rana A, Dumka A, Singh R, Panda MK, Priyadarshi N. A Computerized Analysis with Machine Learning Techniques for the Diagnosis of Parkinson's Disease: Past Studies and Future Perspectives. Diagnostics (Basel) 2022; 12:2708. [PMID: 36359550 PMCID: PMC9689408 DOI: 10.3390/diagnostics12112708] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2022] [Revised: 10/30/2022] [Accepted: 11/02/2022] [Indexed: 08/03/2023] Open
Abstract
According to the World Health Organization (WHO), Parkinson's disease (PD) is a neurodegenerative disease of the brain that causes motor symptoms including slower movement, rigidity, tremor, and imbalance in addition to other problems like Alzheimer's disease (AD), psychiatric problems, insomnia, anxiety, and sensory abnormalities. Techniques including artificial intelligence (AI), machine learning (ML), and deep learning (DL) have been established for the classification of PD and normal controls (NC) with similar therapeutic appearances in order to address these problems and improve the diagnostic procedure for PD. In this article, we examine a literature survey of research articles published up to September 2022 in order to present an in-depth analysis of the use of datasets, various modalities, experimental setups, and architectures that have been applied in the diagnosis of subjective disease. This analysis includes a total of 217 research publications with a list of the various datasets, methodologies, and features. These findings suggest that ML/DL methods and novel biomarkers hold promising results for application in medical decision-making, leading to a more methodical and thorough detection of PD. Finally, we highlight the challenges and provide appropriate recommendations on selecting approaches that might be used for subgrouping and connection analysis with structural magnetic resonance imaging (sMRI), DaTSCAN, and single-photon emission computerized tomography (SPECT) data for future Parkinson's research.
Collapse
Affiliation(s)
- Arti Rana
- Computer Science & Engineering, Veer Madho Singh Bhandari Uttarakhand Technical University, Dehradun 248007, Uttarakhand, India
| | - Ankur Dumka
- Department of Computer Science and Engineering, Women Institute of Technology, Dehradun 248007, Uttarakhand, India
- Department of Computer Science & Engineering, Graphic Era Deemed to be University, Dehradun 248001, Uttarakhand, India
| | - Rajesh Singh
- Division of Research and Innovation, Uttaranchal Institute of Technology, Uttaranchal University, Dehradun 248007, Uttarakhand, India
- Department of Project Management, Universidad Internacional Iberoamericana, Campeche 24560, Mexico
| | - Manoj Kumar Panda
- Department of Electrical Engineering, G.B. Pant Institute of Engineering and Technology, Pauri 246194, Uttarakhand, India
| | - Neeraj Priyadarshi
- Department of Electrical Engineering, JIS College of Engineering, Kolkata 741235, West Bengal, India
| |
Collapse
|
9
|
Karan B, Sahu SS, Orozco-Arroyave JR. An investigation about the relationship between dysarthria level of speech and the neurological state of Parkinson’s patients. Biocybern Biomed Eng 2022. [DOI: 10.1016/j.bbe.2022.04.003] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
|
10
|
Ali L, Chakraborty C, He Z, Cao W, Imrana Y, Rodrigues JJPC. A novel sample and feature dependent ensemble approach for Parkinson’s disease detection. Neural Comput Appl 2022. [DOI: 10.1007/s00521-022-07046-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
Abstract
AbstractParkinson’s disease (PD) is a neurological disease that has been reported to have affected most people worldwide. Recent research pointed out that about 90% of PD patients possess voice disorders. Motivated by this fact, many researchers proposed methods based on multiple types of speech data for PD prediction. However, these methods either face the problem of low rate of accuracy or lack generalization. To develop an approach that will be free of these issues, in this paper we propose a novel ensemble approach. These paper contributions are two folds. First, investigating feature selection integration with deep neural network (DNN) and validating its effectiveness by comparing its performance with conventional DNN and other similar integrated systems. Second, development of a novel ensemble model namely EOFSC (Ensemble model with Optimal Features and Sample Dependant Base Classifiers) that exploits the findings of recently published studies. Recent research pointed out that for different types of voice data, different optimal models are obtained which are sensitive to different types of samples and subsets of features. In this paper, we further consolidate the findings by utilizing the proposed integrated system and propose the development of EOFSC. For multiple types of vowel phonations, multiple base classifiers are obtained which are sensitive to different subsets of features. These features and sample-dependent base classifiers are integrated, and the proposed EOFSC model is constructed. To evaluate the final prediction of the EOFSC model, the majority voting methodology is adopted. Experimental results point out that feature selection integration with neural networks improves the performance of conventional neural networks. Additionally, feature selection integration with DNN outperforms feature selection integration with conventional machine learning models. Finally, the newly developed ensemble model is observed to improve PD detection accuracy by 6.5%.
Collapse
|
11
|
Ryu JY, Chung HY, Choi KY. Potential role of artificial intelligence in craniofacial surgery. Arch Craniofac Surg 2021; 22:223-231. [PMID: 34732033 PMCID: PMC8568494 DOI: 10.7181/acfs.2021.00507] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2021] [Accepted: 10/20/2021] [Indexed: 12/25/2022] Open
Abstract
The field of artificial intelligence (AI) is rapidly advancing, and AI models are increasingly applied in the medical field, especially in medical imaging, pathology, natural language processing, and biosignal analysis. On the basis of these advances, telemedicine, which allows people to receive medical services outside of hospitals or clinics, is also developing in many countries. The mechanisms of deep learning used in medical AI include convolutional neural networks, residual neural networks, and generative adversarial networks. Herein, we investigate the possibility of using these AI methods in the field of craniofacial surgery, with potential applications including craniofacial trauma, congenital anomalies, and cosmetic surgery.
Collapse
Affiliation(s)
- Jeong Yeop Ryu
- Department of Plastic and Reconstructive Surgery, School of Medicine, Kyungpook National University, Daegu, Korea
| | - Ho Yun Chung
- Department of Plastic and Reconstructive Surgery, School of Medicine, Kyungpook National University, Daegu, Korea.,Cell & Matrix Research Institute, School of Medicine, Kyungpook National University, Daegu, Korea
| | - Kang Young Choi
- Department of Plastic and Reconstructive Surgery, School of Medicine, Kyungpook National University, Daegu, Korea
| |
Collapse
|
12
|
Sentiment analysis of pets using deep learning technologies in artificial intelligence of things system. Soft comput 2021. [DOI: 10.1007/s00500-021-06038-z] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/13/2023]
|
13
|
Karan B, Sahu SS, Orozco-Arroyave JR, Mahto K. Non-negative matrix factorization-based time-frequency feature extraction of voice signal for Parkinson's disease prediction. COMPUT SPEECH LANG 2021. [DOI: 10.1016/j.csl.2021.101216] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]
|
14
|
Belkhou A, Jbari A, Badlaoui OE. A computer-aided-diagnosis system for neuromuscular diseases using Mel frequency Cepstral coefficients. SCIENTIFIC AFRICAN 2021. [DOI: 10.1016/j.sciaf.2021.e00904] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022] Open
|
15
|
Mei J, Desrosiers C, Frasnelli J. Machine Learning for the Diagnosis of Parkinson's Disease: A Review of Literature. Front Aging Neurosci 2021; 13:633752. [PMID: 34025389 PMCID: PMC8134676 DOI: 10.3389/fnagi.2021.633752] [Citation(s) in RCA: 76] [Impact Index Per Article: 25.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2020] [Accepted: 03/22/2021] [Indexed: 12/26/2022] Open
Abstract
Diagnosis of Parkinson's disease (PD) is commonly based on medical observations and assessment of clinical signs, including the characterization of a variety of motor symptoms. However, traditional diagnostic approaches may suffer from subjectivity as they rely on the evaluation of movements that are sometimes subtle to human eyes and therefore difficult to classify, leading to possible misclassification. In the meantime, early non-motor symptoms of PD may be mild and can be caused by many other conditions. Therefore, these symptoms are often overlooked, making diagnosis of PD at an early stage challenging. To address these difficulties and to refine the diagnosis and assessment procedures of PD, machine learning methods have been implemented for the classification of PD and healthy controls or patients with similar clinical presentations (e.g., movement disorders or other Parkinsonian syndromes). To provide a comprehensive overview of data modalities and machine learning methods that have been used in the diagnosis and differential diagnosis of PD, in this study, we conducted a literature review of studies published until February 14, 2020, using the PubMed and IEEE Xplore databases. A total of 209 studies were included, extracted for relevant information and presented in this review, with an investigation of their aims, sources of data, types of data, machine learning methods and associated outcomes. These studies demonstrate a high potential for adaptation of machine learning methods and novel biomarkers in clinical decision making, leading to increasingly systematic, informed diagnosis of PD.
Collapse
Affiliation(s)
- Jie Mei
- Chemosensory Neuroanatomy Lab, Department of Anatomy, Université du Québec à Trois-Rivières (UQTR), Trois-Rivières, QC, Canada
| | - Christian Desrosiers
- Laboratoire d'Imagerie, de Vision et d'Intelligence Artificielle (LIVIA), Department of Software and IT Engineering, École de Technologie Supérieure, Montreal, QC, Canada
| | - Johannes Frasnelli
- Chemosensory Neuroanatomy Lab, Department of Anatomy, Université du Québec à Trois-Rivières (UQTR), Trois-Rivières, QC, Canada
- Centre de Recherche de l'Hôpital du Sacré-Coeur de Montréal, Centre Intégré Universitaire de Santé et de Services Sociaux du Nord-de-l'Île-de-Montréal (CIUSSS du Nord-de-l'Île-de-Montréal), Montreal, QC, Canada
| |
Collapse
|
16
|
Alves M, Silva G, Bispo BC, Dajer ME, Rodrigues PM. Voice Disorders Detection Through Multiband Cepstral Features of Sustained Vowel. J Voice 2021; 37:322-331. [PMID: 33663909 DOI: 10.1016/j.jvoice.2021.01.018] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2020] [Revised: 01/18/2021] [Accepted: 01/21/2021] [Indexed: 11/28/2022]
Abstract
This study aims to detect voice disorders related to vocal fold nodule, Reinke's edema and neurological pathologies through multiband cepstral features of the sustained vowel /a/. Detection is performed between pairs of study groups and multiband analysis is accomplished using the wavelet transform. For each pair of groups, a parameters selection is carried out. Time series of the selected parameters are used as input for four classifiers with leave-one-out cross validation. Classification accuracies of 100% are achieved for all pairs including the control group, surpassing the state-of-art methods based on cepstral features, while accuracies higher than 88.50% are obtained for the pathological pairs. The results indicated that the method may be adequate to assist in the diagnosis of the voice disorders addressed. The results must be updated in the future with a larger population to ensure generalization.
Collapse
Affiliation(s)
- Marco Alves
- Universidade Católica Portuguesa, CBQF - Centro de Biotecnologia e Química Fina - Laboratório Associado, Escola Superior de Biotecnologia, Porto, Portugal.
| | - Gabriel Silva
- Universidade Católica Portuguesa, CBQF - Centro de Biotecnologia e Química Fina - Laboratório Associado, Escola Superior de Biotecnologia, Porto, Portugal.
| | - Bruno C Bispo
- Department of Electrical and Electronic Engineering, Federal University of Santa Catarina, Florianópolis-SC, Brazil.
| | - María E Dajer
- Department of Electrical Engineering, Federal University of Technology - Paraná, Cornélio Procópio-PR, Brazil.
| | - Pedro M Rodrigues
- Universidade Católica Portuguesa, CBQF - Centro de Biotecnologia e Química Fina - Laboratório Associado, Escola Superior de Biotecnologia, Porto, Portugal.
| |
Collapse
|
17
|
Classification of ALS patients based on acoustic analysis of sustained vowel phonations. Biomed Signal Process Control 2021. [DOI: 10.1016/j.bspc.2020.102350] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]
|
18
|
Jeancolas L, Petrovska-Delacrétaz D, Mangone G, Benkelfat BE, Corvol JC, Vidailhet M, Lehéricy S, Benali H. X-Vectors: New Quantitative Biomarkers for Early Parkinson's Disease Detection From Speech. Front Neuroinform 2021; 15:578369. [PMID: 33679361 PMCID: PMC7935511 DOI: 10.3389/fninf.2021.578369] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2020] [Accepted: 01/18/2021] [Indexed: 01/18/2023] Open
Abstract
Many articles have used voice analysis to detect Parkinson's disease (PD), but few have focused on the early stages of the disease and the gender effect. In this article, we have adapted the latest speaker recognition system, called x-vectors, in order to detect PD at an early stage using voice analysis. X-vectors are embeddings extracted from Deep Neural Networks (DNNs), which provide robust speaker representations and improve speaker recognition when large amounts of training data are used. Our goal was to assess whether, in the context of early PD detection, this technique would outperform the more standard classifier MFCC-GMM (Mel-Frequency Cepstral Coefficients—Gaussian Mixture Model) and, if so, under which conditions. We recorded 221 French speakers (recently diagnosed PD subjects and healthy controls) with a high-quality microphone and via the telephone network. Men and women were analyzed separately in order to have more precise models and to assess a possible gender effect. Several experimental and methodological aspects were tested in order to analyze their impacts on classification performance. We assessed the impact of the audio segment durations, data augmentation, type of dataset used for the neural network training, kind of speech tasks, and back-end analyses. X-vectors technique provided better classification performances than MFCC-GMM for the text-independent tasks, and seemed to be particularly suited for the early detection of PD in women (7–15% improvement). This result was observed for both recording types (high-quality microphone and telephone).
Collapse
Affiliation(s)
- Laetitia Jeancolas
- Paris Brain Institute-ICM, Centre de NeuroImagerie de Recherche-CENIR, Paris, France.,Laboratoire SAMOVAR, Télécom SudParis, Institut Polytechnique de Paris, Palaiseau, France
| | | | - Graziella Mangone
- Sorbonne University, Inserm, CNRS, Paris Brain Institute-ICM, Paris, France.,Assistance Publique Hôpitaux de Paris, Hôpital Pitié-Salpêtrière, Department of Neurology, Clinical Investigation Center for Neurosciences, Paris, France
| | - Badr-Eddine Benkelfat
- Laboratoire SAMOVAR, Télécom SudParis, Institut Polytechnique de Paris, Palaiseau, France
| | - Jean-Christophe Corvol
- Sorbonne University, Inserm, CNRS, Paris Brain Institute-ICM, Paris, France.,Assistance Publique Hôpitaux de Paris, Hôpital Pitié-Salpêtrière, Department of Neurology, Clinical Investigation Center for Neurosciences, Paris, France
| | - Marie Vidailhet
- Sorbonne University, Inserm, CNRS, Paris Brain Institute-ICM, Paris, France.,Assistance Publique Hôpitaux de Paris, Hôpital Pitié-Salpêtrière, Department of Neurology, Clinical Investigation Center for Neurosciences, Paris, France
| | - Stéphane Lehéricy
- Paris Brain Institute-ICM, Centre de NeuroImagerie de Recherche-CENIR, Paris, France.,Sorbonne University, Inserm, CNRS, Paris Brain Institute-ICM, Paris, France.,Assistance Publique Hôpitaux de Paris, Hôpital Pitié-Salpêtrière, Department of Neuroradiology, Paris, France
| | - Habib Benali
- Department of Electrical & Computer Engineering, PERFORM Center, Concordia University, Montreal, QC, Canada
| |
Collapse
|
19
|
Prediction and Estimation of Parkinson’s Disease Severity Based on Voice Signal. J Voice 2020; 36:439.e9-439.e20. [DOI: 10.1016/j.jvoice.2020.06.004] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2020] [Revised: 06/07/2020] [Accepted: 06/08/2020] [Indexed: 10/23/2022]
|
20
|
Jahnavi BS, Supraja BS, Lalitha S. A vital neurodegenerative disorder detection using speech cues. JOURNAL OF INTELLIGENT & FUZZY SYSTEMS 2020. [DOI: 10.3233/jifs-179714] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Affiliation(s)
- B. Sai Jahnavi
- Department of Electronics and Communication Engineering, Amrita School of Engineering, Bengaluru, Amrita Vishwa Vidyapeetham, India
| | - B. Sai Supraja
- Department of Electronics and Communication Engineering, Amrita School of Engineering, Bengaluru, Amrita Vishwa Vidyapeetham, India
| | - S. Lalitha
- Department of Electronics and Communication Engineering, Amrita School of Engineering, Bengaluru, Amrita Vishwa Vidyapeetham, India
| |
Collapse
|
21
|
Tougui I, Jilbab A, El Mhamdi J. Heart disease classification using data mining tools and machine learning techniques. HEALTH AND TECHNOLOGY 2020. [DOI: 10.1007/s12553-020-00438-1] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]
|
22
|
Soumaya Z, Taoufiq BD, Benayad N, Achraf B, Ammoumou A. A Hybrid Method for the Diagnosis and Classifying Parkinson's Patients based on Time-frequency Domain Properties and K-nearest Neighbor. JOURNAL OF MEDICAL SIGNALS & SENSORS 2020; 10:60-66. [PMID: 32166079 PMCID: PMC7038745 DOI: 10.4103/jmss.jmss_61_18] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2018] [Revised: 07/13/2019] [Accepted: 09/07/2019] [Indexed: 11/12/2022]
Abstract
The vibrations of hands and arms are the main symptoms of Parkinson's ailment. Nevertheless, the affection of the vocal cords leads to troubles and defects in the speech, which is another accurate symptom of the disease. This article presents a diagnostic model of Parkinson's disease (PD) and proposes the time–frequency transform (wavelet WT) and Mel-frequency cepstral coefficients (MFCC) treatment for this disease. The proposed treatment is centered on the vocal signal transformation by a method based on the WT and to extract the coefficients of the MFCC and eventually the categorization of the sick and healthy patients by the use of the classifier K-nearest neighbor (KNN). The analysis used in this article uses a database that contains 18 healthy patients and twenty patients. The Daubechies mother WT is used in treatments to compress the vocal signal and extract the MFCC cepstral coefficients. As far as, the diagnosis of Parkinson's ailment is concerned the KNN classifying performance gives 89% accuracy when applied to 52% of the database as training data, whereas when we increase this percentage from 52% to 73%, we reach 98.68% accuracy which is higher than using the support-vector machine classifier. The KNN is conclusive in the determination of the PD. Moreover, the higher the training data is, the more precise the results are.
Collapse
Affiliation(s)
- Zayrit Soumaya
- Laboratory Industrial Engineering, Information Processing and Logistics (GITIL), Faculty of Science Ain Chok. University Hassan II - Casablanca, Morocco
| | - Belhoussine Drissi Taoufiq
- Laboratory Industrial Engineering, Information Processing and Logistics (GITIL), Faculty of Science Ain Chok. University Hassan II - Casablanca, Morocco
| | - Nsiri Benayad
- Laboratory Research Center STIS, M2CS, Higher School of Technical Education of Rabat (ENSET), Mohammed V University in Rabat, Morocco
| | - Benba Achraf
- Electronic Systems Sensors and Nanobiotechnologies (E2SN), ENSET, Mohammed V University in Rabat, Morocco
| | - Abdelkrim Ammoumou
- Laboratory Industrial Engineering, Information Processing and Logistics (GITIL), Faculty of Science Ain Chok. University Hassan II - Casablanca, Morocco
| |
Collapse
|
23
|
Kuresan H, Samiappan D, Masunda S. Fusion of WPT and MFCC feature extraction in Parkinson's disease diagnosis. Technol Health Care 2020; 27:363-372. [PMID: 30664511 DOI: 10.3233/thc-181306] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/21/2023]
Abstract
BACKGROUND Parkinson's disease (PD) is a neurological disorder, progressive in nature. In order to provide customized patient care, diagnosis and monitoring using smart gadgets, smartphones, and smartwatches, there is a need for a system that works in natural as well as controlled environments. OBJECTIVE AND METHODS The primary purpose is to record speech signal, and identify whether the speech signal is Parkinson or not. For this work, a comparison of three feature extraction methods, i.e. Wavelet Packets, MFCC, and a fusion of MFCC and WPT, were carried out. Apart from the feature extraction, two classifiers were used, i.e. HMM and SVM. RESULTS In this study, a fusion of MFCC, WPT with HMM shows the best performance parameters. CONCLUSION The best of the three feature extraction and classifier results are described in this paper.
Collapse
|
24
|
Gaballah A, Parsa V, Andreetta M, Adams S. Objective and Subjective Speech Quality Assessment of Amplification Devices for Patients With Parkinson’s Disease. IEEE Trans Neural Syst Rehabil Eng 2019; 27:1226-1235. [DOI: 10.1109/tnsre.2019.2915172] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
|
25
|
On the design of automatic voice condition analysis systems. Part I: Review of concepts and an insight to the state of the art. Biomed Signal Process Control 2019. [DOI: 10.1016/j.bspc.2018.12.024] [Citation(s) in RCA: 28] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
|
26
|
Hegde S, Shetty S, Rai S, Dodderi T. A Survey on Machine Learning Approaches for Automatic Detection of Voice Disorders. J Voice 2018; 33:947.e11-947.e33. [PMID: 30316551 DOI: 10.1016/j.jvoice.2018.07.014] [Citation(s) in RCA: 50] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2018] [Revised: 07/06/2018] [Accepted: 07/10/2018] [Indexed: 10/28/2022]
Abstract
The human voice production system is an intricate biological device capable of modulating pitch and loudness. Inherent internal and/or external factors often damage the vocal folds and result in some change of voice. The consequences are reflected in body functioning and emotional standing. Hence, it is paramount to identify voice changes at an early stage and provide the patient with an opportunity to overcome any ramification and enhance their quality of life. In this line of work, automatic detection of voice disorders using machine learning techniques plays a key role, as it is proven to help ease the process of understanding the voice disorder. In recent years, many researchers have investigated techniques for an automated system that helps clinicians with early diagnosis of voice disorders. In this paper, we present a survey of research work conducted on automatic detection of voice disorders and explore how it is able to identify the different types of voice disorders. We also analyze different databases, feature extraction techniques, and machine learning approaches used in these research works.
Collapse
Affiliation(s)
- Sarika Hegde
- NMAM Institute of Technology, Udupi, Karnataka, India.
| | | | - Smitha Rai
- NMAM Institute of Technology, Udupi, Karnataka, India
| | - Thejaswi Dodderi
- Nitte Institute of Speech & Hearing, Mangaluru, Karnataka, India
| |
Collapse
|
27
|
Upadhya SS, Cheeran A, Nirmal J. Thomson Multitaper MFCC and PLP voice features for early detection of Parkinson disease. Biomed Signal Process Control 2018. [DOI: 10.1016/j.bspc.2018.07.019] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
|
28
|
Benba A, Jilbab A, Hammouch A. Using Human Factor Cepstral Coefficient on Multiple Types of Voice Recordings for Detecting Patients with Parkinson's Disease. Ing Rech Biomed 2017. [DOI: 10.1016/j.irbm.2017.10.002] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
|
29
|
Rusz J, Novotny M, Hlavnicka J, Tykalova T, Ruzicka E. High-Accuracy Voice-Based Classification Between Patients With Parkinson’s Disease and Other Neurological Diseases May Be an Easy Task With Inappropriate Experimental Design. IEEE Trans Neural Syst Rehabil Eng 2017; 25:1319-1321. [PMID: 28113773 DOI: 10.1109/tnsre.2016.2621885] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
|