1
|
Wang Y, Li F, Zhang X, Wang P, Li Y, Zhang Y. Intra-subject enveloped multilayer fuzzy sample compression for speech diagnosis of Parkinson's disease. Med Biol Eng Comput 2024; 62:371-388. [PMID: 37874453 DOI: 10.1007/s11517-023-02944-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2023] [Accepted: 10/02/2023] [Indexed: 10/25/2023]
Abstract
Machine learning-based Parkinson's disease (PD) speech diagnosis is a current research hotspot. However, existing methods use each corpus sample as the base unit for modeling. Since different corpus samples within the same subject have different sensitive speech features, it is difficult to obtain unified and stable sensitive speech features (diagnostic markers) that reflect the pathology of the whole subject. Therefore, this study aims at compressing the corpus samples within the subject to facilitate the search for diagnostic markers with high diagnostic accuracy. A two-step sample compression module (TSCM) can solve the problem above. It includes two major parts: sample pruning module (SPM) and sample fuzzy clustering mechanism (SFCMD). Based on stacking multiple TSCMs, a multilayer sample compression module (MSCM) is formed to obtain multilayer compression samples. After that, simultaneous sample/feature selection mechanism (SS/FSM) is designed for feature selection. Based on the multilayer compression samples processed by MSCM and SS/FSM, a novel ensemble learning algorithm (EMSFE) is designed with sparse fusion ensemble learning mechanism (SFELM). The proposed EMSFE is validated by visualization of extracted features and performance comparison with related algorithms. The experimental results show that the proposed algorithm can effectively extract the stable diagnostic markers by compressing the corpus samples within the subject. Furthermore, based on LOSO cross validation, the proposed algorithm with extreme learning machine (ELM) classifier can achieve the accuracy of 92.5%, 93.75% and 91.67% on three datasets, respectively. The proposed EMSFE can extract unified and stable sensitive features that accurately reflect the overall pathology of the subject, which can better meet the requirements of clinical applications.The code and datasets can be found in: https://github.com/wywwwww/EMSFE-supplementary-material.git.
Collapse
Affiliation(s)
- Yiwen Wang
- School of Microelectornics and Communication Engineering, Chongqing University, Chongqing, 400044, China
| | - Fan Li
- School of Microelectornics and Communication Engineering, Chongqing University, Chongqing, 400044, China
| | - Xiaoheng Zhang
- School of Microelectornics and Communication Engineering, Chongqing University, Chongqing, 400044, China
| | - Pin Wang
- School of Microelectornics and Communication Engineering, Chongqing University, Chongqing, 400044, China
| | - Yongming Li
- School of Microelectornics and Communication Engineering, Chongqing University, Chongqing, 400044, China.
| | - Yanling Zhang
- Department of Neurology, Southwest Hospital, Army Medical University, Chongqing, 400038, China
| |
Collapse
|
2
|
Wen P, Zhang Y, Wen G. Intelligent personalized diagnosis modeling in advanced medical system for Parkinson's disease using voice signals. MATHEMATICAL BIOSCIENCES AND ENGINEERING : MBE 2023; 20:8085-8102. [PMID: 37161187 DOI: 10.3934/mbe.2023351] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/11/2023]
Abstract
Currently, machine learning methods have been utilized to realize the early detection of Parkinson's disease (PD) by using voice signals. Because the vocal system of each person is unique, and the same person's pronunciation can be different at different times, the training samples used in machine learning become very different from the speech signal of the patient to be diagnosed, frequently resulting in poor diagnostic performance. On this account, this paper presents a new intelligent personalized diagnosis method (PDM) for Parkinson's disease. The method was designed to begin with constructing new training data by assigning the best classifier to each training sample composed of features from the speech signals of patients. Subsequently, a meta-classifier was trained on the new training data. Finally, for the signal of each test patient, the method used the meta-classifier to select the most appropriate classifier, followed by adopting the selected classifier to classify the signal so that the more accurate diagnosis result of the test patient can be obtained. The novelty of the proposed method is that the proposed method uses different classifiers to perform the diagnosis of PD for diversified patients, whereas the current method uses the same classifier to diagnose all patients to be tested. Results of a large number of experiments show that PDM not only improves the performance but also exceeds the existing methods in speed.
Collapse
Affiliation(s)
- Pengcheng Wen
- College of Intelligent Systems Science and Engineering, Hubei University for Nationalities, Enshi 445000, China
| | - Yuhan Zhang
- Southern Medical University, Affiliated Dongguan Songshan Lake Central Hospital, Dongguan 523000, China
| | - Guihua Wen
- School of Computer Science & Engineering, South China University of Technology, Guangzhou 510000, China
| |
Collapse
|
3
|
Meng W, Zhang Q, Ma S, Cai M, Liu D, Liu Z, Yang J. A lightweight CNN and Transformer hybrid model for mental retardation screening among children from spontaneous speech. Comput Biol Med 2022; 151:106281. [PMID: 36399858 DOI: 10.1016/j.compbiomed.2022.106281] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2022] [Revised: 10/17/2022] [Accepted: 10/30/2022] [Indexed: 11/06/2022]
Abstract
Mental retardation (MR) is a group of mental disorders characterized by low intelligence and social adjustment difficulties. Early diagnosis is beneficial for the timely intervention of children with MR to ease the degree of disability. Children with MR always have impaired speech functions compared to normal children, which is significant for clinical diagnosis. On the basis of this, our study proposes a spontaneous speech-based framework (MT-Net) for screening MR, which merges mobile inverted bottleneck convolutional blocks (MBConv) and visual Transformer blocks. MT-Net takes log-mel spectrograms converted from raw interview speech as data source, and utilizes MBConv and visual Transformer to learn low-level and high-level features well. In addition, SpecAugment, a data augmentation strategy, has been used to expand our audio dataset to further enhance the performance of MT-Net. The experimental results show that our proposed MT-Net outperforms Transformer networks (ViT) and convolutional neural networks (ResNet18, MobileNetV2, EfficientNetV2), achieving accuracy of 91.60% after using SpecAugment. Our proposed MT-Net has fewer parameters, low computing consumption and high prediction accuracy, which is expected to be an auxiliary screening tool for MR.
Collapse
Affiliation(s)
- Wei Meng
- School of Information Engineering, Wuhan University of Technology, Wuhan 430070, China
| | - Qianhong Zhang
- School of Information Engineering, Wuhan University of Technology, Wuhan 430070, China
| | - Simeng Ma
- Department of Psychiatry, Renmin Hospital of Wuhan University, Wuhan 430060, China
| | - Mincheng Cai
- School of Information Engineering, Wuhan University of Technology, Wuhan 430070, China
| | - Dujuan Liu
- Department of Psychiatry, Renmin Hospital of Wuhan University, Wuhan 430060, China
| | - Zhongchun Liu
- Department of Psychiatry, Renmin Hospital of Wuhan University, Wuhan 430060, China.
| | - Jun Yang
- School of Information Engineering, Wuhan University of Technology, Wuhan 430070, China.
| |
Collapse
|
4
|
Li Y, Liu C, Wang P, Zhang H, Wei A, Zhang Y. Envelope multi-type transformation ensemble algorithm of Parkinson speech samples. APPL INTELL 2022. [DOI: 10.1007/s10489-022-04345-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/02/2022]
|
5
|
Ngo QC, Motin MA, Pah ND, Drotár P, Kempster P, Kumar D. Computerized analysis of speech and voice for Parkinson's disease: A systematic review. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2022; 226:107133. [PMID: 36183641 DOI: 10.1016/j.cmpb.2022.107133] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/23/2022] [Revised: 09/13/2022] [Accepted: 09/13/2022] [Indexed: 06/16/2023]
Abstract
BACKGROUND AND OBJECTIVE Speech impairment is an early symptom of Parkinson's disease (PD). This study has summarized the literature related to speech and voice in detecting PD and assessing its severity. METHODS A systematic review of the literature from 2010 to 2021 to investigate analysis methods and signal features. The keywords "Automatic analysis" in conjunction with "PD speech" or "PD voice" were used, and the PubMed and ScienceDirect databases were searched. A total of 838 papers were found on the first run, of which 189 were selected. One hundred and forty-seven were found to be suitable for the review. The different datasets, recording protocols, signal analysis methods and features that were reported are listed. Values of the features that separate PD patients from healthy controls were tabulated. Finally, the barriers that limit the wide use of computerized speech analysis are discussed. RESULTS Speech and voice may be valuable markers for PD. However, large differences between the datasets make it difficult to compare different studies. In addition, speech analytic methods that are not informed by physiological understanding may alienate clinicians. CONCLUSIONS The potential usefulness of speech and voice for the detection and assessment of PD is confirmed by evidence from the classification and correlation results.
Collapse
Affiliation(s)
| | - Mohammod Abdul Motin
- Biosignals Lab, RMIT University, Melbourne, Australia; Department of Electrical & Electronic Engineering, Rajshahi University of Engineering & Technology, Rajshahi 6204, Bangladesh
| | - Nemuel Daniel Pah
- Biosignals Lab, RMIT University, Melbourne, Australia; Universitas Surabaya, Indonesia
| | - Peter Drotár
- Intelligent Information Systems Lab, Technical University of Kosice, Letna 9, 42001, Kosice, Slovakia
| | - Peter Kempster
- Neurosciences Department, Monash Health, Clayton, VIC, Australia; Department of Medicine, School of Clinical Sciences, Monash University, Clayton, VIC, Australia
| | - Dinesh Kumar
- Biosignals Lab, RMIT University, Melbourne, Australia.
| |
Collapse
|
6
|
A Speech-Based Hybrid Decision Support System for Early Detection of Parkinson's Disease. ARABIAN JOURNAL FOR SCIENCE AND ENGINEERING 2022. [DOI: 10.1007/s13369-022-07249-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/02/2022]
|
7
|
Yang M, Ma J, Wang P, Huang Z, Li Y, Liu H, Hameed Z. Hierarchical Boosting Dual-Stage Feature Reduction Ensemble Model for Parkinson's Disease Speech Data. Diagnostics (Basel) 2021; 11:diagnostics11122312. [PMID: 34943549 PMCID: PMC8700329 DOI: 10.3390/diagnostics11122312] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2021] [Revised: 11/24/2021] [Accepted: 11/30/2021] [Indexed: 11/16/2022] Open
Abstract
As a neurodegenerative disease, Parkinson's disease (PD) is hard to identify at the early stage, while using speech data to build a machine learning diagnosis model has proved effective in its early diagnosis. However, speech data show high degrees of redundancy, repetition, and unnecessary noise, which influence the accuracy of diagnosis results. Although feature reduction (FR) could alleviate this issue, the traditional FR is one-sided (traditional feature extraction could construct high-quality features without feature preference, while traditional feature selection could achieve feature preference but could not construct high-quality features). To address this issue, the Hierarchical Boosting Dual-Stage Feature Reduction Ensemble Model (HBD-SFREM) is proposed in this paper. The major contributions of HBD-SFREM are as follows: (1) The instance space of the deep hierarchy is built by an iterative deep extraction mechanism. (2) The manifold features extraction method embeds the nearest neighbor feature preference method to form the dual-stage feature reduction pair. (3) The dual-stage feature reduction pair is iteratively performed by the AdaBoost mechanism to obtain instances features with higher quality, thus achieving a substantial improvement in model recognition accuracy. (4) The deep hierarchy instance space is integrated into the original instance space to improve the generalization of the algorithm. Three PD speech datasets and a self-collected dataset are used to test HBD-SFREM in this paper. Compared with other FR algorithms and deep learning algorithms, the accuracy of HBD-SFREM in PD speech recognition is improved significantly and would not be affected by a small sample dataset. Thus, HBD-SFREM could give a reference for other related studies.
Collapse
Affiliation(s)
- Mingyao Yang
- College of Microelectronics and Communication Engineering, Chongqing University, Chongqing 400000, China; (M.Y.); (J.M.); (P.W.); (Z.H.)
| | - Jie Ma
- College of Microelectronics and Communication Engineering, Chongqing University, Chongqing 400000, China; (M.Y.); (J.M.); (P.W.); (Z.H.)
| | - Pin Wang
- College of Microelectronics and Communication Engineering, Chongqing University, Chongqing 400000, China; (M.Y.); (J.M.); (P.W.); (Z.H.)
| | - Zhiyong Huang
- College of Microelectronics and Communication Engineering, Chongqing University, Chongqing 400000, China; (M.Y.); (J.M.); (P.W.); (Z.H.)
- Correspondence: (Z.H.); (Y.L.); Tel.: +86-138-83216321 (Z.H.); +86-023-65103544 (Y.L.)
| | - Yongming Li
- College of Microelectronics and Communication Engineering, Chongqing University, Chongqing 400000, China; (M.Y.); (J.M.); (P.W.); (Z.H.)
- Correspondence: (Z.H.); (Y.L.); Tel.: +86-138-83216321 (Z.H.); +86-023-65103544 (Y.L.)
| | - He Liu
- Chongqing Academy of Educational Sciences, Chongqing 400000, China;
| | - Zeeshan Hameed
- College of Microelectronics and Communication Engineering, Chongqing University, Chongqing 400000, China; (M.Y.); (J.M.); (P.W.); (Z.H.)
| |
Collapse
|
8
|
Er MB, Isik E, Isik I. Parkinson’s detection based on combined CNN and LSTM using enhanced speech signals with Variational mode decomposition. Biomed Signal Process Control 2021. [DOI: 10.1016/j.bspc.2021.103006] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022]
|
9
|
Gunduz H. An efficient dimensionality reduction method using filter-based feature selection and variational autoencoders on Parkinson's disease classification. Biomed Signal Process Control 2021. [DOI: 10.1016/j.bspc.2021.102452] [Citation(s) in RCA: 23] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
|