1
|
Chiu CC, Wu CM, Chien TN, Kao LJ, Li C. Predicting ICU Readmission from Electronic Health Records via BERTopic with Long Short Term Memory Network Approach. J Clin Med 2024; 13:5503. [PMID: 39336990 PMCID: PMC11432694 DOI: 10.3390/jcm13185503] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2024] [Revised: 09/04/2024] [Accepted: 09/10/2024] [Indexed: 09/30/2024] Open
Abstract
Background: The increasing rate of intensive care unit (ICU) readmissions poses significant challenges in healthcare, impacting both costs and patient outcomes. Predicting patient readmission after discharge is crucial for improving medical quality and reducing expenses. Traditional analyses of electronic health record (EHR) data have primarily focused on numerical data, often neglecting valuable text data. Methods: This study employs a hybrid model combining BERTopic and Long Short-Term Memory (LSTM) networks to predict ICU readmissions. Leveraging the MIMIC-III database, we utilize both quantitative and text data to enhance predictive capabilities. Our approach integrates the strengths of unsupervised topic modeling with supervised deep learning, extracting potential topics from patient records and transforming discharge summaries into topic vectors for more interpretable and personalized predictions. Results: Utilizing a comprehensive dataset of 36,232 ICU patient records, our model achieved an AUROC score of 0.80, thereby surpassing the performance of traditional machine learning models. The implementation of BERTopic facilitated effective utilization of unstructured data, generating themes that effectively guide the selection of relevant predictive factors for patient readmission prognosis. This significantly enhanced the model's interpretative accuracy and predictive capability. Additionally, the integration of importance ranking methods into our machine learning framework allowed for an in-depth analysis of the significance of various variables. This approach provided crucial insights into how different input variables interact and impact predictions of patient readmission across various clinical contexts. Conclusions: The practical application of BERTopic technology in our hybrid model contributes to more efficient patient management and serves as a valuable tool for developing tailored treatment strategies and resource optimization. This study highlights the significance of integrating unstructured text data with traditional quantitative data to develop more accurate and interpretable predictive models in healthcare, emphasizing the importance of individualized care and cost-effective healthcare paradigms.
Collapse
Affiliation(s)
- Chih-Chou Chiu
- Department of Business Management, National Taipei University of Technology, Taipei 106, Taiwan; (C.-C.C.); (C.-M.W.); (L.-J.K.)
| | - Chung-Min Wu
- Department of Business Management, National Taipei University of Technology, Taipei 106, Taiwan; (C.-C.C.); (C.-M.W.); (L.-J.K.)
| | - Te-Nien Chien
- College of Management, National Taipei University of Technology, Taipei 106, Taiwan;
| | - Ling-Jing Kao
- Department of Business Management, National Taipei University of Technology, Taipei 106, Taiwan; (C.-C.C.); (C.-M.W.); (L.-J.K.)
| | - Chengcheng Li
- College of Management, National Taipei University of Technology, Taipei 106, Taiwan;
| |
Collapse
|
2
|
Du G, Zhang J, Jiang M, Long J, Lin Y, Li S, Tan KC. Graph-Based Class-Imbalance Learning With Label Enhancement. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2023; 34:6081-6095. [PMID: 34928806 DOI: 10.1109/tnnls.2021.3133262] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Class imbalance is a common issue in the community of machine learning and data mining. The class-imbalance distribution can make most classical classification algorithms neglect the significance of the minority class and tend toward the majority class. In this article, we propose a label enhancement method to solve the class-imbalance problem in a graph manner, which estimates the numerical label and trains the inductive model simultaneously. It gives a new perspective on the class-imbalance learning based on the numerical label rather than the original logical label. We also present an iterative optimization algorithm and analyze the computation complexity and its convergence. To demonstrate the superiority of the proposed method, several single-label and multilabel datasets are applied in the experiments. The experimental results show that the proposed method achieves a promising performance and outperforms some state-of-the-art single-label and multilabel class-imbalance learning methods.
Collapse
|
3
|
Dual Regularized Unsupervised Feature Selection Based on Matrix Factorization and Minimum Redundancy with application in gene selection. Knowl Based Syst 2022. [DOI: 10.1016/j.knosys.2022.109884] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
|
4
|
Dai L, Zhang J, Du G, Li C, Wei R, Li S. Toward embedding-based multi-label feature selection with label and feature collaboration. Neural Comput Appl 2022. [DOI: 10.1007/s00521-022-07924-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/31/2022]
|
5
|
Karrar A, Mabrouk MS, Abdel Wahed M, Sayed AY. Auto diagnostic system for detecting solitary and juxtapleural pulmonary nodules in computed tomography images using machine learning. Neural Comput Appl 2022. [DOI: 10.1007/s00521-022-07844-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/07/2022]
Abstract
AbstractLung cancer is one of the most serious cancers in the world with the minimum survival rate after the diagnosis as it appears in Computed Tomography scans. Lung nodules may be isolated from (solitary) or attached to (juxtapleural) other structures such as blood vessels or the pleura. Diagnosis of lung nodules according to their location increases the survival rate as it achieves diagnostic and therapeutic quality assurance. In this paper, a Computer Aided Diagnosis (CADx) system is proposed to classify solitary nodules and juxtapleural nodules inside the lungs. Two main auto-diagnostic schemes of supervised learning for lung nodules classification are achieved. In the first scheme, (bounding box + Maximum intensity projection) and (Thresholding + K-means clustering) segmentation approaches are proposed then first- and second-order features are extracted. Fisher score ranking is also used in the first scheme as a feature selection method. The higher five, ten, and fifteen ranks of the feature set are selected. In the first scheme, Support Vector Machine (SVM) classifier is used. In the second scheme, the same segmentation approaches are used with Deep Convolutional neural networks (DCNN) which is a successful tool for deep learning classification. Because of the limited data sample and imbalanced data, tenfold cross-validation and random oversampling are used for the two schemes. For diagnosis of the solitary nodule, the first scheme with SVM achieved the highest accuracy and sensitivity 91.4% and 89.3%, respectively, with radial basis function and applying the (Thresholding + Kmeans clustering) segmentation approach and the higher 15 ranks of the feature set. In the second scheme, DCNN achieved the highest accuracy and sensitivity 96% and 95%, respectively, to detect the solitary nodule when applying the bounding box and maximum intensity projection segmentation approach. Receiver operating characteristic curve is used to evaluate the classifier’s performance. The max. AUC = 90.3% is achieved with DCNN classifier for detecting solitary nodules. This CAD system acts as a second opinion for the radiologist to help in the early diagnosis of lung cancer. The accuracy, sensitivity, and specificity of scheme I (SVM) and scheme II (DCNN) showed promising results in comparison to other published studies.
Collapse
|
6
|
Zhang J, Zou J, Su Z, Tang J, Kang Y, Xu H, Liu Z, Fan S. A class-aware supervised contrastive learning framework for imbalanced fault diagnosis. Knowl Based Syst 2022. [DOI: 10.1016/j.knosys.2022.109437] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/17/2022]
|
7
|
Wang S, Zhu X. Predictive Modeling of Hospital Readmission: Challenges and Solutions. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022; 19:2975-2995. [PMID: 34133285 DOI: 10.1109/tcbb.2021.3089682] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Hospital readmission prediction is a study to learn models from historical medical data to predict probability of a patient returning to hospital in a certain period, e.g. 30 or 90 days, after the discharge. The motivation is to help health providers deliver better treatment and post-discharge strategies, lower the hospital readmission rate, and eventually reduce the medical costs. Due to inherent complexity of diseases and healthcare ecosystems, modeling hospital readmission is facing many challenges. By now, a variety of methods have been developed, but existing literature fails to deliver a complete picture to answer some fundamental questions, such as what are the main challenges and solutions in modeling hospital readmission; what are typical features/models used for readmission prediction; how to achieve meaningful and transparent predictions for decision making; and what are possible conflicts when deploying predictive approaches for real-world usages. In this paper, we systematically review computational models for hospital readmission prediction, and propose a taxonomy of challenges featuring four main categories: (1) data variety and complexity; (2) data imbalance, locality and privacy; (3) model interpretability; and (4) model implementation. The review summarizes methods in each category, and highlights technical solutions proposed to address the challenges. In addition, a review of datasets and resources available for hospital readmission modeling also provides firsthand materials to support researchers and practitioners to design new approaches for effective and efficient hospital readmission prediction.
Collapse
|
8
|
Neural network input feature selection using structured l2 − norm penalization. APPL INTELL 2022. [DOI: 10.1007/s10489-022-03539-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/02/2022]
Abstract
AbstractArtificial neural networks are referred to as universal approximators due to their inherent ability to reconstruct complex linear and nonlinear output maps conceived as input-output relationships from data sets. This can be done by reducing large networks via regularization in order to establish compact models containing fewer parameters aimed at describing vital dependencies in data sets. In situations where the data sets contain non-informative input features, devising a continuous, optimal input feature selection technique can lead to improved prediction or classification. We propose a continuous input selection technique through a dimensional reduction mechanism using a ‘structured’ l2 − norm regularization. The implementation is done by identifying the most informative feature subsets from a given data set via an adaptive training mechanism. The adaptation involves introducing a novel, modified gradient approach during training to deal with the non-differentiability associated with the gradient of the structured norm penalty. When the method is applied to process data sets, results indicate that the most informative inputs of artificial neural networks can be selected using a structured l2 − norm penalization.
Collapse
|
9
|
|
10
|
Majority-to-minority resampling for boosting-based classification under imbalanced data. APPL INTELL 2022. [DOI: 10.1007/s10489-022-03585-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/02/2022]
|
11
|
Forecasting Hospital Readmissions with Machine Learning. Healthcare (Basel) 2022; 10:healthcare10060981. [PMID: 35742033 PMCID: PMC9222500 DOI: 10.3390/healthcare10060981] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2022] [Revised: 05/21/2022] [Accepted: 05/21/2022] [Indexed: 11/17/2022] Open
Abstract
Hospital readmissions are regarded as a compounding economic factor for healthcare systems. In fact, the readmission rate is used in many countries as an indicator of the quality of services provided by a health institution. The ability to forecast patients’ readmissions allows for timely intervention and better post-discharge strategies, preventing future life-threatening events, and reducing medical costs to either the patient or the healthcare system. In this paper, four machine learning models are used to forecast readmissions: support vector machines with a linear kernel, support vector machines with an RBF kernel, balanced random forests, and weighted random forests. The dataset consists of 11,172 actual records of hospitalizations obtained from the General Hospital of Komotini “Sismanogleio” with a total of 24 independent variables. Each record is composed of administrative, medical-clinical, and operational variables. The experimental results indicate that the balanced random forest model outperforms the competition, reaching a sensitivity of 0.70 and an AUC value of 0.78.
Collapse
|
12
|
Balasaraswathi VR, Mary Shamala L, Hamid Y, Pachhaiammal Alias Priya M, Shobana M, Sugumaran M. An Efficient Feature Selection for Intrusion Detection System Using B-HKNN and C2 Search Based Learning Model. Neural Process Lett 2022. [DOI: 10.1007/s11063-022-10854-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
|
13
|
Demir F, Akbulut Y. A new deep technique using R-CNN model and L1NSR feature selection for brain MRI classification. Biomed Signal Process Control 2022. [DOI: 10.1016/j.bspc.2022.103625] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023]
|
14
|
AI on the edge: a comprehensive review. Artif Intell Rev 2022. [DOI: 10.1007/s10462-022-10141-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/02/2022]
|
15
|
AI Models for Predicting Readmission of Pneumonia Patients within 30 Days after Discharge. ELECTRONICS 2022. [DOI: 10.3390/electronics11050673] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/27/2023]
Abstract
A model with capability for precisely predicting readmission is a target being pursued worldwide. The objective of this study is to design predictive models using artificial intelligence methods and data retrieved from the National Health Insurance Research Database of Taiwan for identifying high-risk pneumonia patients with 30-day all-cause readmissions. An integrated genetic algorithm (GA) and support vector machine (SVM), namely IGS, were used to design predictive models optimized with three objective functions. In IGS, GA was used for selecting salient features and optimal SVM parameters, while SVM was used for constructing the models. For comparison, logistic regression (LR) and deep neural network (DNN) were also applied for model construction. The IGS model with AUC used as the objective function achieved an accuracy, sensitivity, specificity, and area under ROC curve (AUC) of 70.11%, 73.46%, 69.26%, and 0.7758, respectively, outperforming the models designed with LR (65.77%, 78.44%, 62.54%, and 0.7689, respectively) and DNN (61.50%, 79.34%, 56.95%, and 0.7547, respectively), as well as previously reported models constructed using thedata of electronic health records with an AUC of 0.71–0.74. It can be used for automatically detecting pneumonia patients with a risk of all-cause readmissions within 30 days after discharge so as to administer suitable interventions to reduce readmission and healthcare costs.
Collapse
|
16
|
Pan Z, Cai F, Chen W, Chen H. Session-based recommendation with an importance extraction module. Neural Comput Appl 2022. [DOI: 10.1007/s00521-022-06966-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
|
17
|
A Simple and Effective Approach Based on a Multi-Level Feature Selection for Automated Parkinson's Disease Detection. J Pers Med 2022; 12:jpm12010055. [PMID: 35055370 PMCID: PMC8781034 DOI: 10.3390/jpm12010055] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2021] [Revised: 12/27/2021] [Accepted: 12/30/2021] [Indexed: 12/07/2022] Open
Abstract
Parkinson’s disease (PD), which is a slowly progressing neurodegenerative disorder, negatively affects people’s daily lives. Early diagnosis is of great importance to minimize the effects of PD. One of the most important symptoms in the early diagnosis of PD disease is the monotony and distortion of speech. Artificial intelligence-based approaches can help specialists and physicians to automatically detect these disorders. In this study, a new and powerful approach based on multi-level feature selection was proposed to detect PD from features containing voice recordings of already-diagnosed cases. At the first level, feature selection was performed with the Chi-square and L1-Norm SVM algorithms (CLS). Then, the features that were extracted from these algorithms were combined to increase the representation power of the samples. At the last level, those samples that were highly distinctive from the combined feature set were selected with feature importance weights using the ReliefF algorithm. In the classification stage, popular classifiers such as KNN, SVM, and DT were used for machine learning, and the best performance was achieved with the KNN classifier. Moreover, the hyperparameters of the KNN classifier were selected with the Bayesian optimization algorithm, and the performance of the proposed approach was further improved. The proposed approach was evaluated using a 10-fold cross-validation technique on a dataset containing PD and normal classes, and a classification accuracy of 95.4% was achieved.
Collapse
|
18
|
Huang YT, Chiang DL, Chen TS, Wang SD, Lai FP, Lin YD. Lagrange interpolation-driven access control mechanism: Towards secure and privacy-preserving fusion of personal health records. Knowl Based Syst 2022. [DOI: 10.1016/j.knosys.2021.107679] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
|
19
|
Jia H, Zhang W, Zheng R, Wang S, Leng X, Cao N. Ensemble mutation slime mould algorithm with restart mechanism for feature selection. INT J INTELL SYST 2021. [DOI: 10.1002/int.22776] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Affiliation(s)
- Heming Jia
- College of Information Engineering Sanming University Sanming China
| | - Wanying Zhang
- College of Mechanical and Electrical Engineering Northeast Forestry University Harbin China
| | - Rong Zheng
- College of Information Engineering Sanming University Sanming China
| | - Shuang Wang
- College of Information Engineering Sanming University Sanming China
| | - Xin Leng
- College of Mechanical and Electrical Engineering Northeast Forestry University Harbin China
| | - Ning Cao
- College of Information Engineering Sanming University Sanming China
| |
Collapse
|
20
|
United equilibrium optimizer for solving multimodal image registration. Knowl Based Syst 2021. [DOI: 10.1016/j.knosys.2021.107552] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]
|
21
|
|
22
|
Feature selection via minimizing global redundancy for imbalanced data. APPL INTELL 2021. [DOI: 10.1007/s10489-021-02855-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
|
23
|
|
24
|
Paul D, Jain A, Saha S, Mathew J. Multi-objective PSO based online feature selection for multi-label classification. Knowl Based Syst 2021. [DOI: 10.1016/j.knosys.2021.106966] [Citation(s) in RCA: 20] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
|
25
|
Meng R, Soper B, Lee HKH, Liu VX, Greene JD, Ray P. Nonstationary multivariate Gaussian processes for electronic health records. J Biomed Inform 2021; 117:103698. [PMID: 33617985 DOI: 10.1016/j.jbi.2021.103698] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2020] [Revised: 12/22/2020] [Accepted: 02/01/2021] [Indexed: 10/22/2022]
Abstract
Advances in the modeling and analysis of electronic health records (EHR) have the potential to improve patient risk stratification, leading to better patient outcomes. The modeling of complex temporal relations across the multiple clinical variables inherent in EHR data is largely unexplored. Existing approaches to modeling EHR data often lack the flexibility to handle time-varying correlations across multiple clinical variables, or they are too complex for clinical interpretation. Therefore, we propose a novel nonstationary multivariate Gaussian process model for EHR data to address the aforementioned drawbacks of existing methodologies. Our proposed model is able to capture time-varying scale, correlation and smoothness across multiple clinical variables. We also provide details on two inference approaches: Maximum a posteriori and Hamilton Monte Carlo. Our model is validated on synthetic data and then we demonstrate its effectiveness on EHR data from Kaiser Permanente Division of Research (KPDOR). Finally, we use the KPDOR EHR data to investigate the relationships between a clinical patient risk metric and the latent processes of our proposed model and demonstrate statistically significant correlations between these entities.
Collapse
Affiliation(s)
- Rui Meng
- Department of Statistics, University of California, Santa Cruz, CA, United States.
| | - Braden Soper
- Lawrence Livermore National Laboratory, Livermore, CA, United States
| | - Herbert K H Lee
- Department of Statistics, University of California, Santa Cruz, CA, United States
| | - Vincent X Liu
- Kaiser Permanente Division of Research, Oakland, CA, United States
| | - John D Greene
- Kaiser Permanente Division of Research, Oakland, CA, United States
| | - Priyadip Ray
- Lawrence Livermore National Laboratory, Livermore, CA, United States
| |
Collapse
|
26
|
|
27
|
A Microcosmic Syndrome Differentiation Model for Metabolic Syndrome with Multilabel Learning. EVIDENCE-BASED COMPLEMENTARY AND ALTERNATIVE MEDICINE 2020; 2020:9081641. [PMID: 33294001 PMCID: PMC7714575 DOI: 10.1155/2020/9081641] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/21/2020] [Revised: 10/22/2020] [Accepted: 10/31/2020] [Indexed: 12/14/2022]
Abstract
Background Metabolic syndrome (MS) is a complex multisystem disease. Traditional Chinese medicine (TCM) is effective in preventing and treating MS. Syndrome differentiation is the basis of TCM treatment, which is composed of location and/or nature syndrome elements. At present, there are still some problems for objective and comprehensive syndrome differentiation in MS. This study mainly proposes a solution to two problems. Firstly, TCM syndromes are concurrent, that is, multiple TCM syndromes may develop in the same patient. Secondly, there is a lack of holistic exploration of the relationship between microscopic indexes, and TCM syndromes. In regard to these two problems, multilabel learning (MLL) method in machine learning can be used to solve them, and a microcosmic syndrome differentiation model can also be built innovatively, which can provide a foundation for the establishment of the next model of multidimensional syndrome differentiation in MS. Methods The standardization scale of TCM four diagnostic information for MS was designed, which was used to obtain the results of TCM diagnosis. The model of microcosmic syndrome differentiation was constructed based on 39 physicochemical indexes by MLL techniques, called ML-kNN. Firstly, the multilabel learning method was compared with three commonly used single learning algorithms. Then, the results from ML-kNN were compared between physicochemical indexes and TCM information. Finally, the influence of the parameter k on the diagnostic model was investigated and the best k value was chosen for TCM diagnosis. Results A total of 698 cases were collected for the modeling of the microcosmic diagnosis of MS. The comprehensive performance of the ML-kNN model worked obviously better than the others, where the average precision of diagnosis was 71.4%. The results from ML-kNN based on physicochemical indexes were similar to the results based on TCM information. On the other hand, the k value had less influence on the prediction results from ML-kNN. Conclusions In the present study, the microcosmic syndrome differentiation model of MS with MLL techniques was good at predicting syndrome elements and could be used to solve the diagnosis problems of multiple labels. Besides, it was suggested that there was a complex correlation between TCM syndrome elements and physicochemical indexes, which worth future investigation to promote the development of objective differentiation of MS.
Collapse
|