1
|
Zhao B, Huepenbecker S, Zhu G, Rajan SS, Fujimoto K, Luo X. Comorbidity network analysis using graphical models for electronic health records. Front Big Data 2023; 6:846202. [PMID: 37663273 PMCID: PMC10470017 DOI: 10.3389/fdata.2023.846202] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2021] [Accepted: 07/25/2023] [Indexed: 09/05/2023] Open
Abstract
Importance The comorbidity network represents multiple diseases and their relationships in a graph. Understanding comorbidity networks among critical care unit (CCU) patients can help doctors diagnose patients faster, minimize missed diagnoses, and potentially decrease morbidity and mortality. Objective The main objective of this study was to identify the comorbidity network among CCU patients using a novel application of a machine learning method (graphical modeling method). The second objective was to compare the machine learning method with a traditional pairwise method in simulation. Method This cross-sectional study used CCU patients' data from Medical Information Mart for the Intensive Care-3 (MIMIC-3) dataset, an electronic health record (EHR) of patients with CCU hospitalizations within Beth Israel Deaconess Hospital from 2001 to 2012. A machine learning method (graphical modeling method) was applied to identify the comorbidity network of 654 diagnosis categories among 46,511 patients. Results Out of the 654 diagnosis categories, the graphical modeling method identified a comorbidity network of 2,806 associations in 510 diagnosis categories. Two medical professionals reviewed the comorbidity network and confirmed that the associations were consistent with current medical understanding. Moreover, the strongest association in our network was between "poisoning by psychotropic agents" and "accidental poisoning by tranquilizers" (logOR 8.16), and the most connected diagnosis was "disorders of fluid, electrolyte, and acid-base balance" (63 associated diagnosis categories). Our method outperformed traditional pairwise comorbidity network methods in simulation studies. Some strongest associations between diagnosis categories were also identified, for example, "diagnoses of mitral and aortic valve" and "other rheumatic heart disease" (logOR: 5.15). Furthermore, our method identified diagnosis categories that were connected with most other diagnosis categories, for example, "disorders of fluid, electrolyte, and acid-base balance" was associated with 63 other diagnosis categories. Additionally, using a data-driven approach, our method partitioned the diagnosis categories into 14 modularity classes. Conclusion and relevance Our graphical modeling method inferred a logical comorbidity network whose associations were consistent with current medical understanding and outperformed traditional network methods in simulation. Our comorbidity network method can potentially assist CCU doctors in diagnosing patients faster and minimizing missed diagnoses.
Collapse
Affiliation(s)
- Bo Zhao
- Department of Biostatistics and Data Science, School of Public Health, The University of Texas Health Science Center, Houston, TX, United States
| | - Sarah Huepenbecker
- Department of Gynecologic Oncology and Reproductive Medicine, The University of Texas MD Anderson Cancer Center, Houston, TX, United States
| | - Gen Zhu
- Department of Biostatistics and Data Science, School of Public Health, The University of Texas Health Science Center, Houston, TX, United States
| | - Suja S. Rajan
- Department of Management, Policy and Community Health, School of Public Health, The University of Texas Health Science Center, Houston, TX, United States
| | - Kayo Fujimoto
- Department of Health Promotion and Behavioral Sciences, School of Public Health, The University of Texas Health Science Center, Houston, TX, United States
| | - Xi Luo
- Department of Biostatistics and Data Science, School of Public Health, The University of Texas Health Science Center, Houston, TX, United States
| |
Collapse
|
2
|
Li W, Zhou Q, Liu W, Xu C, Tang ZR, Dong S, Wang H, Li W, Zhang K, Li R, Zhang W, Hu Z, Shibin S, Liu Q, Kuang S, Yin C. A Machine Learning-Based Predictive Model for Predicting Lymph Node Metastasis in Patients With Ewing's Sarcoma. Front Med (Lausanne) 2022; 9:832108. [PMID: 35463005 PMCID: PMC9020377 DOI: 10.3389/fmed.2022.832108] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2021] [Accepted: 02/24/2022] [Indexed: 11/13/2022] Open
Abstract
Objective In order to provide reference for clinicians and bring convenience to clinical work, we seeked to develop and validate a risk prediction model for lymph node metastasis (LNM) of Ewing’s sarcoma (ES) based on machine learning (ML) algorithms. Methods Clinicopathological data of 923 ES patients from the Surveillance, Epidemiology, and End Results (SEER) database and 51 ES patients from multi-center external validation set were retrospectively collected. We applied ML algorithms to establish a risk prediction model. Model performance was checked using 10-fold cross-validation in the training set and receiver operating characteristic (ROC) curve analysis in external validation set. After determining the best model, a web-based calculator was made to promote the clinical application. Results LNM was confirmed or unable to evaluate in 13.86% (135 out of 974) ES patients. In multivariate logistic regression, race, T stage, M stage and lung metastases were independent predictors for LNM in ES. Six prediction models were established using random forest (RF), naive Bayes classifier (NBC), decision tree (DT), xgboost (XGB), gradient boosting machine (GBM), logistic regression (LR). In 10-fold cross-validation, the average area under curve (AUC) ranked from 0.705 to 0.764. In ROC curve analysis, AUC ranged from 0.612 to 0.727. The performance of the RF model ranked best. Accordingly, a web-based calculator was developed (https://share.streamlit.io/liuwencai2/es_lnm/main/es_lnm.py). Conclusion With the help of clinicopathological data, clinicians can better identify LNM in ES patients. Risk prediction models established in this study performed well, especially the RF model.
Collapse
Affiliation(s)
- Wenle Li
- Department of Orthopedics, Xianyang Central Hospital, Xianyang, China.,Clinical Medical Research Center, Xianyang Central Hospital, Xianyang, China
| | - Qian Zhou
- Department of Respiratory and Critical Care Medicine, The First People's Hospital of Chongqing Liang Jiang New Area, Chongqing, China
| | - Wencai Liu
- Department of Orthopaedic Surgery, The First Affiliated Hospital of Nanchang University, Nanchang, China
| | - Chan Xu
- Department of Respiratory and Critical Care Medicine, The First People's Hospital of Chongqing Liang Jiang New Area, Chongqing, China.,Department of Dermatology, Xianyang Central Hospital, Xianyang, China
| | - Zhi-Ri Tang
- School of Physics and Technology, Wuhan University, Wuhan, China
| | - Shengtao Dong
- Department of Spine Surgery, Second Affiliated Hospital of Dalian Medical University, Dalian, China
| | - Haosheng Wang
- Department of Orthopaedics, The Second Hospital of Jilin University, Changchun, China
| | - Wanying Li
- Clinical Medical Research Center, Xianyang Central Hospital, Xianyang, China
| | - Kai Zhang
- Department of Orthopedics, Xianyang Central Hospital, Xianyang, China.,Clinical Medical Research Center, Xianyang Central Hospital, Xianyang, China
| | - Rong Li
- The First Clinical Medical College, Shaanxi University of Traditional Chinese Medicine, Xianyang, China
| | - Wenshi Zhang
- The First Clinical Medical College, Shaanxi University of Traditional Chinese Medicine, Xianyang, China
| | - Zhaohui Hu
- Department of Spinal Surgery, Liuzhou People's Hospital, Liuzhou, China
| | - Su Shibin
- Department of Business Management, Xiamen Bank, Xiamen, China
| | - Qiang Liu
- Clinical Medical Research Center, Xianyang Central Hospital, Xianyang, China
| | - Sirui Kuang
- Faculty of Medicine, Macau University of Science and Technology, Macau, China
| | - Chengliang Yin
- Faculty of Medicine, Macau University of Science and Technology, Macau, China
| |
Collapse
|
3
|
Zhou L, Zheng X, Yang D, Wang Y, Bai X, Ye X. Application of multi-label classification models for the diagnosis of diabetic complications. BMC Med Inform Decis Mak 2021; 21:182. [PMID: 34098959 PMCID: PMC8182940 DOI: 10.1186/s12911-021-01525-7] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2021] [Accepted: 04/28/2021] [Indexed: 12/23/2022] Open
Abstract
Background Early diagnosis for the diabetes complications is clinically demanding with great significancy. Regarding the complexity of diabetes complications, we applied a multi-label classification (MLC) model to predict four diabetic complications simultaneously using data in the modern electronic health records (EHRs), and leveraged the correlations between the complications to further improve the prediction accuracy. Methods We obtained the demographic characteristics and laboratory data from the EHRs for patients admitted to Changzhou No. 2 People’s Hospital, the affiliated hospital of Nanjing Medical University in China from May 2013 to June 2020. The data included 93 biochemical indicators and 9,765 patients. We used the Pearson correlation coefficient (PCC) to analyze the correlations between different diabetic complications from a statistical perspective. We used an MLC model, based on the Random Forest (RF) technique, to leverage these correlations and predict four complications simultaneously. We explored four different MLC models; a Label Power Set (LP), Classifier Chains (CC), Ensemble Classifier Chains (ECC), and Calibrated Label Ranking (CLR). We used traditional Binary Relevance (BR) as a comparison. We used 11 different performance metrics and the area under the receiver operating characteristic curve (AUROC) to evaluate these models. We analyzed the weights of the learned model and illustrated (1) the top 10 key indicators of different complications and (2) the correlations between different diabetic complications. Results The MLC models including CC, ECC and CLR outperformed the traditional BR method in most performance metrics; the ECC models performed the best in Hamming loss (0.1760), Accuracy (0.7020), F1_Score (0.7855), Precision (0.8649), F1_micro (0.8078), F1_macro (0.7773), Recall_micro (0.8631), Recall_macro (0.8009), and AUROC (0.8231). The two diabetic complication correlation matrices drawn from the PCC analysis and the MLC models were consistent with each other and indicated that the complications correlated to different extents. The top 10 key indicators given by the model are valuable in medical application. Conclusions Our MLC model can effectively utilize the potential correlation between different diabetic complications to further improve the prediction accuracy. This model should be explored further in other complex diseases with multiple complications. Supplementary Information The online version contains supplementary material available at 10.1186/s12911-021-01525-7.
Collapse
Affiliation(s)
- Liang Zhou
- Department of Endocrinology, Changzhou No.2 People's Hospital Affiliated to Nanjing Medical University, 29 Xinglongxiang Road, Changzhou City, 213000, Jiangsu Province, China
| | - Xiaoyuan Zheng
- Department of Endocrinology, Changzhou No.2 People's Hospital Affiliated to Nanjing Medical University, 29 Xinglongxiang Road, Changzhou City, 213000, Jiangsu Province, China
| | - Di Yang
- Shanghai Jiao Tong University School of Medicine, Shanghai, 200025, China
| | - Ying Wang
- Department of Endocrinology, Changzhou No.2 People's Hospital Affiliated to Nanjing Medical University, 29 Xinglongxiang Road, Changzhou City, 213000, Jiangsu Province, China
| | - Xuesong Bai
- Capital Medical University, Beijing, 100053, China
| | - Xinhua Ye
- Department of Endocrinology, Changzhou No.2 People's Hospital Affiliated to Nanjing Medical University, 29 Xinglongxiang Road, Changzhou City, 213000, Jiangsu Province, China.
| |
Collapse
|