1
|
Reza MS, Amin R, Yasmin R, Kulsum W, Ruhi S. Improving diabetes disease patients classification using stacking ensemble method with PIMA and local healthcare data. Heliyon 2024; 10:e24536. [PMID: 38312584 PMCID: PMC10834804 DOI: 10.1016/j.heliyon.2024.e24536] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2023] [Revised: 01/06/2024] [Accepted: 01/10/2024] [Indexed: 02/06/2024] Open
Abstract
Diabetes mellitus, a chronic metabolic disorder, continues to be a major public health issue around the world. It is estimated that one in every two diabetics is undiagnosed. Early diagnosis and management of diabetes can also prevent or delay the onset of complications. With the help of a variety of machine learning and deep learning models, stacking algorithms, and other techniques, our study's goal is to detect diseases early. In this study, we propose two stacking-based models for diabetes disease classification using a combination of the PIMA Indian diabetes dataset, simulated data, and additional data collected from a local healthcare facility. We use both the classical and deep neural network stacking ensemble methods to combine the predictions of multiple classification models and improve classification accuracy and robustness. In the evaluation protocol, we used both the train-test and cross-validation (CV) techniques to validate our proposed model. The highest accuracy is obtained by stacking ensemble with three NN architectures, resulting in an accuracy of 95.50 %, precision of 94 %, recall of 97 %, and f1-score of 96 % using 5-fold CV on simulation study. The stacked accuracy obtained from ML algorithms for the Pima Indian Diabetes dataset is 75.03 % using the train-test split protocol, while the accuracy obtained from the CV protocol is 77.10 % on the stacked model. The range of performance scores that outperformed the CV protocol 2.23 %-12 %. Our proposed method achieves a high accuracy range from 92 % to 95 %, precision, recall, and F1-score ranges from 88 % to 96 % using classical and deep neural network (NN)-based stacking method on the primary dataset. The proposed dataset and ensemble method could be useful in the early detection and treatment of diabetes, as well as in the advancement of machine learning and data analysis techniques in the healthcare industry.
Collapse
Affiliation(s)
- Md Shamim Reza
- Department of Statistics, Pabna University of Science and Technology, Pabna, 6600, Bangladesh
| | - Ruhul Amin
- Department of Statistics, Pabna University of Science and Technology, Pabna, 6600, Bangladesh
| | - Rubia Yasmin
- Department of Statistics, Pabna University of Science and Technology, Pabna, 6600, Bangladesh
| | - Woomme Kulsum
- Department of Statistics, Pabna University of Science and Technology, Pabna, 6600, Bangladesh
| | - Sabba Ruhi
- Department of Statistics, Pabna University of Science and Technology, Pabna, 6600, Bangladesh
| |
Collapse
|
2
|
Shi X, Cui Y, Wang S, Pan Y, Wang B, Lei M. Development and validation of a web-based artificial intelligence prediction model to assess massive intraoperative blood loss for metastatic spinal disease using machine learning techniques. Spine J 2024; 24:146-160. [PMID: 37704048 DOI: 10.1016/j.spinee.2023.09.001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/19/2023] [Revised: 09/01/2023] [Accepted: 09/02/2023] [Indexed: 09/15/2023]
Abstract
BACKGROUND CONTEXT Intraoperative blood loss is a significant concern in patients with metastatic spinal disease. Early identification of patients at high risk of experiencing massive intraoperative blood loss is crucial as it allows for the development of appropriate surgical plans and facilitates timely interventions. However, accurate prediction of intraoperative blood loss remains limited based on prior studies. PURPOSE The purpose of this study was to develop and validate a web-based artificial intelligence (AI) model to predict massive intraoperative blood loss during surgery for metastatic spinal disease. STUDY DESIGN/SETTING An observational cohort study. PATIENT SAMPLE Two hundred seventy-six patients with metastatic spinal tumors undergoing decompressive surgery from two hospitals were included for analysis. Of these, 200 patients were assigned to the derivation cohort for model development and internal validation, while the remaining 76 were allocated to the external validation cohort. OUTCOME MEASURES The primary outcome was massive intraoperative blood loss defined as an estimated blood loss of 2,500 cc or more. METHODS Data on patients' demographics, tumor conditions, oncological therapies, surgical strategies, and laboratory examinations were collected in the derivation cohort. SMOTETomek resampling (which is a combination of Synthetic Minority Oversampling Technique and Tomek Links Undersampling) was performed to balance the classes of the dataset and obtain an expanded dataset. The patients were randomly divided into two groups in a proportion of 7:3, with the most used for model development and the remaining for internal validation. External validation was performed in another cohort of 76 patients with metastatic spinal tumors undergoing decompressive surgery from a teaching hospital. The logistic regression (LR) model, and five machine learning models, including K-Nearest Neighbor (KNN), Decision Tree (DT), XGBoosting Machine (XGBM), Random Forest (RF), and Support Vector Machine (SVM), were used to develop prediction models. Model prediction performance was evaluated using area under the curve (AUC), recall, specificity, F1 score, Brier score, and log loss. A scoring system incorporating 10 evaluation metrics was developed to comprehensively evaluate the prediction performance. RESULTS The incidence of massive intraoperative blood loss was 23.50% (47/200). The model features were comprised of five clinical variables, including tumor type, smoking status, Eastern Cooperative Oncology Group (ECOG) score, surgical process, and preoperative platelet level. The XGBM model performed the best in AUC (0.857 [95% CI: 0.827, 0.877]), accuracy (0.771), recall (0.854), F1 score (0.787), Brier score (0.150), and log loss (0.461), and the RF model ranked second in AUC (0.826 [95% CI: 0.793, 0.861]) and precise (0.705), whereas the AUC of the LR model was only 0.710 (95% CI: 0.665, 0.771), the accuracy was 0.627, the recall was 0.610, and the F1 score was 0.617. According to the scoring system, the XGBM model obtained the highest total score of 55, which signifies the best predictive performance among the evaluated models. External validation showed that the AUC of the XGBM model was also up to 0.809 (95% CI: 0.778, 0.860) and the accuracy was 0.733. The XGBM model, was further deployed online, and can be freely accessed at https://starxueshu-massivebloodloss-main-iudy71.streamlit.app/. CONCLUSIONS The XGBM model may be a useful AI tool to assess the risk of intraoperative blood loss in patients with metastatic spinal disease undergoing decompressive surgery.
Collapse
Affiliation(s)
- Xuedong Shi
- Department of Orthopedic Surgery, Peking University First Hospital, No. 8 Xishiku St, Beijing, Xicheng District, 100032, China.
| | - Yunpeng Cui
- Department of Orthopedic Surgery, Peking University First Hospital, No. 8 Xishiku St, Beijing, Xicheng District, 100032, China
| | - Shengjie Wang
- Department of Orthopaedic Surgery, Shanghai Sixth People's Hospital Affiliated to Shanghai Jiao Tong University, No. 222 Huanhu West Third Road, Pudong New Area, Shanghai, 200233, China
| | - Yuanxing Pan
- Department of Orthopedic Surgery, Peking University First Hospital, No. 8 Xishiku St, Beijing, Xicheng District, 100032, China
| | - Bing Wang
- Department of Orthopedic Surgery, Peking University First Hospital, No. 8 Xishiku St, Beijing, Xicheng District, 100032, China
| | - Mingxing Lei
- Department of Orthopedic Surgery, Hainan Hospital of Chinese PLA General Hospital, No. 80 Jianglin Rd, Sanya, Haitang District, 572022, China; Department of Orthopedic Surgery, National Clinical Research Center for Orthopedics, Sports Medicine and Rehabilitation, No. 28 Fuxing Road, Beijing, Haidian District, 100039, China; Department of Orthopedic Surgery, Chinese PLA General Hospital, No. 28 Fuxing Rd, Beijing, Haidian District, 100039, China.
| |
Collapse
|
3
|
Hendawi R, Li J, Roy S. A Mobile App That Addresses Interpretability Challenges in Machine Learning-Based Diabetes Predictions: Survey-Based User Study. JMIR Form Res 2023; 7:e50328. [PMID: 37955948 PMCID: PMC10682931 DOI: 10.2196/50328] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2023] [Revised: 09/12/2023] [Accepted: 10/08/2023] [Indexed: 11/14/2023] Open
Abstract
BACKGROUND Machine learning approaches, including deep learning, have demonstrated remarkable effectiveness in the diagnosis and prediction of diabetes. However, these approaches often operate as opaque black boxes, leaving health care providers in the dark about the reasoning behind predictions. This opacity poses a barrier to the widespread adoption of machine learning in diabetes and health care, leading to confusion and eroding trust. OBJECTIVE This study aimed to address this critical issue by developing and evaluating an explainable artificial intelligence (AI) platform, XAI4Diabetes, designed to empower health care professionals with a clear understanding of AI-generated predictions and recommendations for diabetes care. XAI4Diabetes not only delivers diabetes risk predictions but also furnishes easily interpretable explanations for complex machine learning models and their outcomes. METHODS XAI4Diabetes features a versatile multimodule explanation framework that leverages machine learning, knowledge graphs, and ontologies. The platform comprises the following four essential modules: (1) knowledge base, (2) knowledge matching, (3) prediction, and (4) interpretation. By harnessing AI techniques, XAI4Diabetes forecasts diabetes risk and provides valuable insights into the prediction process and outcomes. A structured, survey-based user study assessed the app's usability and influence on participants' comprehension of machine learning predictions in real-world patient scenarios. RESULTS A prototype mobile app was meticulously developed and subjected to thorough usability studies and satisfaction surveys. The evaluation study findings underscore the substantial improvement in medical professionals' comprehension of key aspects, including the (1) diabetes prediction process, (2) data sets used for model training, (3) data features used, and (4) relative significance of different features in prediction outcomes. Most participants reported heightened understanding of and trust in AI predictions following their use of XAI4Diabetes. The satisfaction survey results further revealed a high level of overall user satisfaction with the tool. CONCLUSIONS This study introduces XAI4Diabetes, a versatile multi-model explainable prediction platform tailored to diabetes care. By enabling transparent diabetes risk predictions and delivering interpretable insights, XAI4Diabetes empowers health care professionals to comprehend the AI-driven decision-making process, thereby fostering transparency and trust. These advancements hold the potential to mitigate biases and facilitate the broader integration of AI in diabetes care.
Collapse
Affiliation(s)
- Rasha Hendawi
- North Dakota State University, Fargo, ND, United States
| | - Juan Li
- North Dakota State University, Fargo, ND, United States
| | - Souradip Roy
- North Dakota State University, Fargo, ND, United States
| |
Collapse
|
4
|
Chen S, Phuc PT, Nguyen P, Burton W, Lin S, Lin W, Lu CY, Hsu M, Cheng C, Hsu JC. A novel prediction model of the risk of pancreatic cancer among diabetes patients using multiple clinical data and machine learning. Cancer Med 2023; 12:19987-19999. [PMID: 37737056 PMCID: PMC10587954 DOI: 10.1002/cam4.6547] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2023] [Revised: 08/14/2023] [Accepted: 09/06/2023] [Indexed: 09/23/2023] Open
Abstract
INTRODUCTION Pancreatic cancer is associated with poor prognosis. Considering the increased global incidence of diabetes cases and that individuals with diabetes are considered a high-risk subpopulation for pancreatic cancer, it is critical to detect the risk of pancreatic cancer within populations of person living = with diabetes. This study aimed to develop a novel prediction model for pancreatic cancer risk among patients with diabetes, using = a real-world database containing clinical features and employing numerous artificial intelligent approach algorithms. METHODS This retrospective observational study analyzed data on patients with Type 2 diabetes from a multisite Taiwanese EMR database between 2009 and 2019. Predictors were selected in accordance with the literature review and clinical perspectives. The prediction models were constructed using machine learning algorithms such as logistic regression, linear discriminant analysis, gradient boosting machine, and random forest. RESULTS The cohort consisted of 66,384 patients. The Linear Discriminant Analysis (LDA) model generated the highest AUROC of 0.9073, followed by the Voting Ensemble and Gradient Boosting machine models. LDA, the best model, exhibited an accuracy of 84.03%, a sensitivity of 0.8611, and a specificity of 0.8403. The most significant predictors identified for pancreatic cancer risk were glucose, glycated hemoglobin, hyperlipidemia comorbidity, antidiabetic drug use, and lipid-modifying drug use. CONCLUSION This study successfully developed a highly accurate 4-year risk model for pancreatic cancer in patients with diabetes using real-world clinical data and multiple machine-learning algorithms. Potentially, our predictors offer an opportunity to identify pancreatic cancer early and thus increase prevention and invention windows to impact survival in diabetic patients.
Collapse
Affiliation(s)
- Shih‐Min Chen
- School of PharmacyTaipei Medical UniversityTaipeiTaiwan
| | - Phan Thanh Phuc
- International Ph.D. Program in Biotech and Healthcare Management, College of ManagementTaipei Medical UniversityTaipeiTaiwan
| | - Phung‐Anh Nguyen
- Clinical Data Center, Office of Data ScienceTaipei Medical UniversityTaipeiTaiwan
- Clinical Big Data Research CenterTaipei Medical University Hospital, Taipei Medical UniversityTaipeiTaiwan
- Research Center of Health Care Industry Data Science, College of ManagementTaipei Medical UniversityTaipeiTaiwan
| | - Whitney Burton
- International Ph.D. Program in Biotech and Healthcare Management, College of ManagementTaipei Medical UniversityTaipeiTaiwan
| | | | - Weei‐Chin Lin
- Section of Hematology/Oncology, Department of Medicine and Department of Molecular and Cellular BiologyBaylor College of MedicineHoustonTexasUSA
| | - Christine Y. Lu
- Department of Population MedicineHarvard Medical School and Harvard Pilgrim Health Care InstituteBostonMassachusettsUSA
- Kolling Institute, Faculty of Medicine and HealthThe University of Sydney and the Northern Sydney Local Health DistrictSydneyNew South WalesAustralia
- School of Pharmacy, Faculty of Medicine and HealthThe University of SydneySydneyNew South WalesAustralia
| | - Min‐Huei Hsu
- Clinical Data Center, Office of Data ScienceTaipei Medical UniversityTaipeiTaiwan
- Graduate Institute of Data Science, College of ManagementTaipei Medical UniversityTaipeiTaiwan
| | - Chi‐Tsun Cheng
- Research Center of Health Care Industry Data Science, College of ManagementTaipei Medical UniversityTaipeiTaiwan
| | - Jason C. Hsu
- International Ph.D. Program in Biotech and Healthcare Management, College of ManagementTaipei Medical UniversityTaipeiTaiwan
- Clinical Data Center, Office of Data ScienceTaipei Medical UniversityTaipeiTaiwan
- Clinical Big Data Research CenterTaipei Medical University Hospital, Taipei Medical UniversityTaipeiTaiwan
- Research Center of Health Care Industry Data Science, College of ManagementTaipei Medical UniversityTaipeiTaiwan
| |
Collapse
|
5
|
Ou SM, Tsai MT, Lee KH, Tseng WC, Yang CY, Chen TH, Bin PJ, Chen TJ, Lin YP, Sheu WHH, Chu YC, Tarng DC. Prediction of the risk of developing end-stage renal diseases in newly diagnosed type 2 diabetes mellitus using artificial intelligence algorithms. BioData Min 2023; 16:8. [PMID: 36899426 PMCID: PMC10007785 DOI: 10.1186/s13040-023-00324-2] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2022] [Accepted: 02/17/2023] [Indexed: 03/12/2023] Open
Abstract
OBJECTIVES Type 2 diabetes mellitus (T2DM) imposes a great burden on healthcare systems, and these patients experience higher long-term risks for developing end-stage renal disease (ESRD). Managing diabetic nephropathy becomes more challenging when kidney function starts declining. Therefore, developing predictive models for the risk of developing ESRD in newly diagnosed T2DM patients may be helpful in clinical settings. METHODS We established machine learning models constructed from a subset of clinical features collected from 53,477 newly diagnosed T2DM patients from January 2008 to December 2018 and then selected the best model. The cohort was divided, with 70% and 30% of patients randomly assigned to the training and testing sets, respectively. RESULTS The discriminative ability of our machine learning models, including logistic regression, extra tree classifier, random forest, gradient boosting decision tree (GBDT), extreme gradient boosting (XGBoost), and light gradient boosting machine were evaluated across the cohort. XGBoost yielded the highest area under the receiver operating characteristic curve (AUC) of 0.953, followed by extra tree and GBDT, with AUC values of 0.952 and 0.938 on the testing dataset. The SHapley Additive explanation summary plot in the XGBoost model illustrated that the top five important features included baseline serum creatinine, mean serum creatine within 1 year before the diagnosis of T2DM, high-sensitivity C-reactive protein, spot urine protein-to-creatinine ratio and female gender. CONCLUSIONS Because our machine learning prediction models were based on routinely collected clinical features, they can be used as risk assessment tools for developing ESRD. By identifying high-risk patients, intervention strategies may be provided at an early stage.
Collapse
Affiliation(s)
- Shuo-Ming Ou
- Division of Nephrology, Department of Medicine, Taipei Veterans General Hospital, 201, Section 2, Shih-Pai Road, Taipei, 11217, Taiwan.,School of Medicine, College of Medicine, National Yang Ming Chiao Tung University, Taipei, Taiwan.,Institute of Clinical Medicine, National Yang Ming Chiao Tung University, Taipei, Taiwan
| | - Ming-Tsun Tsai
- Division of Nephrology, Department of Medicine, Taipei Veterans General Hospital, 201, Section 2, Shih-Pai Road, Taipei, 11217, Taiwan.,School of Medicine, College of Medicine, National Yang Ming Chiao Tung University, Taipei, Taiwan.,Institute of Clinical Medicine, National Yang Ming Chiao Tung University, Taipei, Taiwan
| | - Kuo-Hua Lee
- Division of Nephrology, Department of Medicine, Taipei Veterans General Hospital, 201, Section 2, Shih-Pai Road, Taipei, 11217, Taiwan.,School of Medicine, College of Medicine, National Yang Ming Chiao Tung University, Taipei, Taiwan.,Institute of Clinical Medicine, National Yang Ming Chiao Tung University, Taipei, Taiwan
| | - Wei-Cheng Tseng
- Division of Nephrology, Department of Medicine, Taipei Veterans General Hospital, 201, Section 2, Shih-Pai Road, Taipei, 11217, Taiwan.,School of Medicine, College of Medicine, National Yang Ming Chiao Tung University, Taipei, Taiwan.,Institute of Clinical Medicine, National Yang Ming Chiao Tung University, Taipei, Taiwan
| | - Chih-Yu Yang
- Division of Nephrology, Department of Medicine, Taipei Veterans General Hospital, 201, Section 2, Shih-Pai Road, Taipei, 11217, Taiwan.,School of Medicine, College of Medicine, National Yang Ming Chiao Tung University, Taipei, Taiwan.,Institute of Clinical Medicine, National Yang Ming Chiao Tung University, Taipei, Taiwan
| | - Tz-Heng Chen
- Division of Nephrology, Department of Medicine, Taipei Veterans General Hospital, 201, Section 2, Shih-Pai Road, Taipei, 11217, Taiwan.,School of Medicine, College of Medicine, National Yang Ming Chiao Tung University, Taipei, Taiwan
| | - Pin-Jie Bin
- Graduate Institute of Medicine, College of Medicine, Kaohsiung Medical University, Kaohsiung, Taiwan
| | - Tzeng-Ji Chen
- School of Medicine, College of Medicine, National Yang Ming Chiao Tung University, Taipei, Taiwan.,Department of Family Medicine, Taipei Veterans General Hospital, Taipei, Taiwan.,Department of Family Medicine, Taipei Veterans General Hospital, Hsinchu Branch, Hsinchu, Taiwan.,Institute of Hospital and Health Care Administration, National Yang Ming Chiao Tung University, Taipei, Taiwan
| | - Yao-Ping Lin
- Division of Nephrology, Department of Medicine, Taipei Veterans General Hospital, 201, Section 2, Shih-Pai Road, Taipei, 11217, Taiwan.,School of Medicine, College of Medicine, National Yang Ming Chiao Tung University, Taipei, Taiwan.,Institute of Clinical Medicine, National Yang Ming Chiao Tung University, Taipei, Taiwan
| | - Wayne Huey-Herng Sheu
- School of Medicine, College of Medicine, National Yang Ming Chiao Tung University, Taipei, Taiwan.,Division of Endocrinology and Metabolism, Department of Internal Medicine, Taipei Veterans General Hospital, Taipei, Taiwan.,Institute of Molecular and Genetic Medicine, National Health Research Institute, Miaoli, Taiwan
| | - Yuan-Chia Chu
- Information Management Office, Taipei Veterans General Hospital, 201, Section 2, Shih-Pai Road, Taipei, 11217, Taiwan. .,Big Data Center, Taipei Veterans General Hospital, Taipei, Taiwan. .,Department of Information Management, National Taipei University of Nursing and Health Sciences, Taipei, Taiwan.
| | - Der-Cherng Tarng
- Division of Nephrology, Department of Medicine, Taipei Veterans General Hospital, 201, Section 2, Shih-Pai Road, Taipei, 11217, Taiwan. .,School of Medicine, College of Medicine, National Yang Ming Chiao Tung University, Taipei, Taiwan. .,Institute of Clinical Medicine, National Yang Ming Chiao Tung University, Taipei, Taiwan. .,Department and Institute of Physiology, National Yang Ming Chiao Tung University, Taipei, Taiwan.
| |
Collapse
|