Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Wang K, Tian J, Zheng C, Yang H, Ren J, Li C, Han Q, Zhang Y. Improving Risk Identification of Adverse Outcomes in Chronic Heart Failure Using SMOTE+ENN and Machine Learning. Risk Manag Healthc Policy 2021;14:2453-2463. [PMID: 34149290 PMCID: PMC8206455 DOI: 10.2147/rmhp.s310295] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2021] [Accepted: 05/24/2021] [Indexed: 01/14/2023] Open

For:	Wang K, Tian J, Zheng C, Yang H, Ren J, Li C, Han Q, Zhang Y. Improving Risk Identification of Adverse Outcomes in Chronic Heart Failure Using SMOTE+ENN and Machine Learning. Risk Manag Healthc Policy 2021;14:2453-2463. [PMID: 34149290 PMCID: PMC8206455 DOI: 10.2147/rmhp.s310295] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2021] [Accepted: 05/24/2021] [Indexed: 01/14/2023] Open

Number

Cited by Other Article(s)

Li Y, Du C, Ge S, Zhang R, Shao Y, Chen K, Li Z, Ma F. Hematoma expansion prediction based on SMOTE and XGBoost algorithm. BMC Med Inform Decis Mak 2024;24:172. [PMID: 38898499 PMCID: PMC11186182 DOI: 10.1186/s12911-024-02561-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2023] [Accepted: 05/30/2024] [Indexed: 06/21/2024] Open

Yu M, Yuan Z, Li R, Shi B, Wan D, Dong X. Interpretable machine learning model to predict surgical difficulty in laparoscopic resection for rectal cancer. Front Oncol 2024;14:1337219. [PMID: 38380369 PMCID: PMC10878416 DOI: 10.3389/fonc.2024.1337219] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2023] [Accepted: 01/15/2024] [Indexed: 02/22/2024] Open

Abstract

Background

Laparoscopic total mesorectal excision (LaTME) is standard surgical methods for rectal cancer, and LaTME operation is a challenging procedure. This study is intended to use machine learning to develop and validate prediction models for surgical difficulty of LaTME in patients with rectal cancer and compare these models' performance.

Methods

We retrospectively collected the preoperative clinical and MRI pelvimetry parameter of rectal cancer patients who underwent laparoscopic total mesorectal resection from 2017 to 2022. The difficulty of LaTME was defined according to the scoring criteria reported by Escal. Patients were randomly divided into training group (80%) and test group (20%). We selected independent influencing features using the least absolute shrinkage and selection operator (LASSO) and multivariate logistic regression method. Adopt synthetic minority oversampling technique (SMOTE) to alleviate the class imbalance problem. Six machine learning model were developed: light gradient boosting machine (LGBM); categorical boosting (CatBoost); extreme gradient boost (XGBoost), logistic regression (LR); random forests (RF); multilayer perceptron (MLP). The area under receiver operating characteristic curve (AUROC), accuracy, sensitivity, specificity and F1 score were used to evaluate the performance of the model. The Shapley Additive Explanations (SHAP) analysis provided interpretation for the best machine learning model. Further decision curve analysis (DCA) was used to evaluate the clinical manifestations of the model.

Results

A total of 626 patients were included. LASSO regression analysis shows that tumor height, prognostic nutrition index (PNI), pelvic inlet, pelvic outlet, sacrococcygeal distance, mesorectal fat area and angle 5 (the angle between the apex of the sacral angle and the lower edge of the pubic bone) are the predictor variables of the machine learning model. In addition, the correlation heatmap shows that there is no significant correlation between these seven variables. When predicting the difficulty of LaTME surgery, the XGBoost model performed best among the six machine learning models (AUROC=0.855). Based on the decision curve analysis (DCA) results, the XGBoost model is also superior, and feature importance analysis shows that tumor height is the most important variable among the seven factors.

Conclusions

This study developed an XGBoost model to predict the difficulty of LaTME surgery. This model can help clinicians quickly and accurately predict the difficulty of surgery and adopt individualized surgical methods.

Collapse

Ge J, Ji Y, Wang F, Zhou X, Wei J, Qi C. Correlation Between Cystatin C and the Severity of Cardiac Dysfunction in Patients with Systolic Heart Failure. Risk Manag Healthc Policy 2023;16:2419-2426. [PMID: 38024499 PMCID: PMC10655600 DOI: 10.2147/rmhp.s437678] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2023] [Accepted: 11/02/2023] [Indexed: 12/01/2023] Open

Zhang Y, Wang H, Yin C, Shu T, Yu J, Jian J, Jian C, Duan M, Kadier K, Xu Q, Wang X, Xiang T, Liu X. Development of a prediction model for the risk of 30-day unplanned readmission in older patients with heart failure: A multicenter retrospective study. Nutr Metab Cardiovasc Dis 2023;33:1878-1887. [PMID: 37500347 DOI: 10.1016/j.numecd.2023.05.034] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/27/2023] [Revised: 05/21/2023] [Accepted: 05/31/2023] [Indexed: 07/29/2023]

Kim R, Suresh K, Rosenberg MA, Tan MS, Malone DC, Allen LA, Kao DP, Anderson HD, Tiwari P, Trinkley KE. A machine learning evaluation of patient characteristics associated with prescribing of guideline-directed medical therapy for heart failure. Front Cardiovasc Med 2023;10:1169574. [PMID: 37416920 PMCID: PMC10321403 DOI: 10.3389/fcvm.2023.1169574] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2023] [Accepted: 06/01/2023] [Indexed: 07/08/2023] Open

Abstract

Introduction/background

Patients with heart failure and reduced ejection fraction (HFrEF) are consistently underprescribed guideline-directed medications. Although many barriers to prescribing are known, identification of these barriers has relied on traditional a priori hypotheses or qualitative methods. Machine learning can overcome many limitations of traditional methods to capture complex relationships in data and lead to a more comprehensive understanding of the underpinnings driving underprescribing. Here, we used machine learning methods and routinely available electronic health record data to identify predictors of prescribing.

Methods

We evaluated the predictive performance of machine learning algorithms to predict prescription of four types of medications for adults with HFrEF: angiotensin converting enzyme inhibitor/angiotensin receptor blocker (ACE/ARB), angiotensin receptor-neprilysin inhibitor (ARNI), evidence-based beta blocker (BB), or mineralocorticoid receptor antagonist (MRA). The models with the best predictive performance were used to identify the top 20 characteristics associated with prescribing each medication type. Shapley values were used to provide insight into the importance and direction of the predictor relationships with medication prescribing.

Results

For 3,832 patients meeting the inclusion criteria, 70% were prescribed an ACE/ARB, 8% an ARNI, 75% a BB, and 40% an MRA. The best-predicting model for each medication type was a random forest (area under the curve: 0.788-0.821; Brier score: 0.063-0.185). Across all medications, top predictors of prescribing included prescription of other evidence-based medications and younger age. Unique to prescribing an ARNI, the top predictors included lack of diagnoses of chronic kidney disease, chronic obstructive pulmonary disease, or hypotension, as well as being in a relationship, nontobacco use, and alcohol use.

Discussion/conclusions

We identified multiple predictors of prescribing for HFrEF medications that are being used to strategically design interventions to address barriers to prescribing and to inform further investigations. The machine learning approach used in this study to identify predictors of suboptimal prescribing can also be used by other health systems to identify and address locally relevant gaps and solutions to prescribing.

Collapse

Eysenbach G, Chao HJ, Chiang YC, Chen HY. Explainable Machine Learning Techniques To Predict Amiodarone-Induced Thyroid Dysfunction Risk: Multicenter, Retrospective Study With External Validation. J Med Internet Res 2023;25:e43734. [PMID: 36749620 PMCID: PMC9944157 DOI: 10.2196/43734] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2022] [Revised: 12/25/2022] [Accepted: 01/16/2023] [Indexed: 02/08/2023] Open

Abstract

BACKGROUND

Machine learning offers new solutions for predicting life-threatening, unpredictable amiodarone-induced thyroid dysfunction. Traditional regression approaches for adverse-effect prediction without time-series consideration of features have yielded suboptimal predictions. Machine learning algorithms with multiple data sets at different time points may generate better performance in predicting adverse effects.

OBJECTIVE

We aimed to develop and validate machine learning models for forecasting individualized amiodarone-induced thyroid dysfunction risk and to optimize a machine learning-based risk stratification scheme with a resampling method and readjustment of the clinically derived decision thresholds.

METHODS

This study developed machine learning models using multicenter, delinked electronic health records. It included patients receiving amiodarone from January 2013 to December 2017. The training set was composed of data from Taipei Medical University Hospital and Wan Fang Hospital, while data from Taipei Medical University Shuang Ho Hospital were used as the external test set. The study collected stationary features at baseline and dynamic features at the first, second, third, sixth, ninth, 12th, 15th, 18th, and 21st months after amiodarone initiation. We used 16 machine learning models, including extreme gradient boosting, adaptive boosting, k-nearest neighbor, and logistic regression models, along with an original resampling method and 3 other resampling methods, including oversampling with the borderline-synthesized minority oversampling technique, undersampling-edited nearest neighbor, and over- and undersampling hybrid methods. The model performance was compared based on accuracy; Precision, recall, F₁-score, geometric mean, area under the curve of the receiver operating characteristic curve (AUROC), and the area under the precision-recall curve (AUPRC). Feature importance was determined by the best model. The decision threshold was readjusted to identify the best cutoff value and a Kaplan-Meier survival analysis was performed.

RESULTS

The training set contained 4075 patients from Taipei Medical University Hospital and Wan Fang Hospital, of whom 583 (14.3%) developed amiodarone-induced thyroid dysfunction, while the external test set included 2422 patients from Taipei Medical University Shuang Ho Hospital, of whom 275 (11.4%) developed amiodarone-induced thyroid dysfunction. The extreme gradient boosting oversampling machine learning model demonstrated the best predictive outcomes among all 16 models. The accuracy; Precision, recall, F₁-score, G-mean, AUPRC, and AUROC were 0.923, 0.632, 0.756, 0.688, 0.845, 0.751, and 0.934, respectively. After readjusting the cutoff, the best value was 0.627, and the F₁-score reached 0.699. The best threshold was able to classify 286 of 2422 patients (11.8%) as high-risk subjects, among which 275 were true-positive patients in the testing set. A shorter treatment duration; higher levels of thyroid-stimulating hormone and high-density lipoprotein cholesterol; and lower levels of free thyroxin, alkaline phosphatase, and low-density lipoprotein were the most important features.

CONCLUSIONS

Machine learning models combined with resampling methods can predict amiodarone-induced thyroid dysfunction and serve as a support tool for individualized risk prediction and clinical decision support.

Collapse

Song W, Liu Y, Qiu L, Qing J, Li A, Zhao Y, Li Y, Li R, Zhou X. Machine learning-based warning model for chronic kidney disease in individuals over 40 years old in underprivileged areas, Shanxi Province. Front Med (Lausanne) 2023;9:930541. [PMID: 36698845 PMCID: PMC9868668 DOI: 10.3389/fmed.2022.930541] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2022] [Accepted: 12/19/2022] [Indexed: 01/11/2023] Open

Abstract

Introduction

Chronic kidney disease (CKD) is a progressive disease with high incidence but early imperceptible symptoms. Since China's rural areas are subject to inadequate medical check-ups and single disease screening programme, it could easily translate into end-stage renal failure. This study aimed to construct an early warning model for CKD tailored to impoverished areas by employing machine learning (ML) algorithms with easily accessible parameters from ten rural areas in Shanxi Province, thereby, promoting a forward shift of treatment time and improving patients' quality of life.

Methods

From April to November 2019, CKD opportunistic screening was carried out in 10 rural areas in Shanxi Province. First, general information, physical examination data, blood and urine specimens were collected from 13,550 subjects. Afterward, feature selection of explanatory variables was performed using LASSO regression, and target datasets were balanced using the SMOTE (synthetic minority over-sampling technique) algorithm, i.e., albuminuria-to-creatinine ratio (ACR) and α1-microglobulin-to-creatinine ratio (MCR). Next, Bagging, Random Forest (RF) and eXtreme Gradient Boosting (XGBoost) were employed for classification of ACR outcomes and MCR outcomes, respectively.

Results

12,330 rural residents were included in this study, with 20 explanatory variables. The cases with increased ACR and increased MCR represented 1,587 (12.8%) and 1,456 (11.8%), respectively. After conducting LASSO, 14 and 15 explanatory variables remained in these two datasets, respectively. Bagging, RF, and XGBoost performed well in classification, with the AUC reaching 0.74, 0.87, 0.87, 0.89 for ACR outcomes and 0.75, 0.88, 0.89, 0.90 for MCR outcomes. The five variables contributing most to the classification of ACR outcomes and MCR outcomes constituted SBP, TG, TC, and Hcy, DBP and age, TG, SBP, Hcy and FPG, respectively. Overall, the machine learning algorithms could emerge as a warning model for CKD.

Conclusion

ML algorithms in conjunction with rural accessible indexes boast good performance in classification, which allows for an early warning model for CKD. This model could help achieve large-scale population screening for CKD in poverty-stricken areas and should be promoted to improve the quality of life and reduce the mortality rate.

Collapse

Adeoye J, Zheng LW, Thomson P, Choi SW, Su YX. Explainable ensemble learning model improves identification of candidates for oral cancer screening. Oral Oncol 2023;136:106278. [PMID: 36525782 DOI: 10.1016/j.oraloncology.2022.106278] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2022] [Revised: 11/26/2022] [Accepted: 12/06/2022] [Indexed: 12/15/2022]

Zhang X, Gavaldà R, Baixeries J. Interpretable prediction of mortality in liver transplant recipients based on machine learning. Comput Biol Med 2022;151:106188. [PMID: 36306583 DOI: 10.1016/j.compbiomed.2022.106188] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2022] [Revised: 09/24/2022] [Accepted: 10/08/2022] [Indexed: 12/27/2022]

Abstract

BACKGROUND

Accurate prediction of the mortality of post-liver transplantation is an important but challenging task. It relates to optimizing organ allocation and estimating the risk of possible dysfunction. Existing risk scoring models, such as the Balance of Risk (BAR) score and the Survival Outcomes Following Liver Transplantation (SOFT) score, do not predict the mortality of post-liver transplantation with sufficient accuracy. In this study, we evaluate the performance of machine learning models and establish an explainable machine learning model for predicting mortality in liver transplant recipients.

METHOD

The optimal feature set for the prediction of the mortality was selected by a wrapper method based on binary particle swarm optimization (BPSO). With the selected optimal feature set, seven machine learning models were applied to predict mortality over different time windows. The best-performing model was used to predict mortality through a comprehensive comparison and evaluation. An interpretable approach based on machine learning and SHapley Additive exPlanations (SHAP) is used to explicitly explain the model's decision and make new discoveries.

RESULTS

With regard to predictive power, our results demonstrated that the feature set selected by BPSO outperformed both the feature set in the existing risk score model (BAR score, SOFT score) and the feature set processed by principal component analysis (PCA). The best-performing model, extreme gradient boosting (XGBoost), was found to improve the Area Under a Curve (AUC) values for mortality prediction by 6.7%, 11.6%, and 17.4% at 3 months, 3 years, and 10 years, respectively, compared to the SOFT score. The main predictors of mortality and their impact were discussed for different age groups and different follow-up periods.

CONCLUSIONS

Our analysis demonstrates that XGBoost can be an ideal method to assess the mortality risk in liver transplantation. In combination with the SHAP approach, the proposed framework provides a more intuitive and comprehensive interpretation of the predictive model, thereby allowing the clinician to better understand the decision-making process of the model and the impact of factors associated with mortality risk in liver transplantation.

Collapse

Masukawa K, Aoyama M, Yokota S, Nakamura J, Ishida R, Nakayama M, Miyashita M. Machine learning models to detect social distress, spiritual pain, and severe physical psychological symptoms in terminally ill patients with cancer from unstructured text data in electronic medical records. Palliat Med 2022;36:1207-1216. [PMID: 35773973 DOI: 10.1177/02692163221105595] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 01/21/2023]

Song W, Zhou X, Duan Q, Wang Q, Li Y, Li A, Zhou W, Sun L, Qiu L, Li R, Li Y. Using random forest algorithm for glomerular and tubular injury diagnosis. Front Med (Lausanne) 2022;9:911737. [PMID: 35966858 PMCID: PMC9366016 DOI: 10.3389/fmed.2022.911737] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2022] [Accepted: 07/04/2022] [Indexed: 11/16/2022] Open

Abstract

Objectives

Chronic kidney disease (CKD) is a common chronic condition with high incidence and insidious onset. Glomerular injury (GI) and tubular injury (TI) represent early manifestations of CKD and could indicate the risk of its development. In this study, we aimed to classify GI and TI using three machine learning algorithms to promote their early diagnosis and slow the progression of CKD.

Methods

Demographic information, physical examination, blood, and morning urine samples were first collected from 13,550 subjects in 10 counties in Shanxi province for classification of GI and TI. Besides, LASSO regression was employed for feature selection of explanatory variables, and the SMOTE (synthetic minority over-sampling technique) algorithm was used to balance target datasets, i.e., GI and TI. Afterward, Random Forest (RF), Naive Bayes (NB), and logistic regression (LR) were constructed to achieve classification of GI and TI, respectively.

Results

A total of 12,330 participants enrolled in this study, with 20 explanatory variables. The number of patients with GI, and TI were 1,587 (12.8%) and 1,456 (11.8%), respectively. After feature selection by LASSO, 14 and 15 explanatory variables remained in these two datasets. Besides, after SMOTE, the number of patients and normal ones were 6,165, 6,165 for GI, and 6,165, 6,164 for TI, respectively. RF outperformed NB and LR in terms of accuracy (78.14, 80.49%), sensitivity (82.00, 84.60%), specificity (74.29, 76.09%), and AUC (0.868, 0.885) for both GI and TI; the four variables contributing most to the classification of GI and TI represented SBP, DBP, sex, age and age, SBP, FPG, and GHb, respectively.

Conclusion

RF boasts good performance in classifying GI and TI, which allows for early auxiliary diagnosis of GI and TI, thus facilitating to help alleviate the progression of CKD, and enjoying great prospects in clinical practice.

Collapse

Wang K, Tian J, Zheng C, Yang H, Ren J, Liu Y, Han Q, Zhang Y. Interpretable prediction of 3-year all-cause mortality in patients with heart failure caused by coronary heart disease based on machine learning and SHAP. Comput Biol Med 2021;137:104813. [PMID: 34481185 DOI: 10.1016/j.compbiomed.2021.104813] [Citation(s) in RCA: 51] [Impact Index Per Article: 17.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2021] [Revised: 08/25/2021] [Accepted: 08/25/2021] [Indexed: 01/14/2023]