1
|
Cao SS, Liu XM, Song BT, Hu YY. Interpretable machine learning models for predicting clinical pregnancies associated with surgical sperm retrieval from testes of different etiologies: a retrospective study. BMC Urol 2024; 24:156. [PMID: 39075422 PMCID: PMC11285258 DOI: 10.1186/s12894-024-01537-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2024] [Accepted: 07/08/2024] [Indexed: 07/31/2024] Open
Abstract
BACKGROUND The relationship between surgical sperm retrieval of different etiologies and clinical pregnancy is unclear. We aimed to develop a robust and interpretable machine learning (ML) model for predicting clinical pregnancy using the SHapley Additive exPlanation (SHAP) association of surgical sperm retrieval from testes of different etiologies. METHODS A total of 345 infertile couples who underwent intracytoplasmic sperm injection (ICSI) treatment with surgical sperm retrieval due to different etiologies from February 2020 to March 2023 at the reproductive center were retrospectively analyzed. The six machine learning (ML) models were used to predict the clinical pregnancy of ICSI. After evaluating the performance characteristics of the six ML models, the Extreme Gradient Boosting model (XGBoost) was selected as the best model, and SHAP was utilized to interpret the XGBoost model for predicting clinical pregnancies and to reveal the decision-making process of the model. RESULTS Combining the area under the receiver operating characteristic curve (AUROC), accuracy, precision, recall, F1 score, brier score, and the area under the precision-recall (P-R) curve (AP), the XGBoost model has the best performance (AUROC: 0.858, 95% confidence interval (CI): 0.778-0.936, accuracy: 79.71%, brier score: 0.151). The global summary plot of SHAP values shows that the female age is the most important feature influencing the model output. The SHAP plot showed that younger age in females, bigger testicular volume (TV), non-tobacco use, higher anti-müllerian hormone (AMH), lower follicle-stimulating hormone (FSH) in females, lower FSH in males, the temporary ejaculatory disorders (TED) group, and not the non-obstructive azoospermia (NOA) group all resulted in an increased probability of clinical pregnancy. CONCLUSIONS The XGBoost model predicts clinical pregnancies associated with testicular sperm retrieval of different etiologies with high accuracy, reliability, and robustness. It can provide clinical counseling decisions for patients with surgical sperm retrieval of various etiologies.
Collapse
Affiliation(s)
- Shun-Shun Cao
- Pediatric Endocrinology, Genetics and Metabolism, The Second Affiliated Hospital of Wenzhou Medical University, Wenzhou, Zhejiang, 325000, China
| | - Xiao-Ming Liu
- Reproductive Medicine Center, Obstetrics and Gynecology, The Second Affiliated Hospital of Wenzhou Medical University, Wenzhou, Zhejiang, 325000, China
| | - Bo-Tian Song
- Reproductive Medicine Center, Obstetrics and Gynecology, The Second Affiliated Hospital of Wenzhou Medical University, Wenzhou, Zhejiang, 325000, China
| | - Yang-Yang Hu
- Reproductive Medicine Center, Obstetrics and Gynecology, The Second Affiliated Hospital of Wenzhou Medical University, Wenzhou, Zhejiang, 325000, China.
| |
Collapse
|
2
|
Yan JK. A methodological showcase: utilizing minimal clinical parameters for early-stage mortality risk assessment in COVID-19-positive patients. PeerJ Comput Sci 2024; 10:e2017. [PMID: 38855224 PMCID: PMC11157615 DOI: 10.7717/peerj-cs.2017] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2023] [Accepted: 04/03/2024] [Indexed: 06/11/2024]
Abstract
The scarcity of data is likely to have a negative effect on machine learning (ML). Yet, in the health sciences, data is diverse and can be costly to acquire. Therefore, it is critical to develop methods that can reach similar accuracy with minimal clinical features. This study explores a methodology that aims to build a model using minimal clinical parameters to reach comparable performance to a model trained with a more extensive list of parameters. To develop this methodology, a dataset of over 1,000 COVID-19-positive patients was used. A machine learning model was built with over 90% accuracy when combining 24 clinical parameters using Random Forest (RF) and logistic regression. Furthermore, to obtain minimal clinical parameters to predict the mortality of COVID-19 patients, the features were weighted using both Shapley values and RF feature importance to get the most important factors. The six most highly weighted features that could produce the highest performance metrics were combined for the final model. The accuracy of the final model, which used a combination of six features, is 90% with the random forest classifier and 91% with the logistic regression model. This performance is close to that of a model using 24 combined features (92%), suggesting that highly weighted minimal clinical parameters can be used to reach similar performance. The six clinical parameters identified here are acute kidney injury, glucose level, age, troponin, oxygen level, and acute hepatic injury. Among those parameters, acute kidney injury was the highest-weighted feature. Together, a methodology was developed using significantly minimal clinical parameters to reach performance metrics similar to a model trained with a large dataset, highlighting a novel approach to address the problems of clinical data collection for machine learning.
Collapse
|
3
|
Qiu T, Chen M, Gao S, Huang J, Wang W, Wang L, Li H. Application effect study of a combination of TeamSTEPPS with modularization teaching in the context of clinical instruction in trauma care. Sci Rep 2024; 14:4712. [PMID: 38409342 PMCID: PMC10897387 DOI: 10.1038/s41598-024-55509-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2023] [Accepted: 02/24/2024] [Indexed: 02/28/2024] Open
Abstract
To explore the effect of a combination of Team Strategies and Tools to Enhance Performance and Patient Safety (TeamSTEPPS) with modularization teaching in the context of clinical instruction in trauma care. A total of 244 nursing students who participated in clinical practice in orthopaedic wards from March 2020 to April 2022 were divided into two groups that received the same trauma care teaching content. The control group (n = 119) used the traditional teaching approach, and the experimental group (n = 125) utilized a combination of TeamSTEPPS with a modularization teaching model. A questionnaire was used to assess students' theoretical knowledge, practical skills, self-concepts and professional benefits after one month with the goal of determining their end-of-course performance. The theoretical knowledge scores obtained by the control group and the experimental group were 89.56 ± 4.06 and 91.62 ± 2.84, respectively, and these results were statistically significant (P < 0.05). Students preferred the combination of TeamSTEPPS with the modularization teaching model to the traditional instructional method in terms of practical skills, professional self-concepts and professional benefits (P < 0.05). The application of the combination of TeamSTEPPS with modularization teaching in the context of clinical instruction in trauma care made significant contributions to nursing students' mastery of theoretical knowledge and practical skills, enhanced their sense level of professional identity, instilled a correct occupational ideology in such students, and enhanced the professional benefits they were able to obtain.
Collapse
Affiliation(s)
- Tieying Qiu
- Clinical Nursing Teaching and Research Section, The Second Xiangya Hospital of Central South University, Changsha, 410011, China
| | - Min Chen
- Clinical Nursing Teaching and Research Section, The Second Xiangya Hospital of Central South University, Changsha, 410011, China
| | - Suyuan Gao
- Clinical Nursing Teaching and Research Section, The Second Xiangya Hospital of Central South University, Changsha, 410011, China
| | - Jin Huang
- Clinical Nursing Teaching and Research Section, The Second Xiangya Hospital of Central South University, Changsha, 410011, China
| | - Weixing Wang
- Clinical Nursing Teaching and Research Section, The Second Xiangya Hospital of Central South University, Changsha, 410011, China
| | - Liping Wang
- Clinical Nursing Teaching and Research Section, The Second Xiangya Hospital of Central South University, Changsha, 410011, China.
| | - Haiyang Li
- Clinical Nursing Teaching and Research Section, The Second Xiangya Hospital of Central South University, Changsha, 410011, China.
| |
Collapse
|
4
|
Li Q, Li J, Chen J, Zhao X, Zhuang J, Zhong G, Song Y, Lei L. A machine learning-based prediction model for postoperative delirium in cardiac valve surgery using electronic health records. BMC Cardiovasc Disord 2024; 24:56. [PMID: 38238677 PMCID: PMC10795338 DOI: 10.1186/s12872-024-03723-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2023] [Accepted: 01/11/2024] [Indexed: 01/22/2024] Open
Abstract
BACKGROUND Previous models for predicting delirium after cardiac surgery remained inadequate. This study aimed to develop and validate a machine learning-based prediction model for postoperative delirium (POD) in cardiac valve surgery patients. METHODS The electronic medical information of the cardiac surgical intensive care unit (CSICU) was extracted from a tertiary and major referral hospital in southern China over 1 year, from June 2019 to June 2020. A total of 507 patients admitted to the CSICU after cardiac valve surgery were included in this study. Seven classical machine learning algorithms (Random Forest Classifier, Logistic Regression, Support Vector Machine Classifier, K-nearest Neighbors Classifier, Gaussian Naive Bayes, Gradient Boosting Decision Tree, and Perceptron.) were used to develop delirium prediction models under full (q = 31) and selected (q = 19) feature sets, respectively. RESULT The Random Forest classifier performs exceptionally well in both feature datasets, with an Area Under the Curve (AUC) of 0.92 for the full feature dataset and an AUC of 0.86 for the selected feature dataset. Additionally, it achieves a relatively lower Expected Calibration Error (ECE) and the highest Average Precision (AP), with an AP of 0.80 for the full feature dataset and an AP of 0.73 for the selected feature dataset. To further evaluate the best-performing Random Forest classifier, SHAP (Shapley Additive Explanations) was used, and the importance matrix plot, scatter plots, and summary plots were generated. CONCLUSIONS We established machine learning-based prediction models to predict POD in patients undergoing cardiac valve surgery. The random forest model has the best predictive performance in prediction and can help improve the prognosis of patients with POD.
Collapse
Affiliation(s)
- Qiuying Li
- Department of Cardiac Surgical Intensive Care Unit, Guangdong Cardiovascular Institute, Guangdong Provincial People's Hospital, Guangdong Academy of Medical Sciences, Southern Medical University, Guangzhou, 510080, China
- Shantou University Medical College (SUMC), Shantou, 515041, China
| | - Jiaxin Li
- Department of Cardiac Surgical Intensive Care Unit, Guangdong Cardiovascular Institute, Guangdong Provincial People's Hospital, Guangdong Academy of Medical Sciences, Southern Medical University, Guangzhou, 510080, China
| | - Jiansong Chen
- Department of Cardiovascular Surgery, Guangdong General Hospital's Nanhai Hospital, The Second People's Hospital of Nanhai District, Foshan, Guangdong, 528251, China
| | - Xu Zhao
- Institute of Clinical Pharmacology, Guangdong Provincial Key Laboratory of New Drug Design and Evaluation, School of Pharmaceutical Sciences, Sun Yat-Sen University, Guangzhou, Guangdong, China
| | - Jian Zhuang
- Department of Cardiovascular Surgery, Guangdong Cardiovascular Institute, Guangdong Provincial People's Hospital, Guangdong Academy of Medical Sciences, Southern Medical University, Guangzhou, 510080, China
- Guangdong Provincial Key Laboratory of South China Structural Heart Disease, Guangdong Cardiovascular Institute, Guangdong Provincial People's Hospital, Guangdong Academy of Medical Sciences, Southern Medical University, Guangzhou, 510080, China
| | - Guoping Zhong
- Institute of Clinical Pharmacology, Guangdong Provincial Key Laboratory of New Drug Design and Evaluation, School of Pharmaceutical Sciences, Sun Yat-Sen University, Guangzhou, Guangdong, China.
| | - Yamin Song
- Department of Cardiac Surgical Intensive Care Unit, Guangdong Cardiovascular Institute, Guangdong Provincial People's Hospital, Guangdong Academy of Medical Sciences, Southern Medical University, Guangzhou, 510080, China.
| | - Liming Lei
- Department of Cardiac Surgical Intensive Care Unit, Guangdong Cardiovascular Institute, Guangdong Provincial People's Hospital, Guangdong Academy of Medical Sciences, Southern Medical University, Guangzhou, 510080, China.
- Guangdong Provincial Key Laboratory of South China Structural Heart Disease, Guangdong Cardiovascular Institute, Guangdong Provincial People's Hospital, Guangdong Academy of Medical Sciences, Southern Medical University, Guangzhou, 510080, China.
- Shantou University Medical College (SUMC), Shantou, 515041, China.
| |
Collapse
|
5
|
Chi CY, Moghadas-Dastjerdi H, Winkler A, Ao S, Chen YP, Wang LW, Su PI, Lin WS, Tsai MS, Huang CH. Clinical Validation of Explainable Deep Learning Model for Predicting the Mortality of In-Hospital Cardiac Arrest Using Diagnosis Codes of Electronic Health Records. Rev Cardiovasc Med 2023; 24:265. [PMID: 39076399 PMCID: PMC11270098 DOI: 10.31083/j.rcm2409265] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2023] [Revised: 06/12/2023] [Accepted: 06/26/2023] [Indexed: 07/31/2024] Open
Abstract
Background Using deep learning for disease outcome prediction is an approach that has made large advances in recent years. Notwithstanding its excellent performance, clinicians are also interested in learning how input affects prediction. Clinical validation of explainable deep learning models is also as yet unexplored. This study aims to evaluate the performance of Deep SHapley Additive exPlanations (D-SHAP) model in accurately identifying the diagnosis code associated with the highest mortality risk. Methods Incidences of at least one in-hospital cardiac arrest (IHCA) for 168,693 patients as well as 1,569,478 clinical records were extracted from Taiwan's National Health Insurance Research Database. We propose a D-SHAP model to provide insights into deep learning model predictions. We trained a deep learning model to predict the 30-day mortality likelihoods of IHCA patients and used D-SHAP to see how the diagnosis codes affected the model's predictions. Physicians were asked to annotate a cardiac arrest dataset and provide expert opinions, which we used to validate our proposed method. A 1-to-4-point annotation of each record (current decision) along with four previous records (historical decision) was used to validate the current and historical D-SHAP values. Results A subset consisting of 402 patients with at least one cardiac arrest record was randomly selected from the IHCA cohort. The median age was 72 years, with mean and standard deviation of 69 ± 17 years. Results indicated that D-SHAP can identify the cause of mortality based on the diagnosis codes. The top five most important diagnosis codes, namely respiratory failure, sepsis, pneumonia, shock, and acute kidney injury were consistent with the physician's opinion. Some diagnoses, such as urinary tract infection, showed a discrepancy between D-SHAP and clinical judgment due to the lower frequency of the disease and its occurrence in combination with other comorbidities. Conclusions The D-SHAP framework was found to be an effective tool to explain deep neural networks and identify most of the important diagnoses for predicting patients' 30-day mortality. However, physicians should always carefully consider the structure of the original database and underlying pathophysiology.
Collapse
Affiliation(s)
- Chien-Yu Chi
- Department of Emergency Medicine, National Taiwan University Hospital Yunlin Branch, 640 Yunlin, Taiwan
| | | | - Adrian Winkler
- Knowtions Research Inc., Toronto, Ontario M5J 2S1, Canada
| | - Shuang Ao
- Knowtions Research Inc., Toronto, Ontario M5J 2S1, Canada
| | - Yen-Pin Chen
- Department of Emergency Medicine, National Taiwan University, 100 Taipei, Taiwan
| | - Liang-Wei Wang
- Department of Emergency Medicine, National Taiwan University, 100 Taipei, Taiwan
| | - Pei-I Su
- Department of Emergency Medicine, National Taiwan University, 100 Taipei, Taiwan
| | - Wei-Shu Lin
- Department of Emergency Medicine, National Taiwan University, 100 Taipei, Taiwan
| | - Min-Shan Tsai
- Department of Emergency Medicine, National Taiwan University, 100 Taipei, Taiwan
| | - Chien-Hua Huang
- Department of Emergency Medicine, National Taiwan University, 100 Taipei, Taiwan
| |
Collapse
|