1
|
Li X, Wang Z, Zhao W, Shi R, Zhu Y, Pan H, Wang D. Machine learning algorithm for predict the in-hospital mortality in critically ill patients with congestive heart failure combined with chronic kidney disease. Ren Fail 2024; 46:2315298. [PMID: 38357763 PMCID: PMC10877653 DOI: 10.1080/0886022x.2024.2315298] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2023] [Accepted: 02/01/2024] [Indexed: 02/16/2024] Open
Abstract
BACKGROUND The objective of this study was to develop and validate a machine learning (ML) model for predict in-hospital mortality among critically ill patients with congestive heart failure (CHF) combined with chronic kidney disease (CKD). METHODS After employing least absolute shrinkage and selection operator regression for feature selection, six distinct methodologies were employed in the construction of the model. The selection of the optimal model was based on the area under the curve (AUC). Furthermore, the interpretation of the chosen model was facilitated through the utilization of SHapley Additive exPlanation (SHAP) values and the Local Interpretable Model-Agnostic Explanations (LIME) algorithm. RESULTS This study collected data and enrolled 5041 patients on CHF combined with CKD from 2008 to 2019, utilizing the Medical Information Mart for Intensive Care Unit. After selection, 22 of the 47 variables collected post-intensive care unit admission were identified as mortality-associated and subsequently utilized in the development of ML models. Among the six models generated, the eXtreme Gradient Boosting (XGBoost) model demonstrated the highest AUC at 0.837. Notably, the SHAP values highlighted the sequential organ failure assessment score, age, simplified acute physiology score II, and urine output as the four most influential variables in the XGBoost model. In addition, the LIME algorithm explains the individualized predictions. CONCLUSIONS In conclusion, our study accomplished the successful development and validation of ML models for predicting in-hospital mortality in critically ill patients with CHF combined with CKD. Notably, the XGBoost model emerged as the most efficacious among all the ML models employed.
Collapse
Affiliation(s)
- Xunliang Li
- Department of Nephrology, The Second Affiliated Hospital of Anhui Medical University, Hefei, China
- Institute of Kidney Disease, Inflammation and Immunity Mediated Diseases, The Second Affiliated Hospital of Anhui Medical University, Hefei, China
| | - Zhijuan Wang
- Department of Nephrology, The Second Affiliated Hospital of Anhui Medical University, Hefei, China
- Institute of Kidney Disease, Inflammation and Immunity Mediated Diseases, The Second Affiliated Hospital of Anhui Medical University, Hefei, China
| | - Wenman Zhao
- Department of Nephrology, The Second Affiliated Hospital of Anhui Medical University, Hefei, China
- Institute of Kidney Disease, Inflammation and Immunity Mediated Diseases, The Second Affiliated Hospital of Anhui Medical University, Hefei, China
| | - Rui Shi
- Department of Nephrology, The Second Affiliated Hospital of Anhui Medical University, Hefei, China
- Institute of Kidney Disease, Inflammation and Immunity Mediated Diseases, The Second Affiliated Hospital of Anhui Medical University, Hefei, China
| | - Yuyu Zhu
- Department of Nephrology, The Second Affiliated Hospital of Anhui Medical University, Hefei, China
- Institute of Kidney Disease, Inflammation and Immunity Mediated Diseases, The Second Affiliated Hospital of Anhui Medical University, Hefei, China
| | - Haifeng Pan
- Institute of Kidney Disease, Inflammation and Immunity Mediated Diseases, The Second Affiliated Hospital of Anhui Medical University, Hefei, China
- Department of Epidemiology and Biostatistics, School of Public Health, Anhui Medical University, Hefei, China
- Inflammation and Immune Mediated Diseases Laboratory of Anhui Province, Hefei, China
| | - Deguang Wang
- Department of Nephrology, The Second Affiliated Hospital of Anhui Medical University, Hefei, China
- Institute of Kidney Disease, Inflammation and Immunity Mediated Diseases, The Second Affiliated Hospital of Anhui Medical University, Hefei, China
| |
Collapse
|
2
|
Kolasseri AE, B V. Comparative study of machine learning and statistical survival models for enhancing cervical cancer prognosis and risk factor assessment using SEER data. Sci Rep 2024; 14:22203. [PMID: 39333298 PMCID: PMC11437206 DOI: 10.1038/s41598-024-72790-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2024] [Accepted: 09/10/2024] [Indexed: 09/29/2024] Open
Abstract
Cervical cancer is a common malignant tumor of the female reproductive system and the leading cause of death among women worldwide. The survival prediction method can be used to effectively analyze the time to event, which is essential in any clinical study. This study aims to bridge the gap between traditional statistical methods and machine learning in survival analysis by revealing which techniques are most effective in predicting survival, with a particular emphasis on improving prediction accuracy and identifying key risk factors for cervical cancer. Women with cervical cancer diagnosed between 2013 and 2015 were included in our study using data from the Surveillance, Epidemiology, and End Results (SEER) database. Using this dataset, the study assesses the performance of Weibull, Cox proportional hazards models, and Random Survival Forests in terms of predictive accuracy and risk factor identification. The findings reveal that machine learning models, particularly Random Survival Forests (RSF), outperform traditional statistical methods in both predictive accuracy and the discernment of crucial prognostic factors, underscoring the advantages of machine learning in handling complex survival data. However, for a survival dataset with a small number of predictors, statistical models should be used first. The study finds that RSF models enhance survival analysis with more accurate predictions and insights into survival risk factors but highlights the need for larger datasets and further research on model interpretability and clinical applicability.
Collapse
Affiliation(s)
- Anjana Eledath Kolasseri
- Department of Mathematics, School of Advanced Sciences, Vellore Institute of Technology, Vellore, Tamil Nadu, India
| | - Venkataramana B
- Department of Mathematics, School of Advanced Sciences, Vellore Institute of Technology, Vellore, Tamil Nadu, India.
| |
Collapse
|
3
|
Peng C, Peng L, Yang F, Yu H, Chen Q, Guo Y, Xu S, Jin Z. The prediction of the survival in patients with severe trauma during prehospital care: Analyses based on NTDB database. Eur J Trauma Emerg Surg 2024; 50:1599-1609. [PMID: 38483558 DOI: 10.1007/s00068-024-02484-0] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2023] [Accepted: 02/19/2024] [Indexed: 10/08/2024]
Abstract
PURPOSE Traumas cause great casualties, accompanied by heavy economic burdens every year. The study aimed to use ML (machine learning) survival algorithms for predicting the 8-and 24-hour survival of severe traumas. METHODS A retrospective study using data from National Trauma Data Bank (NTDB) was conducted. Four ML survival algorithms including survival tree (ST), random forest for survival (RFS) and gradient boosting machine (GBM), together with a Cox proportional hazard model (Cox), were utilized to develop the survival prediction models. Following this, model performance was determined by the comparison of the C-index, integrated Brier score (IBS) and calibration curves in the test datasets. RESULTS A total of 191,240 individuals diagnosed with severe trauma between 2015 and 2018 were identified. Glasgow Coma Scale (GCS), trauma type, age, SaO2, respiratory rate (RR), systolic blood pressure (SBP), EMS transport time, EMS on-scene time, pulse, and EMS response time were identified as the main predictors. For predicting the 8-hour survival with the complete cases, the C-indexes in the test sets were 0.853 (0.845, 0.861), 0.823 (0.812, 0.834), 0.871 (0.862, 0.879) and 0.857 (0.849, 0.865) for Cox, ST, RFS and GBM, respectively. Similar results were observed in the 24-hour survival prediction models. The prediction error curves based on IBS also showed a similar pattern for these models. Additionally, a free web-based calculator was developed for potential clinical use. CONCLUSION The RFS survival algorithms provide non-parametric alternatives to other regression models to be of clinical use for estimating the survival probability of severe trauma patients.
Collapse
Affiliation(s)
- Chi Peng
- Department of Health Statistics, Naval Military Medical University, No. 800 Xiangyin Road, Shanghai, 200433, China
| | - Liwei Peng
- Department of Neurosurgery, Tangdu Hospital, Air Force Medical University, Xi'an, 710038, China
| | - Fan Yang
- Institute of Pathology and Southwest Cancer Center, Southwest Hospital, Third Military Medical University (Army Medical University) and Key Laboratory of Tumor Immunopathology, Chongqing, 400014, China
| | - Hang Yu
- Department of Emergency, Changhai Hospital, Naval Medical University, No. 168 Changhai Road, Shanghai, 200433, China
| | - Qi Chen
- Department of Health Statistics, Naval Military Medical University, No. 800 Xiangyin Road, Shanghai, 200433, China
| | - Yibin Guo
- Department of Health Statistics, Naval Military Medical University, No. 800 Xiangyin Road, Shanghai, 200433, China
| | - Shuogui Xu
- Department of Emergency, Changhai Hospital, Naval Medical University, No. 168 Changhai Road, Shanghai, 200433, China.
| | - Zhichao Jin
- Department of Health Statistics, Naval Military Medical University, No. 800 Xiangyin Road, Shanghai, 200433, China.
| |
Collapse
|
4
|
Gan T, Guan H, Li P, Huang X, Li Y, Zhang R, Li T. Risk prediction models for cardiovascular events in hemodialysis patients: A systematic review. Semin Dial 2024; 37:101-109. [PMID: 37743062 DOI: 10.1111/sdi.13181] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2023] [Revised: 06/25/2023] [Accepted: 09/10/2023] [Indexed: 09/26/2023]
Abstract
OBJECTIVE To perform a systematic review of risk prediction models for cardiovascular (CV) events in hemodialysis (HD) patients, and provide a reference for the application and optimization of related prediction models. METHODS PubMed, The Cochrane Library, Web of Science, and Embase databases were searched from inception to 1 February 2023. Two authors independently conducted the literature search, selection, and screening. The Prediction model Risk Of Bias Assessment Tool (PROBAST) was applied to evaluate the risk of bias and applicability of the included literature. RESULTS A total of nine studies containing 12 models were included, with performance measured by the area under the receiver operating characteristic curve (AUC) lying between 0.70 and 0.88. Age, diabetes mellitus (DM), C-reactive protein (CRP), and albumin (ALB) were the most commonly identified predictors of CV events in HD patients. While the included models demonstrated good applicability, there were still certain risks of bias, primarily related to inadequate handling of missing data and transformation of continuous variables, as well as a lack of model performance validation. CONCLUSION The included models showed good overall predictive performance and can assist healthcare professionals in the early identification of high-risk individuals for CV events in HD patients. In the future, the modeling methods should be improved, or the existing models should undergo external validation to provide better guidance for clinical practice.
Collapse
Affiliation(s)
- Tiantian Gan
- School of Nursing, Chengdu University of Traditional Chinese Medicine, Chengdu, China
| | - Hua Guan
- Health Management Center, Sichuan Academy of Medical Sciences·Sichuan People's Hospital, Chengdu, China
| | - Pengli Li
- Department of Nephrology, Sichuan Academy of Medical Sciences·Sichuan People's Hospital, Chengdu, China
| | - Xinping Huang
- School of Nursing, Chengdu University of Traditional Chinese Medicine, Chengdu, China
| | - Yue Li
- Health Management Center, Sichuan Academy of Medical Sciences·Sichuan People's Hospital, Chengdu, China
| | - Rui Zhang
- Health Management Center, Sichuan Academy of Medical Sciences·Sichuan People's Hospital, Chengdu, China
| | - Tingxin Li
- Health Management Center, Sichuan Academy of Medical Sciences·Sichuan People's Hospital, Chengdu, China
| |
Collapse
|
5
|
Park SW, Yeo NY, Kang S, Ha T, Kim TH, Lee D, Kim D, Choi S, Kim M, Lee D, Kim D, Kim WJ, Lee SJ, Heo YJ, Moon DH, Han SS, Kim Y, Choi HS, Oh DK, Lee SY, Park M, Lim CM, Heo J. Early Prediction of Mortality for Septic Patients Visiting Emergency Room Based on Explainable Machine Learning: A Real-World Multicenter Study. J Korean Med Sci 2024; 39:e53. [PMID: 38317451 PMCID: PMC10843974 DOI: 10.3346/jkms.2024.39.e53] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/15/2023] [Accepted: 12/05/2023] [Indexed: 02/07/2024] Open
Abstract
BACKGROUND Worldwide, sepsis is the leading cause of death in hospitals. If mortality rates in patients with sepsis can be predicted early, medical resources can be allocated efficiently. We constructed machine learning (ML) models to predict the mortality of patients with sepsis in a hospital emergency department. METHODS This study prospectively collected nationwide data from an ongoing multicenter cohort of patients with sepsis identified in the emergency department. Patients were enrolled from 19 hospitals between September 2019 and December 2020. For acquired data from 3,657 survivors and 1,455 deaths, six ML models (logistic regression, support vector machine, random forest, extreme gradient boosting [XGBoost], light gradient boosting machine, and categorical boosting [CatBoost]) were constructed using fivefold cross-validation to predict mortality. Through these models, 44 clinical variables measured on the day of admission were compared with six sequential organ failure assessment (SOFA) components (PaO2/FIO2 [PF], platelets (PLT), bilirubin, cardiovascular, Glasgow Coma Scale score, and creatinine). The confidence interval (CI) was obtained by performing 10,000 repeated measurements via random sampling of the test dataset. All results were explained and interpreted using Shapley's additive explanations (SHAP). RESULTS Of the 5,112 participants, CatBoost exhibited the highest area under the curve (AUC) of 0.800 (95% CI, 0.756-0.840) using clinical variables. Using the SOFA components for the same patient, XGBoost exhibited the highest AUC of 0.678 (95% CI, 0.626-0.730). As interpreted by SHAP, albumin, lactate, blood urea nitrogen, and international normalization ratio were determined to significantly affect the results. Additionally, PF and PLTs in the SOFA component significantly influenced the prediction results. CONCLUSION Newly established ML-based models achieved good prediction of mortality in patients with sepsis. Using several clinical variables acquired at the baseline can provide more accurate results for early predictions than using SOFA components. Additionally, the impact of each variable was identified.
Collapse
Affiliation(s)
- Sang Won Park
- Department of Medical Informatics, School of Medicine, Kangwon National University, Chuncheon, Korea
- Institute of Medical Science, School of Medicine, Kangwon National University, Chuncheon, Korea
| | - Na Young Yeo
- Department of Medical Bigdata Convergence, Kangwon National University, Chuncheon, Korea
| | - Seonguk Kang
- Department of Convergence Security, Kangwon National University, Chuncheon, Korea
| | - Taejun Ha
- Department of Biomedical Research Institute, Kangwon National University Hospital, Chuncheon, Korea
| | - Tae-Hoon Kim
- University-Industry Cooperation Foundation, Kangwon National University, Chuncheon, Korea
| | - DooHee Lee
- Department of Research and Development, ZIOVISION Co. Ltd., Chuncheon, Korea
| | - Dowon Kim
- Department of Research and Development, ZIOVISION Co. Ltd., Chuncheon, Korea
| | - Seheon Choi
- Department of Research and Development, ZIOVISION Co. Ltd., Chuncheon, Korea
| | - Minkyu Kim
- Department of Research and Development, ZIOVISION Co. Ltd., Chuncheon, Korea
| | - DongHoon Lee
- Department of Research and Development, ZIOVISION Co. Ltd., Chuncheon, Korea
| | - DoHyeon Kim
- Department of Research and Development, ZIOVISION Co. Ltd., Chuncheon, Korea
| | - Woo Jin Kim
- Department of Medical Informatics, School of Medicine, Kangwon National University, Chuncheon, Korea
- Department of Internal Medicine, Kangwon National University Hospital, Chuncheon, Korea
- Department of Internal Medicine, School of Medicine, Kangwon National University, Chuncheon, Korea
| | - Seung-Joon Lee
- Department of Internal Medicine, Kangwon National University Hospital, Chuncheon, Korea
- Department of Internal Medicine, School of Medicine, Kangwon National University, Chuncheon, Korea
| | - Yeon-Jeong Heo
- Department of Internal Medicine, Kangwon National University Hospital, Chuncheon, Korea
- Department of Internal Medicine, School of Medicine, Kangwon National University, Chuncheon, Korea
| | - Da Hye Moon
- Department of Internal Medicine, Kangwon National University Hospital, Chuncheon, Korea
- Department of Internal Medicine, School of Medicine, Kangwon National University, Chuncheon, Korea
| | - Seon-Sook Han
- Department of Internal Medicine, Kangwon National University Hospital, Chuncheon, Korea
- Department of Internal Medicine, School of Medicine, Kangwon National University, Chuncheon, Korea
| | - Yoon Kim
- University-Industry Cooperation Foundation, Kangwon National University, Chuncheon, Korea
- Department of Computer Science and Engineering, Kangwon National University, Chuncheon, Korea
| | - Hyun-Soo Choi
- University-Industry Cooperation Foundation, Kangwon National University, Chuncheon, Korea
- Department of Computer Science and Engineering, Seoul National University of Science and Technology, Seoul, Korea
| | - Dong Kyu Oh
- Department of Pulmonary and Critical Care Medicine, Asan Medical Center, University of Ulsan College of Medicine, Seoul, Korea
| | - Su Yeon Lee
- Department of Pulmonary and Critical Care Medicine, Asan Medical Center, University of Ulsan College of Medicine, Seoul, Korea
| | - MiHyeon Park
- Department of Pulmonary and Critical Care Medicine, Asan Medical Center, University of Ulsan College of Medicine, Seoul, Korea
| | - Chae-Man Lim
- Department of Pulmonary and Critical Care Medicine, Asan Medical Center, University of Ulsan College of Medicine, Seoul, Korea
| | - Jeongwon Heo
- Department of Internal Medicine, Kangwon National University Hospital, Chuncheon, Korea
- Department of Internal Medicine, School of Medicine, Kangwon National University, Chuncheon, Korea.
| |
Collapse
|
6
|
Li S, Yi H, Leng Q, Wu Y, Mao Y. New perspectives on cancer clinical research in the era of big data and machine learning. Surg Oncol 2024; 52:102009. [PMID: 38215544 DOI: 10.1016/j.suronc.2023.102009] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2023] [Accepted: 10/16/2023] [Indexed: 01/14/2024]
Abstract
In the 21st century, the development of medical science has entered the era of big data, and machine learning has become an essential tool for mining medical big data. The establishment of the SEER database has provided a wealth of epidemiological data for cancer clinical research, and the number of studies based on SEER and machine learning has been growing in recent years. This article reviews recent research based on SEER and machine learning and finds that the current focus of such studies is primarily on the development and validation of models using machine learning algorithms, with the main directions being lymph node metastasis prediction, distant metastasis prediction, and prognosis-related research. Compared to traditional models, machine learning algorithms have the advantage of stronger adaptability, but also suffer from disadvantages such as overfitting and poor interpretability, which need to be weighed in practical applications. At present, machine learning algorithms, as the foundation of artificial intelligence, have just begun to emerge in the field of cancer clinical research. The future development of oncology will enter a more precise era of cancer research, characterized by larger data, higher dimensions, and more frequent information exchange. Machine learning is bound to shine brightly in this field.
Collapse
Affiliation(s)
- Shujun Li
- Department of Hematology, Xiangya Hospital, Central South University, Changsha, 410008, China; National Clinical Research Center for Geriatric Diseases (Xiangya Hospital), China; Hunan Hematology Oncology Clinical Medical Research Center, China
| | - Hang Yi
- Department of Thoracic Surgery, National Cancer Center/National Clinical Research Center for Cancer/Cancer Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, 100021, China
| | - Qihao Leng
- Xiangya School of Medicine, Central South University, Changsha, 410013, Hunan Province, China
| | - You Wu
- Institute for Hospital Management, School of Medicine, Tsinghua University, 30 Shuangqing Rd, Haidian District, Beijing, China; Department of Health Policy and Management, Bloomberg School of Public Health, Johns Hopkins University, Baltimore, MD, 21205, USA.
| | - Yousheng Mao
- Department of Thoracic Surgery, National Cancer Center/National Clinical Research Center for Cancer/Cancer Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, 100021, China.
| |
Collapse
|
7
|
Chen C, Zhang W, Yan G, Tang C. Identifying metabolic dysfunction-associated steatotic liver disease in patients with hypertension and pre-hypertension: An interpretable machine learning approach. Digit Health 2024; 10:20552076241233135. [PMID: 38389508 PMCID: PMC10883118 DOI: 10.1177/20552076241233135] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2023] [Accepted: 01/30/2024] [Indexed: 02/24/2024] Open
Abstract
Objective Metabolic dysfunction-associated steatotic liver disease (MASLD) is one of the most prevalent liver diseases and is associated with pre-hypertension and hypertension. Our research aims to develop interpretable machine learning (ML) models to accurately identify MASLD in hypertensive and pre-hypertensive populations. Methods The dataset for 4722 hypertensive and pre-hypertensive patients is from subjects in the NAGALA study. Six ML models, including the decision tree, K-nearest neighbor, gradient boosting, naive Bayes, support vector machine, and random forest (RF) models, were used in this study. The optimal model was constructed according to the performances of models evaluated by K-fold cross-validation (k = 5), the area under the receiver operating characteristic curve (AUC), average precision (AP), accuracy, sensitivity, specificity, and F1. Shapley additive explanation (SHAP) values were employed for both global and local interpretation of the model results. Results The prevalence of MASLD in hypertensive and pre-hypertensive patients was 44.3% (362 cases) and 28.3% (1107 cases), respectively. The RF model outperformed the other five models with an AUC of 0.889, AP of 0.800, accuracy of 0.819, sensitivity of 0.816, specificity of 0.821, and F1 of 0.729. According to the SHAP analysis, the top five important features were alanine aminotransferase, body mass index, waist circumference, high-density lipoprotein cholesterol, and total cholesterol. Further analysis of the feature selection in the RF model revealed that incorporating all features leads to optimal model performance. Conclusions ML algorithms, especially RF algorithm, improve the accuracy of MASLD identification, and the global and local interpretation of the RF model results enables us to intuitively understand how various features affect the chances of MASLD in patients with hypertension and pre-hypertension.
Collapse
Affiliation(s)
- Chen Chen
- School of Cyber Science and Engineering, Southeast University, Nanjing, Jiangsu, China
- School of Telecommunications and Information Engineering, Nanjing University of Posts and Telecommunications, Nanjing, Jiangsu, China
| | - Wenkang Zhang
- Department of Cardiology, Zhongda Hospital, Southeast University, Nanjing, Jiangsu, China
- School of Medicine, Southeast University, Nanjing, Jiangsu, China
| | - Gaoliang Yan
- Department of Cardiology, Zhongda Hospital, Southeast University, Nanjing, Jiangsu, China
| | - Chengchun Tang
- Department of Cardiology, Zhongda Hospital, Southeast University, Nanjing, Jiangsu, China
- School of Medicine, Southeast University, Nanjing, Jiangsu, China
| |
Collapse
|
8
|
Peng ZH, Tian JH, Chen BH, Zhou HB, Bi H, He MX, Li MR, Zheng XY, Wang YW, Chong T, Li ZL. Development of machine learning prognostic models for overall survival of prostate cancer patients with lymph node-positive. Sci Rep 2023; 13:18424. [PMID: 37891423 PMCID: PMC10611782 DOI: 10.1038/s41598-023-45804-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2023] [Accepted: 10/24/2023] [Indexed: 10/29/2023] Open
Abstract
Prostate cancer (PCa) patients with lymph node involvement (LNI) constitute a single-risk group with varied prognoses. Existing studies on this group have focused solely on those who underwent prostatectomy (RP), using statistical models to predict prognosis. This study aimed to develop an easily accessible individual survival prediction tool based on multiple machine learning (ML) algorithms to predict survival probability for PCa patients with LNI. A total of 3280 PCa patients with LNI were identified from the Surveillance, Epidemiology, and End Results (SEER) database, covering the years 2000-2019. The primary endpoint was overall survival (OS). Gradient Boosting Survival Analysis (GBSA), Random Survival Forest (RSF), and Extra Survival Trees (EST) were used to develop prognosis models, which were compared to Cox regression. Discrimination was evaluated using the time-dependent areas under the receiver operating characteristic curve (time-dependent AUC) and the concordance index (c-index). Calibration was assessed using the time-dependent Brier score (time-dependent BS) and the integrated Brier score (IBS). Moreover, the beeswarm summary plot in SHAP (SHapley Additive exPlanations) was used to display the contribution of variables to the results. The 3280 patients were randomly split into a training cohort (n = 2624) and a validation cohort (n = 656). Nine variables including age at diagnosis, race, marital status, clinical T stage, prostate-specific antigen (PSA) level at diagnosis, Gleason Score (GS), number of positive lymph nodes, radical prostatectomy (RP), and radiotherapy (RT) were used to develop models. The mean time-dependent AUC for GBSA, RSF, and EST was 0.782 (95% confidence interval [CI] 0.779-0.783), 0.779 (95% CI 0.776-0.780), and 0.781 (95% CI 0.778-0.782), respectively, which were higher than the Cox regression model of 0.770 (95% CI 0.769-0.773). Additionally, all models demonstrated almost similar calibration, with low IBS. A web-based prediction tool was developed using the best-performing GBSA, which is accessible at https://pengzihexjtu-pca-n1.streamlit.app/ . ML algorithms showed better performance compared with Cox regression and we developed a web-based tool, which may help to guide patient treatment and follow-up.
Collapse
Affiliation(s)
- Zi-He Peng
- Department of Urology, The Second Affiliated Hospital of Xi'an Jiaotong University, Xi'an, Shaanxi, China
- Health Science Center, Xi'an Jiaotong University, Xi'an, Shaanxi, China
| | - Juan-Hua Tian
- Department of Urology, The Second Affiliated Hospital of Xi'an Jiaotong University, Xi'an, Shaanxi, China
- Health Science Center, Xi'an Jiaotong University, Xi'an, Shaanxi, China
| | - Bo-Hong Chen
- Department of Urology, The First Affiliated Hospital of Xi'an Jiaotong University, Xi'an, Shaanxi, China
- Health Science Center, Xi'an Jiaotong University, Xi'an, Shaanxi, China
| | - Hai-Bin Zhou
- Department of Urology, The Second Affiliated Hospital of Xi'an Jiaotong University, Xi'an, Shaanxi, China
- Health Science Center, Xi'an Jiaotong University, Xi'an, Shaanxi, China
| | - Hang Bi
- Department of Urology, The Second Affiliated Hospital of Xi'an Jiaotong University, Xi'an, Shaanxi, China
- Health Science Center, Xi'an Jiaotong University, Xi'an, Shaanxi, China
| | - Min-Xin He
- Department of Urology, The Second Affiliated Hospital of Xi'an Jiaotong University, Xi'an, Shaanxi, China
- Health Science Center, Xi'an Jiaotong University, Xi'an, Shaanxi, China
| | - Ming-Rui Li
- Department of Urology, The Second Affiliated Hospital of Xi'an Jiaotong University, Xi'an, Shaanxi, China
- Health Science Center, Xi'an Jiaotong University, Xi'an, Shaanxi, China
| | - Xin-Yu Zheng
- Department of Urology, The First Affiliated Hospital of Xi'an Jiaotong University, Xi'an, Shaanxi, China
- Health Science Center, Xi'an Jiaotong University, Xi'an, Shaanxi, China
| | - Ya-Wen Wang
- Health Science Center, Xi'an Jiaotong University, Xi'an, Shaanxi, China
| | - Tie Chong
- Department of Urology, The Second Affiliated Hospital of Xi'an Jiaotong University, Xi'an, Shaanxi, China.
| | - Zhao-Lun Li
- Department of Urology, The Second Affiliated Hospital of Xi'an Jiaotong University, Xi'an, Shaanxi, China.
| |
Collapse
|
9
|
Yang X, Qiu H, Wang L, Wang X. Predicting Colorectal Cancer Survival Using Time-to-Event Machine Learning: Retrospective Cohort Study. J Med Internet Res 2023; 25:e44417. [PMID: 37883174 PMCID: PMC10636616 DOI: 10.2196/44417] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2022] [Revised: 03/22/2023] [Accepted: 09/29/2023] [Indexed: 10/27/2023] Open
Abstract
BACKGROUND Machine learning (ML) methods have shown great potential in predicting colorectal cancer (CRC) survival. However, the ML models introduced thus far have mainly focused on binary outcomes and have not considered the time-to-event nature of this type of modeling. OBJECTIVE This study aims to evaluate the performance of ML approaches for modeling time-to-event survival data and develop transparent models for predicting CRC-specific survival. METHODS The data set used in this retrospective cohort study contains information on patients who were newly diagnosed with CRC between December 28, 2012, and December 27, 2019, at West China Hospital, Sichuan University. We assessed the performance of 6 representative ML models, including random survival forest (RSF), gradient boosting machine (GBM), DeepSurv, DeepHit, neural net-extended time-dependent Cox (or Cox-Time), and neural multitask logistic regression (N-MTLR) in predicting CRC-specific survival. Multiple imputation by chained equations method was applied to handle missing values in variables. Multivariable analysis and clinical experience were used to select significant features associated with CRC survival. Model performance was evaluated in stratified 5-fold cross-validation repeated 5 times by using the time-dependent concordance index, integrated Brier score, calibration curves, and decision curves. The SHapley Additive exPlanations method was applied to calculate feature importance. RESULTS A total of 2157 patients with CRC were included in this study. Among the 6 time-to-event ML models, the DeepHit model exhibited the best discriminative ability (time-dependent concordance index 0.789, 95% CI 0.779-0.799) and the RSF model produced better-calibrated survival estimates (integrated Brier score 0.096, 95% CI 0.094-0.099), but these are not statistically significant. Additionally, the RSF, GBM, DeepSurv, Cox-Time, and N-MTLR models have comparable predictive accuracy to the Cox Proportional Hazards model in terms of discrimination and calibration. The calibration curves showed that all the ML models exhibited good 5-year survival calibration. The decision curves for CRC-specific survival at 5 years showed that all the ML models, especially RSF, had higher net benefits than default strategies of treating all or no patients at a range of clinically reasonable risk thresholds. The SHapley Additive exPlanations method revealed that R0 resection, tumor-node-metastasis staging, and the number of positive lymph nodes were important factors for 5-year CRC-specific survival. CONCLUSIONS This study showed the potential of applying time-to-event ML predictive algorithms to help predict CRC-specific survival. The RSF, GBM, Cox-Time, and N-MTLR algorithms could provide nonparametric alternatives to the Cox Proportional Hazards model in estimating the survival probability of patients with CRC. The transparent time-to-event ML models help clinicians to more accurately predict the survival rate for these patients and improve patient outcomes by enabling personalized treatment plans that are informed by explainable ML models.
Collapse
Affiliation(s)
- Xulin Yang
- School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu, China
| | - Hang Qiu
- School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu, China
- Big Data Research Center, University of Electronic Science and Technology of China, Chengdu, China
| | - Liya Wang
- Big Data Research Center, University of Electronic Science and Technology of China, Chengdu, China
| | - Xiaodong Wang
- Department of Gastrointestinal Surgery, West China Hospital, Sichuan University, Chengdu, China
| |
Collapse
|
10
|
Xia K, Chen D, Jin S, Yi X, Luo L. Prediction of lung papillary adenocarcinoma-specific survival using ensemble machine learning models. Sci Rep 2023; 13:14827. [PMID: 37684259 PMCID: PMC10491759 DOI: 10.1038/s41598-023-40779-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2023] [Accepted: 08/16/2023] [Indexed: 09/10/2023] Open
Abstract
Accurate prognostic prediction is crucial for treatment decision-making in lung papillary adenocarcinoma (LPADC). The aim of this study was to predict cancer-specific survival in LPADC using ensemble machine learning and classical Cox regression models. Moreover, models were evaluated to provide recommendations based on quantitative data for personalized treatment of LPADC. Data of patients diagnosed with LPADC (2004-2018) were extracted from the Surveillance, Epidemiology, and End Results database. The set of samples was randomly divided into the training and validation sets at a ratio of 7:3. Three ensemble models were selected, namely gradient boosting survival (GBS), random survival forest (RSF), and extra survival trees (EST). In addition, Cox proportional hazards (CoxPH) regression was used to construct the prognostic models. The Harrell's concordance index (C-index), integrated Brier score (IBS), and area under the time-dependent receiver operating characteristic curve (time-dependent AUC) were used to evaluate the performance of the predictive models. A user-friendly web access panel was provided to easily evaluate the model for the prediction of survival and treatment recommendations. A total of 3615 patients were randomly divided into the training and validation cohorts (n = 2530 and 1085, respectively). The extra survival trees, RSF, GBS, and CoxPH models showed good discriminative ability and calibration in both the training and validation cohorts (mean of time-dependent AUC: > 0.84 and > 0.82; C-index: > 0.79 and > 0.77; IBS: < 0.16 and < 0.17, respectively). The RSF and GBS models were more consistent than the CoxPH model in predicting long-term survival. We implemented the developed models as web applications for deployment into clinical practice (accessible through https://shinyshine-820-lpaprediction-model-z3ubbu.streamlit.app/ ). All four prognostic models showed good discriminative ability and calibration. The RSF and GBS models exhibited the highest effectiveness among all models in predicting the long-term cancer-specific survival of patients with LPADC. This approach may facilitate the development of personalized treatment plans and prediction of prognosis for LPADC.
Collapse
Affiliation(s)
- Kaide Xia
- Guiyang Maternal and Child Health Care Hospital, Guiyang Children's Hospital, Guiyang, China
| | - Dinghua Chen
- Department of General Surgery, The Forth People's Hospital of Guiyang, Guiyang, China
| | - Shuai Jin
- School of Big Health, Guizhou Medical University, Guiyang, China
| | - Xinglin Yi
- Department of Respiratory Medicine, Third Military Medical University, Chongqing, China
| | - Li Luo
- Department of Clinical Laboratory, The Second People's Hospital of Guiyang, Guiyang, China.
| |
Collapse
|
11
|
Hao Y, Liang D, Zhang S, Wu S, Li D, Wang Y, Shi M, He Y. Machine learning for predicting the survival in osteosarcoma patients: Analysis based on American and Hebei Province cohort. BIOMOLECULES & BIOMEDICINE 2023; 23:883-893. [PMID: 36967662 PMCID: PMC10494842 DOI: 10.17305/bb.2023.8804] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/18/2023] [Revised: 03/23/2023] [Accepted: 03/23/2023] [Indexed: 06/18/2023]
Abstract
Osteosarcoma, a rare malignant tumor, has a poor prognosis. This study aimed to find the best prognostic model for osteosarcoma. There were 2912 patients included from the SEER database and 225 patients from Hebei Province. Patients from the SEER database (2008-2015) were included in the development dataset. Patients from the SEER database (2004-2007) and Hebei Province cohort were included in the external test datasets. The Cox model and three tree-based machine learning algorithms (survival tree [ST], random survival forest [RSF] and gradient boosting machine [GBM]) were used to develop the prognostic models by 10-fold cross-validation with 200 iterations. Additionally, performance of models in the multivariable group was compared with the TNM group. The 3-year and 5-year cancer specific survival (CSS) were 72.71% and 65.92% in the development dataset, respectively. The predictive ability in the multivariable group was superior to that in the TNM group. The calibration curves and consistency in the multivariable group were superior to those in the TNM group. The Cox and RSF models performed better than the ST and GBM models. A nomogram was constructed to predict the 3-year and 5-year CSS of osteosarcoma patients. The RSF model can be used as a nonparametric alternative to the Cox model. The constructed nomogram based on the Cox model can provide reference for clinicians to formulate specific therapeutic decisions both in America and China.
Collapse
Affiliation(s)
- Yahui Hao
- Cancer Institute, The Fourth Hospital of Hebei Medical University/The Tumor Hospital of Hebei Province, Shijiazhuang, China
| | - Di Liang
- Cancer Institute, The Fourth Hospital of Hebei Medical University/The Tumor Hospital of Hebei Province, Shijiazhuang, China
| | - Shuo Zhang
- Cancer Institute, The Fourth Hospital of Hebei Medical University/The Tumor Hospital of Hebei Province, Shijiazhuang, China
| | - Siqi Wu
- Cancer Institute, The Fourth Hospital of Hebei Medical University/The Tumor Hospital of Hebei Province, Shijiazhuang, China
| | - Daojuan Li
- Cancer Institute, The Fourth Hospital of Hebei Medical University/The Tumor Hospital of Hebei Province, Shijiazhuang, China
| | - Yingying Wang
- Cancer Institute, The Fourth Hospital of Hebei Medical University/The Tumor Hospital of Hebei Province, Shijiazhuang, China
| | - Miaomiao Shi
- Cancer Institute, The Fourth Hospital of Hebei Medical University/The Tumor Hospital of Hebei Province, Shijiazhuang, China
| | - Yutong He
- Cancer Institute, The Fourth Hospital of Hebei Medical University/The Tumor Hospital of Hebei Province, Shijiazhuang, China
| |
Collapse
|
12
|
Xiu Y, Jiang C, Zhang S, Yu X, Qiao K, Huang Y. Prediction of nonsentinel lymph node metastasis in breast cancer patients based on machine learning. World J Surg Oncol 2023; 21:244. [PMID: 37563717 PMCID: PMC10416453 DOI: 10.1186/s12957-023-03109-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2023] [Accepted: 07/12/2023] [Indexed: 08/12/2023] Open
Abstract
BACKGROUND Develop the best machine learning (ML) model to predict nonsentinel lymph node metastases (NSLNM) in breast cancer patients. METHODS From June 2016 to August 2022, 1005 breast cancer patients were included in this retrospective study. Univariate and multivariate analyses were performed using logistic regression. Six ML models were introduced, and their performance was compared. RESULTS NSLNM occurred in 338 (33.6%) of 1005 patients. The best ML model was XGBoost, whose average area under the curve (AUC) based on 10-fold cross-verification was 0.722. It performed better than the nomogram, which was based on logistic regression (AUC: 0.764 vs. 0.706). CONCLUSIONS The ML model XGBoost can well predict NSLNM in breast cancer patients.
Collapse
Affiliation(s)
- Yuting Xiu
- Department of Breast Surgery, Harbin Medical University Cancer Hospital, Harbin, 150086, China
| | - Cong Jiang
- Department of Breast Surgery, Harbin Medical University Cancer Hospital, Harbin, 150086, China
| | - Shiyuan Zhang
- Department of Breast Surgery, Harbin Medical University Cancer Hospital, Harbin, 150086, China
| | - Xiao Yu
- Department of Breast Surgery, Harbin Medical University Cancer Hospital, Harbin, 150086, China
| | - Kun Qiao
- Department of Breast Surgery, Harbin Medical University Cancer Hospital, Harbin, 150086, China.
| | - Yuanxi Huang
- Department of Breast Surgery, Harbin Medical University Cancer Hospital, Harbin, 150086, China.
| |
Collapse
|
13
|
Pan X, Feng T, Liu C, Savjani RR, Chin RK, Sharon Qi X. A survival prediction model via interpretable machine learning for patients with oropharyngeal cancer following radiotherapy. J Cancer Res Clin Oncol 2023; 149:6813-6825. [PMID: 36807760 DOI: 10.1007/s00432-023-04644-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2022] [Accepted: 02/08/2023] [Indexed: 02/21/2023]
Abstract
PURPOSE To explore interpretable machine learning (ML) methods, with the hope of adding more prognosis value, for predicting survival for patients with Oropharyngeal-Cancer (OPC). METHODS A cohort of 427 OPC patients (Training 341, Test 86) from TCIA database was analyzed. Radiomic features of gross-tumor-volume (GTV) extracted from planning CT using Pyradiomics, and HPV p16 status, etc. patient characteristics were considered as potential predictors. A multi-level dimension reduction algorithm consisting of Least-Absolute-Selection-Operator (Lasso) and Sequential-Floating-Backward-Selection (SFBS) was proposed to effectively remove redundant/irrelevant features. The interpretable model was constructed by quantifying the contribution of each feature to the Extreme-Gradient-Boosting (XGBoost) decision by Shapley-Additive-exPlanations (SHAP) algorithm. RESULTS The Lasso-SFBS algorithm proposed in this study finally selected 14 features, and our prediction model achieved an area-under-ROC-curve (AUC) of 0.85 on the test dataset based on this feature set. The ranking of the contribution values calculated by SHAP shows that the top predictors that were most correlated with survival were ECOG performance status, wavelet-LLH_firstorder_Mean, chemotherapy, wavelet-LHL_glcm_InverseVariance, tumor size. Those patients who had chemotherapy, with positive HPV p16 status, and lower ECOG performance status, tended to have higher SHAP scores and longer survival; who had an older age at diagnosis, heavy drinking and smoking pack year history, tended to lower SHAP scores and shorter survival. CONCLUSION We demonstrated predictive values of combined patient characteristics and imaging features for the overall survival of OPC patients. The multi-level dimension reduction algorithm can reliably identify the most plausible predictors that are mostly associated with overall survival. The interpretable patient-specific survival prediction model, capturing correlations of each predictor and clinical outcome, was developed to facilitate clinical decision-making for personalized treatment.
Collapse
Affiliation(s)
- Xiaoying Pan
- School of Computer Science and Technology, Xi'an University of Posts and Telecommunications, Xi'an, 710121, China.
- Shaanxi Key Laboratory of Network Data Analysis and Intelligent Processing, Xi'an University of Posts and Telecommunications, Xi'an, 710121, China.
| | - Tianhao Feng
- School of Computer Science and Technology, Xi'an University of Posts and Telecommunications, Xi'an, 710121, China
- Shaanxi Key Laboratory of Network Data Analysis and Intelligent Processing, Xi'an University of Posts and Telecommunications, Xi'an, 710121, China
| | - Chen Liu
- School of Computer Science and Technology, Xi'an University of Posts and Telecommunications, Xi'an, 710121, China
- Shaanxi Key Laboratory of Network Data Analysis and Intelligent Processing, Xi'an University of Posts and Telecommunications, Xi'an, 710121, China
| | - Ricky R Savjani
- Department of Radiation Oncology, University of California Los Angeles, Los Angeles, CA, 90095, USA
| | - Robert K Chin
- Department of Radiation Oncology, University of California Los Angeles, Los Angeles, CA, 90095, USA
| | - X Sharon Qi
- Department of Radiation Oncology, University of California Los Angeles, Los Angeles, CA, 90095, USA
| |
Collapse
|
14
|
Yi X, Xu W, Tang G, Zhang L, Wang K, Luo H, Zhou X. Individual risk and prognostic value prediction by machine learning for distant metastasis in pulmonary sarcomatoid carcinoma: a large cohort study based on the SEER database and the Chinese population. Front Oncol 2023; 13:1105224. [PMID: 37434968 PMCID: PMC10332636 DOI: 10.3389/fonc.2023.1105224] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2022] [Accepted: 06/06/2023] [Indexed: 07/13/2023] Open
Abstract
Background This study aimed to develop diagnostic and prognostic models for patients with pulmonary sarcomatoid carcinoma (PSC) and distant metastasis (DM). Methods Patients from the Surveillance, Epidemiology, and End Results (SEER) database were divided into a training set and internal test set at a ratio of 7 to 3, while those from the Chinese hospital were assigned to the external test set, to develop the diagnostic model for DM. Univariate logistic regression was employed in the training set to screen for DM-related risk factors, which were included into six machine learning (ML) models. Furthermore, patients from the SEER database were randomly divided into a training set and validation set at a ratio of 7 to 3 to develop the prognostic model which predicts survival of patients PSC with DM. Univariate and multivariate Cox regression analyses have also been performed in the training set to identify independent factors, and a prognostic nomogram for cancer-specific survival (CSS) for PSC patients with DM. Results For the diagnostic model for DM, 589 patients with PSC in the training set, 255 patients in the internal and 94 patients in the external test set were eventually enrolled. The extreme gradient boosting (XGB) algorithm performed best on the external test set with an area under the curve (AUC) of 0.821. For the prognostic model, 270 PSC patients with DM in the training and 117 patients in the test set were enrolled. The nomogram displayed precise accuracy with AUC of 0.803 for 3-month CSS and 0.869 for 6-month CSS in the test set. Conclusion The ML model accurately identified individuals at high risk for DM who needed more careful follow-up, including appropriate preventative therapeutic strategies. The prognostic nomogram accurately predicted CSS in PSC patients with DM.
Collapse
Affiliation(s)
- Xinglin Yi
- Department of Respiratory Medicine, Southwest Hospital of Third Military Medical University, Chongqing, China
| | - Wenhao Xu
- Department of Urinary Medicine Center, Southwest Hospital of Third Military Medical University, Chongqing, China
| | - Guihua Tang
- Department of Respiratory Medicine, Southwest Hospital of Third Military Medical University, Chongqing, China
| | - Lingye Zhang
- Department of Respiratory Medicine, Southwest Hospital of Third Military Medical University, Chongqing, China
| | - Kaishan Wang
- Department of Neurosurgery Department, Southwest Hospital of Third Military Medical University, Chongqing, China
| | - Hu Luo
- Department of Respiratory Medicine, Southwest Hospital of Third Military Medical University, Chongqing, China
| | - Xiangdong Zhou
- Department of Respiratory Medicine, Southwest Hospital of Third Military Medical University, Chongqing, China
| |
Collapse
|
15
|
Liu Y, Wu Z, Feng Y, Gao J, Wang B, Lian C, Diao B. Integration analysis of single-cell and spatial transcriptomics reveal the cellular heterogeneity landscape in glioblastoma and establish a polygenic risk model. Front Oncol 2023; 13:1109037. [PMID: 37397378 PMCID: PMC10308022 DOI: 10.3389/fonc.2023.1109037] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2022] [Accepted: 05/31/2023] [Indexed: 07/04/2023] Open
Abstract
Background Glioblastoma (GBM) is adults' most common and fatally malignant brain tumor. The heterogeneity is the leading cause of treatment failure. However, the relationship between cellular heterogeneity, tumor microenvironment, and GBM progression is still elusive. Methods Integrated analysis of single-cell RNA sequencing (scRNA-seq) and spatial transcriptome sequencing (stRNA-seq) of GBM were conducted to analyze the spatial tumor microenvironment. We investigated the subpopulation heterogeneity of malignant cells through gene set enrichment analyses, cell communications analyses, and pseudotime analyses. Significantly changed genes of the pseudotime analysis were screened to create a tumor progress-related gene risk score (TPRGRS) using Cox regression algorithms in the bulkRNA-sequencing(bulkRNA-seq) dataset. We combined the TPRGRS and clinical characteristics to predict the prognosis of patients with GBM. Furthermore, functional analysis was applied to uncover the underlying mechanisms of the TPRGRS. Results GBM cells were accurately charted to their spatial locations and uncovered their spatial colocalization. The malignant cells were divided into five clusters with transcriptional and functional heterogeneity, including unclassified malignant cells and astrocyte-like, mesenchymal-like, oligodendrocytes-progenitor-like, and neural-progenitor-like malignant cells. Cell-cell communications analysis in scRNA-seq and stRNA-seq identified ligand-receptor pairs of the CXCL, EGF, FGF, and MIF signaling pathways as bridges implying that tumor microenvironment may cause malignant cells' transcriptomic adaptability and disease progression. Pseudotime analysis showed the differentiation trajectory of GBM cells from proneural to mesenchymal transition and identified genes or pathways that affect cell differentiation. TPRGRS could successfully divide patients with GBM in three datasets into high- and low-risk groups, which was proved to be a prognostic factor independent of routine clinicopathological characteristics. Functional analysis revealed the TPRGRS associated with growth factor binding, cytokine activity, signaling receptor activator activity functions, and oncogenic pathways. Further analysis revealed the association of the TPRGRS with gene mutations and immunity in GBM. Finally, the external datasets and qRT-PCR verified high expressions of the TPRGRS mRNAs in GBM cells. Conclusion Our study provides novel insights into heterogeneity in GBM based on scRNA-seq and stRNA-seq data. Moreover, our study proposed a malignant cell transition-based TPRGRS through integrated analysis of bulkRNA-seq and scRNA-seq data, combined with the routine clinicopathological evaluation of tumors, which may provide more personalized drug regimens for GBM patients.
Collapse
Affiliation(s)
- Yaxuan Liu
- School of Laboratory Medicine and Biotechnology, Southern Medical University, Guangzhou, Guangdong, China
- Department of Basic Medicine, General Hospital of Central Theatre Command, Wuhan, Hubei, China
| | - Zhenyu Wu
- Department of Urology, The First People’s Hospital of Foshan, Foshan, Guangdong, China
| | - Yueyuan Feng
- Cancer Hospital, The First People's Hospital of Foshan, Foshan, Foshan, Guangdong, China
| | - Jiawei Gao
- College of Medicine, JiShou University, Xiangxi, Hunan, China
| | - Bo Wang
- College of Medicine, JiShou University, Xiangxi, Hunan, China
| | - Changlin Lian
- Cancer Hospital, The First People's Hospital of Foshan, Foshan, Foshan, Guangdong, China
| | - Bo Diao
- School of Laboratory Medicine and Biotechnology, Southern Medical University, Guangzhou, Guangdong, China
- Department of Basic Medicine, General Hospital of Central Theatre Command, Wuhan, Hubei, China
- Department of Neurosurgery, Wuhan General Hospital of Guangzhou Command and Hubei Key Laboratory of Central Nervous System Tumor and Intervention, Wuhan, Hubei, China
| |
Collapse
|
16
|
Chen W, Zhou B, Jeon CY, Xie F, Lin YC, Butler RK, Zhou Y, Luong TQ, Lustigova E, Pisegna JR, Wu BU. Machine learning versus regression for prediction of sporadic pancreatic cancer. Pancreatology 2023; 23:396-402. [PMID: 37130760 PMCID: PMC10406388 DOI: 10.1016/j.pan.2023.04.009] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/30/2022] [Revised: 04/10/2023] [Accepted: 04/23/2023] [Indexed: 05/04/2023]
Abstract
BACKGROUND/OBJECTIVES There is currently no widely accepted approach to identify patients at increased risk for sporadic pancreatic cancer (PC). We aimed to compare the performance of two machine-learning models with a regression-based model in predicting pancreatic ductal adenocarcinoma (PDAC), the most common form of PC. METHODS This retrospective cohort study consisted of patients 50-84 years of age enrolled in either Kaiser Permanente Southern California (KPSC, model training, internal validation) or the Veterans Affairs (VA, external testing) between 2008 and 2017. The performance of random survival forests (RSF) and eXtreme gradient boosting (XGB) models were compared to that of COX proportional hazards regression (COX). Heterogeneity of the three models were assessed. RESULTS The KPSC and the VA cohorts consisted of 1.8 and 2.7 million patients with 1792 and 4582 incident PDAC cases within 18 months, respectively. Predictors selected into all three models included age, abdominal pain, weight change, and glycated hemoglobin (A1c). Additionally, RSF selected change in alanine transaminase (ALT), whereas the XGB and COX selected the rate of change in ALT. The COX model appeared to have lower AUC (KPSC: 0.737, 95% CI 0.710-0.764; VA: 0.706, 0.699-0.714), compared to those of RSF (KPSC: 0.767, 0.744-0.791; VA: 0.731, 0.724-0.739) and XGB (KPSC: 0.779, 0.755-0.802; VA: 0.742, 0.735-0.750). Among patients with top 5% predicted risk from all three models (N = 29,663), 117 developed PDAC, of which RSF, XGB and COX captured 84 (9 unique), 87 (4 unique), 87 (19 unique) cases, respectively. CONCLUSIONS The three models complement each other, but each has unique contributions.
Collapse
Affiliation(s)
- Wansu Chen
- Kaiser Permanente Southern California Research and Evaluation, Pasadena, CA, USA.
| | - Botao Zhou
- Kaiser Permanente Southern California Research and Evaluation, Pasadena, CA, USA
| | | | - Fagen Xie
- Kaiser Permanente Southern California Research and Evaluation, Pasadena, CA, USA
| | - Yu-Chen Lin
- Cedars-Sinai Medical Center, Los Angeles, CA, USA
| | - Rebecca K Butler
- Kaiser Permanente Southern California Research and Evaluation, Pasadena, CA, USA
| | - Yichen Zhou
- Kaiser Permanente Southern California Research and Evaluation, Pasadena, CA, USA
| | - Tiffany Q Luong
- Kaiser Permanente Southern California Research and Evaluation, Pasadena, CA, USA
| | - Eva Lustigova
- Kaiser Permanente Southern California Research and Evaluation, Pasadena, CA, USA
| | - Joseph R Pisegna
- Division of Gastroenterology and Hepatology, VA Greater Los Angeles Healthcare System, Los Angeles, CA, USA; Departments of Medicine and Human Genetics David Geffen School of Medicine at UCLA, USA
| | - Bechien U Wu
- Center for Pancreatic Care, Department of Gastroenterology, Los Angeles Medical Center, Southern California Permanente Medical Group, Los Angeles, CA, USA
| |
Collapse
|
17
|
Li X, Wu R, Zhao W, Shi R, Zhu Y, Wang Z, Pan H, Wang D. Machine learning algorithm to predict mortality in critically ill patients with sepsis-associated acute kidney injury. Sci Rep 2023; 13:5223. [PMID: 36997585 PMCID: PMC10063657 DOI: 10.1038/s41598-023-32160-z] [Citation(s) in RCA: 8] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2022] [Accepted: 03/23/2023] [Indexed: 04/01/2023] Open
Abstract
This study aimed to establish and validate a machine learning (ML) model for predicting in-hospital mortality in patients with sepsis-associated acute kidney injury (SA-AKI). This study collected data on SA-AKI patients from 2008 to 2019 using the Medical Information Mart for Intensive Care IV. After employing Lasso regression for feature selection, six ML approaches were used to build the model. The optimal model was chosen based on precision and area under curve (AUC). In addition, the best model was interpreted using SHapley Additive exPlanations (SHAP) values and Local Interpretable Model-Agnostic Explanations (LIME) algorithms. There were 8129 sepsis patients eligible for participation; the median age was 68.7 (interquartile range: 57.2-79.6) years, and 57.9% (4708/8129) were male. After selection, 24 of the 44 clinical characteristics gathered after intensive care unit admission remained linked with prognosis and were utilized developing ML models. Among the six models developed, the eXtreme Gradient Boosting (XGBoost) model had the highest AUC, at 0.794. According to the SHAP values, the sequential organ failure assessment score, respiration, simplified acute physiology score II, and age were the four most influential variables in the XGBoost model. Individualized forecasts were clarified using the LIME algorithm. We built and verified ML models that excel in early mortality risk prediction in SA-AKI and the XGBoost model performed best.
Collapse
Affiliation(s)
- Xunliang Li
- Department of Nephrology, The Second Affiliated Hospital of Anhui Medical University, Anhui Medical University, Hefei, People's Republic of China
- Institute of Kidney Disease, Inflammation and Immunity Mediated Diseases, The Second Affiliated Hospital of Anhui Medical University, Anhui Medical University, Hefei, People's Republic of China
| | - Ruijuan Wu
- Department of Nephrology, The Second Affiliated Hospital of Anhui Medical University, Anhui Medical University, Hefei, People's Republic of China
- Institute of Kidney Disease, Inflammation and Immunity Mediated Diseases, The Second Affiliated Hospital of Anhui Medical University, Anhui Medical University, Hefei, People's Republic of China
| | - Wenman Zhao
- Department of Nephrology, The Second Affiliated Hospital of Anhui Medical University, Anhui Medical University, Hefei, People's Republic of China
- Institute of Kidney Disease, Inflammation and Immunity Mediated Diseases, The Second Affiliated Hospital of Anhui Medical University, Anhui Medical University, Hefei, People's Republic of China
| | - Rui Shi
- Department of Nephrology, The Second Affiliated Hospital of Anhui Medical University, Anhui Medical University, Hefei, People's Republic of China
- Institute of Kidney Disease, Inflammation and Immunity Mediated Diseases, The Second Affiliated Hospital of Anhui Medical University, Anhui Medical University, Hefei, People's Republic of China
| | - Yuyu Zhu
- Department of Nephrology, The Second Affiliated Hospital of Anhui Medical University, Anhui Medical University, Hefei, People's Republic of China
- Institute of Kidney Disease, Inflammation and Immunity Mediated Diseases, The Second Affiliated Hospital of Anhui Medical University, Anhui Medical University, Hefei, People's Republic of China
| | - Zhijuan Wang
- Department of Nephrology, The Second Affiliated Hospital of Anhui Medical University, Anhui Medical University, Hefei, People's Republic of China
- Institute of Kidney Disease, Inflammation and Immunity Mediated Diseases, The Second Affiliated Hospital of Anhui Medical University, Anhui Medical University, Hefei, People's Republic of China
| | - Haifeng Pan
- Institute of Kidney Disease, Inflammation and Immunity Mediated Diseases, The Second Affiliated Hospital of Anhui Medical University, Anhui Medical University, Hefei, People's Republic of China.
- Department of Epidemiology and Biostatistics, School of Public Health, Anhui Medical University, Hefei, People's Republic of China.
- Inflammation and Immune Mediated Diseases Laboratory of Anhui Province, Hefei, People's Republic of China.
| | - Deguang Wang
- Department of Nephrology, The Second Affiliated Hospital of Anhui Medical University, Anhui Medical University, Hefei, People's Republic of China.
- Institute of Kidney Disease, Inflammation and Immunity Mediated Diseases, The Second Affiliated Hospital of Anhui Medical University, Anhui Medical University, Hefei, People's Republic of China.
| |
Collapse
|
18
|
Sun H, Wu S, Li S, Jiang X. Which model is better in predicting the survival of laryngeal squamous cell carcinoma?: Comparison of the random survival forest based on machine learning algorithms to Cox regression: analyses based on SEER database. Medicine (Baltimore) 2023; 102:e33144. [PMID: 36897699 PMCID: PMC9997795 DOI: 10.1097/md.0000000000033144] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/07/2022] [Accepted: 02/10/2023] [Indexed: 03/11/2023] Open
Abstract
Prediction of postoperative survival for laryngeal carcinoma patients is very important. This study attempts to demonstrate the utilization of the random survival forest (RSF) and Cox regression model to predict overall survival of laryngeal squamous cell carcinoma (LSCC) and compare their performance. A total of 8677 patients diagnosed with LSCC from 2004 to 2015 were obtained from surveillance, epidemiology, and end results database. Multivariate imputation by chained equations was applied to filling the missing data. Lasso regression algorithm was conducted to find potential predictors. RSF and Cox regression were used to develop the survival prediction models. Harrell's concordance index (C-index), area under the curve (AUC), Brier score, and calibration plot were used to evaluate the predictive performance of the 2 models. For 3-year survival prediction, the C-index in training set were 0.74 (0.011) and 0.84 (0.013) for Cox and RSF respectively. For 5-year survival prediction, the C-index in training set were 0.75 (0.022) and 0.80 (0.011) for Cox and RSF respectively. Similar results were found in validation set. The AUC were 0.795 for RSF and 0.715 for Cox in the training set while the AUC were 0.765 for RSF and 0.705 for Cox in the validation set. The prediction error curves for each model based on Brier score showed the RSF model had lower prediction errors both in training group and validation group. What's more, the calibration curve displayed similar results of 2 models both in training set and validation set. The performance of RSF model were better than Cox regression model. The RSF algorithms provide a relatively better alternatives to be of clinical use for estimating the survival probability of LSCC patients.
Collapse
Affiliation(s)
- Haili Sun
- Ping Yang Hospital Affiliated to Wenzhou Medical University, Wenzhou, China
| | - Shuangshuang Wu
- Ping Yang Hospital Affiliated to Wenzhou Medical University, Wenzhou, China
| | - Shaoxiao Li
- Ping Yang Hospital Affiliated to Wenzhou Medical University, Wenzhou, China
| | - Xiaohua Jiang
- Sir Run Run Shaw Hospital, Zhejiang University School of Medicine, Hangzhou, China
| |
Collapse
|
19
|
Ruan Z, Quan Q, Wang Q, Jiang J, Peng R. New Staging System and Prognostic Model for Malignant Phyllodes Tumor Patients without Distant Metastasis: A Development and Validation Study. J Clin Med 2023; 12:jcm12051889. [PMID: 36902676 PMCID: PMC10003404 DOI: 10.3390/jcm12051889] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2022] [Revised: 02/12/2023] [Accepted: 02/16/2023] [Indexed: 03/08/2023] Open
Abstract
PURPOSE To build a new staging system and new prognostic models for MPTB. METHODS We performed a comprehensive analysis of the data from the SEER database. RESULTS We discussed the characteristics of MPTB by comparing 1085 MPTB cases with 382,718 invasive ductal carcinoma cases. We established a new stage- and age-stratification system for MPTB patients. Furthermore, we built two prognostic models for MPTB patients. The validity of these models was confirmed through multifaceted and multidata verification. CONCLUSIONS Our study provided a staging system and prognostic models for MPTB patients, which can not only help to predict patient outcomes, but also enhance the understanding of the prognostic factors associated with MPTB.
Collapse
Affiliation(s)
- Zhaohui Ruan
- Department of VIP Section, Sun Yat-sen University Cancer Center, State Key Laboratory Oncology in South China, Collaborative Innovation Center of Cancer Medicine, Guangzhou 510060, China
- Changping Laboratory, Beijing 102206, China
| | - Qi Quan
- Department of VIP Section, Sun Yat-sen University Cancer Center, State Key Laboratory Oncology in South China, Collaborative Innovation Center of Cancer Medicine, Guangzhou 510060, China
| | - Qianyu Wang
- Department of VIP Section, Sun Yat-sen University Cancer Center, State Key Laboratory Oncology in South China, Collaborative Innovation Center of Cancer Medicine, Guangzhou 510060, China
| | - Jiaxin Jiang
- Department of VIP Section, Sun Yat-sen University Cancer Center, State Key Laboratory Oncology in South China, Collaborative Innovation Center of Cancer Medicine, Guangzhou 510060, China
| | - Roujun Peng
- Department of VIP Section, Sun Yat-sen University Cancer Center, State Key Laboratory Oncology in South China, Collaborative Innovation Center of Cancer Medicine, Guangzhou 510060, China
- Correspondence:
| |
Collapse
|
20
|
Park SB, Kim KU, Park YW, Hwang JH, Lim CH. Application of 18 F-fluorodeoxyglucose PET/CT radiomic features and machine learning to predict early recurrence of non-small cell lung cancer after curative-intent therapy. Nucl Med Commun 2023; 44:161-168. [PMID: 36458424 DOI: 10.1097/mnm.0000000000001646] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/03/2022]
Abstract
OBJECTIVE To predict the recurrence of non-small cell lung cancer (NSCLC) within 2 years after curative-intent treatment using a machine-learning approach with PET/CT-based radiomics. PATIENTS AND METHODS A total of 77 NSCLC patients who underwent pretreatment 18 F-fluorodeoxyglucose PET/CT were retrospectively analyzed. Five clinical features (age, sex, tumor stage, tumor histology, and smoking status) and 48 radiomic features extracted from primary tumors on PET were used for binary classifications. These were ranked, and a subset of useful features was selected based on Gini coefficient scores in terms of associations with relapsed status. Areas under the receiver operating characteristics curves (AUC) were yielded by six machine-learning algorithms (support vector machine, random forest, neural network, naive Bayes, logistic regression, and gradient boosting). Model performances were compared and validated via random sampling. RESULTS A PET/CT-based radiomic model was developed and validated for predicting the recurrence of NSCLC during the first 2 years after curation. The most important features were SD and variance of standardized uptake value, followed by low-intensity short-zone emphasis and high-intensity zone emphasis. The naive Bayes model with the 15 best-ranked features displayed the best performance (AUC: 0.816). Prediction models using the five best PET-derived features outperformed those using five clinical variables. CONCLUSION The machine learning model using PET-derived radiomic features showed good performance for predicting the recurrence of NSCLC during the first 2 years after a curative intent therapy. PET/CT-based radiomic features may help clinicians improve the risk stratification of relapsed NSCLC.
Collapse
Affiliation(s)
| | - Ki-Up Kim
- Department of Allergy and Respiratory Medicine
| | | | - Jung Hwa Hwang
- Department of Radiology, Soonchunhyang University Hospital, Seoul, Republic of Korea
| | | |
Collapse
|
21
|
Tran TT, Lee J, Gunathilake M, Kim J, Kim SY, Cho H, Kim J. A comparison of machine learning models and Cox proportional hazards models regarding their ability to predict the risk of gastrointestinal cancer based on metabolic syndrome and its components. Front Oncol 2023; 13:1049787. [PMID: 36937438 PMCID: PMC10018751 DOI: 10.3389/fonc.2023.1049787] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2022] [Accepted: 01/20/2023] [Indexed: 03/06/2023] Open
Abstract
Background Little is known about applying machine learning (ML) techniques to identify the important variables contributing to the occurrence of gastrointestinal (GI) cancer in epidemiological studies. We aimed to compare different ML models to a Cox proportional hazards (CPH) model regarding their ability to predict the risk of GI cancer based on metabolic syndrome (MetS) and its components. Methods A total of 41,837 participants were included in a prospective cohort study. Incident cancer cases were identified by following up with participants until December 2019. We used CPH, random survival forest (RSF), survival trees (ST), gradient boosting (GB), survival support vector machine (SSVM), and extra survival trees (EST) models to explore the impact of MetS on GI cancer prediction. We used the C-index and integrated Brier score (IBS) to compare the models. Results In all, 540 incident GI cancer cases were identified. The GB and SSVM models exhibited comparable performance to the CPH model concerning the C-index (0.725). We also recorded a similar IBS for all models (0.017). Fasting glucose and waist circumference were considered important predictors. Conclusions Our study found comparably good performance concerning the C-index for the ML models and CPH model. This finding suggests that ML models may be considered another method for survival analysis when the CPH model's conditions are not satisfied.
Collapse
Affiliation(s)
- Tao Thi Tran
- Department of Cancer Control and Population Health, Graduate School of Cancer Science and Policy, Goyang-si, Gyeonggi-do, Republic of Korea
| | - Jeonghee Lee
- Department of Cancer Biomedical Science, Graduate School of Cancer Science and Policy, Goyang-si, Gyeonggi-do, Republic of Korea
| | - Madhawa Gunathilake
- Department of Cancer Biomedical Science, Graduate School of Cancer Science and Policy, Goyang-si, Gyeonggi-do, Republic of Korea
| | - Junetae Kim
- Department of Cancer Control and Population Health, Graduate School of Cancer Science and Policy, Goyang-si, Gyeonggi-do, Republic of Korea
| | - Sun-Young Kim
- Department of Cancer Control and Population Health, Graduate School of Cancer Science and Policy, Goyang-si, Gyeonggi-do, Republic of Korea
| | - Hyunsoon Cho
- Department of Cancer Control and Population Health, Graduate School of Cancer Science and Policy, Goyang-si, Gyeonggi-do, Republic of Korea
| | - Jeongseon Kim
- Department of Cancer Biomedical Science, Graduate School of Cancer Science and Policy, Goyang-si, Gyeonggi-do, Republic of Korea
- *Correspondence: Jeongseon Kim,
| |
Collapse
|
22
|
Sim R, Chong CW, Loganadan NK, Adam NL, Hussein Z, Lee SWH. Comparison of a chronic kidney disease predictive model for type 2 diabetes mellitus in Malaysia using Cox regression versus machine learning approach. Clin Kidney J 2022; 16:549-559. [PMID: 36865020 PMCID: PMC9972828 DOI: 10.1093/ckj/sfac252] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2022] [Indexed: 12/12/2022] Open
Abstract
Background Diabetes is one of the leading causes of chronic kidney disease (CKD) and end-stage renal disease. This study aims to develop and validate different risk predictive models for incident CKD and CKD progression in people with type 2 diabetes (T2D). Methods We reviewed a cohort of people with T2D seeking care from two tertiary hospitals in the metropolitan cities of the state of Selangor and Negeri Sembilan from January 2012 to May 2021. To identify the 3-year predictor of developing CKD (primary outcome) and CKD progression (secondary outcome), the dataset was randomly split into a training and test set. A Cox proportional hazards (CoxPH) model was developed to identify predictors of developing CKD. The resultant CoxPH model was compared with other machine learning models on their performance using C-statistic. Results The cohorts included 1992 participants, of which 295 had developed CKD and 442 reported worsening of kidney function. Equation for the 3-year risk of developing CKD included gender, haemoglobin A1c, triglyceride and serum creatinine levels, estimated glomerular filtration rate, history of cardiovascular disease and diabetes duration. For risk of CKD progression, the model included systolic blood pressure, retinopathy and proteinuria. The CoxPH model was better at prediction compared with other machine learning models examined for incident CKD (C-statistic: training 0.826; test 0.874) and CKD progression (C-statistic: training 0.611; test 0.655). The risk calculator can be found at https://rs59.shinyapps.io/071221/. Conclusions The Cox regression model was the best performing model to predict people with T2D who will develop a 3-year risk of incident CKD and CKD progression in a Malaysian cohort.
Collapse
Affiliation(s)
- Ruth Sim
- School of Pharmacy, Monash University Malaysia, Jalan Lagoon Selatan, Bandar Sunway, Subang Jaya, Selangor, Malaysia
| | - Chun Wie Chong
- School of Pharmacy, Monash University Malaysia, Jalan Lagoon Selatan, Bandar Sunway, Subang Jaya, Selangor, Malaysia
| | - Navin Kumar Loganadan
- Department of Pharmacy, Putrajaya Hospital, Ministry of Health Malaysia, Jalan P9, Presint 7, Putrajaya, Wilayah Persekutuan Putrajaya, Malaysia
| | - Noor Lita Adam
- Department of Medicine, Hospital Tuanku Jaafar, Ministry of Health Malaysia, Jalan Rasah, Bukit Rasah, Seremban, Negeri Sembilan, Malaysia
| | - Zanariah Hussein
- Department of Medicine, Putrajaya Hospital, Ministry of Health Malaysia, Jalan P9, Presint 7, Putrajaya, Wilayah Persekutuan Putrajaya, Malaysia
| | | |
Collapse
|
23
|
Machine Learning Algorithms for Prediction of Survival by Stress Echocardiography in Chronic Coronary Syndromes. J Pers Med 2022; 12:jpm12091523. [PMID: 36143307 PMCID: PMC9504503 DOI: 10.3390/jpm12091523] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2022] [Revised: 09/13/2022] [Accepted: 09/13/2022] [Indexed: 11/28/2022] Open
Abstract
Stress echocardiography (SE) is based on regional wall motion abnormalities and coronary flow velocity reserve (CFVR). Their independent prognostic capabilities could be better studied with a machine learning (ML) approach. The study aims to assess the SE outcome data by conducting an analysis with an ML approach. We included 6881 prospectively recruited and retrospectively analyzed patients with suspected (n = 4279) or known (n = 2602) coronary artery disease submitted to clinically driven dipyridamole SE. The outcome measure was all-cause death. A random forest survival model was implemented to model the survival function according to the patient’s characteristics; 1002 patients recruited by a single, independent center formed the external validation cohort. During a median follow-up of 3.4 years (IQR 1.6−7.5), 814 (12%) patients died. The mortality risk was higher for patients aged >60 years, with a resting ejection fraction < 60%, resting WMSI, positive stress-rest WMSI scores, and CFVR < 3.The C-index performance was 0.79 in the internal and 0.81 in the external validation data set. Survival functions for individual patients were easily obtained with an open access web app. An ML approach can be fruitfully applied to outcome data obtained with SE. Survival showed a constantly increasing relationship with a CFVR < 3.0 and stress-rest wall motion score index > Since processing is largely automated, this approach can be easily scaled to larger and more comprehensive data sets to further refine stratification, guide therapy and be ultimately adopted as an open-source online decision tool.
Collapse
|
24
|
Peng J, Lu Y, Chen L, Qiu K, Chen F, Liu J, Xu W, Zhang W, Zhao Y, Yu Z, Ren J. The prognostic value of machine learning techniques versus cox regression model for head and neck cancer. Methods 2022; 205:123-132. [PMID: 35798257 DOI: 10.1016/j.ymeth.2022.07.001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2021] [Revised: 05/18/2022] [Accepted: 07/01/2022] [Indexed: 10/17/2022] Open
Abstract
BACKGROUND Accurate prognostic prediction for head and neck cancer (HNC) is important for the improvement of clinical management. We aimed to compare the prognostic value of various machine learning techniques (MLTs) and statistical Cox regression model for different types of HNC. METHODS Clinical data of HNC patients were extracted from the Surveillance, Epidemiology, and End Results (SEER) database from 1974 to 2016. The prediction performance of five ML models, including random forest (RF), gradient boosting decision tree (GBDT), support vector machine (SVM), neural network (NN) and deep learning (DL), were compared with the statistical Cox regression model by estimating the concordance index (C-index), integrated Brier score (IBS), time-dependent receiver operating characteristic (ROC) curve and the area under the curve (AUC). RESULTS Our results showed that the RF model outperformed all other models in prognostic prediction for all tumor sites of HNC, particularly for major salivary gland cancer (MSGC, C-index: 88.730 ± 0.8700, IBS: 7.680 ± 0.4800), oral cavity cancer (OCC, C-index: 84.250 ± 0.6700, IBS: 11.480 ± 0.3300) and oropharyngeal cancer (OPC, C-index: 82.510 ± 0.5400, IBS: 10.120 ± 0.1400). Meanwhile, we analyzed the importance of each clinical variable in the RF model, in which age and tumor size presented the strongest positive prognostic effects. Additionally, similar results can be observed in the internal (6th edition of the AJCC TNM staging system cohort) and external validations (the TCGA HNC cohort). CONCLUSIONS The RF model is a promising prognostic prediction tool for HNC patients, regardless of the anatomic subsites.
Collapse
Affiliation(s)
- Jiajia Peng
- Department of Oto-Rhino-Laryngology, West China Hospital, Sichuan University, Chengdu, China; West China Biomedical Big Data Center, West China Hospital, Sichuan University, Chengdu, China
| | - Yongmei Lu
- Department of Computer Science, Sichuan University, Chengdu, China
| | - Li Chen
- Department of Computer Science, Sichuan University, Chengdu, China
| | - Ke Qiu
- Department of Oto-Rhino-Laryngology, West China Hospital, Sichuan University, Chengdu, China
| | - Fei Chen
- Department of Oto-Rhino-Laryngology, West China Hospital, Sichuan University, Chengdu, China
| | - Jun Liu
- Department of Oto-Rhino-Laryngology, West China Hospital, Sichuan University, Chengdu, China
| | - Wei Xu
- Department of Computer Science, Sichuan University, Chengdu, China
| | - Wei Zhang
- West China Biomedical Big Data Center, West China Hospital, Sichuan University, Chengdu, China
| | - Yu Zhao
- Department of Oto-Rhino-Laryngology, West China Hospital, Sichuan University, Chengdu, China; West China Biomedical Big Data Center, West China Hospital, Sichuan University, Chengdu, China.
| | - Zhonghua Yu
- Department of Computer Science, Sichuan University, Chengdu, China.
| | - Jianjun Ren
- Department of Oto-Rhino-Laryngology, West China Hospital, Sichuan University, Chengdu, China; West China Biomedical Big Data Center, West China Hospital, Sichuan University, Chengdu, China; Department of Biostatistics, Princess Margaret Cancer Centre and Dalla Lana School of Public Health, Toronto, Ontario, Canada.
| |
Collapse
|
25
|
Suresh K, Severn C, Ghosh D. Survival prediction models: an introduction to discrete-time modeling. BMC Med Res Methodol 2022; 22:207. [PMID: 35883032 PMCID: PMC9316420 DOI: 10.1186/s12874-022-01679-6] [Citation(s) in RCA: 14] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2022] [Accepted: 07/08/2022] [Indexed: 12/05/2022] Open
Abstract
Background Prediction models for time-to-event outcomes are commonly used in biomedical research to obtain subject-specific probabilities that aid in making important clinical care decisions. There are several regression and machine learning methods for building these models that have been designed or modified to account for the censoring that occurs in time-to-event data. Discrete-time survival models, which have often been overlooked in the literature, provide an alternative approach for predictive modeling in the presence of censoring with limited loss in predictive accuracy. These models can take advantage of the range of nonparametric machine learning classification algorithms and their available software to predict survival outcomes. Methods Discrete-time survival models are applied to a person-period data set to predict the hazard of experiencing the failure event in pre-specified time intervals. This framework allows for any binary classification method to be applied to predict these conditional survival probabilities. Using time-dependent performance metrics that account for censoring, we compare the predictions from parametric and machine learning classification approaches applied within the discrete time-to-event framework to those from continuous-time survival prediction models. We outline the process for training and validating discrete-time prediction models, and demonstrate its application using the open-source R statistical programming environment. Results Using publicly available data sets, we show that some discrete-time prediction models achieve better prediction performance than the continuous-time Cox proportional hazards model. Random survival forests, a machine learning algorithm adapted to survival data, also had improved performance compared to the Cox model, but was sometimes outperformed by the discrete-time approaches. In comparing the binary classification methods in the discrete time-to-event framework, the relative performance of the different methods varied depending on the data set. Conclusions We present a guide for developing survival prediction models using discrete-time methods and assessing their predictive performance with the aim of encouraging their use in medical research settings. These methods can be applied to data sets that have continuous time-to-event outcomes and multiple clinical predictors. They can also be extended to accommodate new binary classification algorithms as they become available. We provide R code for fitting discrete-time survival prediction models in a github repository. Supplementary Information The online version contains supplementary material available at (10.1186/s12874-022-01679-6).
Collapse
Affiliation(s)
- Krithika Suresh
- Department of Biostatistics and Informatics, Colorado School of Public Health, Aurora, USA.
| | - Cameron Severn
- Child Health Biostatistics Core Department of Pediatrics, Section of Endocrinology, School of Medicine, University of Colorado Anschutz Medical Campus, Aurora, USA
| | - Debashis Ghosh
- Department of Biostatistics and Informatics, Colorado School of Public Health, Aurora, USA
| |
Collapse
|
26
|
Yue S, Li S, Huang X, Liu J, Hou X, Zhao Y, Niu D, Wang Y, Tan W, Wu J. Machine learning for the prediction of acute kidney injury in patients with sepsis. J Transl Med 2022; 20:215. [PMID: 35562803 PMCID: PMC9101823 DOI: 10.1186/s12967-022-03364-0] [Citation(s) in RCA: 61] [Impact Index Per Article: 30.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2021] [Accepted: 03/26/2022] [Indexed: 12/15/2022] Open
Abstract
BACKGROUND Acute kidney injury (AKI) is the most common and serious complication of sepsis, accompanied by high mortality and disease burden. The early prediction of AKI is critical for timely intervention and ultimately improves prognosis. This study aims to establish and validate predictive models based on novel machine learning (ML) algorithms for AKI in critically ill patients with sepsis. METHODS Data of patients with sepsis were extracted from the Medical Information Mart for Intensive Care III (MIMIC- III) database. Feature selection was performed using a Boruta algorithm. ML algorithms such as logistic regression (LR), k-nearest neighbors (KNN), support vector machine (SVM), decision tree, random forest, Extreme Gradient Boosting (XGBoost), and artificial neural network (ANN) were applied for model construction by utilizing tenfold cross-validation. The performances of these models were assessed in terms of discrimination, calibration, and clinical application. Moreover, the discrimination of ML-based models was compared with those of Sequential Organ Failure Assessment (SOFA) and the customized Simplified Acute Physiology Score (SAPS) II model. RESULTS A total of 3176 critically ill patients with sepsis were included for analysis, of which 2397 cases (75.5%) developed AKI during hospitalization. A total of 36 variables were selected for model construction. The models of LR, KNN, SVM, decision tree, random forest, ANN, XGBoost, SOFA and SAPS II score were established and obtained area under the receiver operating characteristic curves of 0.7365, 0.6637, 0.7353, 0.7492, 0.7787, 0.7547, 0.821, 0.6457 and 0.7015, respectively. The XGBoost model had the best predictive performance in terms of discrimination, calibration, and clinical application among all models. CONCLUSION The ML models can be reliable tools for predicting AKI in septic patients. The XGBoost model has the best predictive performance, which can be used to assist clinicians in identifying high-risk patients and implementing early interventions to reduce mortality.
Collapse
Affiliation(s)
- Suru Yue
- Clinical Research Service Center, The Affiliated Hospital of Guangdong Medical University, Zhanjiang, 524001, Guangdong Province, China.,Collaborative Innovation Engineering Technology Research Center of Clinical Medical Big Data Cloud Service in Medical Consortium of West Guangdong Province, The Affiliated Hospital of Guangdong Medical University, Zhanjiang, 524001, Guangdong Province, China
| | - Shasha Li
- Clinical Research Service Center, The Affiliated Hospital of Guangdong Medical University, Zhanjiang, 524001, Guangdong Province, China.,Collaborative Innovation Engineering Technology Research Center of Clinical Medical Big Data Cloud Service in Medical Consortium of West Guangdong Province, The Affiliated Hospital of Guangdong Medical University, Zhanjiang, 524001, Guangdong Province, China
| | - Xueying Huang
- Clinical Research Service Center, The Affiliated Hospital of Guangdong Medical University, Zhanjiang, 524001, Guangdong Province, China.,Collaborative Innovation Engineering Technology Research Center of Clinical Medical Big Data Cloud Service in Medical Consortium of West Guangdong Province, The Affiliated Hospital of Guangdong Medical University, Zhanjiang, 524001, Guangdong Province, China
| | - Jie Liu
- Clinical Research Service Center, The Affiliated Hospital of Guangdong Medical University, Zhanjiang, 524001, Guangdong Province, China.,Collaborative Innovation Engineering Technology Research Center of Clinical Medical Big Data Cloud Service in Medical Consortium of West Guangdong Province, The Affiliated Hospital of Guangdong Medical University, Zhanjiang, 524001, Guangdong Province, China
| | - Xuefei Hou
- Clinical Research Service Center, The Affiliated Hospital of Guangdong Medical University, Zhanjiang, 524001, Guangdong Province, China.,Collaborative Innovation Engineering Technology Research Center of Clinical Medical Big Data Cloud Service in Medical Consortium of West Guangdong Province, The Affiliated Hospital of Guangdong Medical University, Zhanjiang, 524001, Guangdong Province, China
| | - Yumei Zhao
- Clinical Research Service Center, The Affiliated Hospital of Guangdong Medical University, Zhanjiang, 524001, Guangdong Province, China
| | - Dongdong Niu
- Clinical Research Service Center, The Affiliated Hospital of Guangdong Medical University, Zhanjiang, 524001, Guangdong Province, China
| | - Yufeng Wang
- Clinical Research Service Center, The Affiliated Hospital of Guangdong Medical University, Zhanjiang, 524001, Guangdong Province, China.,Collaborative Innovation Engineering Technology Research Center of Clinical Medical Big Data Cloud Service in Medical Consortium of West Guangdong Province, The Affiliated Hospital of Guangdong Medical University, Zhanjiang, 524001, Guangdong Province, China
| | - Wenkai Tan
- Department of Gastroenterology, The Affiliated Hospital of Guangdong Medical University, Zhanjiang, 524001, Guangdong Province, China.
| | - Jiayuan Wu
- Clinical Research Service Center, The Affiliated Hospital of Guangdong Medical University, Zhanjiang, 524001, Guangdong Province, China. .,Collaborative Innovation Engineering Technology Research Center of Clinical Medical Big Data Cloud Service in Medical Consortium of West Guangdong Province, The Affiliated Hospital of Guangdong Medical University, Zhanjiang, 524001, Guangdong Province, China.
| |
Collapse
|
27
|
Using Explainable Machine Learning to Explore the Impact of Synoptic Reporting on Prostate Cancer. ALGORITHMS 2022. [DOI: 10.3390/a15020049] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/01/2023]
Abstract
Machine learning (ML) models have proven to be an attractive alternative to traditional statistical methods in oncology. However, they are often regarded as black boxes, hindering their adoption for answering real-life clinical questions. In this paper, we show a practical application of explainable machine learning (XML). Specifically, we explored the effect that synoptic reporting (SR; i.e., reports where data elements are presented as discrete data items) in Pathology has on the survival of a population of 14,878 Dutch prostate cancer patients. We compared the performance of a Cox Proportional Hazards model (CPH) against that of an eXtreme Gradient Boosting model (XGB) in predicting patient ranked survival. We found that the XGB model (c-index = 0.67) performed significantly better than the CPH (c-index = 0.58). Moreover, we used Shapley Additive Explanations (SHAP) values to generate a quantitative mathematical representation of how features—including usage of SR—contributed to the models’ output. The XGB model in combination with SHAP visualizations revealed interesting interaction effects between SR and the rest of the most important features. These results hint that SR has a moderate positive impact on predicted patient survival. Moreover, adding an explainability layer to predictive ML models can open their black box, making them more accessible and easier to understand by the user. This can make XML-based techniques appealing alternatives to the classical methods used in oncological research and in health care in general.
Collapse
|
28
|
Yang CH, Chen YS, Moi SH, Chen JB, Wang L, Chuang LY. Machine learning approaches for the mortality risk assessment of patients undergoing hemodialysis. Ther Adv Chronic Dis 2022; 13:20406223221119617. [PMID: 36062293 PMCID: PMC9434675 DOI: 10.1177/20406223221119617] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2021] [Accepted: 07/27/2022] [Indexed: 11/15/2022] Open
Abstract
Introduction: Mortality is a major primary endpoint for long-term hemodialysis (HD)
patients. The clinical status of HD patients generally relies on
longitudinal clinical observations such as monthly laboratory examinations
and physical examinations. Methods: A total of 829 HD patients who met the inclusion criteria were analyzed. All
patients were tracked from January 2009 to December 2013. Taken together,
this study performed full-adjusted-Cox proportional hazards (CoxPH),
stepwise-CoxPH, random survival forest (RSF)-CoxPH, and whale optimization
algorithm (WOA)-CoxPH model for the all-cause mortality risk assessment in
HD patients. The model performance between proposed selections of CoxPH
models were evaluated using concordance index. Results: The WOA-CoxPH model obtained the highest concordance index compared with
RSF-CoxPH and typical selection CoxPH model. The eight significant
parameters obtained from the WOA-CoxPH model, including age, diabetes
mellitus (DM), hemoglobin (Hb), albumin, creatinine (Cr), potassium (K),
Kt/V, and cardiothoracic ratio, have also showed significant survival
difference between low- and high-risk characteristics in single-factor
analysis. By integrating the risk characteristics of each single factor,
patients who obtained seven or more risk characteristics of eight selected
parameters were dichotomized as high-risk subgroup, and remaining is
considered as low-risk subgroup. The integrated low- and high-risk subgroup
showed greater discrepancy compared with each single risk factor selected by
WOA-CoxPH model. Conclusion: The study findings revealed WOA-CoxPH model could provide better risk
assessment performance compared with RSF-CoxPH and typical selection CoxPH
model in the HD patients. In summary, patients who had seven or more risk
characteristics of eight selected parameters were at potentially increased
risk of all-cause mortality in HD population.
Collapse
Affiliation(s)
- Cheng-Hong Yang
- Department of Information Management, Tainan University of Technology, Tainan
- Department of Electronic Engineering, National Kaohsiung University of Science and Technology, Kaohsiung
- Biomedical Engineering, Kaohsiung Medical University, Kaohsiung
- School of Dentistry, Kaohsiung Medical University, Kaohsiung
- Drug Development and Value Creation Research Center, Kaohsiung Medical University, Kaohsiung
| | - Yin-Syuan Chen
- Department of Electronic Engineering, National Kaohsiung University of Science and Technology, Kaohsiung
| | - Sin-Hua Moi
- Center of Cancer Program Development, E-Da Cancer Hospital, I-Shou University, Kaohsiung 82445
| | - Jin-Bor Chen
- Department of Neurology, Kaohsiung Chang Gung Memorial Hospital, Chang Gung University College of Medicine, Kaohsiung 83301
| | - Lin Wang
- Department of Nephrology, Dalian University Affiliated Xinhua Hospital, Dalian, 116001, China
| | - Li-Yeh Chuang
- Biotechnology and Chemical Engineering, I-Shou University, Kaohsiung 84004
| |
Collapse
|
29
|
A comparative study of forest methods for time-to-event data: variable selection and predictive performance. BMC Med Res Methodol 2021; 21:193. [PMID: 34563138 PMCID: PMC8465777 DOI: 10.1186/s12874-021-01386-8] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2021] [Accepted: 09/02/2021] [Indexed: 11/17/2022] Open
Abstract
Background As a hot method in machine learning field, the forests approach is an attractive alternative approach to Cox model. Random survival forests (RSF) methodology is the most popular survival forests method, whereas its drawbacks exist such as a selection bias towards covariates with many possible split points. Conditional inference forests (CIF) methodology is known to reduce the selection bias via a two-step split procedure implementing hypothesis tests as it separates the variable selection and splitting, but its computation costs too much time. Random forests with maximally selected rank statistics (MSR-RF) methodology proposed recently seems to be a great improvement on RSF and CIF. Methods In this paper we used simulation study and real data application to compare prediction performances and variable selection performances among three survival forests methods, including RSF, CIF and MSR-RF. To evaluate the performance of variable selection, we combined all simulations to calculate the frequency of ranking top of the variable importance measures of the correct variables, where higher frequency means better selection ability. We used Integrated Brier Score (IBS) and c-index to measure the prediction accuracy of all three methods. The smaller IBS value, the greater the prediction. Results Simulations show that three forests methods differ slightly in prediction performance. MSR-RF and RSF might perform better than CIF when there are only continuous or binary variables in the datasets. For variable selection performance, When there are multiple categorical variables in the datasets, the selection frequency of RSF seems to be lowest in most cases. MSR-RF and CIF have higher selection rates, and CIF perform well especially with the interaction term. The fact that correlation degree of the variables has little effect on the selection frequency indicates that three forest methods can handle data with correlation. When there are only continuous variables in the datasets, MSR-RF perform better. When there are only binary variables in the datasets, RSF and MSR-RF have more advantages than CIF. When the variable dimension increases, MSR-RF and RSF seem to be more robustthan CIF Conclusions All three methods show advantages in prediction performances and variable selection performances under different situations. The recent proposed methodology MSR-RF possess practical value and is well worth popularizing. It is important to identify the appropriate method in real use according to the research aim and the nature of covariates. Supplementary Information The online version contains supplementary material available at 10.1186/s12874-021-01386-8.
Collapse
|
30
|
Quist J, Taylor L, Staaf J, Grigoriadis A. Random Forest Modelling of High-Dimensional Mixed-Type Data for Breast Cancer Classification. Cancers (Basel) 2021; 13:991. [PMID: 33673506 PMCID: PMC7956671 DOI: 10.3390/cancers13050991] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/28/2020] [Revised: 02/16/2021] [Accepted: 02/20/2021] [Indexed: 11/16/2022] Open
Abstract
Advances in high-throughput technologies encourage the generation of large amounts of multiomics data to investigate complex diseases, including breast cancer. Given that the aetiologies of such diseases extend beyond a single biological entity, and that essential biological information can be carried by all data regardless of data type, integrative analyses are needed to identify clinically relevant patterns. To facilitate such analyses, we present a permutation-based framework for random forest methods which simultaneously allows the unbiased integration of mixed-type data and assessment of relative feature importance. Through simulation studies and machine learning datasets, the performance of the approach was evaluated. The results showed minimal multicollinearity and limited overfitting. To further assess the performance, the permutation-based framework was applied to high-dimensional mixed-type data from two independent breast cancer cohorts. Reproducibility and robustness of our approach was demonstrated by the concordance in relative feature importance between the cohorts, along with consistencies in clustering profiles. One of the identified clusters was shown to be prognostic for clinical outcome after standard-of-care adjuvant chemotherapy and outperformed current intrinsic molecular breast cancer classifications.
Collapse
Affiliation(s)
- Jelmar Quist
- Cancer Bioinformatics, Cancer Centre at Guy’s Hospital, King’s College London, London SE1 9RT, UK; (J.Q.); (L.T.)
- School of Cancer and Pharmaceutical Sciences, King’s College London, London SE1 1UL, UK
- Breast Cancer Now Research Unit, Cancer Centre at Guy’s Hospital, King’s College London, London SE1 9RT, UK
| | - Lawson Taylor
- Cancer Bioinformatics, Cancer Centre at Guy’s Hospital, King’s College London, London SE1 9RT, UK; (J.Q.); (L.T.)
- School of Cancer and Pharmaceutical Sciences, King’s College London, London SE1 1UL, UK
| | - Johan Staaf
- Division of Oncology, Department of Clinical Sciences Lund, Lund University, Medicon Village, SE-223 81 Lund, Sweden;
| | - Anita Grigoriadis
- Cancer Bioinformatics, Cancer Centre at Guy’s Hospital, King’s College London, London SE1 9RT, UK; (J.Q.); (L.T.)
- School of Cancer and Pharmaceutical Sciences, King’s College London, London SE1 1UL, UK
- Breast Cancer Now Research Unit, Cancer Centre at Guy’s Hospital, King’s College London, London SE1 9RT, UK
| |
Collapse
|