1
|
Gonzalez R, Saha A, Campbell CJ, Nejat P, Lokker C, Norgan AP. Seeing the random forest through the decision trees. Supporting learning health systems from histopathology with machine learning models: Challenges and opportunities. J Pathol Inform 2024; 15:100347. [PMID: 38162950 PMCID: PMC10755052 DOI: 10.1016/j.jpi.2023.100347] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2023] [Revised: 10/06/2023] [Accepted: 11/01/2023] [Indexed: 01/03/2024] Open
Abstract
This paper discusses some overlooked challenges faced when working with machine learning models for histopathology and presents a novel opportunity to support "Learning Health Systems" with them. Initially, the authors elaborate on these challenges after separating them according to their mitigation strategies: those that need innovative approaches, time, or future technological capabilities and those that require a conceptual reappraisal from a critical perspective. Then, a novel opportunity to support "Learning Health Systems" by integrating hidden information extracted by ML models from digitalized histopathology slides with other healthcare big data is presented.
Collapse
Affiliation(s)
- Ricardo Gonzalez
- DeGroote School of Business, McMaster University, Hamilton, Ontario, Canada
- Division of Computational Pathology and Artificial Intelligence, Department of Laboratory Medicine and Pathology, Mayo Clinic, Rochester, MN, United States
| | - Ashirbani Saha
- Department of Oncology, Faculty of Health Sciences, McMaster University, Hamilton, Ontario, Canada
- Escarpment Cancer Research Institute, McMaster University and Hamilton Health Sciences, Hamilton, Ontario, Canada
| | - Clinton J.V. Campbell
- William Osler Health System, Brampton, Ontario, Canada
- Department of Pathology and Molecular Medicine, Faculty of Health Sciences, McMaster University, Hamilton, Ontario, Canada
| | - Peyman Nejat
- Department of Artificial Intelligence and Informatics, Mayo Clinic, Rochester, MN, United States
| | - Cynthia Lokker
- Health Information Research Unit, Department of Health Research Methods, Evidence and Impact, McMaster University, Hamilton, Ontario, Canada
| | - Andrew P. Norgan
- Department of Laboratory Medicine and Pathology, Mayo Clinic, Rochester, MN, United States
| |
Collapse
|
2
|
Liu S, Chen YX, Dai B, Chen L. Development and Validation of a Novel Machine Learning Model to Predict the Survival of Patients with Gastrointestinal Neuroendocrine Neoplasms. Neuroendocrinology 2024; 114:733-748. [PMID: 38710164 DOI: 10.1159/000539187] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/04/2023] [Accepted: 05/01/2024] [Indexed: 05/08/2024]
Abstract
INTRODUCTION Well-calibrated models for personalized prognostication of patients with gastrointestinal neuroendocrine neoplasms (GINENs) are limited. This study aimed to develop and validate a machine-learning model to predict the survival of patients with GINENs. METHODS Oblique random survival forest (ORSF) model, Cox proportional hazard risk model, Cox model with least absolute shrinkage and selection operator penalization, CoxBoost, Survival Gradient Boosting Machine, Extreme Gradient Boosting survival regression, DeepHit, DeepSurv, DNNSurv, logistic-hazard model, and PC-hazard model were compared. We further tuned hyperparameters and selected variables for the best-performing ORSF. Then, the final ORSF model was validated. RESULTS A total of 43,444 patients with GINENs were included. The median (interquartile range) survival time was 53 (19-102) months. The ORSF model performed best, in which age, histology, M stage, tumor size, primary tumor site, sex, tumor number, surgery, lymph nodes removed, N stage, race, and grade were ranked as important variables. However, chemotherapy and radiotherapy were not necessary for the ORSF model. The ORSF model had an overall C index of 0.86 (95% confidence interval, 0.85-0.87). The area under the receiver operation curves at 1, 3, 5, and 10 years were 0.91, 0.89, 0.87, and 0.80, respectively. The decision curve analysis showed superior clinical usefulness of the ORSF model than the American Joint Committee on Cancer Stage. A nomogram and an online tool were given. CONCLUSION The machine learning ORSF model could precisely predict the survival of patients with GINENs, with the ability to identify patients at high risk for death and probably guide clinical practice.
Collapse
Affiliation(s)
- Si Liu
- Department of Pediatrics, Shengjing Hospital of China Medical University, Shenyang, China
| | - Yun-Xiang Chen
- Department of Library, Shengjing Hospital of China Medical University, Shenyang, China,
| | - Bing Dai
- Department of Pediatrics, Shengjing Hospital of China Medical University, Shenyang, China
| | - Li Chen
- Department of Pediatrics, Shengjing Hospital of China Medical University, Shenyang, China
| |
Collapse
|
3
|
Zhang D, Luan J, Liu B, Yang A, Lv K, Hu P, Han X, Yu H, Shmuel A, Ma G, Zhang C. Comparison of MRI radiomics-based machine learning survival models in predicting prognosis of glioblastoma multiforme. Front Med (Lausanne) 2023; 10:1271687. [PMID: 38098850 PMCID: PMC10720716 DOI: 10.3389/fmed.2023.1271687] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2023] [Accepted: 11/15/2023] [Indexed: 12/17/2023] Open
Abstract
Objective To compare the performance of radiomics-based machine learning survival models in predicting the prognosis of glioblastoma multiforme (GBM) patients. Methods 131 GBM patients were included in our study. The traditional Cox proportional-hazards (CoxPH) model and four machine learning models (SurvivalTree, Random survival forest (RSF), DeepSurv, DeepHit) were constructed, and the performance of the five models was evaluated using the C-index. Results After the screening, 1792 radiomics features were obtained. Seven radiomics features with the strongest relationship with prognosis were obtained following the application of the least absolute shrinkage and selection operator (LASSO) regression. The CoxPH model demonstrated that age (HR = 1.576, p = 0.037), Karnofsky performance status (KPS) score (HR = 1.890, p = 0.006), radiomics risk score (HR = 3.497, p = 0.001), and radiomics risk level (HR = 1.572, p = 0.043) were associated with poorer prognosis. The DeepSurv model performed the best among the five models, obtaining C-index of 0.882 and 0.732 for the training and test set, respectively. The performances of the other four models were lower: CoxPH (0.663 training set / 0.635 test set), SurvivalTree (0.702/0.655), RSF (0.735/0.667), DeepHit (0.608/0.560). Conclusion This study confirmed the superior performance of deep learning algorithms based on radiomics relative to the traditional method in predicting the overall survival of GBM patients; specifically, the DeepSurv model showed the best predictive ability.
Collapse
Affiliation(s)
- Di Zhang
- Department of Radiology, Liaocheng People’s Hospital, Shandong First Medical University & Shandong Academy of Medical Sciences, Liaocheng, Shandong, China
| | - Jixin Luan
- China-Japan Friendship Hospital (Institute of Clinical Medical Sciences), Chinese Academy of Medical Sciences & Peking Union Medical College, Beijing, China
- Department of Radiology, China-Japan Friendship Hospital, Beijing, China
| | - Bing Liu
- China-Japan Friendship Hospital (Institute of Clinical Medical Sciences), Chinese Academy of Medical Sciences & Peking Union Medical College, Beijing, China
- Department of Radiology, China-Japan Friendship Hospital, Beijing, China
| | - Aocai Yang
- China-Japan Friendship Hospital (Institute of Clinical Medical Sciences), Chinese Academy of Medical Sciences & Peking Union Medical College, Beijing, China
- Department of Radiology, China-Japan Friendship Hospital, Beijing, China
| | - Kuan Lv
- Peking University China-Japan Friendship School of Clinical Medicine, Beijing, China
| | - Pianpian Hu
- Peking University China-Japan Friendship School of Clinical Medicine, Beijing, China
| | - Xiaowei Han
- Department of Radiology, The Affiliated Drum Tower Hospital of Nanjing University Medical School, Nanjing, China
| | - Hongwei Yu
- Department of Radiology, China-Japan Friendship Hospital, Beijing, China
| | - Amir Shmuel
- McConnell Brain Imaging Centre, Montreal Neurological Institute, McGill University, Montreal, QC, Canada
- Department of Neurology and Neurosurgery, McGill University, Montreal, QC, Canada
| | - Guolin Ma
- Department of Radiology, China-Japan Friendship Hospital, Beijing, China
| | - Chuanchen Zhang
- Department of Radiology, Liaocheng People’s Hospital, Shandong First Medical University & Shandong Academy of Medical Sciences, Liaocheng, Shandong, China
| |
Collapse
|
4
|
Stankevičiūtė K, Woillard JB, Peck RW, Marquet P, van der Schaar M. Bridging the Worlds of Pharmacometrics and Machine Learning. Clin Pharmacokinet 2023; 62:1551-1565. [PMID: 37803104 DOI: 10.1007/s40262-023-01310-x] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 09/19/2023] [Indexed: 10/08/2023]
Abstract
Precision medicine requires individualized modeling of disease and drug dynamics, with machine learning-based computational techniques gaining increasing popularity. The complexity of either field, however, makes current pharmacological problems opaque to machine learning practitioners, and state-of-the-art machine learning methods inaccessible to pharmacometricians. To help bridge the two worlds, we provide an introduction to current problems and techniques in pharmacometrics that ranges from pharmacokinetic and pharmacodynamic modeling to pharmacometric simulations, model-informed precision dosing, and systems pharmacology, and review some of the machine learning approaches to address them. We hope this would facilitate collaboration between experts, with complementary strengths of principled pharmacometric modeling and flexibility of machine learning leading to synergistic effects in pharmacological applications.
Collapse
Affiliation(s)
- Kamilė Stankevičiūtė
- Department of Computer Science and Technology, University of Cambridge, 15 JJ Thomson Avenue, Cambridge, CB3 0FD, UK
| | - Jean-Baptiste Woillard
- INSERM U1248 P&T, University of Limoges, 2 rue du Pr Descottes, 87000, Limoges, France.
- Department of Pharmacology and Toxicology, CHU Limoges, Limoges, France.
| | - Richard W Peck
- Department of Pharmacology and Therapeutics, University of Liverpool, Liverpool, UK
- Pharma Research and Development, Roche Innovation Center, Basel, Switzerland
| | - Pierre Marquet
- INSERM U1248 P&T, University of Limoges, 2 rue du Pr Descottes, 87000, Limoges, France
- Department of Pharmacology and Toxicology, CHU Limoges, Limoges, France
| | - Mihaela van der Schaar
- Department of Applied Mathematics and Theoretical Physics, University of Cambridge, Cambridge, UK
- The Alan Turing Institute, London, UK
| |
Collapse
|
5
|
Hsiao YC, Kuo CY, Lin FJ, Wu YW, Lin TH, Yeh HI, Chen JW, Wu CC. Machine Learning Models for ASCVD Risk Prediction in an Asian Population - How to Validate the Model is Important. ACTA CARDIOLOGICA SINICA 2023; 39:901-912. [PMID: 38022427 PMCID: PMC10646597 DOI: 10.6515/acs.202311_39(6).20230528a] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Subscribe] [Scholar Register] [Received: 09/12/2022] [Accepted: 05/28/2023] [Indexed: 12/01/2023]
Abstract
Introduction Atherosclerotic cardiovascular disease (ASCVD) is prevalent worldwide including Taiwan, however widely accepted tools to assess the risk of ASCVD are lacking in Taiwan. Machine learning models are potentially useful for risk evaluation. In this study we used two cohorts to test the feasibility of machine learning with transfer learning for developing an ASCVD risk prediction model in Taiwan. Methods Two multi-center observational registry cohorts, T-SPARCLE and T-PPARCLE were used in this study. The variables selected were based on European, U.S. and Asian guidelines. Both registries recorded the ASCVD outcomes of the patients. Ten-fold validation and temporal validation methods were used to evaluate the performance of the binary classification analysis [prediction of major adverse cardiovascular (CV) events in one year]. Time-to-event analyses were also performed. Results In the binary classification analysis, eXtreme Gradient Boosting (XGBoost) and random forest had the best performance, with areas under the receiver operating characteristic curve (AUC-ROC) of 0.72 (0.68-0.76) and 0.73 (0.69-0.77), respectively, although it was not significantly better than other models. Temporal validation was also performed, and the data showed significant differences in the distribution of various features and event rate. The AUC-ROC of XGBoost dropped to 0.66 (0.59-0.73), while that of random forest dropped to 0.69 (0.62-0.76) in the temporal validation method, and the performance also became numerically worse than that of the logistic regression model. In the time-to-event analysis, most models had a concordance index of around 0.70. Conclusions Machine learning models with appropriate transfer learning may be a useful tool for the development of CV risk prediction models and may help improve patient care in the future.
Collapse
Affiliation(s)
- Yu-Chung Hsiao
- Department of Internal Medicine, National Taiwan University Hospital
| | - Chen-Yuan Kuo
- Center for Healthy Longevity and Aging Sciences, National Yang Ming Chiao Tung University
| | - Fang-Ju Lin
- Graduate Institute of Clinical Pharmacy & School of Pharmacy, College of Medicine, National Taiwan University
- Department of Pharmacy, National Taiwan University Hospital, Taipei
| | - Yen-Wen Wu
- Division of Cardiology, Cardiovascular Medical Center, Far Eastern Memorial Hospital, New Taipei City
- School of Medicine, National Yang Ming Chiao Tung University, School of Medicine, Taipei
- Graduate Institute of Medicine, Yuan Ze University, Taoyuan
| | - Tsung-Hsien Lin
- Division of Cardiology, Department of Internal Medicine, Kaohsiung Medical University Hospital
- Faculty of Medicine, College of Medicine, Kaohsiung Medical University, Kaohsiung
| | - Hung-I Yeh
- MacKay Memorial Hospital, MacKay Medical College
| | - Jaw-Wen Chen
- Department of Medical Research and Education, Taipei Veterans General Hospital
| | - Chau-Chung Wu
- Department of Internal Medicine, National Taiwan University Hospital
- Graduate Institute of Medical Education & Bioethics, College of Medicine, National Taiwan University, Taipei, Taiwan
| |
Collapse
|
6
|
Zhang C, Li Z, Yang Z, Huang B, Hou Y, Chen Z. A Dynamic Prediction Model Supporting Individual Life Expectancy Prediction Based on Longitudinal Time-Dependent Covariates. IEEE J Biomed Health Inform 2023; 27:4623-4632. [PMID: 37471185 DOI: 10.1109/jbhi.2023.3292475] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/22/2023]
Abstract
In the field of clinical chronic diseases, common prediction results (such as survival rate) and effect size hazard ratio (HR) are relative indicators, resulting in more abstract information. However, clinicians and patients are more interested in simple and intuitive concepts of (survival) time, such as how long a patient may live or how much longer a patient in a treatment group will live. In addition, due to the long follow-up time, resulting in generation of longitudinal time-dependent covariate information, patients are interested in how long they will survive at each follow-up visit. In this study, based on a time scale indicator-restricted mean survival time (RMST)-we proposed a dynamic RMST prediction model by considering longitudinal time-dependent covariates and utilizing joint model techniques. The model can describe the change trajectory of longitudinal time-dependent covariates and predict the average survival times of patients at different time points (such as follow-up visits). Simulation studies through Monte Carlo cross-validation showed that the dynamic RMST prediction model was superior to the static RMST model. In addition, the dynamic RMST prediction model was applied to a primary biliary cirrhosis (PBC) population to dynamically predict the average survival times of the patients, and the average C-index of the internal validation of the model reached 0.81, which was better than that of the static RMST regression. Therefore, the proposed dynamic RMST prediction model has better performance in prediction and can provide a scientific basis for clinicians and patients to make clinical decisions.
Collapse
|
7
|
Hu S, Chang CP, Snyder J, Deshmukh V, Newman M, Date A, Galvao C, Porucznik CA, Gren LH, Sanchez A, Lloyd S, Haaland B, O'Neil B, Hashibe M. Comparing Active Surveillance and Watchful Waiting With Radical Treatment Using Machine Learning Models Among Patients With Prostate Cancer. JCO Clin Cancer Inform 2023; 7:e2300083. [PMID: 37988640 PMCID: PMC10681553 DOI: 10.1200/cci.23.00083] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2023] [Revised: 07/20/2023] [Accepted: 09/13/2023] [Indexed: 11/23/2023] Open
Abstract
PURPOSE In 2021, 59.6% of low-risk patients with prostate cancer were under active surveillance (AS) as their first course of treatment. However, few studies have investigated AS and watchful waiting (WW) separately. The objectives of this study were to develop and validate a population-level machine learning model for distinguishing AS and WW in the conservative treatment group, and to investigate initial cancer management trends from 2004 to 2017 and the risk of chronic diseases among patients with prostate cancer with different treatment modalities. METHODS In a cohort of 18,134 patients with prostate adenocarcinoma diagnosed between 2004 and 2017, 1,926 patients with available AS/WW information were analyzed using machine learning algorithms with 10-fold cross-validation. Models were evaluated using performance metrics and Brier score. Cox proportional hazard models were used to estimate hazard ratios for chronic disease risk. RESULTS Logistic regression models achieved a test area under the receiver operating curve of 0.73, F-score of 0.79, accuracy of 0.71, and Brier score of 0.29, demonstrating good calibration, precision, and recall values. We noted a sharp increase in AS use between 2004 and 2016 among patients with low-risk prostate cancer and a moderate increase among intermediate-risk patients between 2008 and 2017. Compared with the AS group, radical treatment was associated with a lower risk of prostate cancer-specific mortality but higher risks of Alzheimer disease, anemia, glaucoma, hyperlipidemia, and hypertension. CONCLUSION A machine learning approach accurately distinguished AS and WW groups in conservative treatment in this decision analytical model study. Our results provide insight into the necessity to separate AS and WW in population-based studies.
Collapse
Affiliation(s)
- Siqi Hu
- Huntsman Cancer Institute, Salt Lake City, UT
- Division of Public Health, Department of Family and Preventive Medicine, University of Utah School of Medicine, Salt Lake City, UT
| | - Chun-Pin Chang
- Huntsman Cancer Institute, Salt Lake City, UT
- Division of Public Health, Department of Family and Preventive Medicine, University of Utah School of Medicine, Salt Lake City, UT
| | - John Snyder
- Intermountain Healthcare, Salt Lake City, UT
| | | | - Michael Newman
- University of Utah Health Sciences Center, Salt Lake City, UT
| | - Ankita Date
- Pedigree and Population Resource, Population Sciences, Huntsman Cancer Institute, Salt Lake City, UT
| | - Carlos Galvao
- Pedigree and Population Resource, Population Sciences, Huntsman Cancer Institute, Salt Lake City, UT
| | - Christina A. Porucznik
- Division of Public Health, Department of Family and Preventive Medicine, University of Utah School of Medicine, Salt Lake City, UT
| | - Lisa H. Gren
- Division of Public Health, Department of Family and Preventive Medicine, University of Utah School of Medicine, Salt Lake City, UT
| | - Alejandro Sanchez
- Division of Urology, Department of Surgery, University of Utah School of Medicine, Salt Lake City, UT
| | - Shane Lloyd
- Department of Radiation Oncology, University of Utah School of Medicine, Salt Lake City, UT
| | - Benjamin Haaland
- Huntsman Cancer Institute, Salt Lake City, UT
- Department of Population Health Sciences, University of Utah School of Medicine, Salt Lake City, UT
| | - Brock O'Neil
- Division of Urology, Department of Surgery, University of Utah School of Medicine, Salt Lake City, UT
| | - Mia Hashibe
- Huntsman Cancer Institute, Salt Lake City, UT
- Division of Public Health, Department of Family and Preventive Medicine, University of Utah School of Medicine, Salt Lake City, UT
| |
Collapse
|
8
|
Jin L, Zhao Q, Fu S, Cao F, Hou B, Ma J. Development and validation of machine learning models to predict survival of patients with resected stage-III NSCLC. Front Oncol 2023; 13:1092478. [PMID: 36994203 PMCID: PMC10040845 DOI: 10.3389/fonc.2023.1092478] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2022] [Accepted: 02/13/2023] [Indexed: 03/18/2023] Open
Abstract
ObjectiveTo compare the performance of three machine learning algorithms with the tumor, node, and metastasis (TNM) staging system in survival prediction and validate the individual adjuvant treatment recommendations plan based on the optimal model.MethodsIn this study, we trained three machine learning madel and validated 3 machine learning survival models-deep learning neural network, random forest and cox proportional hazard model- using the data of patients with stage-al3 NSCLC patients who received resection surgery from the National Cancer Institute Surveillance, Epidemiology, and End Results (SEER) database from 2012 to 2017,the performance of survival predication from all machine learning models were assessed using a concordance index (c-index) and the averaged c-index is utilized for cross-validation. The optimal model was externally validated in an independent cohort from Shaanxi Provincial People’s Hospital. Then we compare the performance of the optimal model and TNM staging system. Finally, we developed a Cloud-based recommendation system for adjuvant therapy to visualize survival curve of each treatment plan and deployed on the internet.ResultsA total of 4617 patients were included in this study. The deep learning network performed more stably and accurately in predicting stage-iii NSCLC resected patients survival than the random survival forest and Cox proportional hazard model on the internal test dataset (C-index=0.834 vs. 0.678 vs. 0.640) and better than TNM staging system (C-index=0.820 vs. 0.650) in the external validation. The individual patient who follow the reference from recommendation system had superior survival compared to those who did not. The predicted 5-year-survival curve for each adjuvant treatment plan could be accessed in the recommender system via the browser.ConclusionDeep learning model has several advantages over linear model and random forest model in prognostic predication and treatment recommendations. This novel analytical approach may provide accurate predication on individual survival and treatment recommendations for resected Stage-iii NSCLC patients.
Collapse
Affiliation(s)
- Long Jin
- Department of Radiation Oncology, Shaanxi Provincial People’s Hospital, Xi’an, China
| | - Qifan Zhao
- School of Material Science & Engineering, Huazhong University of Science and Technology, Wuhan, China
| | - Shenbo Fu
- Department of Radiation Oncology, Shaanxi Provincial Cancer Hospital, Xi’an, China
| | - Fei Cao
- Department of Oncology, Shaanxi Provincial People’s Hospital, Xi’an, China
| | - Bin Hou
- Department of Thoracic Surgery, Shaanxi Provincial People’s Hospital, Xi’an, China
| | - Jia Ma
- Shaanxi Provincial People’s Hospital, Xi’an, China
- *Correspondence: Jia Ma,
| |
Collapse
|