1
|
Moon H, Tran L, Lee A, Kwon T, Lee M. Prediction of Treatment Recommendations Via Ensemble Machine Learning Algorithms for Non-Small Cell Lung Cancer Patients in Personalized Medicine. Cancer Inform 2024; 23:11769351241272397. [PMID: 39421723 PMCID: PMC11483699 DOI: 10.1177/11769351241272397] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2024] [Accepted: 07/14/2024] [Indexed: 10/19/2024] Open
Abstract
Objectives The primary goal of this research is to develop treatment-related genomic predictive markers for non-small cell lung cancer by integrating various machine learning algorithms that recommends near-optimal individualized patient treatment for chemotherapy in an effort to maximize efficacy or minimize treatment-related toxicity. This research can contribute toward developing a more refined, accurate and effective therapy accounting for specific patient needs. Methods To accomplish our research goal, we implement ensemble learning algorithms, bagging with regularized Cox regression models and nonparametric tree-based models via Random Survival Forests. A comprehensive meta-database was compiled from the NCBI Gene Expression Omnibus data repository for lung cancer patients to capture and utilize complex genomic patterns that can predict treatment outcomes more accurately. Results The developed novel prediction algorithm demonstrates the ability to support complex clinical decision-making processes in the treatment of NSCLC. It effectively addresses patient heterogeneity, offering predictions that are both refined and personalized in improving the precision of chemotherapy regimens prescribed to the eligible patients. Conclusion This research should contribute substantial advancement of cancer treatments by improving the accuracy and efficacy of chemotherapy treatments for a targeted group of patients who need the right treatment. The integration of complex machine learning techniques with genomic data holds substantial potential to transform current cancer treatment paradigms by providing robust support in clinical decision-making.
Collapse
Affiliation(s)
- Hojin Moon
- Department of Mathematics and Statistics, California State University, Long Beach, Long Beach, CA, USA
| | - Lauren Tran
- Department of Epidemiology, School of Public Health, University of California, Los Angeles, Los Angeles, CA, USA
| | - Andrew Lee
- College of Chemistry, University of California, Berkeley, CA, USA
| | - Taeksoo Kwon
- School of Information and Computer Science, University of California, Irvine, CA, USA
| | - Minho Lee
- School of Math and Computer Science, Irvine Valley College, Irvine, CA, USA
| |
Collapse
|
2
|
Yang C, Xu J, Wang S, Wang Y, Zhang Y, Piao C. Machine learning to predict the early recurrence of intrahepatic cholangiocarcinoma: A systematic review and meta‑analysis. Oncol Lett 2024; 28:385. [PMID: 38966582 PMCID: PMC11222917 DOI: 10.3892/ol.2024.14518] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2023] [Accepted: 04/12/2024] [Indexed: 07/06/2024] Open
Abstract
The prediction of early recurrent of intrahepatic cholangiocarcinoma (ICC) has been widely investigated; however, the predictive value is currently insufficient. To determine the effectiveness of machine learning (ML) for the diagnosis of early recurrent intrahepatic cholangiocarcinoma (ICC), particularly in comparison with clinical models, the present study aimed to determine which ML model had the best diagnostic performance for inpatients with recurrent ICC. In order to search for studies which could be included, three electronic databases were screened from inception to November 2023. A pairwise meta-analysis was performed to evaluate the diagnostic accuracy of the random effects model. A network meta-analysis was performed to identify the most effective ML-based diagnostic model based on the surface under the cumulative ranking curve score. A total of 5 studies of acceptable quality containing 1,247 patients with ICC were included in the present study. Following pairwise meta-analysis, it was found that the ML-based diagnostic accuracy was greater than that of clinical models (surface under the cumulative ranking curve score closer to 1, with significant differences), which initially proved that the ML-based diagnostic power was more optimal than that of clinical models. According to the network meta-analysis, the nomogram performed the best, indicating that this ML model achieved the best diagnostic accuracy for patients with recurrent ICC. In conclusion, the application of ML-based diagnostic models for patients with recurrent ICC was more optimal than the application of the clinical model. The nomogram model ranked first among the models and is therefore recommended for patients with recurrent ICC.
Collapse
Affiliation(s)
- Chao Yang
- Information Construction Department, Department of Ethnic Culture and Vocational Education, Liaoning National Normal College, Shenyang, Liaoning 110032, P.R. China
| | - Jianhui Xu
- Information Construction Department, Department of Ethnic Culture and Vocational Education, Liaoning National Normal College, Shenyang, Liaoning 110032, P.R. China
| | - Shuai Wang
- Department of Clinical Pharmacy, Shenyang Pharmaceutical University, Shenyang, Liaoning 110016, P.R. China
| | - Ying Wang
- Information Construction Department, Department of Ethnic Culture and Vocational Education, Liaoning National Normal College, Shenyang, Liaoning 110032, P.R. China
| | - Yingshi Zhang
- Department of Clinical Pharmacy, Shenyang Pharmaceutical University, Shenyang, Liaoning 110016, P.R. China
| | - Chengzhe Piao
- Information Construction Department, Department of Ethnic Culture and Vocational Education, Liaoning National Normal College, Shenyang, Liaoning 110032, P.R. China
| |
Collapse
|
3
|
Teshale AB, Htun HL, Vered M, Owen AJ, Freak-Poli R. A Systematic Review of Artificial Intelligence Models for Time-to-Event Outcome Applied in Cardiovascular Disease Risk Prediction. J Med Syst 2024; 48:68. [PMID: 39028429 PMCID: PMC11271333 DOI: 10.1007/s10916-024-02087-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2024] [Accepted: 07/09/2024] [Indexed: 07/20/2024]
Abstract
Artificial intelligence (AI) based predictive models for early detection of cardiovascular disease (CVD) risk are increasingly being utilised. However, AI based risk prediction models that account for right-censored data have been overlooked. This systematic review (PROSPERO protocol CRD42023492655) includes 33 studies that utilised machine learning (ML) and deep learning (DL) models for survival outcome in CVD prediction. We provided details on the employed ML and DL models, eXplainable AI (XAI) techniques, and type of included variables, with a focus on social determinants of health (SDoH) and gender-stratification. Approximately half of the studies were published in 2023 with the majority from the United States. Random Survival Forest (RSF), Survival Gradient Boosting models, and Penalised Cox models were the most frequently employed ML models. DeepSurv was the most frequently employed DL model. DL models were better at predicting CVD outcomes than ML models. Permutation-based feature importance and Shapley values were the most utilised XAI methods for explaining AI models. Moreover, only one in five studies performed gender-stratification analysis and very few incorporate the wide range of SDoH factors in their prediction model. In conclusion, the evidence indicates that RSF and DeepSurv models are currently the optimal models for predicting CVD outcomes. This study also highlights the better predictive ability of DL survival models, compared to ML models. Future research should ensure the appropriate interpretation of AI models, accounting for SDoH, and gender stratification, as gender plays a significant role in CVD occurrence.
Collapse
Affiliation(s)
- Achamyeleh Birhanu Teshale
- School of Public Health and Preventive Medicine, Monash University, Melbourne, VIC, Australia
- Department of Epidemiology and Biostatistics, Institute of Public Health, College of Medicine and Health Sciences, University of Gondar, Gondar, Ethiopia
| | - Htet Lin Htun
- School of Public Health and Preventive Medicine, Monash University, Melbourne, VIC, Australia
| | - Mor Vered
- Department of Data Science and AI, Faculty of Information Technology, Monash University, Clayton, VIC, Australia
| | - Alice J Owen
- School of Public Health and Preventive Medicine, Monash University, Melbourne, VIC, Australia
| | - Rosanne Freak-Poli
- School of Public Health and Preventive Medicine, Monash University, Melbourne, VIC, Australia.
- Stroke and Ageing Research, Department of Medicine, School of Clinical Sciences at Monash Health, Monash University, Melbourne, VIC, Australia.
| |
Collapse
|
4
|
Astley JR, Reilly JM, Robinson S, Wild JM, Hatton MQ, Tahir BA. Explainable deep learning-based survival prediction for non-small cell lung cancer patients undergoing radical radiotherapy. Radiother Oncol 2024; 193:110084. [PMID: 38244779 DOI: 10.1016/j.radonc.2024.110084] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2023] [Revised: 12/20/2023] [Accepted: 01/02/2024] [Indexed: 01/22/2024]
Abstract
BACKGROUND AND PURPOSE Survival is frequently assessed using Cox proportional hazards (CPH) regression; however, CPH may be too simplistic as it assumes a linear relationship between covariables and the outcome. Alternative, non-linear machine learning (ML)-based approaches, such as random survival forests (RSFs) and, more recently, deep learning (DL) have been proposed; however, these techniques are largely black-box in nature, limiting explainability. We compared CPH, RSF and DL to predict overall survival (OS) of non-small cell lung cancer (NSCLC) patients receiving radiotherapy using pre-treatment covariables. We employed explainable techniques to provide insights into the contribution of each covariable on OS prediction. MATERIALS AND METHODS The dataset contained 471 stage I-IV NSCLC patients treated with radiotherapy. We built CPH, RSF and DL OS prediction models using several baseline covariable combinations. 10-fold Monte-Carlo cross-validation was employed with a split of 70%:10%:20% for training, validation and testing, respectively. We primarily evaluated performance using the concordance index (C-index) and integrated Brier score (IBS). Local interpretable model-agnostic explanation (LIME) values, adapted for use in survival analysis, were computed for each model. RESULTS The DL method exhibited a significantly improved C-index of 0.670 compared to the CPH and a significantly improved IBS of 0.121 compared to the CPH and RSF approaches. LIME values suggested that, for the DL method, the three most important covariables in OS prediction were stage, administration of chemotherapy and oesophageal mean radiation dose. CONCLUSION We show that, using pre-treatment covariables, a DL approach demonstrates superior performance over CPH and RSF for OS prediction and use explainable techniques to provide transparency and interpretability.
Collapse
Affiliation(s)
- Joshua R Astley
- Division of Clinical Medicine, The University of Sheffield, Sheffield, UK
| | - James M Reilly
- Division of Clinical Medicine, The University of Sheffield, Sheffield, UK
| | - Stephen Robinson
- Division of Clinical Medicine, The University of Sheffield, Sheffield, UK
| | - Jim M Wild
- Division of Clinical Medicine, The University of Sheffield, Sheffield, UK; Insigneo Institute for in Silico Medicine, The University of Sheffield, Sheffield, UK
| | - Matthew Q Hatton
- Division of Clinical Medicine, The University of Sheffield, Sheffield, UK
| | - Bilal A Tahir
- Division of Clinical Medicine, The University of Sheffield, Sheffield, UK; Insigneo Institute for in Silico Medicine, The University of Sheffield, Sheffield, UK.
| |
Collapse
|
5
|
Altuhaifa FA, Win KT, Su G. Predicting lung cancer survival based on clinical data using machine learning: A review. Comput Biol Med 2023; 165:107338. [PMID: 37625260 DOI: 10.1016/j.compbiomed.2023.107338] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2023] [Revised: 07/31/2023] [Accepted: 08/07/2023] [Indexed: 08/27/2023]
Abstract
Machine learning has gained popularity in predicting survival time in the medical field. This review examines studies utilizing machine learning and data-mining techniques to predict lung cancer survival using clinical data. A systematic literature review searched MEDLINE, Scopus, and Google Scholar databases, following reporting guidelines and using the COVIDENCE system. Studies published from 2000 to 2023 employing machine learning for lung cancer survival prediction were included. Risk of bias assessment used the prediction model risk of bias assessment tool. Thirty studies were reviewed, with 13 (43.3%) using the surveillance, epidemiology, and end results database. Missing data handling was addressed in 12 (40%) studies, primarily through data transformation and conversion. Feature selection algorithms were used in 19 (63.3%) studies, with age, sex, and N stage being the most chosen features. Random forest was the predominant machine learning model, used in 17 (56.6%) studies. While the number of lung cancer survival prediction studies is limited, the use of machine learning models based on clinical data has grown since 2012. Consideration of diverse patient cohorts and data pre-processing are crucial. Notably, most studies did not account for missing data, normalization, scaling, or standardized data, potentially introducing bias. Therefore, a comprehensive study on lung cancer survival prediction using clinical data is needed, addressing these challenges.
Collapse
Affiliation(s)
- Fatimah Abdulazim Altuhaifa
- School of Computing and Information Technology, University of Wollongong, NSW, 2500, Australia; Saudi Arabia Ministry of Higher Education, Riyadh, Saudi Arabia.
| | - Khin Than Win
- School of Computing and Information Technology, University of Wollongong, NSW, 2500, Australia
| | - Guoxin Su
- School of Computing and Information Technology, University of Wollongong, NSW, 2500, Australia
| |
Collapse
|