1
|
Janbain A, Farolfi A, Guenegou-Arnoux A, Romengas L, Scharl S, Fanti S, Serani F, Peeken JC, Katsahian S, Strouthos I, Ferentinos K, Koerber SA, Vogel ME, Combs SE, Vrachimis A, Morganti AG, Spohn SK, Grosu AL, Ceci F, Henkenberens C, Kroeze SG, Guckenberger M, Belka C, Bartenstein P, Hruby G, Emmett L, Omerieh AA, Schmidt-Hegemann NS, Mose L, Aebersold DM, Zamboglou C, Wiegel T, Shelan M. A Machine Learning Approach for Predicting Biochemical Outcome After PSMA-PET-Guided Salvage Radiotherapy in Recurrent Prostate Cancer After Radical Prostatectomy: Retrospective Study. JMIR Cancer 2024; 10:e60323. [PMID: 39303279 PMCID: PMC11452751 DOI: 10.2196/60323] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2024] [Revised: 07/06/2024] [Accepted: 08/07/2024] [Indexed: 09/22/2024] Open
Abstract
BACKGROUND Salvage radiation therapy (sRT) is often the sole curative option in patients with biochemical recurrence after radical prostatectomy. After sRT, we developed and validated a nomogram to predict freedom from biochemical failure. OBJECTIVE This study aims to evaluate prostate-specific membrane antigen-positron emission tomography (PSMA-PET)-based sRT efficacy for postprostatectomy prostate-specific antigen (PSA) persistence or recurrence. Objectives include developing a random survival forest (RSF) model for predicting biochemical failure, comparing it with a Cox model, and assessing predictive accuracy over time. Multinational cohort data will validate the model's performance, aiming to improve clinical management of recurrent prostate cancer. METHODS This multicenter retrospective study collected data from 13 medical facilities across 5 countries: Germany, Cyprus, Australia, Italy, and Switzerland. A total of 1029 patients who underwent sRT following PSMA-PET-based assessment for PSA persistence or recurrence were included. Patients were treated between July 2013 and June 2020, with clinical decisions guided by PSMA-PET results and contemporary standards. The primary end point was freedom from biochemical failure, defined as 2 consecutive PSA rises >0.2 ng/mL after treatment. Data were divided into training (708 patients), testing (271 patients), and external validation (50 patients) sets for machine learning algorithm development and validation. RSF models were used, with 1000 trees per model, optimizing predictive performance using the Harrell concordance index and Brier score. Statistical analysis used R Statistical Software (R Foundation for Statistical Computing), and ethical approval was obtained from participating institutions. RESULTS Baseline characteristics of 1029 patients undergoing sRT PSMA-PET-based assessment were analyzed. The median age at sRT was 70 (IQR 64-74) years. PSMA-PET scans revealed local recurrences in 43.9% (430/979) and nodal recurrences in 27.2% (266/979) of patients. Treatment included dose-escalated sRT to pelvic lymphatics in 35.6% (349/979) of cases. The external outlier validation set showed distinct features, including higher rates of positive lymph nodes (47/50, 94% vs 266/979, 27.2% in the learning cohort) and lower delivered sRT doses (<66 Gy in 57/979, 5.8% vs 46/50, 92% of patients; P<.001). The RSF model, validated internally and externally, demonstrated robust predictive performance (Harrell C-index range: 0.54-0.91) across training and validation datasets, outperforming a previously published nomogram. CONCLUSIONS The developed RSF model demonstrates enhanced predictive accuracy, potentially improving patient outcomes and assisting clinicians in making treatment decisions.
Collapse
Affiliation(s)
- Ali Janbain
- European Hospital Georges-Pompidou., Clinical research unit, INSERM Clinical Investigation Center., Paris Cité University, Paris, France
| | - Andrea Farolfi
- Division of Nuclear Medicine, IRCCS Azienda Ospedaliero-Universitaria di Bologna, Bologna, Italy
| | - Armelle Guenegou-Arnoux
- European Hospital Georges-Pompidou., Clinical research unit, INSERM Clinical Investigation Center., Paris Cité University, Paris, France
| | - Louis Romengas
- European Hospital Georges-Pompidou., Clinical research unit, INSERM Clinical Investigation Center., Paris Cité University, Paris, France
| | - Sophia Scharl
- Department of Radiation Oncology, University of Ulm, Ulm, Germany
| | - Stefano Fanti
- Division of Nuclear Medicine, IRCCS Azienda Ospedaliero-Universitaria di Bologna, Bologna, Italy
| | - Francesca Serani
- Division of Nuclear Medicine, IRCCS Azienda Ospedaliero-Universitaria di Bologna, Bologna, Italy
| | - Jan C Peeken
- Department of Radiation Oncology, Klinikum rechts der Isar, Technical University of Munich (TUM), Munich, Germany
| | - Sandrine Katsahian
- European Hospital Georges-Pompidou., Clinical research unit, INSERM Clinical Investigation Center., Paris Cité University, Paris, France
| | - Iosif Strouthos
- Department of Radiation Oncology, German Oncology Center, University Hospital of the European University, Limassol, Cyprus
| | - Konstantinos Ferentinos
- Department of Radiation Oncology, German Oncology Center, University Hospital of the European University, Limassol, Cyprus
| | - Stefan A Koerber
- Department of Radiation Oncology, Heidelberg University Hospital, Heidelberg, Germany
| | - Marco E Vogel
- Department of Radiation Oncology, Klinikum rechts der Isar, Technical University of Munich (TUM), Munich, Germany
| | - Stephanie E Combs
- Department of Radiation Oncology, Klinikum rechts der Isar, Technical University of Munich (TUM), Munich, Germany
| | - Alexis Vrachimis
- Department of Radiation Oncology, German Oncology Center, University Hospital of the European University, Limassol, Cyprus
| | | | - Simon Kb Spohn
- Department of Radiation Oncology, Medical Center-University of Freiburg, Faculty of Medicine, University of Freiburg, Freiburg, Germany
| | - Anca-Ligia Grosu
- Department of Radiation Oncology, Medical Center-University of Freiburg, Faculty of Medicine, University of Freiburg, Freiburg, Germany
| | - Francesco Ceci
- Division of Nuclear Medicine, IEO European Institute of Oncology IRCCS, Milan, Italy
| | - Christoph Henkenberens
- Department of Radiotherapy and Special Oncology, Medical School Hannover, Hannover, Germany
| | - Stephanie Gc Kroeze
- Department of Radiation Oncology, University Hospital Zürich, University of Zurich, Zurich, Switzerland
| | - Matthias Guckenberger
- Department of Radiation Oncology, University Hospital Zürich, University of Zurich, Zurich, Switzerland
| | - Claus Belka
- Department of Radiation Oncology, University Hospital, LMU Munich, Munich, Germany
| | - Peter Bartenstein
- Department of Nuclear Medicine, University Hospital, LMU Munich, Munich, Germany
| | - George Hruby
- Department of Radiation Oncology, Royal North Shore Hospital-University of Sydney, Sydney, Australia
| | - Louise Emmett
- Department of Theranostics and Nuclear Medicine, St Vincent's Hospital Sydney, Sydney, Australia
| | - Ali Afshar Omerieh
- Department of Nuclear Medicine, Inselspital, Bern University Hospital, University of Bern, Bern, Switzerland
| | | | - Lucas Mose
- Department of Radiation Oncology, Inselspital, Bern University Hospital, University of Bern, Bern, Switzerland
| | - Daniel M Aebersold
- Department of Radiation Oncology, Inselspital, Bern University Hospital, University of Bern, Bern, Switzerland
| | - Constantinos Zamboglou
- Department of Radiation Oncology, German Oncology Center, University Hospital of the European University, Limassol, Cyprus
| | - Thomas Wiegel
- Department of Radiation Oncology, University of Ulm, Ulm, Germany
| | - Mohamed Shelan
- Department of Radiation Oncology, Inselspital, Bern University Hospital, University of Bern, Bern, Switzerland
| |
Collapse
|
2
|
Fan G, Yang S, Qin J, Huang L, Li Y, Liu H, Liao X. Machine Learning Predict Survivals of Spinal and Pelvic Ewing's Sarcoma with the SEER Database. Global Spine J 2024; 14:1125-1136. [PMID: 36281905 PMCID: PMC11289541 DOI: 10.1177/21925682221134049] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open
Abstract
STUDY DESIGN Retrospective Cohort Study. OBJECTIVES This study aimed to develop survival prediction models for spinal Ewing's sarcoma (EWS) based on machine learning (ML). METHODS We extracted the SEER registry's clinical data of EWS diagnosed between 1975 and 2016. Three feature selection methods extracted clinical features. Four ML algorithms (Cox, random survival forest (RSF), CoxBoost, DeepCox) were trained to predict the overall survival (OS) and cancer-specific survival (CSS) of spinal EWS. The concordance index (C-index), integrated Brier score (IBS) and mean area under the curves (AUC) were used to assess the prediction performance of different ML models. The top initial ML models with best performance from each evaluation index (C-index, IBS and mean AUC) were finally stacked to ensemble models which were compared with the traditional TNM stage model by 3-/5-/10-year Receiver Operating Characteristic (ROC) curves and Decision Curve Analysis (DCA). RESULTS A total of 741 patients with spinal EWS were identified. C-index, IBS and mean AUC for the final ensemble ML model in predicting OS were .693/0.158/0.829 during independent testing, while .719/0.171/0.819 in predicting CSS. The ensemble ML model also achieved an AUC of .705/0.747/0.851 for predicting 3-/5-/10-year OS during independent testing, while .734/0.779/0.830 for predicting 3-/5-/10-year CSS, both of which outperformed the traditional TNM stage. DCA curves also showed the advantages of the ensemble models over the traditional TNM stage. CONCLUSION ML was an effective and promising technique in predicting survival of spinal EWS, and the ensemble models were superior to the traditional TNM stage model.
Collapse
Affiliation(s)
- Guoxin Fan
- National Key Clinical Pain Medicine of China, Huazhong University of Science and Technology Union Shenzhen Hospital, China
- Guangdong Key Laboratory for Biomedical Measurements and Ultrasound Imaging, School of Biomedical Engineering, Shenzhen University Health Science Center, China
- Department of Pain Medicine and Shenzhen Municipal Key Laboratory for Pain Medicine, The 6th Affiliated Hospital of Shenzhen University Health Science Center, China
- Department of Spine Surgery, Third Affiliated Hospital, Sun Yat-sen University, Guangzhou, China
| | - Sheng Yang
- Department of Orthopedics, Shanghai Tenth Peoples Hospital, Tongji University School of Medicine, China
| | - Jiaqi Qin
- Artificial Intelligence Innovation Center, Research Institute of Tsinghua, Pearl River Delta, China
| | - Longfei Huang
- Department of Orthopedics, Nanchang Hongdu Hospital of Traditional Chinese Medicine, China
| | - Yufeng Li
- Department of Sports Medicine, The Eighth Affiliated Hospital Sun Yat-sen University, China
| | - Huaqing Liu
- Artificial Intelligence Innovation Center, Research Institute of Tsinghua, Pearl River Delta, China
| | - Xiang Liao
- National Key Clinical Pain Medicine of China, Huazhong University of Science and Technology Union Shenzhen Hospital, China
- Department of Pain Medicine and Shenzhen Municipal Key Laboratory for Pain Medicine, The 6th Affiliated Hospital of Shenzhen University Health Science Center, China
| |
Collapse
|
3
|
Wani NA, Kumar R, Bedi J. DeepXplainer: An interpretable deep learning based approach for lung cancer detection using explainable artificial intelligence. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2024; 243:107879. [PMID: 37897989 DOI: 10.1016/j.cmpb.2023.107879] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/01/2023] [Revised: 10/17/2023] [Accepted: 10/20/2023] [Indexed: 10/30/2023]
Abstract
BACKGROUND AND OBJECTIVE Artificial intelligence (AI) has several uses in the healthcare industry, some of which include healthcare management, medical forecasting, practical making of decisions, and diagnosis. AI technologies have reached human-like performance, but their use is limited since they are still largely viewed as opaque black boxes. This distrust remains the primary factor for their limited real application, particularly in healthcare. As a result, there is a need for interpretable predictors that provide better predictions and also explain their predictions. METHODS This study introduces "DeepXplainer", a new interpretable hybrid deep learning-based technique for detecting lung cancer and providing explanations of the predictions. This technique is based on a convolutional neural network and XGBoost. XGBoost is used for class label prediction after "DeepXplainer" has automatically learned the features of the input using its many convolutional layers. For providing explanations or explainability of the predictions, an explainable artificial intelligence method known as "SHAP" is implemented. RESULTS The open-source "Survey Lung Cancer" dataset was processed using this method. On multiple parameters, including accuracy, sensitivity, F1-score, etc., the proposed method outperformed the existing methods. The proposed method obtained an accuracy of 97.43%, a sensitivity of 98.71%, and an F1-score of 98.08. After the model has made predictions with this high degree of accuracy, each prediction is explained by implementing an explainable artificial intelligence method at both the local and global levels. CONCLUSIONS A deep learning-based classification model for lung cancer is proposed with three primary components: one for feature learning, another for classification, and a third for providing explanations for the predictions made by the proposed hybrid (ConvXGB) model. The proposed "DeepXplainer" has been evaluated using a variety of metrics, and the results demonstrate that it outperforms the current benchmarks. Providing explanations for the predictions, the proposed approach may help doctors in detecting and treating lung cancer patients more effectively.
Collapse
Affiliation(s)
- Niyaz Ahmad Wani
- Department of Computer Science and Engineering, Thapar Institute of Engineering and Technology, Patiala (PIN: 147004), Punjab, India.
| | - Ravinder Kumar
- Department of Computer Science and Engineering, Thapar Institute of Engineering and Technology, Patiala (PIN: 147004), Punjab, India.
| | - Jatin Bedi
- Department of Computer Science and Engineering, Thapar Institute of Engineering and Technology, Patiala (PIN: 147004), Punjab, India.
| |
Collapse
|
4
|
Luna HGC, Imasa MS, Juat N, Hernandez KV, Sayo TM, Cristal-Luna G, Asur-Galang SM, Bellengan M, Duga KJ, Buenaobra BB, De los Santos MI, Medina D, Samo J, Literal VM, Bascos NA, Sy-Naval S. The differential prognostic implications of PD-L1 expression in the outcomes of Filipinos with EGFR-mutant NSCLC treated with tyrosine kinase inhibitors. Transl Lung Cancer Res 2023; 12:1896-1911. [PMID: 37854154 PMCID: PMC10579834 DOI: 10.21037/tlcr-23-118] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2023] [Accepted: 07/20/2023] [Indexed: 10/20/2023]
Abstract
Background The tumor immune microenvironment influences tumor evolution in non-small cell lung cancer (NSCLC). Yet, the prognostic value of programmed death-ligand 1 (PD-L1) in epidermal growth factor receptor (EGFR)-mutant NSCLC remains controversial. Additionally, prognostic studies in Filipinos with EGFR-mutant NSCLC remain unexplored to this day. Methods We prospectively studied the outcomes of EGFR-mutant NSCLC in Filipino cohort, and retrospectively verified the survival trend using The Cancer Genome Atlas (TCGA) cohort. Kaplan-Meier method and generalized linear regression were used to assess survival. Expression and DNA methylation of cluster of differentiation 274 (CD274, gene that codes for PD-L1) were examined from TCGA tumor profiles. Pearson's correlation was used to correlate PD-L1 expression with outcomes associated with occurrence of EGFR mutations, tyrosine kinase inhibitor (TKI) types, and programmed cell death protein 1 (PD-1) expression. Proteome network analysis was used to examine the correlation between drug resistance and PD-L1. Results PD-L1 positivity was associated with significantly longer progression-free survival (PFS; P=0.0096) but had a significantly contrasting influence in the overall survival (OS; P=0.0011). PD-L1 positivity (in both protein and RNA) was associated with longer median OS (mOS) in exon21 L858R, whereas, negativity was associated with longer mOS in exon19 deletion (exon19del). Stratification (high, low, negative) of PD-L1 expression lacked significant prognostic value (all P>0.05). PD-L1/CD274 expression (P<0.05) and DNA methylation (P<0.001) vary significantly among NSCLC subtypes and in different disease stages. Erlotinib treatment produced the longest median progression-free survival (mPFS; 874 days) relative to other EGFR-TKIs (137-311 days). PD-L1 lacked a significant correlation with EGFR-TKIs. Consistent with the immune-regulation activities of PD-1, higher expression leads to relatively shorter mOS. PD-1 correlated positively with PD-L1 expression and occurrence of exon21 L858R. Conclusions PD-L1 differentially influenced the outcomes of Filipinos with EGFR-mutant NSCLC. NSCLC subtypes, disease stage, and PD-1 expression may impact the collective outcomes associated with PD-L1 and EGFR-sensitizing mutations.
Collapse
Affiliation(s)
- Herdee Gloriane C. Luna
- Lung Center of the Philippines, Quezon City, Philippines
- National Kidney and Transplant Institute, Quezon City, Philippines
| | | | - Necy Juat
- National Kidney and Transplant Institute, Quezon City, Philippines
| | | | - Treah May Sayo
- Lung Center of the Philippines, Quezon City, Philippines
| | | | - Sheena Marie Asur-Galang
- Clinical Proteomics for Cancer Initiative, Department of Science and Technology-Philippine Council for Health Research and Development, Taguig City, Philippines
| | - Mirasol Bellengan
- Clinical Proteomics for Cancer Initiative, Department of Science and Technology-Philippine Council for Health Research and Development, Taguig City, Philippines
| | - Kent John Duga
- Clinical Proteomics for Cancer Initiative, Department of Science and Technology-Philippine Council for Health Research and Development, Taguig City, Philippines
| | - Bien Brian Buenaobra
- Clinical Proteomics for Cancer Initiative, Department of Science and Technology-Philippine Council for Health Research and Development, Taguig City, Philippines
| | - Marvin I. De los Santos
- Clinical Proteomics for Cancer Initiative, Department of Science and Technology-Philippine Council for Health Research and Development, Taguig City, Philippines
| | - Daniel Medina
- Clinical Proteomics for Cancer Initiative, Department of Science and Technology-Philippine Council for Health Research and Development, Taguig City, Philippines
| | - Jamirah Samo
- Clinical Proteomics for Cancer Initiative, Department of Science and Technology-Philippine Council for Health Research and Development, Taguig City, Philippines
| | - Venus Minerva Literal
- Clinical Proteomics for Cancer Initiative, Department of Science and Technology-Philippine Council for Health Research and Development, Taguig City, Philippines
| | - Neil Andrew Bascos
- National Institute of Molecular Biology and Biotechnology, University of the Philippines Diliman, Quezon City, Philippines
- Protein, Proteomics and Metabolomics Facility, Philippine Genome Center, University of the Philippines System, Quezon City, Philippines
| | | |
Collapse
|
5
|
Moon JW, Yang E, Kim JH, Kwon OJ, Park M, Yi CA. Predicting Non-Small-Cell Lung Cancer Survival after Curative Surgery via Deep Learning of Diffusion MRI. Diagnostics (Basel) 2023; 13:2555. [PMID: 37568918 PMCID: PMC10417371 DOI: 10.3390/diagnostics13152555] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2023] [Revised: 07/19/2023] [Accepted: 07/27/2023] [Indexed: 08/13/2023] Open
Abstract
BACKGROUND the objective of this study is to evaluate the predictive power of the survival model using deep learning of diffusion-weighted images (DWI) in patients with non-small-cell lung cancer (NSCLC). METHODS DWI at b-values of 0, 100, and 700 sec/mm2 (DWI0, DWI100, DWI700) were preoperatively obtained for 100 NSCLC patients who underwent curative surgery (57 men, 43 women; mean age, 62 years). The ADC0-100 (perfusion-sensitive ADC), ADC100-700 (perfusion-insensitive ADC), ADC0-100-700, and demographic features were collected as input data and 5-year survival was collected as output data. Our survival model adopted transfer learning from a pre-trained VGG-16 network, whereby the softmax layer was replaced with the binary classification layer for the prediction of 5-year survival. Three channels of input data were selected in combination out of DWIs and ADC images and their accuracies and AUCs were compared for the best performance during 10-fold cross validation. RESULTS 66 patients survived, and 34 patients died. The predictive performance was the best in the following combination: DWI0-ADC0-100-ADC0-100-700 (accuracy: 92%; AUC: 0.904). This was followed by DWI0-DWI700-ADC0-100-700, DWI0-DWI100-DWI700, and DWI0-DWI0-DWI0 (accuracy: 91%, 81%, 76%; AUC: 0.889, 0.763, 0.711, respectively). Survival prediction models trained with ADC performed significantly better than the one trained with DWI only (p-values < 0.05). The survival prediction was improved when demographic features were added to the model with only DWIs, but the benefit of clinical information was not prominent when added to the best performing model using both DWI and ADC. CONCLUSIONS Deep learning may play a role in the survival prediction of lung cancer. The performance of learning can be enhanced by inputting precedented, proven functional parameters of the ADC instead of the original data of DWIs only.
Collapse
Affiliation(s)
- Jung Won Moon
- Department of Radiology, Kangnam Sacred Heart Hospital, Hallym University School of Medicine, Seoul 07441, Republic of Korea;
| | - Ehwa Yang
- Department of Radiology, Samsung Medical Center, Sungkyunkwan University School of Medicine, Seoul 06351, Republic of Korea;
| | - Jae-Hun Kim
- Department of Radiology, Samsung Medical Center, Sungkyunkwan University School of Medicine, Seoul 06351, Republic of Korea;
| | - O Jung Kwon
- Division of Respiratory and Critical Care Medicine, Department of Internal Medicine, Samsung Medical Center, Sungkyunkwan University School of Medicine, Seoul 06351, Republic of Korea;
| | - Minsu Park
- Department of Information and Statistics, Chungnam National University, Daejeon 34134, Republic of Korea;
| | - Chin A Yi
- Department of Radiology, Samsung Medical Center, Sungkyunkwan University School of Medicine, Seoul 06351, Republic of Korea;
| |
Collapse
|
6
|
Tang M, Gao L, He B, Yang Y. Machine learning based prognostic model of Chinese medicine affecting the recurrence and metastasis of I-III stage colorectal cancer: A retrospective study in China. Front Oncol 2022; 12:1044344. [PMID: 36465374 PMCID: PMC9714626 DOI: 10.3389/fonc.2022.1044344] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2022] [Accepted: 10/31/2022] [Indexed: 06/30/2024] Open
Abstract
Background To construct prognostic model of colorectal cancer (CRC) recurrence and metastasis (R&M) with traditional Chinese medicine (TCM) factors based on different machine learning (ML) methods. Aiming to offset the defects in the existing model lacking TCM factors. Methods Patients with stage I-III CRC after radical resection were included as the model data set. The training set and the internal verification set were randomly divided at a ratio of 7: 3 by the "set aside method". The average performance index and 95% confidence interval of the model were calculated by repeating 100 tests. Eight factors were used as predictors of Western medicine. Two types of models were constructed by taking "whether to accept TCM intervention" and "different TCM syndrome types" as TCM predictors. The model was constructed by four ML methods: logistic regression, random forest, Extreme Gradient Boosting (XGBoost) and support vector machine (SVM). The predicted target was whether R&M would occur within 3 years and 5 years after radical surgery. The area under curve (AUC) value and decision curve analysis (DCA) curve were used to evaluate accuracy and utility of the model. Results The model data set consisted of 558 patients, of which 317 received TCM intervention after radical resection. The model based on the four ML methods with the TCM factor of "whether to accept TCM intervention" showed good ability in predicting R&M within 3 years and 5 years (AUC value > 0.75), and XGBoost was the best method. The DCA indicated that when the R&M probability in patients was at a certain threshold, the models provided additional clinical benefits. When predicting the R&M probability within 3 years and 5 years in the model with TCM factors of "different TCM syndrome types", the four methods all showed certain predictive ability (AUC value > 0.70). With the exception of the model constructed by SVM, the other methods provided additional clinical benefits within a certain probability threshold. Conclusion The prognostic model based on ML methods shows good accuracy and clinical utility. It can quantify the influence degree of TCM factors on R&M, and provide certain values for clinical decision-making.
Collapse
Affiliation(s)
- Mo Tang
- Oncology Department, Xiyuan Hospital of China Academy of Chinese Medical Sciences, Beijing, China
| | - Lihao Gao
- Smart City Business Unit, Baidu Inc., Beijing, China
| | - Bin He
- Oncology Department, Xiyuan Hospital of China Academy of Chinese Medical Sciences, Beijing, China
| | - Yufei Yang
- Oncology Department, Xiyuan Hospital of China Academy of Chinese Medical Sciences, Beijing, China
| |
Collapse
|
7
|
Development and Validation of Novel Deep-Learning Models Using Multiple Data Types for Lung Cancer Survival. Cancers (Basel) 2022; 14:cancers14225562. [PMID: 36428655 PMCID: PMC9688689 DOI: 10.3390/cancers14225562] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2022] [Revised: 11/03/2022] [Accepted: 11/10/2022] [Indexed: 11/16/2022] Open
Abstract
A well-established lung-cancer-survival-prediction model that relies on multiple data types, multiple novel machine-learning algorithms, and external testing is absent in the literature. This study aims to address this gap and determine the critical factors of lung cancer survival. We selected non-small-cell lung cancer patients from a retrospective dataset of the Taipei Medical University Clinical Research Database and Taiwan Cancer Registry between January 2008 and December 2018. All patients were monitored from the index date of cancer diagnosis until the event of death. Variables, including demographics, comorbidities, medications, laboratories, and patient gene tests, were used. Nine machine-learning algorithms with various modes were used. The performance of the algorithms was measured by the area under the receiver operating characteristic curve (AUC). In total, 3714 patients were included. The best performance of the artificial neural network (ANN) model was achieved when integrating all variables with the AUC, accuracy, precision, recall, and F1-score of 0.89, 0.82, 0.91, 0.75, and 0.65, respectively. The most important features were cancer stage, cancer size, age of diagnosis, smoking, drinking status, EGFR gene, and body mass index. Overall, the ANN model improved predictive performance when integrating different data types.
Collapse
|
8
|
Sedighi-Maman Z, Heath JJ. An Interpretable Two-Phase Modeling Approach for Lung Cancer Survivability Prediction. SENSORS (BASEL, SWITZERLAND) 2022; 22:6783. [PMID: 36146145 PMCID: PMC9503480 DOI: 10.3390/s22186783] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 07/23/2022] [Revised: 08/28/2022] [Accepted: 09/05/2022] [Indexed: 06/16/2023]
Abstract
Although lung cancer survival status and survival length predictions have primarily been studied individually, a scheme that leverages both fields in an interpretable way for physicians remains elusive. We propose a two-phase data analytic framework that is capable of classifying survival status for 0.5-, 1-, 1.5-, 2-, 2.5-, and 3-year time-points (phase I) and predicting the number of survival months within 3 years (phase II) using recent Surveillance, Epidemiology, and End Results data from 2010 to 2017. In this study, we employ three analytical models (general linear model, extreme gradient boosting, and artificial neural networks), five data balancing techniques (synthetic minority oversampling technique (SMOTE), relocating safe level SMOTE, borderline SMOTE, adaptive synthetic sampling, and majority weighted minority oversampling technique), two feature selection methods (least absolute shrinkage and selection operator (LASSO) and random forest), and the one-hot encoding approach. By implementing a comprehensive data preparation phase, we demonstrate that a computationally efficient and interpretable method such as GLM performs comparably to more complex models. Moreover, we quantify the effects of individual features in phase I and II by exploiting GLM coefficients. To the best of our knowledge, this study is the first to (a) implement a comprehensive data processing approach to develop performant, computationally efficient, and interpretable methods in comparison to black-box models, (b) visualize top factors impacting survival odds by utilizing the change in odds ratio, and (c) comprehensively explore short-term lung cancer survival using a two-phase approach.
Collapse
Affiliation(s)
- Zahra Sedighi-Maman
- Robert B. Willumstad School of Business, Adelphi University, Garden City, NY 11530, USA
| | - Jonathan J. Heath
- McDonough School of Business, Georgetown University, Washington, DC 20057, USA
| |
Collapse
|
9
|
Lei H, Li X, Ma W, Hong N, Liu C, Zhou W, Zhou H, Gong M, Wang Y, Wang G, Wu Y. Comparison of nomogram and machine-learning methods for predicting the survival of non-small cell lung cancer patients. CANCER INNOVATION 2022; 1:135-145. [PMID: 38090651 PMCID: PMC10686174 DOI: 10.1002/cai2.24] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 03/29/2022] [Revised: 05/28/2022] [Accepted: 06/29/2022] [Indexed: 10/15/2024]
Abstract
Background Most patients with advanced non-small cell lung cancer (NSCLC) have a poor prognosis. Predicting overall survival using clinical data would benefit cancer patients by allowing providers to design an optimum treatment plan. We compared the performance of nomograms with machine-learning models at predicting the overall survival of NSCLC patients. This comparison benefits the development and selection of models during the clinical decision-making process for NSCLC patients. Methods Multiple machine-learning models were used in a retrospective cohort of 6586 patients. First, we modeled and validated a nomogram to predict the overall survival of NSCLC patients. Subsequently, five machine-learning models (logistic regression, random forest, XGBoost, decision tree, and light gradient boosting machine) were used to predict survival status. Next, we evaluated the performance of the models. Finally, the machine-learning model with the highest accuracy was chosen for comparison with the nomogram at predicting survival status by observing a novel performance measure: time-dependent prediction accuracy. Results Among the five machine-learning models, the accuracy of random forest model outperformed the others. Compared with the nomogram for time-dependent prediction accuracy with a follow-up time ranging from 12 to 60 months, the prediction accuracies of both the nomogram and machine-learning models changed as time varied. The nomogram reached a maximum prediction accuracy of 0.85 in the 60th month, and the random forest algorithm reached a maximum prediction accuracy of 0.74 in the 13th month. Conclusions Overall, the nomogram provided more reliable prognostic assessments of NSCLC patients than machine-learning models over our observation period. Although machine-learning methods have been widely adopted for predicting clinical prognoses in recent studies, the conventional nomogram was competitive. In real clinical applications, a comprehensive model that combines these two methods may demonstrate superior capabilities.
Collapse
Affiliation(s)
- Haike Lei
- Chongqing Key Laboratory of Translational Research for Cancer Metastasis and Individualized TreatmentChongqing University Cancer HospitalChongqingChina
| | - Xiaosheng Li
- Chongqing Key Laboratory of Translational Research for Cancer Metastasis and Individualized TreatmentChongqing University Cancer HospitalChongqingChina
| | - Wuren Ma
- Digital Health China Technologies, Co., Ltd.BeijingChina
| | - Na Hong
- Digital Health China Technologies, Co., Ltd.BeijingChina
| | - Chun Liu
- Digital Health China Technologies, Co., Ltd.BeijingChina
| | - Wei Zhou
- Chongqing Key Laboratory of Translational Research for Cancer Metastasis and Individualized TreatmentChongqing University Cancer HospitalChongqingChina
| | - Hong Zhou
- Chongqing Key Laboratory of Translational Research for Cancer Metastasis and Individualized TreatmentChongqing University Cancer HospitalChongqingChina
| | - Mengchun Gong
- Digital Health China Technologies, Co., Ltd.BeijingChina
| | - Ying Wang
- Chongqing Key Laboratory of Translational Research for Cancer Metastasis and Individualized TreatmentChongqing University Cancer HospitalChongqingChina
| | - Guixue Wang
- MOE Key Lab for Biorheological Science and Technology, State and Local Joint Engineering Laboratory for Vascular ImplantsCollege of Bioengineering Chongqing UniversityChongqingChina
| | - Yongzhong Wu
- Chongqing Key Laboratory of Translational Research for Cancer Metastasis and Individualized TreatmentChongqing University Cancer HospitalChongqingChina
| |
Collapse
|
10
|
Prelaj A, Boeri M, Robuschi A, Ferrara R, Proto C, Lo Russo G, Galli G, De Toma A, Brambilla M, Occhipinti M, Manglaviti S, Beninato T, Bottiglieri A, Massa G, Zattarin E, Gallucci R, Galli EG, Ganzinelli M, Sozzi G, de Braud FGM, Garassino MC, Restelli M, Pedrocchi ALG, Trovo' F. Machine Learning Using Real-World and Translational Data to Improve Treatment Selection for NSCLC Patients Treated with Immunotherapy. Cancers (Basel) 2022; 14:cancers14020435. [PMID: 35053597 PMCID: PMC8773718 DOI: 10.3390/cancers14020435] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2021] [Revised: 01/05/2022] [Accepted: 01/12/2022] [Indexed: 02/01/2023] Open
Abstract
Simple Summary In this paper, the authors show that artificial intelligence (AI) and machine learning (ML) are useful approaches to integrate multifactorial data and helpful for personalized prediction. In detail, compared to PD-L1 for advanced non-small cell lung cancer (NSCLC), ML tools predicted better responder (R) and non-responder (NR) patients to immunotherapy (IO). It was also able to indirectly foresee OS and PFS of R and NR patients. Given the high incidence of NSCLC, and the absence of reliable biomarkers to predict the response to IO other than PD-L1, the authors believe this research may be of great interest to anyone involved in thoracic oncology. Furthermore, given the growing interest from the scientific community in AI and ML, the authors believe that this manuscript could represent a fascinating topic to anyone who needs to exploit the enormous potential of these tools in the treatment of cancer. Abstract (1) Background: In advanced non-small cell lung cancer (aNSCLC), programmed death ligand 1 (PD-L1) remains the only biomarker for candidate patients to immunotherapy (IO). This study aimed at using artificial intelligence (AI) and machine learning (ML) tools to improve response and efficacy predictions in aNSCLC patients treated with IO. (2) Methods: Real world data and the blood microRNA signature classifier (MSC) were used. Patients were divided into responders (R) and non-responders (NR) to determine if the overall survival of the patients was likely to be shorter or longer than 24 months from baseline IO. (3) Results: One-hundred sixty-four out of 200 patients (i.e., only those ones with PD-L1 data available) were considered in the model, 73 (44.5%) were R and 91 (55.5%) NR. Overall, the best model was the linear regression (RL) and included 5 features. The model predicting R/NR of patients achieved accuracy ACC = 0.756, F1 score F1 = 0.722, and area under the ROC curve AUC = 0.82. LR was also the best-performing model in predicting patients with long survival (24 months OS), achieving ACC = 0.839, F1 = 0.908, and AUC = 0.87. (4) Conclusions: The results suggest that the integration of multifactorial data provided by ML techniques is a useful tool to select NSCLC patients as candidates for IO.
Collapse
Affiliation(s)
- Arsela Prelaj
- Medical Oncology Department, Fondazione IRCCS Istituto Nazionale Tumori, 20133 Milan, Italy; (R.F.); (C.P.); (G.L.R.); (G.G.); (A.D.T.); (M.B.); (M.O.); (S.M.); (T.B.); (A.B.); (G.M.); (E.Z.); (R.G.); (E.G.G.); (M.G.); (F.G.M.d.B.); (M.C.G.)
- Department of Electronics, Information and Bioengineering, Politecnico di Milano, 20133 Milan, Italy; (A.R.); (M.R.); (A.L.G.P.); (F.T.)
- Correspondence:
| | - Mattia Boeri
- Tumor Genomics Unit, Department of Research, Fondazione IRCCS Istituto Nazionale dei Tumori, 20133 Milan, Italy; (M.B.); (G.S.)
| | - Alessandro Robuschi
- Department of Electronics, Information and Bioengineering, Politecnico di Milano, 20133 Milan, Italy; (A.R.); (M.R.); (A.L.G.P.); (F.T.)
| | - Roberto Ferrara
- Medical Oncology Department, Fondazione IRCCS Istituto Nazionale Tumori, 20133 Milan, Italy; (R.F.); (C.P.); (G.L.R.); (G.G.); (A.D.T.); (M.B.); (M.O.); (S.M.); (T.B.); (A.B.); (G.M.); (E.Z.); (R.G.); (E.G.G.); (M.G.); (F.G.M.d.B.); (M.C.G.)
| | - Claudia Proto
- Medical Oncology Department, Fondazione IRCCS Istituto Nazionale Tumori, 20133 Milan, Italy; (R.F.); (C.P.); (G.L.R.); (G.G.); (A.D.T.); (M.B.); (M.O.); (S.M.); (T.B.); (A.B.); (G.M.); (E.Z.); (R.G.); (E.G.G.); (M.G.); (F.G.M.d.B.); (M.C.G.)
| | - Giuseppe Lo Russo
- Medical Oncology Department, Fondazione IRCCS Istituto Nazionale Tumori, 20133 Milan, Italy; (R.F.); (C.P.); (G.L.R.); (G.G.); (A.D.T.); (M.B.); (M.O.); (S.M.); (T.B.); (A.B.); (G.M.); (E.Z.); (R.G.); (E.G.G.); (M.G.); (F.G.M.d.B.); (M.C.G.)
| | - Giulia Galli
- Medical Oncology Department, Fondazione IRCCS Istituto Nazionale Tumori, 20133 Milan, Italy; (R.F.); (C.P.); (G.L.R.); (G.G.); (A.D.T.); (M.B.); (M.O.); (S.M.); (T.B.); (A.B.); (G.M.); (E.Z.); (R.G.); (E.G.G.); (M.G.); (F.G.M.d.B.); (M.C.G.)
| | - Alessandro De Toma
- Medical Oncology Department, Fondazione IRCCS Istituto Nazionale Tumori, 20133 Milan, Italy; (R.F.); (C.P.); (G.L.R.); (G.G.); (A.D.T.); (M.B.); (M.O.); (S.M.); (T.B.); (A.B.); (G.M.); (E.Z.); (R.G.); (E.G.G.); (M.G.); (F.G.M.d.B.); (M.C.G.)
| | - Marta Brambilla
- Medical Oncology Department, Fondazione IRCCS Istituto Nazionale Tumori, 20133 Milan, Italy; (R.F.); (C.P.); (G.L.R.); (G.G.); (A.D.T.); (M.B.); (M.O.); (S.M.); (T.B.); (A.B.); (G.M.); (E.Z.); (R.G.); (E.G.G.); (M.G.); (F.G.M.d.B.); (M.C.G.)
| | - Mario Occhipinti
- Medical Oncology Department, Fondazione IRCCS Istituto Nazionale Tumori, 20133 Milan, Italy; (R.F.); (C.P.); (G.L.R.); (G.G.); (A.D.T.); (M.B.); (M.O.); (S.M.); (T.B.); (A.B.); (G.M.); (E.Z.); (R.G.); (E.G.G.); (M.G.); (F.G.M.d.B.); (M.C.G.)
| | - Sara Manglaviti
- Medical Oncology Department, Fondazione IRCCS Istituto Nazionale Tumori, 20133 Milan, Italy; (R.F.); (C.P.); (G.L.R.); (G.G.); (A.D.T.); (M.B.); (M.O.); (S.M.); (T.B.); (A.B.); (G.M.); (E.Z.); (R.G.); (E.G.G.); (M.G.); (F.G.M.d.B.); (M.C.G.)
| | - Teresa Beninato
- Medical Oncology Department, Fondazione IRCCS Istituto Nazionale Tumori, 20133 Milan, Italy; (R.F.); (C.P.); (G.L.R.); (G.G.); (A.D.T.); (M.B.); (M.O.); (S.M.); (T.B.); (A.B.); (G.M.); (E.Z.); (R.G.); (E.G.G.); (M.G.); (F.G.M.d.B.); (M.C.G.)
| | - Achille Bottiglieri
- Medical Oncology Department, Fondazione IRCCS Istituto Nazionale Tumori, 20133 Milan, Italy; (R.F.); (C.P.); (G.L.R.); (G.G.); (A.D.T.); (M.B.); (M.O.); (S.M.); (T.B.); (A.B.); (G.M.); (E.Z.); (R.G.); (E.G.G.); (M.G.); (F.G.M.d.B.); (M.C.G.)
| | - Giacomo Massa
- Medical Oncology Department, Fondazione IRCCS Istituto Nazionale Tumori, 20133 Milan, Italy; (R.F.); (C.P.); (G.L.R.); (G.G.); (A.D.T.); (M.B.); (M.O.); (S.M.); (T.B.); (A.B.); (G.M.); (E.Z.); (R.G.); (E.G.G.); (M.G.); (F.G.M.d.B.); (M.C.G.)
| | - Emma Zattarin
- Medical Oncology Department, Fondazione IRCCS Istituto Nazionale Tumori, 20133 Milan, Italy; (R.F.); (C.P.); (G.L.R.); (G.G.); (A.D.T.); (M.B.); (M.O.); (S.M.); (T.B.); (A.B.); (G.M.); (E.Z.); (R.G.); (E.G.G.); (M.G.); (F.G.M.d.B.); (M.C.G.)
| | - Rosaria Gallucci
- Medical Oncology Department, Fondazione IRCCS Istituto Nazionale Tumori, 20133 Milan, Italy; (R.F.); (C.P.); (G.L.R.); (G.G.); (A.D.T.); (M.B.); (M.O.); (S.M.); (T.B.); (A.B.); (G.M.); (E.Z.); (R.G.); (E.G.G.); (M.G.); (F.G.M.d.B.); (M.C.G.)
| | - Edoardo Gregorio Galli
- Medical Oncology Department, Fondazione IRCCS Istituto Nazionale Tumori, 20133 Milan, Italy; (R.F.); (C.P.); (G.L.R.); (G.G.); (A.D.T.); (M.B.); (M.O.); (S.M.); (T.B.); (A.B.); (G.M.); (E.Z.); (R.G.); (E.G.G.); (M.G.); (F.G.M.d.B.); (M.C.G.)
| | - Monica Ganzinelli
- Medical Oncology Department, Fondazione IRCCS Istituto Nazionale Tumori, 20133 Milan, Italy; (R.F.); (C.P.); (G.L.R.); (G.G.); (A.D.T.); (M.B.); (M.O.); (S.M.); (T.B.); (A.B.); (G.M.); (E.Z.); (R.G.); (E.G.G.); (M.G.); (F.G.M.d.B.); (M.C.G.)
| | - Gabriella Sozzi
- Tumor Genomics Unit, Department of Research, Fondazione IRCCS Istituto Nazionale dei Tumori, 20133 Milan, Italy; (M.B.); (G.S.)
| | - Filippo G. M. de Braud
- Medical Oncology Department, Fondazione IRCCS Istituto Nazionale Tumori, 20133 Milan, Italy; (R.F.); (C.P.); (G.L.R.); (G.G.); (A.D.T.); (M.B.); (M.O.); (S.M.); (T.B.); (A.B.); (G.M.); (E.Z.); (R.G.); (E.G.G.); (M.G.); (F.G.M.d.B.); (M.C.G.)
| | - Marina Chiara Garassino
- Medical Oncology Department, Fondazione IRCCS Istituto Nazionale Tumori, 20133 Milan, Italy; (R.F.); (C.P.); (G.L.R.); (G.G.); (A.D.T.); (M.B.); (M.O.); (S.M.); (T.B.); (A.B.); (G.M.); (E.Z.); (R.G.); (E.G.G.); (M.G.); (F.G.M.d.B.); (M.C.G.)
| | - Marcello Restelli
- Department of Electronics, Information and Bioengineering, Politecnico di Milano, 20133 Milan, Italy; (A.R.); (M.R.); (A.L.G.P.); (F.T.)
| | - Alessandra Laura Giulia Pedrocchi
- Department of Electronics, Information and Bioengineering, Politecnico di Milano, 20133 Milan, Italy; (A.R.); (M.R.); (A.L.G.P.); (F.T.)
| | - Francesco Trovo'
- Department of Electronics, Information and Bioengineering, Politecnico di Milano, 20133 Milan, Italy; (A.R.); (M.R.); (A.L.G.P.); (F.T.)
| |
Collapse
|
11
|
Pancreatic Cancer Survival Prediction: A Survey of the State-of-the-Art. COMPUTATIONAL AND MATHEMATICAL METHODS IN MEDICINE 2021; 2021:1188414. [PMID: 34630626 PMCID: PMC8497168 DOI: 10.1155/2021/1188414] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/28/2021] [Revised: 08/24/2021] [Accepted: 09/18/2021] [Indexed: 12/22/2022]
Abstract
Cancer early detection increases the chances of survival. Some cancer types, like pancreatic cancer, are challenging to diagnose or detect early, and the stages have a fast progression rate. This paper presents the state-of-the-art techniques used in cancer survival prediction, suggesting how these techniques can be implemented in predicting the overall survival of pancreatic ductal adenocarcinoma cancer (pdac) patients. Because of bewildering and high volumes of data, the recent studies highlight the importance of machine learning (ML) algorithms like support vector machines and convolutional neural networks. Studies predict pancreatic ductal adenocarcinoma cancer (pdac) survival is within the limits of 41.7% at one year, 8.7% at three years, and 1.9% at five years. There is no significant correlation found between the disease stages and the overall survival rate. The implementation of ML algorithms can improve our understanding of cancer progression. ML methods need an appropriate level of validation to be considered in everyday clinical practice. The objective of these techniques is to perform classification, prediction, and estimation. Accurate predictions give pathologists information on the patient's state, surgical treatment to be done, optimal use of resources, individualized therapy, drugs to prescribe, and better patient management.
Collapse
|
12
|
Meng X, Hu J, Plant RE, Carpenter TE, Carey JR. Distinctive egg-laying patterns in terminal versus non-terminal periods in three fruit fly species. Exp Gerontol 2021; 145:111201. [PMID: 33316371 PMCID: PMC7855919 DOI: 10.1016/j.exger.2020.111201] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2020] [Revised: 11/18/2020] [Accepted: 12/08/2020] [Indexed: 11/22/2022]
Abstract
The specific objective of this study was to use a logistic regression model for determining the degree to which egg laying patterns of individual females at the end of life (i.e., terminal segments) in each of three different fruit fly species could be distinguished from the egg-laying patterns over a similar period in midlife (i.e., non-terminal segments). Extracting data from large-scale databases for 11-day terminal and 11-day non-terminal segments in the vinegar fly (Drosophila melanogaster), the Mexican fruit fly (Anastrepha ludens) and the Mediterranean fruit fly (Ceratitis capitata) and organizing the model's results in a 2 × 2 contingency table, we found that: (1) daily egg-laying patterns in fruit flies can be used to distinguish terminal from non-terminal periods; (2) the overall performance metrics such as precision, accuracy, false positives and true negatives depended heavily on species; (3) differentiating between terminal and non-terminal segments is more difficult when flies die at younger ages; and (4) among the three species the best performing metrics including accuracy and precision were those produced using data on D. melanogaster. We conclude that, although the reliability of the prediction of whether a segment occurred at the end of life is relatively high for most species, it does not follow precisely predicting remaining life will also be highly reliable since classifying an end of life period is a fundamentally different challenge than is predicting an exact day of death.
Collapse
Affiliation(s)
- Xiang Meng
- Guangdong Key Laboratory of Animal Conservation and Resource Utilization, Guangdong Public Laboratory of Wild Animal Conservation and Utilization, Institute of Zoology, Guangdong Academy of Science, Guangzhou 510260, China; College of Life Science, Guangzhou University, Guangzhou 510006, China
| | - Junjie Hu
- College of Life Science, Guangzhou University, Guangzhou 510006, China
| | - Richard E Plant
- Department of Plant Sciences, University of California, Davis, 95616, USA; Department of Biological and Agricultural Engineering, University of California, Davis, 95616, USA
| | - Tim E Carpenter
- Department of Medicine and Epidemiology, University of California, Davis, 95616, USA
| | - James R Carey
- Department of Entomology, University of California, Davis, 95616, USA; Center for the Economic and Demography of Aging, University of California, Berkeley, 94720, USA.
| |
Collapse
|
13
|
Deng F, Shen L, Wang H, Zhang L. Classify multicategory outcome in patients with lung adenocarcinoma using clinical, transcriptomic and clinico-transcriptomic data: machine learning versus multinomial models. Am J Cancer Res 2020; 10:4624-4639. [PMID: 33415023 PMCID: PMC7783755] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2020] [Accepted: 11/25/2020] [Indexed: 06/12/2023] Open
Abstract
Classification of multicategory survival-outcome is important for precision oncology. Machine learning (ML) algorithms have been used to accurately classify multi-category survival-outcome of some cancer-types, but not yet that of lung adenocarcinoma. Therefore, we compared the performances of 3 ML models (random forests, support vector machine [SVM], multilayer perceptron) and multinomial logistic regression (Mlogit) models for classifying 4-category survival-outcome of lung adenocarcinoma using the TCGA. Mlogit model overall performed similar to SVM and multilayer perceptron models (micro-average area under curve=0.82), while random forests model was inferior. Surprisingly, transcriptomic data alone and clinico-transcriptomic data appeared sufficient to accurately classify the 4-category survival-outcome in these patients, but no models using clinical data alone performed well. Notably, NDUFS5, P2RY2, PRPF18, CCL24, ZNF813, MYL6, FLJ41941, POU5F1B, and SUV420H1 were the top-ranked genes that were associated with alive without disease and inversely linked to other outcomes. Similarly, BDKRB2, TERC, DNAJA3, MRPL15, SLC16A13, CRHBP and ACSBG2 were associated with alive with progression and GAL3ST3, AD2, RAB41, HDC, and PLEKHG1 associated with dead with disease, respectively, while also inversely linked other outcomes. These cross-linked genes may be used for risk-stratification and future treatment development.
Collapse
Affiliation(s)
- Fei Deng
- School of Electrical and Electronic Engineering, Shanghai Institute of TechnologyShanghai, China
| | - Lanlan Shen
- Department of Pediatrics, Baylor College of Medicine, USDA/ARS Children’s Nutrition Research CenterHouston, TX, USA
| | - He Wang
- Department of Pathology, Yale University School of MedicineNew Haven, CT, USA
| | - Lanjing Zhang
- Department of Pathology, Princeton Medical CenterPlainsboro, NJ, USA
- Department of Biological Sciences, Rutgers UniversityNewark, NJ
- Rutgers Cancer Institute of New JerseyNew Brunswick, NJ, USA
- Department of Chemical Biology, Ernest Mario School of Pharmacy, Rutgers UniversityPiscataway, NJ, USA
| |
Collapse
|
14
|
Wang J, Deng F, Zeng F, Shanahan AJ, Li WV, Zhang L. Predicting long-term multicategory cause of death in patients with prostate cancer: random forest versus multinomial model. Am J Cancer Res 2020; 10:1344-1355. [PMID: 32509383 PMCID: PMC7269775] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2020] [Accepted: 04/07/2020] [Indexed: 06/11/2023] Open
Abstract
The majority of patients with prostate cancer die of non-cancer causes of death (COD). It is thus important to accurately predict multi-category COD in these patients. Random forest (RF), a popular machine learning model, has been shown useful for predicting binary cancer-specific deaths. However, its accuracy for predicting multi-category COD in cancer patients is unclear. We included patients in Surveillance, Epidemiology, and End Results-18 cancer registry-program with prostate cancer diagnosed in 2004 (followed-up through 2016). They were randomly divided into training and testing sets with equal sizes. We evaluated prediction accuracies of RF and conventional statistical/multinomial models for 6-category COD by data-encoding types using the 2-fold cross-validation approach. Among 49,864 prostate cancer patients, 29,611 (59.4%) were alive at the end of follow-up, and 5,448 (10.9%) died of cardiovascular disease, 4,607 (9.2%) of prostate cancer, 3,681 (7.4%) of non-prostate cancer, 717 (1.4%) of infection, and 5,800 (11.6%) of other causes. We predicted 6-category COD among these patients with a mean accuracy of 59.1% (n=240, 95% CI, 58.7%-59.4%) in RF models with one-hot encoding, and 50.4% (95% CI, 49.7%-51.0%) in multinomial models. Tumor characteristics, prostate-specific antigen level, and diagnosis confirmation-method were important in RF and multinomial models. In RF models, no statistical differences were found between the accuracies of training versus cross-validation phases, and those of categorical versus one-hot encoding. We here report that RF models can outperform multinomial logistic models (absolute accuracy-difference, 8.7%) in predicting long-term 6-category COD among prostate cancer patients, while pathology diagnosis itself and tumor pathology remain important factors.
Collapse
Affiliation(s)
- Jianwei Wang
- Department of Urology, Beijing Jishuitan Hospital, The Fourth Medical College of Peking UniversityBeijing, China
| | - Fei Deng
- School of Electrical and Electronic Engineering, Shanghai Institute of TechnologyShanghai, China
| | - Fuqing Zeng
- Department of Urology, Wuhan Union Hospital of Tongji Medical Collage, Huazhong University of Science and TechnologyWuhan, China
| | | | - Wei Vivian Li
- Department of Biostatistics and Epidemiology, Rutgers School of Public HealthPiscataway, NJ, USA
| | - Lanjing Zhang
- Department of Pathology, Princeton Medical CenterPlainsboro, NJ, USA
- Department of Biological Sciences, Rutgers UniversityNewark, NJ, USA
- Rutgers Cancer Institute of New JerseyNew Brunswick, NJ, USA
- Department of Chemical Biology, Ernest Mario School of Pharmacy, Rutgers UniversityPiscataway, NJ, USA
| |
Collapse
|