1
|
Ding Q, Li C, Wang C, Ding Q. Construction and interpretation of weight-balanced enhanced machine learning models for predicting liver metastasis risk in colorectal cancer patients. Discov Oncol 2025; 16:164. [DOI: https:/doi.org/10.1007/s12672-025-01871-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/08/2024] [Accepted: 02/03/2025] [Indexed: 04/06/2025] Open
|
2
|
Ding Q, Li C, Wang C, Ding Q. Construction and interpretation of weight-balanced enhanced machine learning models for predicting liver metastasis risk in colorectal cancer patients. Discov Oncol 2025; 16:164. [PMID: 39937330 PMCID: PMC11822177 DOI: 10.1007/s12672-025-01871-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/08/2024] [Accepted: 02/03/2025] [Indexed: 02/13/2025] Open
Abstract
BACKGROUND Colorectal cancer (CRC) is a major contributor to cancer-related mortality, with liver metastases developing in approximately 25% of affected individuals. The presence of liver metastasis significantly deteriorates the prognosis for patients. The objective of this study is to predict liver metastasis in CRC patients by developing machine learning (ML)-based models, thereby aiding clinicians in the decision-making process for appropriate interventions. METHODS Retrospective analysis was performed using the Surveillance, Epidemiology, and End Results (SEER) database, and cases with CRC from 2010 to 2015 were extracted to the downstream analysis. Logistic regression (LR), Random Forest (RF), Gradient Boosting Machine (GBM), Extreme Gradient Boosting (XGBoost), Categorical Boosting (CatBoost), and LightGBM are applied to develop machine learning (ML) models to predict liver metastasis of CRC patient. To optimize the models, an improved weight-balancing algorithm was employed, enhancing the performance of the classifiers. The six models were tenfold cross-validated, and the optimal model was selected based on a combination of performance metrics. Shapley additive explanation (SHAP) was utilized to interpret the best-performing ML models globally, locally, and interactively. To ensure the model's reliability and generalizability, an external validation cohort of CRC cases from 2018 to 2021, obtained from a separate SEER database, was used for external evaluation. RESULTS In total, 50,062 patients with CRC were included in the analysis, with 5604 patients occurring liver metastasis. Among the six models evaluated, the CatBoost model showed excellent performance with the highest AUC of 0.8844. Moreover, the CatBoost model also outperformed the others in terms of recall (0.8060) and F1-score (0.6736). SHAP-based summary and force plots were used to interpret the CatBoost model. The interpretability analysis revealed that elevated carcinoembryonic antigen (CEA) levels, systemic therapy, N and T stages, and chemotherapy performed were the most significant indicators for predicting liver metastasis according to the optimal model. Furthermore, systemic therapy was suggested to increase liver metastasis risk in N0 stage patients, while it appeared to be beneficial in patients with lymph node metastasis. Preoperative radiation therapy was found to be more effective than postoperative radiation therapy. Validation using an external cohort of CRC cases from 2018 to 2021 further confirmed the robustness and stability of the CatBoost model, as its overall performance remained consistent with the internal validation results. CONCLUSION Elevated levels of carcinoembryonic antigen (CEA) have been identified as a crucial clinical predictor for liver metastasis in CRC patients. Furthermore, the administration of systemic therapy to patients who do not exhibit lymph node involvement has been found to increase the risk of liver metastasis. In terms of radiation therapy, preoperative radiation appears to be more efficacious in controlling the risk of liver metastasis compared to postoperative radiation. This finding underscores the importance of optimizing treatment strategies based on the specific clinical context and patient characteristics.
Collapse
Affiliation(s)
- Qunzhe Ding
- School of Information Management, Wuhan University, Wuhan, Hubei, 430072, People's Republic of China
| | - Chenyang Li
- Hepatic Surgery Center, Tongji Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, Hubei, China
- Clinical Medical Research Center of Hepatic Surgery at Hubei Province, Wuhan, Hubei, China
- Hubei Key Laboratory of Hepato-Pancreatic-Biliary Diseases, Tongji Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, Hubei, China
| | - Chendong Wang
- Hepatic Surgery Center, Tongji Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, Hubei, China
- Clinical Medical Research Center of Hepatic Surgery at Hubei Province, Wuhan, Hubei, China
- Hubei Key Laboratory of Hepato-Pancreatic-Biliary Diseases, Tongji Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, Hubei, China
| | - Qunzhe Ding
- School of Information Management, Wuhan University, Wuhan, Hubei, 430072, People's Republic of China.
| |
Collapse
|
3
|
Bukhari I, Li M, Li G, Xu J, Zheng P, Chu X. Pinpointing the integration of artificial intelligence in liver cancer immune microenvironment. Front Immunol 2024; 15:1520398. [PMID: 39759506 PMCID: PMC11695355 DOI: 10.3389/fimmu.2024.1520398] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2024] [Accepted: 12/02/2024] [Indexed: 01/07/2025] Open
Abstract
Liver cancer remains one of the most formidable challenges in modern medicine, characterized by its high incidence and mortality rate. Emerging evidence underscores the critical roles of the immune microenvironment in tumor initiation, development, prognosis, and therapeutic responsiveness. However, the composition of the immune microenvironment of liver cancer (LC-IME) and its association with clinicopathological significance remain unelucidated. In this review, we present the recent developments related to the use of artificial intelligence (AI) for studying the immune microenvironment of liver cancer, focusing on the deciphering of complex high-throughput data. Additionally, we discussed the current challenges of data harmonization and algorithm interpretability for studying LC-IME.
Collapse
Affiliation(s)
- Ihtisham Bukhari
- Department of Oncology, The Fifth Affiliated Hospital of Zhengzhou University, Zhengzhou, China
- Marshall B. J. Medical Research Center, Zhengzhou University, Zhengzhou, Henan, China
| | - Mengxue Li
- Marshall B. J. Medical Research Center, Zhengzhou University, Zhengzhou, Henan, China
| | - Guangyuan Li
- Department of Oncology, The Fifth Affiliated Hospital of Zhengzhou University, Zhengzhou, China
| | - Jixuan Xu
- Department of Gastrointestinal & Thyroid Surgery, The Fifth Affiliated Hospital of Zhengzhou University, Zhengzhou, China
| | - Pengyuan Zheng
- Marshall B. J. Medical Research Center, Zhengzhou University, Zhengzhou, Henan, China
| | - Xiufeng Chu
- Department of Oncology, The Fifth Affiliated Hospital of Zhengzhou University, Zhengzhou, China
- Marshall B. J. Medical Research Center, Zhengzhou University, Zhengzhou, Henan, China
| |
Collapse
|
4
|
Yan Z, Wu Y, Chen Y, Xu J, Zhang X, Yin Q. A clinical prediction model for distant metastases of pediatric neuroblastoma: an analysis based on the SEER database. Front Pediatr 2024; 12:1417818. [PMID: 39363969 PMCID: PMC11447546 DOI: 10.3389/fped.2024.1417818] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/15/2024] [Accepted: 09/03/2024] [Indexed: 10/05/2024] Open
Abstract
Background Patients with distant metastases from neuroblastoma (NB) usually have a poorer prognosis, and early diagnosis is essential to prevent distant metastases. The aim was to develop a machine-learning model for predicting the risk of distant metastasis in patients with neuroblastoma to aid clinical diagnosis and treatment decisions. Methods We built a predictive model using data from the Surveillance, Epidemiology, and End Results (SEER) database from 2010 to 2018 on 1,542 patients with neuroblastoma. Seven machine-learning methods were employed to forecast the likelihood of neuroblastoma distant metastases. Univariate and multivariate logistic regression analyses were used to identify independent risk factors for building machine learning models. Secondly, the subject operating characteristic area under the curve (AUC), Precision-Recall (PR) curves, decision curve analysis (DCA), and calibration curves were used to assess model performance. To further explain the optimal model, the Shapley summation interpretation method (SHAP) was applied. Ultimately, the best model was used to create an online calculator that estimates the likelihood of neuroblastoma distant metastases. Results The study included 1,542 patients with neuroblastoma, multifactorial logistic regression analysis showed that age, histology, tumor size, tumor grade, primary site, surgery, chemotherapy, and radiotherapy were independent risk factors for distant metastasis of neuroblastoma (P < 0.05). Logistic regression (LR) was found to be the optimal algorithm among the seven constructed, with the highest AUC values of 0.835 and 0.850 in the training and validation sets, respectively. Finally, we used the logistic regression model to build a network calculator for distant metastasis of neuroblastoma. Conclusion The study developed and validated a machine learning model based on clinical and pathological information for predicting the risk of distant metastasis in patients with neuroblastoma, which may help physicians make clinical decisions.
Collapse
Affiliation(s)
- Zhiwei Yan
- Department of Paediatric Surgery, Affiliated Hospital of Nantong University, Medical School of Nantong University, Nantong, China
| | - Yumeng Wu
- Cancer Research Center Nantong, Affiliated Tumor Hospital of Nantong University, Nantong, China
| | - Yuehua Chen
- Department of Pediatric Surgery, Affiliated Hospital of Nantong University, Nantong, China
| | - Jian Xu
- Department of Medical Oncology, Nantong Second Peoples Affiliated Hospital of Nantong University, Nantong, Jiangsu, China
| | - Xiubing Zhang
- Department of Medical Oncology, Nantong Second Peoples Affiliated Hospital of Nantong University, Nantong, Jiangsu, China
| | - Qiyou Yin
- Department of Pediatric Surgery, Affiliated Hospital of Nantong University, Nantong, China
| |
Collapse
|
5
|
Li L, Sun M, Wang J, Wan S. Multi-omics based artificial intelligence for cancer research. Adv Cancer Res 2024; 163:303-356. [PMID: 39271266 DOI: 10.1016/bs.acr.2024.06.005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/15/2024]
Abstract
With significant advancements of next generation sequencing technologies, large amounts of multi-omics data, including genomics, epigenomics, transcriptomics, proteomics, and metabolomics, have been accumulated, offering an unprecedented opportunity to explore the heterogeneity and complexity of cancer across various molecular levels and scales. One of the promising aspects of multi-omics lies in its capacity to offer a holistic view of the biological networks and pathways underpinning cancer, facilitating a deeper understanding of its development, progression, and response to treatment. However, the exponential growth of data generated by multi-omics studies present significant analytical challenges. Processing, analyzing, integrating, and interpreting these multi-omics datasets to extract meaningful insights is an ambitious task that stands at the forefront of current cancer research. The application of artificial intelligence (AI) has emerged as a powerful solution to these challenges, demonstrating exceptional capabilities in deciphering complex patterns and extracting valuable information from large-scale, intricate omics datasets. This review delves into the synergy of AI and multi-omics, highlighting its revolutionary impact on oncology. We dissect how this confluence is reshaping the landscape of cancer research and clinical practice, particularly in the realms of early detection, diagnosis, prognosis, treatment and pathology. Additionally, we elaborate the latest AI methods for multi-omics integration to provide a comprehensive insight of the complex biological mechanisms and inherent heterogeneity of cancer. Finally, we discuss the current challenges of data harmonization, algorithm interpretability, and ethical considerations. Addressing these challenges necessitates a multidisciplinary collaboration, paving the promising way for more precise, personalized, and effective treatments for cancer patients.
Collapse
Affiliation(s)
- Lusheng Li
- Department of Genetics, Cell Biology and Anatomy, University of Nebraska Medical Center, Omaha, NE, United States
| | - Mengtao Sun
- Department of Genetics, Cell Biology and Anatomy, University of Nebraska Medical Center, Omaha, NE, United States
| | - Jieqiong Wang
- Department of Neurological Sciences, University of Nebraska Medical Center, Omaha, NE, United States
| | - Shibiao Wan
- Department of Genetics, Cell Biology and Anatomy, University of Nebraska Medical Center, Omaha, NE, United States.
| |
Collapse
|