1
|
Asadi F, Rahimi M, Ramezanghorbani N, Almasi S. Comparing the Effectiveness of Artificial Intelligence Models in Predicting Ovarian Cancer Survival: A Systematic Review. Cancer Rep (Hoboken) 2025; 8:e70138. [PMID: 40103563 PMCID: PMC11920737 DOI: 10.1002/cnr2.70138] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/04/2024] [Revised: 12/23/2024] [Accepted: 01/27/2025] [Indexed: 03/20/2025] Open
Abstract
BACKGROUND This systematic review investigates the use of machine learning (ML) algorithms in predicting survival outcomes for ovarian cancer (OC) patients. Key prognostic endpoints, including overall survival (OS), recurrence-free survival (RFS), progression-free survival (PFS), and treatment response prediction (TRP), are examined to evaluate the effectiveness of these algorithms and identify significant features that influence predictive accuracy. RECENT FINDINGS A thorough search of four major databases-PubMed, Scopus, Web of Science, and Cochrane-resulted in 2400 articles published within the last decade, with 32 studies meeting the inclusion criteria. Notably, most publications emerged after 2021. Commonly used algorithms for survival prediction included random forest, support vector machines, logistic regression, XGBoost, and various deep learning models. Evaluation metrics such as area under the curve (AUC) (18 studies), concordance index (C-index) (11 studies), and accuracy (11 studies) were frequently employed. Age at diagnosis, tumor stage, CA-125 levels, and treatment-related factors were consistently highlighted as significant predictors, emphasizing their relevance in OC prognosis. CONCLUSION ML models demonstrate considerable potential for predicting OC survival outcomes; however, challenges persist regarding model accuracy and interpretability. Incorporating diverse data types-such as clinical, imaging, and molecular datasets-holds promise for enhancing predictive capabilities. Future advancements will depend on integrating heterogeneous data sources with multimodal ML approaches, which are crucial for improving prognostic precision in OC.
Collapse
Affiliation(s)
- Farkhondeh Asadi
- Department of Health Information Technology and Management, School of Allied Medical Sciences, Shahid Beheshti University of Medical Sciences, Tehran, Iran
| | - Milad Rahimi
- Department of Health Information Technology and Management, School of Allied Medical Sciences, Shahid Beheshti University of Medical Sciences, Tehran, Iran
| | - Nahid Ramezanghorbani
- Department of Development & Coordination Scientific Information and Publications, Deputy of Research & Technology, Ministry of Health & Medical Education, Tehran, Iran
| | - Sohrab Almasi
- Department of Health Information Technology and Management, School of Allied Medical Sciences, Shahid Beheshti University of Medical Sciences, Tehran, Iran
| |
Collapse
|
2
|
Sun X, Nong M, Meng F, Sun X, Jiang L, Li Z, Zhang P. Architecting the metabolic reprogramming survival risk framework in LUAD through single-cell landscape analysis: three-stage ensemble learning with genetic algorithm optimization. J Transl Med 2024; 22:353. [PMID: 38622716 PMCID: PMC11017668 DOI: 10.1186/s12967-024-05138-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2024] [Accepted: 03/27/2024] [Indexed: 04/17/2024] Open
Abstract
Recent studies have increasingly revealed the connection between metabolic reprogramming and tumor progression. However, the specific impact of metabolic reprogramming on inter-patient heterogeneity and prognosis in lung adenocarcinoma (LUAD) still requires further exploration. Here, we introduced a cellular hierarchy framework according to a malignant and metabolic gene set, named malignant & metabolism reprogramming (MMR), to reanalyze 178,739 single-cell reference profiles. Furthermore, we proposed a three-stage ensemble learning pipeline, aided by genetic algorithm (GA), for survival prediction across 9 LUAD cohorts (n = 2066). Throughout the pipeline of developing the three stage-MMR (3 S-MMR) score, double training sets were implemented to avoid over-fitting; the gene-pairing method was utilized to remove batch effect; GA was harnessed to pinpoint the optimal basic learner combination. The novel 3 S-MMR score reflects various aspects of LUAD biology, provides new insights into precision medicine for patients, and may serve as a generalizable predictor of prognosis and immunotherapy response. To facilitate the clinical adoption of the 3 S-MMR score, we developed an easy-to-use web tool for risk scoring as well as therapy stratification in LUAD patients. In summary, we have proposed and validated an ensemble learning model pipeline within the framework of metabolic reprogramming, offering potential insights for LUAD treatment and an effective approach for developing prognostic models for other diseases.
Collapse
Affiliation(s)
- Xinti Sun
- Department of Cardiothoracic Surgery, Tianjin Medical University General Hospital, Tianjin, China
| | - Minyu Nong
- School of Clinical Medicine, Youjiang Medical University for Nationalities, Baise, Guangxi, China
| | - Fei Meng
- Department of Cardiothoracic Surgery, Tianjin Medical University General Hospital, Tianjin, China
| | - Xiaojuan Sun
- Department of Oncology, Qingdao University Affiliated Hospital, Qingdao, Shandong, China
| | - Lihe Jiang
- School of Clinical Medicine, Youjiang Medical University for Nationalities, Baise, Guangxi, China
| | - Zihao Li
- Department of Cardiothoracic Surgery, Tianjin Medical University General Hospital, Tianjin, China
| | - Peng Zhang
- Department of Cardiothoracic Surgery, Tianjin Medical University General Hospital, Tianjin, China.
| |
Collapse
|
3
|
Nunez JJ, Leung B, Ho C, Ng RT, Bates AT. Predicting which patients with cancer will see a psychiatrist or counsellor from their initial oncology consultation document using natural language processing. COMMUNICATIONS MEDICINE 2024; 4:69. [PMID: 38589545 PMCID: PMC11001970 DOI: 10.1038/s43856-024-00495-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2023] [Accepted: 03/28/2024] [Indexed: 04/10/2024] Open
Abstract
BACKGROUND Patients with cancer often have unmet psychosocial needs. Early detection of who requires referral to a counsellor or psychiatrist may improve their care. This work used natural language processing to predict which patients will see a counsellor or psychiatrist from a patient's initial oncology consultation document. We believe this is the first use of artificial intelligence to predict psychiatric outcomes from non-psychiatric medical documents. METHODS This retrospective prognostic study used data from 47,625 patients at BC Cancer. We analyzed initial oncology consultation documents using traditional and neural language models to predict whether patients would see a counsellor or psychiatrist in the 12 months following their initial oncology consultation. RESULTS Here, we show our best models achieved a balanced accuracy (receiver-operating-characteristic area-under-curve) of 73.1% (0.824) for predicting seeing a psychiatrist, and 71.0% (0.784) for seeing a counsellor. Different words and phrases are important for predicting each outcome. CONCLUSION These results suggest natural language processing can be used to predict psychosocial needs of patients with cancer from their initial oncology consultation document. Future research could extend this work to predict the psychosocial needs of medical patients in other settings.
Collapse
Affiliation(s)
- John-Jose Nunez
- BC Cancer, Vancouver, BC, Canada.
- Department of Computer Science, University of British Columbia, Vancouver, BC, Canada.
- Department of Psychiatry, University of British Columbia, Vancouver, BC, Canada.
| | | | | | - Raymond T Ng
- Department of Computer Science, University of British Columbia, Vancouver, BC, Canada
| | - Alan T Bates
- BC Cancer, Vancouver, BC, Canada
- Department of Psychiatry, University of British Columbia, Vancouver, BC, Canada
| |
Collapse
|
4
|
Nguyen E, Cui Z, Kokaraki G, Carlson J, Liu Y. Transferable and Interpretable Treatment Effectiveness Prediction for Ovarian Cancer via Multimodal Deep Learning. AMIA ... ANNUAL SYMPOSIUM PROCEEDINGS. AMIA SYMPOSIUM 2024; 2023:550-558. [PMID: 38222355 PMCID: PMC10785847] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Subscribe] [Scholar Register] [Indexed: 01/16/2024]
Abstract
Ovarian cancer, a potentially life-threatening disease, is often difficult to treat. There is a critical need for innovations that can assist in improved therapy selection. Although deep learning models are showing promising results, they are employed as a "black-box" and require enormous amounts of data. Therefore, we explore the transferable and interpretable prediction of treatment effectiveness for ovarian cancer patients. Unlike existing works focusing on histopathology images, we propose a multimodal deep learning framework which takes into account not only large histopathology images, but also clinical variables to increase the scope of the data. The results demonstrate that the proposed models achieve high prediction accuracy and interpretability, and can also be transferred to other cancer datasets without significant loss of performance.
Collapse
Affiliation(s)
- Emily Nguyen
- Computer Science Department, University of Southern California, Los Angeles, CA, U.S.A
| | - Zijun Cui
- Computer Science Department, University of Southern California, Los Angeles, CA, U.S.A
| | - Georgia Kokaraki
- Keck School of Medicine, University of Southern California, Los Angeles, CA, U.S.A
| | - Joseph Carlson
- Keck School of Medicine, University of Southern California, Los Angeles, CA, U.S.A
| | - Yan Liu
- Computer Science Department, University of Southern California, Los Angeles, CA, U.S.A
| |
Collapse
|
5
|
Xu L, Guo C, Liu M. A weighted distance-based dynamic ensemble regression framework for gastric cancer survival time prediction. Artif Intell Med 2024; 147:102740. [PMID: 38184344 DOI: 10.1016/j.artmed.2023.102740] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/26/2022] [Revised: 10/28/2023] [Accepted: 11/28/2023] [Indexed: 01/08/2024]
Abstract
Accurate prediction of gastric cancer patient survival time is essential for clinical decision-making. However, unified static models lack specificity and flexibility in predictions owing to the varying survival outcomes among gastric cancer patients. We address these problems by using an ensemble learning approach and adaptively assigning greater weights to similar patients to make more targeted predictions when predicting an individual's survival time. We treat these problems as regression problems and introduce a weighted dynamic ensemble regression framework. To better identify similar patients, we devise a method to measure patient similarity, considering the diverse impacts of features. Subsequently, we use this measure to design both a weighted K-means clustering method and a fuzzy K-means sampling technique to group patients and train corresponding base regressors. To achieve more targeted predictions, we calculate the weight of each base regressor based on the similarity between the patient to be predicted and the patient clusters, culminating in the integration of the results. The model is validated on a dataset of 7791 patients, outperforming other models in terms of three evaluation metrics, namely, the root mean square error, mean absolute error, and the coefficient of determination. The weighted dynamic ensemble regression strategy can improve the baseline model by 1.75%, 2.12%, and 13.45% in terms of the three respective metrics while also mitigating the imbalanced survival time distribution issue. This enhanced performance has been statistically validated, even when tested on six public datasets with different sizes. By considering feature variations, patients with distinct survival profiles can be effectively differentiated, and the model predictive performance can be enhanced. The results generated by our proposed model can be invaluable in guiding decisions related to treatment plans and resource allocation. Furthermore, the model has the potential for broader applications in prognosis for other types of cancers or similar regression problems in various domains.
Collapse
Affiliation(s)
- Liangchen Xu
- Institute of Systems Engineering, Dalian University of Technology, Dalian 116024, China.
| | - Chonghui Guo
- Institute of Systems Engineering, Dalian University of Technology, Dalian 116024, China.
| | - Mucan Liu
- Institute of Systems Engineering, Dalian University of Technology, Dalian 116024, China.
| |
Collapse
|
6
|
Pereira TF, Aranha VJ, Waldvogel BC, da Costa AM, Tavares Guerreiro Fregnani JH. Deterministic linkage for improving follow-up time in a Brazilian population-based cancer registry. Sci Rep 2023; 13:4816. [PMID: 36964184 PMCID: PMC10039007 DOI: 10.1038/s41598-023-31303-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2022] [Accepted: 03/09/2023] [Indexed: 03/26/2023] Open
Abstract
Population-based cancer registries (PBCR) are the primary source of cancer incidence and survival statistics. The loss to follow-up of these patients is concerning since it reduces the reliability of any statistical analysis. The linkage techniques have been increasingly used to improve data quality in various information systems. The linkage was performed between the databases of the PBCR-Barretos and the mortality database of the state of São Paulo. To evaluate the improvement in the follow-up time of patients, the comparability of the two databases, pre- and post linkage, was made. Three analyses were performed: a comparative analysis of the absolute number of deaths, a comparative analysis of the follow-up time of patients and the survival analysis. After linkage, there was an increase of 813 deaths. The follow-up time of patients was extended and observed in most types of tumours. The comparability of the survival analyses at both time points also showed a decrease in survival probabilities for all tumour types. Deterministic linkage is effective in updating the vital status of registered patients, improving patient follow-up time, and maintaining good quality data from PBCRs, consequently producing more reliable rates, as seen for the survival analyses.
Collapse
Affiliation(s)
- Talita Fernanda Pereira
- Post Graduate Program of the Education and Research Institute, Barretos Cancer Hospital, Pio XII Foundation, Barretos, São Paulo, 14784-400, Brazil.
- Based-Population Cancer Registry of Barretos Region, Barretos Cancer Hospital, Pio XII Foundation, Barretos, São Paulo, 14784-400, Brazil.
| | | | | | - Allini Mafra da Costa
- Post Graduate Program of the Education and Research Institute, Barretos Cancer Hospital, Pio XII Foundation, Barretos, São Paulo, 14784-400, Brazil
- Based-Population Cancer Registry of Barretos Region, Barretos Cancer Hospital, Pio XII Foundation, Barretos, São Paulo, 14784-400, Brazil
- Department of Precision Health, Luxembourg Institute of Health, 1445, Strassen, Luxembourg
| | - José Humberto Tavares Guerreiro Fregnani
- Post Graduate Program of the Education and Research Institute, Barretos Cancer Hospital, Pio XII Foundation, Barretos, São Paulo, 14784-400, Brazil
- A.C. Camargo Cancer Center, São Paulo, 01525-001, Brazil
| |
Collapse
|
7
|
Sheehy J, Rutledge H, Acharya UR, Loh HW, Gururajan R, Tao X, Zhou X, Li Y, Gurney T, Kondalsamy-Chennakesavan S. Gynecological cancer prognosis using machine learning techniques: A systematic review of last three decades (1990–2022). Artif Intell Med 2023; 139:102536. [PMID: 37100507 DOI: 10.1016/j.artmed.2023.102536] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2021] [Revised: 03/19/2023] [Accepted: 03/23/2023] [Indexed: 03/30/2023]
Abstract
OBJECTIVE Many Computer Aided Prognostic (CAP) systems based on machine learning techniques have been proposed in the field of oncology. The objective of this systematic review was to assess and critically appraise the methodologies and approaches used in predicting the prognosis of gynecological cancers using CAPs. METHODS Electronic databases were used to systematically search for studies utilizing machine learning methods in gynecological cancers. Study risk of bias (ROB) and applicability were assessed using the PROBAST tool. 139 studies met the inclusion criteria, of which 71 predicted outcomes for ovarian cancer patients, 41 predicted outcomes for cervical cancer patients, 28 predicted outcomes for uterine cancer patients, and 2 predicted outcomes for gynecological malignancies broadly. RESULTS Random forest (22.30 %) and support vector machine (21.58 %) classifiers were used most commonly. Use of clinicopathological, genomic and radiomic data as predictors was observed in 48.20 %, 51.08 % and 17.27 % of studies, respectively, with some studies using multiple modalities. 21.58 % of studies were externally validated. Twenty-three individual studies compared ML and non-ML methods. Study quality was highly variable and methodologies, statistical reporting and outcome measures were inconsistent, preventing generalized commentary or meta-analysis of performance outcomes. CONCLUSION There is significant variability in model development when prognosticating gynecological malignancies with respect to variable selection, machine learning (ML) methods and endpoint selection. This heterogeneity prevents meta-analysis and conclusions regarding the superiority of ML methods. Furthermore, PROBAST-mediated ROB and applicability analysis demonstrates concern for the translatability of existing models. This review identifies ways that this can be improved upon in future works to develop robust, clinically translatable models within this promising field.
Collapse
|
8
|
Nunez JJ, Leung B, Ho C, Bates AT, Ng RT. Predicting the Survival of Patients With Cancer From Their Initial Oncology Consultation Document Using Natural Language Processing. JAMA Netw Open 2023; 6:e230813. [PMID: 36848085 PMCID: PMC9972192 DOI: 10.1001/jamanetworkopen.2023.0813] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 03/01/2023] Open
Abstract
IMPORTANCE Predicting short- and long-term survival of patients with cancer may improve their care. Prior predictive models either use data with limited availability or predict the outcome of only 1 type of cancer. OBJECTIVE To investigate whether natural language processing can predict survival of patients with general cancer from a patient's initial oncologist consultation document. DESIGN, SETTING, AND PARTICIPANTS This retrospective prognostic study used data from 47 625 of 59 800 patients who started cancer care at any of the 6 BC Cancer sites located in the province of British Columbia between April 1, 2011, and December 31, 2016. Mortality data were updated until April 6, 2022, and data were analyzed from update until September 30, 2022. All patients with a medical or radiation oncologist consultation document generated within 180 days of diagnosis were included; patients seen for multiple cancers were excluded. EXPOSURES Initial oncologist consultation documents were analyzed using traditional and neural language models. MAIN OUTCOMES AND MEASURES The primary outcome was the performance of the predictive models, including balanced accuracy and receiver operating characteristics area under the curve (AUC). The secondary outcome was investigating what words the models used. RESULTS Of the 47 625 patients in the sample, 25 428 (53.4%) were female and 22 197 (46.6%) were male, with a mean (SD) age of 64.9 (13.7) years. A total of 41 447 patients (87.0%) survived 6 months, 31 143 (65.4%) survived 36 months, and 27 880 (58.5%) survived 60 months, calculated from their initial oncologist consultation. The best models achieved a balanced accuracy of 0.856 (AUC, 0.928) for predicting 6-month survival, 0.842 (AUC, 0.918) for 36-month survival, and 0.837 (AUC, 0.918) for 60-month survival, on a holdout test set. Differences in what words were important for predicting 6- vs 60-month survival were found. CONCLUSIONS AND RELEVANCE These findings suggest that models performed comparably with or better than previous models predicting cancer survival and that they may be able to predict survival using readily available data without focusing on 1 cancer type.
Collapse
Affiliation(s)
- John-Jose Nunez
- BC Cancer, Vancouver, British Columbia, Canada
- Department of Computer Science, University of British Columbia, Vancouver, British Columbia, Canada
- Department of Psychiatry, University of British Columbia, Vancouver, British Columbia, Canada
| | | | - Cheryl Ho
- BC Cancer, Vancouver, British Columbia, Canada
| | - Alan T. Bates
- BC Cancer, Vancouver, British Columbia, Canada
- Department of Psychiatry, University of British Columbia, Vancouver, British Columbia, Canada
| | - Raymond T. Ng
- Department of Computer Science, University of British Columbia, Vancouver, British Columbia, Canada
| |
Collapse
|
9
|
Adeoye J, Akinshipo A, Koohi-Moghadam M, Thomson P, Su YX. Construction of machine learning-based models for cancer outcomes in low and lower-middle income countries: A scoping review. Front Oncol 2022; 12:976168. [PMID: 36531037 PMCID: PMC9751812 DOI: 10.3389/fonc.2022.976168] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2022] [Accepted: 11/14/2022] [Indexed: 01/31/2025] Open
Abstract
Background The impact and utility of machine learning (ML)-based prediction tools for cancer outcomes including assistive diagnosis, risk stratification, and adjunctive decision-making have been largely described and realized in the high income and upper-middle-income countries. However, statistical projections have estimated higher cancer incidence and mortality risks in low and lower-middle-income countries (LLMICs). Therefore, this review aimed to evaluate the utilization, model construction methods, and degree of implementation of ML-based models for cancer outcomes in LLMICs. Methods PubMed/Medline, Scopus, and Web of Science databases were searched and articles describing the use of ML-based models for cancer among local populations in LLMICs between 2002 and 2022 were included. A total of 140 articles from 22,516 citations that met the eligibility criteria were included in this study. Results ML-based models from LLMICs were often based on traditional ML algorithms than deep or deep hybrid learning. We found that the construction of ML-based models was skewed to particular LLMICs such as India, Iran, Pakistan, and Egypt with a paucity of applications in sub-Saharan Africa. Moreover, models for breast, head and neck, and brain cancer outcomes were frequently explored. Many models were deemed suboptimal according to the Prediction model Risk of Bias Assessment tool (PROBAST) due to sample size constraints and technical flaws in ML modeling even though their performance accuracy ranged from 0.65 to 1.00. While the development and internal validation were described for all models included (n=137), only 4.4% (6/137) have been validated in independent cohorts and 0.7% (1/137) have been assessed for clinical impact and efficacy. Conclusion Overall, the application of ML for modeling cancer outcomes in LLMICs is increasing. However, model development is largely unsatisfactory. We recommend model retraining using larger sample sizes, intensified external validation practices, and increased impact assessment studies using randomized controlled trial designs. Systematic review registration https://www.crd.york.ac.uk/prospero/display_record.php?RecordID=308345, identifier CRD42022308345.
Collapse
Affiliation(s)
- John Adeoye
- Division of Oral and Maxillofacial Surgery, Faculty of Dentistry, University of Hong Kong, Hong Kong, Hong Kong, SAR, China
- Oral Cancer Research Theme, Faculty of Dentistry, University of Hong Kong, Hong Kong, Hong Kong, SAR, China
| | - Abdulwarith Akinshipo
- Department of Oral and Maxillofacial Pathology and Biology, Faculty of Dentistry, University of Lagos, Lagos, Nigeria
| | - Mohamad Koohi-Moghadam
- Division of Applied Oral Sciences and Community Dental Care, Faculty of Dentistry, University of Hong Kong, Hong Kong, Hong Kong, SAR, China
- Clinical Artificial Intelligence Research Theme, Faculty of Dentistry, University of Hong Kong, Hong Kong, Hong Kong, SAR, China
| | - Peter Thomson
- College of Medicine and Dentistry, James Cook University, Cairns, Queensland, Australia
| | - Yu-Xiong Su
- Division of Oral and Maxillofacial Surgery, Faculty of Dentistry, University of Hong Kong, Hong Kong, Hong Kong, SAR, China
- Oral Cancer Research Theme, Faculty of Dentistry, University of Hong Kong, Hong Kong, Hong Kong, SAR, China
| |
Collapse
|
10
|
P D, C G. A systematic review on machine learning and deep learning techniques in cancer survival prediction. PROGRESS IN BIOPHYSICS AND MOLECULAR BIOLOGY 2022; 174:62-71. [PMID: 35933043 DOI: 10.1016/j.pbiomolbio.2022.07.004] [Citation(s) in RCA: 21] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/07/2022] [Revised: 07/13/2022] [Accepted: 07/19/2022] [Indexed: 06/15/2023]
Abstract
Cancer is a disease which is characterised by the unusual and uncontrollable growth of body cells. This usually happens asymptomatically and gets spread to other parts of the body. The major problem in treating cancer is that its progress is not monitored once it is diagnosed. The progress or the prognosis can be done through survival analysis. The survival analysis is the branch of statistics that deals in predicting the time of event of occurrence. In the case of cancer prognosis the event is the survival time of the patient from the onset of the disease or it can be the recurrence of the disease after undergoing a treatment. This study aims to bring out the machine learning and deep learning models involved in providing the prognosis to the cancer patients.
Collapse
Affiliation(s)
- Deepa P
- School of Information Technology and Engineering, Vellore Institute of Technology, Vellore, India
| | - Gunavathi C
- School of Computer Science and Engineering, Vellore Institute of Technology, Vellore, India.
| |
Collapse
|
11
|
Zhu S, Kong W, Zhu J, Huang L, Wang S, Bi S, Xie Z. The genetic algorithm-aided three-stage ensemble learning method identified a robust survival risk score in patients with glioma. Brief Bioinform 2022; 23:6694808. [DOI: 10.1093/bib/bbac344] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2022] [Revised: 07/14/2022] [Accepted: 07/25/2022] [Indexed: 02/07/2023] Open
Abstract
Abstract
Ensemble learning is a kind of machine learning method which can integrate multiple basic learners together and achieve higher accuracy. Recently, single machine learning methods have been established to predict survival for patients with cancer. However, it still lacked a robust ensemble learning model with high accuracy to pick out patients with high risks. To achieve this, we proposed a novel genetic algorithm-aided three-stage ensemble learning method (3S score) for survival prediction. During the process of constructing the 3S score, double training sets were used to avoid over-fitting; the gene-pairing method was applied to reduce batch effect; a genetic algorithm was employed to select the best basic learner combination. When used to predict the survival state of glioma patients, this model achieved the highest C-index (0.697) as well as area under the receiver operating characteristic curve (ROC-AUCs) (first year = 0.705, third year = 0.825 and fifth year = 0.839) in the combined test set (n = 1191), compared with 12 other baseline models. Furthermore, the 3S score can distinguish survival significantly in eight cohorts among the total of nine independent test cohorts (P < 0.05), achieving significant improvement of ROC-AUCs. Notably, ablation experiments demonstrated that the gene-pairing method, double training sets and genetic algorithm make sure the robustness and effectiveness of the 3S score. The performance exploration on pan-cancer showed that the 3S score has excellent ability on survival prediction in five kinds of cancers, which was verified by Cox regression, survival curves and ROC curves together. To enable its clinical adoption, we implemented the 3S score and other two clinical factors as an easy-to-use web tool for risk scoring and therapy stratification in glioma patients.
Collapse
Affiliation(s)
- Sujie Zhu
- Institute of Translational Medicine, The Affiliated Hospital of Qingdao University, College of Medicine, Qingdao University , Qingdao, China
| | - Weikaixin Kong
- Institute for Molecular Medicine Finland (FIMM), HiLIFE, University of Helsinki , Finland
- Institute Sanqu Technology (Hangzhou) Co., Ltd. , Hangzhou, China
| | - Jie Zhu
- Institute for Molecular Medicine Finland (FIMM), HiLIFE, University of Helsinki , Finland
| | - Liting Huang
- Institute of Translational Medicine, The Affiliated Hospital of Qingdao University, College of Medicine, Qingdao University , Qingdao, China
| | - Shixin Wang
- Institute of Translational Medicine, The Affiliated Hospital of Qingdao University, College of Medicine, Qingdao University , Qingdao, China
| | - Suzhen Bi
- Institute of Translational Medicine, The Affiliated Hospital of Qingdao University, College of Medicine, Qingdao University , Qingdao, China
| | - Zhengwei Xie
- Peking University International Cancer Institute and Department of Pharmacology, School of Basic Medical Sciences, Peking University , Beijing, China
| |
Collapse
|
12
|
Body Weight Is a Valid Predictor of the Long-Term Prognosis of Cervical Cancer. COMPUTATIONAL AND MATHEMATICAL METHODS IN MEDICINE 2022; 2022:5613350. [PMID: 35720030 PMCID: PMC9200589 DOI: 10.1155/2022/5613350] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/03/2022] [Revised: 05/21/2022] [Accepted: 05/25/2022] [Indexed: 11/17/2022]
Abstract
Objective To identify and validate effective clinical predictors for the long-term prognosis of patients with cervical cancer. Methods Cervical cancer patients were retrieved from the TCGA database, and patients' clinical data were collected and analyzed for the predictive value of long-term prognosis. In the other branch of the study, patients with cervical cancer and admitted to our hospital between January 1, 2016, and December 31, 2016, were retrieved and followed up for prognosis analysis. Results In the database patient cohort of our study, 607 cases with cervical cancer were analyzed. Aneuploidy score (p = 0.012), Buffa hypoxia score (p = 0.013), histologic grade (p = 0.01), fraction genome altered >0.4 (p < 0.001), weight > 60 kg (p < 0.001), height > 160 cm (p = 0.047), BMI <18.5 (p = 0.023), Winter hypoxia score (p = 0.002), and adjuvant postoperative radiotherapy were good predictors for disease-free survival (DFS), while aneuploidy score (p = 0.001), MSI sensor score > 0.5 (p = 0.035), person neoplasm status (p < 0.001), race (p = 0.006), Ragnum hypoxia score (p = 0.012), weight (p < 0.001), height (p < 0.001), and BMI < 18.5 (p = 0.04) were good predictors for overall survival (OS). In the admitted patient cohort, age over 60 years old at the time of diagnosis was the only clinical factor influencing the long-term DFS (p = 0.004). TNM stage above III (p = 0.004), body weight > 70 kg (p < 0.001), and complicated with other cancer (p < 0.001) were clinical factor influencing the long-term OS. Conclusions Clinical factors, especially common to both cohorts, could be used to show the long-term prognosis of cervical cancer.
Collapse
|
13
|
Clinical Characteristics in the Prediction of Posttreatment Survival of Patients with Ovarian Cancer. DISEASE MARKERS 2022; 2022:3321014. [PMID: 35571616 PMCID: PMC9098309 DOI: 10.1155/2022/3321014] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/17/2022] [Accepted: 04/15/2022] [Indexed: 12/14/2022]
Abstract
Objective To determine the efficacy of clinical characteristics in the prediction of prognosis in patients with ovarian cancer. Methods Clinical data were collected from 3 datasets from TCGA database, including 1680 cases of ovarian serous cystadenocarcinoma, and were analyzed. Patients with ovarian cancer admitted to our hospital in 2016 were retrieved and followed up for prognosis analysis. Results From the datasets, for patients > 75 years old at the time of diagnosis, histologic grade and mutation count were good predictors for disease-free survival, while for patients > 50 years old at the time of diagnosis, histologic grade, race, fraction genome altered, and mutation count were good predictors for overall survival. In the patients (n = 38) retrieved from our hospital, the longest dimension of lesion (cm) and body weight at admission were good predictors for overall survival. Conclusions Those clinical factors, together with the two predictive equations, could be used to comprehensively predict the long-term prognosis of patients with ovarian cancer.
Collapse
|
14
|
Kaur I, Doja M, Ahmad T. Data Mining and Machine Learning in Cancer Survival Research: An Overview and Future Recommendations. J Biomed Inform 2022; 128:104026. [DOI: 10.1016/j.jbi.2022.104026] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2021] [Revised: 02/07/2022] [Accepted: 02/09/2022] [Indexed: 12/29/2022]
|