1
|
Li S, Yi H, Leng Q, Wu Y, Mao Y. New perspectives on cancer clinical research in the era of big data and machine learning. Surg Oncol 2024; 52:102009. [PMID: 38215544 DOI: 10.1016/j.suronc.2023.102009] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2023] [Accepted: 10/16/2023] [Indexed: 01/14/2024]
Abstract
In the 21st century, the development of medical science has entered the era of big data, and machine learning has become an essential tool for mining medical big data. The establishment of the SEER database has provided a wealth of epidemiological data for cancer clinical research, and the number of studies based on SEER and machine learning has been growing in recent years. This article reviews recent research based on SEER and machine learning and finds that the current focus of such studies is primarily on the development and validation of models using machine learning algorithms, with the main directions being lymph node metastasis prediction, distant metastasis prediction, and prognosis-related research. Compared to traditional models, machine learning algorithms have the advantage of stronger adaptability, but also suffer from disadvantages such as overfitting and poor interpretability, which need to be weighed in practical applications. At present, machine learning algorithms, as the foundation of artificial intelligence, have just begun to emerge in the field of cancer clinical research. The future development of oncology will enter a more precise era of cancer research, characterized by larger data, higher dimensions, and more frequent information exchange. Machine learning is bound to shine brightly in this field.
Collapse
Affiliation(s)
- Shujun Li
- Department of Hematology, Xiangya Hospital, Central South University, Changsha, 410008, China; National Clinical Research Center for Geriatric Diseases (Xiangya Hospital), China; Hunan Hematology Oncology Clinical Medical Research Center, China
| | - Hang Yi
- Department of Thoracic Surgery, National Cancer Center/National Clinical Research Center for Cancer/Cancer Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, 100021, China
| | - Qihao Leng
- Xiangya School of Medicine, Central South University, Changsha, 410013, Hunan Province, China
| | - You Wu
- Institute for Hospital Management, School of Medicine, Tsinghua University, 30 Shuangqing Rd, Haidian District, Beijing, China; Department of Health Policy and Management, Bloomberg School of Public Health, Johns Hopkins University, Baltimore, MD, 21205, USA.
| | - Yousheng Mao
- Department of Thoracic Surgery, National Cancer Center/National Clinical Research Center for Cancer/Cancer Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, 100021, China.
| |
Collapse
|
2
|
Galadima H, Anson-Dwamena R, Johnson A, Bello G, Adunlin G, Blando J. Machine Learning as a Tool for Early Detection: A Focus on Late-Stage Colorectal Cancer across Socioeconomic Spectrums. Cancers (Basel) 2024; 16:540. [PMID: 38339293 PMCID: PMC10854986 DOI: 10.3390/cancers16030540] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2023] [Revised: 01/19/2024] [Accepted: 01/23/2024] [Indexed: 02/12/2024] Open
Abstract
PURPOSE To assess the efficacy of various machine learning (ML) algorithms in predicting late-stage colorectal cancer (CRC) diagnoses against the backdrop of socio-economic and regional healthcare disparities. METHODS An innovative theoretical framework was developed to integrate individual- and census tract-level social determinants of health (SDOH) with sociodemographic factors. A comparative analysis of the ML models was conducted using key performance metrics such as AUC-ROC to evaluate their predictive accuracy. Spatio-temporal analysis was used to identify disparities in late-stage CRC diagnosis probabilities. RESULTS Gradient boosting emerged as the superior model, with the top predictors for late-stage CRC diagnosis being anatomic site, year of diagnosis, age, proximity to superfund sites, and primary payer. Spatio-temporal clusters highlighted geographic areas with a statistically significant high probability of late-stage diagnoses, emphasizing the need for targeted healthcare interventions. CONCLUSIONS This research underlines the potential of ML in enhancing the prognostic predictions in oncology, particularly in CRC. The gradient boosting model, with its robust performance, holds promise for deployment in healthcare systems to aid early detection and formulate localized cancer prevention strategies. The study's methodology demonstrates a significant step toward utilizing AI in public health to mitigate disparities and improve cancer care outcomes.
Collapse
Affiliation(s)
- Hadiza Galadima
- School of Community and Environmental Health, Old Dominion University, Norfolk, VA 23529, USA; (R.A.-D.); (A.J.); (J.B.)
| | - Rexford Anson-Dwamena
- School of Community and Environmental Health, Old Dominion University, Norfolk, VA 23529, USA; (R.A.-D.); (A.J.); (J.B.)
| | - Ashley Johnson
- School of Community and Environmental Health, Old Dominion University, Norfolk, VA 23529, USA; (R.A.-D.); (A.J.); (J.B.)
| | - Ghalib Bello
- Department of Environmental Medicine & Public Health, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA;
| | - Georges Adunlin
- Department of Pharmaceutical, Social and Administrative Sciences, Samford University, Birmingham, AL 35229, USA;
| | - James Blando
- School of Community and Environmental Health, Old Dominion University, Norfolk, VA 23529, USA; (R.A.-D.); (A.J.); (J.B.)
| |
Collapse
|
3
|
Miller HA, Tran A, LyBarger KS, Frieboes HB. A clinical marker-based modeling framework to preoperatively predict lymph node and vascular space involvement in endometrial cancer patients. EUROPEAN JOURNAL OF SURGICAL ONCOLOGY 2024; 50:107309. [PMID: 38056021 DOI: 10.1016/j.ejso.2023.107309] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2023] [Revised: 10/31/2023] [Accepted: 11/27/2023] [Indexed: 12/08/2023]
Abstract
INTRODUCTION Endometrial cancer (EC) has high mortality at advanced stages. Poor prognostic factors include grade 3 tumors, deep myometrial invasion, lymph node metastasis (LNM), and lymphovascular space invasion (LVSI). Preoperative knowledge of patients at higher risk of lymph node involvement, when such involvement is not suspected, would benefit surgery planning and patient prognosis. This study implements an ensemble machine learning approach that evaluates Cancer Antigen 125 (CA125) along with histologic type, preoperative grade, and age to predict LVSI, LNM and stage in EC patients. METHODS A retrospective chart review spanning January 2000 to January 2015 at a regional hospital was performed. Women 18 years or older with a diagnosis of EC and preoperative or within one-week CA125 measurement were included (n = 842). An ensemble machine learning approach was implemented based on a stacked generalization technique to evaluate CA125 in combination with histologic type, preoperative grade, and age as predictors, and LVSI, LNM and disease stage as outcomes. RESULTS The ensemble approach predicted LNM and LVSI in EC patients with AUROCTEST of 0.857 and 0.750, respectively, and predicted disease stage with AUROCTEST of 0.665. The approach achieved AUROCTEST for LVSI and LNM of 0.750 and 0.643 for grade 1 patients, and of 0.689 and 0.952 for grade 2 patients, respectively. CONCLUSION An ensemble machine learning approach offers the potential to preoperatively predict LVSI, LNM and stage in EC patients with adequate accuracy based on CA125, histologic type, preoperative grade, and age.
Collapse
Affiliation(s)
- Hunter A Miller
- Department of Bioengineering, University of Louisville, Louisville, KY, USA
| | - Anh Tran
- Department of Biochemical Engineering, University of California Davis, Davis, CA, USA
| | - K Shawn LyBarger
- Sarah Cannon Cancer Institute, HCA MidAmerica, Kansas City, MO, USA; Department of Surgical Oncology, University of Missouri, Kansas City, MO, USA
| | - Hermann B Frieboes
- Department of Bioengineering, University of Louisville, Louisville, KY, USA; Department of Pharmacology and Toxicology, University of Louisville, Louisville, KY, USA; UofL Health - Brown Cancer Center, University of Louisville, Louisville, KY, USA; Center for Predictive Medicine, University of Louisville, Louisville, KY, USA.
| |
Collapse
|
4
|
Piedimonte S, Rosa G, Gerstl B, Sopocado M, Coronel A, Lleno S, Vicus D. Evaluating the use of machine learning in endometrial cancer: a systematic review. Int J Gynecol Cancer 2023; 33:1383-1393. [PMID: 37666535 DOI: 10.1136/ijgc-2023-004622] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/06/2023] Open
Abstract
OBJECTIVE To review the literature on machine learning in endometrial cancer, report the most commonly used algorithms, and compare performance with traditional prediction models. METHODS This is a systematic review of the literature from January 1985 to March 2021 on the use of machine learning in endometrial cancer. An extensive search of electronic databases was conducted. Four independent reviewers screened studies initially by title then full text. Quality was assessed using the MINORS (Methodological Index for Non-Randomized Studies) criteria. P values were derived using the Pearson's Χ2 test in JMP 15.0. RESULTS Among 4295 articles screened, 30 studies on machine learning in endometrial cancer were included. The most frequent applications were in patient datasets (33.3%, n=10), pre-operative diagnostics (30%, n=9), genomics (23.3%, n=7), and serum biomarkers (13.3%, n=4). The most commonly used models were neural networks (n=10, 33.3%) and support vector machine (n=6, 20%).The number of publications on machine learning in endometrial cancer increased from 1 in 2010 to 29 in 2021.Eight studies compared machine learning with traditional statistics. Among patient dataset studies, two machine learning models (20%) performed similarly to logistic regression (accuracy: 0.85 vs 0.82, p=0.16). Machine learning algorithms performed similarly to detect endometrial cancer based on MRI (accuracy: 0.87 vs 0.82, p=0.24) while outperforming traditional methods in predicting extra-uterine disease in one serum biomarker study (accuracy: 0.81 vs 0.61). For survival outcomes, one study compared machine learning with Kaplan-Meier and reported no difference in concordance index (83.8% vs 83.1%). CONCLUSION Although machine learning is an innovative and emerging technology, performance is similar to that of traditional regression models in endometrial cancer. More studies are needed to assess its role in endometrial cancer. PROSPERO REGISTRATION NUMBER CRD42021269565.
Collapse
Affiliation(s)
- Sabrina Piedimonte
- Department of Gynecologic Oncology, University of Toronto, Toronto, Ontario, Canada
| | | | - Brigitte Gerstl
- The Rosa Institute, Sydney, New South Wales, Australia
- The Kirby Institute, University of New South Wales, Sydney, New South Wales, Australia
| | - Mars Sopocado
- The Rosa Institute, Sydney, New South Wales, Australia
| | - Ana Coronel
- The Rosa Institute, Sydney, New South Wales, Australia
| | | | - Danielle Vicus
- Department of Gynecologic Oncology, University of Toronto, Toronto, Ontario, Canada
- Department of Gynecologic Oncology, Sunnybrook Health Sciences, Toronto, Ontario, Canada
| |
Collapse
|
5
|
Sheehy J, Rutledge H, Acharya UR, Loh HW, Gururajan R, Tao X, Zhou X, Li Y, Gurney T, Kondalsamy-Chennakesavan S. Gynecological cancer prognosis using machine learning techniques: A systematic review of last three decades (1990–2022). Artif Intell Med 2023; 139:102536. [PMID: 37100507 DOI: 10.1016/j.artmed.2023.102536] [Citation(s) in RCA: 9] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2021] [Revised: 03/19/2023] [Accepted: 03/23/2023] [Indexed: 03/30/2023]
Abstract
OBJECTIVE Many Computer Aided Prognostic (CAP) systems based on machine learning techniques have been proposed in the field of oncology. The objective of this systematic review was to assess and critically appraise the methodologies and approaches used in predicting the prognosis of gynecological cancers using CAPs. METHODS Electronic databases were used to systematically search for studies utilizing machine learning methods in gynecological cancers. Study risk of bias (ROB) and applicability were assessed using the PROBAST tool. 139 studies met the inclusion criteria, of which 71 predicted outcomes for ovarian cancer patients, 41 predicted outcomes for cervical cancer patients, 28 predicted outcomes for uterine cancer patients, and 2 predicted outcomes for gynecological malignancies broadly. RESULTS Random forest (22.30 %) and support vector machine (21.58 %) classifiers were used most commonly. Use of clinicopathological, genomic and radiomic data as predictors was observed in 48.20 %, 51.08 % and 17.27 % of studies, respectively, with some studies using multiple modalities. 21.58 % of studies were externally validated. Twenty-three individual studies compared ML and non-ML methods. Study quality was highly variable and methodologies, statistical reporting and outcome measures were inconsistent, preventing generalized commentary or meta-analysis of performance outcomes. CONCLUSION There is significant variability in model development when prognosticating gynecological malignancies with respect to variable selection, machine learning (ML) methods and endpoint selection. This heterogeneity prevents meta-analysis and conclusions regarding the superiority of ML methods. Furthermore, PROBAST-mediated ROB and applicability analysis demonstrates concern for the translatability of existing models. This review identifies ways that this can be improved upon in future works to develop robust, clinically translatable models within this promising field.
Collapse
|
6
|
Roškar L, Pušić M, Roškar I, Kokol M, Pirš B, Smrkolj Š, Rižner TL. Models including preoperative plasma levels of angiogenic factors, leptin and IL-8 as potential biomarkers of endometrial cancer. Front Oncol 2022; 12:972131. [PMID: 36505829 PMCID: PMC9730274 DOI: 10.3389/fonc.2022.972131] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2022] [Accepted: 10/26/2022] [Indexed: 11/27/2022] Open
Abstract
Background The diversity of endometrial cancer (EC) dictates the need for precise early diagnosis and pre-operative stratification to select treatment options appropriately. Non-invasive biomarkers invaluably assist clinicians in managing patients in daily clinical practice. Currently, there are no validated diagnostic or prognostic biomarkers for EC that could accurately predict the presence and extent of the disease. Methods Our study analyzed 202 patients, of whom 91 were diagnosed with EC and 111 were control patients with the benign gynecological disease. Using Luminex xMAP™ multiplexing technology, we measured the pre-operative plasma concentrations of six previously selected angiogenic factors - leptin, IL-8, sTie-2, follistatin, neuropilin-1, and G-CSF. Besides basic statistical methods, we used a machine-learning algorithm to create a robust diagnostic model based on the plasma concentration of tested angiogenic factors. Results The plasma levels of leptin were significantly higher in EC patients than in control patients. Leptin was higher in type 1 EC patients versus control patients, and IL-8 was higher in type 2 EC versus control patients, particularly in poorly differentiated endometrioid EC grade 3. IL-8 plasma levels were significantly higher in EC patients with lymphovascular or myometrial invasion. Among univariate models, the model based on leptin reached the best results on both training and test datasets. A combination of age, IL-8, leptin and G-CSF was determined as the most important feature for the multivariate model, with ROC AUC 0.94 on training and 0.81 on the test dataset. The model utilizing a combination of all six AFs, BMI and age reached a ROC AUC of 0.89 on both the training and test dataset, strongly indicating the capability for predicting the risk of EC even on unseen data. Conclusion According to our results, measuring plasma concentrations of angiogenic factors could, provided they are confirmed in a multicentre validation study, represent an important supplementary diagnostic tool for early detection and prognostic characterization of EC, which could guide the decision-making regarding the extent of treatment.
Collapse
Affiliation(s)
- Luka Roškar
- Department of Gynaecology and Obstetrics, Faculty of Medicine, University of Ljubljana, Ljubljana, Slovenia,Division of Gynaecology and Obstetrics, General Hospital Murska Sobota, Murska Sobota, Slovenia
| | - Maja Pušić
- Institute of Biochemistry and Molecular Genetics, Faculty of Medicine, University of Ljubljana, Ljubljana, Slovenia
| | - Irena Roškar
- Institute of Biochemistry and Molecular Genetics, Faculty of Medicine, University of Ljubljana, Ljubljana, Slovenia
| | - Marko Kokol
- Faculty of Electrical Engineering and Computer Science, University of Maribor, Maribor, Slovenia,Semantika Research, Semantika d.o.o., Maribor, Slovenia
| | - Boštjan Pirš
- Division of Gynaecology and Obstetrics, University Medical Centre, Ljubljana, Slovenia
| | - Špela Smrkolj
- Department of Gynaecology and Obstetrics, Faculty of Medicine, University of Ljubljana, Ljubljana, Slovenia,Division of Gynaecology and Obstetrics, University Medical Centre, Ljubljana, Slovenia,*Correspondence: Špela Smrkolj, ; Tea Lanišnik Rižner,
| | - Tea Lanišnik Rižner
- Institute of Biochemistry and Molecular Genetics, Faculty of Medicine, University of Ljubljana, Ljubljana, Slovenia,*Correspondence: Špela Smrkolj, ; Tea Lanišnik Rižner,
| |
Collapse
|
7
|
Prediction of endometrial cancer recurrence by using a novel machine learning algorithm: An Israeli gynecologic oncology group study. J Gynecol Obstet Hum Reprod 2022; 51:102466. [DOI: 10.1016/j.jogoh.2022.102466] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2022] [Revised: 06/04/2022] [Accepted: 08/25/2022] [Indexed: 11/23/2022]
|
8
|
Fiste O, Liontos M, Zagouri F, Stamatakos G, Dimopoulos MA. Machine learning applications in gynecological cancer: A critical review. Crit Rev Oncol Hematol 2022; 179:103808. [PMID: 36087852 DOI: 10.1016/j.critrevonc.2022.103808] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2022] [Revised: 08/18/2022] [Accepted: 09/05/2022] [Indexed: 11/30/2022] Open
Abstract
Machine Learning (ML) represents a computer science capable of generating predictive models, by exposure to raw, training data, without being rigidly programmed. Over the last few years, ML has gained attention within the field of oncology, with considerable strides in both diagnostic, predictive, and prognostic spectrum of malignancies, but also as a catalyst of cancer research. In this review, we discuss the state of ML applications on gynecologic oncology and systematically address major technical and ethical concerns, with respect to their real-world medical practice translation. Undoubtedly, advances in ML will enable the analysis of large, rather complex, datasets for improved, cost-effective, and efficient clinical decisions.
Collapse
Affiliation(s)
- Oraianthi Fiste
- Department of Clinical Therapeutics, School of Medicine, National and Kapodistrian University of Athens, Alexandra Hospital, 80 Vasilissis Sophias, 11528 Athens, Greece.
| | - Michalis Liontos
- Department of Clinical Therapeutics, School of Medicine, National and Kapodistrian University of Athens, Alexandra Hospital, 80 Vasilissis Sophias, 11528 Athens, Greece
| | - Flora Zagouri
- Department of Clinical Therapeutics, School of Medicine, National and Kapodistrian University of Athens, Alexandra Hospital, 80 Vasilissis Sophias, 11528 Athens, Greece
| | - Georgios Stamatakos
- In Silico Oncology and In Silico Medicine Group, Institute of Communication and Computer Systems, School of Electrical and Computer Engineering, National Technical University of Athens, Athens, Greece
| | - Meletios Athanasios Dimopoulos
- Department of Clinical Therapeutics, School of Medicine, National and Kapodistrian University of Athens, Alexandra Hospital, 80 Vasilissis Sophias, 11528 Athens, Greece
| |
Collapse
|
9
|
Bhardwaj V, Sharma A, Parambath SV, Gul I, Zhang X, Lobie PE, Qin P, Pandey V. Machine Learning for Endometrial Cancer Prediction and Prognostication. Front Oncol 2022; 12:852746. [PMID: 35965548 PMCID: PMC9365068 DOI: 10.3389/fonc.2022.852746] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2022] [Accepted: 06/14/2022] [Indexed: 11/13/2022] Open
Abstract
Endometrial cancer (EC) is a prevalent uterine cancer that remains a major contributor to cancer-associated morbidity and mortality. EC diagnosed at advanced stages shows a poor therapeutic response. The clinically utilized EC diagnostic approaches are costly, time-consuming, and are not readily available to all patients. The rapid growth in computational biology has enticed substantial research attention from both data scientists and oncologists, leading to the development of rapid and cost-effective computer-aided cancer surveillance systems. Machine learning (ML), a subcategory of artificial intelligence, provides opportunities for drug discovery, early cancer diagnosis, effective treatment, and choice of treatment modalities. The application of ML approaches in EC diagnosis, therapies, and prognosis may be particularly relevant. Considering the significance of customized treatment and the growing trend of using ML approaches in cancer prediction and monitoring, a critical survey of ML utility in EC may provide impetus research in EC and assist oncologists, molecular biologists, biomedical engineers, and bioinformaticians to further collaborative research in EC. In this review, an overview of EC along with risk factors and diagnostic methods is discussed, followed by a comprehensive analysis of the potential ML modalities for prevention, screening, detection, and prognosis of EC patients.
Collapse
Affiliation(s)
- Vipul Bhardwaj
- Tsinghua Berkeley Shenzhen Institute, Tsinghua Shenzhen International Graduate School, Tsinghua University, Shenzhen, China
| | - Arundhiti Sharma
- Tsinghua Berkeley Shenzhen Institute, Tsinghua Shenzhen International Graduate School, Tsinghua University, Shenzhen, China
| | | | - Ijaz Gul
- Institute of Biopharmaceutical and Health Engineering, Tsinghua Shenzhen International Graduate School, Tsinghua University, Shenzhen, China
| | - Xi Zhang
- Shenzhen Bay Laboratory, Shenzhen, China
| | - Peter E. Lobie
- Tsinghua Berkeley Shenzhen Institute, Tsinghua Shenzhen International Graduate School, Tsinghua University, Shenzhen, China
- Institute of Biopharmaceutical and Health Engineering, Tsinghua Shenzhen International Graduate School, Tsinghua University, Shenzhen, China
- Shenzhen Bay Laboratory, Shenzhen, China
| | - Peiwu Qin
- Tsinghua Berkeley Shenzhen Institute, Tsinghua Shenzhen International Graduate School, Tsinghua University, Shenzhen, China
- Institute of Biopharmaceutical and Health Engineering, Tsinghua Shenzhen International Graduate School, Tsinghua University, Shenzhen, China
| | - Vijay Pandey
- Tsinghua Berkeley Shenzhen Institute, Tsinghua Shenzhen International Graduate School, Tsinghua University, Shenzhen, China
- Institute of Biopharmaceutical and Health Engineering, Tsinghua Shenzhen International Graduate School, Tsinghua University, Shenzhen, China
- *Correspondence: Vijay Pandey,
| |
Collapse
|
10
|
Piedimonte S, Feigenberg T, Drysdale E, Kwon J, Gotlieb WH, Cormier B, Plante M, Lau S, Helpman L, Renaud MC, May T, Vicus D. Predicting recurrence and recurrence-free survival in high-grade endometrial cancer using machine learning. J Surg Oncol 2022; 126:1096-1103. [PMID: 35819161 DOI: 10.1002/jso.27008] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2022] [Revised: 06/06/2022] [Accepted: 06/29/2022] [Indexed: 11/11/2022]
Abstract
OBJECTIVE To develop machine-learning models to predict recurrence and time-to-recurrence in high-grade endometrial cancer (HGEC) following surgery and tailored adjuvant treatment. METHODS Data were retrospectively collected across eight Canadian centers including 1237 patients. Four models were trained to predict recurrence: random forests, boosted trees, and two neural networks. Receiver operating characteristic curves were used to select the best model based on the highest area under the curve (AUC). For time to recurrence, we compared random forests and Least Absolute Shrinkage and Selection Operator (LASSO) model to Cox proportional hazards. RESULTS The random forest was the best model to predict recurrence in HGEC; the AUCs were 85.2%, 74.1%, and 71.8% in the training, validation, and test sets, respectively. The top five predictors were: stage, uterus height, specimen weight, adjuvant chemotherapy, and preoperative histology. Performance increased to 77% and 80% when stratified by Stage III and IV, respectively. For time to recurrence, there was no difference between the LASSO and Cox proportional hazards models (c-index 71%). The random forest had a c-index of 60.5%. CONCLUSIONS A bootstrap random forest model may be a more accurate technique to predict recurrence in HGEC using multiple clinicopathologic factors. For time to recurrence, machine-learning methods performed similarly to the Cox proportional hazards model.
Collapse
Affiliation(s)
- Sabrina Piedimonte
- Division of Gynecologic Oncology, University of Toronto, Toronto, Ontario, Canada
| | | | - Erik Drysdale
- Genetics and Genome Biology, AI in Medicine, SickKids, Toronto, Ontario, Canada
| | - Janice Kwon
- Vancouver Coastal Health, Vancouver, British Columbia, Canada
| | | | - Beatrice Cormier
- Centre Hospitalier Universitaire de Montreal, Montreal, Quebec, Canada
| | - Marie Plante
- Centre Hospitalier Universitaire de Quebec, Quebec City, Quebec, Canada
| | - Susie Lau
- Jewish General Hospital, Montreal, Quebec, Canada
| | | | | | - Taymaa May
- University Health Network, Toronto, Ontario, Canada
| | - Danielle Vicus
- Sunnybrook Health Sciences Center, Toronto, Ontario, Canada
| |
Collapse
|
11
|
Liu X, Wu Y, Liu P, Zhang X. Developing a validated nomogram for predicting ovarian metastasis in endometrial cancer patients: a retrospective research. Arch Gynecol Obstet 2021; 305:719-729. [PMID: 34495379 DOI: 10.1007/s00404-021-06214-4] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2020] [Accepted: 08/24/2021] [Indexed: 11/30/2022]
Abstract
OBJECTIVE To explore risk factors and develop a prediction model for ovarian metastasis in endometrial cancer (EC), as well as providing provide a reference for clinical ovarian preservation. METHODS We conducted a retrospective observational study enrolling 1496 EC patients having received complete staging surgery from Qilu Hospital of Shandong University from 2012 to 2018. These patients were randomly divided into two cohorts: training cohort (n = 1046) and validation cohort (n = 448). A nomogram prediction model was developed based on univariate, least absolute shrinkage and selection operator (Lasso), and multivariate logistic regression. Then, the nomogram model's performance was evaluated in discrimination, calibration, and clinical utility three aspects. RESULTS Parametrium invasion, lymph node metastasis, and oviduct metastasis were finally contained in the nomogram prediction model. The AUC of the model in the training cohort was 0.85 compared with 0.72 in the validation cohort. It also behaved well in calibration and had good clinical utility. With a threshold probability of 20% ~ 80%, the nomogram increased the net benefit by 0 ~ 13.6 per 100 patients than surgery for all patients upon validation. CONCLUSIONS We develop a nomogram with good performances for predicting ovarian metastasis in EC patients, which may help clinicians identify candidate patients appropriate for ovarian preservation in premenopausal EC patients.
Collapse
Affiliation(s)
- Xiaodie Liu
- Department of Obstetrics and Gynecology, Qilu Hospital of Shandong University, Cheeloo College of Medicine, Shandong University, Jinan, 250012, China.,Department of Obstetrics and Gynecology, China-Japan Friendship Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, 100029, China
| | - Yaohai Wu
- Department of Urology, Qilu Hospital of Shandong University, Cheeloo College of Medicine, Shandong University, Jinan, 250012, China.,Department of Urology, Seventh Affiliated Hospital, Sun Yat-Sen University, Shenzhen, 518107, People's Republic of China
| | - Peishu Liu
- Department of Obstetrics and Gynecology, Qilu Hospital of Shandong University, Cheeloo College of Medicine, Shandong University, Jinan, 250012, China.
| | - Xiaolei Zhang
- Department of Obstetrics and Gynecology, Qilu Hospital of Shandong University, Cheeloo College of Medicine, Shandong University, Jinan, 250012, China.
| |
Collapse
|
12
|
Grimley PM, Liu Z, Darcy KM, Hueman MT, Wang H, Sheng L, Henson DE, Chen D. A prognostic system for epithelial ovarian carcinomas using machine learning. Acta Obstet Gynecol Scand 2021; 100:1511-1519. [PMID: 33665831 PMCID: PMC8360140 DOI: 10.1111/aogs.14137] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2020] [Accepted: 03/01/2021] [Indexed: 11/28/2022]
Abstract
Introduction Integrating additional factors into the International Federation of Gynecology and Obstetrics (FIGO) staging system is needed for accurate patient classification and survival prediction. In this study, we tested machine learning as a novel tool for incorporating additional prognostic parameters into the conventional FIGO staging system for stratifying patients with epithelial ovarian carcinomas and evaluating their survival. Material and methods Cancer‐specific survival data for epithelial ovarian carcinomas were extracted from the Surveillance, Epidemiology, and End Results (SEER) program. Two datasets were constructed based upon the year of diagnosis. Dataset 1 (39 514 cases) was limited to primary tumor (T), regional lymph nodes (N) and distant metastasis (M). Dataset 2 (25 291 cases) included additional parameters of age at diagnosis (A) and histologic type and grade (H). The Ensemble Algorithm for Clustering Cancer Data (EACCD) was applied to generate prognostic groups with depiction in dendrograms. C‐indices provided dendrogram cutoffs and comparisons of prediction accuracy. Results Dataset 1 was stratified into nine epithelial ovarian carcinoma prognostic groups, contrasting with 10 groups from FIGO methodology. The EACCD grouping had a slightly higher accuracy in survival prediction than FIGO staging (C‐index = 0.7391 vs 0.7371, increase in C‐index = 0.0020, 95% confidence interval [CI] 0.0012–0.0027, p = 1.8 × 10−7). Nevertheless, there remained a strong inter‐system association between EACCD and FIGO (rank correlation = 0.9480, p = 6.1 × 10−15). Analysis of Dataset 2 demonstrated that A and H could be smoothly integrated with the T, N and M criteria. Survival data were stratified into nine prognostic groups with an even higher prediction accuracy (C‐index = 0.7605) than when using only T, N and M. Conclusions EACCD was successfully applied to integrate A and H with T, N and M for stratification and survival prediction of epithelial ovarian carcinoma patients. Additional factors could be advantageously incorporated to test the prognostic impact of emerging diagnostic or therapeutic advances.
Collapse
Affiliation(s)
- Philip M Grimley
- Department of Pathology, F. Edward Hébert School of Medicine, Uniformed Services University of the Health Sciences, Bethesda, MD, USA
| | - Zhenqiu Liu
- Department of Public Health Sciences, Penn State Cancer Institute, Hershey, PA, USA
| | - Kathleen M Darcy
- Department of Obstetrics & Gynecology, Uniformed Services University of the Health Sciences, Bethesda, MD, USA
| | - Matthew T Hueman
- Department of Surgical Oncology, John P. Murtha Cancer Center, Walter Reed National Military Medical Center, Bethesda, MD, USA
| | - Huan Wang
- Department of Biostatistics, George Washington University, Washington, DC, USA
| | - Li Sheng
- Department of Mathematics, Drexel University, Philadelphia, PA, USA
| | - Donald E Henson
- Department of Preventive Medicine & Biostatistics, F. Edward Hébert School of Medicine, Uniformed Services University of the Health Sciences, Bethesda, MD, USA
| | - Dechang Chen
- Department of Preventive Medicine & Biostatistics, F. Edward Hébert School of Medicine, Uniformed Services University of the Health Sciences, Bethesda, MD, USA
| |
Collapse
|
13
|
Hueman M, Wang H, Liu Z, Henson D, Nguyen C, Park D, Sheng L, Chen D. Expanding TNM for lung cancer through machine learning. Thorac Cancer 2021; 12:1423-1430. [PMID: 33713568 PMCID: PMC8088955 DOI: 10.1111/1759-7714.13926] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2021] [Revised: 02/20/2021] [Accepted: 02/21/2021] [Indexed: 01/05/2023] Open
Abstract
Background Expanding the tumor, lymph node, metastasis (TNM) staging system by accommodating new prognostic and predictive factors for cancer will improve patient stratification and survival prediction. Here, we introduce machine learning for incorporating additional prognostic factors into the conventional TNM for stratifying patients with lung cancer and evaluating survival. Methods Data were extracted from SEER. A total of 77 953 patients were analyzed using factors including primary tumor (T), regional lymph node (N), distant metastasis (M), age, and histology type. Ensemble algorithm for clustering cancer data (EACCD) and C‐index were applied to generate prognostic groups and expand the current staging system. Results With T, N, and M, EACCD stratified patients into 11 groups, resulting in a significantly higher accuracy in survival prediction than the 10 AJCC stages (C‐index = 0.7346 vs. 0.7247, increase in C‐index = 0.0099, 95% CI: 0.0091–0.0106, p‐value = 9.2 × 10−147). There nevertheless remained a strong association between the EACCD grouping and AJCC staging (rank correlation = 0.9289; p‐value = 6.7 × 10−22). A further analysis demonstrated that age and histological tumor could be integrated with the TNM. Data were stratified into 12 prognostic groups with an even higher prediction accuracy (C‐index = 0.7468 vs. 0.7247, increase in C‐index = 0.0221, 95% CI: 0.0212–0.0231, p‐value <5 × 10−324). Conclusions EACCD can be successfully applied to integrate additional factors with T, N, M for lung cancer patients.
Collapse
Affiliation(s)
- Matthew Hueman
- Department of Surgical Oncology, John P. Murtha Cancer Center, Walter Reed National Military Medical Center, Bethesda, Maryland, USA
| | - Huan Wang
- Department of Biostatistics, George Washington University, Washington, District of Columbia, USA
| | - Zhenqiu Liu
- Department of Public Health Sciences, Penn State Cancer Institute, Hershey, Pennsylvania, USA
| | - Donald Henson
- Department of Preventive Medicine & Biostatistics, F. Edward Hébert School of Medicine, Uniformed Services University of the Health Sciences, Bethesda, Maryland, USA
| | - Cuong Nguyen
- Department of Pathology, Uniformed Services University of the Health Sciences, Bethesda, Maryland, USA
| | - Dean Park
- Department of Hematology-Oncology, John P. Murtha Cancer Center, Walter Reed National Military Medical Center, Bethesda, Maryland, USA
| | - Li Sheng
- Department of Mathematics, Drexel University, Philadelphia, Pennsylvania, USA
| | - Dechang Chen
- Department of Preventive Medicine & Biostatistics, F. Edward Hébert School of Medicine, Uniformed Services University of the Health Sciences, Bethesda, Maryland, USA
| |
Collapse
|
14
|
Shi S, Lei G, Yang L, Zhang C, Fang Z, Li J, Wang G. Using Machine Learning to Predict Postoperative Liver Dysfunction After Aortic Arch Surgery. J Cardiothorac Vasc Anesth 2021; 35:2330-2335. [PMID: 33745835 DOI: 10.1053/j.jvca.2021.02.046] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/18/2020] [Revised: 02/06/2021] [Accepted: 02/16/2021] [Indexed: 12/23/2022]
Abstract
OBJECTIVES The study compared machine-learning models with traditional logistic regression to predicting liver outcomes after aortic arch surgery. DESIGN Retrospective review from January 2013 to May 2017. SETTING Fuwai Hospital. PARTICIPANTS The study comprised 672 consecutive patients who had undergone aortic arch surgery. MEASUREMENTS AND MAIN RESULTS Three machine-learning methods were compared with logistic regression with regard to the prediction of postoperative liver dysfunction (PLD) after aortic arch surgery. The perioperative characteristics, including the patients' baseline medical condition and intraoperative data, were analyzed. The performance of the models was assessed using the area under the receiver operating characteristic curve. Naïve Bayes had the best discriminative ability for the prediction of PLD (area under the receiver operating characteristic curve = 0.77) compared with random forest (0.76), support vector machine (0.73), and logistic regression (0.72). The primary endpoint of PLD was observed in 185 patients (27.5%). The cardiopulmonary bypass time, long surgery time, long aortic clamp time, high preoperative bilirubin value, and low rectal temperature were strongly associated with the development of PLD after aortic arch surgery. CONCLUSION The machine-learning method of naïve Bayes predicts PLD after aortic arch surgery significantly better than traditional logistic regression.
Collapse
Affiliation(s)
- Sheng Shi
- Department of Anesthesiology, Fuwai Hospital, National Center for Cardiovascular Diseases, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China
| | - Guiyu Lei
- Department of Anesthesiology, Beijing Tongren Hospital, Capital Medical University, Beijing, China
| | - Lijing Yang
- Department of Anesthesiology, Fuwai Hospital, National Center for Cardiovascular Diseases, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China
| | - Congya Zhang
- Department of Anesthesiology, Fuwai Hospital, National Center for Cardiovascular Diseases, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China
| | - Zhongrong Fang
- Department of Anesthesiology, Fuwai Hospital, National Center for Cardiovascular Diseases, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China
| | - Jun Li
- Department of Anesthesiology, Fuwai Hospital, National Center for Cardiovascular Diseases, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China
| | - Guyan Wang
- Department of Anesthesiology, Beijing Tongren Hospital, Capital Medical University, Beijing, China.
| |
Collapse
|