1
|
Yang Y, Yang Z, Lyu Z, Ouyang K, Wang J, Wu D, Li Y. Pathological-Features-Modified TNM Staging System Improves Prognostic Accuracy for Rectal Cancer. Dis Colon Rectum 2024; 67:645-654. [PMID: 38147435 DOI: 10.1097/dcr.0000000000003034] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 12/28/2023]
Abstract
BACKGROUND Variations in survival outcomes are observed in the eighth edition of the American Joint Committee on Cancer TNM staging system. OBJECTIVE Machine learning ensemble methods were used to develop and evaluate the effectiveness of a pathological-features-modified TNM staging system in predicting survival for patients with rectal cancer by use of commonly reported pathological features, such as histological grade, tumor deposits, and perineural invasion, to improve the prognostic accuracy. DESIGN This was a retrospective population-based study. SETTINGS Data were assessed from the database of the Surveillance, Epidemiology, and End Results Program. PATIENTS The study cohort comprised 14,468 patients with rectal cancer diagnosed between 2010 and 2015. The development cohort included those who underwent surgery as the primary treatment, whereas patients who received neoadjuvant therapy were assigned to the validation cohort. MAIN OUTCOME MEASURES The primary outcome measures included cumulative rectal cancer survival, adjusted HRs, and both calibration and discrimination statistics to evaluate model performance and internal validation. RESULTS Multivariable Cox regression analysis identified all 3 pathological features as prognostic factors, after which patients were categorized into 4 pathological groups based on the number of pathological features (ie, 0, 1, 2, and 3). Distinct survival differences were observed among the groups, especially with patients with stage III rectal cancer. The proposed pathological-features-modified TNM staging outperformed the TNM staging in both the development and validation cohorts. LIMITATIONS Retrospective in design and lack of external validation. CONCLUSIONS The proposed pathological-features-modified TNM staging could complement the current TNM staging by improving the accuracy of survival estimation of patients with rectal cancer. See Video Abstract . EL SISTEMA DE ESTADIFICACIN TNM CON CARACTERSTICAS PATOLGICAS MODIFICADO MEJORA LA PRECISIN DEL PRONSTICO DEL CNCER DE RECTO ANTECEDENTES:Se observan variaciones en los resultados de supervivencia en el sistema de estadificación TNM del Comité Conjunto Americano del Cáncer 8º ediciónOBJETIVO:Se utilizaron métodos conjuntos de aprendizaje automático para desarrollar y evaluar la eficacia de un sistema de estadificación con características patológicas modificadas de tumores, ganglios y metástasis para predecir la supervivencia de pacientes con cáncer de recto, utilizando algunas características patológicas comúnmente informadas, como el grado histológico, depósitos tumorales e invasión perineural, para mejorar la precisión del pronóstico.DISEÑO:Este fue un estudio retrospectivo de base poblacional.ENTERNO CLINICO:Se recuperaron y evaluaron datos de la base de datos de Vigilancia, Epidemiología y Resultados Finales.PACIENTES:La cohorte del estudio estuvo compuesta por 14,468 pacientes con cáncer de recto diagnosticados entre 2010 y 2015. La cohorte de desarrollo incluyó a aquellos que se sometieron a cirugía como tratamiento primario, mientras que los pacientes que recibieron terapia neoadyuvante fueron asignados a la cohorte de validación.PRINCIPALES MEDIDAS DE RESULTADO:Las medidas de resultado primarias incluyeron supervivencia acumulada del cáncer de recto, índices de riesgo ajustados y estadísticas de calibración y discriminación para evaluar el rendimiento del modelo y la validación interna.RESULTADOS:El análisis de regresión multivariable de Cox identificó las tres características patológicas como factores pronósticos, después de lo cual los pacientes se clasificaron en cuatro grupos patológicos según el número de características patológicas (es decir, 0, 1, 2 y 3). Se observaron distintas diferencias en la supervivencia entre los grupos, especialmente en los pacientes en estadio III. La estadificación propuesta con características patológicas modificadas de tumores-ganglios-metástasis superó a la estadificación TNM tanto en las cohortes de desarrollo como en las de validación.LIMITACIONES:Diseño retrospectivo y falta de validación externa.CONCLUSIONES:La estadificación propuesta con características patológicas modificadas de tumores-ganglios-metástasis podría complementar la estadificación TNM actual al mejorar la precisión de la estimación de supervivencia de los pacientes con cáncer de recto. (Traducción- Dr. Francisco M. Abarca-Rendon ).
Collapse
Affiliation(s)
- Yuesheng Yang
- Shantou University Medical College, Shantou, People's Republic of China
- Department of Gastrointestinal Surgery, Department of General Surgery, Guangdong Provincial People's Hospital (Guangdong Academy of Medical Sciences), Southern Medical University, Guangzhou, People's Republic of China
| | - Zifeng Yang
- Department of Gastrointestinal Surgery, Department of General Surgery, Guangdong Provincial People's Hospital (Guangdong Academy of Medical Sciences), Southern Medical University, Guangzhou, People's Republic of China
| | - Zejian Lyu
- Department of Gastrointestinal Surgery, Department of General Surgery, Guangdong Provincial People's Hospital (Guangdong Academy of Medical Sciences), Southern Medical University, Guangzhou, People's Republic of China
| | - Kaibo Ouyang
- Shantou University Medical College, Shantou, People's Republic of China
- Department of Gastrointestinal Surgery, Department of General Surgery, Guangdong Provincial People's Hospital (Guangdong Academy of Medical Sciences), Southern Medical University, Guangzhou, People's Republic of China
| | - Junjiang Wang
- Department of Gastrointestinal Surgery, Department of General Surgery, Guangdong Provincial People's Hospital (Guangdong Academy of Medical Sciences), Southern Medical University, Guangzhou, People's Republic of China
| | - Deqing Wu
- Department of Gastrointestinal Surgery, Department of General Surgery, Guangdong Provincial People's Hospital (Guangdong Academy of Medical Sciences), Southern Medical University, Guangzhou, People's Republic of China
| | - Yong Li
- Department of Gastrointestinal Surgery, Department of General Surgery, Guangdong Provincial People's Hospital (Guangdong Academy of Medical Sciences), Southern Medical University, Guangzhou, People's Republic of China
| |
Collapse
|
2
|
Bhardwaj V, Sharma A, Parambath SV, Gul I, Zhang X, Lobie PE, Qin P, Pandey V. Machine Learning for Endometrial Cancer Prediction and Prognostication. Front Oncol 2022; 12:852746. [PMID: 35965548 PMCID: PMC9365068 DOI: 10.3389/fonc.2022.852746] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2022] [Accepted: 06/14/2022] [Indexed: 11/13/2022] Open
Abstract
Endometrial cancer (EC) is a prevalent uterine cancer that remains a major contributor to cancer-associated morbidity and mortality. EC diagnosed at advanced stages shows a poor therapeutic response. The clinically utilized EC diagnostic approaches are costly, time-consuming, and are not readily available to all patients. The rapid growth in computational biology has enticed substantial research attention from both data scientists and oncologists, leading to the development of rapid and cost-effective computer-aided cancer surveillance systems. Machine learning (ML), a subcategory of artificial intelligence, provides opportunities for drug discovery, early cancer diagnosis, effective treatment, and choice of treatment modalities. The application of ML approaches in EC diagnosis, therapies, and prognosis may be particularly relevant. Considering the significance of customized treatment and the growing trend of using ML approaches in cancer prediction and monitoring, a critical survey of ML utility in EC may provide impetus research in EC and assist oncologists, molecular biologists, biomedical engineers, and bioinformaticians to further collaborative research in EC. In this review, an overview of EC along with risk factors and diagnostic methods is discussed, followed by a comprehensive analysis of the potential ML modalities for prevention, screening, detection, and prognosis of EC patients.
Collapse
Affiliation(s)
- Vipul Bhardwaj
- Tsinghua Berkeley Shenzhen Institute, Tsinghua Shenzhen International Graduate School, Tsinghua University, Shenzhen, China
| | - Arundhiti Sharma
- Tsinghua Berkeley Shenzhen Institute, Tsinghua Shenzhen International Graduate School, Tsinghua University, Shenzhen, China
| | | | - Ijaz Gul
- Institute of Biopharmaceutical and Health Engineering, Tsinghua Shenzhen International Graduate School, Tsinghua University, Shenzhen, China
| | - Xi Zhang
- Shenzhen Bay Laboratory, Shenzhen, China
| | - Peter E. Lobie
- Tsinghua Berkeley Shenzhen Institute, Tsinghua Shenzhen International Graduate School, Tsinghua University, Shenzhen, China
- Institute of Biopharmaceutical and Health Engineering, Tsinghua Shenzhen International Graduate School, Tsinghua University, Shenzhen, China
- Shenzhen Bay Laboratory, Shenzhen, China
| | - Peiwu Qin
- Tsinghua Berkeley Shenzhen Institute, Tsinghua Shenzhen International Graduate School, Tsinghua University, Shenzhen, China
- Institute of Biopharmaceutical and Health Engineering, Tsinghua Shenzhen International Graduate School, Tsinghua University, Shenzhen, China
| | - Vijay Pandey
- Tsinghua Berkeley Shenzhen Institute, Tsinghua Shenzhen International Graduate School, Tsinghua University, Shenzhen, China
- Institute of Biopharmaceutical and Health Engineering, Tsinghua Shenzhen International Graduate School, Tsinghua University, Shenzhen, China
- *Correspondence: Vijay Pandey,
| |
Collapse
|
3
|
Yang CQ, Wang H, Liu Z, Hueman MT, Bhaskaran A, Henson DE, Sheng L, Chen D. Integrating additional factors into the TNM staging for cutaneous melanoma by machine learning. PLoS One 2021; 16:e0257949. [PMID: 34591891 PMCID: PMC8483349 DOI: 10.1371/journal.pone.0257949] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/06/2021] [Accepted: 09/14/2021] [Indexed: 12/23/2022] Open
Abstract
BACKGROUND Integrating additional factors into the TNM staging system is needed for more accurate risk classification and survival prediction for patients with cutaneous melanoma. In the present study, we introduce machine learning as a novel tool that incorporates additional prognostic factors to improve the current TNM staging system. METHODS AND FINDINGS Cancer-specific survival data for cutaneous melanoma with at least a 5 years follow-up were extracted from the Surveillance, Epidemiology, and End Results Program of the National Cancer Institute and split into the training set (40,781 cases) and validation set (5,390 cases). Five factors were studied: the primary tumor (T), regional lymph nodes (N), distant metastasis (M), age (A), and sex (S). The Ensemble Algorithm for Clustering Cancer Data (EACCD) was applied to the training set to generate prognostic groups. Utilizing only T, N, and M, a basic prognostic system was built where patients were stratified into 10 prognostic groups with well-separated survival curves, similar to 10 AJCC stages. These 10 groups had a significantly higher accuracy in survival prediction than 10 stages (C-index = 0.7682 vs 0.7643; increase in C-index = 0.0039, 95% CI = (0.0032, 0.0047); p-value = 7.2×10-23). Nevertheless, a positive association remained between the EACCD grouping and the AJCC staging (Spearman's rank correlation coefficient = 0.8316; p-value = 4.5×10-13). With additional information from A and S, a more advanced prognostic system was established using the training data that stratified patients into 10 groups and further improved the prediction accuracy (C-index = 0.7865 vs 0.7643; increase in C-index = 0.0222, 95% CI = (0.0191, 0.0254); p-value = 8.8×10-43). Both internal validation using the training set and temporal validation using the validation set showed good stratification and a high predictive accuracy of the prognostic systems. CONCLUSIONS The EACCD allows additional factors to be integrated into the TNM to create a prognostic system that improves patient stratification and survival prediction for cutaneous melanoma. This integration separates favorable from unfavorable clinical outcomes for patients and improves both cohort selection for clinical trials and treatment management.
Collapse
Affiliation(s)
- Charles Q. Yang
- Department of Surgery, Walter Reed National Military Medical Center, Bethesda, MD, United States of America
| | - Huan Wang
- Department of Biostatistics, The George Washington University, Washington, DC, United States of America
| | - Zhenqiu Liu
- Department of Public Health Sciences, Penn State Cancer Institute, Hershey, PA, United States of America
| | - Matthew T. Hueman
- Department of Surgical Oncology, John P. Murtha Cancer Center, Walter Reed National Military Medical Center, Bethesda, MD, United States of America
| | - Aadya Bhaskaran
- Department of Quantitative Theory and Methods, Emory University, Atlanta, GA, United States of America
| | - Donald E. Henson
- Deceased, was with The Department of Preventive Medicine & Biostatistics, F. Edward Hébert School of Medicine, Uniformed Services University of the Health Sciences, Bethesda, MD, United States of America
| | - Li Sheng
- Department of Mathematics, Drexel University, Philadelphia, PA, United States of America
| | - Dechang Chen
- Department of Preventive Medicine & Biostatistics, F. Edward Hébert School of Medicine, Uniformed Services University of the Health Sciences, Bethesda, MD, United States of America
| |
Collapse
|
4
|
Grimley PM, Liu Z, Darcy KM, Hueman MT, Wang H, Sheng L, Henson DE, Chen D. A prognostic system for epithelial ovarian carcinomas using machine learning. Acta Obstet Gynecol Scand 2021; 100:1511-1519. [PMID: 33665831 PMCID: PMC8360140 DOI: 10.1111/aogs.14137] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2020] [Accepted: 03/01/2021] [Indexed: 11/28/2022]
Abstract
Introduction Integrating additional factors into the International Federation of Gynecology and Obstetrics (FIGO) staging system is needed for accurate patient classification and survival prediction. In this study, we tested machine learning as a novel tool for incorporating additional prognostic parameters into the conventional FIGO staging system for stratifying patients with epithelial ovarian carcinomas and evaluating their survival. Material and methods Cancer‐specific survival data for epithelial ovarian carcinomas were extracted from the Surveillance, Epidemiology, and End Results (SEER) program. Two datasets were constructed based upon the year of diagnosis. Dataset 1 (39 514 cases) was limited to primary tumor (T), regional lymph nodes (N) and distant metastasis (M). Dataset 2 (25 291 cases) included additional parameters of age at diagnosis (A) and histologic type and grade (H). The Ensemble Algorithm for Clustering Cancer Data (EACCD) was applied to generate prognostic groups with depiction in dendrograms. C‐indices provided dendrogram cutoffs and comparisons of prediction accuracy. Results Dataset 1 was stratified into nine epithelial ovarian carcinoma prognostic groups, contrasting with 10 groups from FIGO methodology. The EACCD grouping had a slightly higher accuracy in survival prediction than FIGO staging (C‐index = 0.7391 vs 0.7371, increase in C‐index = 0.0020, 95% confidence interval [CI] 0.0012–0.0027, p = 1.8 × 10−7). Nevertheless, there remained a strong inter‐system association between EACCD and FIGO (rank correlation = 0.9480, p = 6.1 × 10−15). Analysis of Dataset 2 demonstrated that A and H could be smoothly integrated with the T, N and M criteria. Survival data were stratified into nine prognostic groups with an even higher prediction accuracy (C‐index = 0.7605) than when using only T, N and M. Conclusions EACCD was successfully applied to integrate A and H with T, N and M for stratification and survival prediction of epithelial ovarian carcinoma patients. Additional factors could be advantageously incorporated to test the prognostic impact of emerging diagnostic or therapeutic advances.
Collapse
Affiliation(s)
- Philip M Grimley
- Department of Pathology, F. Edward Hébert School of Medicine, Uniformed Services University of the Health Sciences, Bethesda, MD, USA
| | - Zhenqiu Liu
- Department of Public Health Sciences, Penn State Cancer Institute, Hershey, PA, USA
| | - Kathleen M Darcy
- Department of Obstetrics & Gynecology, Uniformed Services University of the Health Sciences, Bethesda, MD, USA
| | - Matthew T Hueman
- Department of Surgical Oncology, John P. Murtha Cancer Center, Walter Reed National Military Medical Center, Bethesda, MD, USA
| | - Huan Wang
- Department of Biostatistics, George Washington University, Washington, DC, USA
| | - Li Sheng
- Department of Mathematics, Drexel University, Philadelphia, PA, USA
| | - Donald E Henson
- Department of Preventive Medicine & Biostatistics, F. Edward Hébert School of Medicine, Uniformed Services University of the Health Sciences, Bethesda, MD, USA
| | - Dechang Chen
- Department of Preventive Medicine & Biostatistics, F. Edward Hébert School of Medicine, Uniformed Services University of the Health Sciences, Bethesda, MD, USA
| |
Collapse
|
5
|
Hueman M, Wang H, Liu Z, Henson D, Nguyen C, Park D, Sheng L, Chen D. Expanding TNM for lung cancer through machine learning. Thorac Cancer 2021; 12:1423-1430. [PMID: 33713568 PMCID: PMC8088955 DOI: 10.1111/1759-7714.13926] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2021] [Revised: 02/20/2021] [Accepted: 02/21/2021] [Indexed: 01/05/2023] Open
Abstract
Background Expanding the tumor, lymph node, metastasis (TNM) staging system by accommodating new prognostic and predictive factors for cancer will improve patient stratification and survival prediction. Here, we introduce machine learning for incorporating additional prognostic factors into the conventional TNM for stratifying patients with lung cancer and evaluating survival. Methods Data were extracted from SEER. A total of 77 953 patients were analyzed using factors including primary tumor (T), regional lymph node (N), distant metastasis (M), age, and histology type. Ensemble algorithm for clustering cancer data (EACCD) and C‐index were applied to generate prognostic groups and expand the current staging system. Results With T, N, and M, EACCD stratified patients into 11 groups, resulting in a significantly higher accuracy in survival prediction than the 10 AJCC stages (C‐index = 0.7346 vs. 0.7247, increase in C‐index = 0.0099, 95% CI: 0.0091–0.0106, p‐value = 9.2 × 10−147). There nevertheless remained a strong association between the EACCD grouping and AJCC staging (rank correlation = 0.9289; p‐value = 6.7 × 10−22). A further analysis demonstrated that age and histological tumor could be integrated with the TNM. Data were stratified into 12 prognostic groups with an even higher prediction accuracy (C‐index = 0.7468 vs. 0.7247, increase in C‐index = 0.0221, 95% CI: 0.0212–0.0231, p‐value <5 × 10−324). Conclusions EACCD can be successfully applied to integrate additional factors with T, N, M for lung cancer patients.
Collapse
Affiliation(s)
- Matthew Hueman
- Department of Surgical Oncology, John P. Murtha Cancer Center, Walter Reed National Military Medical Center, Bethesda, Maryland, USA
| | - Huan Wang
- Department of Biostatistics, George Washington University, Washington, District of Columbia, USA
| | - Zhenqiu Liu
- Department of Public Health Sciences, Penn State Cancer Institute, Hershey, Pennsylvania, USA
| | - Donald Henson
- Department of Preventive Medicine & Biostatistics, F. Edward Hébert School of Medicine, Uniformed Services University of the Health Sciences, Bethesda, Maryland, USA
| | - Cuong Nguyen
- Department of Pathology, Uniformed Services University of the Health Sciences, Bethesda, Maryland, USA
| | - Dean Park
- Department of Hematology-Oncology, John P. Murtha Cancer Center, Walter Reed National Military Medical Center, Bethesda, Maryland, USA
| | - Li Sheng
- Department of Mathematics, Drexel University, Philadelphia, Pennsylvania, USA
| | - Dechang Chen
- Department of Preventive Medicine & Biostatistics, F. Edward Hébert School of Medicine, Uniformed Services University of the Health Sciences, Bethesda, Maryland, USA
| |
Collapse
|
6
|
Hueman M, Wang H, Henson D, Chen D. Expanding the TNM for cancers of the colon and rectum using machine learning: a demonstration. ESMO Open 2019; 4:e000518. [PMID: 31275615 PMCID: PMC6579577 DOI: 10.1136/esmoopen-2019-000518] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2019] [Revised: 04/23/2019] [Accepted: 04/24/2019] [Indexed: 12/28/2022] Open
Abstract
Objective The American Joint Committee on Cancer (AJCC) system for staging cancers of the colon and rectum includes depth of tumour penetration, number of positive lymph nodes and presence or absence of metastasis. Using machine learning, we demonstrate that these factors can be integrated with age, carcinoembryonic antigen (CEA) interpretation and tumour location, to form prognostic systems that expand the tumour, lymph node, metastasis (TNM) staging system. Methods Two datasets on colon and rectal cancers were extracted from the Surveillance, Epidemiology and End Results Programme of the National Cancer Institute. Dataset 1 included three factors (tumour, lymph nodes and metastasis). Dataset 2 contained six factors (tumour, lymph nodes, metastasis, age, CEA interpretation and tumour location). The Ensemble Algorithm for Clustering Cancer Data (EACCD) and the C-index were applied to generate prognostic groups. Results The EACCD prognostic system based on dataset 1 stratified patients into 10 risk groups, analogous to the 10 stages of the AJCC staging system. There was a strong inter-system association between EACCD grouping and AJCC staging (Spearman’s rank correlation=0.9046, p value=1.6×10−17). However, the EACCD system had a significantly higher survival prediction accuracy than the AJCC system (C-index=0.7802 and 0.7695, respectively for the EACCD system and AJCC system, p value=4.9×10−91). Adding age, or CEA interpretation, or location improved the prediction accuracy of the prognostic system-involving tumour, lymph nodes and metastasis. The EACCD prognostic system based on dataset 2 and all six factors stratified patients into 10 groups with the highest survival prediction accuracy (C-index=0.7914). Conclusions The EACCD can integrate multiple factors to stratify patients with colon or rectal cancer into risk groups that predict survival with a high accuracy.
Collapse
Affiliation(s)
- Matthew Hueman
- Department of Surgical Oncology, John P. Murtha Cancer Center, Walter Reed National Military Medical Center, Bethesda, Maryland, USA
| | - Huan Wang
- Biostatistics, George Washington University, Washington, District of Columbia, USA
| | - Donald Henson
- Department of Preventive Medicine & Biostatistics, Uniformed Services University, Bethesda, Maryland, USA
| | - Dechang Chen
- PMB, Uniformed Services University, Bethesda, Maryland, USA
| |
Collapse
|
7
|
Yang CQ, Gardiner L, Wang H, Hueman MT, Chen D. Creating Prognostic Systems for Well-Differentiated Thyroid Cancer Using Machine Learning. Front Endocrinol (Lausanne) 2019; 10:288. [PMID: 31139148 PMCID: PMC6517862 DOI: 10.3389/fendo.2019.00288] [Citation(s) in RCA: 16] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/02/2018] [Accepted: 04/23/2019] [Indexed: 11/17/2022] Open
Abstract
Updates to staging models are needed to reflect a greater understanding of tumor behavior and clinical outcomes for well-differentiated thyroid carcinomas. We used a machine learning algorithm and disease-specific survival data of differentiated thyroid carcinoma from the Surveillance, Epidemiology, and End Results Program of the National Cancer Institute to integrate clinical factors to improve prognostic accuracy. The concordance statistic (C-index) was used to cut dendrograms resulting from the learning process to generate prognostic groups. We created one computational prognostic model (7 prognostic groups with C-index = 0.8583) based on tumor size (T), regional lymph nodes (N), status of distant metastasis (M), and age to mirror the contemporary American Joint Committee on Cancer (AJCC) staging system (C-index = 0.8387). We showed that adding histologic type (papillary and follicular) improved the survival prediction of the model. We also showed that 55 is the best cutoff of age in the model, consistent with the changes from the most recent 8th edition staging manual from AJCC. The demonstrated approach has the potential to create prognostic systems permitting data driven and real time analysis that can aid decision-making in patient management and prognostication.
Collapse
Affiliation(s)
- Charles Q. Yang
- Department of Otolaryngology, Walter Reed National Military Medical Center, Bethesda, MD, United States
| | - Lauren Gardiner
- Class of 2020, Virginia Commonwealth University School of Medicine, Richmond, VA, United States
| | - Huan Wang
- Department of Biostatistics, The George Washington University, Washington, DC, United States
| | - Matthew T. Hueman
- Department of Surgical Oncology, John P. Murtha Cancer Center, Walter Reed National Military Medical Center, Bethesda, MD, United States
| | - Dechang Chen
- Department of Preventive Medicine & Biostatistics, F. Edward Hébert School of Medicine, Uniformed Services University of the Health Sciences, Bethesda, MD, United States
- *Correspondence: Dechang Chen
| |
Collapse
|
8
|
Hueman MT, Wang H, Yang CQ, Sheng L, Henson DE, Schwartz AM, Chen D. Creating prognostic systems for cancer patients: A demonstration using breast cancer. Cancer Med 2018; 7:3611-3621. [PMID: 29968970 PMCID: PMC6089151 DOI: 10.1002/cam4.1629] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2018] [Revised: 05/31/2018] [Accepted: 06/01/2018] [Indexed: 11/11/2022] Open
Abstract
Integrating additional prognostic factors into the tumor, lymph node, metastasis staging system improves the relative stratification of cancer patients and enhances the accuracy in planning their treatment options and predicting clinical outcomes. We describe a novel approach to build prognostic systems for cancer patients that can admit any number of prognostic factors. In the approach, an unsupervised learning algorithm was used to create dendrograms and the C‐index was used to cut dendrograms to generate prognostic groups. Breast cancer data from the Surveillance, Epidemiology, and End Results Program of the National Cancer Institute were used for demonstration. Two relative prognostic systems were created for breast cancer. One system (7 prognostic groups with C‐index = 0.7295) was based on tumor size, regional lymph nodes, and no distant metastasis. The other system (7 prognostic groups with C‐index = 0.7458) was based on tumor size, regional lymph nodes, no distant metastasis, grade, estrogen receptor, progesterone receptor, and age. The dendrograms showed a relationship between survival and prognostic factors. The proposed approach is able to create prognostic systems that have a good accuracy in survival prediction and provide a manageable number of prognostic groups. The prognostic systems have the potential to permit a thorough database analysis of all information relevant to decision‐making in patient management and prognosis.
Collapse
Affiliation(s)
- Mathew T Hueman
- Department of Surgical Oncology, John P. Murtha Cancer Center, Walter Reed National Military Medical Center, Bethesda, MD, USA
| | - Huan Wang
- Department of Biostatistics, The George Washington University, Washington, DC, USA
| | - Charles Q Yang
- Department of Surgery, Walter Reed National Military Medical Center, Bethesda, MD, USA
| | - Li Sheng
- Department of Mathematics, Drexel University, Philadelphia, PA, USA
| | - Donald E Henson
- Department of Preventive Medicine & Biostatistics, F. Edward Hébert School of Medicine, Uniformed Services University of the Health Sciences, Bethesda, MD, USA
| | - Arnold M Schwartz
- Department of Pathology, School of Medicine and Health Sciences, The George Washington University, Washington, DC, USA.,Department of Environmental and Occupational Health, Milken Institute School of Public Health, The George Washington University, Washington, DC, USA
| | - Dechang Chen
- Department of Preventive Medicine & Biostatistics, F. Edward Hébert School of Medicine, Uniformed Services University of the Health Sciences, Bethesda, MD, USA
| |
Collapse
|
9
|
An Algorithm for Creating Prognostic Systems for Cancer. J Med Syst 2016; 40:160. [DOI: 10.1007/s10916-016-0518-1] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2015] [Accepted: 05/03/2016] [Indexed: 10/21/2022]
|
10
|
Zhang Z, Huang K, Gu C, Zhao L, Wang N, Wang X, Zhao D, Zhang C, Lu Y, Meng Y. Molecular Subtyping of Serous Ovarian Cancer Based on Multi-omics Data. Sci Rep 2016; 6:26001. [PMID: 27184229 PMCID: PMC4868982 DOI: 10.1038/srep26001] [Citation(s) in RCA: 28] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2016] [Accepted: 04/25/2016] [Indexed: 01/22/2023] Open
Abstract
Classification of ovarian cancer by morphologic features has a limited effect on serous ovarian cancer (SOC) treatment and prognosis. Here, we proposed a new system for SOC subtyping based on the molecular categories from the Cancer Genome Atlas project. We analyzed the DNA methylation, protein, microRNA, and gene expression of 1203 samples from 599 serous ovarian cancer patients. These samples were divided into nine subtypes based on RNA-seq data, and each subtype was found to be associated with the activation and/or suppression of the following four biological processes: immunoactivity, hormone metabolic, mesenchymal development and the MAPK signaling pathway. We also identified four DNA methylation, two protein expression, six microRNA sequencing and four pathway subtypes. By integrating the subtyping results across different omics platforms, we found that most RNA-seq subtypes overlapped with one or two subtypes from other omics data. Our study sheds light on the molecular mechanisms of SOC and provides a new perspective for the more accurate stratification of its subtypes.
Collapse
Affiliation(s)
- Zhe Zhang
- Department of Gynecologic Oncology, Chinese PLA General Hospital, Beijing 100853, China
| | - Ke Huang
- Department of Gynecologic Oncology, Chinese PLA General Hospital, Beijing 100853, China
| | - Chenglei Gu
- Department of Gynecologic Oncology, Chinese PLA General Hospital, Beijing 100853, China
| | - Luyang Zhao
- Department of Gynecologic Oncology, Chinese PLA General Hospital, Beijing 100853, China
| | - Nan Wang
- Department of Gynecologic Oncology, Chinese PLA General Hospital, Beijing 100853, China
| | - Xiaolei Wang
- Beijing Institute of Health Service and Medical Information, Beijing 100850, China
| | - Dongsheng Zhao
- Beijing Institute of Health Service and Medical Information, Beijing 100850, China
| | - Chenggang Zhang
- Beijing Institute of Radiation Medicine, State Key Laboratory of Proteomics, Cognitive and Mental Health Research Center, Beijing 100850, China
| | - Yiming Lu
- Beijing Institute of Radiation Medicine, State Key Laboratory of Proteomics, Cognitive and Mental Health Research Center, Beijing 100850, China
| | - Yuanguang Meng
- Department of Gynecologic Oncology, Chinese PLA General Hospital, Beijing 100853, China
| |
Collapse
|
11
|
Chen D, Hueman MT, Henson DE, Schwartz AM. An algorithm for expanding the TNM staging system. Future Oncol 2016; 12:1015-24. [PMID: 26904925 DOI: 10.2217/fon.16.5] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022] Open
Abstract
AIM We describe a new method to expand the tumor, lymph node, metastasis (TNM) staging system using a clustering algorithm. Cases of breast cancer were used for demonstration. MATERIALS & METHODS An unsupervised ensemble-learning algorithm was used to create dendrograms. Cutting the dendrograms produced prognostic systems. RESULTS Prognostic systems contained groups of patients with similar outcomes. The prognostic systems based on tumor size and lymph node status recapitulated the general structure of the TNM for breast cancer. The prognostic systems based on tumor size, lymph node status, histologic grade and estrogen receptor status revealed a more detailed stratification of patients when grade and estrogen receptor status were added. CONCLUSION Prognostic systems from cutting the dendrogram have the potential to improve and expand the TNM.
Collapse
Affiliation(s)
- Dechang Chen
- Department of Preventive Medicine & Biostatistics, The Uniformed Services University of the Health Sciences, 4301 Jones Bridge Rd, Bethesda, MD 20814, USA
| | - Matthew T Hueman
- Surgical Oncology, John P Murtha Cancer Center, Walter Reed National Military Medical Center, 8901 Wisconsin Ave., Bethesda, MD 20889, USA
| | - Donald E Henson
- Department of Preventive Medicine & Biostatistics, The Uniformed Services University of the Health Sciences, 4301 Jones Bridge Rd, Bethesda, MD 20814, USA.,Department of Surgery, The Uniformed Services University of the Health Sciences, 4301 Jones Bridge Rd, Bethesda, MD 20814, USA
| | - Arnold M Schwartz
- Department of Pathology, The George Washington University Medical Center, Washington, DC 20037, USA.,Department of Surgery, The George Washington University Medical Center, Washington, DC 20037, USA
| |
Collapse
|
12
|
A new cluster-based oversampling method for improving survival prediction of hepatocellular carcinoma patients. J Biomed Inform 2015; 58:49-59. [PMID: 26423562 DOI: 10.1016/j.jbi.2015.09.012] [Citation(s) in RCA: 66] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2015] [Revised: 08/13/2015] [Accepted: 09/20/2015] [Indexed: 11/23/2022]
Abstract
Liver cancer is the sixth most frequently diagnosed cancer and, particularly, Hepatocellular Carcinoma (HCC) represents more than 90% of primary liver cancers. Clinicians assess each patient's treatment on the basis of evidence-based medicine, which may not always apply to a specific patient, given the biological variability among individuals. Over the years, and for the particular case of Hepatocellular Carcinoma, some research studies have been developing strategies for assisting clinicians in decision making, using computational methods (e.g. machine learning techniques) to extract knowledge from the clinical data. However, these studies have some limitations that have not yet been addressed: some do not focus entirely on Hepatocellular Carcinoma patients, others have strict application boundaries, and none considers the heterogeneity between patients nor the presence of missing data, a common drawback in healthcare contexts. In this work, a real complex Hepatocellular Carcinoma database composed of heterogeneous clinical features is studied. We propose a new cluster-based oversampling approach robust to small and imbalanced datasets, which accounts for the heterogeneity of patients with Hepatocellular Carcinoma. The preprocessing procedures of this work are based on data imputation considering appropriate distance metrics for both heterogeneous and missing data (HEOM) and clustering studies to assess the underlying patient groups in the studied dataset (K-means). The final approach is applied in order to diminish the impact of underlying patient profiles with reduced sizes on survival prediction. It is based on K-means clustering and the SMOTE algorithm to build a representative dataset and use it as training example for different machine learning procedures (logistic regression and neural networks). The results are evaluated in terms of survival prediction and compared across baseline approaches that do not consider clustering and/or oversampling using the Friedman rank test. Our proposed methodology coupled with neural networks outperformed all others, suggesting an improvement over the classical approaches currently used in Hepatocellular Carcinoma prediction models.
Collapse
|