Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Zhang X, Li Y, Akinyemiju T, Ojesina AI, Buckhaults P, Liu N, Xu B, Yi N. Pathway-Structured Predictive Model for Cancer Survival Prediction: A Two-Stage Approach. Genetics 2017;205:89-100. [PMID: 28049703 PMCID: PMC5223526 DOI: 10.1534/genetics.116.189191] [Citation(s) in RCA: 28] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2016] [Accepted: 10/31/2016] [Indexed: 12/11/2022] Open

For:	Zhang X, Li Y, Akinyemiju T, Ojesina AI, Buckhaults P, Liu N, Xu B, Yi N. Pathway-Structured Predictive Model for Cancer Survival Prediction: A Two-Stage Approach. Genetics 2017;205:89-100. [PMID: 28049703 PMCID: PMC5223526 DOI: 10.1534/genetics.116.189191] [Citation(s) in RCA: 28] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2016] [Accepted: 10/31/2016] [Indexed: 12/11/2022] Open

Number

Cited by Other Article(s)

Shen J, Wang S, Sun H, Huang J, Bai L, Wang X, Dong Y, Tang Z. A novel non-negative Bayesian stacking modeling method for Cancer survival prediction using high-dimensional omics data. BMC Med Res Methodol 2024;24:105. [PMID: 38702624 PMCID: PMC11067084 DOI: 10.1186/s12874-024-02232-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2023] [Accepted: 04/23/2024] [Indexed: 05/06/2024] Open

Abstract

BACKGROUND

Survival prediction using high-dimensional molecular data is a hot topic in the field of genomics and precision medicine, especially for cancer studies. Considering that carcinogenesis has a pathway-based pathogenesis, developing models using such group structures is a closer mimic of disease progression and prognosis. Many approaches can be used to integrate group information; however, most of them are single-model methods, which may account for unstable prediction.

METHODS

We introduced a novel survival stacking method that modeled using group structure information to improve the robustness of cancer survival prediction in the context of high-dimensional omics data. With a super learner, survival stacking combines the prediction from multiple sub-models that are independently trained using the features in pre-grouped biological pathways. In addition to a non-negative linear combination of sub-models, we extended the super learner to non-negative Bayesian hierarchical generalized linear model and artificial neural network. We compared the proposed modeling strategy with the widely used survival penalized method Lasso Cox and several group penalized methods, e.g., group Lasso Cox, via simulation study and real-world data application.

RESULTS

The proposed survival stacking method showed superior and robust performance in terms of discrimination compared with single-model methods in case of high-noise simulated data and real-world data. The non-negative Bayesian stacking method can identify important biological signal pathways and genes that are associated with the prognosis of cancer.

CONCLUSIONS

This study proposed a novel survival stacking strategy incorporating biological group information into the cancer prognosis models. Additionally, this study extended the super learner to non-negative Bayesian model and ANN, enriching the combination of sub-models. The proposed Bayesian stacking strategy exhibited favorable properties in the prediction and interpretation of complex survival data, which may aid in discovering cancer targets.

Collapse

Affiliation(s)

Junjie Shen Department of Biostatistics, School of Public Health, Jiangsu Key Laboratory of Preventive and Translational Medicine for Major Chronic Non-communicable Diseases, MOE Key Laboratory of Geriatric Diseases and Immunology, Suzhou Medical College of Soochow University, Suzhou, Jiangsu, 215123, People's Republic of China
Shuo Wang Institute of Medical Biometry and Statistics, Faculty of Medicine and Medical Center-University of Freiburg, 79085, Freiburg, Germany
Hao Sun Department of Biostatistics, School of Public Health, Jiangsu Key Laboratory of Preventive and Translational Medicine for Major Chronic Non-communicable Diseases, MOE Key Laboratory of Geriatric Diseases and Immunology, Suzhou Medical College of Soochow University, Suzhou, Jiangsu, 215123, People's Republic of China
Jie Huang Department of Biostatistics, School of Public Health, Jiangsu Key Laboratory of Preventive and Translational Medicine for Major Chronic Non-communicable Diseases, MOE Key Laboratory of Geriatric Diseases and Immunology, Suzhou Medical College of Soochow University, Suzhou, Jiangsu, 215123, People's Republic of China
Lu Bai Department of Biostatistics, School of Public Health, Jiangsu Key Laboratory of Preventive and Translational Medicine for Major Chronic Non-communicable Diseases, MOE Key Laboratory of Geriatric Diseases and Immunology, Suzhou Medical College of Soochow University, Suzhou, Jiangsu, 215123, People's Republic of China
Xichao Wang Department of Biostatistics, School of Public Health, Jiangsu Key Laboratory of Preventive and Translational Medicine for Major Chronic Non-communicable Diseases, MOE Key Laboratory of Geriatric Diseases and Immunology, Suzhou Medical College of Soochow University, Suzhou, Jiangsu, 215123, People's Republic of China
Yongfei Dong Department of Biostatistics, School of Public Health, Jiangsu Key Laboratory of Preventive and Translational Medicine for Major Chronic Non-communicable Diseases, MOE Key Laboratory of Geriatric Diseases and Immunology, Suzhou Medical College of Soochow University, Suzhou, Jiangsu, 215123, People's Republic of China
Zaixiang Tang Department of Biostatistics, School of Public Health, Jiangsu Key Laboratory of Preventive and Translational Medicine for Major Chronic Non-communicable Diseases, MOE Key Laboratory of Geriatric Diseases and Immunology, Suzhou Medical College of Soochow University, Suzhou, Jiangsu, 215123, People's Republic of China.

Collapse

Shen J, Wang S, Dong Y, Sun H, Wang X, Tang Z. A non-negative spike-and-slab lasso generalized linear stacking prediction modeling method for high-dimensional omics data. BMC Bioinformatics 2024;25:119. [PMID: 38509499 PMCID: PMC10953151 DOI: 10.1186/s12859-024-05741-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2023] [Accepted: 03/11/2024] [Indexed: 03/22/2024] Open

Pu J, Yu H, Guo Y. A Novel Strategy to Identify Prognosis-Relevant Gene Sets in Cancers. Genes (Basel) 2022;13:862. [PMID: 35627247 PMCID: PMC9141699 DOI: 10.3390/genes13050862] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2022] [Revised: 05/06/2022] [Accepted: 05/09/2022] [Indexed: 11/16/2022] Open

Shen Z, Jin Y, Sun Q, Zhang S, Chen X, Hu L, He C, Wang Y, Liu Q, Zhang H, Liu X, Wang L, Jiao J, Miao Y, Gu W, Wang F, Wang C, Shi Y, Ye J, Zhu T, Sun C, Song X, Xu L, Yan D, Sun H, Cao J, Li D, Li Z, Wang Z, Huang S, Xu K, Sang W. A Novel Prognostic Index Model for Adult Hemophagocytic Lymphohistiocytosis: A Multicenter Retrospective Analysis in China. Front Immunol 2022;13:829878. [PMID: 35251016 PMCID: PMC8894441 DOI: 10.3389/fimmu.2022.829878] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2021] [Accepted: 01/28/2022] [Indexed: 11/13/2022] Open

Affiliation(s)

Ziyuan Shen Department of Epidemiology and Biostatistics, School of Public Health, Xuzhou Medical University, Xuzhou, China
Yingliang Jin Center for Medical Statistics and Data Analysis, School of Public Health, Xuzhou Medical University, Xuzhou, China
Qian Sun Department of Hematology, Affiliated Hospital of Xuzhou Medical University, Xuzhou, China
Shuo Zhang Department of Hematology, Affiliated Hospital of Xuzhou Medical University, Xuzhou, China
Xi Chen Department of Epidemiology and Biostatistics, School of Public Health, Xuzhou Medical University, Xuzhou, China
Lingling Hu Department of Hematology, Affiliated Hospital of Xuzhou Medical University, Xuzhou, China
Chenlu He Department of Epidemiology and Biostatistics, School of Public Health, Xuzhou Medical University, Xuzhou, China
Ying Wang Department of Personnel, Suqian First Hospital, Suqian, China
Qinhua Liu Department of Hematology, The First Affiliated Hospital of Anhui Medical University, Hefei, China
Hao Zhang Department of Hematology, The Affiliated Hospital of Jining Medical University, Jining, China
Xin Liu Department of Hematology, The Affiliated Hospital of Jining Medical University, Jining, China
Ling Wang Department of Hematology, Taian Central Hospital, Taian, China
Jun Jiao Department of Hematology, Taian Central Hospital, Taian, China
Yuqing Miao Department of Hematology, Yancheng First People’s Hospital, Yancheng, China
Weiying Gu Department of Hematology, The First People’s Hospital of Changzhou, Changzhou, China
Fei Wang Department of Hematology, The First People’s Hospital of Changzhou, Changzhou, China
Chunling Wang Department of Hematology, Huai’an First People’s Hospital, Huai’an, China
Yuye Shi Department of Hematology, Huai’an First People’s Hospital, Huai’an, China
Jingjing Ye Department of Hematology, Qilu Hospital of Shandong University, Jinan, China
Taigang Zhu Department of Hematology, The General Hospital of Wanbei Coal-Electric Group, Suzhou, China
Cai Sun Department of Hematology, Affiliated Hospital of Xuzhou Medical University, Xuzhou, China
Xuguang Song Department of Hematology, Affiliated Hospital of Xuzhou Medical University, Xuzhou, China
Linyan Xu Department of Hematology, Affiliated Hospital of Xuzhou Medical University, Xuzhou, China
Dongmei Yan Department of Hematology, Affiliated Hospital of Xuzhou Medical University, Xuzhou, China
Haiying Sun Department of Hematology, Affiliated Hospital of Xuzhou Medical University, Xuzhou, China
Jiang Cao Department of Hematology, Affiliated Hospital of Xuzhou Medical University, Xuzhou, China
Depeng Li Department of Hematology, Affiliated Hospital of Xuzhou Medical University, Xuzhou, China
Zhenyu Li Department of Hematology, Affiliated Hospital of Xuzhou Medical University, Xuzhou, China
Zhao Wang Department of Hematology, Beijing Friendship Hospital, Capital Medical University, Beijing, China
Shuiping Huang Center for Medical Statistics and Data Analysis, School of Public Health, Xuzhou Medical University, Xuzhou, China *Correspondence: Wei Sang, ; Kailin Xu, ; Shuiping Huang,
Kailin Xu Department of Hematology, Affiliated Hospital of Xuzhou Medical University, Xuzhou, China *Correspondence: Wei Sang, ; Kailin Xu, ; Shuiping Huang,
Wei Sang Department of Hematology, Affiliated Hospital of Xuzhou Medical University, Xuzhou, China *Correspondence: Wei Sang, ; Kailin Xu, ; Shuiping Huang,

Collapse

Novel application of survival models for predicting microbial community transitions with variable selection for eDNA. Appl Environ Microbiol 2022;88:e0214621. [PMID: 35138931 DOI: 10.1128/aem.02146-21] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open

Abstract

Survival analysis is a prolific statistical tool in medicine for inferring risk and time to disease-related events. However, it is under-utilized in microbiome research to predict microbial community mediated events, partly due to the sparsity and high dimensional nature of the data. We advance the application of Cox proportional hazards (Cox PH) survival models to environmental DNA (eDNA) data with feature selection suitable for filtering irrelevant and redundant taxonomic variables. Selection methods are compared in terms of false positives, sensitivity, and survival estimation accuracy in simulation and in a real data setting to forecast harmful cyanobacterial blooms. A novel extension of a method for selecting microbial biomarkers with survival data (SuRFCox) reliably outperforms other methods. We determine Cox PH models with SuRFCox selected predictors are more robust to varied signal, noise, and data correlation structure. SuRFCox also yields the most accurate and consistent prediction of blooms according to cross-validated testing by year over eight different bloom seasons. Identification of common biomarkers among validated survival forecasts over changing conditions has clear biological significance. Survival models with such biomarkers inform risk assessment and provide insight into the causes of critical community transitions. Importance In this paper, we report on a novel approach of selecting microorganisms for model-based prediction of the time to critical microbially-modulated events (e.g., harmful algal blooms, clinical outcomes, community shifts, etc.). Our novel method for identifying biomarkers from large, dynamic communities of microbes has broad utility to environmental and ecological impact risk assessment and public health. Results will also promote theoretical and practical advancements relevant to the biology of specific organisms. To address the unique challenge posed by diverse environmental conditions and sparse microbes, we developed a novel method of selecting predictors for modelling time-to-event data. Competing methods for selecting predictors are rigorously compared to determine which is the most accurate and generalizable. Model forecasts are applied to show suitable predictors can precisely quantify the risk over time of biological events like harmful cyanobacterial blooms.

Collapse

Saad M, He S, Thorstad W, Gay H, Barnett D, Zhao Y, Ruan S, Wang X, Li H. Learning-based Cancer Treatment Outcome Prognosis using Multimodal Biomarkers. IEEE TRANSACTIONS ON RADIATION AND PLASMA MEDICAL SCIENCES 2022;6:231-244. [PMID: 35520102 PMCID: PMC9066560 DOI: 10.1109/trpms.2021.3104297] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/03/2023]

Chu J, Sun NA, Hu W, Chen X, Yi N, Shen Y. The Application of Bayesian Methods in Cancer Prognosis and Prediction. Cancer Genomics Proteomics 2022;19:1-11. [PMID: 34949654 PMCID: PMC8717957 DOI: 10.21873/cgp.20298] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2021] [Revised: 11/24/2021] [Accepted: 11/30/2021] [Indexed: 11/10/2022] Open

Shen J, Liu J, Li H, Bai L, Du Z, Geng R, Cao J, Sun P, Tang Z. Explore association of genes in PDL1/PD1 pathway to radiotherapy survival benefit based on interaction model strategy. Radiat Oncol 2021;16:223. [PMID: 34794456 PMCID: PMC8600865 DOI: 10.1186/s13014-021-01951-x] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2021] [Accepted: 11/08/2021] [Indexed: 02/25/2023] Open

Affiliation(s)

Junjie Shen Department of Biostatistics, School of Public Health, Medical College of Soochow University, Suzhou, 215123, China.,Jiangsu Key Laboratory of Preventive and Translational Medicine for Geriatric Diseases, Medical College of Soochow University, Suzhou, 215123, China
Jingfang Liu Department of Gynaecology and Obstetrics, The First Affiliated Hospital of Soochow University, Suzhou, 215123, China
Huijun Li Department of Biostatistics, School of Public Health, Medical College of Soochow University, Suzhou, 215123, China.,Jiangsu Key Laboratory of Preventive and Translational Medicine for Geriatric Diseases, Medical College of Soochow University, Suzhou, 215123, China
Lu Bai Department of Biostatistics, School of Public Health, Medical College of Soochow University, Suzhou, 215123, China.,Jiangsu Key Laboratory of Preventive and Translational Medicine for Geriatric Diseases, Medical College of Soochow University, Suzhou, 215123, China
Zixuan Du Department of Biostatistics, School of Public Health, Medical College of Soochow University, Suzhou, 215123, China.,Jiangsu Key Laboratory of Preventive and Translational Medicine for Geriatric Diseases, Medical College of Soochow University, Suzhou, 215123, China
Ruirui Geng Department of Biostatistics, School of Public Health, Medical College of Soochow University, Suzhou, 215123, China.,Jiangsu Key Laboratory of Preventive and Translational Medicine for Geriatric Diseases, Medical College of Soochow University, Suzhou, 215123, China
Jianping Cao School of Radiation Medicine and Protection and Collaborative Innovation Center of Radiation Medicine of Jiangsu Higher Education Institutions, Soochow University, Suzhou, 215006, China
Peng Sun Department of Otolaryngology, The First Affiliated Hospital of Soochow University, Suzhou, 215123, China.
Zaixiang Tang Department of Biostatistics, School of Public Health, Medical College of Soochow University, Suzhou, 215123, China. .,Jiangsu Key Laboratory of Preventive and Translational Medicine for Geriatric Diseases, Medical College of Soochow University, Suzhou, 215123, China.

Collapse

Zheng X, Amos CI, Frost HR. Pan-cancer evaluation of gene expression and somatic alteration data for cancer prognosis prediction. BMC Cancer 2021;21:1053. [PMID: 34563154 PMCID: PMC8467202 DOI: 10.1186/s12885-021-08796-3] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2021] [Accepted: 08/16/2021] [Indexed: 02/04/2023] Open

Abstract

BACKGROUND

Over the past decades, approaches for diagnosing and treating cancer have seen significant improvement. However, the variability of patient and tumor characteristics has limited progress on methods for prognosis prediction. The development of high-throughput omics technologies now provides multiple approaches for characterizing tumors. Although a large number of published studies have focused on integration of multi-omics data and use of pathway-level models for cancer prognosis prediction, there still exists a gap of knowledge regarding the prognostic landscape across multi-omics data for multiple cancer types using both gene-level and pathway-level predictors.

METHODS

In this study, we systematically evaluated three often available types of omics data (gene expression, copy number variation and somatic point mutation) covering both DNA-level and RNA-level features. We evaluated the landscape of predictive performance of these three omics modalities for 33 cancer types in the TCGA using a Lasso or Group Lasso-penalized Cox model and either gene or pathway level predictors.

RESULTS

We constructed the prognostic landscape using three types of omics data for 33 cancer types on both the gene and pathway levels. Based on this landscape, we found that predictive performance is cancer type dependent and we also highlighted the cancer types and omics modalities that support the most accurate prognostic models. In general, models estimated on gene expression data provide the best predictive performance on either gene or pathway level and adding copy number variation or somatic point mutation data to gene expression data does not improve predictive performance, with some exceptional cohorts including low grade glioma and thyroid cancer. In general, pathway-level models have better interpretative performance, higher stability and smaller model size across multiple cancer types and omics data types relative to gene-level models.

CONCLUSIONS

Based on this landscape and comprehensively comparison, models estimated on gene expression data provide the best predictive performance on either gene or pathway level. Pathway-level models have better interpretative performance, higher stability and smaller model size relative to gene-level models.

Collapse

A potential prognostic prediction model of colon adenocarcinoma with recurrence based on prognostic lncRNA signatures. Hum Genomics 2020;14:24. [PMID: 32522293 PMCID: PMC7288433 DOI: 10.1186/s40246-020-00270-8] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2019] [Accepted: 05/13/2020] [Indexed: 12/18/2022] Open

Abstract

BACKGROUND

Colon adenocarcinoma (COAD) is one of the common gastrointestinal malignant diseases, with high mortality rate and poor prognosis due to delayed diagnosis. This study aimed to construct a prognostic prediction model for patients with colon adenocarcinoma (COAD) recurrence.

METHODS

Differently expressed RNAs (DERs) between recurrence and non-recurrence COAD samples were identified based on expression profile data from the NCBI Gene Expression Omnibus (GEO) repository and The Cancer Genome Atlas (TCGA) database. Then, recurrent COAD discriminating classifier was established using SMV-RFE algorithm, and receiver operating characteristic curve was used to assess the predictive power of classifier. Furthermore, the prognostic prediction model was constructed based on univariate and multivariate Cox regression analysis, and Kaplan-Meier survival curve analysis was used to estimate this model. Furthermore, the co-expression network of DElncRNAs and DEmRNAs was constructed followed by GO and KEGG pathway enrichment analysis.

RESULTS

A total of 54 optimized signature DElncRNAs were screened and SMV classifier was constructed, which presented a high accuracy to distinguish recurrence and non-recurrence COAD samples. Furthermore, six independent prognostic lncRNAs signatures (LINC00852, ZNF667-AS1, FOXP1-IT1, LINC01560, TAF1A-AS1, and LINC00174) in COAD patients with recurrence were screened, and the prognostic prediction model for recurrent COAD was constructed, which possessed a relative satisfying predicted ability both in the training dataset and validation dataset. Furthermore, the DEmRNAs in the co-expression network were mainly enriched in glycan biosynthesis, cardiac muscle contraction, and colorectal cancer.

CONCLUSIONS

Our study revealed that six lncRNA signatures acted as an independent prognostic biomarker for patients with COAD recurrence.

Collapse

Zhao Z, Li Y, Wu Y, Chen R. Deep learning-based model for predicting progression in patients with head and neck squamous cell carcinoma. Cancer Biomark 2020;27:19-28. [PMID: 31658045 DOI: 10.3233/cbm-190380] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]

Wu G, Zhang M. A novel risk score model based on eight genes and a nomogram for predicting overall survival of patients with osteosarcoma. BMC Cancer 2020;20:456. [PMID: 32448271 PMCID: PMC7245838 DOI: 10.1186/s12885-020-06741-4] [Citation(s) in RCA: 19] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2019] [Accepted: 03/12/2020] [Indexed: 12/19/2022] Open

Abstract

BACKGROUND

This study aims to identify a predictive model to predict survival outcomes of osteosarcoma (OS) patients.

METHODS

A RNA sequencing dataset (the training set) and a microarray dataset (the validation set) were obtained from The Cancer Genome Atlas (TCGA) and Gene Expression Omnibus (GEO) database, respectively. Differentially expressed genes (DEGs) between metastatic and non-metastatic OS samples were identified in training set. Prognosis-related DEGs were screened and optimized by support vector machine (SVM) recursive feature elimination. A SVM classifier was built to classify metastatic and non-metastatic OS samples. Independent prognosic genes were extracted by multivariate regression analysis to build a risk score model followed by performance evaluation in two datasets by Kaplan-Meier (KM) analysis. Independent clinical prognostic indicators were identified followed by nomogram analysis. Finally, functional analyses of survival-related genes were conducted.

RESULT

Totally, 345 DEGs and 45 prognosis-related genes were screened. A SVM classifier could distinguish metastatic and non-metastatic OS samples. An eight-gene signature was an independent prognostic marker and used for constructing a risk score model. The risk score model could separate OS samples into high and low risk groups in two datasets (training set: log-rank p < 0.01, C-index = 0.805; validation set: log-rank p < 0.01, C-index = 0.797). Tumor metastasis and RS model status were independent prognostic factors and nomogram model exhibited accurate survival prediction for OS. Additionally, functional analyses of survival-related genes indicated they were closely associated with immune responses and cytokine-cytokine receptor interaction pathway.

CONCLUSION

An eight-gene predictive model and nomogram were developed to predict OS prognosis.

Collapse

Zheng X, Amos CI, Frost HR. Comparison of pathway and gene-level models for cancer prognosis prediction. BMC Bioinformatics 2020;21:76. [PMID: 32111152 PMCID: PMC7048092 DOI: 10.1186/s12859-020-3423-z] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2019] [Accepted: 02/17/2020] [Indexed: 02/08/2023] Open

Abstract

BACKGROUND

Cancer prognosis prediction is valuable for patients and clinicians because it allows them to appropriately manage care. A promising direction for improving the performance and interpretation of expression-based predictive models involves the aggregation of gene-level data into biological pathways. While many studies have used pathway-level predictors for cancer survival analysis, a comprehensive comparison of pathway-level and gene-level prognostic models has not been performed. To address this gap, we characterized the performance of penalized Cox proportional hazard models built using either pathway- or gene-level predictors for the cancers profiled in The Cancer Genome Atlas (TCGA) and pathways from the Molecular Signatures Database (MSigDB).

RESULTS

When analyzing TCGA data, we found that pathway-level models are more parsimonious, more robust, more computationally efficient and easier to interpret than gene-level models with similar predictive performance. For example, both pathway-level and gene-level models have an average Cox concordance index of ~ 0.85 for the TCGA glioma cohort, however, the gene-level model has twice as many predictors on average, the predictor composition is less stable across cross-validation folds and estimation takes 40 times as long as compared to the pathway-level model. When the complex correlation structure of the data is broken by permutation, the pathway-level model has greater predictive performance while still retaining superior interpretative power, robustness, parsimony and computational efficiency relative to the gene-level models. For example, the average concordance index of the pathway-level model increases to 0.88 while the gene-level model falls to 0.56 for the TCGA glioma cohort using survival times simulated from uncorrelated gene expression data.

CONCLUSION

The results of this study show that when the correlations among gene expression values are low, pathway-level analyses can yield better predictive performance, greater interpretative power, more robust models and less computational cost relative to a gene-level model. When correlations among genes are high, a pathway-level analysis provides equivalent predictive power compared to a gene-level analysis while retaining the advantages of interpretability, robustness and computational efficiency.

Collapse

Yi B, Tang C, Tao Y, Zhao Z. Definition of a novel vascular invasion-associated multi-gene signature for predicting survival in patients with hepatocellular carcinoma. Oncol Lett 2020;19:147-158. [PMID: 31897125 PMCID: PMC6923904 DOI: 10.3892/ol.2019.11072] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2019] [Accepted: 09/11/2019] [Indexed: 12/12/2022] Open

Fang J. Tightly integrated genomic and epigenomic data mining using tensor decomposition. Bioinformatics 2019;35:112-118. [PMID: 29939222 DOI: 10.1093/bioinformatics/bty513] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2018] [Accepted: 06/21/2018] [Indexed: 12/12/2022] Open

Quinn TP, Lee SC, Venkatesh S, Nguyen T. Improving the classification of neuropsychiatric conditions using gene ontology terms as features. Am J Med Genet B Neuropsychiatr Genet 2019;180:508-518. [PMID: 31025483 DOI: 10.1002/ajmg.b.32727] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/22/2018] [Revised: 02/14/2019] [Accepted: 03/08/2019] [Indexed: 11/11/2022]

Cheerla A, Gevaert O. Deep learning with multimodal representation for pancancer prognosis prediction. Bioinformatics 2019;35:i446-i454. [PMID: 31510656 PMCID: PMC6612862 DOI: 10.1093/bioinformatics/btz342] [Citation(s) in RCA: 128] [Impact Index Per Article: 25.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022] Open

Dereli O, Oğuz C, Gönen M. Path2Surv: Pathway/gene set-based survival analysis using multiple kernel learning. Bioinformatics 2019;35:5137-5145. [DOI: 10.1093/bioinformatics/btz446] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2018] [Revised: 05/17/2019] [Accepted: 05/25/2019] [Indexed: 12/18/2022] Open

Zhang Y, Yang W, Li D, Yang JY, Guan R, Yang MQ. Toward the precision breast cancer survival prediction utilizing combined whole genome-wide expression and somatic mutation analysis. BMC Med Genomics 2018;11:104. [PMID: 30454048 DOI: 10.1109/bibm.2017.8217762] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/23/2023] Open

Abstract

BACKGROUND

Breast cancer is the most common type of invasive cancer in woman. It accounts for approximately 18% of all cancer deaths worldwide. It is well known that somatic mutation plays an essential role in cancer development. Hence, we propose that a prognostic prediction model that integrates somatic mutations with gene expression can improve survival prediction for cancer patients and also be able to reveal the genetic mutations associated with survival.

METHOD

Differential expression analysis was used to identify breast cancer related genes. Genetic algorithm (GA) and univariate Cox regression analysis were applied to filter out survival related genes. DAVID was used for enrichment analysis on somatic mutated gene set. The performance of survival predictors were assessed by Cox regression model and concordance index(C-index).

RESULTS

We investigated the genome-wide gene expression profile and somatic mutations of 1091 breast invasive carcinoma cases from The Cancer Genome Atlas (TCGA). We identified 118 genes with high hazard ratios as breast cancer survival risk gene candidates (log rank p < 0.0001 and c-index = 0.636). Multiple breast cancer survival related genes were found in this gene set, including FOXR2, FOXD1, MTNR1B and SDC1. Further genetic algorithm (GA) revealed an optimal gene set consisted of 88 genes with higher c-index (log rank p < 0.0001 and c-index = 0.656). We validated this gene set on an independent breast cancer data set and achieved a similar performance (log rank p < 0.0001 and c-index = 0.614). Moreover, we revealed 25 functional annotations, 15 gene ontology terms and 14 pathways that were significantly enriched in the genes that showed distinct mutation patterns in the different survival risk groups. These functional gene sets were used as new features for the survival prediction model. In particular, our results suggested that the Fanconi anemia pathway had an important role in breast cancer prognosis.

CONCLUSIONS

Our study indicated that the expression levels of the gene signatures remain the effective indicators for breast cancer survival prediction. Combining the gene expression information with other types of features derived from somatic mutations can further improve the performance of survival prediction. The pathways that were associated with survival risk suggested by our study can be further investigated for improving cancer patient survival.

Collapse

Zhang Y, Yang W, Li D, Yang JY, Guan R, Yang MQ. Toward the precision breast cancer survival prediction utilizing combined whole genome-wide expression and somatic mutation analysis. BMC Med Genomics 2018;11:104. [PMID: 30454048 PMCID: PMC6245494 DOI: 10.1186/s12920-018-0419-x] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022] Open

Abstract

Background

Method

Results

Conclusions

Electronic supplementary material

The online version of this article (10.1186/s12920-018-0419-x) contains supplementary material, which is available to authorized users.

Collapse

Zhang X, Li B, Han H, Song S, Xu H, Yi Z, Hong Y, Zhuang W, Yi N. Pathway-structured predictive modeling for multi-level drug response in multiple myeloma. Bioinformatics 2018;34:3609-3615. [PMID: 29850860 PMCID: PMC6198861 DOI: 10.1093/bioinformatics/bty436] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2018] [Revised: 05/08/2018] [Accepted: 05/24/2018] [Indexed: 11/12/2022] Open

Abstract

Motivation

Molecular analyses suggest that myeloma is composed of distinct sub-types that have different molecular pathologies and various response rates to certain treatments. Drug responses in multiple myeloma (MM) are usually recorded as a multi-level ordinal outcome. One of the goals of drug response studies is to predict which response category any patients belong to with high probability based on their clinical and molecular features. However, as most of genes have small effects, gene-based models may provide limited predictive accuracy. In that case, methods for predicting multi-level ordinal drug responses by incorporating biological pathways are desired but have not been developed yet.

Results

We propose a pathway-structured method for predicting multi-level ordinal responses using a two-stage approach. We first develop hierarchical ordinal logistic models and an efficient quasi-Newton algorithm for jointly analyzing numerous correlated variables. Our two-stage approach first obtains the linear predictor (called the pathway score) for each pathway by fitting all predictors within each pathway using the hierarchical ordinal logistic approach, and then combines the pathway scores as new predictors to build a predictive model. We applied the proposed method to two publicly available datasets for predicting multi-level ordinal drug responses in MM using large-scale gene expression data and pathway information. Our results show that our approach not only significantly improved the predictive performance compared with the corresponding gene-based model but also allowed us to identify biologically relevant pathways.

Availability and implementation

The proposed approach has been implemented in our R package BhGLM, which is freely available from the public GitHub repository https://github.com/abbyyan3/BhGLM.

Collapse

Wang JH, Chen YH. Overlapping group screening for detection of gene-gene interactions: application to gene expression profiles with survival trait. BMC Bioinformatics 2018;19:335. [PMID: 30241463 PMCID: PMC6150983 DOI: 10.1186/s12859-018-2372-2] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2018] [Accepted: 09/12/2018] [Indexed: 01/29/2023] Open

Ozturk K, Dow M, Carlin DE, Bejar R, Carter H. The Emerging Potential for Network Analysis to Inform Precision Cancer Medicine. J Mol Biol 2018;430:2875-2899. [PMID: 29908887 PMCID: PMC6097914 DOI: 10.1016/j.jmb.2018.06.016] [Citation(s) in RCA: 54] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2018] [Revised: 05/30/2018] [Accepted: 06/06/2018] [Indexed: 12/19/2022]

Yi N, Tang Z, Zhang X, Guo B. BhGLM: Bayesian hierarchical GLMs and survival models, with applications to genomics and epidemiology. Bioinformatics 2018;35:1419-1421. [PMID: 30219850 PMCID: PMC7963076 DOI: 10.1093/bioinformatics/bty803] [Citation(s) in RCA: 25] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2018] [Revised: 09/05/2018] [Accepted: 09/12/2018] [Indexed: 01/31/2023] Open

Deng K, Zhang F, Song W, Zhao W, Rong Z, Cai Y, Xu H, Lu M, Wang W, Li A, Hou Y, Li Z, Li K. Identification of pathway-based recurrence-associated signatures in optimally debulked patients with serous ovarian cancer. J Cell Biochem 2018;119:8564-8573. [PMID: 30126000 DOI: 10.1002/jcb.27098] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2017] [Accepted: 04/26/2018] [Indexed: 11/06/2022]

Nedungadi P, Iyer A, Gutjahr G, Bhaskar J, Pillai AB. Data-Driven Methods for Advancing Precision Oncology. CURRENT PHARMACOLOGY REPORTS 2018;4:145-156. [PMID: 33520605 PMCID: PMC7845924 DOI: 10.1007/s40495-018-0127-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/17/2022]

Abstract

PURPOSE OF REVIEW

This article discusses the advances, methods, challenges, and future directions of data-driven methods in advancing precision oncology for biomedical research, drug discovery, clinical research, and practice.

RECENT FINDINGS

Precision oncology provides individually tailored cancer treatment by considering an individual's genetic makeup, clinical, environmental, social, and lifestyle information. Challenges include voluminous, heterogeneous, and disparate data generated by different technologies with multiple modalities such as Omics, electronic health records, clinical registries and repositories, medical imaging, demographics, wearables, and sensors. Statistical and machine learning methods have been continuously adapting to the ever-increasing size and complexity of data. Precision Oncology supportive analytics have improved turnaround time in biomarker discovery and time-to-application of new and repurposed drugs. Precision oncology additionally seeks to identify target patient populations based on genomic alterations that are sensitive or resistant to conventional or experimental treatments. Predictive models have been developed for cancer progression and survivorship, drug sensitivity and resistance, and identification of the most suitable combination treatments for individual patient scenarios. In the future, clinical decision support systems need to be revamped to better incorporate knowledge from precision oncology, thus enabling clinical practitioners to provide precision cancer care.

SUMMARY

Open Omics datasets, machine learning algorithms, and predictive models have enabled the advancement of precision oncology. Clinical decision support systems with integrated electronic health record and Omics data are needed to provide data-driven recommendations to assist clinicians in disease prevention, early identification, and individualized treatment. Additionally, as cancer is a constantly evolving disorder, clinical decision systems will need to be continually updated based on more recent knowledge and datasets.

Collapse

Chaudhary K, Poirion OB, Lu L, Garmire LX. Deep Learning-Based Multi-Omics Integration Robustly Predicts Survival in Liver Cancer. Clin Cancer Res 2018;24:1248-1259. [PMID: 28982688 PMCID: PMC6050171 DOI: 10.1158/1078-0432.ccr-17-0853] [Citation(s) in RCA: 490] [Impact Index Per Article: 81.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2017] [Revised: 06/18/2017] [Accepted: 10/02/2017] [Indexed: 02/07/2023]

Kuznetsov VA, Tang Z, Ivshina AV. Identification of common oncogenic and early developmental pathways in the ovarian carcinomas controlling by distinct prognostically significant microRNA subsets. BMC Genomics 2017;18:692. [PMID: 28984201 PMCID: PMC5629558 DOI: 10.1186/s12864-017-4027-5] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022] Open

Abstract

Background

High-grade serous ovarian carcinoma (HG-SOC) is the dominant tumor histologic type in epithelial ovarian cancers, exhibiting highly aberrant microRNA expression profiles and diverse pathways that collectively determine the disease aggressiveness and clinical outcomes. However, the functional relationships between microRNAs, the common pathways controlled by the microRNAs and their prognostic and therapeutic significance remain poorly understood.

Methods

We investigated the gene expression patterns of microRNAs in the tumors of 582 HG-SOC patients to identify prognosis signatures and pathways controlled by tumor miRNAs. We developed a variable selection and prognostic method, which performs a robust selection of small-sized subsets of the predictive features (e.g., expressed microRNAs) that collectively serves as the biomarkers of cancer risk and progression stratification system, interconnecting these features with common cancer-related pathways.

Results

Across different cohorts, our meta-analysis revealed two robust and unbiased miRNA-based prognostic classifiers. Each classifier reproducibly discriminates HG-SOC patients into high-confidence low-, intermediate- or high-prognostic risk subgroups with essentially different 5-year overall survival rates of 51.6-85%, 20-38.1%, and 0-10%, respectively. Significant correlations of the risk subgroup’s stratification with chemotherapy treatment response were observed. We predicted specific target genes involved in nine cancer-related and two oocyte maturation pathways (neurotrophin and progesterone-mediated oocyte maturation), where each gene can be controlled by more than one miRNA species of the distinct miRNA HG-SOC prognostic classifiers.

Conclusions

We identified robust and reproducible miRNA-based prognostic subsets of the of HG-SOC classifiers. The miRNAs of these classifiers could control nine oncogenic and two developmental pathways, highlighting common underlying pathologic mechanisms and perspective targets for the further development of a personalized prognosis assay(s) and the development of miRNA-interconnected pathway-centric and multi-agent therapeutic intervention.

Electronic supplementary material

The online version of this article (doi:10.1186/s12864-017-4027-5) contains supplementary material, which is available to authorized users.

Collapse