1
|
Vanmathi P, Jose D. An ensemble-based serial cascaded attention network and improved variational auto encoder for breast cancer prognosis prediction using data. Comput Methods Biomech Biomed Engin 2024; 27:98-115. [PMID: 38006210 DOI: 10.1080/10255842.2023.2280883] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2023] [Accepted: 11/02/2023] [Indexed: 11/26/2023]
Abstract
Breast cancer is one of the most common types of cancer in women and it produces a huge amount of death rate in the world. Early recognition is lessening its impact. The early recognition of breast cancer could convince patients to receive surgical therapy, which will significantly improve the chance of restoration. This information is used by the machine learning technique to find links between them and appraise our forecasts of fresh occurrences. Later recognition of breast cancer can lead to death. An accurate prescient framework for breast cancer prediction is urgently needed in the current era. In order to accomplish the objective, an adaptive ensemble model is proposed for breast cancer prognosis prediction using data. At the initial stage, the raw data are fetched from benchmark datasets. It is then followed by data cleaning and preprocessing. Subsequently, the pre-processed data is fed into the Improved Variational Autoencoder (IVAE), where the deep features are extracted. Finally, the resultant features are given as input to the Ensemble-based Serial Cascaded Attention Network (ESCANet), which is built with Deep Temporal Convolution Network (DTCN), Bi-directional Long Short-Term Memory (BiLSTM), and Recurrent Neural Network (RNN). The effectiveness of the model is validated and compared with conventional methodologies. Therefore, the results elucidate that the proposed methodology achieves extensive results; thus, it increases the system's efficiency.
Collapse
Affiliation(s)
- P Vanmathi
- Full time Research Scholar, Department of ECE, KCG College of Technology, Karapakkam, Chennai, Tamil Nadu, India
| | - Deepa Jose
- Professor, Department of ECE, KCG College of Technology, Karapakkam, Chennai, Tamil Nadu, India
| |
Collapse
|
2
|
Lin W, Saner NJ, Weng X, Caruana NJ, Botella J, Kuang J, Lee MJC, Jamnick NA, Pitchford NW, Garnham A, Bartlett JD, Chen H, Bishop DJ. The Effect of Sleep Restriction, With or Without Exercise, on Skeletal Muscle Transcriptomic Profiles in Healthy Young Males. Front Endocrinol (Lausanne) 2022; 13:863224. [PMID: 35937838 PMCID: PMC9355502 DOI: 10.3389/fendo.2022.863224] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/26/2022] [Accepted: 06/22/2022] [Indexed: 11/13/2022] Open
Abstract
BACKGROUND Inadequate sleep is associated with many detrimental health effects, including increased risk of developing insulin resistance and type 2 diabetes. These effects have been associated with changes to the skeletal muscle transcriptome, although this has not been characterised in response to a period of sleep restriction. Exercise induces a beneficial transcriptional response within skeletal muscle that may counteract some of the negative effects associated with sleep restriction. We hypothesised that sleep restriction would down-regulate transcriptional pathways associated with glucose metabolism, but that performing exercise would mitigate these effects. METHODS 20 healthy young males were allocated to one of three experimental groups: a Normal Sleep (NS) group (8 h time in bed per night (TIB), for five nights (11 pm - 7 am)), a Sleep Restriction (SR) group (4 h TIB, for five nights (3 am - 7 am)), and a Sleep Restriction and Exercise group (SR+EX) (4 h TIB, for five nights (3 am - 7 am) and three high-intensity interval exercise (HIIE) sessions (performed at 10 am)). RNA sequencing was performed on muscle samples collected pre- and post-intervention. Our data was then compared to skeletal muscle transcriptomic data previously reported following sleep deprivation (24 h without sleep). RESULTS Gene set enrichment analysis (GSEA) indicated there was an increased enrichment of inflammatory and immune response related pathways in the SR group post-intervention. However, in the SR+EX group the direction of enrichment in these same pathways occurred in the opposite directions. Despite this, there were no significant changes at the individual gene level from pre- to post-intervention. A set of genes previously shown to be decreased with sleep deprivation was also decreased in the SR group, but increased in the SR+EX group. CONCLUSION The alterations to inflammatory and immune related pathways in skeletal muscle, following five nights of sleep restriction, provide insight regarding the transcriptional changes that underpin the detrimental effects associated with sleep loss. Performing three sessions of HIIE during sleep restriction attenuated some of these transcriptional changes. Overall, the transcriptional alterations observed with a moderate period of sleep restriction were less evident than previously reported changes following a period of sleep deprivation.
Collapse
Affiliation(s)
- Wentao Lin
- College of Exercise and Health, Guangzhou Sport University, Guangzhou, China
| | - Nicholas J. Saner
- Institute for Health and Sport, Victoria University, Melbourne, VIC, Australia
- Human Integrative Physiology, Baker Heart and Diabetes Institute, Melbourne, VIC, Australia
| | - Xiquan Weng
- College of Exercise and Health, Guangzhou Sport University, Guangzhou, China
| | - Nikeisha J. Caruana
- Institute for Health and Sport, Victoria University, Melbourne, VIC, Australia
- Department of Biochemistry and Pharmacology and Bio21 Molecular Science and Biotechnology Institute, The University of Melbourne, Parkville, VIC, Australia
| | - Javier Botella
- Department of Biochemistry and Pharmacology and Bio21 Molecular Science and Biotechnology Institute, The University of Melbourne, Parkville, VIC, Australia
| | - Jujiao Kuang
- Institute for Health and Sport, Victoria University, Melbourne, VIC, Australia
| | - Matthew J-C. Lee
- Institute for Health and Sport, Victoria University, Melbourne, VIC, Australia
| | - Nicholas A. Jamnick
- Metabolic Research Unit, Institute for Mental and Physical Health and Clinical Translation, School of Medicine, Deakin University, Geelong, VIC, Australia
| | - Nathan W. Pitchford
- School of Health Sciences, University of Tasmania, Launceston, TAS, Australia
| | - Andrew Garnham
- Institute for Health and Sport, Victoria University, Melbourne, VIC, Australia
| | | | - Hao Chen
- College of Exercise and Health, Guangzhou Sport University, Guangzhou, China
- *Correspondence: Hao Chen, ; David J. Bishop,
| | - David J. Bishop
- Institute for Health and Sport, Victoria University, Melbourne, VIC, Australia
- *Correspondence: Hao Chen, ; David J. Bishop,
| |
Collapse
|
3
|
Transcriptome study of receptive endometrium in overweight and obese women shows important expression differences in immune response and inflammatory pathways in women who do not conceive. PLoS One 2021; 16:e0261873. [PMID: 34941965 PMCID: PMC8699967 DOI: 10.1371/journal.pone.0261873] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2021] [Accepted: 12/12/2021] [Indexed: 12/15/2022] Open
Abstract
Obesity and being overweight are growing worldwide health problems that also affect women of reproductive age. They impair women’s fertility and are associated with lower IVF success rates. The mechanism by which increased body weight disrupts fertility has not yet been established. One possibility is that it affects the process of embryo implantation on the endometrial level. The purpose of our study was to determine the differences in enriched biological pathways in the endometrium of overweight and obese women undergoing IVF procedures. For this purpose, 14 patients (5 pregnant, 9 non-pregnant) were included in the study. Endometrial samples were obtained during the window of implantation and RNA sequencing was performed. There were no differences in general patient’s and IVF cycle characteristics between pregnant and non-pregnant women. In the endometrial samples of women who did not conceive, pathways related to the immune response, inflammation, and reactive oxygen species production were over-expressed. Our findings show that the reason for implantation failure in overweight and obese women could lie in the excessive immune and inflammatory response at the endometrial level.
Collapse
|
4
|
Zhu J, Zhao L, Hu Y, Cui G, Luo A, Bao C, Han Y, Zhou T, Lu W, Wang J, Black SM, Tang H. Hypoxia-Inducible Factor 2-Alpha Mediated Gene Sets Differentiate Pulmonary Arterial Hypertension. Front Cell Dev Biol 2021; 9:701247. [PMID: 34422822 PMCID: PMC8375387 DOI: 10.3389/fcell.2021.701247] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2021] [Accepted: 07/16/2021] [Indexed: 12/02/2022] Open
Abstract
OBJECTIVES HIF2α is of vital importance in the regulation of endothelial dysfunction, cell proliferation, migration, and pulmonary vascular remodeling in pulmonary hypertension. Our previous studies demonstrated that conditional and inducible deletion of HIF2α in mouse lung endothelial cells, dramatically protected the mice against vascular remodeling and the development of pulmonary arterial hypertension (PAH). Here, we provide a novel transcriptome insight into the impact of HIF2α in PAH pathogenesis and the potential to use HIF2α-mediated gene sets to differentiate PAH human subjects. METHODS Using transcriptome data, we first tapped the value of the difference in gene expression profile between wild type (WT) and Hif2a knockdown (KD) cell lines. We considered the deregulated genes between WT and Hif2a-KD cells as HIF2α influenced genes. By examining the lung tissue transcriptome data set with nine controls and eight PAH patients, we evaluated the HIF2α regulatory network in PAH pathogenesis to further determine the identification ability of HIF2α-mediated gene sets in human PAH subjects. On the other hand, using peripheral blood mononuclear cells (PBMCs) transcriptome data from PAH patients and healthy controls, we further validated the potential of the HIF2α-mediated PBMC gene sets as a possible diagnostic tool for PAH. To verify the ability of HIF2α-mediated gene sets for the identification of PAH, endothelial cell-specific Phd2 knockout mice with spontaneous pulmonary hypertension were used for reverse validation experiments. RESULTS 19 identified GO biological process terms were significantly correlated with the genes down-regulated in Hif2a-KD cells, all of which are strongly related to the PAH pathogenesis. We further assessed the discriminative power of these HIF2α-mediated gene sets in PAH human subjects. We found that the expression profile of the HIF2α-mediated gene sets in lung tissues and PBMCs were differentiated both between controls and PAH patients. Further, a significant positive correlation was observed between hypoxia and Phd2 deficiency mediated gene set expression profiles. As expected, 7 of the 19 significantly down-regulated GO terms in Hif2a-KD cells were found to overlap with the up-regulated GO gene sets in Phd2 EC-/- mice compared to WT controls, suggesting opposing effects of HIF2α and PHD2 on PAH pathogenesis. CONCLUSION HIF2α-mediated gene sets may be used to differentiate pulmonary arterial hypertension.
Collapse
Affiliation(s)
- Jinsheng Zhu
- College of Veterinary Medicine, Northwest A&F University, Xianyang, China
| | - Li Zhao
- College of Veterinary Medicine, Northwest A&F University, Xianyang, China
| | - Yadan Hu
- College of Veterinary Medicine, Northwest A&F University, Xianyang, China
| | - Guoqi Cui
- College of Veterinary Medicine, Northwest A&F University, Xianyang, China
| | - Ang Luo
- College of Veterinary Medicine, Northwest A&F University, Xianyang, China
| | - Changlei Bao
- College of Veterinary Medicine, Northwest A&F University, Xianyang, China
| | - Ying Han
- Department of Physiology, Nanjing Medical University, Nanjing, China
| | - Tong Zhou
- Department of Physiology and Cell Biology, University of Nevada School of Medicine, Reno, NV, United States
| | - Wenju Lu
- State Key Laboratory of Respiratory Disease, Guangdong Key Laboratory of Vascular Disease, National Clinical Research Center for Respiratory Disease, Guangzhou Institute of Respiratory Health, The First Affiliated Hospital of Guangzhou Medical University, Guangzhou, China
| | - Jian Wang
- State Key Laboratory of Respiratory Disease, Guangdong Key Laboratory of Vascular Disease, National Clinical Research Center for Respiratory Disease, Guangzhou Institute of Respiratory Health, The First Affiliated Hospital of Guangzhou Medical University, Guangzhou, China
| | - Stephen M. Black
- Department of Cellular Biology and Pharmacology, Herbert Wertheim College of Medicine, Miami, FL, United States
- Department of Environmental Health Sciences, Robert Stempel College of Public Health and Social Work, Miami, FL, United States
- Center for Translational Science, Florida International University, Port St. Lucie, FL, United States
| | - Haiyang Tang
- College of Veterinary Medicine, Northwest A&F University, Xianyang, China
- State Key Laboratory of Respiratory Disease, Guangdong Key Laboratory of Vascular Disease, National Clinical Research Center for Respiratory Disease, Guangzhou Institute of Respiratory Health, The First Affiliated Hospital of Guangzhou Medical University, Guangzhou, China
| |
Collapse
|
5
|
Rahem SM, Epsi NJ, Coffman FD, Mitrofanova A. Genome-wide analysis of therapeutic response uncovers molecular pathways governing tamoxifen resistance in ER+ breast cancer. EBioMedicine 2020; 61:103047. [PMID: 33099086 PMCID: PMC7585053 DOI: 10.1016/j.ebiom.2020.103047] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2020] [Revised: 09/02/2020] [Accepted: 09/18/2020] [Indexed: 01/10/2023] Open
Abstract
Background Prioritization of breast cancer patients based on the risk of resistance to tamoxifen plays a significant role in personalized therapeutic planning and improving disease course and outcomes. Methods In this work, we demonstrate that a genome-wide pathway-centric computational framework elucidates molecular pathways as markers of tamoxifen resistance in ER+ breast cancer patients. In particular, we associated activity levels of molecular pathways with a wide spectrum of response to tamoxifen, which defined markers of tamoxifen resistance in patients with ER+ breast cancer. Findings We identified five biological pathways as markers of tamoxifen failure and demonstrated their ability to predict the risk of tamoxifen resistance in two independent patient cohorts (Test cohort1: log-rank p-value = 0.02, adjusted HR = 3.11; Test cohort2: log-rank p-value = 0.01, adjusted HR = 4.24). We have shown that these pathways are not markers of aggressiveness and outperform known markers of tamoxifen response. Furthermore, for adoption into clinic, we derived a list of pathway read-out genes and their associated scoring system, which assigns a risk of tamoxifen resistance for new incoming patients. Interpretation We propose that the identified pathways and their read-out genes can be utilized to prioritize patients who would benefit from tamoxifen treatment and patients at risk of tamoxifen resistance that should be offered alternative regimens. Funding This work was supported by the Rutgers SHP Dean's research grant, Rutgers start-up funds, Libyan Ministry of Higher Education and Scientific Research, and Katrina Kehlet Graduate Award from The NJ Chapter of the Healthcare Information Management Systems Society.
Collapse
Affiliation(s)
- Sarra M Rahem
- Department of Biomedical and Health Informatics, Rutgers School of Health Professions, Rutgers Biomedical and Health Sciences, USA
| | - Nusrat J Epsi
- Department of Biomedical and Health Informatics, Rutgers School of Health Professions, Rutgers Biomedical and Health Sciences, USA
| | - Frederick D Coffman
- Department of Biomedical and Health Informatics, Rutgers School of Health Professions, Rutgers Biomedical and Health Sciences, USA; Department of Physician Assistant Studies and Practice, USA; Department of Pathology & Laboratory Medicine, New Jersey Medical School, Newark, New Jersey 07107, USA
| | - Antonina Mitrofanova
- Department of Biomedical and Health Informatics, Rutgers School of Health Professions, Rutgers Biomedical and Health Sciences, USA; Rutgers Cancer Institute of New Jersey, Rutgers, The State University of New Jersey, New Brunswick, New Jersey 08901, USA.
| |
Collapse
|
6
|
Adnan N, Lei C, Ruan J. Robust edge-based biomarker discovery improves prediction of breast cancer metastasis. BMC Bioinformatics 2020; 21:359. [PMID: 32998692 PMCID: PMC7526355 DOI: 10.1186/s12859-020-03692-2] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022] Open
Abstract
Background The abundance of molecular profiling of breast cancer tissues entailed active research on molecular marker-based early diagnosis of metastasis. Recently there is a surging interest in combining gene expression with gene networks such as protein-protein interaction (PPI) network, gene co-expression (CE) network and pathway information to identify robust and accurate biomarkers for metastasis prediction, reflecting the common belief that cancer is a systems biology disease. However, controversy exists in the literature regarding whether network markers are indeed better features than genes alone for predicting as well as understanding metastasis. We believe much of the existing results may have been biased by the overly complicated prediction algorithms, unfair evaluation, and lack of rigorous statistics. In this study, we propose a simple approach to use network edges as features, based on two types of networks respectively, and compared their prediction power using three classification algorithms and rigorous statistical procedure on one of the largest datasets available. To detect biomarkers that are significant for the prediction and to compare the robustness of different feature types, we propose an unbiased and novel procedure to measure feature importance that eliminates the potential bias from factors such as different sample size, number of features, as well as class distribution. Results Experimental results reveal that edge-based feature types consistently outperformed gene-based feature type in random forest and logistic regression models under all performance evaluation metrics, while the prediction accuracy of edge-based support vector machine (SVM) model was poorer, due to the larger number of edge features compared to gene features and the lack of feature selection in SVM model. Experimental results also show that edge features are much more robust than gene features and the top biomarkers from edge feature types are statistically more significantly enriched in the biological processes that are well known to be related to breast cancer metastasis. Conclusions Overall, this study validates the utility of edge features as biomarkers but also highlights the importance of carefully designed experimental procedures in order to achieve statistically reliable comparison results.
Collapse
Affiliation(s)
- Nahim Adnan
- Department of Computer Science, The University of Texas at San Antonio, One UTSA Circle, San Antonio, 78249, TX, USA
| | - Chengwei Lei
- Department of Computer & Electrical Engineering/Computer Science, California State University, Bakersfield, 9001 Stockdale Highway, Bakersfield, 93311, CA, USA
| | - Jianhua Ruan
- Department of Computer Science, The University of Texas at San Antonio, One UTSA Circle, San Antonio, 78249, TX, USA.
| |
Collapse
|
7
|
Momenzadeh M, Sehhati M, Rabbani H. Using hidden Markov model to predict recurrence of breast cancer based on sequential patterns in gene expression profiles. J Biomed Inform 2020; 111:103570. [PMID: 32961308 DOI: 10.1016/j.jbi.2020.103570] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2020] [Revised: 09/06/2020] [Accepted: 09/10/2020] [Indexed: 12/16/2022]
Abstract
A new approach is presented to predict breast cancer recurrence through gene expression profiles using hidden Markov models (HMM). In this regard, 322 genes were selected from 44 published gene lists related to breast cancer prognosis. Afterwards, using gene set enrichment analysis, 922 gene sets were found from subsets of genes with the same biological meaning. In order to extract the sequential patterns from gene expression data, we ranked the gene sets using appropriate criteria and used HMM in which the ranked gene sets considered as observation sequences and hidden states represented priority of gene sets for discriminating between expression profiles. In this experiment, seven publicly available microarray datasets, including 1271 breast tumor samples, were used to classify cancer patients into two groups according to risk of recurrence. Our experiments indicated the greater performance and more robustness of the proposed model compared with other widely used classification methods.
Collapse
Affiliation(s)
- Mohammadreza Momenzadeh
- Department of Biomedical Engineering, School of Advanced Technologies in Medicine, Isfahan University of Medical Sciences, Isfahan, Iran
| | - Mohammadreza Sehhati
- Department of Biomedical Engineering, School of Advanced Technologies in Medicine, Isfahan University of Medical Sciences, Isfahan, Iran; Medical Image and Signal Processing Research Center, Isfahan University of Medical Sciences, Isfahan, Iran; Department of Bioinformatics, School of Advanced Technologies in Medicine, Isfahan University of Medical Sciences, Isfahan, Iran.
| | - Hossein Rabbani
- Department of Biomedical Engineering, School of Advanced Technologies in Medicine, Isfahan University of Medical Sciences, Isfahan, Iran; Medical Image and Signal Processing Research Center, Isfahan University of Medical Sciences, Isfahan, Iran
| |
Collapse
|
8
|
Kaspi A, Ziemann M. mitch: multi-contrast pathway enrichment for multi-omics and single-cell profiling data. BMC Genomics 2020; 21:447. [PMID: 32600408 PMCID: PMC7325150 DOI: 10.1186/s12864-020-06856-9] [Citation(s) in RCA: 25] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2019] [Accepted: 06/19/2020] [Indexed: 11/13/2022] Open
Abstract
BACKGROUND Inference of biological pathway activity via gene set enrichment analysis is frequently used in the interpretation of clinical and other omics data. With the proliferation of new omics profiling approaches and ever-growing size of data sets generated, there is a lack of tools available to perform and visualise gene set enrichments in analyses involving multiple contrasts. RESULTS To address this, we developed mitch, an R package for multi-contrast gene set enrichment analysis. It uses a rank-MANOVA statistical approach to identify sets of genes that exhibit joint enrichment across multiple contrasts. Its unique visualisation features enable the exploration of enrichments in up to 20 contrasts. We demonstrate the utility of mitch with case studies spanning multi-contrast RNA expression profiling, integrative multi-omics, tool benchmarking and single-cell RNA sequencing. Using simulated data we show that mitch has similar accuracy to state of the art tools for single-contrast enrichment analysis, and superior accuracy in identifying multi-contrast enrichments. CONCLUSION mitch is a versatile tool for rapidly and accurately identifying and visualising gene set enrichments in multi-contrast omics data. Mitch is available from Bioconductor ( https://bioconductor.org/packages/mitch ).
Collapse
Affiliation(s)
- Antony Kaspi
- Population Health and Immunity Division, The Walter and Eliza Hall Institute of Medical Research, 1G Royal Parade, Parkville, VIC, 3052, Australia
- Department of Medical Biology, University of Melbourne, 1G Royal Parade, Parkville, VIC, 3052, Australia
| | - Mark Ziemann
- School of Life and Environmental Sciences, Deakin University, Geelong, Australia.
| |
Collapse
|
9
|
Adnan N, Liu Z, Huang THM, Ruan J. Comparative evaluation of network features for the prediction of breast cancer metastasis. BMC Med Genomics 2020; 13:40. [PMID: 32241278 PMCID: PMC7119280 DOI: 10.1186/s12920-020-0676-3] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022] Open
Abstract
Background Discovering a highly accurate and robust gene signature for the prediction of breast cancer metastasis from gene expression profiling of primary tumors is one of the most challenging tasks to reduce the number of deaths in women. Due to the limited success of gene-based features in achieving satisfactory prediction accuracy, many methodologies have been proposed in recent years to develop network-based features by integrating network information with gene expression. However, evaluation results are inconsistent to confirm the effectiveness of network-based features, because of many confounding factors involved in classification model learning process, such as data normalization, dimension reduction, and feature selection. An unbiased comparative evaluation is essential for uncovering the strength of network-based features. Methods In this study, we compared several types of network-based features obtained using different mathematical operators (Mean, Maximum, Minimum, Median, Variance) on geneset (i.e., a gene and its’ neighbors in the network) in protein-protein interaction network and gene co-expression network for their ability in predicting breast cancer metastasis using gene expression data from more than 10 patient cohorts. Results While network-based features are usually statistically more significant than gene-based feature, a consistent improvement of prediction performance using network-based features requires a substantial number of patients in the dataset. In contrary to many previous reports, no evidence was found to support the robustness of network-based features and we argue some of the robustness may be due to the inherent bias associated with node degree in the network. In addition, different types of network features seem to cover different pathways and are complementary to each other. Consequently, an ensemble classifier combining different network features was proposed and was found to significantly outperform classifiers based on gene-based feature or any single type of network-based features. Conclusions Network-based features and their combination show promise for improving the prediction of breast cancer metastasis but may require a large amount of training data. Robustness claim of network-based features needs to be re-examined with network node degree and other confounding factors in consideration.
Collapse
Affiliation(s)
- Nahim Adnan
- Department of Computer Science, University of Texas at San Antonio, One UTSA Circle, San Antonio, TX 78249, USA
| | - Zhijie Liu
- Department of Molecular Medicine, University of Texas Health Science Center at San Antonio, 7703 Floyd Curl Drive, San Antonio, TX 78230, USA
| | - Tim H M Huang
- Department of Molecular Medicine, University of Texas Health Science Center at San Antonio, 7703 Floyd Curl Drive, San Antonio, TX 78230, USA
| | - Jianhua Ruan
- Department of Computer Science, University of Texas at San Antonio, One UTSA Circle, San Antonio, TX 78249, USA. .,Department of Molecular Medicine, University of Texas Health Science Center at San Antonio, 7703 Floyd Curl Drive, San Antonio, TX 78230, USA.
| |
Collapse
|
10
|
Comparison of GeneChip, nCounter, and Real-Time PCR-Based Gene Expressions Predicting Locoregional Tumor Control after Primary and Postoperative Radiochemotherapy in Head and Neck Squamous Cell Carcinoma. J Mol Diagn 2020; 22:801-810. [PMID: 32247864 DOI: 10.1016/j.jmoldx.2020.03.005] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2019] [Revised: 02/21/2020] [Accepted: 03/10/2020] [Indexed: 02/07/2023] Open
Abstract
This article compares the expression and applicability of biomarkers, from single genes and gene signatures, identified in patients with locally advanced head and neck squamous cell carcinoma using the GeneChip Human Transcriptome Array 2.0, nCounter, and real-time PCR analyses. Two multicenter, retrospective cohorts of patients with head and neck squamous cell carcinoma from the German Cancer Consortium Radiation Oncology Group who received postoperative radiochemotherapy or primary radiochemotherapy were considered. Real-time PCR was performed for a limited number of 38 genes of the cohort who received postoperative radiochemotherapy only. Correlations between the methods were evaluated by the Spearman rank correlation coefficient. Patients were stratified based on the expression of putative cancer stem cell markers, hypoxia-associated gene signatures, and a previously developed seven-gene signature. Locoregional tumor control was compared between these patient subgroups using log-rank tests. Gene expressions obtained from nCounter analyses were moderately correlated to GeneChip analyses (median ρ = approximately 0.68). A higher correlation was obtained between nCounter analyses and real-time PCR (median ρ = 0.84). Significant associations with locoregional tumor control were observed for most of the considered biomarkers evaluated by GeneChip and nCounter analyses. In general, all applied biomarkers (single genes and gene signatures) classified approximately 70% to 85% of the patients similarly. Overall, gene signatures seem to be more robust and had a better transferability among different measurement methods.
Collapse
|
11
|
Cui ZJ, Zhou XH, Zhang HY. DNA Methylation Module Network-Based Prognosis and Molecular Typing of Cancer. Genes (Basel) 2019; 10:genes10080571. [PMID: 31357729 PMCID: PMC6722866 DOI: 10.3390/genes10080571] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2019] [Revised: 07/11/2019] [Accepted: 07/26/2019] [Indexed: 12/25/2022] Open
Abstract
Achieving cancer prognosis and molecular typing is critical for cancer treatment. Previous studies have identified some gene signatures for the prognosis and typing of cancer based on gene expression data. Some studies have shown that DNA methylation is associated with cancer development, progression, and metastasis. In addition, DNA methylation data are more stable than gene expression data in cancer prognosis. Therefore, in this work, we focused on DNA methylation data. Some prior researches have shown that gene modules are more reliable in cancer prognosis than are gene signatures and that gene modules are not isolated. However, few studies have considered cross-talk among the gene modules, which may allow some important gene modules for cancer to be overlooked. Therefore, we constructed a gene co-methylation network based on the DNA methylation data of cancer patients, and detected the gene modules in the co-methylation network. Then, by permutation testing, cross-talk between every two modules was identified; thus, the module network was generated. Next, the core gene modules in the module network of cancer were identified using the K-shell method, and these core gene modules were used as features to study the prognosis and molecular typing of cancer. Our method was applied in three types of cancer (breast invasive carcinoma, skin cutaneous melanoma, and uterine corpus endometrial carcinoma). Based on the core gene modules identified by the constructed DNA methylation module networks, we can distinguish not only the prognosis of cancer patients but also use them for molecular typing of cancer. These results indicated that our method has important application value for the diagnosis of cancer and may reveal potential carcinogenic mechanisms.
Collapse
Affiliation(s)
- Ze-Jia Cui
- Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan 430070, China
| | - Xiong-Hui Zhou
- Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan 430070, China.
| | - Hong-Yu Zhang
- Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan 430070, China.
| |
Collapse
|
12
|
Li JN, Zhong R, Zhou XH. Prediction of Bone Metastasis in Breast Cancer Based on Minimal Driver Gene Set in Gene Dependency Network. Genes (Basel) 2019; 10:E466. [PMID: 31213036 PMCID: PMC6627827 DOI: 10.3390/genes10060466] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2019] [Revised: 06/02/2019] [Accepted: 06/14/2019] [Indexed: 12/21/2022] Open
Abstract
Bone is the most frequent organ for breast cancer metastasis, and thus it is essential to predict the bone metastasis of breast cancer. In our work, we constructed a gene dependency network based on the hypothesis that the relation between one gene and the risk of bone metastasis might be affected by another gene. Then, based on the structure controllability theory, we mined the driver gene set which can control the whole network in the gene dependency network, and the signature genes were selected from them. Survival analysis showed that the signature could distinguish the bone metastasis risks of cancer patients in the test data set and independent data set. Besides, we used the signature genes to construct a centroid classifier. The results showed that our method is effective and performed better than published methods.
Collapse
Affiliation(s)
- Jia-Nuo Li
- Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan 430070, China.
| | - Rui Zhong
- Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan 430070, China.
| | - Xiong-Hui Zhou
- Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan 430070, China.
| |
Collapse
|
13
|
Gao YC, Zhou XH, Zhang W. An Ensemble Strategy to Predict Prognosis in Ovarian Cancer Based on Gene Modules. Front Genet 2019; 10:366. [PMID: 31068972 PMCID: PMC6491874 DOI: 10.3389/fgene.2019.00366] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2019] [Accepted: 04/05/2019] [Indexed: 12/15/2022] Open
Abstract
Due to the high heterogeneity and complexity of cancer, it is still a challenge to predict the prognosis of cancer patients. In this work, we used a clustering algorithm to divide patients into different subtypes in order to reduce the heterogeneity of the cancer patients in each subtype. Based on the hypothesis that the gene co-expression network may reveal relationships among genes, some communities in the network could influence the prognosis of cancer patients and all the prognosis-related communities could fully reveal the prognosis of cancer patients. To predict the prognosis for cancer patients in each subtype, we adopted an ensemble classifier based on the gene co-expression network of the corresponding subtype. Using the gene expression data of ovarian cancer patients in TCGA (The Cancer Genome Atlas), three subtypes were identified. Survival analysis showed that patients in different subtypes had different survival risks. Three ensemble classifiers were constructed for each subtype. Leave-one-out and independent validation showed that our method outperformed control and literature methods. Furthermore, the function annotation of the communities in each subtype showed that some communities were cancer-related. Finally, we found that the current drug targets can partially support our method.
Collapse
Affiliation(s)
| | - Xiong-Hui Zhou
- Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan, China
| | - Wen Zhang
- Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan, China
| |
Collapse
|
14
|
Zhou XH, Chu XY, Xue G, Xiong JH, Zhang HY. Identifying cancer prognostic modules by module network analysis. BMC Bioinformatics 2019; 20:85. [PMID: 30777030 PMCID: PMC6380061 DOI: 10.1186/s12859-019-2674-z] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2017] [Accepted: 02/08/2019] [Indexed: 02/08/2023] Open
Abstract
Background The identification of prognostic genes that can distinguish the prognostic risks of cancer patients remains a significant challenge. Previous works have proven that functional gene sets were more reliable for this task than the gene signature. However, few works have considered the cross-talk among functional gene sets, which may result in neglecting important prognostic gene sets for cancer. Results Here, we proposed a new method that considers both the interactions among modules and the prognostic correlation of the modules to identify prognostic modules in cancers. First, dense sub-networks in the gene co-expression network of cancer patients were detected. Second, cross-talk between every two modules was identified by a permutation test, thus generating the module network. Third, the prognostic correlation of each module was evaluated by the resampling method. Then, the GeneRank algorithm, which takes the module network and the prognostic correlations of all the modules as input, was applied to prioritize the prognostic modules. Finally, the selected modules were validated by survival analysis in various data sets. Our method was applied in three kinds of cancers, and the results show that our method succeeded in identifying prognostic modules in all the three cancers. In addition, our method outperformed state-of-the-art methods. Furthermore, the selected modules were significantly enriched with known cancer-related genes and drug targets of cancer, which may indicate that the genes involved in the modules may be drug targets for therapy. Conclusions We proposed a useful method to identify key modules in cancer prognosis and our prognostic genes may be good candidates for drug targets. Electronic supplementary material The online version of this article (10.1186/s12859-019-2674-z) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Xiong-Hui Zhou
- Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan, 430070, People's Republic of China
| | - Xin-Yi Chu
- Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan, 430070, People's Republic of China
| | - Gang Xue
- Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan, 430070, People's Republic of China
| | - Jiang-Hui Xiong
- State Key Laboratory of Space Medicine Fundamentals and Application, China Astronaut Research and Training Center, Beijing, People's Republic of China.,Lab of Epigenetics and Health Tracking Technology, Space Institute of Southern China, Shenzhen, People's Republic of China
| | - Hong-Yu Zhang
- Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan, 430070, People's Republic of China.
| |
Collapse
|
15
|
Zhang X, Li B, Han H, Song S, Xu H, Yi Z, Hong Y, Zhuang W, Yi N. Pathway-structured predictive modeling for multi-level drug response in multiple myeloma. Bioinformatics 2018; 34:3609-3615. [PMID: 29850860 PMCID: PMC6198861 DOI: 10.1093/bioinformatics/bty436] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2018] [Revised: 05/08/2018] [Accepted: 05/24/2018] [Indexed: 11/12/2022] Open
Abstract
Motivation Molecular analyses suggest that myeloma is composed of distinct sub-types that have different molecular pathologies and various response rates to certain treatments. Drug responses in multiple myeloma (MM) are usually recorded as a multi-level ordinal outcome. One of the goals of drug response studies is to predict which response category any patients belong to with high probability based on their clinical and molecular features. However, as most of genes have small effects, gene-based models may provide limited predictive accuracy. In that case, methods for predicting multi-level ordinal drug responses by incorporating biological pathways are desired but have not been developed yet. Results We propose a pathway-structured method for predicting multi-level ordinal responses using a two-stage approach. We first develop hierarchical ordinal logistic models and an efficient quasi-Newton algorithm for jointly analyzing numerous correlated variables. Our two-stage approach first obtains the linear predictor (called the pathway score) for each pathway by fitting all predictors within each pathway using the hierarchical ordinal logistic approach, and then combines the pathway scores as new predictors to build a predictive model. We applied the proposed method to two publicly available datasets for predicting multi-level ordinal drug responses in MM using large-scale gene expression data and pathway information. Our results show that our approach not only significantly improved the predictive performance compared with the corresponding gene-based model but also allowed us to identify biologically relevant pathways. Availability and implementation The proposed approach has been implemented in our R package BhGLM, which is freely available from the public GitHub repository https://github.com/abbyyan3/BhGLM.
Collapse
Affiliation(s)
- Xinyan Zhang
- Department of Biostatistics, Jiann-Ping Hsu College of Public Health, Georgia Southern University, Statesboro, GA, USA
| | - Bingzong Li
- Department of Hematology, The Second Affiliated Hospital of Soochow University, Suzhou, China
| | - Huiying Han
- Department of Cell Biology, School of Biology and Basic Medical Sciences, Soochow University, Suzhou, China
| | - Sha Song
- Department of Cell Biology, School of Biology and Basic Medical Sciences, Soochow University, Suzhou, China
| | - Hongxia Xu
- Department of Cell Biology, School of Biology and Basic Medical Sciences, Soochow University, Suzhou, China
| | - Zixuan Yi
- School of Medicine, Eastern Virginia Medical School, Norfork, VA, USA
| | - Yating Hong
- Department of Hematology, The Second Affiliated Hospital of Soochow University, Suzhou, China
| | - Wenzhuo Zhuang
- Department of Cell Biology, School of Biology and Basic Medical Sciences, Soochow University, Suzhou, China
| | - Nengjun Yi
- Department of Biostatistics, University of Alabama at Birmingham, Birmingham, AL, USA
| |
Collapse
|
16
|
Zheng F, Wei L, Zhao L, Ni F. Pathway Network Analysis of Complex Diseases Based on Multiple Biological Networks. BIOMED RESEARCH INTERNATIONAL 2018; 2018:5670210. [PMID: 30151386 PMCID: PMC6091292 DOI: 10.1155/2018/5670210] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/10/2017] [Revised: 02/06/2018] [Accepted: 03/11/2018] [Indexed: 12/14/2022]
Abstract
Biological pathways play important roles in the development of complex diseases, such as cancers, which are multifactorial complex diseases that are usually caused by multiple disorders gene mutations or pathway. It has become one of the most important issues to analyze pathways combining multiple types of high-throughput data, such as genomics and proteomics, to understand the mechanisms of complex diseases. In this paper, we propose a method for constructing the pathway network of gene phenotype and find out disease pathogenesis pathways through the analysis of the constructed network. The specific process of constructing the network includes, firstly, similarity calculation between genes expressing data combined with phenotypic mutual information and GO ontology information, secondly, calculating the correlation between pathways based on the similarity between differential genes and constructing the pathway network, and, finally, mining critical pathways to identify diseases. Experimental results on Breast Cancer Dataset using this method show that our method is better. In addition, testing on an alternative dataset proved that the key pathways we found were more accurate and reliable as biological markers of disease. These results show that our proposed method is effective.
Collapse
Affiliation(s)
- Fang Zheng
- College of Informatics, Huazhong Agricultural University, Wuhan 430079, China
| | - Le Wei
- College of Informatics, Huazhong Agricultural University, Wuhan 430079, China
| | - Liang Zhao
- College of Informatics, Huazhong Agricultural University, Wuhan 430079, China
| | - FuChuan Ni
- College of Informatics, Huazhong Agricultural University, Wuhan 430079, China
| |
Collapse
|
17
|
Liang R, Wang M, Zheng G, Zhu H, Zhi Y, Sun Z. A comprehensive analysis of prognosis prediction models based on pathway‑level, gene‑level and clinical information for glioblastoma. Int J Mol Med 2018; 42:1837-1846. [PMID: 30015853 PMCID: PMC6108889 DOI: 10.3892/ijmm.2018.3765] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2018] [Accepted: 06/21/2018] [Indexed: 11/23/2022] Open
Abstract
The present study aimed to develop a pathway-based prognosis prediction model for glioblastoma (GBM). Univariate and multivariate Cox regression analysis were used to identify prognosis-related genes and clinical factors using mRNA-seq data of GBM samples from The Cancer Genome Atlas (TCGA) database. The expression matrix of prognosis-related genes was transformed into pathway deregulation score (PDS) based on the Gene Set Enrichment Analysis (GSEA) repository using Pathifier software. With PDS scores as input, L1-penalized estimation-based Cox-proportional hazards (PH) model was used to identify prognostic pathways. Consequently, a prognosis prediction model based on these prognostic pathways was constructed for classifying patients in the TCGA set or each of the three validation sets into two risk groups. The survival difference between these risk groups was then analyzed using Kaplan-Meier survival analysis and log-rank test. In addition, a gene-based prognostic model was constructed using the Cox-PH model. The model of prognostic pathway combined with clinical factors was also evaluated. In total, 148 genes were discovered to be associated with prognosis. The Cox-PH model identified 13 prognostic pathways. Subsequently, a prognostic model based on the 13 pathways was constructed, and was demonstrated to successfully differentiate overall survival in the TCGA set and in three independent sets. However, the gene-based prognosis model was validated in only two of the three independent sets. Furthermore, the pathway+clinic factor-based model exhibited better predictive results compared with the pathway-based model. In conclusion, the present study suggests a promising prognosis prediction model of 13 pathways for GBM, which may be superior to the gene-level information-based prognostic model.
Collapse
Affiliation(s)
- Ruqing Liang
- Department of Neurology, Affiliated Hospital of Jining Medical University, Jining, Shandong 272000, P.R. China
| | - Meng Wang
- Department of Oncology, Jining First People's Hospital, Jining, Shandong 272011, P.R. China
| | - Guizhi Zheng
- College of Integrated Chinese and Western Medicine, Jining Medical University, Jining, Shandong 272067, P.R. China
| | - Hua Zhu
- Department of Oncology, Jining First People's Hospital, Jining, Shandong 272011, P.R. China
| | - Yaqin Zhi
- Department of Oncology, Jining First People's Hospital, Jining, Shandong 272011, P.R. China
| | - Zongwen Sun
- Department of Oncology, Jining First People's Hospital, Jining, Shandong 272011, P.R. China
| |
Collapse
|
18
|
Pietrosemoli N, Mella S, Yennek S, Baghdadi MB, Sakai H, Sambasivan R, Pala F, Di Girolamo D, Tajbakhsh S. Comparison of multiple transcriptomes exposes unified and divergent features of quiescent and activated skeletal muscle stem cells. Skelet Muscle 2017; 7:28. [PMID: 29273087 PMCID: PMC5741941 DOI: 10.1186/s13395-017-0144-8] [Citation(s) in RCA: 27] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2017] [Accepted: 11/29/2017] [Indexed: 01/17/2023] Open
Abstract
BACKGROUND Skeletal muscle satellite (stem) cells are quiescent in adult mice and can undergo multiple rounds of proliferation and self-renewal following muscle injury. Several labs have profiled transcripts of myogenic cells during the developmental and adult myogenesis with the aim of identifying quiescent markers. Here, we focused on the quiescent cell state and generated new transcriptome profiles that include subfractionations of adult satellite cell populations, and an artificially induced prenatal quiescent state, to identify core signatures for quiescent and proliferating. METHODS Comparison of available data offered challenges related to the inherent diversity of datasets and biological conditions. We developed a standardized workflow to homogenize the normalization, filtering, and quality control steps for the analysis of gene expression profiles allowing the identification up- and down-regulated genes and the subsequent gene set enrichment analysis. To share the analytical pipeline of this work, we developed Sherpa, an interactive Shiny server that allows multi-scale comparisons for extraction of desired gene sets from the analyzed datasets. This tool is adaptable to cell populations in other contexts and tissues. RESULTS A multi-scale analysis comprising eight datasets of quiescent satellite cells had 207 and 542 genes commonly up- and down-regulated, respectively. Shared up-regulated gene sets include an over-representation of the TNFα pathway via NFKβ signaling, Il6-Jak-Stat3 signaling, and the apical surface processes, while shared down-regulated gene sets exhibited an over-representation of Myc and E2F targets and genes associated to the G2M checkpoint and oxidative phosphorylation. However, virtually all datasets contained genes that are associated with activation or cell cycle entry, such as the immediate early stress response genes Fos and Jun. An empirical examination of fixed and isolated satellite cells showed that these and other genes were absent in vivo, but activated during procedural isolation of cells. CONCLUSIONS Through the systematic comparison and individual analysis of diverse transcriptomic profiles, we identified genes that were consistently differentially expressed among the different datasets and shared underlying biological processes key to the quiescent cell state. Our findings provide impetus to define and distinguish transcripts associated with true in vivo quiescence from those that are first responding genes due to disruption of the stem cell niche.
Collapse
Affiliation(s)
- Natalia Pietrosemoli
- Bioinformatics and Biostatistics Hub, C3BI, USR 3756 IP CNRS, Institut Pasteur, 75015 Paris, France
| | - Sébastien Mella
- Stem Cells and Development, Department of Developmental and Stem Cell Biology, Institut Pasteur, 75015 Paris, France
- CNRS UMR 3738, Institut Pasteur, 75015 Paris, France
| | - Siham Yennek
- Stem Cells and Development, Department of Developmental and Stem Cell Biology, Institut Pasteur, 75015 Paris, France
- CNRS UMR 3738, Institut Pasteur, 75015 Paris, France
- Novo Nordisk Foundation Center for Stem Cell Biology, DanStem, University of Copenhagen, 3B Blegdamsvej, DK-2200 Copenhagen N, Denmark
| | - Meryem B. Baghdadi
- Stem Cells and Development, Department of Developmental and Stem Cell Biology, Institut Pasteur, 75015 Paris, France
- CNRS UMR 3738, Institut Pasteur, 75015 Paris, France
| | - Hiroshi Sakai
- Stem Cells and Development, Department of Developmental and Stem Cell Biology, Institut Pasteur, 75015 Paris, France
- CNRS UMR 3738, Institut Pasteur, 75015 Paris, France
| | - Ramkumar Sambasivan
- Institute for Stem Cell Biology and Regenerative Medicine, GKVK PO, Bellary Road, Bengaluru, 560065 India
| | - Francesca Pala
- Stem Cells and Development, Department of Developmental and Stem Cell Biology, Institut Pasteur, 75015 Paris, France
- CNRS UMR 3738, Institut Pasteur, 75015 Paris, France
| | - Daniela Di Girolamo
- Stem Cells and Development, Department of Developmental and Stem Cell Biology, Institut Pasteur, 75015 Paris, France
- Dipartimento di Medicina Clinica e Chirurgia, Università degli Studi di Napoli Federico II, Via S. Pansini 5, 80131 Naples, Italy
| | - Shahragim Tajbakhsh
- Stem Cells and Development, Department of Developmental and Stem Cell Biology, Institut Pasteur, 75015 Paris, France
- CNRS UMR 3738, Institut Pasteur, 75015 Paris, France
| |
Collapse
|
19
|
Gulluni F, Martini M, De Santis MC, Campa CC, Ghigo A, Margaria JP, Ciraolo E, Franco I, Ala U, Annaratone L, Disalvatore D, Bertalot G, Viale G, Noatynska A, Compagno M, Sigismund S, Montemurro F, Thelen M, Fan F, Meraldi P, Marchiò C, Pece S, Sapino A, Chiarle R, Di Fiore PP, Hirsch E. Mitotic Spindle Assembly and Genomic Stability in Breast Cancer Require PI3K-C2α Scaffolding Function. Cancer Cell 2017; 32:444-459.e7. [PMID: 29017056 DOI: 10.1016/j.ccell.2017.09.002] [Citation(s) in RCA: 59] [Impact Index Per Article: 7.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/06/2017] [Revised: 07/25/2017] [Accepted: 09/05/2017] [Indexed: 12/11/2022]
Abstract
Proper organization of the mitotic spindle is key to genetic stability, but molecular components of inter-microtubule bridges that crosslink kinetochore fibers (K-fibers) are still largely unknown. Here we identify a kinase-independent function of class II phosphoinositide 3-OH kinase α (PI3K-C2α) acting as limiting scaffold protein organizing clathrin and TACC3 complex crosslinking K-fibers. Downregulation of PI3K-C2α causes spindle alterations, delayed anaphase onset, and aneuploidy, indicating that PI3K-C2α expression is required for genomic stability. Reduced abundance of PI3K-C2α in breast cancer models initially impairs tumor growth but later leads to the convergent evolution of fast-growing clones with mitotic checkpoint defects. As a consequence of altered spindle, loss of PI3K-C2α increases sensitivity to taxane-based therapy in pre-clinical models and in neoadjuvant settings.
Collapse
Affiliation(s)
- Federico Gulluni
- Molecular Biotechnology Center, Department of Molecular Biotechnology and Health Sciences, University of Torino, Turin 10126, Italy
| | - Miriam Martini
- Molecular Biotechnology Center, Department of Molecular Biotechnology and Health Sciences, University of Torino, Turin 10126, Italy.
| | - Maria Chiara De Santis
- Molecular Biotechnology Center, Department of Molecular Biotechnology and Health Sciences, University of Torino, Turin 10126, Italy
| | - Carlo Cosimo Campa
- Molecular Biotechnology Center, Department of Molecular Biotechnology and Health Sciences, University of Torino, Turin 10126, Italy
| | - Alessandra Ghigo
- Molecular Biotechnology Center, Department of Molecular Biotechnology and Health Sciences, University of Torino, Turin 10126, Italy
| | - Jean Piero Margaria
- Molecular Biotechnology Center, Department of Molecular Biotechnology and Health Sciences, University of Torino, Turin 10126, Italy
| | - Elisa Ciraolo
- Molecular Biotechnology Center, Department of Molecular Biotechnology and Health Sciences, University of Torino, Turin 10126, Italy
| | - Irene Franco
- Molecular Biotechnology Center, Department of Molecular Biotechnology and Health Sciences, University of Torino, Turin 10126, Italy
| | - Ugo Ala
- Molecular Biotechnology Center, Department of Molecular Biotechnology and Health Sciences, University of Torino, Turin 10126, Italy
| | - Laura Annaratone
- Department of Medical Sciences, University of Torino, Turin, Italy; Pathology Unit, Department of Laboratory Medicine, Azienda Ospedaliera Universitaria Città della Salute e della Scienza di Torino, Turin, Italy
| | - Davide Disalvatore
- IFOM, The FIRC Institute for Molecular Oncology Foundation, Milan, Italy
| | - Giovanni Bertalot
- Program of Molecular Medicine, IEO, European Institute of Oncology, Milan, Italy
| | - Giuseppe Viale
- Division of Pathology, European Institute of Oncology, Milan, Italy; Department of Oncology and Hemato-oncology (DIPO), University of Milan, Milan, Italy
| | - Anna Noatynska
- Department of Cell Physiology and Metabolism, University of Geneva, Geneva, Switzerland
| | - Mara Compagno
- Molecular Biotechnology Center, Department of Molecular Biotechnology and Health Sciences, University of Torino, Turin 10126, Italy; Department of Pathology, Boston Children's Hospital and Harvard Medical School, Boston, MA, USA
| | - Sara Sigismund
- IFOM, The FIRC Institute for Molecular Oncology Foundation, Milan, Italy
| | - Filippo Montemurro
- Unit of Investigative Oncology, Candiolo Cancer Institute - FPO, IRCCS, Candiolo (TO), Italy
| | - Marcus Thelen
- Institute for Research in Biomedicine, Università della Svizzera Italiana, Bellinzona, Switzerland
| | - Fan Fan
- Department of Biological Science and Bioengineering, Key Laboratory of Biomedical Information Engineering of Ministry of Education, School of Life Science and Technology, Xi'an Jiaotong University, Xi'an, Shaanxi, P. R. China
| | - Patrick Meraldi
- Department of Cell Physiology and Metabolism, University of Geneva, Geneva, Switzerland
| | - Caterina Marchiò
- Department of Medical Sciences, University of Torino, Turin, Italy; Pathology Unit, Department of Laboratory Medicine, Azienda Ospedaliera Universitaria Città della Salute e della Scienza di Torino, Turin, Italy
| | - Salvatore Pece
- Program of Molecular Medicine, IEO, European Institute of Oncology, Milan, Italy; Department of Oncology and Hemato-oncology (DIPO), University of Milan, Milan, Italy
| | - Anna Sapino
- Department of Medical Sciences, University of Torino, Turin, Italy; Unit of Pathology, Candiolo Cancer Institute - FPO, IRCCS, Candiolo (TO), Italy
| | - Roberto Chiarle
- Molecular Biotechnology Center, Department of Molecular Biotechnology and Health Sciences, University of Torino, Turin 10126, Italy; Department of Pathology, Boston Children's Hospital and Harvard Medical School, Boston, MA, USA
| | - Pier Paolo Di Fiore
- IFOM, The FIRC Institute for Molecular Oncology Foundation, Milan, Italy; Program of Molecular Medicine, IEO, European Institute of Oncology, Milan, Italy; Department of Oncology and Hemato-oncology (DIPO), University of Milan, Milan, Italy
| | - Emilio Hirsch
- Molecular Biotechnology Center, Department of Molecular Biotechnology and Health Sciences, University of Torino, Turin 10126, Italy.
| |
Collapse
|
20
|
Zhang Q, Li J, Wang D, Wang Y. Finding disagreement pathway signatures and constructing an ensemble model for cancer classification. Sci Rep 2017; 7:10044. [PMID: 28855608 PMCID: PMC5577098 DOI: 10.1038/s41598-017-10258-5] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2017] [Accepted: 08/07/2017] [Indexed: 12/02/2022] Open
Abstract
Cancer classification based on molecular level is a relatively routine research procedure with advances in high-throughput molecular profiling techniques. However, the number of genes typically far exceeds the number of the sample size in gene expression studies. The existing gene selection methods are almost based on statistics and machine learning, overlooking relevant biological principles or knowledge while working with biological data. Here, we propose a robust ensemble learning paradigm, which incorporates multiple pathways information, to predict cancer classification. We compare the proposed method with other methods, such as Elastic SCAD and PPDMF, and estimate the classification performance. The results show that the proposed method has the higher performances on most metrics and robust performance. We further investigate the biological mechanism of the ensemble feature genes. The results demonstrate that the ensemble feature genes are associated with drug targets/clinically-relevant cancer. In addition, some core biological pathways and biological process underlying clinically-relevant phenotypes are identified by function annotation. Overall, our research can provide a new perspective for the further study of molecular activities and manifestations of cancer.
Collapse
Affiliation(s)
- Qiaosheng Zhang
- Harbin Institute of Technology, School of Computer Science and Technology, Harbin, 150001, P.R. China.,Heilongjiang Bayi Agricultural University, College of Science, Daqing, 163319, P.R. China
| | - Jie Li
- Harbin Institute of Technology, School of Computer Science and Technology, Harbin, 150001, P.R. China.
| | - Dong Wang
- Harbin Institute of Technology, School of Computer Science and Technology, Harbin, 150001, P.R. China
| | - Yadong Wang
- Harbin Institute of Technology, School of Computer Science and Technology, Harbin, 150001, P.R. China
| |
Collapse
|
21
|
Zhang X, Li Y, Akinyemiju T, Ojesina AI, Buckhaults P, Liu N, Xu B, Yi N. Pathway-Structured Predictive Model for Cancer Survival Prediction: A Two-Stage Approach. Genetics 2017; 205:89-100. [PMID: 28049703 PMCID: PMC5223526 DOI: 10.1534/genetics.116.189191] [Citation(s) in RCA: 28] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2016] [Accepted: 10/31/2016] [Indexed: 12/11/2022] Open
Abstract
Heterogeneity in terms of tumor characteristics, prognosis, and survival among cancer patients has been a persistent problem for many decades. Currently, prognosis and outcome predictions are made based on clinical factors and/or by incorporating molecular profiling data. However, inaccurate prognosis and prediction may result by using only clinical or molecular information directly. One of the main shortcomings of past studies is the failure to incorporate prior biological information into the predictive model, given strong evidence of the pathway-based genetic nature of cancer, i.e., the potential for oncogenes to be grouped into pathways based on biological functions such as cell survival, proliferation, and metastatic dissemination. To address this problem, we propose a two-stage approach to incorporate pathway information into the prognostic modeling using large-scale gene expression data. In the first stage, we fit all predictors within each pathway using the penalized Cox model and Bayesian hierarchical Cox model. In the second stage, we combine the cross-validated prognostic scores of all pathways obtained in the first stage as new predictors to build an integrated prognostic model for prediction. We apply the proposed method to analyze two independent breast and ovarian cancer datasets from The Cancer Genome Atlas (TCGA), predicting overall survival using large-scale gene expression profiling data. The results from both datasets show that the proposed approach not only improves survival prediction compared with the alternative analyses that ignore the pathway information, but also identifies significant biological pathways.
Collapse
Affiliation(s)
- Xinyan Zhang
- Department of Biostatistics, University of Alabama at Birmingham, Alabama 35294
| | - Yan Li
- Department of Biostatistics, University of Alabama at Birmingham, Alabama 35294
| | - Tomi Akinyemiju
- Department of Epidemiology, University of Alabama at Birmingham, Alabama 35294
| | - Akinyemi I Ojesina
- Department of Epidemiology, University of Alabama at Birmingham, Alabama 35294
| | - Phillip Buckhaults
- Department of Drug Discovery and Biomedical Sciences, The South Carolina College of Pharmacy, The University of South Carolina, Columbia, South Carolina 29208
| | - Nianjun Liu
- Department of Epidemiology and Biostatistics, School of Public Health, Indiana University, Bloomington, Indiana 47405
| | - Bo Xu
- Department of Oncology, Southern Research Institute, Birmingham, Alabama 35205
| | - Nengjun Yi
- Department of Biostatistics, University of Alabama at Birmingham, Alabama 35294
| |
Collapse
|
22
|
Cava C, Colaprico A, Bertoli G, Bontempi G, Mauri G, Castiglioni I. How interacting pathways are regulated by miRNAs in breast cancer subtypes. BMC Bioinformatics 2016; 17:348. [PMID: 28185585 PMCID: PMC5123339 DOI: 10.1186/s12859-016-1196-1] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/08/2023] Open
Abstract
BACKGROUND An important challenge in cancer biology is to understand the complex aspects of the disease. It is increasingly evident that genes are not isolated from each other and the comprehension of how different genes are related to each other could explain biological mechanisms causing diseases. Biological pathways are important tools to reveal gene interaction and reduce the large number of genes to be studied by partitioning it into smaller paths. Furthermore, recent scientific evidence has proven that a combination of pathways, instead than a single element of the pathway or a single pathway, could be responsible for pathological changes in a cell. RESULTS In this paper we develop a new method that can reveal miRNAs able to regulate, in a coordinated way, networks of gene pathways. We applied the method to subtypes of breast cancer. The basic idea is the identification of pathways significantly enriched with differentially expressed genes among the different breast cancer subtypes and normal tissue. Looking at the pairs of pathways that were found to be functionally related, we created a network of dependent pathways and we focused on identifying miRNAs that could act as miRNA drivers in a coordinated regulation process. CONCLUSIONS Our approach enables miRNAs identification that could have an important role in the development of breast cancer.
Collapse
Affiliation(s)
- Claudia Cava
- Institute of Molecular Bioimaging and Physiology (IBFM), National Research Council (CNR), Milan, Italy
| | - Antonio Colaprico
- Interuniversity Institute of Bioinformatics in Brussels (IB), Brussels, Belgium
- Machine Learning Group, ULB, Brussels, Belgium
| | - Gloria Bertoli
- Institute of Molecular Bioimaging and Physiology (IBFM), National Research Council (CNR), Milan, Italy
| | - Gianluca Bontempi
- Interuniversity Institute of Bioinformatics in Brussels (IB), Brussels, Belgium
- Machine Learning Group, ULB, Brussels, Belgium
| | - Giancarlo Mauri
- Department of Informatics, Systems and Communications, University of Milan–Bicocca, Milan, Italy
| | - Isabella Castiglioni
- Institute of Molecular Bioimaging and Physiology (IBFM), National Research Council (CNR), Milan, Italy
| |
Collapse
|
23
|
Differentially Expressed Genes and Signature Pathways of Human Prostate Cancer. PLoS One 2015; 10:e0145322. [PMID: 26683658 PMCID: PMC4687717 DOI: 10.1371/journal.pone.0145322] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2015] [Accepted: 12/02/2015] [Indexed: 11/30/2022] Open
Abstract
Genomic technologies including microarrays and next-generation sequencing have enabled the generation of molecular signatures of prostate cancer. Lists of differentially expressed genes between malignant and non-malignant states are thought to be fertile sources of putative prostate cancer biomarkers. However such lists of differentially expressed genes can be highly variable for multiple reasons. As such, looking at differential expression in the context of gene sets and pathways has been more robust. Using next-generation genome sequencing data from The Cancer Genome Atlas, differential gene expression between age- and stage- matched human prostate tumors and non-malignant samples was assessed and used to craft a pathway signature of prostate cancer. Up- and down-regulated genes were assigned to pathways composed of curated groups of related genes from multiple databases. The significance of these pathways was then evaluated according to the number of differentially expressed genes found in the pathway and their position within the pathway using Gene Set Enrichment Analysis and Signaling Pathway Impact Analysis. The “transforming growth factor-beta signaling” and “Ran regulation of mitotic spindle formation” pathways were strongly associated with prostate cancer. Several other significant pathways confirm reported findings from microarray data that suggest actin cytoskeleton regulation, cell cycle, mitogen-activated protein kinase signaling, and calcium signaling are also altered in prostate cancer. Thus we have demonstrated feasibility of pathway analysis and identified an underexplored area (Ran) for investigation in prostate cancer pathogenesis.
Collapse
|
24
|
Sehhati M, Mehridehnavi A, Rabbani H, Pourhossein M. Stable Gene Signature Selection for Prediction of Breast Cancer Recurrence Using Joint Mutual Information. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2015; 12:1440-1448. [PMID: 26671813 DOI: 10.1109/tcbb.2015.2407407] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
In this experiment, a gene selection technique was proposed to select a robust gene signature from microarray data for prediction of breast cancer recurrence. In this regard, a hybrid scoring criterion was designed as linear combinations of the scores that were determined in the mutual information (MI) domain and protein-protein interactions network. Whereas, the MI-based score represents the complementary information between the selected genes for outcome prediction; and the number of connections in the PPI network between the selected genes builds the PPI-based score. All genes were scored by using the proposed function in a hybrid forward-backward gene-set selection process to select the optimum biomarker-set from the gene expression microarray data. The accuracy and stability of the finally selected biomarkers were evaluated by using five-fold cross-validation (CV) to classify available data on breast cancer patients into two cohorts of poor and good prognosis. The results showed an appealing improvement in the cross-dataset accuracy in comparison with similar studies whenever we applied a primary signature, which was selected from one dataset, to predict survival in other independent datasets. Moreover, the proposed method demonstrated 58-92 percent overlap between 50-genes signatures, which were selected from seven independent datasets individually.
Collapse
|
25
|
Holec M, Kuželka O, Železný F. Novel gene sets improve set-level classification of prokaryotic gene expression data. BMC Bioinformatics 2015; 16:348. [PMID: 26511329 PMCID: PMC4625461 DOI: 10.1186/s12859-015-0786-7] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2015] [Accepted: 10/20/2015] [Indexed: 01/23/2023] Open
Abstract
Background Set-level classification of gene expression data has received significant attention recently. In this setting, high-dimensional vectors of features corresponding to genes are converted into lower-dimensional vectors of features corresponding to biologically interpretable gene sets. The dimensionality reduction brings the promise of a decreased risk of overfitting, potentially resulting in improved accuracy of the learned classifiers. However, recent empirical research has not confirmed this expectation. Here we hypothesize that the reported unfavorable classification results in the set-level framework were due to the adoption of unsuitable gene sets defined typically on the basis of the Gene ontology and the KEGG database of metabolic networks. We explore an alternative approach to defining gene sets, based on regulatory interactions, which we expect to collect genes with more correlated expression. We hypothesize that such more correlated gene sets will enable to learn more accurate classifiers. Methods We define two families of gene sets using information on regulatory interactions, and evaluate them on phenotype-classification tasks using public prokaryotic gene expression data sets. From each of the two gene-set families, we first select the best-performing subtype. The two selected subtypes are then evaluated on independent (testing) data sets against state-of-the-art gene sets and against the conventional gene-level approach. Results The novel gene sets are indeed more correlated than the conventional ones, and lead to significantly more accurate classifiers. The novel gene sets are indeed more correlated than the conventional ones, and lead to significantly more accurate classifiers. Conclusion Novel gene sets defined on the basis of regulatory interactions improve set-level classification of gene expression data. The experimental scripts and other material needed to reproduce the experiments are available at http://ida.felk.cvut.cz/novelgenesets.tar.gz. Electronic supplementary material The online version of this article (doi:10.1186/s12859-015-0786-7) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Matěj Holec
- Faculty of Electrical Engineering, Czech Technical University, Technická 2, Prague, 166 27, Czech Republic.
| | - Ondřej Kuželka
- Faculty of Electrical Engineering, Czech Technical University, Technická 2, Prague, 166 27, Czech Republic. .,School of Computer Science and Informatics, Cardiff University, Queen's Buildings, 5 The Parade, Roath, Cardiff, CF24 3AA, UK.
| | - Filip Železný
- Faculty of Electrical Engineering, Czech Technical University, Technická 2, Prague, 166 27, Czech Republic.
| |
Collapse
|
26
|
|
27
|
Bonsang-Kitzis H, Sadacca B, Hamy-Petit AS, Moarii M, Pinheiro A, Laurent C, Reyal F. Biological network-driven gene selection identifies a stromal immune module as a key determinant of triple-negative breast carcinoma prognosis. Oncoimmunology 2015; 5:e1061176. [PMID: 26942074 DOI: 10.1080/2162402x.2015.1061176] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2015] [Revised: 06/02/2015] [Accepted: 06/08/2015] [Indexed: 12/31/2022] Open
Abstract
Triple-negative breast cancer (TNBC) is a heterogeneous group of aggressive breast cancers for which no targeted treatment is available. Robust tools for TNBC classification are required, to improve the prediction of prognosis and to develop novel therapeutic interventions. We analyzed 3,247 primary human breast cancer samples from 21 publicly available datasets, using a five-step method: (1) selection of TNBC samples by bimodal filtering on ER-HER2 and PR, (2) normalization of the selected TNBC samples, (3) selection of the most variant genes, (4) identification of gene clusters and biological gene selection within gene clusters on the basis of String© database connections and gene-expression correlations, (5) summarization of each gene cluster in a metagene. We then assessed the ability of these metagenes to predict prognosis, on an external public dataset (METABRIC). Our analysis of gene expression (GE) in 557 TNBCs from 21 public datasets identified a six-metagene signature (167 genes) in which the metagenes were enriched in different gene ontologies. The gene clusters were named as follows: Immunity1, Immunity2, Proliferation/DNA damage, AR-like, Matrix/Invasion1 and Matrix2 clusters respectively. This signature was particularly robust for the identification of TNBC subtypes across many datasets (n = 1,125 samples), despite technology differences (Affymetrix© A, Plus2 and Illumina©). Weak Immunity two metagene expression was associated with a poor prognosis (disease-specific survival; HR = 2.68 [1.59-4.52], p = 0.0002). The six-metagene signature (167 genes) was validated over 1,125 TNBC samples. The Immunity two metagene had strong prognostic value. These findings open up interesting possibilities for the development of new therapeutic interventions.
Collapse
Affiliation(s)
- H Bonsang-Kitzis
- Residual Tumor & Response to Treatment Laboratory; RT2Lab; Translational Research Department; Institut Curie; Paris, France; U932 Immunity and Cancer; INSERM; Institut Curie; Paris, France; Department of Surgery; Institut Curie; Paris, France
| | - B Sadacca
- Residual Tumor & Response to Treatment Laboratory; RT2Lab; Translational Research Department; Institut Curie; Paris, France; U932 Immunity and Cancer; INSERM; Institut Curie; Paris, France; Laboratoire de Mathématiques et Modélisation d'Evry, Université d'Évry Val d'Essonne; UMR CNRS 8071, ENSIIE, USC INRA, France
| | - A S Hamy-Petit
- Residual Tumor & Response to Treatment Laboratory; RT2Lab; Translational Research Department; Institut Curie; Paris, France; U932 Immunity and Cancer; INSERM; Institut Curie; Paris, France
| | - M Moarii
- Mines Paristech; PSL-Research University; CBIO-Centre for Computational Biology; Mines ParisTech; Fontainebleau, France; U900, INSERM; Institut Curie; Paris, France
| | - A Pinheiro
- Residual Tumor & Response to Treatment Laboratory; RT2Lab; Translational Research Department; Institut Curie; Paris, France; U932 Immunity and Cancer; INSERM; Institut Curie; Paris, France
| | - C Laurent
- Residual Tumor & Response to Treatment Laboratory; RT2Lab; Translational Research Department; Institut Curie; Paris, France; U932 Immunity and Cancer; INSERM; Institut Curie; Paris, France
| | - F Reyal
- Residual Tumor & Response to Treatment Laboratory; RT2Lab; Translational Research Department; Institut Curie; Paris, France; U932 Immunity and Cancer; INSERM; Institut Curie; Paris, France; Department of Surgery; Institut Curie; Paris, France
| |
Collapse
|
28
|
Seok J, Davis RW, Xiao W. A hybrid approach of gene sets and single genes for the prediction of survival risks with gene expression data. PLoS One 2015; 10:e0122103. [PMID: 25933378 PMCID: PMC4416884 DOI: 10.1371/journal.pone.0122103] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2015] [Accepted: 02/21/2015] [Indexed: 12/04/2022] Open
Abstract
Accumulated biological knowledge is often encoded as gene sets, collections of genes associated with similar biological functions or pathways. The use of gene sets in the analyses of high-throughput gene expression data has been intensively studied and applied in clinical research. However, the main interest remains in finding modules of biological knowledge, or corresponding gene sets, significantly associated with disease conditions. Risk prediction from censored survival times using gene sets hasn’t been well studied. In this work, we propose a hybrid method that uses both single gene and gene set information together to predict patient survival risks from gene expression profiles. In the proposed method, gene sets provide context-level information that is poorly reflected by single genes. Complementarily, single genes help to supplement incomplete information of gene sets due to our imperfect biomedical knowledge. Through the tests over multiple data sets of cancer and trauma injury, the proposed method showed robust and improved performance compared with the conventional approaches with only single genes or gene sets solely. Additionally, we examined the prediction result in the trauma injury data, and showed that the modules of biological knowledge used in the prediction by the proposed method were highly interpretable in biology. A wide range of survival prediction problems in clinical genomics is expected to benefit from the use of biological knowledge.
Collapse
Affiliation(s)
- Junhee Seok
- School of Electrical Engineering, Korea University, Seoul 136-713, Republic of Korea
- * E-mail: (JS); (WX)
| | - Ronald W. Davis
- Stanford Genome Technology Center, Palo Alto, California, United States of America
| | - Wenzhong Xiao
- Stanford Genome Technology Center, Palo Alto, California, United States of America
- Massachusetts General Hospital and Shriners Hospital for Children, Boston, Massachusetts, United States of America
- * E-mail: (JS); (WX)
| |
Collapse
|
29
|
Geman D, Ochs M, Price ND, Tomasetti C, Younes L. An argument for mechanism-based statistical inference in cancer. Hum Genet 2015; 134:479-95. [PMID: 25381197 PMCID: PMC4612627 DOI: 10.1007/s00439-014-1501-x] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2014] [Accepted: 10/14/2014] [Indexed: 01/07/2023]
Abstract
Cancer is perhaps the prototypical systems disease, and as such has been the focus of extensive study in quantitative systems biology. However, translating these programs into personalized clinical care remains elusive and incomplete. In this perspective, we argue that realizing this agenda—in particular, predicting disease phenotypes, progression and treatment response for individuals—requires going well beyond standard computational and bioinformatics tools and algorithms. It entails designing global mathematical models over network-scale configurations of genomic states and molecular concentrations, and learning the model parameters from limited available samples of high-dimensional and integrative omics data. As such, any plausible design should accommodate: biological mechanism, necessary for both feasible learning and interpretable decision making; stochasticity, to deal with uncertainty and observed variation at many scales; and a capacity for statistical inference at the patient level. This program, which requires a close, sustained collaboration between mathematicians and biologists, is illustrated in several contexts, including learning biomarkers, metabolism, cell signaling, network inference and tumorigenesis.
Collapse
Affiliation(s)
- Donald Geman
- Department of Applied Mathematics and Statistics, Johns Hopkins University, Baltimore, MD, 21210, USA,
| | | | | | | | | |
Collapse
|
30
|
Qin G, Zhao XM. A survey on computational approaches to identifying disease biomarkers based on molecular networks. J Theor Biol 2014; 362:9-16. [DOI: 10.1016/j.jtbi.2014.06.007] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2014] [Revised: 06/03/2014] [Accepted: 06/04/2014] [Indexed: 11/29/2022]
|
31
|
Zeng T, Zhang W, Yu X, Liu X, Li M, Liu R, Chen L. Edge biomarkers for classification and prediction of phenotypes. SCIENCE CHINA-LIFE SCIENCES 2014; 57:1103-14. [PMID: 25326072 DOI: 10.1007/s11427-014-4757-4] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/09/2014] [Accepted: 08/07/2014] [Indexed: 12/19/2022]
Abstract
In general, a disease manifests not from malfunction of individual molecules but from failure of the relevant system or network, which can be considered as a set of interactions or edges among molecules. Thus, instead of individual molecules, networks or edges are stable forms to reliably characterize complex diseases. This paper reviews both traditional node biomarkers and edge biomarkers, which have been newly proposed. These biomarkers are classified in terms of their contained information. In particular, we show that edge and network biomarkers provide novel ways of stably and reliably diagnosing the disease state of a sample. First, we categorize the biomarkers based on the information used in the learning and prediction steps. We then briefly introduce conventional node biomarkers, or molecular biomarkers without network information, and their computational approaches. The main focus of this paper is edge and network biomarkers, which exploit network information to improve the accuracy of diagnosis and prognosis. Moreover, by extracting both network and dynamic information from the data, we can develop dynamical network and edge biomarkers. These biomarkers not only diagnose the immediate pre-disease state but also detect the critical molecules or networks by which the biological system progresses from the healthy to the disease state. The identified critical molecules can be used as drug targets, and the critical state indicates the critical point of disease control. The paper also discusses representative biomarker-based methods.
Collapse
Affiliation(s)
- Tao Zeng
- Key Laboratory of Systems Biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai, 200031, China
| | | | | | | | | | | | | |
Collapse
|
32
|
Huang S, Yee C, Ching T, Yu H, Garmire LX. A novel model to combine clinical and pathway-based transcriptomic information for the prognosis prediction of breast cancer. PLoS Comput Biol 2014; 10:e1003851. [PMID: 25233347 PMCID: PMC4168973 DOI: 10.1371/journal.pcbi.1003851] [Citation(s) in RCA: 59] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2014] [Accepted: 08/08/2014] [Indexed: 01/19/2023] Open
Abstract
Breast cancer is the most common malignancy in women worldwide. With the increasing awareness of heterogeneity in breast cancers, better prediction of breast cancer prognosis is much needed for more personalized treatment and disease management. Towards this goal, we have developed a novel computational model for breast cancer prognosis by combining the Pathway Deregulation Score (PDS) based pathifier algorithm, Cox regression and L1-LASSO penalization method. We trained the model on a set of 236 patients with gene expression data and clinical information, and validated the performance on three diversified testing data sets of 606 patients. To evaluate the performance of the model, we conducted survival analysis of the dichotomized groups, and compared the areas under the curve based on the binary classification. The resulting prognosis genomic model is composed of fifteen pathways (e.g. P53 pathway) that had previously reported cancer relevance, and it successfully differentiated relapse in the training set (log rank p-value = 6.25e-12) and three testing data sets (log rank p-value<0.0005). Moreover, the pathway-based genomic models consistently performed better than gene-based models on all four data sets. We also find strong evidence that combining genomic information with clinical information improved the p-values of prognosis prediction by at least three orders of magnitude in comparison to using either genomic or clinical information alone. In summary, we propose a novel prognosis model that harnesses the pathway-based dysregulation as well as valuable clinical information. The selected pathways in our prognosis model are promising targets for therapeutic intervention. With the increasing awareness of heterogeneity in breast cancers, better prediction of breast cancer prognosis is much needed early on for more personalized treatment and management. Towards this goal we propose in this study a novel pathway-based prognosis prediction model, which emphasizes on individualized pathway-based risk measurement using the pathway dysregulation score (PDS). In combination with the L1-LASSO penalized feature selection and the COX-Proportional Hazards regression model, we have identified fifteen cancer relevant pathways using the pathway-based genomic model that successfully differentiated the relapse in the training set as well as three diversified test sets. Moreover, given the debate whether higher-order representative features, such as GO sets, pathways and network modules are superior to the gene-level features in the genomic models, we demonstrate that pathway-based genomic models consistently performed better than gene-based models in all four data sets. Last but not least, we show strong evidence that models that combine genomic information with clinical information improves the prognosis prediction significantly, in comparison to models that use either genomic or clinical information alone.
Collapse
Affiliation(s)
- Sijia Huang
- Molecular Biosciences and Bioengineering Graduate Program, University of Hawaii at Manoa, Honolulu, Hawaii, United States of America
- Epidemiology Program, University of Hawaii Cancer Center, Honolulu, Hawaii, United States of America
| | - Cameron Yee
- Neurobiology Program of Biology Department, University of Washington, Seattle, Washington, United States of America
| | - Travers Ching
- Molecular Biosciences and Bioengineering Graduate Program, University of Hawaii at Manoa, Honolulu, Hawaii, United States of America
- Epidemiology Program, University of Hawaii Cancer Center, Honolulu, Hawaii, United States of America
| | - Herbert Yu
- Epidemiology Program, University of Hawaii Cancer Center, Honolulu, Hawaii, United States of America
| | - Lana X. Garmire
- Molecular Biosciences and Bioengineering Graduate Program, University of Hawaii at Manoa, Honolulu, Hawaii, United States of America
- Epidemiology Program, University of Hawaii Cancer Center, Honolulu, Hawaii, United States of America
- * E-mail:
| |
Collapse
|
33
|
Vadnais C, Shooshtarizadeh P, Rajadurai CV, Lesurf R, Hulea L, Davoudi S, Cadieux C, Hallett M, Park M, Nepveu A. Autocrine Activation of the Wnt/β-Catenin Pathway by CUX1 and GLIS1 in Breast Cancers. Biol Open 2014; 3:937-46. [PMID: 25217618 PMCID: PMC4197442 DOI: 10.1242/bio.20148193] [Citation(s) in RCA: 37] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/05/2023] Open
Abstract
Autocrine activation of the Wnt/β-catenin pathway occurs in several cancers, notably in breast tumors, and is associated with higher expression of various Wnt ligands. Using various inhibitors of the FZD/LRP receptor complex, we demonstrate that some adenosquamous carcinomas that develop in MMTV-CUX1 transgenic mice represent a model for autocrine activation of the Wnt/β-catenin pathway. By comparing expression profiles of laser-capture microdissected mammary tumors, we identify Glis1 as a transcription factor that is highly expressed in the subset of tumors with elevated Wnt gene expression. Analysis of human cancer datasets confirms that elevated WNT gene expression is associated with high levels of CUX1 and GLIS1 and correlates with genes of the epithelial-to-mesenchymal transition (EMT) signature: VIM, SNAI1 and TWIST1 are elevated whereas CDH1 and OCLN are decreased. Co-expression experiments demonstrate that CUX1 and GLIS1 cooperate to stimulate TCF/β-catenin transcriptional activity and to enhance cell migration and invasion. Altogether, these results provide additional evidence for the role of GLIS1 in reprogramming gene expression and suggest a hierarchical model for transcriptional regulation of the Wnt/β-catenin pathway and the epithelial-to-mesenchymal transition.
Collapse
Affiliation(s)
- Charles Vadnais
- Goodman Cancer Research Centre, McGill University, Montreal, QC H3A 1A3, Canada Department of Biochemistry, McGill University, Montreal, QC H3G 1Y6, Canada
| | | | - Charles V Rajadurai
- Goodman Cancer Research Centre, McGill University, Montreal, QC H3A 1A3, Canada Department of Biochemistry, McGill University, Montreal, QC H3G 1Y6, Canada
| | - Robert Lesurf
- Goodman Cancer Research Centre, McGill University, Montreal, QC H3A 1A3, Canada Department of Biochemistry, McGill University, Montreal, QC H3G 1Y6, Canada McGill Centre for Bioinformatics, McGill University, Montreal, QC H3G 0B1, Canada
| | - Laura Hulea
- Goodman Cancer Research Centre, McGill University, Montreal, QC H3A 1A3, Canada Department of Biochemistry, McGill University, Montreal, QC H3G 1Y6, Canada
| | - Sayeh Davoudi
- Goodman Cancer Research Centre, McGill University, Montreal, QC H3A 1A3, Canada
| | - Chantal Cadieux
- Department of Biochemistry, McGill University, Montreal, QC H3G 1Y6, Canada
| | - Michael Hallett
- Goodman Cancer Research Centre, McGill University, Montreal, QC H3A 1A3, Canada McGill Centre for Bioinformatics, McGill University, Montreal, QC H3G 0B1, Canada
| | - Morag Park
- Goodman Cancer Research Centre, McGill University, Montreal, QC H3A 1A3, Canada Department of Biochemistry, McGill University, Montreal, QC H3G 1Y6, Canada Department of Medicine, McGill University, Montreal, QC H3A 1A1, Canada Department of Oncology, McGill University, Montreal, QC H2W 1S6, Canada
| | - Alain Nepveu
- Goodman Cancer Research Centre, McGill University, Montreal, QC H3A 1A3, Canada Department of Biochemistry, McGill University, Montreal, QC H3G 1Y6, Canada Department of Medicine, McGill University, Montreal, QC H3A 1A1, Canada Department of Oncology, McGill University, Montreal, QC H2W 1S6, Canada
| |
Collapse
|
34
|
Zhou X, Liu J. A computational model to predict bone metastasis in breast cancer by integrating the dysregulated pathways. BMC Cancer 2014; 14:618. [PMID: 25163697 PMCID: PMC4161863 DOI: 10.1186/1471-2407-14-618] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2014] [Accepted: 08/20/2014] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Although there are a lot of researches focusing on cancer prognosis or prediction of cancer metastases, it is still a big challenge to predict the risks of cancer metastasizing to a specific organ such as bone. In fact, little work has been published for such a purpose nowadays. METHODS In this work, we propose a Dysregulated Pathway Based prediction Model (DPBM) built on a merged data set with 855 samples. First, we use bootstrapping strategy to select bone metastasis related genes. Based on the selected genes, we then detect out the dysregulated pathways involved in the process of bone metastasis via enrichment analysis. And then we use the discriminative genes in each dysregulated pathway, called as dysregulated genes, to construct a sub-model to forecast the risk of bone metastasis. Finally we combine all sub-models as an ensemble model (DPBM) to predict the risk of bone metastasis. RESULTS We have validated DPBM on the training, test and independent sets separately, and the results show that DPBM can significantly distinguish the bone metastases risks of patients (with p-values of 3.82E-10, 0.00007 and 0.0003 on three sets respectively). Moreover, the dysregulated genes are generally with higher topological coefficients (degree and betweenness centrality) in the PPI network, which means that they may play critical roles in the biological functions. Further functional analysis of these genes demonstrates that the immune system seems to play an important role in bone-specific metastasis of breast cancer. CONCLUSIONS Each of the dysregulated pathways that are enriched with bone metastasis related genes may uncover one critical aspect of influencing the bone metastasis of breast cancer, thus the ensemble strategy can help to describe the comprehensive view of bone metastasis mechanism. Therefore, the constructed DPBM is robust and able to significantly distinguish the bone metastases risks of patients in both test set and independent set. Moreover, the dysregulated genes in the dysregulated pathways tend to play critical roles in the biological process of bone metastasis of breast cancer.
Collapse
Affiliation(s)
| | - Juan Liu
- School of Computer, Wuhan University, Wuhan, P,R, China.
| |
Collapse
|
35
|
Kim D, Joung JG, Sohn KA, Shin H, Park YR, Ritchie MD, Kim JH. Knowledge boosting: a graph-based integration approach with multi-omics data and genomic knowledge for cancer clinical outcome prediction. J Am Med Inform Assoc 2014; 22:109-20. [PMID: 25002459 PMCID: PMC4433357 DOI: 10.1136/amiajnl-2013-002481] [Citation(s) in RCA: 61] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/05/2023] Open
Abstract
Objective Cancer can involve gene dysregulation via multiple mechanisms, so no single level of genomic data fully elucidates tumor behavior due to the presence of numerous genomic variations within or between levels in a biological system. We have previously proposed a graph-based integration approach that combines multi-omics data including copy number alteration, methylation, miRNA, and gene expression data for predicting clinical outcome in cancer. However, genomic features likely interact with other genomic features in complex signaling or regulatory networks, since cancer is caused by alterations in pathways or complete processes. Methods Here we propose a new graph-based framework for integrating multi-omics data and genomic knowledge to improve power in predicting clinical outcomes and elucidate interplay between different levels. To highlight the validity of our proposed framework, we used an ovarian cancer dataset from The Cancer Genome Atlas for predicting stage, grade, and survival outcomes. Results Integrating multi-omics data with genomic knowledge to construct pre-defined features resulted in higher performance in clinical outcome prediction and higher stability. For the grade outcome, the model with gene expression data produced an area under the receiver operating characteristic curve (AUC) of 0.7866. However, models of the integration with pathway, Gene Ontology, chromosomal gene set, and motif gene set consistently outperformed the model with genomic data only, attaining AUCs of 0.7873, 0.8433, 0.8254, and 0.8179, respectively. Conclusions Integrating multi-omics data and genomic knowledge to improve understanding of molecular pathogenesis and underlying biology in cancer should improve diagnostic and prognostic indicators and the effectiveness of therapies.
Collapse
Affiliation(s)
- Dokyoon Kim
- Division of Biomedical Informatics, Seoul National University Biomedical Informatics (SNUBI), Seoul National University College of Medicine, Seoul, Korea Department of Biochemistry and Molecular Biology, Center for Systems Genomics, Pennsylvania State University, University Park, Pennsylvania, USA
| | - Je-Gun Joung
- Division of Biomedical Informatics, Seoul National University Biomedical Informatics (SNUBI), Seoul National University College of Medicine, Seoul, Korea Translational Bioinformatics Lab (TBL), Samsung Genome Institute (SGI), Samsung Medical Center, Seoul, Korea
| | - Kyung-Ah Sohn
- Division of Biomedical Informatics, Seoul National University Biomedical Informatics (SNUBI), Seoul National University College of Medicine, Seoul, Korea Department of Information and Computer Engineering, Ajou University, Suwon, Korea
| | - Hyunjung Shin
- Department of Industrial and Information Systems Engineering, Ajou University, Suwon, Korea
| | - Yu Rang Park
- Division of Biomedical Informatics, Seoul National University Biomedical Informatics (SNUBI), Seoul National University College of Medicine, Seoul, Korea Department of Biomedical Informatics, Asan Medical Center, Seoul, Korea
| | - Marylyn D Ritchie
- Department of Biochemistry and Molecular Biology, Center for Systems Genomics, Pennsylvania State University, University Park, Pennsylvania, USA
| | - Ju Han Kim
- Division of Biomedical Informatics, Seoul National University Biomedical Informatics (SNUBI), Seoul National University College of Medicine, Seoul, Korea Systems Biomedical Informatics Research Center, Seoul National University, Seoul, Korea
| |
Collapse
|
36
|
Gardeux V, Achour I, Li J, Maienschein-Cline M, Li H, Pesce L, Parinandi G, Bahroos N, Winn R, Foster I, Garcia JGN, Lussier YA. 'N-of-1-pathways' unveils personal deregulated mechanisms from a single pair of RNA-Seq samples: towards precision medicine. J Am Med Inform Assoc 2014; 21:1015-25. [PMID: 25301808 PMCID: PMC4215042 DOI: 10.1136/amiajnl-2013-002519] [Citation(s) in RCA: 33] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/25/2023] Open
Abstract
Background The emergence of precision medicine allowed the incorporation of individual molecular data into patient care. Indeed, DNA sequencing predicts somatic mutations in individual patients. However, these genetic features overlook dynamic epigenetic and phenotypic response to therapy. Meanwhile, accurate personal transcriptome interpretation remains an unmet challenge. Further, N-of-1 (single-subject) efficacy trials are increasingly pursued, but are underpowered for molecular marker discovery. Method ‘N-of-1-pathways’ is a global framework relying on three principles: (i) the statistical universe is a single patient; (ii) significance is derived from geneset/biomodules powered by paired samples from the same patient; and (iii) similarity between genesets/biomodules assesses commonality and differences, within-study and cross-studies. Thus, patient gene-level profiles are transformed into deregulated pathways. From RNA-Seq of 55 lung adenocarcinoma patients, N-of-1-pathways predicts the deregulated pathways of each patient. Results Cross-patient N-of-1-pathways obtains comparable results with conventional genesets enrichment analysis (GSEA) and differentially expressed gene (DEG) enrichment, validated in three external evaluations. Moreover, heatmap and star plots highlight both individual and shared mechanisms ranging from molecular to organ-systems levels (eg, DNA repair, signaling, immune response). Patients were ranked based on the similarity of their deregulated mechanisms to those of an independent gold standard, generating unsupervised clusters of diametric extreme survival phenotypes (p=0.03). Conclusions The N-of-1-pathways framework provides a robust statistical and relevant biological interpretation of individual disease-free survival that is often overlooked in conventional cross-patient studies. It enables mechanism-level classifiers with smaller cohorts as well as N-of-1 studies. Software http://lussierlab.org/publications/N-of-1-pathways
Collapse
Affiliation(s)
- Vincent Gardeux
- Department of Medicine, Bio5 Institute, UA Cancer Center, University of Arizona, Tucson, Arizona, USA Department of Medicine, University of Illinois at Chicago, Chicago, Illinois, USA Department of Informatics, School of Engineering, EISTI (École Internationale des Sciences du Traitement de l'Information), Cergy-Pontoise, France Institute for Translational Health Informatics, University of Illinois at Chicago, Chicago, Illinois, USA
| | - Ikbel Achour
- Department of Medicine, Bio5 Institute, UA Cancer Center, University of Arizona, Tucson, Arizona, USA Department of Medicine, University of Illinois at Chicago, Chicago, Illinois, USA Institute for Translational Health Informatics, University of Illinois at Chicago, Chicago, Illinois, USA
| | - Jianrong Li
- Department of Medicine, Bio5 Institute, UA Cancer Center, University of Arizona, Tucson, Arizona, USA Department of Medicine, University of Illinois at Chicago, Chicago, Illinois, USA Institute for Translational Health Informatics, University of Illinois at Chicago, Chicago, Illinois, USA
| | - Mark Maienschein-Cline
- Institute for Translational Health Informatics, University of Illinois at Chicago, Chicago, Illinois, USA
| | - Haiquan Li
- Department of Medicine, Bio5 Institute, UA Cancer Center, University of Arizona, Tucson, Arizona, USA Department of Medicine, University of Illinois at Chicago, Chicago, Illinois, USA Institute for Translational Health Informatics, University of Illinois at Chicago, Chicago, Illinois, USA
| | - Lorenzo Pesce
- Computation Institute, Argonne National Laboratory & University of Chicago, Chicago, Illinois, USA
| | - Gurunadh Parinandi
- Department of Medicine, University of Illinois at Chicago, Chicago, Illinois, USA Institute for Translational Health Informatics, University of Illinois at Chicago, Chicago, Illinois, USA
| | - Neil Bahroos
- Institute for Translational Health Informatics, University of Illinois at Chicago, Chicago, Illinois, USA
| | - Robert Winn
- Department of Medicine, University of Illinois at Chicago, Chicago, Illinois, USA University of Illinois Cancer Center, Chicago, Illinois, USA
| | - Ian Foster
- Computation Institute, Argonne National Laboratory & University of Chicago, Chicago, Illinois, USA Department of Computer Science, University of Chicago, Chicago, Illinois, USA Mathematics and Computer Science Division, Argonne National Laboratory, Chicago, Illinois, USA
| | - Joe G N Garcia
- Department of Medicine, Bio5 Institute, UA Cancer Center, University of Arizona, Tucson, Arizona, USA
| | - Yves A Lussier
- Department of Medicine, Bio5 Institute, UA Cancer Center, University of Arizona, Tucson, Arizona, USA Department of Medicine, University of Illinois at Chicago, Chicago, Illinois, USA Institute for Translational Health Informatics, University of Illinois at Chicago, Chicago, Illinois, USA Computation Institute, Argonne National Laboratory & University of Chicago, Chicago, Illinois, USA University of Illinois Cancer Center, Chicago, Illinois, USA Department of Bioengineering, University of Illinois at Chicago, Chicago, Illinois, USA Department of Biopharmaceutical Science, College of Pharmacy, University of Illinois at Chicago, Illinois, USA Institute for Genomics and Systems Biology, University of Chicago, Chicago, Illinois, USA Department of Pharmacology, University of Illinois at Chicago, Chicago, Illinois, USA
| |
Collapse
|
37
|
Saini A, Hou J, Zhou W. Breast cancer prognosis risk estimation using integrated gene expression and clinical data. BIOMED RESEARCH INTERNATIONAL 2014; 2014:459203. [PMID: 24949450 PMCID: PMC4052785 DOI: 10.1155/2014/459203] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/02/2013] [Revised: 01/11/2014] [Accepted: 03/02/2014] [Indexed: 01/20/2023]
Abstract
BACKGROUND Novel prognostic markers are needed so newly diagnosed breast cancer patients do not undergo any unnecessary therapy. Various microarray gene expression datasets based studies have generated gene signatures to predict the prognosis outcomes, while ignoring the large amount of information contained in established clinical markers. Nevertheless, small sample sizes in individual microarray datasets remain a bottleneck in generating robust gene signatures that show limited predictive power. The aim of this study is to achieve high classification accuracy for the good prognosis group and then achieve high classification accuracy for the poor prognosis group. METHODS We propose a novel algorithm called the IPRE (integrated prognosis risk estimation) algorithm. We used integrated microarray datasets from multiple studies to increase the sample sizes (∼ 2,700 samples). The IPRE algorithm consists of a virtual chromosome for the extraction of the prognostic gene signature that has 79 genes, and a multivariate logistic regression model that incorporates clinical data along with expression data to generate the risk score formula that accurately categorizes breast cancer patients into two prognosis groups. RESULTS The evaluation on two testing datasets showed that the IPRE algorithm achieved high classification accuracies of 82% and 87%, which was far greater than any existing algorithms.
Collapse
Affiliation(s)
- Ashish Saini
- School of Information Technology, Deakin University, 221 Burwood Highway, Melbourne, VIC 3125, Australia
| | - Jingyu Hou
- School of Information Technology, Deakin University, 221 Burwood Highway, Melbourne, VIC 3125, Australia
| | - Wanlei Zhou
- School of Information Technology, Deakin University, 221 Burwood Highway, Melbourne, VIC 3125, Australia
| |
Collapse
|
38
|
Gardeux V, Arslan AD, Achour I, Ho TT, Beck WT, Lussier YA. Concordance of deregulated mechanisms unveiled in underpowered experiments: PTBP1 knockdown case study. BMC Med Genomics 2014; 7 Suppl 1:S1. [PMID: 25079003 PMCID: PMC4101571 DOI: 10.1186/1755-8794-7-s1-s1] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/01/2023] Open
Abstract
Background Genome-wide transcriptome profiling generated by microarray and RNA-Seq often provides deregulated genes or pathways applicable only to larger cohort. On the other hand, individualized interpretation of transcriptomes is increasely pursued to improve diagnosis, prognosis, and patient treatment processes. Yet, robust and accurate methods based on a single paired-sample remain an unmet challenge. Methods "N-of-1-pathways" translates gene expression data profiles into mechanism-level profiles on single pairs of samples (one p-value per geneset). It relies on three principles: i) statistical universe is a single paired sample, which serves as its own control; ii) statistics can be derived from multiple gene expression measures that share common biological mechanisms assimilated to genesets; iii) semantic similarity metric takes into account inter-mechanisms' relationships to better assess commonality and differences, within and cross study-samples (e.g. patients, cell-lines, tissues, etc.), which helps the interpretation of the underpinning biology. Results In the context of underpowered experiments, N-of-1-pathways predictions perform better or comparable to those of GSEA and Differentially Expressed Genes enrichment (DEG enrichment), within-and cross-datasets. N-of-1-pathways uncovered concordant PTBP1-dependent mechanisms across datasets (Odds-Ratios≥13, p-values≤1 × 10−5), such as RNA splicing and cell cycle. In addition, it unveils tissue-specific mechanisms of alternatively transcribed PTBP1-dependent genesets. Furthermore, we demonstrate that GSEA and DEG Enrichment preclude accurate analysis on single paired samples. Conclusions N-of-1-pathways enables robust and biologically relevant mechanism-level classifiers with small cohorts and one single paired samples that surpasses conventional methods. Further, it identifies unique sample/ patient mechanisms, a requirement for precision medicine.
Collapse
|
39
|
Callari M, Musella V, Di Buduo E, Sensi M, Miodini P, Dugo M, Orlandi R, Agresti R, Paolini B, Carcangiu ML, Cappelletti V, Daidone MG. Subtype-dependent prognostic relevance of an interferon-induced pathway metagene in node-negative breast cancer. Mol Oncol 2014; 8:1278-89. [PMID: 24853384 DOI: 10.1016/j.molonc.2014.04.010] [Citation(s) in RCA: 35] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2013] [Revised: 04/09/2014] [Accepted: 04/14/2014] [Indexed: 12/31/2022] Open
Abstract
The majority of gene expression signatures developed to predict the likelihood to relapse in breast cancer (BC) patients assigns a high risk score to patients with Estrogen Receptor (ER) negative or highly proliferating tumors. We aimed to identify a signature of differentially expressed (DE) metagenes, rather than single DE genes, associated with distant metastases beyond classical risk factors. We used 105 gene expression profiles from consecutive BCs to identify metagenes whose prognostic role was defined on an independent series of 92 ESR1+/ERBB2- node-negative BCs (42 cases developing metastases within 5 years from diagnosis and 50 cases metastasis-free for more than 5 years, comparable for age, tumor size, ER status and surgery). Findings were validated on publicly available datasets of 684 node-negative BCs including all the subtypes. Only a metagene containing interferon-induced genes (IFN metagene) proved to be predictive of distant metastasis in our series of patients with ESR1+/ERBB2- tumors (P = 0.029), and such a finding was validated on 457 ESR1+/ERBB2- BCs from public datasets (P = 0.0424). Conversely, the IFN metagene was associated with a low risk of metastasis in 104 ERBB2+ tumors (P = 0.0099) whereas it did not prove to significantly affect prognosis in 123 ESR1-/ERBB2- tumors (P = 0.2235). A complex prognostic interaction was revealed in ESR1+/ERBB2- and ERBB2+ tumors when the association between the IFN metagene and a T-cell metagene was considered. The study confirms the importance of analyzing prognostic variables separately within BC subtypes, highlights the advantages of using metagenes rather than genes, and finally identifies in node-negative ESR1+/ERBB2- BCs, the unfavorable role of high IFN metagene expression.
Collapse
Affiliation(s)
- Maurizio Callari
- Department of Experimental Oncology and Molecular Medicine, Fondazione IRCCS Istituto Nazionale dei Tumori, Via Amadeo, 42, 20133 Milan, Italy
| | - Valeria Musella
- Department of Experimental Oncology and Molecular Medicine, Fondazione IRCCS Istituto Nazionale dei Tumori, Via Amadeo, 42, 20133 Milan, Italy
| | - Eleonora Di Buduo
- Department of Experimental Oncology and Molecular Medicine, Fondazione IRCCS Istituto Nazionale dei Tumori, Via Amadeo, 42, 20133 Milan, Italy
| | - Marialuisa Sensi
- Department of Experimental Oncology and Molecular Medicine, Fondazione IRCCS Istituto Nazionale dei Tumori, Via Amadeo, 42, 20133 Milan, Italy
| | - Patrizia Miodini
- Department of Experimental Oncology and Molecular Medicine, Fondazione IRCCS Istituto Nazionale dei Tumori, Via Amadeo, 42, 20133 Milan, Italy
| | - Matteo Dugo
- Department of Experimental Oncology and Molecular Medicine, Fondazione IRCCS Istituto Nazionale dei Tumori, Via Amadeo, 42, 20133 Milan, Italy
| | - Rosaria Orlandi
- Department of Experimental Oncology and Molecular Medicine, Fondazione IRCCS Istituto Nazionale dei Tumori, Via Amadeo, 42, 20133 Milan, Italy
| | - Roberto Agresti
- Department of Surgery, Fondazione IRCCS Istituto Nazionale dei Tumori, Via Venezian, 1, 20133 Milan, Italy
| | - Biagio Paolini
- Department of Pathology, Fondazione IRCCS Istituto Nazionale dei Tumori, Via Venezian, 1, 20133 Milan, Italy
| | - Maria Luisa Carcangiu
- Department of Pathology, Fondazione IRCCS Istituto Nazionale dei Tumori, Via Venezian, 1, 20133 Milan, Italy
| | - Vera Cappelletti
- Department of Experimental Oncology and Molecular Medicine, Fondazione IRCCS Istituto Nazionale dei Tumori, Via Amadeo, 42, 20133 Milan, Italy.
| | - Maria Grazia Daidone
- Department of Experimental Oncology and Molecular Medicine, Fondazione IRCCS Istituto Nazionale dei Tumori, Via Amadeo, 42, 20133 Milan, Italy.
| |
Collapse
|
40
|
|
41
|
Soneson C, Fontes M. Incorporation of gene exchangeabilities improves the reproducibility of gene set rankings. Comput Stat Data Anal 2014. [DOI: 10.1016/j.csda.2012.07.026] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
|
42
|
Sariyar M, Hoffmann I, Binder H. Combining techniques for screening and evaluating interaction terms on high-dimensional time-to-event data. BMC Bioinformatics 2014; 15:58. [PMID: 24571520 PMCID: PMC3945780 DOI: 10.1186/1471-2105-15-58] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2013] [Accepted: 01/28/2014] [Indexed: 11/23/2022] Open
Abstract
Background Molecular data, e.g. arising from microarray technology, is often used for predicting survival probabilities of patients. For multivariate risk prediction models on such high-dimensional data, there are established techniques that combine parameter estimation and variable selection. One big challenge is to incorporate interactions into such prediction models. In this feasibility study, we present building blocks for evaluating and incorporating interactions terms in high-dimensional time-to-event settings, especially for settings in which it is computationally too expensive to check all possible interactions. Results We use a boosting technique for estimation of effects and the following building blocks for pre-selecting interactions: (1) resampling, (2) random forests and (3) orthogonalization as a data pre-processing step. In a simulation study, the strategy that uses all building blocks is able to detect true main effects and interactions with high sensitivity in different kinds of scenarios. The main challenge are interactions composed of variables that do not represent main effects, but our findings are also promising in this regard. Results on real world data illustrate that effect sizes of interactions frequently may not be large enough to improve prediction performance, even though the interactions are potentially of biological relevance. Conclusion Screening interactions through random forests is feasible and useful, when one is interested in finding relevant two-way interactions. The other building blocks also contribute considerably to an enhanced pre-selection of interactions. We determined the limits of interaction detection in terms of necessary effect sizes. Our study emphasizes the importance of making full use of existing methods in addition to establishing new ones.
Collapse
Affiliation(s)
- Murat Sariyar
- Institute of Medical Biostatistics, Epidemiology and Informatics, Medical Center of the Johannes Gutenberg University, Mainz 55131, Germany.
| | | | | |
Collapse
|
43
|
Pyatnitskiy M, Mazo I, Shkrob M, Schwartz E, Kotelnikova E. Clustering gene expression regulators: new approach to disease subtyping. PLoS One 2014; 9:e84955. [PMID: 24416320 PMCID: PMC3887006 DOI: 10.1371/journal.pone.0084955] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2013] [Accepted: 11/20/2013] [Indexed: 12/29/2022] Open
Abstract
One of the main challenges in modern medicine is to stratify different patient groups in terms of underlying disease molecular mechanisms as to develop more personalized approach to therapy. Here we propose novel method for disease subtyping based on analysis of activated expression regulators on a sample-by-sample basis. Our approach relies on Sub-Network Enrichment Analysis algorithm (SNEA) which identifies gene subnetworks with significant concordant changes in expression between two conditions. Subnetwork consists of central regulator and downstream genes connected by relations extracted from global literature-extracted regulation database. Regulators found in each patient separately are clustered together and assigned activity scores which are used for final patients grouping. We show that our approach performs well compared to other related methods and at the same time provides researchers with complementary level of understanding of pathway-level biology behind a disease by identification of significant expression regulators. We have observed the reasonable grouping of neuromuscular disorders (triggered by structural damage vs triggered by unknown mechanisms), that was not revealed using standard expression profile clustering. For another experiment we were able to suggest the clusters of regulators, responsible for colorectal carcinoma vs adenoma discrimination and identify frequently genetically changed regulators that could be of specific importance for the individual characteristics of cancer development. Proposed approach can be regarded as biologically meaningful feature selection, reducing tens of thousands of genes down to dozens of clusters of regulators. Obtained clusters of regulators make possible to generate valuable biological hypotheses about molecular mechanisms related to a clinical outcome for individual patient.
Collapse
Affiliation(s)
- Mikhail Pyatnitskiy
- Institute of Biomedical Chemistry, RAMS, Moscow, Russia
- Ariadne Diagnostics LLC, Rockville, Maryland, United States of America
- * E-mail:
| | - Ilya Mazo
- Ariadne Diagnostics LLC, Rockville, Maryland, United States of America
| | - Maria Shkrob
- Elsevier Inc, Rockville, Maryland, United States of America
| | - Elena Schwartz
- Ariadne Diagnostics LLC, Rockville, Maryland, United States of America
| | - Ekaterina Kotelnikova
- Ariadne Diagnostics LLC, Rockville, Maryland, United States of America
- Institute for Information Transmission Problems, RAS, Moscow, Russia
| |
Collapse
|
44
|
Staiger C, Cadot S, Györffy B, Wessels LFA, Klau GW. Current composite-feature classification methods do not outperform simple single-genes classifiers in breast cancer prognosis. Front Genet 2013; 4:289. [PMID: 24391662 PMCID: PMC3870302 DOI: 10.3389/fgene.2013.00289] [Citation(s) in RCA: 31] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2013] [Accepted: 11/28/2013] [Indexed: 01/21/2023] Open
Abstract
Integrating gene expression data with secondary data such as pathway or protein-protein interaction data has been proposed as a promising approach for improved outcome prediction of cancer patients. Methods employing this approach usually aggregate the expression of genes into new composite features, while the secondary data guide this aggregation. Previous studies were limited to few data sets with a small number of patients. Moreover, each study used different data and evaluation procedures. This makes it difficult to objectively assess the gain in classification performance. Here we introduce the Amsterdam Classification Evaluation Suite (ACES). ACES is a Python package to objectively evaluate classification and feature-selection methods and contains methods for pooling and normalizing Affymetrix microarrays from different studies. It is simple to use and therefore facilitates the comparison of new approaches to best-in-class approaches. In addition to the methods described in our earlier study (Staiger et al., 2012), we have included two prominent prognostic gene signatures specific for breast cancer outcome, one more composite feature selection method and two network-based gene ranking methods. Employing the evaluation pipeline we show that current composite-feature classification methods do not outperform simple single-genes classifiers in predicting outcome in breast cancer. Furthermore, we find that also the stability of features across different data sets is not higher for composite features. Most stunningly, we observe that prediction performances are not affected when extracting features from randomized PPI networks.
Collapse
Affiliation(s)
- Christine Staiger
- Life Sciences, Centrum Wiskunde & Informatica Amsterdam, Netherlands ; Computational Cancer Biology, Division of Molecular Carcinogenesis, Netherlands Cancer Institute Amsterdam, Netherlands
| | - Sidney Cadot
- Computational Cancer Biology, Division of Molecular Carcinogenesis, Netherlands Cancer Institute Amsterdam, Netherlands
| | - Balázs Györffy
- Research Laboratory of Pediatrics and Nephrology, Hungarian Academy of Sciences Budapest, Hungary
| | - Lodewyk F A Wessels
- Computational Cancer Biology, Division of Molecular Carcinogenesis, Netherlands Cancer Institute Amsterdam, Netherlands ; Cancer Systems Biology Center, Netherlands Cancer Institute Amsterdam, Netherlands ; Delft Bioinformatics Lab, Faculty of Electrical Engineering, Mathematics and Computer Science, TU Delft Delft, Netherlands
| | - Gunnar W Klau
- Life Sciences, Centrum Wiskunde & Informatica Amsterdam, Netherlands ; Operations Research and Bioinformatics, Faculty of Sciences, VU University Amsterdam Amsterdam, Netherlands
| |
Collapse
|
45
|
Jaffe AE, Storey JD, Ji H, Leek JT. Gene set bagging for estimating the probability a statistically significant result will replicate. BMC Bioinformatics 2013; 14:360. [PMID: 24330332 PMCID: PMC3890500 DOI: 10.1186/1471-2105-14-360] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2013] [Accepted: 11/11/2013] [Indexed: 11/11/2022] Open
Abstract
Background Significance analysis plays a major role in identifying and ranking genes, transcription factor binding sites, DNA methylation regions, and other high-throughput features associated with illness. We propose a new approach, called gene set bagging, for measuring the probability that a gene set replicates in future studies. Gene set bagging involves resampling the original high-throughput data, performing gene-set analysis on the resampled data, and confirming that biological categories replicate in the bagged samples. Results Using both simulated and publicly-available genomics data, we demonstrate that significant categories in a gene set enrichment analysis may be unstable when subjected to resampling. We show our method estimates the replication probability (R), the probability that a gene set will replicate as a significant result in future studies, and show in simulations that this method reflects replication better than each set’s p-value. Conclusions Our results suggest that gene lists based on p-values are not necessarily stable, and therefore additional steps like gene set bagging may improve biological inference on gene sets.
Collapse
Affiliation(s)
| | | | | | - Jeffrey T Leek
- Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, Baltimore MD 21205, USA.
| |
Collapse
|
46
|
Zeng T, Sun SY, Wang Y, Zhu H, Chen L. Network biomarkers reveal dysfunctional gene regulations during disease progression. FEBS J 2013; 280:5682-95. [PMID: 24107168 DOI: 10.1111/febs.12536] [Citation(s) in RCA: 54] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2013] [Revised: 08/30/2013] [Accepted: 09/09/2013] [Indexed: 12/13/2022]
Abstract
Extensive studies have been conducted on gene biomarkers by exploring the increasingly accumulated gene expression and sequence data generated from high-throughput technology. Here, we briefly report on the state-of-the-art research and application of biomarkers from single genes (i.e. gene biomarkers) to gene sets (i.e. group or set biomarkers), gene networks (i.e. network biomarkers) and dynamical gene networks (i.e. dynamical network biomarkers). In particular, differential and dynamical network biomarkers are used as representative examples to demonstrate their effectiveness in both detecting early signals for complex diseases and revealing essential mechanisms on disease initiation and progression at a network level.
Collapse
Affiliation(s)
- Tao Zeng
- Key Laboratory of Systems Biology, SIBS-Novo Nordisk Translational Research Centre for PreDiabetes, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai, China
| | | | | | | | | |
Collapse
|
47
|
Zhou X, Liu J, Ye X, Wang W, Xiong J. Ensemble classifier based on context specific miRNA regulation modules: a new method for cancer outcome prediction. BMC Bioinformatics 2013; 14 Suppl 12:S6. [PMID: 24268063 PMCID: PMC3848894 DOI: 10.1186/1471-2105-14-s12-s6] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/18/2023] Open
Abstract
Background Many calssifiers which are constructed with chosen gene markers have been proposed to forecast the prognosis of patients who suffer from breast cancer. However, few of them has been applied in clinical practice because of the bad generalization, which results from the situation that markers selected by one method are very different from those obtained by anohter mothod, and thus such markers always lack discriminative capability in the other data sets. Methods In this work, a new ensemble classifier, on the basis of context specific miRNA regulation modules, has been proposed to forecast the metastasis risk of cancer sufferers. First, we defined all of the miRNAs which regulate the same context as a module that contains miRNAs and their regulating context, and applied the CoMi (Context-specific miRNA activity) score in order to illustrate a miRNA's effect which happened in a particular background; then the miRNA regulation modules with distinguising abilities were detected and each of them was responsible for building a weak classifier separately; at last, by using majority voting strategy, we integrated all weak classifiers to establish an ensembled one that was applied to forecast the prognosis of patients who suffer from cancer. Results After comparing, the results on the cohorts containing over 1,000 samples showed that the proposed ensemble classifier is superior to other three classifiers based on miRNA expression profiles, mRNA expression profiles and CoMi activity patterns respectively. Significantly, our method outperforms the representative works. Moreover, the detected modules from different data sets show great stability (with p-value of 6.40e-08). For investigating the biological significance of those selected modules, case studies have been done by us and the results suggested that our method do help to reveal latent mechanism in metastasis of breast cancer. Conclusions One context specific miRNA regulation module can uncover one critical biological process and its involved miRNAs that are related to the cancer outcome, and several modules together can help to study the biological mechanism in cancer metastasis, thus the classifer based on ensembling multiple classifers which were built with different context specific miRNA regulation modules has showed promising performances in terms with both prediction accuracy and generalization.
Collapse
|
48
|
Chen JL, Hsu A, Yang X, Li J, Lee Y, Parinandi G, Li H, Lussier YA. Curation-free biomodules mechanisms in prostate cancer predict recurrent disease. BMC Med Genomics 2013; 6 Suppl 2:S4. [PMID: 23819917 PMCID: PMC3654873 DOI: 10.1186/1755-8794-6-s2-s4] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/08/2023] Open
Abstract
Motivation Gene expression-based prostate cancer gene signatures of poor prognosis are hampered by lack of gene feature reproducibility and a lack of understandability of their function. Molecular pathway-level mechanisms are intrinsically more stable and more robust than an individual gene. The Functional Analysis of Individual Microarray Expression (FAIME) we developed allows distinctive sample-level pathway measurements with utility for correlation with continuous phenotypes (e.g. survival). Further, we and others have previously demonstrated that pathway-level classifiers can be as accurate as gene-level classifiers using curated genesets that may implicitly comprise ascertainment biases (e.g. KEGG, GO). Here, we hypothesized that transformation of individual prostate cancer patient gene expression to pathway-level mechanisms derived from automated high throughput analyses of genomic datasets may also permit personalized pathway analysis and improve prognosis of recurrent disease. Results Via FAIME, three independent prostate gene expression arrays with both normal and tumor samples were transformed into two distinct types of molecular pathway mechanisms: (i) the curated Gene Ontology (GO) and (ii) dynamic expression activity networks of cancer (Cancer Modules). FAIME-derived mechanisms for tumorigenesis were then identified and compared. Curated GO and computationally generated "Cancer Module" mechanisms overlap significantly and are enriched for known oncogenic deregulations and highlight potential areas of investigation. We further show in two independent datasets that these pathway-level tumorigenesis mechanisms can identify men who are more likely to develop recurrent prostate cancer (log-rank_p = 0.019). Conclusion Curation-free biomodules classification derived from congruent gene expression activation breaks from the paradigm of recapitulating the known curated pathway mechanism universe.
Collapse
Affiliation(s)
- James L Chen
- Center for Biomed Informatics and Department of Medicine, The University of Chicago, Chicago, IL, USA
| | | | | | | | | | | | | | | |
Collapse
|
49
|
Determining epithelial contribution to in vivo mesenchymal tumour expression signature using species-specific microarray profiling analysis of xenografts. Genet Res (Camb) 2013; 95:14-29. [PMID: 23497823 DOI: 10.1017/s0016672313000013] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022] Open
Abstract
Gene expression profiling using microarrays and xenograft transplants of human cancer cell lines are both popular tools to investigate human cancer. However, the undefined degree of cross hybridization between the mouse and human genomes hinders the use of microarrays to characterize gene expression of both the host and the cancer cell within the xenograft. Since an increasingly recognized aspect of cancer is the host response (or cancer-stroma interaction), we describe here a bioinformatic manipulation of the Affymetrix profiling that allows interrogation of the gene expression of both the mouse host and the human tumour. Evidence of microenvironmental regulation of epithelial mesenchymal transition of the tumour component in vivo is resolved against a background of mesenchymal gene expression. This tool could allow deeper insight to the mechanism of action of anti-cancer drugs, as typically novel drug efficacy is being tested in xenograft systems.
Collapse
|
50
|
Perez-Rathke A, Li H, Lussier YA. Interpreting personal transcriptomes: personalized mechanism-scale profiling of RNA-seq data. PACIFIC SYMPOSIUM ON BIOCOMPUTING. PACIFIC SYMPOSIUM ON BIOCOMPUTING 2013:159-170. [PMID: 23424121 PMCID: PMC3595401] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Subscribe] [Scholar Register] [Indexed: 06/01/2023]
Abstract
Despite thousands of reported studies unveiling gene-level signatures for complex diseases, few of these techniques work at the single-sample level with explicit underpinning of biological mechanisms. This presents both a critical dilemma in the field of personalized medicine as well as a plethora of opportunities for analysis of RNA-seq data. In this study, we hypothesize that the "Functional Analysis of Individual Microarray Expression" (FAIME) method we developed could be smoothly extended to RNA-seq data and unveil intrinsic underlying mechanism signatures across different scales of biological data for the same complex disease. Using publicly available RNA-seq data for gastric cancer, we confirmed the effectiveness of this method (i) to translate each sample transcriptome to pathway-scale scores, (ii) to predict deregulated pathways in gastric cancer against gold standards (FDR<5%, Precision=75%, Recall =92%), and (iii) to predict phenotypes in an independent dataset and expression platform (RNA-seq vs microarrays, Fisher Exact Test p<10(-6)). Measuring at a single-sample level, FAIME could differentiate cancer samples from normal ones; furthermore, it achieved comparative performance in identifying differentially expressed pathways as compared to state-of-the-art cross-sample methods. These results motivate future work on mechanism-level biomarker discovery predictive of diagnoses, treatment, and therapy.
Collapse
Affiliation(s)
- Alan Perez-Rathke
- Department of Medicine, University of Illinois at Chicago, Chicago, IL 60612, USA.
| | | | | |
Collapse
|