1
|
Xi J, Deng Z, Liu Y, Wang Q, Shi W. Integrating multi-type aberrations from DNA and RNA through dynamic mapping gene space for subtype-specific breast cancer driver discovery. PeerJ 2023; 11:e14843. [PMID: 36755866 PMCID: PMC9901305 DOI: 10.7717/peerj.14843] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2022] [Accepted: 01/11/2023] [Indexed: 02/05/2023] Open
Abstract
Driver event discovery is a crucial demand for breast cancer diagnosis and therapy. In particular, discovering subtype-specificity of drivers can prompt the personalized biomarker discovery and precision treatment of cancer patients. Still, most of the existing computational driver discovery studies mainly exploit the information from DNA aberrations and gene interactions. Notably, cancer driver events would occur due to not only DNA aberrations but also RNA alternations, but integrating multi-type aberrations from both DNA and RNA is still a challenging task for breast cancer drivers. On the one hand, the data formats of different aberration types also differ from each other, known as data format incompatibility. On the other hand, different types of aberrations demonstrate distinct patterns across samples, known as aberration type heterogeneity. To promote the integrated analysis of subtype-specific breast cancer drivers, we design a "splicing-and-fusing" framework to address the issues of data format incompatibility and aberration type heterogeneity simultaneously. To overcome the data format incompatibility, the "splicing-step" employs a knowledge graph structure to connect multi-type aberrations from the DNA and RNA data into a unified formation. To tackle the aberration type heterogeneity, the "fusing-step" adopts a dynamic mapping gene space integration approach to represent the multi-type information by vectorized profiles. The experiments also demonstrate the advantages of our approach in both the integration of multi-type aberrations from DNA and RNA and the discovery of subtype-specific breast cancer drivers. In summary, our "splicing-and-fusing" framework with knowledge graph connection and dynamic mapping gene space fusion of multi-type aberrations data from DNA and RNA can successfully discover potential breast cancer drivers with subtype-specificity indication.
Collapse
Affiliation(s)
- Jianing Xi
- School of Biomedical Engineering, Guangzhou Medical University, Guangzhou, China
| | - Zhen Deng
- School of Basic Medical Sciences, Guangzhou Medical University, Guangzhou, China
| | - Yang Liu
- School of Biomedical Engineering, Guangzhou Medical University, Guangzhou, China
| | - Qian Wang
- School of Biomedical Engineering, Guangzhou Medical University, Guangzhou, China
| | - Wen Shi
- School of Biomedical Engineering, Guangzhou Medical University, Guangzhou, China
| |
Collapse
|
2
|
Solayappan M, Azlan A, Khor KZ, Yik MY, Khan M, Yusoff NM, Moses EJ. Utilization of CRISPR-Mediated Tools for Studying Functional Genomics in Hematological Malignancies: An Overview on the Current Perspectives, Challenges, and Clinical Implications. Front Genet 2022; 12:767298. [PMID: 35154242 PMCID: PMC8834884 DOI: 10.3389/fgene.2021.767298] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2021] [Accepted: 11/17/2021] [Indexed: 11/26/2022] Open
Abstract
Hematological malignancies (HM) are a group of neoplastic diseases that are usually heterogenous in nature due to the complex underlying genetic aberrations in which collaborating mutations enable cells to evade checkpoints that normally safeguard it against DNA damage and other disruptions of healthy cell growth. Research regarding chromosomal structural rearrangements and alterations, gene mutations, and functionality are currently being carried out to understand the genomics of these abnormalities. It is also becoming more evident that cross talk between the functional changes in transcription and proteins gives the characteristics of the disease although specific mutations may induce unique phenotypes. Functional genomics is vital in this aspect as it measures the complete genetic change in cancerous cells and seeks to integrate the dynamic changes in these networks to elucidate various cancer phenotypes. The advent of CRISPR technology has indeed provided a superfluity of benefits to mankind, as this versatile technology enables DNA editing in the genome. The CRISPR-Cas9 system is a precise genome editing tool, and it has revolutionized methodologies in the field of hematology. Currently, there are various CRISPR systems that are used to perform robust site-specific gene editing to study HM. Furthermore, experimental approaches that are based on CRISPR technology have created promising tools for developing effective hematological therapeutics. Therefore, this review will focus on diverse applications of CRISPR-based gene-editing tools in HM and its potential future trajectory. Collectively, this review will demonstrate the key roles of different CRISPR systems that are being used in HM, and the literature will be a representation of a critical step toward further understanding the biology of HM and the development of potential therapeutic approaches.
Collapse
Affiliation(s)
- Maheswaran Solayappan
- Regenerative Medicine Sciences Cluster, Advanced Medical and Dental Institute, Universiti Sains Malaysia, Penang, Malaysia
- Department of Biotechnology, Faculty of Applied Sciences, AIMST University, Bedong, Malaysia
| | - Adam Azlan
- Regenerative Medicine Sciences Cluster, Advanced Medical and Dental Institute, Universiti Sains Malaysia, Penang, Malaysia
- *Correspondence: Emmanuel Jairaj Moses,
| | - Kang Zi Khor
- Regenerative Medicine Sciences Cluster, Advanced Medical and Dental Institute, Universiti Sains Malaysia, Penang, Malaysia
| | - Mot Yee Yik
- Regenerative Medicine Sciences Cluster, Advanced Medical and Dental Institute, Universiti Sains Malaysia, Penang, Malaysia
| | - Matiullah Khan
- Department of Pathology, Faculty of Medicine, AIMST University, Bedong, Malaysia
| | - Narazah Mohd Yusoff
- Regenerative Medicine Sciences Cluster, Advanced Medical and Dental Institute, Universiti Sains Malaysia, Penang, Malaysia
| | - Emmanuel Jairaj Moses
- Regenerative Medicine Sciences Cluster, Advanced Medical and Dental Institute, Universiti Sains Malaysia, Penang, Malaysia
- *Correspondence: Emmanuel Jairaj Moses,
| |
Collapse
|
3
|
Kim SY, Song HK, Lee SK, Kim SG, Woo HG, Yang J, Noh HJ, Kim YS, Moon A. Sex-Biased Molecular Signature for Overall Survival of Liver Cancer Patients. Biomol Ther (Seoul) 2020; 28:491-502. [PMID: 33077700 PMCID: PMC7585639 DOI: 10.4062/biomolther.2020.157] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2020] [Revised: 09/18/2020] [Accepted: 09/18/2020] [Indexed: 12/31/2022] Open
Abstract
Sex/gender disparity has been shown in the incidence and prognosis of many types of diseases, probably due to differences in genes, physiological conditions such as hormones, and lifestyle between the sexes. The mortality and survival rates of many cancers, especially liver cancer, differ between men and women. Due to the pronounced sex/gender disparity, considering sex/gender may be necessary for the diagnosis and treatment of liver cancer. By analyzing research articles through a PubMed literature search, the present review identified 12 genes which showed practical relevance to cancer and sex disparities. Among the 12 sex-specific genes, 7 genes (BAP1, CTNNB1, FOXA1, GSTO1, GSTP1, IL6, and SRPK1) showed sex-biased function in liver cancer. Here we summarized previous findings of cancer molecular signature including our own analysis, and showed that sex-biased molecular signature CTNNB1High, IL6High, RHOAHigh and GLIPR1Low may serve as a female-specific index for prediction and evaluation of OS in liver cancer patients. This review suggests a potential implication of sex-biased molecular signature in liver cancer, providing a useful information on diagnosis and prediction of disease progression based on gender.
Collapse
Affiliation(s)
- Sun Young Kim
- Department of Chemistry, College of Natural Sciences, Duksung Women's University, Seoul 01369, Republic of Korea
| | - Hye Kyung Song
- Department of Chemistry, College of Natural Sciences, Duksung Women's University, Seoul 01369, Republic of Korea
| | - Suk Kyeong Lee
- Department of Medical Life Sciences, Department of Biomedicine & Health Sciences, College of Medicine, The Catholic University of Korea, Seoul 06649, Republic of Korea
| | - Sang Geon Kim
- College of Pharmacy and Integrated Research Institute for Drug Development, Dongguk University_Seoul, Goyang 10326, Republic of Korea
| | - Hyun Goo Woo
- Department of Physiology, Ajou University School of Medicine, Suwon 16499, Republic of Korea.,Department of Biomedical Science, Graduate School, Ajou University, Suwon 16499, Republic of Korea
| | - Jieun Yang
- Department of Physiology, Ajou University School of Medicine, Suwon 16499, Republic of Korea.,Department of Biomedical Science, Graduate School, Ajou University, Suwon 16499, Republic of Korea
| | - Hyun-Jin Noh
- Department of Biomedical Science, Graduate School, Ajou University, Suwon 16499, Republic of Korea.,Department of Biochemistry, Ajou University School of Medicine, Suwon 16499, Republic of Korea
| | - You-Sun Kim
- Department of Biomedical Science, Graduate School, Ajou University, Suwon 16499, Republic of Korea.,Department of Biochemistry, Ajou University School of Medicine, Suwon 16499, Republic of Korea
| | - Aree Moon
- Duksung Innovative Drug Center, College of Pharmacy, Duksung Women's University, Seoul 01369, Republic of Korea
| |
Collapse
|
4
|
Hristov BH, Chazelle B, Singh M. uKIN Combines New and Prior Information with Guided Network Propagation to Accurately Identify Disease Genes. Cell Syst 2020; 10:470-479.e3. [PMID: 32684276 PMCID: PMC7821437 DOI: 10.1016/j.cels.2020.05.008] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2020] [Revised: 04/24/2020] [Accepted: 05/19/2020] [Indexed: 12/23/2022]
Abstract
Protein interaction networks provide a powerful framework for identifying genes causal for complex genetic diseases. Here, we introduce a general framework, uKIN, that uses prior knowledge of disease-associated genes to guide, within known protein-protein interaction networks, random walks that are initiated from newly identified candidate genes. In large-scale testing across 24 cancer types, we demonstrate that our network propagation approach for integrating both prior and new information not only better identifies cancer driver genes than using either source of information alone but also readily outperforms other state-of-the-art network-based approaches. We also apply our approach to genome-wide association data to identify genes functionally relevant for several complex diseases. Overall, our work suggests that guided network propagation approaches that utilize both prior and new data are a powerful means to identify disease genes. uKIN is freely available for download at: https://github.com/Singh-Lab/uKIN.
Collapse
Affiliation(s)
- Borislav H Hristov
- Department of Computer Science, Princeton University, Princeton, NJ 08544, USA; Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ 08544, USA
| | - Bernard Chazelle
- Department of Computer Science, Princeton University, Princeton, NJ 08544, USA
| | - Mona Singh
- Department of Computer Science, Princeton University, Princeton, NJ 08544, USA; Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ 08544, USA.
| |
Collapse
|
5
|
Zhang W, Zeng Y, Wang L, Liu Y, Cheng YN. An Effective Graph Clustering Method to Identify Cancer Driver Modules. Front Bioeng Biotechnol 2020; 8:271. [PMID: 32318558 PMCID: PMC7154174 DOI: 10.3389/fbioe.2020.00271] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2020] [Accepted: 03/16/2020] [Indexed: 12/15/2022] Open
Abstract
Identifying the molecular modules that drive cancer progression can greatly deepen the understanding of cancer mechanisms and provide useful information for targeted therapies. Most methods currently addressing this issue primarily use mutual exclusivity without making full use of the extra layer of module property. In this paper, we propose MCLCluster to identity cancer driver modules, which use somatic mutation data, Cancer Cell Fraction (CCF) data, gene functional interaction network and protein-protein interaction (PPI) network to derive the module property on mutual exclusivity, connectivity in PPI network and functionally similarity of genes. We have taken three effective measures to ensure the effectiveness of our algorithm. First, we use CCF data to choose stronger signals and more confident mutations. Second, the weighted gene functional interaction network is used to quantify the gene functional similarity in PPI. The third, graph clustering method based on Markov is exploited to extract the candidate module. MCLCluster is tested in the two TCGA datasets (GBM and BRCA), and identifies several well-known oncogenes driver modules and some modules with functionally associated driver genes. Besides, we compare it with Multi-Dendrix, FSME Cluster and RME in simulated dataset with background noise and passenger rate, MCLCluster outperforming all of these methods.
Collapse
Affiliation(s)
- Wei Zhang
- College of Computer Engineering and Applied Mathematics, Changsha University, Changsha, China.,Hunan Province Key Laboratory of Industrial Internet Technology and Security, Changsha University, Changsha, China
| | - Yifu Zeng
- College of Computer Engineering and Applied Mathematics, Changsha University, Changsha, China.,Hunan Province Key Laboratory of Industrial Internet Technology and Security, Changsha University, Changsha, China
| | - Lei Wang
- College of Computer Engineering and Applied Mathematics, Changsha University, Changsha, China.,Key Laboratory of Hunan Province for Internet of Things and Information Security, Xiangtan University, Xiangtan, China
| | - Yue Liu
- College of Computer Science and Electronics Engineering, Hunan University, Changsha, China
| | - Yi-Nan Cheng
- College of Science, Southern University of Science and Technology, Shenzhen, China
| |
Collapse
|
6
|
Abstract
A key goal of cancer systems biology is to use big data to elucidate the molecular networks by which cancer develops. However, to date there has been no systematic evaluation of how far these efforts have progressed. In this Analysis, we survey six major systems biology approaches for mapping and modelling cancer pathways with attention to how well their resulting network maps cover and enhance current knowledge. Our sample of 2,070 systems biology maps captures all literature-curated cancer pathways with significant enrichment, although the strong tendency is for these maps to recover isolated mechanisms rather than entire integrated processes. Systems biology maps also identify previously underappreciated functions, such as a potential role for human papillomavirus-induced chromosomal alterations in ovarian tumorigenesis, and they add new genes to known cancer pathways, such as those related to metabolism, Hippo signalling and immunity. Notably, we find that many cancer networks have been provided only in journal figures and not for programmatic access, underscoring the need to deposit network maps in community databases to ensure they can be readily accessed. Finally, few of these findings have yet been clinically translated, leaving ample opportunity for future translational studies. Periodic surveys of cancer pathway maps, such as the one reported here, are critical to assess progress in the field and identify underserved areas of methodology and cancer biology.
Collapse
Affiliation(s)
- Brent M Kuenzi
- Division of Genetics, Department of Medicine, University of California, San Diego, La Jolla, CA, USA
| | - Trey Ideker
- Division of Genetics, Department of Medicine, University of California, San Diego, La Jolla, CA, USA.
| |
Collapse
|
7
|
Saberi Ansar E, Eslahchii C, Rahimi M, Geranpayeh L, Ebrahimi M, Aghdam R, Kerdivel G. Significant random signatures reveals new biomarker for breast cancer. BMC Med Genomics 2019; 12:160. [PMID: 31703592 PMCID: PMC6842262 DOI: 10.1186/s12920-019-0609-1] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2019] [Accepted: 10/24/2019] [Indexed: 02/07/2023] Open
Abstract
BACKGROUND In 2012, Venet et al. proposed that at least in the case of breast cancer, most published signatures are not significantly more associated with outcome than randomly generated signatures. They suggested that nominal p-value is not a good estimator to show the significance of a signature. Therefore, one can reasonably postulate that some information might be present in such significant random signatures. METHODS In this research, first we show that, using an empirical p-value, these published signatures are more significant than their nominal p-values. In other words, the proposed empirical p-value can be considered as a complimentary criterion for nominal p-value to distinguish random signatures from significant ones. Secondly, we develop a novel computational method to extract information that are embedded within significant random signatures. In our method, a score is assigned to each gene based on the number of times it appears in significant random signatures. Then, these scores are diffused through a protein-protein interaction network and a permutation procedure is used to determine the genes with significant scores. The genes with significant scores are considered as the set of significant genes. RESULTS First, we applied our method on the breast cancer dataset NKI to achieve a set of significant genes in breast cancer considering significant random signatures. Secondly, prognostic performance of the computed set of significant genes is evaluated using DMFS and RFS datasets. We have observed that the top ranked genes from this set can successfully separate patients with poor prognosis from those with good prognosis. Finally, we investigated the expression pattern of TAT, the first gene reported in our set, in malignant breast cancer vs. adjacent normal tissue and mammospheres. CONCLUSION Applying the method, we found a set of significant genes in breast cancer, including TAT, a gene that has never been reported as an important gene in breast cancer. Our results show that the expression of TAT is repressed in tumors suggesting that this gene could act as a tumor suppressor in breast cancer and could be used as a new biomarker.
Collapse
Affiliation(s)
- Elnaz Saberi Ansar
- Curie Institute, INSERM U830, Translational Research Department, PSL Research University, Paris, 75005 France
- School of Biological Sciences, Institute for Research in Fundamental Sciences (IPM), Tehran, Iran
| | - Changiz Eslahchii
- Department of Computer Sciences, Faculty of Mathematical Sciences, Shahid-Beheshti University, GC, Tehran, Iran
- School of Biological Sciences, Institute for Research in Fundamental Sciences (IPM), Tehran, Iran
| | - Mahsa Rahimi
- Department of Stem Cells and Developmental Biology, Cell Science Research Center, Royan Institute for Stem Cell Biology and Technology, ACECR, Tehran, Iran
| | - Lobat Geranpayeh
- Department of Surgery, Sina Hospital, Tehran University of Medical Sciences, Tehran, Iran
| | - Marzieh Ebrahimi
- Department of Stem Cells and Developmental Biology, Cell Science Research Center, Royan Institute for Stem Cell Biology and Technology, ACECR, Tehran, Iran
| | - Rosa Aghdam
- Department of Computer Sciences, Faculty of Mathematical Sciences, Shahid-Beheshti University, GC, Tehran, Iran
- School of Biological Sciences, Institute for Research in Fundamental Sciences (IPM), Tehran, Iran
| | - Gwenneg Kerdivel
- Institut Cochin, Department Development, Reproduction, Inserm U1016, CNRS, UMR 8104, Université Paris Descartes UMR-S1016, Paris, 75014 France
| |
Collapse
|
8
|
Zhang W, Wang SL. A Novel Method for Identifying the Potential Cancer Driver Genes Based on Molecular Data Integration. Biochem Genet 2019; 58:16-39. [PMID: 31115714 DOI: 10.1007/s10528-019-09924-2] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/06/2019] [Accepted: 05/02/2019] [Indexed: 12/17/2022]
Abstract
The identification of the cancer driver genes is essential for personalized therapy. The mutation frequency of most driver genes is in the middle (2-20%) or even lower range, which makes it difficult to find the driver genes with low-frequency mutations. Other forms of genomic aberrations, such as copy number variations (CNVs) and epigenetic changes, may also reflect cancer progression. In this work, a method for identifying the potential cancer driver genes (iPDG) based on molecular data integration is proposed. DNA copy number variation, somatic mutation, and gene expression data of matched cancer samples are integrated. In combination with the method of iKEEG, the "key genes" of cancer are identified, and the change in their expression levels is used for auxiliary evaluation of whether the mutated genes are potential drivers. For a mutated gene, the concept of mutational effect is defined, which takes into account the effects of copy number variation, mutation gene itself, and its neighbor genes. The method mainly includes two steps: the first step is data preprocessing. First, DNA copy number variation and somatic mutation data are integrated. Then, the integrated data are mapped to a given interaction network, and the diffusion kernel is used to form the mutation effect matrix. The second step is to obtain the key genes by using the iKGGE method, and construct the connection matrix by means of the gene expression data of the key genes and mutation impact matrix of the mutated genes. Experiments on TCGA breast cancer and Glioblastoma multiforme datasets demonstrate that iPDG is effective not only to identify the known cancer driver genes but also to discover the rare potential driver genes. When measured by functional enrichment analysis, we find that these genes are clearly associated with these two types of cancers.
Collapse
Affiliation(s)
- Wei Zhang
- College of Computer Science and Electronics Engineering, Hunan University, Changsha, 410082, Hunan, China
| | - Shu-Lin Wang
- College of Computer Science and Electronics Engineering, Hunan University, Changsha, 410082, Hunan, China.
| |
Collapse
|
9
|
Ozturk K, Dow M, Carlin DE, Bejar R, Carter H. The Emerging Potential for Network Analysis to Inform Precision Cancer Medicine. J Mol Biol 2018; 430:2875-2899. [PMID: 29908887 PMCID: PMC6097914 DOI: 10.1016/j.jmb.2018.06.016] [Citation(s) in RCA: 53] [Impact Index Per Article: 8.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2018] [Revised: 05/30/2018] [Accepted: 06/06/2018] [Indexed: 12/19/2022]
Abstract
Precision cancer medicine promises to tailor clinical decisions to patients using genomic information. Indeed, successes of drugs targeting genetic alterations in tumors, such as imatinib that targets BCR-ABL in chronic myelogenous leukemia, have demonstrated the power of this approach. However, biological systems are complex, and patients may differ not only by the specific genetic alterations in their tumor, but also by more subtle interactions among such alterations. Systems biology and more specifically, network analysis, provides a framework for advancing precision medicine beyond clinical actionability of individual mutations. Here we discuss applications of network analysis to study tumor biology, early methods for N-of-1 tumor genome analysis, and the path for such tools to the clinic.
Collapse
Affiliation(s)
- Kivilcim Ozturk
- Department of Medicine, Division of Medical Genetics, University of California San Diego, La Jolla, CA 92093, USA; Bioinformatics and Systems Biology Program, University of California San Diego, La Jolla, CA 92093, USA
| | - Michelle Dow
- Department of Medicine, Division of Medical Genetics, University of California San Diego, La Jolla, CA 92093, USA; Bioinformatics and Systems Biology Program, University of California San Diego, La Jolla, CA 92093, USA
| | - Daniel E Carlin
- Department of Medicine, Division of Medical Genetics, University of California San Diego, La Jolla, CA 92093, USA
| | - Rafael Bejar
- Moores Cancer Center, Division of Hematology and Oncology, University of California San Diego, La Jolla, CA 92093, USA
| | - Hannah Carter
- Department of Medicine, Division of Medical Genetics, University of California San Diego, La Jolla, CA 92093, USA; Bioinformatics and Systems Biology Program, University of California San Diego, La Jolla, CA 92093, USA; Moores Cancer Center and Institute for Genomic Medicine, University of California San Diego, La Jolla, CA 92093, USA; CIFAR, MaRS Centre, West Tower, 661 University Ave., Suite 505, Toronto, ON M5G 1M1, Canada.
| |
Collapse
|
10
|
Hou Y, Gao B, Li G, Su Z. MaxMIF: A New Method for Identifying Cancer Driver Genes through Effective Data Integration. ADVANCED SCIENCE (WEINHEIM, BADEN-WURTTEMBERG, GERMANY) 2018; 5:1800640. [PMID: 30250803 PMCID: PMC6145398 DOI: 10.1002/advs.201800640] [Citation(s) in RCA: 20] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/26/2018] [Revised: 06/14/2018] [Indexed: 05/05/2023]
Abstract
Identification of a few cancer driver mutation genes from a much larger number of passenger mutation genes in cancer samples remains a highly challenging task. Here, a novel method for distinguishing the driver genes from the passenger genes by effective integration of somatic mutation data and molecular interaction data using a maximal mutational impact function (MaxMIF) is presented. When evaluated on six somatic mutation datasets of Pan-Cancer and 19 datasets of different cancer types from TCGA, MaxMIF almost always significantly outperforms all the existing state-of-the-art methods in terms of predictive accuracy, sensitivity, and specificity. It recovers about 30% more known cancer genes in 500 top-ranked candidate genes than the best among the other tools evaluated. MaxMIF is also highly robust to data perturbation. Intriguingly, MaxMIF is able to identify potential cancer driver genes, with strong experimental data support. Therefore, MaxMIF can be very useful for identifying or prioritizing cancer driver genes in the increasing number of available cancer genomic data.
Collapse
Affiliation(s)
- Yingnan Hou
- School of MathematicsShandong UniversityJinan250100P. R. China
- State Key Laboratory of Microbial TechnologyShandong UniversityJinan250100P. R. China
| | - Bo Gao
- School of MathematicsShandong UniversityJinan250100P. R. China
- State Key Laboratory of Microbial TechnologyShandong UniversityJinan250100P. R. China
| | - Guojun Li
- School of MathematicsShandong UniversityJinan250100P. R. China
- State Key Laboratory of Microbial TechnologyShandong UniversityJinan250100P. R. China
- Department of Bioinformatics and GenomicsThe University of North Carolina at Charlotte9201, University City BlvdCharlotteNC28223USA
| | - Zhengchang Su
- Department of Bioinformatics and GenomicsThe University of North Carolina at Charlotte9201, University City BlvdCharlotteNC28223USA
| |
Collapse
|
11
|
Xi J, Wang M, Li A. Discovering mutated driver genes through a robust and sparse co-regularized matrix factorization framework with prior information from mRNA expression patterns and interaction network. BMC Bioinformatics 2018; 19:214. [PMID: 29871594 PMCID: PMC5989443 DOI: 10.1186/s12859-018-2218-y] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2017] [Accepted: 05/24/2018] [Indexed: 12/18/2022] Open
Abstract
BACKGROUND Discovery of mutated driver genes is one of the primary objective for studying tumorigenesis. To discover some relatively low frequently mutated driver genes from somatic mutation data, many existing methods incorporate interaction network as prior information. However, the prior information of mRNA expression patterns are not exploited by these existing network-based methods, which is also proven to be highly informative of cancer progressions. RESULTS To incorporate prior information from both interaction network and mRNA expressions, we propose a robust and sparse co-regularized nonnegative matrix factorization to discover driver genes from mutation data. Furthermore, our framework also conducts Frobenius norm regularization to overcome overfitting issue. Sparsity-inducing penalty is employed to obtain sparse scores in gene representations, of which the top scored genes are selected as driver candidates. Evaluation experiments by known benchmarking genes indicate that the performance of our method benefits from the two type of prior information. Our method also outperforms the existing network-based methods, and detect some driver genes that are not predicted by the competing methods. CONCLUSIONS In summary, our proposed method can improve the performance of driver gene discovery by effectively incorporating prior information from interaction network and mRNA expression patterns into a robust and sparse co-regularized matrix factorization framework.
Collapse
Affiliation(s)
- Jianing Xi
- School of Information Science and Technology, University of Science and Technology of China, Huangshan Road, Hefei, 230027 China
| | - Minghui Wang
- School of Information Science and Technology, University of Science and Technology of China, Huangshan Road, Hefei, 230027 China
- Centers for Biomedical Engineering, University of Science and Technology of China, Huangshan Road, Hefei, 230027 China
| | - Ao Li
- School of Information Science and Technology, University of Science and Technology of China, Huangshan Road, Hefei, 230027 China
- Centers for Biomedical Engineering, University of Science and Technology of China, Huangshan Road, Hefei, 230027 China
| |
Collapse
|
12
|
Xi J, Li A, Wang M. A novel unsupervised learning model for detecting driver genes from pan-cancer data through matrix tri-factorization framework with pairwise similarities constraints. Neurocomputing 2018. [DOI: 10.1016/j.neucom.2018.03.026] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/21/2023]
|
13
|
Girnius N, Edwards YJ, Garlick DS, Davis RJ. The cJUN NH 2-terminal kinase (JNK) signaling pathway promotes genome stability and prevents tumor initiation. eLife 2018; 7:36389. [PMID: 29856313 PMCID: PMC5984035 DOI: 10.7554/elife.36389] [Citation(s) in RCA: 27] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2018] [Accepted: 05/09/2018] [Indexed: 12/19/2022] Open
Abstract
Breast cancer is the most commonly diagnosed malignancy in women. Analysis of breast cancer genomic DNA indicates frequent loss-of-function mutations in components of the cJUN NH2-terminal kinase (JNK) signaling pathway. Since JNK signaling can promote cell proliferation by activating the AP1 transcription factor, this apparent association of reduced JNK signaling with tumor development was unexpected. We examined the effect of JNK deficiency in the murine breast epithelium. Loss of JNK signaling caused genomic instability and the development of breast cancer. Moreover, JNK deficiency caused widespread early neoplasia and rapid tumor formation in a murine model of breast cancer. This tumor suppressive function was not mediated by a role of JNK in the growth of established tumors, but by a requirement of JNK to prevent tumor initiation. Together, these data identify JNK pathway defects as ‘driver’ mutations that promote genome instability and tumor initiation. As cells in our body grow and divide, their DNA can experience changes or damage. Most of these ‘mutations’ are harmless, or quickly fixed by the body. Yet, sometimes a mutation can trigger a chain of genetic events that drives the cells to multiply uncontrollably, which leads to tumors. Identifying these ‘driver mutations’ is complex, but key to understanding how cancers start and can be fought. Breast cancer is the most common type of cancer diagnosed in women worldwide. Large studies have focused on sequencing the DNA of cancerous breast cells to try to identify the mutations that started the cancer. Results show that, in these cells, a biological mechanism called the JNK signaling pathway is often inactivated because mutations affect the molecules that take part in this process. Like a chain reaction, the proteins of the JNK pathway act on each other until the last one, called JNK, gets switched on. This protein then goes on to participate in a number of cellular processes such as DNA repair. Is it possible that mutations in this pathway actually drive cancer, and if so, how? Girnius et al. addressed these questions by inactivating the JNK pathway in the breast cells of mice. Over the next year and a half, the JNK-deficient animals were more likely to get breast cancer than normal mice. Further experiments showed that, in breast cells, the JNK protein prevented tumors from appearing. However, once the tumors were present, it was less effective at stopping them from growing. The DNA of the breast cancer cells with no JNK protein also contained more genetic changes and mistakes. This suggests that the JNK signaling pathway helps to keep the genetic information ‘healthy’. This may be because, normally, the JNK protein activates processes that fix DNA mutations. Taken together, the results presented by Girnius et al. show that genetic changes which inactivate the JNK pathway can drive the development of breast cancer. Certain anti-cancer drugs kill cancerous cells by damaging their DNA. Breast tumor cells with inactive JNK pathways are less able to repair their genetic information, and so these drugs could potentially work well on them. Future experiments will be needed to test this hypothesis.
Collapse
Affiliation(s)
- Nomeda Girnius
- Program in Molecular Medicine, University of Massachusetts Medical School, Worcester, United States
| | - Yvonne Jk Edwards
- Program in Molecular Medicine, University of Massachusetts Medical School, Worcester, United States
| | - David S Garlick
- Histo-Scientific Research Laboratories, Mount Jackson, United States
| | - Roger J Davis
- Program in Molecular Medicine, University of Massachusetts Medical School, Worcester, United States.,Howard Hughes Medical Institute, University of Massachusetts Medical School, Worcester, United States
| |
Collapse
|
14
|
Hristov BH, Singh M. Network-Based Coverage of Mutational Profiles Reveals Cancer Genes. Cell Syst 2017; 5:221-229.e4. [PMID: 28957656 PMCID: PMC5997485 DOI: 10.1016/j.cels.2017.09.003] [Citation(s) in RCA: 22] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2017] [Revised: 08/28/2017] [Accepted: 09/06/2017] [Indexed: 12/21/2022]
Abstract
A central goal in cancer genomics is to identify the somatic alterations that underpin tumor initiation and progression. While commonly mutated cancer genes are readily identifiable, those that are rarely mutated across samples are difficult to distinguish from the large numbers of other infrequently mutated genes. We introduce a method, nCOP, that considers per-individual mutational profiles within the context of protein-protein interaction networks in order to identify small connected subnetworks of genes that, while not individually frequently mutated, comprise pathways that are altered across (i.e., "cover") a large fraction of individuals. By analyzing 6,038 samples across 24 different cancer types, we demonstrate that nCOP is highly effective in identifying cancer genes, including those with low mutation frequencies. Overall, our work demonstrates that combining per-individual mutational information with interaction networks is a powerful approach for tackling the mutational heterogeneity observed across cancers.
Collapse
Affiliation(s)
- Borislav H Hristov
- Department of Computer Science, Princeton University, Princeton, NJ 08544, USA; Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ 08544, USA
| | - Mona Singh
- Department of Computer Science, Princeton University, Princeton, NJ 08544, USA; Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ 08544, USA.
| |
Collapse
|
15
|
A novel network regularized matrix decomposition method to detect mutated cancer genes in tumour samples with inter-patient heterogeneity. Sci Rep 2017; 7:2855. [PMID: 28588243 PMCID: PMC5460199 DOI: 10.1038/s41598-017-03141-w] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2017] [Accepted: 04/20/2017] [Indexed: 01/01/2023] Open
Abstract
Inter-patient heterogeneity is a major challenge for mutated cancer genes detection which is crucial to advance cancer diagnostics and therapeutics. To detect mutated cancer genes in heterogeneous tumour samples, a prominent strategy is to determine whether the genes are recurrently mutated in their interaction network context. However, recent studies show that some cancer genes in different perturbed pathways are mutated in different subsets of samples. Subsequently, these genes may not display significant mutational recurrence and thus remain undiscovered even in consideration of network information. We develop a novel method called mCGfinder to efficiently detect mutated cancer genes in tumour samples with inter-patient heterogeneity. Based on matrix decomposition framework incorporated with gene interaction network information, mCGfinder can successfully measure the significance of mutational recurrence of genes in a subset of samples. When applying mCGfinder on TCGA somatic mutation datasets of five types of cancers, we find that the genes detected by mCGfinder are significantly enriched for known cancer genes, and yield substantially smaller p-values than other existing methods. All the results demonstrate that mCGfinder is an efficient method in detecting mutated cancer genes.
Collapse
|
16
|
Le Morvan M, Zinovyev A, Vert JP. NetNorM: Capturing cancer-relevant information in somatic exome mutation data with gene networks for cancer stratification and prognosis. PLoS Comput Biol 2017; 13:e1005573. [PMID: 28650955 PMCID: PMC5507468 DOI: 10.1371/journal.pcbi.1005573] [Citation(s) in RCA: 24] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2016] [Revised: 07/11/2017] [Accepted: 05/15/2017] [Indexed: 01/01/2023] Open
Abstract
Genome-wide somatic mutation profiles of tumours can now be assessed efficiently and promise to move precision medicine forward. Statistical analysis of mutation profiles is however challenging due to the low frequency of most mutations, the varying mutation rates across tumours, and the presence of a majority of passenger events that hide the contribution of driver events. Here we propose a method, NetNorM, to represent whole-exome somatic mutation data in a form that enhances cancer-relevant information using a gene network as background knowledge. We evaluate its relevance for two tasks: survival prediction and unsupervised patient stratification. Using data from 8 cancer types from The Cancer Genome Atlas (TCGA), we show that it improves over the raw binary mutation data and network diffusion for these two tasks. In doing so, we also provide a thorough assessment of somatic mutations prognostic power which has been overlooked by previous studies because of the sparse and binary nature of mutations.
Collapse
Affiliation(s)
- Marine Le Morvan
- MINES ParisTech, PSL Research University, CBIO-Centre for Computational Biology, 75006 Paris, France
- Institut Curie, 75248 Paris Cedex 5, France
- INSERM, U900, 75248 Paris Cedex 5, France
| | - Andrei Zinovyev
- MINES ParisTech, PSL Research University, CBIO-Centre for Computational Biology, 75006 Paris, France
- Institut Curie, 75248 Paris Cedex 5, France
- INSERM, U900, 75248 Paris Cedex 5, France
| | - Jean-Philippe Vert
- MINES ParisTech, PSL Research University, CBIO-Centre for Computational Biology, 75006 Paris, France
- Institut Curie, 75248 Paris Cedex 5, France
- INSERM, U900, 75248 Paris Cedex 5, France
- Department of Mathematics and Applications, Ecole normale supérieure, CNRS, PSL Research University, 75005 Paris, France
| |
Collapse
|
17
|
Jang K, Kim K, Cho A, Lee I, Choi JK. Network perturbation by recurrent regulatory variants in cancer. PLoS Comput Biol 2017; 13:e1005449. [PMID: 28333928 PMCID: PMC5383347 DOI: 10.1371/journal.pcbi.1005449] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2016] [Revised: 04/06/2017] [Accepted: 03/10/2017] [Indexed: 12/12/2022] Open
Abstract
Cancer driving genes have been identified as recurrently affected by variants that alter protein-coding sequences. However, a majority of cancer variants arise in noncoding regions, and some of them are thought to play a critical role through transcriptional perturbation. Here we identified putative transcriptional driver genes based on combinatorial variant recurrence in cis-regulatory regions. The identified genes showed high connectivity in the cancer type-specific transcription regulatory network, with high outdegree and many downstream genes, highlighting their causative role during tumorigenesis. In the protein interactome, the identified transcriptional drivers were not as highly connected as coding driver genes but appeared to form a network module centered on the coding drivers. The coding and regulatory variants associated via these interactions between the coding and transcriptional drivers showed exclusive and complementary occurrence patterns across tumor samples. Transcriptional cancer drivers may act through an extensive perturbation of the regulatory network and by altering protein network modules through interactions with coding driver genes. Identifying driver variants is a current challenge facing cancer genomics. A well-established and robust method for this is to find recurrence in large cohorts of samples. Recurrence patterns of amino acid-changing variants can reveal oncogenes and tumor suppressor genes. However, such single-gene approaches have limitations because of rare variants. Therefore, recurrently affected protein complexes, network modules, or signaling pathways have been identified based on network-level recurrence. Here we dissect chromatin interactome to identify cis-regulatory variants that show high gene-level recurrence. We then employ the gene regulatory network and protein interactome to characterize putative cancer genes with cis-regulatory variant recurrence. These genes were located at critical positions in the regulatory network. By contrast, they are at the circumference in the protein interactome; instead, they form a network module with coding cancer genes located at hub positions. Furthermore, the coding and regulatory variants associated via these interactions showed exclusive and complementary occurrence patterns across tumor samples. Therefore, we suggest that transcriptional cancer drivers may act through an extensive perturbation of the regulatory network and by altering protein network modules through interactions with coding driver genes.
Collapse
Affiliation(s)
- Kiwon Jang
- Department of Bio and Brain Engineering, KAIST, Daejeon, Republic of Korea
| | - Kwoneel Kim
- Department of Bio and Brain Engineering, KAIST, Daejeon, Republic of Korea
| | - Ara Cho
- Department of Biotechnology, College of Life Science and Biotechnology, Yonsei University, Seoul, Republic of Korea
| | - Insuk Lee
- Department of Biotechnology, College of Life Science and Biotechnology, Yonsei University, Seoul, Republic of Korea
| | - Jung Kyoon Choi
- Department of Bio and Brain Engineering, KAIST, Daejeon, Republic of Korea
- * E-mail:
| |
Collapse
|
18
|
Xi J, Wang M, Li A. Discovering potential driver genes through an integrated model of somatic mutation profiles and gene functional information. MOLECULAR BIOSYSTEMS 2017; 13:2135-2144. [DOI: 10.1039/c7mb00303j] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/01/2023]
Abstract
An integrated approach to identify driver genes based on information of somatic mutations, the interaction network and Gene Ontology similarity.
Collapse
Affiliation(s)
- Jianing Xi
- School of Information Science and Technology
- University of Science and Technology of China
- Hefei AH 230027
- People’s Republic of China
| | - Minghui Wang
- School of Information Science and Technology
- University of Science and Technology of China
- Hefei AH 230027
- People’s Republic of China
- Centers for Biomedical Engineering
| | - Ao Li
- School of Information Science and Technology
- University of Science and Technology of China
- Hefei AH 230027
- People’s Republic of China
- Centers for Biomedical Engineering
| |
Collapse
|
19
|
Dimitrakopoulos CM, Beerenwinkel N. Computational approaches for the identification of cancer genes and pathways. WILEY INTERDISCIPLINARY REVIEWS-SYSTEMS BIOLOGY AND MEDICINE 2016; 9. [PMID: 27863091 PMCID: PMC5215607 DOI: 10.1002/wsbm.1364] [Citation(s) in RCA: 40] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/28/2016] [Revised: 07/26/2016] [Accepted: 08/23/2016] [Indexed: 12/27/2022]
Abstract
High‐throughput DNA sequencing techniques enable large‐scale measurement of somatic mutations in tumors. Cancer genomics research aims at identifying all cancer‐related genes and solid interpretation of their contribution to cancer initiation and development. However, this venture is characterized by various challenges, such as the high number of neutral passenger mutations and the complexity of the biological networks affected by driver mutations. Based on biological pathway and network information, sophisticated computational methods have been developed to facilitate the detection of cancer driver mutations and pathways. They can be categorized into (1) methods using known pathways from public databases, (2) network‐based methods, and (3) methods learning cancer pathways de novo. Methods in the first two categories use and integrate different types of data, such as biological pathways, protein interaction networks, and gene expression measurements. The third category consists of de novo methods that detect combinatorial patterns of somatic mutations across tumor samples, such as mutual exclusivity and co‐occurrence. In this review, we discuss recent advances, current limitations, and future challenges of these approaches for detecting cancer genes and pathways. We also discuss the most important current resources of cancer‐related genes. WIREs Syst Biol Med 2017, 9:e1364. doi: 10.1002/wsbm.1364 For further resources related to this article, please visit the WIREs website.
Collapse
Affiliation(s)
- Christos M Dimitrakopoulos
- Department of Biosystems Science and Engineering, ETH Zürich, Basel, Switzerland.,SIB Swiss Institute of Bioinformatics, Basel, Switzerland
| | - Niko Beerenwinkel
- Department of Biosystems Science and Engineering, ETH Zürich, Basel, Switzerland.,SIB Swiss Institute of Bioinformatics, Basel, Switzerland
| |
Collapse
|
20
|
Cho A, Shim JE, Kim E, Supek F, Lehner B, Lee I. MUFFINN: cancer gene discovery via network analysis of somatic mutation data. Genome Biol 2016; 17:129. [PMID: 27333808 PMCID: PMC4918128 DOI: 10.1186/s13059-016-0989-x] [Citation(s) in RCA: 93] [Impact Index Per Article: 11.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2015] [Accepted: 05/24/2016] [Indexed: 12/21/2022] Open
Abstract
A major challenge for distinguishing cancer-causing driver mutations from inconsequential passenger mutations is the long-tail of infrequently mutated genes in cancer genomes. Here, we present and evaluate a method for prioritizing cancer genes accounting not only for mutations in individual genes but also in their neighbors in functional networks, MUFFINN (MUtations For Functional Impact on Network Neighbors). This pathway-centric method shows high sensitivity compared with gene-centric analyses of mutation data. Notably, only a marginal decrease in performance is observed when using 10 % of TCGA patient samples, suggesting the method may potentiate cancer genome projects with small patient populations.
Collapse
Affiliation(s)
- Ara Cho
- Department of Biotechnology, College of Life Science and Biotechnology, Yonsei University, Seoul, Korea
| | - Jung Eun Shim
- Department of Biotechnology, College of Life Science and Biotechnology, Yonsei University, Seoul, Korea
| | - Eiru Kim
- Department of Biotechnology, College of Life Science and Biotechnology, Yonsei University, Seoul, Korea
| | - Fran Supek
- EMBL-CRG Systems Biology Unit, Centre for Genomic Regulation (CRG), 08003, Barcelona, Spain.,Universitat Pompeu Fabra (UPF), 08003, Barcelona, Spain.,Division of Electronics, Rudjer Boskovic Institute, 10000, Zagreb, Croatia
| | - Ben Lehner
- EMBL-CRG Systems Biology Unit, Centre for Genomic Regulation (CRG), 08003, Barcelona, Spain. .,Universitat Pompeu Fabra (UPF), 08003, Barcelona, Spain.
| | - Insuk Lee
- Department of Biotechnology, College of Life Science and Biotechnology, Yonsei University, Seoul, Korea.
| |
Collapse
|
21
|
De Maeyer D, Weytjens B, De Raedt L, Marchal K. Network-Based Analysis of eQTL Data to Prioritize Driver Mutations. Genome Biol Evol 2016; 8:481-94. [PMID: 26802430 PMCID: PMC4825419 DOI: 10.1093/gbe/evw010] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022] Open
Abstract
In clonal systems, interpreting driver genes in terms of molecular networks helps understanding how these drivers elicit an adaptive phenotype. Obtaining such a network-based understanding depends on the correct identification of driver genes. In clonal systems, independent evolved lines can acquire a similar adaptive phenotype by affecting the same molecular pathways, a phenomenon referred to as parallelism at the molecular pathway level. This implies that successful driver identification depends on interpreting mutated genes in terms of molecular networks. Driver identification and obtaining a network-based understanding of the adaptive phenotype are thus confounded problems that ideally should be solved simultaneously. In this study, a network-based eQTL method is presented that solves both the driver identification and the network-based interpretation problem. As input the method uses coupled genotype-expression phenotype data (eQTL data) of independently evolved lines with similar adaptive phenotypes and an organism-specific genome-wide interaction network. The search for mutational consistency at pathway level is defined as a subnetwork inference problem, which consists of inferring a subnetwork from the genome-wide interaction network that best connects the genes containing mutations to differentially expressed genes. Based on their connectivity with the differentially expressed genes, mutated genes are prioritized as driver genes. Based on semisynthetic data and two publicly available data sets, we illustrate the potential of the network-based eQTL method to prioritize driver genes and to gain insights in the molecular mechanisms underlying an adaptive phenotype. The method is available at http://bioinformatics.intec.ugent.be/phenetic_eqtl/index.html
Collapse
Affiliation(s)
- Dries De Maeyer
- Deptartment of Information Technology (INTEC, iMINDS), UGent, 9052 Ghent, Belgium Department of Plant Biotechnology and Bioinformatics, Ghent University, Technologiepark 927, 9052 Gent, Belgium Bioinformatics Institute Ghent, Technologiepark 927, 9052 Ghent, Belgium Department of Microbial and Molecular Systems, KU Leuven, Kasteelpark Arenberg 20, B-3001 Leuven, Belgium
| | - Bram Weytjens
- Deptartment of Information Technology (INTEC, iMINDS), UGent, 9052 Ghent, Belgium Department of Plant Biotechnology and Bioinformatics, Ghent University, Technologiepark 927, 9052 Gent, Belgium Bioinformatics Institute Ghent, Technologiepark 927, 9052 Ghent, Belgium Department of Microbial and Molecular Systems, KU Leuven, Kasteelpark Arenberg 20, B-3001 Leuven, Belgium
| | - Luc De Raedt
- Department of Computer Science, KU Leuven, Celestijnenlaan 200A, B-3001 Leuven, Belgium
| | - Kathleen Marchal
- Deptartment of Information Technology (INTEC, iMINDS), UGent, 9052 Ghent, Belgium Department of Plant Biotechnology and Bioinformatics, Ghent University, Technologiepark 927, 9052 Gent, Belgium Bioinformatics Institute Ghent, Technologiepark 927, 9052 Ghent, Belgium Department of Genetics, University of Pretoria, Hatfield Campus, Pretoria 0028, South Africa Department of Microbial and Molecular Systems, KU Leuven, Kasteelpark Arenberg 20, B-3001 Leuven, Belgium
| |
Collapse
|
22
|
Kang H, Cho KH, Zhang XD, Zeng T, Chen L. Inferring Sequential Order of Somatic Mutations during Tumorgenesis based on Markov Chain Model. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2015; 12:1094-1103. [PMID: 26451822 DOI: 10.1109/tcbb.2015.2424408] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
Tumors are developed and worsen with the accumulated mutations on DNA sequences during tumorigenesis. Identifying the temporal order of gene mutations in cancer initiation and development is a challenging topic. It not only provides a new insight into the study of tumorigenesis at the level of genome sequences but also is an effective tool for early diagnosis of tumors and preventive medicine. In this paper, we develop a novel method to accurately estimate the sequential order of gene mutations during tumorigenesis from genome sequencing data based on Markov chain model as TOMC (Temporal Order based on Markov Chain), and also provide a new criterion to further infer the order of samples or patients, which can characterize the severity or stage of the disease. We applied our method to the analysis of tumors based on several high-throughput datasets. Specifically, first, we revealed that tumor suppressor genes (TSG) tend to be mutated ahead of oncogenes, which are considered as important events for key functional loss and gain during tumorigenesis. Second, the comparisons of various methods demonstrated that our approach has clear advantages over the existing methods due to the consideration on the effect of mutation dependence among genes, such as co-mutation. Third and most important, our method is able to deduce the ordinal sequence of patients or samples to quantitatively characterize their severity of tumors. Therefore, our work provides a new way to quantitatively understand the development and progression of tumorigenesis based on high throughput sequencing data.
Collapse
|
23
|
Melak T, Gakkhar S. Maximum flow approach to prioritize potential drug targets of Mycobacterium tuberculosis H37Rv from protein-protein interaction network. Clin Transl Med 2015; 4:61. [PMID: 26061871 PMCID: PMC4467812 DOI: 10.1186/s40169-015-0061-6] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2015] [Accepted: 06/02/2015] [Indexed: 01/26/2023] Open
Abstract
Background In spite of the implementations of several strategies, tuberculosis (TB) is overwhelmingly a serious global public health problem causing millions of infections and deaths every year. This is mainly due to the emergence of drug-resistance varieties of TB. The current treatment strategies for the drug-resistance TB are of longer duration, more expensive and have side effects. This highlights the importance of identification and prioritization of targets for new drugs. This study has been carried out to prioritize potential drug targets of Mycobacteriumtuberculosis H37Rv based on their flow to resistance genes. Methods The weighted proteome interaction network of the pathogen was constructed using a dataset from STRING database. Only a subset of the dataset with interactions that have a combined score value ≥770 was considered. Maximum flow approach has been used to prioritize potential drug targets. The potential drug targets were obtained through comparative genome and network centrality analysis. The curated set of resistance genes was retrieved from literatures. Detail literature review and additional assessment of the method were also carried out for validation. Results A list of 537 proteins which are essential to the pathogen and non-homologous with human was obtained from the comparative genome analysis. Through network centrality measures, 131 of them were found within the close neighborhood of the centre of gravity of the proteome network. These proteins were further prioritized based on their maximum flow value to resistance genes and they are proposed as reliable drug targets of the pathogen. Proteins which interact with the host were also identified in order to understand the infection mechanism. Conclusion Potential drug targets of Mycobacteriumtuberculosis H37Rv were successfully prioritized based on their flow to resistance genes of existing drugs which is believed to increase the druggability of the targets since inhibition of a protein that has a maximum flow to resistance genes is more likely to disrupt the communication to these genes. Purposely selected literature review of the top 14 proteins showed that many of them in this list were proposed as drug targets of the pathogen. Electronic supplementary material The online version of this article (doi:10.1186/s40169-015-0061-6) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Tilahun Melak
- Department of Computer Science, Dilla University, Gedeo, Ethiopia,
| | | |
Collapse
|
24
|
Babaei S, Mahfouz A, Hulsman M, Lelieveldt BPF, de Ridder J, Reinders M. Hi-C Chromatin Interaction Networks Predict Co-expression in the Mouse Cortex. PLoS Comput Biol 2015; 11:e1004221. [PMID: 25965262 PMCID: PMC4429121 DOI: 10.1371/journal.pcbi.1004221] [Citation(s) in RCA: 34] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2014] [Accepted: 03/03/2015] [Indexed: 01/08/2023] Open
Abstract
The three dimensional conformation of the genome in the cell nucleus influences important biological processes such as gene expression regulation. Recent studies have shown a strong correlation between chromatin interactions and gene co-expression. However, predicting gene co-expression from frequent long-range chromatin interactions remains challenging. We address this by characterizing the topology of the cortical chromatin interaction network using scale-aware topological measures. We demonstrate that based on these characterizations it is possible to accurately predict spatial co-expression between genes in the mouse cortex. Consistent with previous findings, we find that the chromatin interaction profile of a gene-pair is a good predictor of their spatial co-expression. However, the accuracy of the prediction can be substantially improved when chromatin interactions are described using scale-aware topological measures of the multi-resolution chromatin interaction network. We conclude that, for co-expression prediction, it is necessary to take into account different levels of chromatin interactions ranging from direct interaction between genes (i.e. small-scale) to chromatin compartment interactions (i.e. large-scale). Regulatory elements can target genes over large genomic distances through long-range chromatin interactions. These interactions arise as a result of the three-dimensional (3D) conformation of chromosomes in the cell nucleus. This 3D conformation can also result in the co-localization of co-regulated genes. To investigate this, we asked whether genome-wide chromatin interactions can predict co-expression patterns of genes. To address this question, we characterized 3D interactions between genes, captured by Hi-C measurements, by a network, termed chromatin interaction network (CIN). We applied scale-aware topological measures to the network to comprehensively characterize the chromatin interactions at different scales, ranging from direct interaction between gene pairs to chromatin compartment interactions. We then used multi-scale chromatin interactions to predict spatial co-expression patterns in the mouse cortex. The results show that the prediction performance improves when scale-aware topological measures of the multi-resolution chromatin interaction network are used.
Collapse
Affiliation(s)
- Sepideh Babaei
- Delft Bioinformatics Lab, Delft University of Technology, Delft, The Netherlands
| | - Ahmed Mahfouz
- Delft Bioinformatics Lab, Delft University of Technology, Delft, The Netherlands
- Division of Image Processing, Department of Radiology, Leiden University Medical Center, Leiden, The Netherlands
| | - Marc Hulsman
- Delft Bioinformatics Lab, Delft University of Technology, Delft, The Netherlands
- Department of Clinical Genetics, VU University Medical Center, Amsterdam, The Netherlands
| | - Boudewijn P. F. Lelieveldt
- Division of Image Processing, Department of Radiology, Leiden University Medical Center, Leiden, The Netherlands
- Department of Intelligent Systems, Delft University of Technology, Delft, The Netherlands
| | - Jeroen de Ridder
- Delft Bioinformatics Lab, Delft University of Technology, Delft, The Netherlands
- * E-mail: (JDR); (MR)
| | - Marcel Reinders
- Delft Bioinformatics Lab, Delft University of Technology, Delft, The Netherlands
- * E-mail: (JDR); (MR)
| |
Collapse
|
25
|
Pon JR, Marra MA. Driver and Passenger Mutations in Cancer. ANNUAL REVIEW OF PATHOLOGY-MECHANISMS OF DISEASE 2015; 10:25-50. [DOI: 10.1146/annurev-pathol-012414-040312] [Citation(s) in RCA: 216] [Impact Index Per Article: 24.0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/28/2022]
Affiliation(s)
- Julia R. Pon
- Canada's Michael Smith Genome Sciences Centre, BC Cancer Agency, Vancouver, Canada V5Z 1L3;
| | - Marco A. Marra
- Canada's Michael Smith Genome Sciences Centre, BC Cancer Agency, Vancouver, Canada V5Z 1L3;
- Department of Medical Genetics, University of British Columbia, Vancouver, Canada V6T 1Z4;
| |
Collapse
|
26
|
Merid SK, Goranskaya D, Alexeyenko A. Distinguishing between driver and passenger mutations in individual cancer genomes by network enrichment analysis. BMC Bioinformatics 2014; 15:308. [PMID: 25236784 PMCID: PMC4262241 DOI: 10.1186/1471-2105-15-308] [Citation(s) in RCA: 52] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2014] [Accepted: 09/02/2014] [Indexed: 01/09/2023] Open
Abstract
Background In somatic cancer genomes, delineating genuine driver mutations against a background of multiple passenger events is a challenging task. The difficulty of determining function from sequence data and the low frequency of mutations are increasingly hindering the search for novel, less common cancer drivers. The accumulation of extensive amounts of data on somatic point and copy number alterations necessitates the development of systematic methods for driver mutation analysis. Results We introduce a framework for detecting driver mutations via functional network analysis, which is applied to individual genomes and does not require pooling multiple samples. It probabilistically evaluates 1) functional network links between different mutations in the same genome and 2) links between individual mutations and known cancer pathways. In addition, it can employ correlations of mutation patterns in pairs of genes. The method was used to analyze genomic alterations in two TCGA datasets, one for glioblastoma multiforme and another for ovarian carcinoma, which were generated using different approaches to mutation profiling. The proportions of drivers among the reported de novo point mutations in these cancers were estimated to be 57.8% and 16.8%, respectively. The both sets also included extended chromosomal regions with synchronous duplications or losses of multiple genes. We identified putative copy number driver events within many such segments. Finally, we summarized seemingly disparate mutations and discovered a functional network of collagen modifications in the glioblastoma. In order to select the most efficient network for use with this method, we used a novel, ROC curve-based procedure for benchmarking different network versions by their ability to recover pathway membership. Conclusions The results of our network-based procedure were in good agreement with published gold standard sets of cancer genes and were shown to complement and expand frequency-based driver analyses. On the other hand, three sequence-based methods applied to the same data yielded poor agreement with each other and with our results. We review the difference in driver proportions discovered by different sequencing approaches and discuss the functional roles of novel driver mutations. The software used in this work and the global network of functional couplings are publicly available at http://research.scilifelab.se/andrej_alexeyenko/downloads.html. Electronic supplementary material The online version of this article (doi:10.1186/1471-2105-15-308) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
| | | | - Andrey Alexeyenko
- Department of Microbiology, Tumour and Cell biology, Bioinformatics Infrastructure for Life Sciences, Science for Life Laboratory, Karolinska Institutet, 17177 Stockholm, Sweden.
| |
Collapse
|