1
|
Mina M, Iyer A, Ciriello G. Epistasis and evolutionary dependencies in human cancers. Curr Opin Genet Dev 2022; 77:101989. [PMID: 36182742 DOI: 10.1016/j.gde.2022.101989] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2022] [Revised: 08/29/2022] [Accepted: 08/31/2022] [Indexed: 01/27/2023]
Abstract
Cancer evolution is driven by the concerted action of multiple molecular alterations, which emerge and are selected during tumor progression. An alteration is selected when it provides an advantage to the tumor cell. However, the advantage provided by a specific alteration depends on the tumor lineage, cell epigenetic state, and presence of additional alterations. In this case, we say that an evolutionary dependency exists between an alteration and what influences its selection. Epistatic interactions between altered genes lead to evolutionary dependencies (EDs), by favoring or vetoing specific combinations of events. Large-scale cancer genomics studies have discovered examples of such dependencies, and showed that they influence tumor progression, disease phenotypes, and therapeutic response. In the past decade, several algorithmic approaches have been proposed to infer EDs from large-scale genomics datasets. These methods adopt diverse strategies to address common challenges and shed new light on cancer evolutionary trajectories. Here, we review these efforts starting from a simple conceptualization of the problem, presenting the tackled and still unmet needs in the field, and discussing the implications of EDs in cancer biology and precision oncology.
Collapse
Affiliation(s)
- Marco Mina
- Department of Computational Biology, University of Lausanne, Lausanne, Switzerland; Swiss Cancer Center Leman, Lausanne, Switzerland; Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Arvind Iyer
- Department of Computational Biology, University of Lausanne, Lausanne, Switzerland; Swiss Cancer Center Leman, Lausanne, Switzerland; Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Giovanni Ciriello
- Department of Computational Biology, University of Lausanne, Lausanne, Switzerland; Swiss Cancer Center Leman, Lausanne, Switzerland; Swiss Institute of Bioinformatics, Lausanne, Switzerland.
| |
Collapse
|
2
|
Zeng Z, Mao C, Vo A, Li X, Nugent JO, Khan SA, Clare SE, Luo Y. Deep learning for cancer type classification and driver gene identification. BMC Bioinformatics 2021; 22:491. [PMID: 34689757 PMCID: PMC8543824 DOI: 10.1186/s12859-021-04400-4] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2021] [Accepted: 09/24/2021] [Indexed: 12/12/2022] Open
Abstract
Background Genetic information is becoming more readily available and is increasingly being used to predict patient cancer types as well as their subtypes. Most classification methods thus far utilize somatic mutations as independent features for classification and are limited by study power. We aim to develop a novel method to effectively explore the landscape of genetic variants, including germline variants, and small insertions and deletions for cancer type prediction.
Results We proposed DeepCues, a deep learning model that utilizes convolutional neural networks to unbiasedly derive features from raw cancer DNA sequencing data for disease classification and relevant gene discovery. Using raw whole-exome sequencing as features, germline variants and somatic mutations, including insertions and deletions, were interactively amalgamated for feature generation and cancer prediction. We applied DeepCues to a dataset from TCGA to classify seven different types of major cancers and obtained an overall accuracy of 77.6%. We compared DeepCues to conventional methods and demonstrated a significant overall improvement (p < 0.001). Strikingly, using DeepCues, the top 20 breast cancer relevant genes we have identified, had a 40% overlap with the top 20 known breast cancer driver genes. Conclusion Our results support DeepCues as a novel method to improve the representational resolution of DNA sequencings and its power in deriving features from raw sequences for cancer type prediction, as well as discovering new cancer relevant genes. Supplementary Information The online version contains supplementary material available at 10.1186/s12859-021-04400-4.
Collapse
Affiliation(s)
- Zexian Zeng
- Department of Preventive Medicine, Feinberg School of Medicine, Northwestern University, 750 N Lake Shore Drive Room 11-189, Chicago, IL, 60611, USA.,Department of Data Sciences, Dana-Farber Cancer Institute, Harvard T.H. Chan School of Public Health, Boston, MA, USA
| | - Chengsheng Mao
- Department of Preventive Medicine, Feinberg School of Medicine, Northwestern University, 750 N Lake Shore Drive Room 11-189, Chicago, IL, 60611, USA
| | - Andy Vo
- Committee on Developmental Biology and Regenerative Medicine, The University of Chicago, Chicago, IL, USA
| | | | - Janna Ore Nugent
- Research Computing Services, Northwestern University, Chicago, IL, USA
| | - Seema A Khan
- Department of Surgery, Feinberg School of Medicine, Northwestern University, NMH/Prentice Women's Hospital Room 4-420 250 E Superior, Chicago, IL, 60611, USA.
| | - Susan E Clare
- Department of Surgery, Feinberg School of Medicine, Northwestern University, Robert H Lurie Medical Research Center Room 4-113 250 E Superior, Chicago, IL, 60611, USA.
| | - Yuan Luo
- Department of Preventive Medicine, Feinberg School of Medicine, Northwestern University, 750 N Lake Shore Drive Room 11-189, Chicago, IL, 60611, USA.
| |
Collapse
|
3
|
Fedrizzi T, Ciani Y, Lorenzin F, Cantore T, Gasperini P, Demichelis F. Fast mutual exclusivity algorithm nominates potential synthetic lethal gene pairs through brute force matrix product computations. Comput Struct Biotechnol J 2021; 19:4394-4403. [PMID: 34429855 PMCID: PMC8369001 DOI: 10.1016/j.csbj.2021.08.001] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2021] [Revised: 08/02/2021] [Accepted: 08/03/2021] [Indexed: 12/12/2022] Open
Abstract
Mutual Exclusivity analysis of genomic aberrations contributes to the exploration of potential synthetic lethal (SL) relationships thus guiding the nomination of specific cancer cells vulnerabilities. When multiple classes of genomic aberrations and large cohorts of patients are interrogated, exhaustive genome-wide analyses are not computationally feasible with commonly used approaches. Here we present Fast Mutual Exclusivity (FaME), an algorithm based on matrix multiplication that employs a logarithm-based implementation of the Fisher's exact test to achieve fast computation of genome-wide mutual exclusivity tests; we show that brute force testing for mutual exclusivity of hundreds of millions of aberrations combinations can be performed in few minutes. We applied FaME to allele-specific data from whole exome experiments of 27 TCGA studies cohorts, detecting both mutual exclusivity of point mutations, as well as allele-specific copy number signals that span sets of contiguous cytobands. We next focused on a case study involving the loss of tumor suppressors and druggable genes while exploiting an integrated analysis of both public cell lines loss of function screens data and patients' transcriptomic profiles. FaME algorithm implementation as well as allele-specific analysis output are publicly available at https://github.com/demichelislab/FaME.
Collapse
Affiliation(s)
- Tarcisio Fedrizzi
- Department of Cellular, Computational and Integrative Biology, University of Trento, 38123 Trento, Italy
| | - Yari Ciani
- Department of Cellular, Computational and Integrative Biology, University of Trento, 38123 Trento, Italy
| | - Francesca Lorenzin
- Department of Cellular, Computational and Integrative Biology, University of Trento, 38123 Trento, Italy
| | - Thomas Cantore
- Department of Cellular, Computational and Integrative Biology, University of Trento, 38123 Trento, Italy
| | - Paola Gasperini
- Department of Cellular, Computational and Integrative Biology, University of Trento, 38123 Trento, Italy
| | - Francesca Demichelis
- Department of Cellular, Computational and Integrative Biology, University of Trento, 38123 Trento, Italy
- Department of Physiology and Biophysics, Weill Cornell Medicine, New York, NY, USA
- The HRH Prince Alwaleed Bin Talal Bin Abdulaziz Al-Saud Institute for Computational Biomedicine, Weill Cornell Medical College, New York, NY 10021, USA
- The Caryl and Israel Englander Institute for Precision Medicine, Weill Cornell Medicine, New York, NY 10021, USA
| |
Collapse
|
4
|
Völkel G, Laban S, Fürstberger A, Kühlwein SD, Ikonomi N, Hoffmann TK, Brunner C, Neuberg DS, Gaidzik V, Döhner H, Kraus JM, Kestler HA. Analysis, identification and visualization of subgroups in genomics. Brief Bioinform 2020; 22:5909009. [PMID: 32954413 PMCID: PMC8138884 DOI: 10.1093/bib/bbaa217] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/04/2020] [Revised: 08/14/2020] [Accepted: 08/17/2020] [Indexed: 12/22/2022] Open
Abstract
Motivation Cancer is a complex and heterogeneous disease involving multiple somatic mutations that accumulate during its progression. In the past years, the wide availability of genomic data from patients’ samples opened new perspectives in the analysis of gene mutations and alterations. Hence, visualizing and further identifying genes mutated in massive sets of patients are nowadays a critical task that sheds light on more personalized intervention approaches. Results Here, we extensively review existing tools for visualization and analysis of alteration data. We compare different approaches to study mutual exclusivity and sample coverage in large-scale omics data. We complement our review with the standalone software AVAtar (‘analysis and visualization of alteration data’) that integrates diverse aspects known from different tools into a comprehensive platform. AVAtar supplements customizable alteration plots by a multi-objective evolutionary algorithm for subset identification and provides an innovative and user-friendly interface for the evaluation of concurrent solutions. A use case from personalized medicine demonstrates its unique features showing an application on vaccination target selection. Availability AVAtar is available at: https://github.com/sysbio-bioinf/avatar Contact hans.kestler@uni-ulm.de, phone: +49 (0) 731 500 24 500, fax: +49 (0) 731 500 24 502
Collapse
Affiliation(s)
| | | | | | | | | | - Thomas K Hoffmann
- Department of Otorhinolaryngology, Head and Neck Surgery, Ulm University Medical Center, Germany
| | - Cornelia Brunner
- Department of Otorhinolaryngology, Head and Neck Surgery, Ulm University Medical Center, Germany
| | - Donna S Neuberg
- Department of Biostatistics, Dana-Farber Cancer Institute, Boston, Massachusetts, USA
| | - Verena Gaidzik
- Department of Internal Medicine III, Ulm University Medical Center, Germany
| | - Hartmut Döhner
- Department of Internal Medicine III, Ulm University Medical Center, Germany
| | | | | |
Collapse
|
5
|
Zeng Z, Vo AH, Mao C, Clare SE, Khan SA, Luo Y. Cancer classification and pathway discovery using non-negative matrix factorization. J Biomed Inform 2019; 96:103247. [PMID: 31271844 PMCID: PMC6697569 DOI: 10.1016/j.jbi.2019.103247] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2019] [Revised: 04/23/2019] [Accepted: 07/01/2019] [Indexed: 02/08/2023]
Abstract
OBJECTIVES Extracting genetic information from a full range of sequencing data is important for understanding disease. We propose a novel method to effectively explore the landscape of genetic mutations and aggregate them to predict cancer type. DESIGN We applied non-smooth non-negative matrix factorization (nsNMF) and support vector machine (SVM) to utilize the full range of sequencing data, aiming to better aggregate genetic mutations and improve their power to predict disease type. More specifically, we introduce a novel classifier to distinguish cancer types using somatic mutations obtained from whole-exome sequencing data. Mutations were identified from multiple cancers and scored using SIFT, PP2, and CADD, and collapsed at the individual gene level. nsNMF was then applied to reduce dimensionality and obtain coefficient and basis matrices. A feature matrix was derived from the obtained matrices to train a classifier for cancer type classification with the SVM model. RESULTS We have demonstrated that the classifier was able to distinguish four cancer types with reasonable accuracy. In five-fold cross-validations using mutation counts as features, the average prediction accuracy was 80% (SEM = 0.1%), significantly outperforming baselines and outperforming models using mutation scores as features. CONCLUSION Using the factor matrices derived from the nsNMF, we identified multiple genes and pathways that are significantly associated with each cancer type. This study presents a generic and complete pipeline to study the associations between somatic mutations and cancers. The proposed method can be adapted to other studies for disease status classification and pathway discovery.
Collapse
Affiliation(s)
- Zexian Zeng
- Department of Preventive Medicine, Northwestern University, Feinberg School of Medicine, Chicago, IL, USA
| | - Andy H Vo
- Committee on Developmental Biology and Regenerative Medicine, The University of Chicago, Chicago, IL, USA
| | - Chengsheng Mao
- Department of Preventive Medicine, Northwestern University, Feinberg School of Medicine, Chicago, IL, USA
| | - Susan E Clare
- Department of Surgery, Northwestern University, Feinberg School of Medicine, Chicago, IL, USA.
| | - Seema A Khan
- Department of Surgery, Northwestern University, Feinberg School of Medicine, Chicago, IL, USA.
| | - Yuan Luo
- Department of Preventive Medicine, Northwestern University, Feinberg School of Medicine, Chicago, IL, USA.
| |
Collapse
|
6
|
Deng Y, Luo S, Deng C, Luo T, Yin W, Zhang H, Zhang Y, Zhang X, Lan Y, Ping Y, Xiao Y, Li X. Identifying mutual exclusivity across cancer genomes: computational approaches to discover genetic interaction and reveal tumor vulnerability. Brief Bioinform 2019; 20:254-266. [PMID: 28968730 DOI: 10.1093/bib/bbx109] [Citation(s) in RCA: 30] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2017] [Indexed: 02/06/2023] Open
Abstract
Systematic sequencing of cancer genomes has revealed prevalent heterogeneity, with patients harboring various combinatorial patterns of genetic alteration. In particular, a phenomenon that a group of genes exhibits mutually exclusive patterns has been widespread across cancers, covering a broad spectrum of crucial cancer pathways. Recently, there is considerable evidence showing that, mutual exclusivity reflects alternative functions in tumor initiation and progression, or suggests adverse effects of their concurrence. Given its importance, numerous computational approaches have been proposed to study mutual exclusivity using genomic profiles alone, or by integrating networks and phenotypes. Some of them have been routinely used to explore genetic associations, which lead to a deeper understanding of carcinogenic mechanisms and reveals unexpected tumor vulnerabilities. Here, we present an overview of mutual exclusivity from the perspective of cancer genome. We describe the common hypothesis underlying mutual exclusivity, summarize the strategies for the identification of significant mutually exclusive patterns, compare the performance of representative algorithms from simulated data sets and discuss their common confounders.
Collapse
Affiliation(s)
- Yulan Deng
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, Heilongjiang, China
| | - Shangyi Luo
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, Heilongjiang, China
| | - Chunyu Deng
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, Heilongjiang, China
| | - Tao Luo
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, Heilongjiang, China
| | - Wenkang Yin
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, Heilongjiang, China
| | - Hongyi Zhang
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, Heilongjiang, China
| | - Yong Zhang
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, Heilongjiang, China
| | - Xinxin Zhang
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, Heilongjiang, China
| | - Yujia Lan
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, Heilongjiang, China
| | - Yanyan Ping
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, Heilongjiang, China
| | - Yun Xiao
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, Heilongjiang, China
| | - Xia Li
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, Heilongjiang, China
| |
Collapse
|
7
|
Zhang J, Zhang S. The Discovery of Mutated Driver Pathways in Cancer: Models and Algorithms. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2018; 15:988-998. [PMID: 28113329 DOI: 10.1109/tcbb.2016.2640963] [Citation(s) in RCA: 32] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]
Abstract
The pathogenesis of cancer in human is still poorly understood. With the rapid development of high-throughput sequencing technologies, huge volumes of cancer genomics data have been generated. Deciphering that data poses great opportunities and challenges to computational biologists. One of such key challenges is to distinguish driver mutations, genes as well as pathways from passenger ones. Mutual exclusivity of gene mutations (each patient has no more than one mutation in the gene set) has been observed in various cancer types and thus has been used as an important property of a driver gene set or pathway. In this article, we aim to review the recent development of computational models and algorithms for discovering driver pathways or modules in cancer with the focus on mutual exclusivity-based ones.
Collapse
|
8
|
Zhang L, Liu Y, Wang M, Wu Z, Li N, Zhang J, Yang C. EZH2-, CHD4-, and IDH-linked epigenetic perturbation and its association with survival in glioma patients. J Mol Cell Biol 2017; 9:477-488. [PMID: 29272522 PMCID: PMC5907834 DOI: 10.1093/jmcb/mjx056] [Citation(s) in RCA: 36] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2017] [Revised: 11/12/2017] [Accepted: 12/18/2017] [Indexed: 12/13/2022] Open
Abstract
Glioma is a complex disease with limited treatment options. Recent advances have identified isocitrate dehydrogenase (IDH) mutations in up to 80% lower grade gliomas (LGG) and in 76% secondary glioblastomas (GBM). IDH mutations are also seen in 10%-20% of acute myeloid leukemia (AML). In AML, it was determined that mutations of IDH and other genes involving epigenetic regulations are early events, emerging in the pre-leukemic stem cells (pre-LSCs) stage, whereas mutations in genes propagating oncogenic signal are late events in leukemia. IDH mutations are also early events in glioma, occurring before TP53 mutation, 1p/19q deletion, etc. Despite these advances in glioma research, studies into other molecular alterations have lagged considerably. In this study, we analyzed currently available databases. We identified EZH2, KMT2C, and CHD4 as important genes in glioma in addition to the known gene IDH1/2. We also showed that genomic alterations of PIK3CA, CDKN2A, CDK4, FIP1L1, or FUBP1 collaborate with IDH mutations to negatively affect patients' survival in LGG. In LGG patients with TP53 mutations or IDH1/2 mutations, additional genomic alterations of EZH2, KMC2C, and CHD4 individually or in combination were associated with a markedly decreased disease-free survival than patients without such alterations. Alterations of EZH2, KMT2C, and CHD4 at genetic level or protein level could perturb epigenetic program, leading to malignant transformation in glioma. By reviewing current literature on both AML and glioma and performing bioinformatics analysis on available datasets, we developed a hypothetical model on the tumorigenesis from premalignant stem cells to glioma.
Collapse
Affiliation(s)
- Le Zhang
- College of Computer Science, Sichuan University, Chengdu, China
- College of Computer and Information Science, Southwest University, Chongqing, China
| | - Ying Liu
- The Vivian Smith Department of Neurosurgery, Center for Stem Cell and Regenerative Medicine, The University of Texas Health Science Center at Houston, Houston, TX, USA
| | - Mengning Wang
- Harvard Stem Cell Institute, Harvard University, Cambridge, MA, USA
| | - Zhenhai Wu
- Department of neurosurgery, ShouGuang People’s Hospital, Shandong, China
| | - Na Li
- College of Computer and Information Science, Southwest University, Chongqing, China
| | - Jinsong Zhang
- Pharmacological & Physiological Science, School of Medicine, Saint Louis University, St. Louis, MO, USA
| | - Chuanwei Yang
- Breast Medical Oncology, The University of Texas MD Anderson Cancer Center, Houston, TX, USA
- Systems Biology, The University of Texas MD Anderson Cancer Center, Houston, TX, USA
| |
Collapse
|
9
|
Newton Y, Novak AM, Swatloski T, McColl DC, Chopra S, Graim K, Weinstein AS, Baertsch R, Salama SR, Ellrott K, Chopra M, Goldstein TC, Haussler D, Morozova O, Stuart JM. TumorMap: Exploring the Molecular Similarities of Cancer Samples in an Interactive Portal. Cancer Res 2017; 77:e111-e114. [PMID: 29092953 DOI: 10.1158/0008-5472.can-17-0580] [Citation(s) in RCA: 49] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2017] [Revised: 06/14/2017] [Accepted: 08/07/2017] [Indexed: 01/15/2023]
Abstract
Vast amounts of molecular data are being collected on tumor samples, which provide unique opportunities for discovering trends within and between cancer subtypes. Such cross-cancer analyses require computational methods that enable intuitive and interactive browsing of thousands of samples based on their molecular similarity. We created a portal called TumorMap to assist in exploration and statistical interrogation of high-dimensional complex "omics" data in an interactive and easily interpretable way. In the TumorMap, samples are arranged on a hexagonal grid based on their similarity to one another in the original genomic space and are rendered with Google's Map technology. While the important feature of this public portal is the ability for the users to build maps from their own data, we pre-built genomic maps from several previously published projects. We demonstrate the utility of this portal by presenting results obtained from The Cancer Genome Atlas project data. Cancer Res; 77(21); e111-4. ©2017 AACR.
Collapse
Affiliation(s)
- Yulia Newton
- Department of Biomolecular Engineering and Bioinformatics, University of California, Santa Cruz, California
| | - Adam M Novak
- Department of Biomolecular Engineering and Bioinformatics, University of California, Santa Cruz, California
| | - Teresa Swatloski
- Department of Biomolecular Engineering and Bioinformatics, University of California, Santa Cruz, California
| | - Duncan C McColl
- Department of Biomolecular Engineering and Bioinformatics, University of California, Santa Cruz, California
| | - Sahil Chopra
- Department of Biomolecular Engineering and Bioinformatics, University of California, Santa Cruz, California.,Stanford University, Stanford, California
| | - Kiley Graim
- Department of Biomolecular Engineering and Bioinformatics, University of California, Santa Cruz, California
| | - Alana S Weinstein
- Department of Biomolecular Engineering and Bioinformatics, University of California, Santa Cruz, California
| | - Robert Baertsch
- Department of Biomolecular Engineering and Bioinformatics, University of California, Santa Cruz, California
| | - Sofie R Salama
- Department of Biomolecular Engineering and Bioinformatics, University of California, Santa Cruz, California
| | - Kyle Ellrott
- Department of Biomolecular Engineering and Bioinformatics, University of California, Santa Cruz, California.,Oregon Health and Science University, Portland, Oregon
| | - Manu Chopra
- Department of Biomolecular Engineering and Bioinformatics, University of California, Santa Cruz, California.,Pacific Collegiate School, Santa Cruz, California
| | - Theodore C Goldstein
- Department of Biomolecular Engineering and Bioinformatics, University of California, Santa Cruz, California.,Hematology-oncology Department, University of California, San Francisco, California
| | - David Haussler
- Department of Biomolecular Engineering and Bioinformatics, University of California, Santa Cruz, California
| | - Olena Morozova
- Department of Biomolecular Engineering and Bioinformatics, University of California, Santa Cruz, California
| | - Joshua M Stuart
- Department of Biomolecular Engineering and Bioinformatics, University of California, Santa Cruz, California.
| |
Collapse
|
10
|
Zhang J, Zhang S. Discovery of cancer common and specific driver gene sets. Nucleic Acids Res 2017; 45:e86. [PMID: 28168295 PMCID: PMC5449640 DOI: 10.1093/nar/gkx089] [Citation(s) in RCA: 44] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2016] [Revised: 01/20/2017] [Accepted: 01/31/2017] [Indexed: 12/31/2022] Open
Abstract
Cancer is known as a disease mainly caused by gene alterations. Discovery of mutated driver pathways or gene sets is becoming an important step to understand molecular mechanisms of carcinogenesis. However, systematically investigating commonalities and specificities of driver gene sets among multiple cancer types is still a great challenge, but this investigation will undoubtedly benefit deciphering cancers and will be helpful for personalized therapy and precision medicine in cancer treatment. In this study, we propose two optimization models to de novo discover common driver gene sets among multiple cancer types (ComMDP) and specific driver gene sets of one certain or multiple cancer types to other cancers (SpeMDP), respectively. We first apply ComMDP and SpeMDP to simulated data to validate their efficiency. Then, we further apply these methods to 12 cancer types from The Cancer Genome Atlas (TCGA) and obtain several biologically meaningful driver pathways. As examples, we construct a common cancer pathway model for BRCA and OV, infer a complex driver pathway model for BRCA carcinogenesis based on common driver gene sets of BRCA with eight cancer types, and investigate specific driver pathways of the liquid cancer lymphoblastic acute myeloid leukemia (LAML) versus other solid cancer types. In these processes more candidate cancer genes are also found.
Collapse
Affiliation(s)
- Junhua Zhang
- National Center for Mathematics and Interdisciplinary Sciences, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing 100190, China
| | - Shihua Zhang
- National Center for Mathematics and Interdisciplinary Sciences, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing 100190, China
- School of Mathematics Sciences, University of Chinese Academy of Sciences, Beijing 100049, China
| |
Collapse
|
11
|
Wu H, Gao L, Kasabov NK. Network-Based Method for Inferring Cancer Progression at the Pathway Level from Cross-Sectional Mutation Data. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2016; 13:1036-1044. [PMID: 26915128 DOI: 10.1109/tcbb.2016.2520934] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
Large-scale cancer genomics projects are providing a wealth of somatic mutation data from a large number of cancer patients. However, it is difficult to obtain several samples with a temporal order from one patient in evaluating the cancer progression. Therefore, one of the most challenging problems arising from the data is to infer the temporal order of mutations across many patients. To solve the problem efficiently, we present a Network-based method (NetInf) to Infer cancer progression at the pathway level from cross-sectional data across many patients, leveraging on the exclusive property of driver mutations within a pathway and the property of linear progression between pathways. To assess the robustness of NetInf, we apply it on simulated data with the addition of different levels of noise. To verify the performance of NetInf, we apply it to analyze somatic mutation data from three real cancer studies with large number of samples. Experimental results reveal that the pathways detected by NetInf show significant enrichment. Our method reduces computational complexity by constructing gene networks without assigning the number of pathways, which also provides new insights on the temporal order of somatic mutations at the pathway level rather than at the gene level.
Collapse
|
12
|
Wang J, Cazzato E, Ladewig E, Frattini V, Rosenbloom DIS, Zairis S, Abate F, Liu Z, Elliott O, Shin YJ, Lee JK, Lee IH, Park WY, Eoli M, Blumberg AJ, Lasorella A, Nam DH, Finocchiaro G, Iavarone A, Rabadan R. Clonal evolution of glioblastoma under therapy. Nat Genet 2016; 48:768-76. [PMID: 27270107 DOI: 10.1038/ng.3590] [Citation(s) in RCA: 520] [Impact Index Per Article: 65.0] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2016] [Accepted: 05/16/2016] [Indexed: 02/08/2023]
Abstract
Glioblastoma (GBM) is the most common and aggressive primary brain tumor. To better understand how GBM evolves, we analyzed longitudinal genomic and transcriptomic data from 114 patients. The analysis shows a highly branched evolutionary pattern in which 63% of patients experience expression-based subtype changes. The branching pattern, together with estimates of evolutionary rate, suggests that relapse-associated clones typically existed years before diagnosis. Fifteen percent of tumors present hypermutation at relapse in highly expressed genes, with a clear mutational signature. We find that 11% of recurrence tumors harbor mutations in LTBP4, which encodes a protein binding to TGF-β. Silencing LTBP4 in GBM cells leads to suppression of TGF-β activity and decreased cell proliferation. In recurrent GBM with wild-type IDH1, high LTBP4 expression is associated with worse prognosis, highlighting the TGF-β pathway as a potential therapeutic target in GBM.
Collapse
Affiliation(s)
- Jiguang Wang
- Department of Systems Biology, Columbia University, New York, New York, USA.,Department of Biomedical Informatics, Columbia University, New York, New York, USA
| | - Emanuela Cazzato
- Fondazione IRCCS Istituto Neurologico Besta, Unit of Molecular Neuro-Oncology, Milan, Italy
| | - Erik Ladewig
- Department of Systems Biology, Columbia University, New York, New York, USA.,Department of Biomedical Informatics, Columbia University, New York, New York, USA
| | - Veronique Frattini
- Institute for Cancer Genetics, Columbia University, New York, New York, USA
| | - Daniel I S Rosenbloom
- Department of Systems Biology, Columbia University, New York, New York, USA.,Department of Biomedical Informatics, Columbia University, New York, New York, USA
| | - Sakellarios Zairis
- Department of Systems Biology, Columbia University, New York, New York, USA.,Department of Biomedical Informatics, Columbia University, New York, New York, USA
| | - Francesco Abate
- Department of Systems Biology, Columbia University, New York, New York, USA.,Department of Biomedical Informatics, Columbia University, New York, New York, USA
| | - Zhaoqi Liu
- Department of Systems Biology, Columbia University, New York, New York, USA.,Department of Biomedical Informatics, Columbia University, New York, New York, USA
| | - Oliver Elliott
- Department of Systems Biology, Columbia University, New York, New York, USA.,Department of Biomedical Informatics, Columbia University, New York, New York, USA
| | - Yong-Jae Shin
- Department of Neurosurgery, Samsung Medical Center, Sungkyunkwan University School of Medicine, Seoul, Republic of Korea
| | - Jin-Ku Lee
- Department of Neurosurgery, Samsung Medical Center, Sungkyunkwan University School of Medicine, Seoul, Republic of Korea
| | - In-Hee Lee
- Department of Neurosurgery, Samsung Medical Center, Sungkyunkwan University School of Medicine, Seoul, Republic of Korea
| | - Woong-Yang Park
- Samsung Genome Institute, Samsung Medical Center, Sungkyunkwan University School of Medicine, Seoul, Republic of Korea
| | - Marica Eoli
- Fondazione IRCCS Istituto Neurologico Besta, Unit of Molecular Neuro-Oncology, Milan, Italy
| | | | - Anna Lasorella
- Institute for Cancer Genetics, Columbia University, New York, New York, USA.,Department of Pediatrics, Columbia University, New York, New York, USA.,Department of Pathology, Columbia University, New York, New York, USA
| | - Do-Hyun Nam
- Department of Neurosurgery, Samsung Medical Center, Sungkyunkwan University School of Medicine, Seoul, Republic of Korea.,Department of Health Sciences and Technology, SAIHST, Sungkyunkwan University, Seoul, Republic of Korea
| | - Gaetano Finocchiaro
- Fondazione IRCCS Istituto Neurologico Besta, Unit of Molecular Neuro-Oncology, Milan, Italy
| | - Antonio Iavarone
- Institute for Cancer Genetics, Columbia University, New York, New York, USA.,Department of Pathology, Columbia University, New York, New York, USA.,Department of Neurology, Columbia University, New York, New York, USA
| | - Raul Rabadan
- Department of Systems Biology, Columbia University, New York, New York, USA.,Department of Biomedical Informatics, Columbia University, New York, New York, USA
| |
Collapse
|
13
|
The application of information theory for the research of aging and aging-related diseases. Prog Neurobiol 2016; 157:158-173. [PMID: 27004830 DOI: 10.1016/j.pneurobio.2016.03.005] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2015] [Revised: 03/13/2016] [Accepted: 03/19/2016] [Indexed: 11/23/2022]
Abstract
This article reviews the application of information-theoretical analysis, employing measures of entropy and mutual information, for the study of aging and aging-related diseases. The research of aging and aging-related diseases is particularly suitable for the application of information theory methods, as aging processes and related diseases are multi-parametric, with continuous parameters coexisting alongside discrete parameters, and with the relations between the parameters being as a rule non-linear. Information theory provides unique analytical capabilities for the solution of such problems, with unique advantages over common linear biostatistics. Among the age-related diseases, information theory has been used in the study of neurodegenerative diseases (particularly using EEG time series for diagnosis and prediction), cancer (particularly for establishing individual and combined cancer biomarkers), diabetes (mainly utilizing mutual information to characterize the diseased and aging states), and heart disease (mainly for the analysis of heart rate variability). Few works have employed information theory for the analysis of general aging processes and frailty, as underlying determinants and possible early preclinical diagnostic measures for aging-related diseases. Generally, the use of information-theoretical analysis permits not only establishing the (non-linear) correlations between diagnostic or therapeutic parameters of interest, but may also provide a theoretical insight into the nature of aging and related diseases by establishing the measures of variability, adaptation, regulation or homeostasis, within a system of interest. It may be hoped that the increased use of such measures in research may considerably increase diagnostic and therapeutic capabilities and the fundamental theoretical mathematical understanding of aging and disease.
Collapse
|