1
|
Rossi N, Gigante N, Vitacolonna N, Piazza C. Inferring Markov Chains to Describe Convergent Tumor Evolution With CIMICE. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2024; 21:106-119. [PMID: 38015671 DOI: 10.1109/tcbb.2023.3337258] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/30/2023]
Abstract
The field of tumor phylogenetics focuses on studying the differences within cancer cell populations. Many efforts are done within the scientific community to build cancer progression models trying to understand the heterogeneity of such diseases. These models are highly dependent on the kind of data used for their construction, therefore, as the experimental technologies evolve, it is of major importance to exploit their peculiarities. In this work we describe a cancer progression model based on Single Cell DNA Sequencing data. When constructing the model, we focus on tailoring the formalism on the specificity of the data. We operate by defining a minimal set of assumptions needed to reconstruct a flexible DAG structured model, capable of identifying progression beyond the limitation of the infinite site assumption. Our proposal is conservative in the sense that we aim to neither discard nor infer knowledge which is not represented in the data. We provide simulations and analytical results to show the features of our model, test it on real data, show how it can be integrated with other approaches to cope with input noise. Moreover, our framework can be exploited to produce simulated data that follows our theoretical assumptions. Finally, we provide an open source R implementation of our approach, called CIMICE, that is publicly available on BioConductor.
Collapse
|
2
|
Alfaro-Murillo JA, Townsend JP. Pairwise and higher-order epistatic effects among somatic cancer mutations across oncogenesis. Math Biosci 2023; 366:109091. [PMID: 37996064 PMCID: PMC10847963 DOI: 10.1016/j.mbs.2023.109091] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2023] [Revised: 09/21/2023] [Accepted: 10/20/2023] [Indexed: 11/25/2023]
Abstract
Cancer occurs as a consequence of multiple somatic mutations that lead to uncontrolled cell growth. Mutual exclusivity and co-occurrence of mutations imply-but do not prove-that mutations exert synergistic or antagonistic epistatic effects on oncogenesis. Knowledge of these interactions, and the consequent trajectories of mutation and selection that lead to cancer has been a longstanding goal within the cancer research community. Recent research has revealed mutation rates and scaled selection coefficients for specific recurrent variants across many cancer types. However, there are no current methods to quantify the strength of selection incorporating pairwise and higher-order epistatic effects on selection within the trajectory of likely cancer genotoypes. Therefore, we have developed a continuous-time Markov chain model that enables the estimation of mutation origination and fixation (flux), dependent on somatic cancer genotype. Coupling this continuous-time Markov chain model with a deconvolution approach provides estimates of underlying mutation rates and selection across the trajectory of oncogenesis. We demonstrate computation of fluxes and selection coefficients in a somatic evolutionary model for the four most frequently variant driver genes (TP53, LRP1B, KRAS and STK11) from 565 cases of lung adenocarcinoma. Our analysis reveals multiple antagonistic epistatic effects that reduce the possible routes of oncogenesis, and inform cancer research regarding viable trajectories of somatic evolution whose progression could be forestalled by precision medicine. Synergistic epistatic effects are also identified, most notably in the somatic genotype TP53 LRP1B for mutations in the KRAS gene, and in somatic genotypes containing KRAS or TP53 mutations for mutations in the STK11 gene. Large positive fluxes of KRAS variants were driven by large selection coefficients, whereas the flux toward LRP1B mutations was substantially aided by a large mutation rate for this gene. The approach enables inference of the most likely routes of site-specific variant evolution and estimation of the strength of selection operating on each step along the route, a key component of what we need to know to develop and implement personalized cancer therapies.
Collapse
Affiliation(s)
- Jorge A Alfaro-Murillo
- Department of Biostatistics, Yale School of Public Health, New Haven, CT, United States of America
| | - Jeffrey P Townsend
- Department of Biostatistics, Yale School of Public Health, New Haven, CT, United States of America; Department of Ecology and Evolutionary Biology, Yale University, New Haven, CT, United States of America; Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, United States of America.
| |
Collapse
|
3
|
Chen J. Timed hazard networks: Incorporating temporal difference for oncogenetic analysis. PLoS One 2023; 18:e0283004. [PMID: 36928529 PMCID: PMC10019724 DOI: 10.1371/journal.pone.0283004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2022] [Accepted: 03/01/2023] [Indexed: 03/18/2023] Open
Abstract
Oncogenetic graphical models are crucial for understanding cancer progression by analyzing the accumulation of genetic events. These models are used to identify statistical dependencies and temporal order of genetic events, which helps design targeted therapies. However, existing algorithms do not account for temporal differences between samples in oncogenetic analysis. This paper introduces Timed Hazard Networks (TimedHN), a new statistical model that uses temporal differences to improve accuracy and reliability. TimedHN models the accumulation process as a continuous-time Markov chain and includes an efficient gradient computation algorithm for optimization. Our simulation experiments demonstrate that TimedHN outperforms current state-of-the-art graph reconstruction methods. We also compare TimedHN with existing methods on a luminal breast cancer dataset, highlighting its potential utility. The Matlab implementation and data are available at https://github.com/puar-playground/TimedHN.
Collapse
Affiliation(s)
- Jian Chen
- Department of Computer Science and Engineering, University at Buffalo, Buffalo, NY, United States of America
- * E-mail:
| |
Collapse
|
4
|
Diaz-Colunga J, Diaz-Uriarte R. Conditional prediction of consecutive tumor evolution using cancer progression models: What genotype comes next? PLoS Comput Biol 2021; 17:e1009055. [PMID: 34932572 PMCID: PMC8730404 DOI: 10.1371/journal.pcbi.1009055] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2021] [Revised: 01/05/2022] [Accepted: 11/25/2021] [Indexed: 12/13/2022] Open
Abstract
Accurate prediction of tumor progression is key for adaptive therapy and precision medicine. Cancer progression models (CPMs) can be used to infer dependencies in mutation accumulation from cross-sectional data and provide predictions of tumor progression paths. However, their performance when predicting complete evolutionary trajectories is limited by violations of assumptions and the size of available data sets. Instead of predicting full tumor progression paths, here we focus on short-term predictions, more relevant for diagnostic and therapeutic purposes. We examine whether five distinct CPMs can be used to answer the question "Given that a genotype with n mutations has been observed, what genotype with n + 1 mutations is next in the path of tumor progression?" or, shortly, "What genotype comes next?". Using simulated data we find that under specific combinations of genotype and fitness landscape characteristics CPMs can provide predictions of short-term evolution that closely match the true probabilities, and that some genotype characteristics can be much more relevant than global features. Application of these methods to 25 cancer data sets shows that their use is hampered by a lack of information needed to make principled decisions about method choice. Fruitful use of these methods for short-term predictions requires adapting method's use to local genotype characteristics and obtaining reliable indicators of performance; it will also be necessary to clarify the interpretation of the method's results when key assumptions do not hold.
Collapse
Affiliation(s)
- Juan Diaz-Colunga
- Department of Biochemistry, School of Medicine, Universidad Autónoma de Madrid, Madrid, Spain
- Instituto de Investigaciones Biomédicas ‘Alberto Sols’ (UAM-CSIC), Madrid, Spain
- Department of Ecology & Evolutionary Biology and Microbial Sciences Institute, Yale University, New Haven, Connecticut, United States of America
| | - Ramon Diaz-Uriarte
- Department of Biochemistry, School of Medicine, Universidad Autónoma de Madrid, Madrid, Spain
- Instituto de Investigaciones Biomédicas ‘Alberto Sols’ (UAM-CSIC), Madrid, Spain
- * E-mail:
| |
Collapse
|
5
|
Li L, Shao M, He X, Ren S, Tian T. Risk of lung cancer due to external environmental factor and epidemiological data analysis. MATHEMATICAL BIOSCIENCES AND ENGINEERING : MBE 2021; 18:6079-6094. [PMID: 34517524 DOI: 10.3934/mbe.2021304] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/30/2023]
Abstract
Lung cancer is a cancer with the fastest growth in the incidence and mortality all over the world, which is an extremely serious threat to human's life and health. Evidences reveal that external environmental factors are the key drivers of lung cancer, such as smoking, radiation exposure and so on. Therefore, it is urgent to explain the mechanism of lung cancer risk due to external environmental factors experimentally and theoretically. However, it is still an open issue regarding how external environment factors affect lung cancer risk. In this paper, we summarize the main mathematical models involved the gene mutations for cancers, and review the application of the models to analyze the mechanism of lung cancer and the risk of lung cancer due to external environmental exposure. In addition, we apply the model described and the epidemiological data to analyze the influence of external environmental factors on lung cancer risk. The result indicates that radiation can cause significantly an increase in the mutation rate of cells, in particular the mutation in stability gene that leads to genomic instability. These studies not only can offer insights into the relationship between external environmental factors and human lung cancer risk, but also can provide theoretical guidance for the prevention and control of lung cancer.
Collapse
Affiliation(s)
- Lingling Li
- School of Science, Xi'an Polytechnic University, Xi'an 710048, China
| | - Mengyao Shao
- School of Science, Xi'an Polytechnic University, Xi'an 710048, China
| | - Xingshi He
- School of Science, Xi'an Polytechnic University, Xi'an 710048, China
| | - Shanjing Ren
- School of Mathematics and Big Data, GuiZhou Education University, Guiyang 550018, China
| | - Tianhai Tian
- School of Mathematical Science, Monash University, Melbourne Vic 3800, Australia
| |
Collapse
|
6
|
Schill R, Solbrig S, Wettig T, Spang R. Modelling cancer progression using Mutual Hazard Networks. Bioinformatics 2020; 36:241-249. [PMID: 31250881 PMCID: PMC6956791 DOI: 10.1093/bioinformatics/btz513] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2018] [Revised: 03/29/2019] [Accepted: 06/25/2019] [Indexed: 12/26/2022] Open
Abstract
MOTIVATION Cancer progresses by accumulating genomic events, such as mutations and copy number alterations, whose chronological order is key to understanding the disease but difficult to observe. Instead, cancer progression models use co-occurrence patterns in cross-sectional data to infer epistatic interactions between events and thereby uncover their most likely order of occurrence. State-of-the-art progression models, however, are limited by mathematical tractability and only allow events to interact in directed acyclic graphs, to promote but not inhibit subsequent events, or to be mutually exclusive in distinct groups that cannot overlap. RESULTS Here we propose Mutual Hazard Networks (MHN), a new Machine Learning algorithm to infer cyclic progression models from cross-sectional data. MHN model events by their spontaneous rate of fixation and by multiplicative effects they exert on the rates of successive events. MHN compared favourably to acyclic models in cross-validated model fit on four datasets tested. In application to the glioblastoma dataset from The Cancer Genome Atlas, MHN proposed a novel interaction in line with consecutive biopsies: IDH1 mutations are early events that promote subsequent fixation of TP53 mutations. AVAILABILITY AND IMPLEMENTATION Implementation and data are available at https://github.com/RudiSchill/MHN. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Rudolf Schill
- Department of Statistical Bioinformatics, Institute of Functional Genomics, Regensburg 93040, Germany
| | - Stefan Solbrig
- Department of Physics, University of Regensburg, Regensburg 93040, Germany
| | - Tilo Wettig
- Department of Physics, University of Regensburg, Regensburg 93040, Germany
| | - Rainer Spang
- Department of Statistical Bioinformatics, Institute of Functional Genomics, Regensburg 93040, Germany
| |
Collapse
|
7
|
Mazaya M, Trinh HC, Kwon YK. Effects of ordered mutations on dynamics in signaling networks. BMC Med Genomics 2020; 13:13. [PMID: 32075651 PMCID: PMC7032007 DOI: 10.1186/s12920-019-0651-z] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2019] [Accepted: 12/19/2019] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Many previous clinical studies have found that accumulated sequential mutations are statistically related to tumorigenesis. However, they are limited in fully elucidating the significance of the ordered-mutation because they did not focus on the network dynamics. Therefore, there is a pressing need to investigate the dynamics characteristics induced by ordered-mutations. METHODS To quantify the ordered-mutation-inducing dynamics, we defined the mutation-sensitivity and the order-specificity that represent if the network is sensitive against a double knockout mutation and if mutation-sensitivity is specific to the mutation order, respectively, using a Boolean network model. RESULTS Through intensive investigations, we found that a signaling network is more sensitive when a double-mutation occurs in the direction order inducing a longer path and a smaller number of paths than in the reverse order. In addition, feedback loops involving a gene pair decreased both the mutation-sensitivity and the order-specificity. Next, we investigated relationships of functionally important genes with ordered-mutation-inducing dynamics. The network is more sensitive to mutations subject to drug-targets, whereas it is less specific to the mutation order. Both the sensitivity and specificity are increased when different-drug-targeted genes are mutated. Further, we found that tumor suppressors can efficiently suppress the amplification of oncogenes when the former are mutated earlier than the latter. CONCLUSION Taken together, our results help to understand the importance of the order of mutations with respect to the dynamical effects in complex biological systems.
Collapse
Affiliation(s)
- Maulida Mazaya
- School of IT Convergence, University of Ulsan, 93 Daehak-ro, Nam-gu, Ulsan, 44610, Republic of Korea
| | - Hung-Cuong Trinh
- Faculty of Information Technology, Ton Duc Thang University, Ho Chi Minh City, Vietnam
| | - Yung-Keun Kwon
- School of IT Convergence, University of Ulsan, 93 Daehak-ro, Nam-gu, Ulsan, 44610, Republic of Korea.
| |
Collapse
|
8
|
Diaz-Uriarte R, Vasallo C. Every which way? On predicting tumor evolution using cancer progression models. PLoS Comput Biol 2019; 15:e1007246. [PMID: 31374072 PMCID: PMC6693785 DOI: 10.1371/journal.pcbi.1007246] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2018] [Revised: 08/14/2019] [Accepted: 07/05/2019] [Indexed: 11/18/2022] Open
Abstract
Successful prediction of the likely paths of tumor progression is valuable for diagnostic, prognostic, and treatment purposes. Cancer progression models (CPMs) use cross-sectional samples to identify restrictions in the order of accumulation of driver mutations and thus CPMs encode the paths of tumor progression. Here we analyze the performance of four CPMs to examine whether they can be used to predict the true distribution of paths of tumor progression and to estimate evolutionary unpredictability. Employing simulations we show that if fitness landscapes are single peaked (have a single fitness maximum) there is good agreement between true and predicted distributions of paths of tumor progression when sample sizes are large, but performance is poor with the currently common much smaller sample sizes. Under multi-peaked fitness landscapes (i.e., those with multiple fitness maxima), performance is poor and improves only slightly with sample size. In all cases, detection regime (when tumors are sampled) is a key determinant of performance. Estimates of evolutionary unpredictability from the best performing CPM, among the four examined, tend to overestimate the true unpredictability and the bias is affected by detection regime; CPMs could be useful for estimating upper bounds to the true evolutionary unpredictability. Analysis of twenty-two cancer data sets shows low evolutionary unpredictability for several of the data sets. But most of the predictions of paths of tumor progression are very unreliable, and unreliability increases with the number of features analyzed. Our results indicate that CPMs could be valuable tools for predicting cancer progression but that, currently, obtaining useful predictions of paths of tumor progression from CPMs is dubious, and emphasize the need for methodological work that can account for the probably multi-peaked fitness landscapes in cancer.
Collapse
Affiliation(s)
- Ramon Diaz-Uriarte
- Department of Biochemistry, Universidad Autónoma de Madrid, Madrid, Spain
- Instituto de Investigaciones Biomédicas “Alberto Sols” (UAM-CSIC), Madrid, Spain
| | - Claudia Vasallo
- Department of Biochemistry, Universidad Autónoma de Madrid, Madrid, Spain
- Instituto de Investigaciones Biomédicas “Alberto Sols” (UAM-CSIC), Madrid, Spain
| |
Collapse
|
9
|
Khakabimamaghani S, Malikic S, Tang J, Ding D, Morin R, Chindelevitch L, Ester M. Collaborative intra-tumor heterogeneity detection. Bioinformatics 2019; 35:i379-i388. [PMID: 31510674 PMCID: PMC6612880 DOI: 10.1093/bioinformatics/btz355] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/11/2023] Open
Abstract
MOTIVATION Despite the remarkable advances in sequencing and computational techniques, noise in the data and complexity of the underlying biological mechanisms render deconvolution of the phylogenetic relationships between cancer mutations difficult. Besides that, the majority of the existing datasets consist of bulk sequencing data of single tumor sample of an individual. Accurate inference of the phylogenetic order of mutations is particularly challenging in these cases and the existing methods are faced with several theoretical limitations. To overcome these limitations, new methods are required for integrating and harnessing the full potential of the existing data. RESULTS We introduce a method called Hintra for intra-tumor heterogeneity detection. Hintra integrates sequencing data for a cohort of tumors and infers tumor phylogeny for each individual based on the evolutionary information shared between different tumors. Through an iterative process, Hintra learns the repeating evolutionary patterns and uses this information for resolving the phylogenetic ambiguities of individual tumors. The results of synthetic experiments show an improved performance compared to two state-of-the-art methods. The experimental results with a recent Breast Cancer dataset are consistent with the existing knowledge and provide potentially interesting findings. AVAILABILITY AND IMPLEMENTATION The source code for Hintra is available at https://github.com/sahandk/HINTRA.
Collapse
Affiliation(s)
| | - Salem Malikic
- School of Computing Science, Simon Fraser University, Burnaby, BC
| | - Jeffrey Tang
- Department of Molecular Biology and Biochemistry, Simon Fraser University, Burnaby, BC
| | - Dujian Ding
- School of Computing Science, Simon Fraser University, Burnaby, BC
| | - Ryan Morin
- Department of Molecular Biology and Biochemistry, Simon Fraser University, Burnaby, BC
- Genome Sciences Centre, British Columbia Cancer Agency, Vancouver, BC
| | | | - Martin Ester
- School of Computing Science, Simon Fraser University, Burnaby, BC
- Vancouver Prostate Centre, Vancouver, BC, Canada
| |
Collapse
|
10
|
Abstract
Large-scale genomic data highlight the complexity and diversity of the molecular changes that drive cancer progression. Statistical analysis of cancer data from different tissues can guide drug repositioning as well as the design of targeted treatments. Here, we develop an improved Bayesian network model for tumour mutational profiles and apply it to 8198 patient samples across 22 cancer types from TCGA. For each cancer type, we identify the interactions between mutated genes, capturing signatures beyond mere mutational frequencies. When comparing mutation networks, we find genes which interact both within and across cancer types. To detach cancer classification from the tissue type we perform de novo clustering of the pancancer mutational profiles based on the Bayesian network models. We find 22 novel clusters which significantly improve survival prediction beyond clinical information. The models highlight key gene interactions for each cluster potentially allowing genomic stratification for clinical trials and identifying drug targets. Tumour heterogeneity hinders translation of large-scale genomic data into the clinic. Here the authors develop a method for the stratification of cancer patients based on the molecular gene status, including genetic interactions, rather than clinico-histological data, and apply it to TCGA data for over 8000 cases across 22 cancer types.
Collapse
|
11
|
Diaz-Uriarte R. Cancer progression models and fitness landscapes: a many-to-many relationship. Bioinformatics 2018; 34:836-844. [PMID: 29048486 PMCID: PMC6031050 DOI: 10.1093/bioinformatics/btx663] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2017] [Accepted: 10/17/2017] [Indexed: 11/13/2022] Open
Abstract
Motivation The identification of constraints, due to gene interactions, in the order of accumulation of mutations during cancer progression can allow us to single out therapeutic targets. Cancer progression models (CPMs) use genotype frequency data from cross-sectional samples to identify these constraints, and return Directed Acyclic Graphs (DAGs) of restrictions where arrows indicate dependencies or constraints. On the other hand, fitness landscapes, which map genotypes to fitness, contain all possible paths of tumor progression. Thus, we expect a correspondence between DAGs from CPMs and the fitness landscapes where evolution happened. But many fitness landscapes-e.g. those with reciprocal sign epistasis-cannot be represented by CPMs. Results Using simulated data under 500 fitness landscapes, I show that CPMs' performance (prediction of genotypes that can exist) degrades with reciprocal sign epistasis. There is large variability in the DAGs inferred from each landscape, which is also affected by mutation rate, detection regime and fitness landscape features, in ways that depend on CPM method. Using three cancer datasets, I show that these problems strongly affect the analysis of empirical data: fitness landscapes that are widely different from each other produce data similar to the empirically observed ones and lead to DAGs that infer very different restrictions. Because reciprocal sign epistasis can be common in cancer, these results question the use and interpretation of CPMs. Availability and implementation Code available from Supplementary Material. Contact ramon.diaz@iib.uam.es. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Ramon Diaz-Uriarte
- Department of Biochemistry, Universidad Autónoma de Madrid, Instituto de Investigaciones Biomédicas "Alberto Sols" (UAM-CSIC), Madrid 28029, Spain
| |
Collapse
|
12
|
Wilkins JF, Cannataro VL, Shuch B, Townsend JP. Analysis of mutation, selection, and epistasis: an informed approach to cancer clinical trials. Oncotarget 2018; 9:22243-22253. [PMID: 29854275 PMCID: PMC5976461 DOI: 10.18632/oncotarget.25155] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2018] [Accepted: 04/02/2018] [Indexed: 12/30/2022] Open
Abstract
Currently, drug development efforts and clinical trials to test them are often prioritized by targeting genes with high frequencies of somatic variants among tumors. However, differences in oncogenic mutation rate-not necessarily the effect the variant has on tumor growth-contribute enormously to somatic variant frequency. We argue that decoupling the contributions of mutation and cancer lineage selection to the frequency of somatic variants among tumors is critical to understanding-and predicting-the therapeutic potential of different interventions. To provide an indicator of that strength of selection and therapeutic potential, the frequency at which we observe a given variant across patients must be modulated by our expectation given the mutation rate and target size to provide an indicator of that strength of selection and therapeutic potential. Additionally, antagonistic and synergistic epistasis among mutations also impacts the potential therapeutic benefit of targeted drug development. Quantitative approaches should be fostered that use the known genetic architectures of cancer types, decouple mutation rate, and provide rigorous guidance regarding investment in targeted drug development. By integrating evolutionary principles and detailed mechanistic knowledge into those approaches, we can maximize our ability to identify those targeted therapies most likely to yield substantial clinical benefit.
Collapse
Affiliation(s)
| | | | - Brian Shuch
- Department of Urology, Yale School of Medicine, New Haven, CT, USA
- Department of Radiology, Yale School of Medicine, New Haven, CT, USA
| | - Jeffrey P. Townsend
- Department of Biostatistics, Yale School of Public Health, Yale University, New Haven, CT, USA
- Department of Ecology and Evolutionary Biology, Yale University, New Haven, CT, USA
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, USA
| |
Collapse
|
13
|
Pang S, Sun Y, Wu L, Yang L, Zhao YL, Wang Z, Li Y. Reconstruction of kidney renal clear cell carcinoma evolution across pathological stages. Sci Rep 2018; 8:3339. [PMID: 29463849 PMCID: PMC5820260 DOI: 10.1038/s41598-018-20321-4] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2017] [Accepted: 01/16/2018] [Indexed: 01/02/2023] Open
Abstract
Although numerous studies on kidney renal clear cell carcinoma (KIRC) were carried out, the dynamic process of tumor formation was not clear yet. Inadequate attention was paid on the evolutionary paths among somatic mutations and their clinical implications. As the tumor initiation and evolution of KIRC were primarily associated with SNVs, we reconstructed an evolutionary process of KIRC using cross-sectional SNVs in different pathological stages. KIRC driver genes appeared early in the evolutionary tree, and the genes with moderate mutation frequency showed a pattern of stage-by-stage expansion. Although the individual gene mutations were not necessarily associated with survival outcome, the evolutionary paths such as VHL-PBRM1 and FMN2-PCLO could indicate stage-specific prognosis. Our results suggested that, besides mutation frequency, the evolutionary relationship among the mutated genes could facilitate to identify novel drivers and biomarkers for clinical utility.
Collapse
Affiliation(s)
- Shichao Pang
- Department of Statistics, School of Mathematical Sciences, Shanghai Jiao Tong University, Shanghai, 200240, China
| | - Yidi Sun
- Key Lab of Computational Biology, CAS-MPG Partner Institute for Computational Biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai, P.R. China
- CAS Key Laboratory of Systems Biology, CAS Center for Excellence in Molecular Cell Science, Institute of Biochemistry and Cell Biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, 320 YueYang Road, Shanghai, 200031, China
- University of Chinese Academy of Sciences, Shanghai, 200031, China
| | - Leilei Wu
- Department of Bioinformatics and Biostatistics, MOE LSB and LSC, State Key Laboratory of Microbial Metabolism, Joint International Research Laboratory of Metabolic & Developmental Sciences, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, 200240, China
| | - Liguang Yang
- Key Lab of Computational Biology, CAS-MPG Partner Institute for Computational Biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai, P.R. China
- University of Chinese Academy of Sciences, Shanghai, 200031, China
| | - Yi-Lei Zhao
- Department of Bioinformatics and Biostatistics, MOE LSB and LSC, State Key Laboratory of Microbial Metabolism, Joint International Research Laboratory of Metabolic & Developmental Sciences, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, 200240, China.
| | - Zhen Wang
- Key Lab of Computational Biology, CAS-MPG Partner Institute for Computational Biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai, P.R. China.
| | - Yixue Li
- Department of Bioinformatics and Biostatistics, MOE LSB and LSC, State Key Laboratory of Microbial Metabolism, Joint International Research Laboratory of Metabolic & Developmental Sciences, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, 200240, China.
- Key Lab of Computational Biology, CAS-MPG Partner Institute for Computational Biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai, P.R. China.
- CAS Key Laboratory of Systems Biology, CAS Center for Excellence in Molecular Cell Science, Institute of Biochemistry and Cell Biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, 320 YueYang Road, Shanghai, 200031, China.
- University of Chinese Academy of Sciences, Shanghai, 200031, China.
- Shanghai Center for Bioinformation Technology, Shanghai Industrial Technology Institute, Shanghai, P.R. China.
- Collaborative Innovation Center for Genetics and Development, Fudan University, Shanghai, P.R. China.
| |
Collapse
|
14
|
Abstract
Rapid advances in high-throughput sequencing and a growing realization of the importance of evolutionary theory to cancer genomics have led to a proliferation of phylogenetic studies of tumour progression. These studies have yielded not only new insights but also a plethora of experimental approaches, sometimes reaching conflicting or poorly supported conclusions. Here, we consider this body of work in light of the key computational principles underpinning phylogenetic inference, with the goal of providing practical guidance on the design and analysis of scientifically rigorous tumour phylogeny studies. We survey the range of methods and tools available to the researcher, their key applications, and the various unsolved problems, closing with a perspective on the prospects and broader implications of this field.
Collapse
Affiliation(s)
- Russell Schwartz
- Department of Biological Sciences and Computational Biology Department, Carnegie Mellon University, Pittsburgh, Pennsylvania 15217, USA
| | - Alejandro A Schäffer
- Computational Biology Branch, National Center for Biotechnology Information, National Institutes of Health, Bethesda, Maryland 20892, USA
| |
Collapse
|
15
|
Li X, Huang H, Guan Y, Gong Y, He CY, Yi X, Qi M, Chen ZY. Whole-exome sequencing predicted cancer epitope trees of 23 early cervical cancers in Chinese women. Cancer Med 2016; 6:207-219. [PMID: 27998038 PMCID: PMC5269563 DOI: 10.1002/cam4.953] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2016] [Revised: 10/05/2016] [Accepted: 10/07/2016] [Indexed: 12/18/2022] Open
Abstract
Emerging evidence suggest that the heterogeneity of cancer limits the efficacy of immunotherapy. To search for optimal therapeutic targets for enhancing the efficacy, we used whole‐exome sequencing data of 23 early cervical tumors from Chinese women to investigate the hierarchical structure of the somatic mutations and the neo‐epitopes. The putative neo‐epitopes were predicted based on the mutant peptides’ strong binding with major histocompatibility complex class I molecules. We found that each tumor carried an average of 117 mutations and 61 putative neo‐epitopes. Each patient displayed a unique phylogenetic tree in which almost all subclones harbored neo‐epitopes, highlighting the importance of individual neo‐epitope tree in determination of immunotherapeutic targets. The alterations in FBXW7 and PIK3CA, or other members of the significantly altered ubiquitin‐mediated proteolysis and extracellular matrix receptor interaction related pathways, were proposed as the earliest changes triggering the malignant progression. The neo‐epitopes involved in these pathways, and located at the top of the hierarchy tree, might become the optimal candidates for therapeutic targets, possessing the potential to mediate T‐cell killing of the descendant cells. These findings expanded our understanding in early stage of cervical carcinogenesis and offered an important approach to assist optimizing the immunotherapeutic target selection.
Collapse
Affiliation(s)
- Xia Li
- The Laboratory for Gene and Cell Engineering, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China.,Research Center for Biomedical Information Technology, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China
| | - Hailiang Huang
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, Massachusetts.,Broad Institute of MIT and Harvard, Cambridge, Massachusetts
| | | | - Yuhua Gong
- Geneplus-Beijing, Beijing, 102206, China
| | - Cheng-Yi He
- The Laboratory for Gene and Cell Engineering, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China
| | - Xin Yi
- Geneplus-Beijing, Beijing, 102206, China
| | - Ming Qi
- BGI-Shenzhen, Shenzhen, China.,School of Basic Medical Sciences, Center for Genetic and Genomic Medicine, Zhejiang University Medical School 1st Affiliated Hospital James Watson Institute of Genome Sciences, Hangzhou, Zhejiang, China.,Department of Pathology and Laboratory Medicine, University of Rochester Medical Center, Rochester, New York
| | - Zhi-Ying Chen
- The Laboratory for Gene and Cell Engineering, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China
| |
Collapse
|
16
|
Zhang H, Deng Y, Zhang Y, Ping Y, Zhao H, Pang L, Zhang X, Wang L, Xu C, Xiao Y, Li X. Cooperative genomic alteration network reveals molecular classification across 12 major cancer types. Nucleic Acids Res 2016; 45:567-582. [PMID: 27899621 PMCID: PMC5314758 DOI: 10.1093/nar/gkw1087] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2016] [Revised: 10/18/2016] [Accepted: 10/27/2016] [Indexed: 11/22/2022] Open
Abstract
The accumulation of somatic genomic alterations that enables cells to gradually acquire growth advantage contributes to tumor development. This has the important implication of the widespread existence of cooperative genomic alterations in the accumulation process. Here, we proposed a computational method HCOC that simultaneously consider genetic context and downstream functional effects on cancer hallmarks to uncover somatic cooperative events in human cancers. Applying our method to 12 TCGA cancer types, we totally identified 1199 cooperative events with high heterogeneity across human cancers, and then constructed a pan-cancer cooperative alteration network. These cooperative events are associated with genomic alterations of some high-confident cancer drivers, and can trigger the dysfunction of hallmark associated pathways in a co-defect way rather than single alterations. We found that these cooperative events can be used to produce a prognostic classification that can provide complementary information with tissue-of-origin. In a further case study of glioblastoma, using 23 cooperative events identified, we stratified patients into molecularly relevant subtypes with a prognostic significance independent of the Glioma-CpG Island Methylator Phenotype (GCIMP). In summary, our method can be effectively used to discover cancer-driving cooperative events that can be valuable clinical markers for patient stratification.
Collapse
Affiliation(s)
- Hongyi Zhang
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, Heilongjiang 150081, China
| | - Yulan Deng
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, Heilongjiang 150081, China
| | - Yong Zhang
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, Heilongjiang 150081, China
| | - Yanyan Ping
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, Heilongjiang 150081, China
| | - Hongying Zhao
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, Heilongjiang 150081, China
| | - Lin Pang
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, Heilongjiang 150081, China
| | - Xinxin Zhang
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, Heilongjiang 150081, China
| | - Li Wang
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, Heilongjiang 150081, China
| | - Chaohan Xu
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, Heilongjiang 150081, China
| | - Yun Xiao
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, Heilongjiang 150081, China
| | - Xia Li
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, Heilongjiang 150081, China
| |
Collapse
|
17
|
Algorithmic methods to infer the evolutionary trajectories in cancer progression. Proc Natl Acad Sci U S A 2016; 113:E4025-34. [PMID: 27357673 DOI: 10.1073/pnas.1520213113] [Citation(s) in RCA: 40] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open
Abstract
The genomic evolution inherent to cancer relates directly to a renewed focus on the voluminous next-generation sequencing data and machine learning for the inference of explanatory models of how the (epi)genomic events are choreographed in cancer initiation and development. However, despite the increasing availability of multiple additional -omics data, this quest has been frustrated by various theoretical and technical hurdles, mostly stemming from the dramatic heterogeneity of the disease. In this paper, we build on our recent work on the "selective advantage" relation among driver mutations in cancer progression and investigate its applicability to the modeling problem at the population level. Here, we introduce PiCnIc (Pipeline for Cancer Inference), a versatile, modular, and customizable pipeline to extract ensemble-level progression models from cross-sectional sequenced cancer genomes. The pipeline has many translational implications because it combines state-of-the-art techniques for sample stratification, driver selection, identification of fitness-equivalent exclusive alterations, and progression model inference. We demonstrate PiCnIc's ability to reproduce much of the current knowledge on colorectal cancer progression as well as to suggest novel experimentally verifiable hypotheses.
Collapse
|
18
|
Li X. Dynamic changes of driver genes' mutations across clinical stages in nine cancer types. Cancer Med 2016; 5:1556-65. [PMID: 26992457 PMCID: PMC4944883 DOI: 10.1002/cam4.704] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2016] [Revised: 02/14/2016] [Accepted: 02/23/2016] [Indexed: 12/31/2022] Open
Abstract
The driver genes play critical roles for tumorigenesis, and the number of identified driver genes reached plateau. But how they act during different cancer development stages is lack of knowledge. We investigated 138 driver genes’ mutation changes across clinical stages using 3,477 cases in nine cancer types from the Cancer Genome Atlas (TCGA) and constructed their temporal order relationships. We also examined the codon changes for the widely mutated TP53 and PIK3CA in tumor stages. Combinations of one to three driver genes specifically dominated in each cancer. Across the clinical stages, we categorized three patterns for the behaviors of driver genes’ mutation changes in the nine cancer types: recurrently mutated in all the stages and triggering other mutations; certain mutations lost meanwhile other mutations emerged; mutations dominated across entire stages, while other mutations gradually appeared or disappeared. We observed different codon changes dominated in different stages and revealed mutations recurrently occurring on the hotspot regions of the coding sequence may be the core factor for driver genes’ tumorigenesis. Our results highlighted the dynamic changes of oncogenesis roles in different clinical stages and suggested different diagnostic decision making according to the clinical stages of patients.
Collapse
Affiliation(s)
- Xia Li
- Laboratory for Gene and Cell Engineering, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, 1068 Xueyuan Avenue, Shenzhen University Town, Shenzhen, China
| |
Collapse
|
19
|
Ramazzotti D, Caravagna G, Olde Loohuis L, Graudenzi A, Korsunsky I, Mauri G, Antoniotti M, Mishra B. CAPRI: efficient inference of cancer progression models from cross-sectional data. Bioinformatics 2015; 31:3016-26. [DOI: 10.1093/bioinformatics/btv296] [Citation(s) in RCA: 68] [Impact Index Per Article: 7.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2015] [Accepted: 05/04/2015] [Indexed: 12/27/2022] Open
|