1
|
Luo XG, Kuipers J, Beerenwinkel N. Joint inference of exclusivity patterns and recurrent trajectories from tumor mutation trees. Nat Commun 2023; 14:3676. [PMID: 37344522 DOI: 10.1038/s41467-023-39400-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2022] [Accepted: 06/12/2023] [Indexed: 06/23/2023] Open
Abstract
Cancer progression is an evolutionary process shaped by both deterministic and stochastic forces. Multi-region and single-cell sequencing of tumors enable high-resolution reconstruction of the mutational history of each tumor and highlight the extensive diversity across tumors and patients. Resolving the interactions among mutations and recovering recurrent evolutionary processes may offer greater opportunities for successful therapeutic strategies. To this end, we present a novel probabilistic framework, called TreeMHN, for the joint inference of exclusivity patterns and recurrent trajectories from a cohort of intra-tumor phylogenetic trees. Through simulations, we show that TreeMHN outperforms existing alternatives that can only focus on one aspect of the task. By analyzing datasets of blood, lung, and breast cancers, we find the most likely evolutionary trajectories and mutational patterns, consistent with and enriching our current understanding of tumorigenesis. Moreover, TreeMHN facilitates the prediction of tumor evolution and provides probabilistic measures on the next mutational events given a tumor tree, a prerequisite for evolution-guided treatment strategies.
Collapse
Affiliation(s)
- Xiang Ge Luo
- Department of Biosystems Science and Engineering, ETH Zurich, Mattenstrasse 26, 4058, Basel, Switzerland
- SIB Swiss Institute of Bioinformatics, Mattenstrasse 26, 4058, Basel, Switzerland
| | - Jack Kuipers
- Department of Biosystems Science and Engineering, ETH Zurich, Mattenstrasse 26, 4058, Basel, Switzerland
- SIB Swiss Institute of Bioinformatics, Mattenstrasse 26, 4058, Basel, Switzerland
| | - Niko Beerenwinkel
- Department of Biosystems Science and Engineering, ETH Zurich, Mattenstrasse 26, 4058, Basel, Switzerland.
- SIB Swiss Institute of Bioinformatics, Mattenstrasse 26, 4058, Basel, Switzerland.
| |
Collapse
|
2
|
Georg P, Grasedyck L, Klever M, Schill R, Spang R, Wettig T. Low-rank tensor methods for Markov chains with applications to tumor progression models. J Math Biol 2023; 86:7. [PMID: 36460900 PMCID: PMC9718722 DOI: 10.1007/s00285-022-01846-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2021] [Revised: 09/19/2022] [Accepted: 11/22/2022] [Indexed: 12/05/2022]
Abstract
Cancer progression can be described by continuous-time Markov chains whose state space grows exponentially in the number of somatic mutations. The age of a tumor at diagnosis is typically unknown. Therefore, the quantity of interest is the time-marginal distribution over all possible genotypes of tumors, defined as the transient distribution integrated over an exponentially distributed observation time. It can be obtained as the solution of a large linear system. However, the sheer size of this system renders classical solvers infeasible. We consider Markov chains whose transition rates are separable functions, allowing for an efficient low-rank tensor representation of the linear system's operator. Thus we can reduce the computational complexity from exponential to linear. We derive a convergent iterative method using low-rank formats whose result satisfies the normalization constraint of a distribution. We also perform numerical experiments illustrating that the marginal distribution is well approximated with low rank.
Collapse
Affiliation(s)
- Peter Georg
- Department of Physics, University of Regensburg, 93040 Regensburg, Germany
| | - Lars Grasedyck
- Institute for Geometry and Applied Mathematics, RWTH Aachen University, 52062 Aachen, Germany
| | - Maren Klever
- Institute for Geometry and Applied Mathematics, RWTH Aachen University, 52062 Aachen, Germany
| | - Rudolf Schill
- Department of Statistical Bioinformatics, Institute of Functional Genomics, University of Regensburg, 93040 Regensburg, Germany
| | - Rainer Spang
- Department of Statistical Bioinformatics, Institute of Functional Genomics, University of Regensburg, 93040 Regensburg, Germany
| | - Tilo Wettig
- Department of Physics, University of Regensburg, 93040 Regensburg, Germany
| |
Collapse
|
3
|
ToMExO: A probabilistic tree-structured model for cancer progression. PLoS Comput Biol 2022; 18:e1010732. [DOI: 10.1371/journal.pcbi.1010732] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2022] [Revised: 12/15/2022] [Accepted: 11/14/2022] [Indexed: 12/12/2022] Open
Abstract
Identifying the interrelations among cancer driver genes and the patterns in which the driver genes get mutated is critical for understanding cancer. In this paper, we study cross-sectional data from cohorts of tumors to identify the cancer-type (or subtype) specific process in which the cancer driver genes accumulate critical mutations. We model this mutation accumulation process using a tree, where each node includes a driver gene or a set of driver genes. A mutation in each node enables its children to have a chance of mutating. This model simultaneously explains the mutual exclusivity patterns observed in mutations in specific cancer genes (by its nodes) and the temporal order of events (by its edges). We introduce a computationally efficient dynamic programming procedure for calculating the likelihood of our noisy datasets and use it to build our Markov Chain Monte Carlo (MCMC) inference algorithm, ToMExO. Together with a set of engineered MCMC moves, our fast likelihood calculations enable us to work with datasets with hundreds of genes and thousands of tumors, which cannot be dealt with using available cancer progression analysis methods. We demonstrate our method’s performance on several synthetic datasets covering various scenarios for cancer progression dynamics. Then, a comparison against two state-of-the-art methods on a moderate-size biological dataset shows the merits of our algorithm in identifying significant and valid patterns. Finally, we present our analyses of several large biological datasets, including colorectal cancer, glioblastoma, and pancreatic cancer. In all the analyses, we validate the results using a set of method-independent metrics testing the causality and significance of the relations identified by ToMExO or competing methods.
Collapse
|
4
|
Biron-Lattes M, Bouchard-Côté A, Campbell T. Pseudo-marginal inference for CTMCs on infinite spaces via monotonic likelihood approximations. J Comput Graph Stat 2022. [DOI: 10.1080/10618600.2022.2118750] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/17/2022]
|
5
|
Larsen JR, Martin MR, Martin JD, Hicks JB, Kuhn P. Modeling the onset of symptoms of COVID-19: Effects of SARS-CoV-2 variant. PLoS Comput Biol 2021; 17:e1009629. [PMID: 34914688 PMCID: PMC8675677 DOI: 10.1371/journal.pcbi.1009629] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2021] [Accepted: 11/10/2021] [Indexed: 12/28/2022] Open
Abstract
Identifying order of symptom onset of infectious diseases might aid in differentiating symptomatic infections earlier in a population thereby enabling non-pharmaceutical interventions and reducing disease spread. Previously, we developed a mathematical model predicting the order of symptoms based on data from the initial outbreak of SARS-CoV-2 in China using symptom occurrence at diagnosis and found that the order of COVID-19 symptoms differed from that of other infectious diseases including influenza. Whether this order of COVID-19 symptoms holds in the USA under changing conditions is unclear. Here, we use modeling to predict the order of symptoms using data from both the initial outbreaks in China and in the USA. Whereas patients in China were more likely to have fever before cough and then nausea/vomiting before diarrhea, patients in the USA were more likely to have cough before fever and then diarrhea before nausea/vomiting. Given that the D614G SARS-CoV-2 variant that rapidly spread from Europe to predominate in the USA during the first wave of the outbreak was not present in the initial China outbreak, we hypothesized that this mutation might affect symptom order. Supporting this notion, we found that as SARS-CoV-2 in Japan shifted from the original Wuhan reference strain to the D614G variant, symptom order shifted to the USA pattern. Google Trends analyses supported these findings, while weather, age, and comorbidities did not affect our model's predictions of symptom order. These findings indicate that symptom order can change with mutation in viral disease and raise the possibility that D614G variant is more transmissible because infected people are more likely to cough in public before being incapacitated with fever.
Collapse
Affiliation(s)
- Joseph R. Larsen
- Quantitative and Computational Biology, Department of Biological Science, University of Southern California, Los Angeles, California, United States of America
- Convergent Science Institute in Cancer, Michelson Center for Convergent Bioscience, University of Southern California, Los Angeles, California, United States of America
| | - Margaret R. Martin
- Department of Computer Science, Tufts University, Medford, Massachusetts, United States of America
| | - John D. Martin
- Materia Therapeutics, Las Vegas, Nevada, United States of America
| | - James B. Hicks
- Convergent Science Institute in Cancer, Michelson Center for Convergent Bioscience, University of Southern California, Los Angeles, California, United States of America
| | - Peter Kuhn
- Convergent Science Institute in Cancer, Michelson Center for Convergent Bioscience, University of Southern California, Los Angeles, California, United States of America
| |
Collapse
|
6
|
Comparing mutational pathways to lopinavir resistance in HIV-1 subtypes B versus C. PLoS Comput Biol 2021; 17:e1008363. [PMID: 34491984 PMCID: PMC8448360 DOI: 10.1371/journal.pcbi.1008363] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2020] [Revised: 09/17/2021] [Accepted: 08/09/2021] [Indexed: 11/19/2022] Open
Abstract
Although combination antiretroviral therapies seem to be effective at controlling HIV-1 infections regardless of the viral subtype, there is increasing evidence for subtype-specific drug resistance mutations. The order and rates at which resistance mutations accumulate in different subtypes also remain poorly understood. Most of this knowledge is derived from studies of subtype B genotypes, despite not being the most abundant subtype worldwide. Here, we present a methodology for the comparison of mutational networks in different HIV-1 subtypes, based on Hidden Conjunctive Bayesian Networks (H-CBN), a probabilistic model for inferring mutational networks from cross-sectional genotype data. We introduce a Monte Carlo sampling scheme for learning H-CBN models for a larger number of resistance mutations and develop a statistical test to assess differences in the inferred mutational networks between two groups. We apply this method to infer the temporal progression of mutations conferring resistance to the protease inhibitor lopinavir in a large cross-sectional cohort of HIV-1 subtype C genotypes from South Africa, as well as to a data set of subtype B genotypes obtained from the Stanford HIV Drug Resistance Database and the Swiss HIV Cohort Study. We find strong support for different initial mutational events in the protease, namely at residue 46 in subtype B and at residue 82 in subtype C. The inferred mutational networks for subtype B versus C are significantly different sharing only five constraints on the order of accumulating mutations with mutation at residue 54 as the parental event. The results also suggest that mutations can accumulate along various alternative paths within subtypes, as opposed to a unique total temporal ordering. Beyond HIV drug resistance, the statistical methodology is applicable more generally for the comparison of inferred mutational networks between any two groups. There is a disparity in the distribution of infections by HIV-1 subtype in the world. Subtype B is predominant in America, Australia and western and central Europe, and most therapeutic strategies are based on research and clinical studies on this subtype. However, non-B subtypes represent the majority of global HIV-1 infections; e.g., subtype C alone accounts for nearly half of all HIV-1 infections. We present a statistical framework enabling the comparison of patterns of accumulating mutations in different HIV-1 subtypes. Specifically, we compare the temporal ordering of lopinavir resistance mutations in HIV-1 subtypes B versus C. To this end, we combine the Hidden Conjunctive Bayesian Network (H-CBN) model with an approximate inference scheme enabling comparisons of larger networks. We show that the development of resistance to lopinavir differs significantly between subtypes B and C, such that findings based on subtype B sequences can not always be applied to sybtype C. The described methodology is suitable for comparing different subgroups in the context of other evolutionary processes.
Collapse
|
7
|
HyperTraPS: Inferring Probabilistic Patterns of Trait Acquisition in Evolutionary and Disease Progression Pathways. Cell Syst 2020; 10:39-51.e10. [PMID: 31786211 DOI: 10.1016/j.cels.2019.10.009] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/03/2018] [Revised: 08/23/2019] [Accepted: 10/26/2019] [Indexed: 01/15/2023]
Abstract
The explosion of data throughout the biomedical sciences provides unprecedented opportunities to learn about the dynamics of evolution and disease progression, but harnessing these large and diverse datasets remains challenging. Here, we describe a highly generalizable statistical platform to infer the dynamic pathways by which many, potentially interacting, traits are acquired or lost over time. We use HyperTraPS (hypercubic transition path sampling) to efficiently learn progression pathways from cross-sectional, longitudinal, or phylogenetically linked data, readily distinguishing multiple competing pathways, and identifying the most parsimonious mechanisms underlying given observations. This Bayesian approach allows inclusion of prior knowledge, quantifies uncertainty in pathway structure, and allows predictions, such as which symptom a patient will acquire next. We provide visualization tools for intuitive assessment of multiple, variable pathways. We apply the method to ovarian cancer progression and the evolution of multidrug resistance in tuberculosis, demonstrating its power to reveal previously undetected dynamic pathways.
Collapse
|
8
|
Wang M, Yu T, Liu J, Chen L, Stromberg AJ, Villano JL, Arnold SM, Liu C, Wang C. A probabilistic method for leveraging functional annotations to enhance estimation of the temporal order of pathway mutations during carcinogenesis. BMC Bioinformatics 2019; 20:620. [PMID: 31791231 PMCID: PMC6889196 DOI: 10.1186/s12859-019-3218-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2019] [Accepted: 11/12/2019] [Indexed: 12/02/2022] Open
Abstract
BACKGROUND Cancer arises through accumulation of somatically acquired genetic mutations. An important question is to delineate the temporal order of somatic mutations during carcinogenesis, which contributes to better understanding of cancer biology and facilitates identification of new therapeutic targets. Although a number of statistical and computational methods have been proposed to estimate the temporal order of mutations, they do not account for the differences in the functional impacts of mutations and thus are likely to be obscured by the presence of passenger mutations that do not contribute to cancer progression. In addition, many methods infer the order of mutations at the gene level, which have limited power due to the low mutation rate in most genes. RESULTS In this paper, we develop a Probabilistic Approach for estimating the Temporal Order of Pathway mutations by leveraging functional Annotations of mutations (PATOPA). PATOPA infers the order of mutations at the pathway level, wherein it uses a probabilistic method to characterize the likelihood of mutational events from different pathways occurring in a certain order. The functional impact of each mutation is incorporated to weigh more on a mutation that is more integral to tumor development. A maximum likelihood method is used to estimate parameters and infer the probability of one pathway being mutated prior to another. Simulation studies and analysis of whole exome sequencing data from The Cancer Genome Atlas (TCGA) demonstrate that PATOPA is able to accurately estimate the temporal order of pathway mutations and provides new biological insights on carcinogenesis of colorectal and lung cancers. CONCLUSIONS PATOPA provides a useful tool to estimate temporal order of mutations at the pathway level while leveraging functional annotations of mutations.
Collapse
Affiliation(s)
- Menghan Wang
- Department of Statistics, University of Kentucky, Lexington, USA
| | - Tianxin Yu
- Department of Molecular & Cellular Biology, Roswell Park Comprehensive Cancer Center, Buffalo, USA
| | - Jinpeng Liu
- Markey Cancer Center, University of Kentucky, Lexington, USA
| | - Li Chen
- Markey Cancer Center, University of Kentucky, Lexington, USA
- Department of Biostatistics, University of Kentucky, Lexington, USA
| | | | - John L. Villano
- Markey Cancer Center, University of Kentucky, Lexington, USA
- Department of Internal Medicine, University of Kentucky, Lexington, USA
| | - Susanne M. Arnold
- Markey Cancer Center, University of Kentucky, Lexington, USA
- Department of Internal Medicine, University of Kentucky, Lexington, USA
| | - Chunming Liu
- Markey Cancer Center, University of Kentucky, Lexington, USA
- Department of Molecular & Cellular Biochemistry, University of Kentucky, Lexington, USA
| | - Chi Wang
- Markey Cancer Center, University of Kentucky, Lexington, USA
- Department of Biostatistics, University of Kentucky, Lexington, USA
| |
Collapse
|
9
|
Khakabimamaghani S, Ding D, Snow O, Ester M. Uncovering the subtype-specific temporal order of cancer pathway dysregulation. PLoS Comput Biol 2019; 15:e1007451. [PMID: 31710622 PMCID: PMC6872169 DOI: 10.1371/journal.pcbi.1007451] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2019] [Revised: 11/21/2019] [Accepted: 09/30/2019] [Indexed: 12/20/2022] Open
Abstract
Cancer is driven by genetic mutations that dysregulate pathways important for proper cell function. Therefore, discovering these cancer pathways and their dysregulation order is key to understanding and treating cancer. However, the heterogeneity of mutations between different individuals makes this challenging and requires that cancer progression is studied in a subtype-specific way. To address this challenge, we provide a mathematical model, called Subtype-specific Pathway Linear Progression Model (SPM), that simultaneously captures cancer subtypes and pathways and order of dysregulation of the pathways within each subtype. Experiments with synthetic data indicate the robustness of SPM to problem specifics including noise compared to an existing method. Moreover, experimental results on glioblastoma multiforme and colorectal adenocarcinoma show the consistency of SPM's results with the existing knowledge and its superiority to an existing method in certain cases. The implementation of our method is available at https://github.com/Dalton386/SPM.
Collapse
Affiliation(s)
| | - Dujian Ding
- School of Computing Science, Simon Fraser University, Burnaby, British Columbia, Canada
| | - Oliver Snow
- School of Computing Science, Simon Fraser University, Burnaby, British Columbia, Canada
| | - Martin Ester
- School of Computing Science, Simon Fraser University, Burnaby, British Columbia, Canada
| |
Collapse
|
10
|
Abstract
MOTIVATION How predictable is the evolution of cancer? This fundamental question is of immense relevance for the diagnosis, prognosis and treatment of cancer. Evolutionary biologists have approached the question of predictability based on the underlying fitness landscape. However, empirical fitness landscapes of tumor cells are impossible to determine in vivo. Thus, in order to quantify the predictability of cancer evolution, alternative approaches are required that circumvent the need for fitness landscapes. RESULTS We developed a computational method based on conjunctive Bayesian networks (CBNs) to quantify the predictability of cancer evolution directly from mutational data, without the need for measuring or estimating fitness. Using simulated data derived from >200 different fitness landscapes, we show that our CBN-based notion of evolutionary predictability strongly correlates with the classical notion of predictability based on fitness landscapes under the strong selection weak mutation assumption. The statistical framework enables robust and scalable quantification of evolutionary predictability. We applied our approach to driver mutation data from the TCGA and the MSK-IMPACT clinical cohorts to systematically compare the predictability of 15 different cancer types. We found that cancer evolution is remarkably predictable as only a small fraction of evolutionary trajectories are feasible during cancer progression. AVAILABILITY AND IMPLEMENTATION https://github.com/cbg-ethz/predictability\_of\_cancer\_evolution. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Sayed-Rzgar Hosseini
- Department of Biosystems Science and Engineering, ETH Zürich, Basel, Switzerland
- Cancer Research UK Cambridge Institute, University of Cambridge, Cambridge, UK
| | - Ramon Diaz-Uriarte
- Department of Biochemistry, Universidad Autónoma de Madrid, Instituto de Investigaciones Biomédicas “Alberto Sols (UAM-CSIC)”, Madrid, Spain
| | - Florian Markowetz
- Cancer Research UK Cambridge Institute, University of Cambridge, Cambridge, UK
| | - Niko Beerenwinkel
- Department of Biosystems Science and Engineering, ETH Zürich, Basel, Switzerland
- SIB Swiss Institute of Bioinformatics, Lausanne, Switzerland
| |
Collapse
|
11
|
Hainke K, Szugat S, Fried R, Rahnenführer J. Variable selection for disease progression models: methods for oncogenetic trees and application to cancer and HIV. BMC Bioinformatics 2017; 18:358. [PMID: 28764644 PMCID: PMC5539896 DOI: 10.1186/s12859-017-1762-1] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2016] [Accepted: 07/14/2017] [Indexed: 12/12/2022] Open
Abstract
Background Disease progression models are important for understanding the critical steps during the development of diseases. The models are imbedded in a statistical framework to deal with random variations due to biology and the sampling process when observing only a finite population. Conditional probabilities are used to describe dependencies between events that characterise the critical steps in the disease process. Many different model classes have been proposed in the literature, from simple path models to complex Bayesian networks. A popular and easy to understand but yet flexible model class are oncogenetic trees. These have been applied to describe the accumulation of genetic aberrations in cancer and HIV data. However, the number of potentially relevant aberrations is often by far larger than the maximal number of events that can be used for reliably estimating the progression models. Still, there are only a few approaches to variable selection, which have not yet been investigated in detail. Results We fill this gap and propose specifically for oncogenetic trees ten variable selection methods, some of these being completely new. We compare them in an extensive simulation study and on real data from cancer and HIV. It turns out that the preselection of events by clique identification algorithms performs best. Here, events are selected if they belong to the largest or the maximum weight subgraph in which all pairs of vertices are connected. Conclusions The variable selection method of identifying cliques finds both the important frequent events and those related to disease pathways. Electronic supplementary material The online version of this article (doi:10.1186/s12859-017-1762-1) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Katrin Hainke
- Department of Statistics, TU Dortmund University, Dortmund, 44221, Germany
| | - Sebastian Szugat
- Department of Statistics, TU Dortmund University, Dortmund, 44221, Germany
| | - Roland Fried
- Department of Statistics, TU Dortmund University, Dortmund, 44221, Germany
| | - Jörg Rahnenführer
- Department of Statistics, TU Dortmund University, Dortmund, 44221, Germany.
| |
Collapse
|
12
|
Montazeri H, Kuipers J, Kouyos R, Böni J, Yerly S, Klimkait T, Aubert V, Günthard HF, Beerenwinkel N. Large-scale inference of conjunctive Bayesian networks. Bioinformatics 2017; 32:i727-i735. [PMID: 27587695 DOI: 10.1093/bioinformatics/btw459] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022] Open
Abstract
UNLABELLED The continuous time conjunctive Bayesian network (CT-CBN) is a graphical model for analyzing the waiting time process of the accumulation of genetic changes (mutations). CT-CBN models have been successfully used in several biological applications such as HIV drug resistance development and genetic progression of cancer. However, current approaches for parameter estimation and network structure learning of CBNs can only deal with a small number of mutations (<20). Here, we address this limitation by presenting an efficient and accurate approximate inference algorithm using a Monte Carlo expectation-maximization algorithm based on importance sampling. The new method can now be used for a large number of mutations, up to one thousand, an increase by two orders of magnitude. In simulation studies, we present the accuracy as well as the running time efficiency of the new inference method and compare it with a MLE method, expectation-maximization, and discrete time CBN model, i.e. a first-order approximation of the CT-CBN model. We also study the application of the new model on HIV drug resistance datasets for the combination therapy with zidovudine plus lamivudine (AZT + 3TC) as well as under no treatment, both extracted from the Swiss HIV Cohort Study database. AVAILABILITY AND IMPLEMENTATION The proposed method is implemented as an R package available at https://github.com/cbg-ethz/MC-CBN CONTACT: niko.beerenwinkel@bsse.ethz.ch SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Hesam Montazeri
- Department of Biosystems Science and Engineering, ETH Zurich, Basel, Switzerland SIB Swiss Institute of Bioinformatics, Basel, Switzerland
| | - Jack Kuipers
- Department of Biosystems Science and Engineering, ETH Zurich, Basel, Switzerland SIB Swiss Institute of Bioinformatics, Basel, Switzerland
| | - Roger Kouyos
- Division of Infectious Diseases and Hospital Epidemiology, University Hospital Zurich, University of Zurich, Zurich, Switzerland Institute of Medical Virology
| | - Jürg Böni
- Swiss National Center for Retroviruses, Institute of Medical Virology, University of Zurich, Zurich 8057, Switzerland
| | - Sabine Yerly
- Laboratory of Virology, Division of Infectious Diseases, Geneva University Hospital, Geneva, Switzerland
| | - Thomas Klimkait
- Department of Biomedicine-Petersplatz, University of Basel, Basel, Switzerland
| | - Vincent Aubert
- Division of Immunology and Allergy, University Hospital Lausanne, Lausanne, Switzerland
| | - Huldrych F Günthard
- Division of Infectious Diseases and Hospital Epidemiology, University Hospital Zurich, University of Zurich, Zurich, Switzerland Institute of Medical Virology
| | - Niko Beerenwinkel
- Department of Biosystems Science and Engineering, ETH Zurich, Basel, Switzerland SIB Swiss Institute of Bioinformatics, Basel, Switzerland
| | | |
Collapse
|
13
|
Cristea S, Kuipers J, Beerenwinkel N. pathTiMEx: Joint Inference of Mutually Exclusive Cancer Pathways and Their Progression Dynamics. J Comput Biol 2017; 24:603-615. [DOI: 10.1089/cmb.2016.0171] [Citation(s) in RCA: 28] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022] Open
Affiliation(s)
- Simona Cristea
- Department of Biosystems Science and Engineering, ETH Zürich, Basel, Switzerland
- The Swiss Institute of Bioinformatics, Basel, Switzerland
| | - Jack Kuipers
- Department of Biosystems Science and Engineering, ETH Zürich, Basel, Switzerland
- The Swiss Institute of Bioinformatics, Basel, Switzerland
| | - Niko Beerenwinkel
- Department of Biosystems Science and Engineering, ETH Zürich, Basel, Switzerland
- The Swiss Institute of Bioinformatics, Basel, Switzerland
| |
Collapse
|
14
|
Montazeri H, Günthard HF, Yang WL, Kouyos R, Beerenwinkel N. Estimating the dynamics and dependencies of accumulating mutations with applications to HIV drug resistance. Biostatistics 2015; 16:713-26. [PMID: 25979750 DOI: 10.1093/biostatistics/kxv019] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2014] [Accepted: 03/13/2015] [Indexed: 12/14/2022] Open
Abstract
We introduce a new model called the observed time conjunctive Bayesian network (OT-CBN) that describes the accumulation of genetic events (mutations) under partial temporal ordering constraints. Unlike other CBN models, the OT-CBN model uses sampling time points of genotypes in addition to genotypes themselves to estimate model parameters. We developed an expectation-maximization algorithm to obtain approximate maximum likelihood estimates by accounting for this additional information. In a simulation study, we show that the OT-CBN model outperforms the continuous time CBN (CT-CBN) (Beerenwinkel and Sullivant, 2009. Markov models for accumulating mutations. Biometrika 96: (3), 645-661), which does not take into account individual sampling times for parameter estimation. We also show superiority of the OT-CBN model on several datasets of HIV drug resistance mutations extracted from the Swiss HIV Cohort Study database.
Collapse
Affiliation(s)
- Hesam Montazeri
- Department of Biosystems Science and Engineering, ETH Zurich, Basel 4058, Switzerland and SIB Swiss Institute of Bioinformatics, Basel 4058, Switzerland
| | - Huldrych F Günthard
- Division of Infectious Diseases and Hospital Epidemiology, University Hospital Zurich, University of Zurich, Zurich 8091, Switzerland Institute of Medical Virology, University of Zurich, Zurich 8057, Switzerland
| | - Wan-Lin Yang
- Division of Infectious Diseases and Hospital Epidemiology, University Hospital Zurich, University of Zurich, Zurich 8091, Switzerland Institute of Medical Virology, University of Zurich, Zurich 8057, Switzerland
| | - Roger Kouyos
- Division of Infectious Diseases and Hospital Epidemiology, University Hospital Zurich, University of Zurich, Zurich 8091, Switzerland Institute of Medical Virology, University of Zurich, Zurich 8057, Switzerland
| | - Niko Beerenwinkel
- Department of Biosystems Science and Engineering, ETH Zurich, Basel, Switzerland and SIB Swiss Institute of Bioinformatics, Basel, Switzerland
| | | |
Collapse
|
15
|
Raphael BJ, Vandin F. Simultaneous inference of cancer pathways and tumor progression from cross-sectional mutation data. J Comput Biol 2015; 22:510-27. [PMID: 25785493 DOI: 10.1089/cmb.2014.0161] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/02/2023] Open
Abstract
Recent cancer sequencing studies provide a wealth of somatic mutation data from a large number of patients. One of the most intriguing and challenging questions arising from this data is to determine whether the temporal order of somatic mutations in a cancer follows any common progression. Since we usually obtain only one sample from a patient, such inferences are commonly made from cross-sectional data from different patients. This analysis is complicated by the extensive variation in the somatic mutations across different patients, variation that is reduced by examining combinations of mutations in various pathways. Thus far, methods to reconstruct tumor progression at the pathway level have restricted attention to known, a priori defined pathways. In this work we show how to simultaneously infer pathways and the temporal order of their mutations from cross-sectional data, leveraging on the exclusivity property of driver mutations within a pathway. We define the pathway linear progression model, and derive a combinatorial formulation for the problem of finding the optimal model from mutation data. We show that with enough samples the optimal solution to this problem uniquely identifies the correct model with high probability even when errors are present in the mutation data. We then formulate the problem as an integer linear program (ILP), which allows the analysis of datasets from recent studies with large numbers of samples. We use our algorithm to analyze somatic mutation data from three cancer studies, including two studies from The Cancer Genome Atlas (TCGA) on large number of samples on colorectal cancer and glioblastoma. The models reconstructed with our method capture most of the current knowledge of the progression of somatic mutations in these cancer types, while also providing new insights on the tumor progression at the pathway level.
Collapse
Affiliation(s)
- Benjamin J Raphael
- 1Department of Computer Science and Center for Computational Molecular Biology, Brown University, Providence, Rhode Island
| | - Fabio Vandin
- 1Department of Computer Science and Center for Computational Molecular Biology, Brown University, Providence, Rhode Island.,2Department of Mathematics and Computer Science, University of Southern Denmark, Odense, Denmark
| |
Collapse
|
16
|
Beerenwinkel N, Schwarz RF, Gerstung M, Markowetz F. Cancer evolution: mathematical models and computational inference. Syst Biol 2015; 64:e1-25. [PMID: 25293804 PMCID: PMC4265145 DOI: 10.1093/sysbio/syu081] [Citation(s) in RCA: 201] [Impact Index Per Article: 22.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2013] [Accepted: 09/26/2014] [Indexed: 12/12/2022] Open
Abstract
Cancer is a somatic evolutionary process characterized by the accumulation of mutations, which contribute to tumor growth, clinical progression, immune escape, and drug resistance development. Evolutionary theory can be used to analyze the dynamics of tumor cell populations and to make inference about the evolutionary history of a tumor from molecular data. We review recent approaches to modeling the evolution of cancer, including population dynamics models of tumor initiation and progression, phylogenetic methods to model the evolutionary relationship between tumor subclones, and probabilistic graphical models to describe dependencies among mutations. Evolutionary modeling helps to understand how tumors arise and will also play an increasingly important prognostic role in predicting disease progression and the outcome of medical interventions, such as targeted therapy.
Collapse
Affiliation(s)
- Niko Beerenwinkel
- Department of Biosystems Science and Engineering, ETH Zurich, 4058 Basel, Switzerland; SIB Swiss Institute of Bioinformatics, 4058 Basel, Switzerland; European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire, CB10 1SA, United Kingdom; Wellcome Trust Sanger Institute, Hinxton, Cambridgeshire, CB10 1SA, United Kingdom; Cancer Research UK Cambridge Institute, University of Cambridge, Cambridge, CB20RE, United Kingdom Department of Biosystems Science and Engineering, ETH Zurich, 4058 Basel, Switzerland; SIB Swiss Institute of Bioinformatics, 4058 Basel, Switzerland; European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire, CB10 1SA, United Kingdom; Wellcome Trust Sanger Institute, Hinxton, Cambridgeshire, CB10 1SA, United Kingdom; Cancer Research UK Cambridge Institute, University of Cambridge, Cambridge, CB20RE, United Kingdom
| | - Roland F Schwarz
- Department of Biosystems Science and Engineering, ETH Zurich, 4058 Basel, Switzerland; SIB Swiss Institute of Bioinformatics, 4058 Basel, Switzerland; European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire, CB10 1SA, United Kingdom; Wellcome Trust Sanger Institute, Hinxton, Cambridgeshire, CB10 1SA, United Kingdom; Cancer Research UK Cambridge Institute, University of Cambridge, Cambridge, CB20RE, United Kingdom
| | - Moritz Gerstung
- Department of Biosystems Science and Engineering, ETH Zurich, 4058 Basel, Switzerland; SIB Swiss Institute of Bioinformatics, 4058 Basel, Switzerland; European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire, CB10 1SA, United Kingdom; Wellcome Trust Sanger Institute, Hinxton, Cambridgeshire, CB10 1SA, United Kingdom; Cancer Research UK Cambridge Institute, University of Cambridge, Cambridge, CB20RE, United Kingdom
| | - Florian Markowetz
- Department of Biosystems Science and Engineering, ETH Zurich, 4058 Basel, Switzerland; SIB Swiss Institute of Bioinformatics, 4058 Basel, Switzerland; European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire, CB10 1SA, United Kingdom; Wellcome Trust Sanger Institute, Hinxton, Cambridgeshire, CB10 1SA, United Kingdom; Cancer Research UK Cambridge Institute, University of Cambridge, Cambridge, CB20RE, United Kingdom
| |
Collapse
|
17
|
Gopalakrishnan S, Montazeri H, Menz S, Beerenwinkel N, Huisinga W. Estimating HIV-1 fitness characteristics from cross-sectional genotype data. PLoS Comput Biol 2014; 10:e1003886. [PMID: 25375675 PMCID: PMC4222584 DOI: 10.1371/journal.pcbi.1003886] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2014] [Accepted: 08/26/2014] [Indexed: 12/31/2022] Open
Abstract
Despite the success of highly active antiretroviral therapy (HAART) in the management of human immunodeficiency virus (HIV)-1 infection, virological failure due to drug resistance development remains a major challenge. Resistant mutants display reduced drug susceptibilities, but in the absence of drug, they generally have a lower fitness than the wild type, owing to a mutation-incurred cost. The interaction between these fitness costs and drug resistance dictates the appearance of mutants and influences viral suppression and therapeutic success. Assessing in vivo viral fitness is a challenging task and yet one that has significant clinical relevance. Here, we present a new computational modelling approach for estimating viral fitness that relies on common sparse cross-sectional clinical data by combining statistical approaches to learn drug-specific mutational pathways and resistance factors with viral dynamics models to represent the host-virus interaction and actions of drug mechanistically. We estimate in vivo fitness characteristics of mutant genotypes for two antiretroviral drugs, the reverse transcriptase inhibitor zidovudine (ZDV) and the protease inhibitor indinavir (IDV). Well-known features of HIV-1 fitness landscapes are recovered, both in the absence and presence of drugs. We quantify the complex interplay between fitness costs and resistance by computing selective advantages for different mutants. Our approach extends naturally to multiple drugs and we illustrate this by simulating a dual therapy with ZDV and IDV to assess therapy failure. The combined statistical and dynamical modelling approach may help in dissecting the effects of fitness costs and resistance with the ultimate aim of assisting the choice of salvage therapies after treatment failure. Mutations conferring drug resistance represent major threats to the therapeutic success of highly active antiretroviral therapy (HAART) against human immunodeficiency virus (HIV)-1 infection. Viral mutants differ in their fitness and assessing viral fitness is a challenging task. In this article, we estimate drug-specific mutational pathways by learning from clinical data using statistical techniques and incorporate these into mathematical models of in vivo viral infection dynamics. This approach enables us to estimate mutant fitness characteristics. We illustrate our method by predicting fitness characteristics of mutant genotypes for two different antiretroviral therapies with the drugs zidovudine and indinavir. We recover several established features of mutant fitnesses and quantify fitness characteristics both in the absence and presence of drugs. Our model extends naturally to multiple drugs and we illustrate this by simulating a dual therapy with ZDV and IDV to assess therapy failure. Additionally, our modelling approach relies only on cross-sectional clinical data. We believe that such an approach is a highly valuable tool in assisting the choice of salvage therapies after treatment failure.
Collapse
Affiliation(s)
- Sathej Gopalakrishnan
- Institute of Biochemistry and Biology, University of Potsdam, Potsdam, Germany
- Graduate Research Training Program PharMetrX: Pharmacometrics & Computational Disease Modelling, Free University of Berlin and University of Potsdam, Berlin/Potsdam, Germany
| | - Hesam Montazeri
- Department of Biosystems Science and Engineering, ETH Zurich, Basel, Switzerland
- SIB Swiss Institute of Bioinformatics, Basel, Switzerland
| | - Stephan Menz
- Institute of Mathematics, University of Potsdam, Potsdam, Germany
| | - Niko Beerenwinkel
- Department of Biosystems Science and Engineering, ETH Zurich, Basel, Switzerland
- SIB Swiss Institute of Bioinformatics, Basel, Switzerland
- * E-mail: (NB); (WH)
| | - Wilhelm Huisinga
- Institute of Biochemistry and Biology, University of Potsdam, Potsdam, Germany
- Institute of Mathematics, University of Potsdam, Potsdam, Germany
- * E-mail: (NB); (WH)
| |
Collapse
|
18
|
Ozawa T, Riester M, Cheng YK, Huse JT, Squatrito M, Helmy K, Charles N, Michor F, Holland EC. Most human non-GCIMP glioblastoma subtypes evolve from a common proneural-like precursor glioma. Cancer Cell 2014; 26:288-300. [PMID: 25117714 PMCID: PMC4143139 DOI: 10.1016/j.ccr.2014.06.005] [Citation(s) in RCA: 288] [Impact Index Per Article: 28.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/27/2013] [Revised: 02/20/2014] [Accepted: 06/11/2014] [Indexed: 01/16/2023]
Abstract
To understand the relationships between the non-GCIMP glioblastoma (GBM) subgroups, we performed mathematical modeling to predict the temporal sequence of driver events during tumorigenesis. The most common order of evolutionary events is 1) chromosome (chr) 7 gain and chr10 loss, followed by 2) CDKN2A loss and/or TP53 mutation, and 3) alterations canonical for specific subtypes. We then developed a computational methodology to identify drivers of broad copy number changes, identifying PDGFA (chr7) and PTEN (chr10) as driving initial nondisjunction events. These predictions were validated using mouse modeling, showing that PDGFA is sufficient to induce proneural-like gliomas and that additional NF1 loss converts proneural to the mesenchymal subtype. Our findings suggest that most non-GCIMP mesenchymal GBMs arise as, and evolve from, a proneural-like precursor.
Collapse
Affiliation(s)
- Tatsuya Ozawa
- Division of Human Biology and Solid Tumor Translational Research, Fred Hutchinson Cancer Research Center, Department of Neurosurgery and Alvord Brain Tumor Center, University of Washington, Seattle, WA 98109, USA
| | - Markus Riester
- Department of Biostatistics and Computational Biology, Dana-Farber Cancer Institute, Harvard School of Public Health, Boston, MA 02215, USA; Department of Biostatistics, Harvard School of Public Health, Boston, MA 02215, USA
| | - Yu-Kang Cheng
- Department of Biostatistics and Computational Biology, Dana-Farber Cancer Institute, Harvard School of Public Health, Boston, MA 02215, USA; Department of Biostatistics, Harvard School of Public Health, Boston, MA 02215, USA
| | - Jason T Huse
- Department of Pathology and Human Oncology, Pathogenesis Program, Memorial Sloan-Kettering Cancer Center, New York, NY 10065, USA
| | - Massimo Squatrito
- Cancer Cell Biology Programme, Spanish National Cancer Research Centre, Madrid 28029, Spain
| | - Karim Helmy
- Department of Cancer Biology and Genetics, Memorial Sloan-Kettering Cancer Center, New York, NY 10065, USA
| | - Nikki Charles
- Department of Cancer Biology and Genetics, Memorial Sloan-Kettering Cancer Center, New York, NY 10065, USA
| | - Franziska Michor
- Department of Biostatistics and Computational Biology, Dana-Farber Cancer Institute, Harvard School of Public Health, Boston, MA 02215, USA; Department of Biostatistics, Harvard School of Public Health, Boston, MA 02215, USA.
| | - Eric C Holland
- Division of Human Biology and Solid Tumor Translational Research, Fred Hutchinson Cancer Research Center, Department of Neurosurgery and Alvord Brain Tumor Center, University of Washington, Seattle, WA 98109, USA.
| |
Collapse
|
19
|
Dümcke S, Bräuer J, Anchang B, Spang R, Beerenwinkel N, Tresch A. Exact likelihood computation in Boolean networks with probabilistic time delays, and its application in signal network reconstruction. ACTA ACUST UNITED AC 2013; 30:414-9. [PMID: 24292937 DOI: 10.1093/bioinformatics/btt696] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023]
Abstract
MOTIVATION For biological pathways, it is common to measure a gene expression time series after various knockdowns of genes that are putatively involved in the process of interest. These interventional time-resolved data are most suitable for the elucidation of dynamic causal relationships in signaling networks. Even with this kind of data it is still a major and largely unsolved challenge to infer the topology and interaction logic of the underlying regulatory network. RESULTS In this work, we present a novel model-based approach involving Boolean networks to reconstruct small to medium-sized regulatory networks. In particular, we solve the problem of exact likelihood computation in Boolean networks with probabilistic exponential time delays. Simulations demonstrate the high accuracy of our approach. We apply our method to data of Ivanova et al. (2006), where RNA interference knockdown experiments were used to build a network of the key regulatory genes governing mouse stem cell maintenance and differentiation. In contrast to previous analyses of that data set, our method can identify feedback loops and provides new insights into the interplay of some master regulators in embryonic stem cell development. AVAILABILITY AND IMPLEMENTATION The algorithm is implemented in the statistical language R. Code and documentation are available at Bioinformatics online. CONTACT duemcke@mpipz.mpg.de or tresch@mpipz.mpg.de SUPPLEMENTARY INFORMATION Supplementary Materials are available at Bioinfomatics online.
Collapse
Affiliation(s)
- Sebastian Dümcke
- Institute for Genetics, University of Cologne, 50674 Cologne, Max Planck Institute for Plant Breeding Research, 50829 Cologne, Gene Center, Department of Biochemistry, Ludwig-Maximilians University, 81379 Munich, Germany, Insitute for Functional Genomics, 93053 Regensburg, Germany, Department of Radiology, Center for Cancer Systems Biology, Stanford University, Stanford, CA 94305-5488, USA and ETH Zürich, Department of Biosystems Science and Engineering, Mattenstrasse 26 4058 Basel, Switzerland
| | | | | | | | | | | |
Collapse
|
20
|
Hainke K, Rahnenführer J, Fried R. Cumulative disease progression models for cross-sectional data: a review and comparison. Biom J 2012; 54:617-40. [PMID: 22886685 DOI: 10.1002/bimj.201100186] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2011] [Revised: 04/19/2012] [Accepted: 05/25/2012] [Indexed: 11/06/2022]
Abstract
A better understanding of disease progression is beneficial for early diagnosis and appropriate individual therapy. Many different approaches for statistical modelling of cumulative disease progression have been proposed in the literature, including simple path models up to complex restricted Bayesian networks. Important fields of application are diseases such as cancer and HIV. Tumour progression is measured by means of chromosome aberrations, whereas people infected with HIV develop drug resistances because of genetic changes of the HI-virus. These two very different diseases have typical courses of disease progression, which can be modelled partly by consecutive and partly by independent steps. This paper gives an overview of the different progression models and points out their advantages and drawbacks. Different models are compared via simulations to analyse how they work if some of their assumptions are violated. In a simulation study, we evaluate how models perform in terms of fitting induced multivariate probability distributions and topological relationships. We often find that the true model class used for generating data is outperformed by either a less or a more complex model class. The more flexible conjunctive Bayesian networks can be used to fit oncogenetic trees, whereas mixtures of oncogenetic trees with three tree components can be well fitted by mixture models with only two tree components.
Collapse
Affiliation(s)
- Katrin Hainke
- Department of Statistics, TU Dortmund University, 44221 Dortmund, Germany.
| | | | | |
Collapse
|
21
|
Doherty KM, Nakka P, King BM, Rhee SY, Holmes SP, Shafer RW, Radhakrishnan ML. A multifaceted analysis of HIV-1 protease multidrug resistance phenotypes. BMC Bioinformatics 2011; 12:477. [PMID: 22172090 PMCID: PMC3305535 DOI: 10.1186/1471-2105-12-477] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/27/2011] [Accepted: 12/15/2011] [Indexed: 12/19/2022] Open
Abstract
Background Great strides have been made in the effective treatment of HIV-1 with the development of second-generation protease inhibitors (PIs) that are effective against historically multi-PI-resistant HIV-1 variants. Nevertheless, mutation patterns that confer decreasing susceptibility to available PIs continue to arise within the population. Understanding the phenotypic and genotypic patterns responsible for multi-PI resistance is necessary for developing PIs that are active against clinically-relevant PI-resistant HIV-1 variants. Results In this work, we use globally optimal integer programming-based clustering techniques to elucidate multi-PI phenotypic resistance patterns using a data set of 398 HIV-1 protease sequences that have each been phenotyped for susceptibility toward the nine clinically-approved HIV-1 PIs. We validate the information content of the clusters by evaluating their ability to predict the level of decreased susceptibility to each of the available PIs using a cross validation procedure. We demonstrate the finding that as a result of phenotypic cross resistance, the considered clinical HIV-1 protease isolates are confined to ~6% or less of the clinically-relevant phenotypic space. Clustering and feature selection methods are used to find representative sequences and mutations for major resistance phenotypes to elucidate their genotypic signatures. We show that phenotypic similarity does not imply genotypic similarity, that different PI-resistance mutation patterns can give rise to HIV-1 isolates with similar phenotypic profiles. Conclusion Rather than characterizing HIV-1 susceptibility toward each PI individually, our study offers a unique perspective on the phenomenon of PI class resistance by uncovering major multidrug-resistant phenotypic patterns and their often diverse genotypic determinants, providing a methodology that can be applied to understand clinically-relevant phenotypic patterns to aid in the design of novel inhibitors that target other rapidly evolving molecular targets as well.
Collapse
|
22
|
Longerich T, Mueller MM, Breuhahn K, Schirmacher P, Benner A, Heiss C. Oncogenetic tree modeling of human hepatocarcinogenesis. Int J Cancer 2011; 130:575-83. [PMID: 21400513 DOI: 10.1002/ijc.26063] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2010] [Accepted: 02/17/2011] [Indexed: 12/30/2022]
Abstract
Classical comparative genomic hybridization (CGH) has been used to identify recurrent genomic alterations in human HCC. As hepatocarcinogenesis is considered as a stepwise process, we applied oncogenetic tree modeling on all available classical CGH data to determine occurrence of genetic alterations over time. Nine losses (1p, 4q, 6q, 8p, 9p, 13q, 16p, 16q and 17p) and ten gains (1q, 5p, 6p, 7p, 7q, 8q, 17q, 20p, 20q and Xq) of genomic information were used to build the oncogenetic tree model. Whereas gains of 1q and 8q together with losses of 8p formed a cluster that represents early etiology-independent alterations, the associations of gains at 6q and 17q as well as losses of 6p and 9p were observed during tumor progression. HBV-induced HCCs had significantly more chromosomal aberrations compared to HBV-negative tumors. Losses of 1p, 4q and 13q were associated with HBV-induced HCCs, whereas virus-negative HCCs showed an association of gains at 5p, 7, 20q and Xq. Using five aberrations that were significantly associated with tumor dedifferentiation a robust progression model of stepwise human hepatocarcinogensis (gain 1q → gain 8q → loss 4q → loss 16q → loss 13q) was developed. In silico analysis revealed that protumorigenic candidate genes have been identified for each recurrently altered hotspot. Thus, oncogenic candidate genes that are coded on chromosome arms 1q and 8q are promising targets for the prevention of malignant transformation and the development of biomarkers for the early diagnosis of human HCC that may significantly improve the treatment options and thus prognosis of HCC patients.
Collapse
Affiliation(s)
- Thomas Longerich
- Institute of Pathology, University Hospital Heidelberg, Heidelberg, Germany.
| | | | | | | | | | | |
Collapse
|
23
|
Gerstung M, Baudis M, Moch H, Beerenwinkel N. Quantifying cancer progression with conjunctive Bayesian networks. Bioinformatics 2009; 25:2809-15. [PMID: 19692554 PMCID: PMC2781752 DOI: 10.1093/bioinformatics/btp505] [Citation(s) in RCA: 88] [Impact Index Per Article: 5.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
MOTIVATION Cancer is an evolutionary process characterized by accumulating mutations. However, the precise timing and the order of genetic alterations that drive tumor progression remain enigmatic. RESULTS We present a specific probabilistic graphical model for the accumulation of mutations and their interdependencies. The Bayesian network models cancer progression by an explicit unobservable accumulation process in time that is separated from the observable but error-prone detection of mutations. Model parameters are estimated by an Expectation-Maximization algorithm and the underlying interaction graph is obtained by a simulated annealing procedure. Applying this method to cytogenetic data for different cancer types, we find multiple complex oncogenetic pathways deviating substantially from simplified models, such as linear pathways or trees. We further demonstrate how the inferred progression dynamics can be used to improve genetics-based survival predictions which could support diagnostics and prognosis. AVAILABILITY The software package ct-cbn is available under a GPL license on the web site cbg.ethz.ch/software/ct-cbn CONTACT moritz.gerstung@bsse.ethz.ch.
Collapse
Affiliation(s)
- Moritz Gerstung
- Department of Biosystems Science and Engineering, ETH Zurich, Mattenstrasse 26, 4058 Basel, Switzerland.
| | | | | | | |
Collapse
|