Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Aguse N, Qi Y, El-Kebir M. Summarizing the solution space in tumor phylogeny inference by multiple consensus trees. Bioinformatics 2019;35:i408-i416. [PMID: 31510657 PMCID: PMC6612807 DOI: 10.1093/bioinformatics/btz312] [Citation(s) in RCA: 20] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/17/2023] Open

For:	Aguse N, Qi Y, El-Kebir M. Summarizing the solution space in tumor phylogeny inference by multiple consensus trees. Bioinformatics 2019;35:i408-i416. [PMID: 31510657 PMCID: PMC6612807 DOI: 10.1093/bioinformatics/btz312] [Citation(s) in RCA: 20] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/17/2023] Open

Number

Cited by Other Article(s)

Vasei H, Foroughmand-Araabi MH, Daneshgar A. Weighted centroid trees: a general approach to summarize phylogenies in single-labeled tumor mutation tree inference. Bioinformatics 2024;40:btae120. [PMID: 38984735 DOI: 10.1093/bioinformatics/btae120] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2023] [Revised: 02/19/2024] [Accepted: 07/09/2024] [Indexed: 07/11/2024] Open

Sashittal P, Chen V, Pasarkar A, Raphael BJ. Joint inference of cell lineage and mitochondrial evolution from single-cell sequencing data. Bioinformatics 2024;40:i218-i227. [PMID: 38940122 PMCID: PMC11211840 DOI: 10.1093/bioinformatics/btae231] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/29/2024] Open

Abstract

MOTIVATION

Eukaryotic cells contain organelles called mitochondria that have their own genome. Most cells contain thousands of mitochondria which replicate, even in nondividing cells, by means of a relatively error-prone process resulting in somatic mutations in their genome. Because of the higher mutation rate compared to the nuclear genome, mitochondrial mutations have been used to track cellular lineage, particularly using single-cell sequencing that measures mitochondrial mutations in individual cells. However, existing methods to infer the cell lineage tree from mitochondrial mutations do not model "heteroplasmy," which is the presence of multiple mitochondrial clones with distinct sets of mutations in an individual cell. Single-cell sequencing data thus provide a mixture of the mitochondrial clones in individual cells, with the ancestral relationships between these clones described by a mitochondrial clone tree. While deconvolution of somatic mutations from a mixture of evolutionarily related genomes has been extensively studied in the context of bulk sequencing of cancer tumor samples, the problem of mitochondrial deconvolution has the additional constraint that the mitochondrial clone tree must be concordant with the cell lineage tree.

RESULTS

We formalize the problem of inferring a concordant pair of a mitochondrial clone tree and a cell lineage tree from single-cell sequencing data as the Nested Perfect Phylogeny Mixture (NPPM) problem. We derive a combinatorial characterization of the solutions to the NPPM problem, and formulate an algorithm, MERLIN, to solve this problem exactly using a mixed integer linear program. We show on simulated data that MERLIN outperforms existing methods that do not model mitochondrial heteroplasmy nor the concordance between the mitochondrial clone tree and the cell lineage tree. We use MERLIN to analyze single-cell whole-genome sequencing data of 5220 cells of a gastric cancer cell line and show that MERLIN infers a more biologically plausible cell lineage tree and mitochondrial clone tree compared to existing methods.

AVAILABILITY AND IMPLEMENTATION

https://github.com/raphael-group/MERLIN.

Collapse

Qi Y, El-Kebir M. Consensus Tree Under the Ancestor-Descendant Distance is NP-Hard. J Comput Biol 2024;31:58-70. [PMID: 38010616 DOI: 10.1089/cmb.2023.0262] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2023] Open

Guang Z, Smith-Erb M, Oesper L. A weighted distance-based approach for deriving consensus tumor evolutionary trees. Bioinformatics 2023;39:i204-i212. [PMID: 37387177 DOI: 10.1093/bioinformatics/btad230] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/01/2023] Open

Llabrés M, Rosselló F, Valiente G. The Generalized Robinson-Foulds Distance for Phylogenetic Trees. J Comput Biol 2021;28:1181-1195. [PMID: 34714118 PMCID: PMC8742253 DOI: 10.1089/cmb.2021.0342] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open

Utro F, Levovitz C, Rhrissorrakrai K, Parida L. A common methodological phylogenomics framework for intra-patient heteroplasmies to infer SARS-CoV-2 sublineages and tumor clones. BMC Genomics 2021;22:518. [PMID: 34789161 PMCID: PMC8596094 DOI: 10.1186/s12864-021-07660-9] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2021] [Accepted: 04/28/2021] [Indexed: 01/07/2023] Open

Silva AS, Wilkinson M. On Defining and Finding Islands of Trees and Mitigating Large Island Bias. Syst Biol 2021;70:1282-1294. [PMID: 33749752 PMCID: PMC8513764 DOI: 10.1093/sysbio/syab015] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2020] [Accepted: 02/24/2021] [Indexed: 11/12/2022] Open

Ciccolella S, Bernardini G, Denti L, Bonizzoni P, Previtali M, Della Vedova G. Triplet-based similarity score for fully multilabeled trees with poly-occurring labels. Bioinformatics 2021;37:178-184. [PMID: 32730595 PMCID: PMC8055217 DOI: 10.1093/bioinformatics/btaa676] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2020] [Revised: 06/29/2020] [Accepted: 07/22/2020] [Indexed: 01/06/2023] Open

Hodzic E, Shrestha R, Malikic S, Collins CC, Litchfield K, Turajlic S, Sahinalp SC. Identification of conserved evolutionary trajectories in tumors. Bioinformatics 2021;36:i427-i435. [PMID: 32657374 DOI: 10.1093/bioinformatics/btaa453] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open

Christensen S, Kim J, Chia N, Koyejo O, El-Kebir M. Detecting evolutionary patterns of cancers using consensus trees. Bioinformatics 2021;36:i684-i691. [PMID: 33381820 DOI: 10.1093/bioinformatics/btaa801] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open

Sundermann LK, Wintersinger J, Rätsch G, Stoye J, Morris Q. Reconstructing tumor evolutionary histories and clone trees in polynomial-time with SubMARine. PLoS Comput Biol 2021;17:e1008400. [PMID: 33465079 PMCID: PMC7845980 DOI: 10.1371/journal.pcbi.1008400] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2020] [Revised: 01/29/2021] [Accepted: 09/22/2020] [Indexed: 11/18/2022] Open

Abstract

Tumors contain multiple subpopulations of genetically distinct cancer cells. Reconstructing their evolutionary history can improve our understanding of how cancers develop and respond to treatment. Subclonal reconstruction methods cluster mutations into groups that co-occur within the same subpopulations, estimate the frequency of cells belonging to each subpopulation, and infer the ancestral relationships among the subpopulations by constructing a clone tree. However, often multiple clone trees are consistent with the data and current methods do not efficiently capture this uncertainty; nor can these methods scale to clone trees with a large number of subclonal populations.

Here, we formalize the notion of a partially-defined clone tree (partial clone tree for short) that defines a subset of the pairwise ancestral relationships in a clone tree, thereby implicitly representing the set of all clone trees that have these defined pairwise relationships. Also, we introduce a special partial clone tree, the Maximally-Constrained Ancestral Reconstruction (MAR), which summarizes all clone trees fitting the input data equally well. Finally, we extend commonly used clone tree validity conditions to apply to partial clone trees and describe SubMARine, a polynomial-time algorithm producing the subMAR, which approximates the MAR and guarantees that its defined relationships are a subset of those present in the MAR. We also extend SubMARine to work with subclonal copy number aberrations and define equivalence constraints for this purpose. Further, we extend SubMARine to permit noise in the estimates of the subclonal frequencies while retaining its validity conditions and guarantees. In contrast to other clone tree reconstruction methods, SubMARine runs in time and space that scale polynomially in the number of subclones.

We show through extensive noise-free simulation, a large lung cancer dataset and a prostate cancer dataset that the subMAR equals the MAR in all cases where only a single clone tree exists and that it is a perfect match to the MAR in most of the other cases. Notably, SubMARine runs in less than 70 seconds on a single thread with less than one Gb of memory on all datasets presented in this paper, including ones with 50 nodes in a clone tree. On the real-world data, SubMARine almost perfectly recovers the previously reported trees and identifies minor errors made in the expert-driven reconstructions of those trees.

The freely-available open-source code implementing SubMARine can be downloaded at https://github.com/morrislab/submarine.

Cancer cells accumulate mutations over time and consist of genetically distinct subpopulations. Their evolutionary history (as represented by tumor phylogenies) can be inferred from bulk cancer genome sequencing data. Current tumor phylogeny reconstruction methods have two main issues: they are slow, and they do not efficiently represent uncertainty in the reconstruction.

To address these issues, we developed SubMARine, a fast algorithm that summarizes all valid phylogenies in an intuitive format. SubMARine solved all reconstruction problems in this manuscript in less than 70 seconds, orders of magnitude faster than other methods. These reconstruction problems included those with up to 50 subclones; problems that are too large for other algorithms to even attempt. SubMARine achieves these result because, unlike other algorithms, it performs its reconstruction by identifying an upper-bound on the solution set of trees and the amount of noise in the estimates of the subclonal frequencies. In the vast majority of cases we checked, i. e. an extensive noise-free simulation, a lung cancer and a prostate cancer dataset, this upper bound is tight: when only a single solution exists, SubMARine converges to it every time. When multiple solutions exist, our algorithm correctly recovers the uncertain relationships in 71% of cases.

In addition to solving these two major challenges, we introduce some useful new concepts for and open research problems in the field of tumor phylogeny reconstruction. Specifically, we formalize the concept of a partial clone tree which provides a set of constraints on the solution set of clone trees; and provide a complete set of conditions under which a partial clone tree is valid. These conditions guarantee that all trees in the solution set satisfy the constraints implied by the partial clone tree.

Collapse

Tsyvina V, Zelikovsky A, Snir S, Skums P. Inference of mutability landscapes of tumors from single cell sequencing data. PLoS Comput Biol 2020;16:e1008454. [PMID: 33253159 PMCID: PMC7728263 DOI: 10.1371/journal.pcbi.1008454] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2020] [Revised: 12/10/2020] [Accepted: 10/20/2020] [Indexed: 11/18/2022] Open

Weber LL, Aguse N, Chia N, El-Kebir M. PhyDOSE: Design of follow-up single-cell sequencing experiments of tumors. PLoS Comput Biol 2020;16:e1008240. [PMID: 33001973 PMCID: PMC7553321 DOI: 10.1371/journal.pcbi.1008240] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2020] [Revised: 10/13/2020] [Accepted: 08/12/2020] [Indexed: 01/07/2023] Open

Abstract

The combination of bulk and single-cell DNA sequencing data of the same tumor enables the inference of high-fidelity phylogenies that form the input to many important downstream analyses in cancer genomics. While many studies simultaneously perform bulk and single-cell sequencing, some studies have analyzed initial bulk data to identify which mutations to target in a follow-up single-cell sequencing experiment, thereby decreasing cost. Bulk data provide an additional untapped source of valuable information, composed of candidate phylogenies and associated clonal prevalence. Here, we introduce PhyDOSE, a method that uses this information to strategically optimize the design of follow-up single cell experiments. Underpinning our method is the observation that only a small number of clones uniquely distinguish one candidate tree from all other trees. We incorporate distinguishing features into a probabilistic model that infers the number of cells to sequence so as to confidently reconstruct the phylogeny of the tumor. We validate PhyDOSE using simulations and a retrospective analysis of a leukemia patient, concluding that PhyDOSE's computed number of cells resolves tree ambiguity even in the presence of typical single-cell sequencing errors. We also conduct a retrospective analysis on an acute myeloid leukemia cohort, demonstrating the potential to achieve similar results with a significant reduction in the number of cells sequenced. In a prospective analysis, we demonstrate the advantage of selecting cells to sequence across multiple biopsies and that only a small number of cells suffice to disambiguate the solution space of trees in a recent lung cancer cohort. In summary, PhyDOSE proposes cost-efficient single-cell sequencing experiments that yield high-fidelity phylogenies, which will improve downstream analyses aimed at deepening our understanding of cancer biology.

Collapse

Sashittal P, El-Kebir M. Sampling and summarizing transmission trees with multi-strain infections. Bioinformatics 2020;36:i362-i370. [PMID: 32657399 PMCID: PMC7355255 DOI: 10.1093/bioinformatics/btaa438] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open