Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For:	[Subscribe] [Scholar Register]

Number

Cited by Other Article(s)

Berling L, Collienne L, Gavryushkin A. Estimating the mean in the space of ranked phylogenetic trees. Bioinformatics 2024;40:btae514. [PMID: 39177090 PMCID: PMC11364146 DOI: 10.1093/bioinformatics/btae514] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2023] [Revised: 05/16/2024] [Accepted: 08/21/2024] [Indexed: 08/24/2024] Open

Abstract

MOTIVATION

Reconstructing evolutionary histories of biological entities, such as genes, cells, organisms, populations, and species, from phenotypic and molecular sequencing data is central to many biological, palaeontological, and biomedical disciplines. Typically, due to uncertainties and incompleteness in data, the true evolutionary history (phylogeny) is challenging to estimate. Statistical modelling approaches address this problem by introducing and studying probability distributions over all possible evolutionary histories, but can also introduce uncertainties due to misspecification. In practice, computational methods are deployed to learn those distributions typically by sampling them. This approach, however, is fundamentally challenging as it requires designing and implementing various statistical methods over a space of phylogenetic trees (or treespace). Although the problem of developing statistics over a treespace has received substantial attention in the literature and numerous breakthroughs have been made, it remains largely unsolved. The challenge of solving this problem is 2-fold: a treespace has nontrivial often counter-intuitive geometry implying that much of classical Euclidean statistics does not immediately apply; many parametrizations of treespace with promising statistical properties are computationally hard, so they cannot be used in data analyses. As a result, there is no single conventional method for estimating even the most fundamental statistics over any treespace, such as mean and variance, and various heuristics are used in practice. Despite the existence of numerous tree summary methods to approximate means of probability distributions over a treespace based on its geometry, and the theoretical promise of this idea, none of the attempts resulted in a practical method for summarizing tree samples.

RESULTS

In this paper, we present a tree summary method along with useful properties of our chosen treespace while focusing on its impact on phylogenetic analyses of real datasets. We perform an extensive benchmark study and demonstrate that our method outperforms currently most popular methods with respect to a number of important 'quality' statistics. Further, we apply our method to three empirical datasets ranging from cancer evolution to linguistics and find novel insights into corresponding evolutionary problems in all of them. We hence conclude that this treespace is a promising candidate to serve as a foundation for developing statistics over phylogenetic trees analytically, as well as new computational tools for evolutionary data analyses.

AVAILABILITY AND IMPLEMENTATION

An implementation is available at https://github.com/bioDS/Centroid-Code.

Collapse

Teichman S, Lee MD, Willis AD. Analyzing microbial evolution through gene and genome phylogenies. Biostatistics 2024;25:786-800. [PMID: 37897441 PMCID: PMC11247178 DOI: 10.1093/biostatistics/kxad025] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2023] [Revised: 08/15/2023] [Accepted: 08/27/2023] [Indexed: 10/30/2023] Open

Smith MR. Robust Analysis of Phylogenetic Tree Space. Syst Biol 2022;71:1255-1270. [PMID: 34963003 PMCID: PMC9366458 DOI: 10.1093/sysbio/syab100] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2020] [Revised: 12/03/2021] [Accepted: 12/23/2021] [Indexed: 11/13/2022] Open

Wu X, Zhu H. Association testing for binary trees-A Markov branching process approach. Stat Med 2022;41:2557-2573. [PMID: 35262202 PMCID: PMC9311163 DOI: 10.1002/sim.9370] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2021] [Revised: 01/28/2022] [Accepted: 02/22/2022] [Indexed: 11/29/2022]

Baczyński J, Sauquet H, Spalik K. Exceptional evolutionary lability of flower-like inflorescences (pseudanthia) in Apiaceae subfamily Apioideae. AMERICAN JOURNAL OF BOTANY 2022;109:437-455. [PMID: 35112711 PMCID: PMC9310750 DOI: 10.1002/ajb2.1819] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/27/2021] [Revised: 12/19/2021] [Accepted: 12/22/2021] [Indexed: 06/14/2023]

Abstract

PREMISE

Pseudanthia are widespread and have long been postulated to be a key innovation responsible for some of the angiosperm radiations. The aim of our study was to analyze macroevolutionary patterns of these flower-like inflorescences and their potential correlation with diversification rates in Apiaceae subfamily Apioideae. In particular, we were interested to investigate evolvability of pseudanthia and evaluate their potential association with changes in the size of floral display.

METHODS

The framework for our analyses consisted of a time-calibrated phylogeny of 1734 representatives of Apioideae and a morphological matrix of inflorescence traits encoded for 847 species. Macroevolutionary patterns in pseudanthia were inferred using Markov models of discrete character evolution and stochastic character mapping, and a principal component analysis was used to visualize correlations in inflorescence architecture. The interdependence between net diversification rates and the occurrence of pseudocorollas was analyzed with trait-independent and trait-dependent approaches.

RESULTS

Pseudanthia evolved in 10 major clades of Apioideae with at least 36 independent origins and 46 reversals. The morphospace analysis recovered differences in color and compactness between floral and hyperfloral pseudanthia. A correlation between pseudocorollas and size of inflorescence was also strongly supported. Contrary to our predictions, pseudanthia are not responsible for variation in diversification rates identified in this subfamily.

CONCLUSIONS

Our results suggest that pseudocorollas evolve as an answer to the trade-off between enlargement of floral display and costs associated with production of additional flowers. The high evolvability and architectural differences in apioid pseudanthia may be explained on the basis of adaptive wandering and evolutionary developmental biology.

Collapse

Information geometry for phylogenetic trees. J Math Biol 2021;82:19. [PMID: 33590321 PMCID: PMC7884381 DOI: 10.1007/s00285-021-01553-x] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2020] [Revised: 10/12/2020] [Accepted: 10/21/2020] [Indexed: 11/12/2022]

Page R, Yoshida R, Zhang L. Tropical principal component analysis on the space of phylogenetic trees. Bioinformatics 2020;36:4590-4598. [PMID: 32516398 DOI: 10.1093/bioinformatics/btaa564] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/27/2019] [Revised: 05/29/2020] [Accepted: 06/03/2020] [Indexed: 11/13/2022] Open

Random walks and Brownian motion on cubical complexes. Stoch Process Their Appl 2020. [DOI: 10.1016/j.spa.2019.06.013] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]

Brown DG, Owen M. Mean and Variance of Phylogenetic Trees. Syst Biol 2019;69:139-154. [DOI: 10.1093/sysbio/syz041] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2017] [Revised: 05/13/2019] [Accepted: 05/24/2019] [Indexed: 11/13/2022] Open

Schötz C. Convergence rates for the generalized Fréchet mean via the quadruple inequality. Electron J Stat 2019. [DOI: 10.1214/19-ejs1618] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]

Willis A. Confidence Sets for Phylogenetic Trees. J Am Stat Assoc 2018. [DOI: 10.1080/01621459.2017.1395342] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]

Dinh V, Tung Ho LS, Suchard MA, Matsen FA. Consistency and convergence rate of phylogenetic inference via regularization. Ann Stat 2018;46:1481-1512. [PMID: 30344357 DOI: 10.1214/17-aos1592] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]

Abstract

It is common in phylogenetics to have some, perhaps partial, information about the overall evolutionary tree of a group of organisms and wish to find an evolutionary tree of a specific gene for those organisms. There may not be enough information in the gene sequences alone to accurately reconstruct the correct "gene tree." Although the gene tree may deviate from the "species tree" due to a variety of genetic processes, in the absence of evidence to the contrary it is parsimonious to assume that they agree. A common statistical approach in these situations is to develop a likelihood penalty to incorporate such additional information. Recent studies using simulation and empirical data suggest that a likelihood penalty quantifying concordance with a species tree can significantly improve the accuracy of gene tree reconstruction compared to using sequence data alone. However, the consistency of such an approach has not yet been established, nor have convergence rates been bounded. Because phylogenetics is a non-standard inference problem, the standard theory does not apply. In this paper, we propose a penalized maximum likelihood estimator for gene tree reconstruction, where the penalty is the square of the Billera-Holmes-Vogtmann geodesic distance from the gene tree to the species tree. We prove that this method is consistent, and derive its convergence rate for estimating the discrete gene tree structure and continuous edge lengths (representing the amount of evolution that has occurred on that branch) simultaneously. We find that the regularized estimator is "adaptive fast converging," meaning that it can reconstruct all edges of length greater than any given threshold from gene sequences of polynomial length. Our method does not require the species tree to be known exactly; in fact, our asymptotic theory holds for any such guide tree.

Collapse

Willis A, Bell R. Uncertainty in Phylogenetic Tree Estimates. J Comput Graph Stat 2018. [DOI: 10.1080/10618600.2017.1391697] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/05/2023]

Liebscher V. New Gromov-Inspired Metrics on Phylogenetic Tree Space. Bull Math Biol 2018;80:493-518. [DOI: 10.1007/s11538-017-0385-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2017] [Accepted: 12/19/2017] [Indexed: 11/29/2022]

Jombart T, Kendall M, Almagro‐Garcia J, Colijn C. treespace: Statistical exploration of landscapes of phylogenetic trees. Mol Ecol Resour 2017;17:1385-1392. [PMID: 28374552 PMCID: PMC5724650 DOI: 10.1111/1755-0998.12676] [Citation(s) in RCA: 92] [Impact Index Per Article: 13.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2016] [Revised: 03/17/2017] [Accepted: 03/21/2017] [Indexed: 01/01/2023]

Nye TMW, Tang X, Weyenberg G, Yoshida R. Principal component analysis and the locus of the Fréchet mean in the space of phylogenetic trees. Biometrika 2017;104:901-922. [PMID: 29422694 PMCID: PMC5793493 DOI: 10.1093/biomet/asx047] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/10/2016] [Indexed: 11/13/2022] Open

Abstract

Evolutionary relationships are represented by phylogenetic trees, and a phylogenetic analysis of gene sequences typically produces a collection of these trees, one for each gene in the analysis. Analysis of samples of trees is difficult due to the multi-dimensionality of the space of possible trees. In Euclidean spaces, principal component analysis is a popular method of reducing high-dimensional data to a low-dimensional representation that preserves much of the sample’s structure. However, the space of all phylogenetic trees on a fixed set of species does not form a Euclidean vector space, and methods adapted to tree space are needed. Previous work introduced the notion of a principal geodesic in this space, analogous to the first principal component. Here we propose a geometric object for tree space similar to the \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} }{}$k$\end{document}th principal component in Euclidean space: the locus of the weighted Fréchet mean of \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} }{}$k+1$\end{document} vertex trees when the weights vary over the \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} }{}$k$\end{document}-simplex. We establish some basic properties of these objects, in particular showing that they have dimension \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} }{}$k$\end{document}, and propose algorithms for projection onto these surfaces and for finding the principal locus associated with a sample of trees. Simulation studies demonstrate that these algorithms perform well, and analyses of two datasets, containing Apicomplexa and African coelacanth genomes respectively, reveal important structure from the second principal components.

Collapse

Groisser D, Jung S, Schwartzman A. Geometric foundations for scaling-rotation statistics on symmetric positive definite matrices: Minimal smooth scaling-rotation curves in low dimensions. Electron J Stat 2017. [DOI: 10.1214/17-ejs1250] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]

St. John K. Review Paper: The Shape of Phylogenetic Treespace. Syst Biol 2017;66:e83-e94. [PMID: 28173538 PMCID: PMC5837343 DOI: 10.1093/sysbio/syw025] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2015] [Revised: 12/16/2015] [Accepted: 03/22/2016] [Indexed: 11/23/2022] Open

Barden D, Le H, Owen M. Limiting behaviour of Fréchet means in the space of phylogenetic trees. ANN I STAT MATH 2016. [DOI: 10.1007/s10463-016-0582-9] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]

Lu N, Miao H. Clustering Tree-Structured Data on Manifold. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2016;38:1956-1968. [PMID: 26660696 PMCID: PMC5027669 DOI: 10.1109/tpami.2015.2505282] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]

Gavryushkin A, Drummond AJ. The space of ultrametric phylogenetic trees. J Theor Biol 2016;403:197-208. [PMID: 27188249 DOI: 10.1016/j.jtbi.2016.05.001] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2015] [Revised: 03/17/2016] [Accepted: 05/01/2016] [Indexed: 10/21/2022]

Bendich P, Marron JS, Miller E, Pieloch A, Skwerer S. Persistent Homology Analysis of Brain Artery Trees. Ann Appl Stat 2016;10:198-218. [PMID: 27642379 PMCID: PMC5026243 DOI: 10.1214/15-aoas886] [Citation(s) in RCA: 60] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]

Consistency of a phylogenetic tree maximum likelihood estimator. J Stat Plan Inference 2015. [DOI: 10.1016/j.jspi.2015.01.001] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/03/2023]

Benner P, Bačák M, Bourguignon PY. Point estimates in phylogenetic reconstructions. Bioinformatics 2015;30:i534-40. [PMID: 25161244 PMCID: PMC4147914 DOI: 10.1093/bioinformatics/btu461] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open

Quantification and Visualization of Variation in Anatomical Trees. ASSOCIATION FOR WOMEN IN MATHEMATICS SERIES 2015. [DOI: 10.1007/978-3-319-16348-2_5] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/26/2023]

Huckemann S, Mattingly J, Miller E, Nolen J. Sticky central limit theorems at isolated hyperbolic planar singularities. ELECTRON J PROBAB 2015. [DOI: 10.1214/ejp.v20-3887] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]

Nye TMW. An Algorithm for Constructing Principal Geodesics in Phylogenetic Treespace. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2014;11:304-315. [PMID: 26355778 DOI: 10.1109/tcbb.2014.2309599] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]

Feragen A, Lo P, de Bruijne M, Nielsen M, Lauze F. Toward a theory of statistical tree-shape analysis. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2013;35:2008-2021. [PMID: 23267202 DOI: 10.1109/tpami.2012.265] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/01/2023]

Matsen FA, Evans SN. Edge principal components and squash clustering: using the special structure of phylogenetic placement data for sample comparison. PLoS One 2013;8:e56859. [PMID: 23505415 PMCID: PMC3594297 DOI: 10.1371/journal.pone.0056859] [Citation(s) in RCA: 57] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/10/2012] [Accepted: 01/16/2013] [Indexed: 01/30/2023] Open

Feragen A, Owen M, Petersen J, Wille MMW, Thomsen LH, Dirksen A, de Bruijne M. Tree-space statistics and approximations for large-scale analysis of anatomical trees. INFORMATION PROCESSING IN MEDICAL IMAGING : PROCEEDINGS OF THE ... CONFERENCE 2013;23:74-85. [PMID: 24683959 DOI: 10.1007/978-3-642-38868-2_7] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/01/2023]

Geometric Tree Kernels: Classification of COPD from Airway Tree Geometry. ACTA ACUST UNITED AC 2013. [DOI: 10.1007/978-3-642-38868-2_15] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register]

Ponciano JM, Burleigh JG, Braun EL, Taper ML. Assessing parameter identifiability in phylogenetic models using data cloning. Syst Biol 2012;61:955-72. [PMID: 22649181 PMCID: PMC3478565 DOI: 10.1093/sysbio/sys055] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2011] [Revised: 02/02/2012] [Accepted: 05/25/2012] [Indexed: 11/14/2022] Open

Aydın B, Pataki G, Wang H, Ladha A, Bullitt E, Marron JS. New Approaches to Principal Component Analysis for Trees. STATISTICS IN BIOSCIENCES 2012. [DOI: 10.1007/s12561-012-9055-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/14/2022]