Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Bayzid MS, Hunt T, Warnow T. Disk covering methods improve phylogenomic analyses. BMC Genomics 2014;15 Suppl 6:S7. [PMID: 25572610 PMCID: PMC4239662 DOI: 10.1186/1471-2164-15-s6-s7] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022] Open

For:	Bayzid MS, Hunt T, Warnow T. Disk covering methods improve phylogenomic analyses. BMC Genomics 2014;15 Suppl 6:S7. [PMID: 25572610 PMCID: PMC4239662 DOI: 10.1186/1471-2164-15-s6-s7] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022] Open

Number

Cited by Other Article(s)

Le T, Sy A, Molloy EK, Zhang Q, Rao S, Warnow T. Using Constrained-INC for Large-Scale Gene Tree and Species Tree Estimation. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2021;18:2-15. [PMID: 32750844 DOI: 10.1109/tcbb.2020.2990867] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]

Bhattacharjee A, Bayzid MS. Machine learning based imputation techniques for estimating phylogenetic trees from incomplete distance matrices. BMC Genomics 2020;21:497. [PMID: 32689946 PMCID: PMC7370488 DOI: 10.1186/s12864-020-06892-5] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2020] [Accepted: 07/07/2020] [Indexed: 02/08/2023] Open

Smirnov V, Warnow T. Unblended disjoint tree merging using GTM improves species tree estimation. BMC Genomics 2020;21:235. [PMID: 32299343 PMCID: PMC7161100 DOI: 10.1186/s12864-020-6605-1] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022] Open

Islam M, Sarker K, Das T, Reaz R, Bayzid MS. STELAR: a statistically consistent coalescent-based species tree estimation method by maximizing triplet consistency. BMC Genomics 2020;21:136. [PMID: 32039704 PMCID: PMC7011378 DOI: 10.1186/s12864-020-6519-y] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2019] [Accepted: 01/20/2020] [Indexed: 12/14/2022] Open

Data, time and money: evaluating the best compromise for inferring molecular phylogenies of non-model animal taxa. Mol Phylogenet Evol 2020;142:106660. [DOI: 10.1016/j.ympev.2019.106660] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2019] [Revised: 10/17/2019] [Accepted: 10/17/2019] [Indexed: 12/15/2022]

Molloy EK, Warnow T. Statistically consistent divide-and-conquer pipelines for phylogeny estimation using NJMerge. Algorithms Mol Biol 2019;14:14. [PMID: 31360216 PMCID: PMC6642500 DOI: 10.1186/s13015-019-0151-x] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2019] [Accepted: 06/13/2019] [Indexed: 12/26/2022] Open

Abstract

BACKGROUND

Divide-and-conquer methods, which divide the species set into overlapping subsets, construct a tree on each subset, and then combine the subset trees using a supertree method, provide a key algorithmic framework for boosting the scalability of phylogeny estimation methods to large datasets. Yet the use of supertree methods, which typically attempt to solve NP-hard optimization problems, limits the scalability of such approaches.

RESULTS

In this paper, we introduce a divide-and-conquer approach that does not require supertree estimation: we divide the species set into pairwise disjoint subsets, construct a tree on each subset using a base method, and then combine the subset trees using a distance matrix. For this merger step, we present a new method, called NJMerge, which is a polynomial-time extension of Neighbor Joining (NJ); thus, NJMerge can be viewed either as a method for improving traditional NJ or as a method for scaling the base method to larger datasets. We prove that NJMerge can be used to create divide-and-conquer pipelines that are statistically consistent under some models of evolution. We also report the results of an extensive simulation study evaluating NJMerge on multi-locus datasets with up to 1000 species. We found that NJMerge sometimes improved the accuracy of traditional NJ and substantially reduced the running time of three popular species tree methods (ASTRAL-III, SVDquartets, and "concatenation" using RAxML) without sacrificing accuracy. Finally, although NJMerge can fail to return a tree, in our experiments, NJMerge failed on only 11 out of 2560 test cases.

CONCLUSIONS

Theoretical and empirical results suggest that NJMerge is a valuable technique for large-scale phylogeny estimation, especially when computational resources are limited. NJMerge is freely available on Github (http://github.com/ekmolloy/njmerge).

Collapse

Eiserhardt WL, Antonelli A, Bennett DJ, Botigué LR, Burleigh JG, Dodsworth S, Enquist BJ, Forest F, Kim JT, Kozlov AM, Leitch IJ, Maitner BS, Mirarab S, Piel WH, Pérez-Escobar OA, Pokorny L, Rahbek C, Sandel B, Smith SA, Stamatakis A, Vos RA, Warnow T, Baker WJ. A roadmap for global synthesis of the plant tree of life. AMERICAN JOURNAL OF BOTANY 2018;105:614-622. [PMID: 29603138 DOI: 10.1002/ajb2.1041] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/13/2017] [Accepted: 11/08/2017] [Indexed: 06/08/2023]

Affiliation(s)

Wolf L Eiserhardt Royal Botanic Gardens, Kew, TW9 3AE, Richmond, Surrey, UK Department of Bioscience, Aarhus University, Ny Munkegade 116, 8000, Aarhus C, Denmark
Alexandre Antonelli Gothenburg Global Biodiversity Centre, Box 461, 405 30, Gothenburg, Sweden Department of Biological and Environmental Sciences, University of Gothenburg, Box 461, 405 30, Gothenburg, Sweden Gothenburg Botanical Garden, Carl Skottsbergs Gata 22B, SE-413 19, Gothenburg, Sweden
Dominic J Bennett Gothenburg Global Biodiversity Centre, Box 461, 405 30, Gothenburg, Sweden Department of Biological and Environmental Sciences, University of Gothenburg, Box 461, 405 30, Gothenburg, Sweden Gothenburg Botanical Garden, Carl Skottsbergs Gata 22B, SE-413 19, Gothenburg, Sweden
Laura R Botigué Royal Botanic Gardens, Kew, TW9 3AE, Richmond, Surrey, UK
J Gordon Burleigh Department of Biology, University of Florida, Florida, 32611, USA
Steven Dodsworth Royal Botanic Gardens, Kew, TW9 3AE, Richmond, Surrey, UK
Brian J Enquist Department of Ecology and Evolutionary Biology, University of Arizona, Tucson, AZ, 85721, USA The Santa Fe Institute, Santa Fe, NM, 87501, USA
Félix Forest Royal Botanic Gardens, Kew, TW9 3AE, Richmond, Surrey, UK
Jan T Kim Royal Botanic Gardens, Kew, TW9 3AE, Richmond, Surrey, UK
Alexey M Kozlov Scientific Computing Group, Heidelberg Institute for Theoretical Studies, 69118, Heidelberg, Germany
Ilia J Leitch Royal Botanic Gardens, Kew, TW9 3AE, Richmond, Surrey, UK
Brian S Maitner Department of Ecology and Evolutionary Biology, University of Arizona, Tucson, AZ, 85721, USA
Siavash Mirarab Department of Electrical and Computer Engineering, University of California, San Diego, San Diego, CA, 92093, USA
William H Piel Yale-NUS College, 16 College Avenue West, Singapore, 138527, Republic of Singapore
Oscar A Pérez-Escobar Royal Botanic Gardens, Kew, TW9 3AE, Richmond, Surrey, UK
Lisa Pokorny Royal Botanic Gardens, Kew, TW9 3AE, Richmond, Surrey, UK
Carsten Rahbek Center for Macroecology, Evolution and Climate, University of Copenhagen, Universitetsparken 15, DK-2100, Copenhagen O, Denmark Imperial College London, Silwood Park, Buckhurst Road, Ascot, Berkshire, SL5 7PY, UK
Brody Sandel Department of Biology, Santa Clara University, Santa Clara, CA, 95053, USA
Stephen A Smith Department of Ecology and Evolutionary Biology, University of Michigan, Ann Arbor, MI, 48109, USA
Alexandros Stamatakis Scientific Computing Group, Heidelberg Institute for Theoretical Studies, 69118, Heidelberg, Germany Institute for Theoretical Informatics, Karlsruhe Institute of Technology, 76128, Karlsruhe, Germany
Rutger A Vos Naturalis Biodiversity Center, P.O. Box 9517, 2300RA, Leiden, The Netherlands Institute of Biology Leiden, P.O. Box 9505, 2300RA, Leiden, The Netherlands
Tandy Warnow Department of Computer Science, University of Illinois at Urbana-Champaign, Urbana, IL, 61801, USA
William J Baker Royal Botanic Gardens, Kew, TW9 3AE, Richmond, Surrey, UK

Collapse

Zou Q, Wan S, Zeng X, Ma ZS. Reconstructing evolutionary trees in parallel for massive sequences. BMC SYSTEMS BIOLOGY 2017;11:100. [PMID: 29297337 PMCID: PMC5751538 DOI: 10.1186/s12918-017-0476-3] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]

Vachaspati P, Warnow T. FastRFS: fast and accurate Robinson-Foulds Supertrees using constrained exact optimization. Bioinformatics 2017;33:631-639. [PMID: 27663499 PMCID: PMC5870905 DOI: 10.1093/bioinformatics/btw600] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2016] [Accepted: 09/12/2016] [Indexed: 01/22/2023] Open

Molloy EK, Warnow T. To Include or Not to Include: The Impact of Gene Filtering on Species Tree Estimation Methods. Syst Biol 2017;67:285-303. [DOI: 10.1093/sysbio/syx077] [Citation(s) in RCA: 138] [Impact Index Per Article: 19.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2016] [Accepted: 09/13/2017] [Indexed: 01/27/2023] Open

Bhattacharyya S, Mukherjee J. IDXL: Species Tree Inference Using Internode Distance and Excess Gene Leaf Count. J Mol Evol 2017;85:57-78. [PMID: 28835989 DOI: 10.1007/s00239-017-9807-7] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2016] [Accepted: 08/09/2017] [Indexed: 11/28/2022]

Mirarab S, Warnow T. ASTRAL-II: coalescent-based species tree estimation with many hundreds of taxa and thousands of genes. Bioinformatics 2015;31:i44-52. [PMID: 26072508 PMCID: PMC4765870 DOI: 10.1093/bioinformatics/btv234] [Citation(s) in RCA: 578] [Impact Index Per Article: 64.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022] Open

Chou J, Gupta A, Yaduvanshi S, Davidson R, Nute M, Mirarab S, Warnow T. A comparative study of SVDquartets and other coalescent-based species tree estimation methods. BMC Genomics 2015;16 Suppl 10:S2. [PMID: 26449249 PMCID: PMC4602346 DOI: 10.1186/1471-2164-16-s10-s2] [Citation(s) in RCA: 95] [Impact Index Per Article: 10.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023] Open

Abstract

BACKGROUND

Species tree estimation is challenging in the presence of incomplete lineage sorting (ILS), which can make gene trees different from the species tree. Because ILS is expected to occur and the standard concatenation approach can return incorrect trees with high support in the presence of ILS, "coalescent-based" summary methods (which first estimate gene trees and then combine gene trees into a species tree) have been developed that have theoretical guarantees of robustness to arbitrarily high amounts of ILS. Some studies have suggested that summary methods should only be used on "c-genes" (i.e., recombination-free loci) that can be extremely short (sometimes fewer than 100 sites). However, gene trees estimated on short alignments can have high estimation error, and summary methods tend to have high error on short c-genes. To address this problem, Chifman and Kubatko introduced SVDquartets, a new coalescent-based method. SVDquartets takes multi-locus unlinked single-site data, infers the quartet trees for all subsets of four species, and then combines the set of quartet trees into a species tree using a quartet amalgamation heuristic. Yet, the relative accuracy of SVDquartets to leading coalescent-based methods has not been assessed.

RESULTS

We compared SVDquartets to two leading coalescent-based methods (ASTRAL-2 and NJst), and to concatenation using maximum likelihood. We used a collection of simulated datasets, varying ILS levels, numbers of taxa, and number of sites per locus. Although SVDquartets was sometimes more accurate than ASTRAL-2 and NJst, most often the best results were obtained using ASTRAL-2, even on the shortest gene sequence alignments we explored (with only 10 sites per locus). Finally, concatenation was the most accurate of all methods under low ILS conditions.

CONCLUSIONS

ASTRAL-2 generally had the best accuracy under higher ILS conditions, and concatenation had the best accuracy under the lowest ILS conditions. However, SVDquartets was competitive with the best methods under conditions with low ILS and small numbers of sites per locus. The good performance under many conditions of ASTRAL-2 in comparison to SVDquartets is surprising given the known vulnerability of ASTRAL-2 and similar methods to short gene sequences.

Collapse

Vachaspati P, Warnow T. ASTRID: Accurate Species TRees from Internode Distances. BMC Genomics 2015;16 Suppl 10:S3. [PMID: 26449326 PMCID: PMC4602181 DOI: 10.1186/1471-2164-16-s10-s3] [Citation(s) in RCA: 129] [Impact Index Per Article: 14.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022] Open

Roch S, Warnow T. On the Robustness to Gene Tree Estimation Error (or lack thereof) of Coalescent-Based Species Tree Methods. Syst Biol 2015;64:663-76. [PMID: 25813358 DOI: 10.1093/sysbio/syv016] [Citation(s) in RCA: 95] [Impact Index Per Article: 10.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2014] [Accepted: 03/20/2015] [Indexed: 11/13/2022] Open

Zimmermann T, Mirarab S, Warnow T. BBCA: Improving the scalability of *BEAST using random binning. BMC Genomics 2014;15 Suppl 6:S11. [PMID: 25572469 PMCID: PMC4239591 DOI: 10.1186/1471-2164-15-s6-s11] [Citation(s) in RCA: 31] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022] Open