Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Izquierdo-Carrasco F, Cazes J, Smith SA, Stamatakis A. PUmPER: phylogenies updated perpetually. ACTA ACUST UNITED AC 2014;30:1476-7. [PMID: 24478338 PMCID: PMC4016711 DOI: 10.1093/bioinformatics/btu053] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]

For:	Izquierdo-Carrasco F, Cazes J, Smith SA, Stamatakis A. PUmPER: phylogenies updated perpetually. ACTA ACUST UNITED AC 2014;30:1476-7. [PMID: 24478338 PMCID: PMC4016711 DOI: 10.1093/bioinformatics/btu053] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]

Number

Cited by Other Article(s)

Kramer AM, Thornlow B, Ye C, De Maio N, McBroome J, Hinrichs AS, Lanfear R, Turakhia Y, Corbett-Detig R. Online Phylogenetics with matOptimize Produces Equivalent Trees and is Dramatically More Efficient for Large SARS-CoV-2 Phylogenies than de novo and Maximum-Likelihood Implementations. Syst Biol 2023;72:1039-1051. [PMID: 37232476 PMCID: PMC10627557 DOI: 10.1093/sysbio/syad031] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2021] [Revised: 05/14/2023] [Accepted: 06/22/2023] [Indexed: 05/27/2023] Open

Abstract

Phylogenetics has been foundational to SARS-CoV-2 research and public health policy, assisting in genomic surveillance, contact tracing, and assessing emergence and spread of new variants. However, phylogenetic analyses of SARS-CoV-2 have often relied on tools designed for de novo phylogenetic inference, in which all data are collected before any analysis is performed and the phylogeny is inferred once from scratch. SARS-CoV-2 data sets do not fit this mold. There are currently over 14 million sequenced SARS-CoV-2 genomes in online databases, with tens of thousands of new genomes added every day. Continuous data collection, combined with the public health relevance of SARS-CoV-2, invites an "online" approach to phylogenetics, in which new samples are added to existing phylogenetic trees every day. The extremely dense sampling of SARS-CoV-2 genomes also invites a comparison between likelihood and parsimony approaches to phylogenetic inference. Maximum likelihood (ML) and pseudo-ML methods may be more accurate when there are multiple changes at a single site on a single branch, but this accuracy comes at a large computational cost, and the dense sampling of SARS-CoV-2 genomes means that these instances will be extremely rare because each internal branch is expected to be extremely short. Therefore, it may be that approaches based on maximum parsimony (MP) are sufficiently accurate for reconstructing phylogenies of SARS-CoV-2, and their simplicity means that they can be applied to much larger data sets. Here, we evaluate the performance of de novo and online phylogenetic approaches, as well as ML, pseudo-ML, and MP frameworks for inferring large and dense SARS-CoV-2 phylogenies. Overall, we find that online phylogenetics produces similar phylogenetic trees to de novo analyses for SARS-CoV-2, and that MP optimization with UShER and matOptimize produces equivalent SARS-CoV-2 phylogenies to some of the most popular ML and pseudo-ML inference tools. MP optimization with UShER and matOptimize is thousands of times faster than presently available implementations of ML and online phylogenetics is faster than de novo inference. Our results therefore suggest that parsimony-based methods like UShER and matOptimize represent an accurate and more practical alternative to established ML implementations for large SARS-CoV-2 phylogenies and could be successfully applied to other similar data sets with particularly dense sampling and short branch lengths.

Collapse

Thornlow B, Kramer A, Ye C, De Maio N, McBroome J, Hinrichs AS, Lanfear R, Turakhia Y, Corbett-Detig R. Online Phylogenetics using Parsimony Produces Slightly Better Trees and is Dramatically More Efficient for Large SARS-CoV-2 Phylogenies than de novo and Maximum-Likelihood Approaches. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2022:2021.12.02.471004. [PMID: 35611334 PMCID: PMC9128781 DOI: 10.1101/2021.12.02.471004] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]

Abstract

Phylogenetics has been foundational to SARS-CoV-2 research and public health policy, assisting in genomic surveillance, contact tracing, and assessing emergence and spread of new variants. However, phylogenetic analyses of SARS-CoV-2 have often relied on tools designed for de novo phylogenetic inference, in which all data are collected before any analysis is performed and the phylogeny is inferred once from scratch. SARS-CoV-2 datasets do not fit this mould. There are currently over 10 million sequenced SARS-CoV-2 genomes in online databases, with tens of thousands of new genomes added every day. Continuous data collection, combined with the public health relevance of SARS-CoV-2, invites an "online" approach to phylogenetics, in which new samples are added to existing phylogenetic trees every day. The extremely dense sampling of SARS-CoV-2 genomes also invites a comparison between likelihood and parsimony approaches to phylogenetic inference. Maximum likelihood (ML) methods are more accurate when there are multiple changes at a single site on a single branch, but this accuracy comes at a large computational cost, and the dense sampling of SARS-CoV-2 genomes means that these instances will be extremely rare because each internal branch is expected to be extremely short. Therefore, it may be that approaches based on maximum parsimony (MP) are sufficiently accurate for reconstructing phylogenies of SARS-CoV-2, and their simplicity means that they can be applied to much larger datasets. Here, we evaluate the performance of de novo and online phylogenetic approaches, and ML and MP frameworks, for inferring large and dense SARS-CoV-2 phylogenies. Overall, we find that online phylogenetics produces similar phylogenetic trees to de novo analyses for SARS-CoV-2, and that MP optimizations produce more accurate SARS-CoV-2 phylogenies than do ML optimizations. Since MP is thousands of times faster than presently available implementations of ML and online phylogenetics is faster than de novo , we therefore propose that, in the context of comprehensive genomic epidemiology of SARS-CoV-2, MP online phylogenetics approaches should be favored.

Collapse

Sánchez-Reyes LL, Kandziora M, McTavish EJ. Physcraper: a Python package for continually updated phylogenetic trees using the Open Tree of Life. BMC Bioinformatics 2021;22:355. [PMID: 34187366 PMCID: PMC8244228 DOI: 10.1186/s12859-021-04274-6] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2021] [Accepted: 06/16/2021] [Indexed: 11/10/2022] Open

Hu D, Liu B, Wang L, Reeves PR. Living Trees: High-Quality Reproducible and Reusable Construction of Bacterial Phylogenetic Trees. Mol Biol Evol 2020;37:563-575. [PMID: 31633785 DOI: 10.1093/molbev/msz241] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023] Open

Gill MS, Lemey P, Suchard MA, Rambaut A, Baele G. Online Bayesian Phylodynamic Inference in BEAST with Application to Epidemic Reconstruction. Mol Biol Evol 2020;37:1832-1842. [PMID: 32101295 PMCID: PMC7253210 DOI: 10.1093/molbev/msaa047] [Citation(s) in RCA: 22] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022] Open

Fang Y, Liu C, Lin J, Li X, Alavian KN, Yang Y, Niu Y. PhySpeTree: an automated pipeline for reconstructing phylogenetic species trees. BMC Evol Biol 2019;19:219. [PMID: 31791235 PMCID: PMC6889546 DOI: 10.1186/s12862-019-1541-x] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2019] [Accepted: 11/13/2019] [Indexed: 02/05/2023] Open

Fourment M, Claywell BC, Dinh V, McCoy C, Matsen Iv FA, Darling AE. Effective Online Bayesian Phylogenetics via Sequential Monte Carlo with Guided Proposals. Syst Biol 2018;67:490-502. [PMID: 29186587 PMCID: PMC5920299 DOI: 10.1093/sysbio/syx090] [Citation(s) in RCA: 22] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2017] [Accepted: 11/20/2017] [Indexed: 11/14/2022] Open

del Campo J, Kolisko M, Boscaro V, Santoferrara LF, Nenarokov S, Massana R, Guillou L, Simpson A, Berney C, de Vargas C, Brown MW, Keeling PJ, Wegener Parfrey L. EukRef: Phylogenetic curation of ribosomal RNA to enhance understanding of eukaryotic diversity and distribution. PLoS Biol 2018;16:e2005849. [PMID: 30222734 PMCID: PMC6160240 DOI: 10.1371/journal.pbio.2005849] [Citation(s) in RCA: 60] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Revised: 09/27/2018] [Indexed: 01/03/2023] Open

Abstract

Environmental sequencing has greatly expanded our knowledge of micro-eukaryotic diversity and ecology by revealing previously unknown lineages and their distribution. However, the value of these data is critically dependent on the quality of the reference databases used to assign an identity to environmental sequences. Existing databases contain errors and struggle to keep pace with rapidly changing eukaryotic taxonomy, the influx of novel diversity, and computational challenges related to assembling the high-quality alignments and trees needed for accurate characterization of lineage diversity. EukRef (eukref.org) is an ongoing community-driven initiative that addresses these challenges by bringing together taxonomists with expertise spanning the eukaryotic tree of life and microbial ecologists, who use environmental sequence data to develop reliable reference databases across the diversity of microbial eukaryotes. EukRef organizes and facilitates rigorous mining and annotation of sequence data by providing protocols, guidelines, and tools. The EukRef pipeline and tools allow users interested in a particular group of microbial eukaryotes to retrieve all sequences belonging to that group from International Nucleotide Sequence Database Collaboration (INSDC) (GenBank, the European Nucleotide Archive [ENA], or the DNA DataBank of Japan [DDBJ]), to place those sequences in a phylogenetic tree, and to curate taxonomic and environmental information for the group. We provide guidelines to facilitate the process and to standardize taxonomic annotations. The final outputs of this process are (1) a reference tree and alignment, (2) a reference sequence database, including taxonomic and environmental information, and (3) a list of putative chimeras and other artifactual sequences. These products will be useful for the broad community as they become publicly available (at eukref.org) and are shared with existing reference databases.

Collapse

Affiliation(s)

Javier del Campo Department of Marine Biology and Oceanography, Institut de Ciències del Mar—CSIC, Barcelona, Catalonia, Spain Department of Botany and Biodiversity Research Centre, University of British Columbia, Vancouver, British Columbia, Canada * E-mail:
Martin Kolisko Department of Botany and Biodiversity Research Centre, University of British Columbia, Vancouver, British Columbia, Canada Institute of Parasitology, Biology Centre, Czech Academy of Sciences, České Budějovice, Czech Republic
Vittorio Boscaro Department of Botany and Biodiversity Research Centre, University of British Columbia, Vancouver, British Columbia, Canada
Luciana F. Santoferrara Departments of Marine Sciences & Ecology and Evolutionary Biology, University of Connecticut, Storrs, United States of America
Serafim Nenarokov Institute of Parasitology, Biology Centre, Czech Academy of Sciences, České Budějovice, Czech Republic
Ramon Massana Department of Marine Biology and Oceanography, Institut de Ciències del Mar—CSIC, Barcelona, Catalonia, Spain
Laure Guillou Sorbonne Université, CNRS, Station Biologique de Roscoff, UMR7144, Roscoff, France
Alastair Simpson Department of Biology, and Centre for Comparative Genomics and Evolutionary Bioinformatics, Dalhousie University, Halifax, Nova Scotia, Canada
Cedric Berney Sorbonne Université, CNRS, Station Biologique de Roscoff, UMR7144, Roscoff, France
Colomban de Vargas Sorbonne Université, CNRS, Station Biologique de Roscoff, UMR7144, Roscoff, France
Matthew W. Brown Department of Biological Sciences, Mississippi State University, Mississippi State, Mississippi, United States of America
Patrick J. Keeling Department of Botany and Biodiversity Research Centre, University of British Columbia, Vancouver, British Columbia, Canada
Laura Wegener Parfrey Department of Botany and Biodiversity Research Centre, University of British Columbia, Vancouver, British Columbia, Canada Department of Zoology, University of British Columbia, Vancouver, British Columbia, Canada

Collapse

Borstein SR, O’Meara BC. AnnotationBustR: an R package to extract subsequences from GenBank annotations. PeerJ 2018;6:e5179. [PMID: 30002984 PMCID: PMC6034590 DOI: 10.7717/peerj.5179] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2017] [Accepted: 06/18/2018] [Indexed: 02/01/2023] Open

Modha S, Thanki AS, Cotmore SF, Davison AJ, Hughes J. ViCTree: an automated framework for taxonomic classification from protein sequences. Bioinformatics 2018;34:2195-2200. [PMID: 29474519 PMCID: PMC6022645 DOI: 10.1093/bioinformatics/bty099] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2017] [Revised: 01/08/2018] [Accepted: 02/20/2018] [Indexed: 11/14/2022] Open

Dinh V, Darling AE, Matsen IV FA. Online Bayesian Phylogenetic Inference: Theoretical Foundations via Sequential Monte Carlo. Syst Biol 2018;67:503-517. [PMID: 29244177 PMCID: PMC5920340 DOI: 10.1093/sysbio/syx087] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2017] [Revised: 11/08/2017] [Accepted: 11/09/2017] [Indexed: 11/29/2022] Open

Smith SA, Brown JW. Constructing a broadly inclusive seed plant phylogeny. AMERICAN JOURNAL OF BOTANY 2018;105:302-314. [PMID: 29746720 DOI: 10.1002/ajb2.1019] [Citation(s) in RCA: 363] [Impact Index Per Article: 60.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/08/2017] [Accepted: 10/19/2017] [Indexed: 05/03/2023]

Antonelli A, Hettling H, Condamine FL, Vos K, Nilsson RH, Sanderson MJ, Sauquet H, Scharn R, Silvestro D, Töpel M, Bacon CD, Oxelman B, Vos RA. Toward a Self-Updating Platform for Estimating Rates of Speciation and Migration, Ages, and Relationships of Taxa. Syst Biol 2018;66:152-166. [PMID: 27616324 PMCID: PMC5410925 DOI: 10.1093/sysbio/syw066] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2015] [Accepted: 07/19/2016] [Indexed: 01/06/2023] Open

Affiliation(s)

Alexandre Antonelli Department of Biological and Environmental Sciences, University of Gothenburg, Box 461, SE-405 30 Göteborg, Sweden.,Gothenburg Botanical Garden, Carl Skottsbergs Gata 22A, SE-41319 Göteborg, Sweden
Hannes Hettling Naturalis Biodiversity Center, Darwinweg 4, 2333 CR Leiden, The Netherlands
Fabien L Condamine Department of Biological and Environmental Sciences, University of Gothenburg, Box 461, SE-405 30 Göteborg, Sweden.,CNRS, UMR 5554 Institut des Sciences de l'Evolution (Université de Montpellier), Place Eugéne Bataillon, 34095 Montpellier, France
Karin Vos Department of Biological and Environmental Sciences, University of Gothenburg, Box 461, SE-405 30 Göteborg, Sweden
R Henrik Nilsson Department of Biological and Environmental Sciences, University of Gothenburg, Box 461, SE-405 30 Göteborg, Sweden
Michael J Sanderson Department of Ecology and Evolutionary Biology, University of Arizona, 1041 E. Lowell, Tucson, AZ 85721, USA
Hervé Sauquet Université Paris-Sud, Laboratoire Écologie, Systématique, Évolution, CNRS UMR 8079, 91405 Orsay, France
Ruud Scharn Department of Biological and Environmental Sciences, University of Gothenburg, Box 461, SE-405 30 Göteborg, Sweden
Daniele Silvestro Department of Biological and Environmental Sciences, University of Gothenburg, Box 461, SE-405 30 Göteborg, Sweden.,Department of Ecology and Evolution, University of Lausanne, 1015 Lausanne, Switzerland
Mats Töpel Swedish Bioinformatics Infrastructure for Life Sciences, Department of Biological and Environmental Sciences, University of Gothenburg, Box 463, SE-405 30, Göteborg, Sweden.,Department of Marine Sciences, University of Gothenburg, Box 460, SE-405 30 Göteborg, Sweden
Christine D Bacon Department of Biological and Environmental Sciences, University of Gothenburg, Box 461, SE-405 30 Göteborg, Sweden
Bengt Oxelman Department of Biological and Environmental Sciences, University of Gothenburg, Box 461, SE-405 30 Göteborg, Sweden
Rutger A Vos Naturalis Biodiversity Center, Darwinweg 4, 2333 CR Leiden, The Netherlands

Collapse

Matsen FA. Phylogenetics and the human microbiome. Syst Biol 2015;64:e26-41. [PMID: 25102857 PMCID: PMC4265140 DOI: 10.1093/sysbio/syu053] [Citation(s) in RCA: 28] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2013] [Accepted: 07/24/2014] [Indexed: 01/04/2023] Open