Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Ané C, Sanderson M. Missing the Forest for the Trees: Phylogenetic Compression and Its Implications for Inferring Complex Evolutionary Histories. Syst Biol 2005;54:146-57. [PMID: 15805016 DOI: 10.1080/10635150590905984] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022] Open

For:	Ané C, Sanderson M. Missing the Forest for the Trees: Phylogenetic Compression and Its Implications for Inferring Complex Evolutionary Histories. Syst Biol 2005;54:146-57. [PMID: 15805016 DOI: 10.1080/10635150590905984] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022] Open

Number

Cited by Other Article(s)

McBroome J, Thornlow B, Hinrichs AS, Kramer A, De Maio N, Goldman N, Haussler D, Corbett-Detig R, Turakhia Y. A Daily-Updated Database and Tools for Comprehensive SARS-CoV-2 Mutation-Annotated Trees. Mol Biol Evol 2021;38:5819-5824. [PMID: 34469548 PMCID: PMC8662617 DOI: 10.1093/molbev/msab264] [Citation(s) in RCA: 49] [Impact Index Per Article: 16.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022] Open

Hübner L, Kozlov AM, Hespe D, Sanders P, Stamatakis A. Exploring parallel MPI fault tolerance mechanisms for phylogenetic inference with RAxML-NG. Bioinformatics 2021;37:4056-4063. [PMID: 34037680 PMCID: PMC9502163 DOI: 10.1093/bioinformatics/btab399] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2021] [Revised: 05/10/2021] [Accepted: 05/25/2021] [Indexed: 11/18/2022] Open

Ralph P, Thornton K, Kelleher J. Efficiently Summarizing Relationships in Large Samples: A General Duality Between Statistics of Genealogies and Genomes. Genetics 2020;215:779-797. [PMID: 32357960 PMCID: PMC7337078 DOI: 10.1534/genetics.120.303253] [Citation(s) in RCA: 33] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2019] [Accepted: 04/28/2020] [Indexed: 12/11/2022] Open

Abstract

As a genetic mutation is passed down across generations, it distinguishes those genomes that have inherited it from those that have not, providing a glimpse of the genealogical tree relating the genomes to each other at that site. Statistical summaries of genetic variation therefore also describe the underlying genealogies. We use this correspondence to define a general framework that efficiently computes single-site population genetic statistics using the succinct tree sequence encoding of genealogies and genome sequence. The general approach accumulates sample weights within the genealogical tree at each position on the genome, which are then combined using a summary function; different statistics result from different choices of weight and function. Results can be reported in three ways: by site, which corresponds to statistics calculated as usual from genome sequence; by branch, which gives the expected value of the dual site statistic under the infinite sites model of mutation, and by node, which summarizes the contribution of each ancestor to these statistics. We use the framework to implement many currently defined statistics of genome sequence (making the statistics' relationship to the underlying genealogical trees concrete and explicit), as well as the corresponding branch statistics of tree shape. We evaluate computational performance using simulated data, and show that calculating statistics from tree sequences using this general framework is several orders of magnitude more efficient than optimized matrix-based methods in terms of both run time and memory requirements. We also explore how well the duality between site and branch statistics holds in practice on trees inferred from the 1000 Genomes Project data set, and discuss ways in which deviations may encode interesting biological signals.

Collapse

Inferring whole-genome histories in large population datasets. Nat Genet 2019;51:1330-1338. [PMID: 31477934 PMCID: PMC6726478 DOI: 10.1038/s41588-019-0483-y] [Citation(s) in RCA: 121] [Impact Index Per Article: 24.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2019] [Accepted: 07/15/2019] [Indexed: 01/01/2023]

Sherwin WB, Chao A, Jost L, Smouse PE. Information Theory Broadens the Spectrum of Molecular Ecology and Evolution. Trends Ecol Evol 2017;32:948-963. [PMID: 29126564 DOI: 10.1016/j.tree.2017.09.012] [Citation(s) in RCA: 42] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2017] [Revised: 09/22/2017] [Accepted: 09/26/2017] [Indexed: 01/18/2023]

Sherwin WB. Genes are information, so information theory is coming to the aid of evolutionary biology. Mol Ecol Resour 2016;15:1259-61. [PMID: 26452559 DOI: 10.1111/1755-0998.12458] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2015] [Accepted: 08/17/2015] [Indexed: 11/28/2022]

Implementing and testing the multispecies coalescent model: A valuable paradigm for phylogenomics. Mol Phylogenet Evol 2016;94:447-62. [DOI: 10.1016/j.ympev.2015.10.027] [Citation(s) in RCA: 265] [Impact Index Per Article: 33.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/19/2023]

Cohen AR, Vitányi PM. Normalized Compression Distance of Multisets with Applications. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2015;37:1602-14. [PMID: 26352998 PMCID: PMC4566858 DOI: 10.1109/tpami.2014.2375175] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/23/2023]

Stenz NWM, Larget B, Baum DA, Ané C. Exploring Tree-Like and Non-Tree-Like Patterns Using Genome Sequences: An Example Using the Inbreeding Plant SpeciesArabidopsis thaliana(L.) Heynh. Syst Biol 2015;64:809-23. [DOI: 10.1093/sysbio/syv039] [Citation(s) in RCA: 48] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2014] [Accepted: 06/04/2015] [Indexed: 11/14/2022] Open

McTavish EJ, Hinchliff CE, Allman JF, Brown JW, Cranston KA, Holder MT, Rees JA, Smith SA. Phylesystem: a git-based data store for community-curated phylogenetic estimates. Bioinformatics 2015;31:2794-800. [PMID: 25940563 PMCID: PMC4547614 DOI: 10.1093/bioinformatics/btv276] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2015] [Accepted: 04/27/2015] [Indexed: 11/13/2022] Open

Abstract

MOTIVATION

Phylogenetic estimates from published studies can be archived using general platforms like Dryad (Vision, 2010) or TreeBASE (Sanderson et al., 1994). Such services fulfill a crucial role in ensuring transparency and reproducibility in phylogenetic research. However, digital tree data files often require some editing (e.g. rerooting) to improve the accuracy and reusability of the phylogenetic statements. Furthermore, establishing the mapping between tip labels used in a tree and taxa in a single common taxonomy dramatically improves the ability of other researchers to reuse phylogenetic estimates. As the process of curating a published phylogenetic estimate is not error-free, retaining a full record of the provenance of edits to a tree is crucial for openness, allowing editors to receive credit for their work and making errors introduced during curation easier to correct.

RESULTS

Here, we report the development of software infrastructure to support the open curation of phylogenetic data by the community of biologists. The backend of the system provides an interface for the standard database operations of creating, reading, updating and deleting records by making commits to a git repository. The record of the history of edits to a tree is preserved by git's version control features. Hosting this data store on GitHub (http://github.com/) provides open access to the data store using tools familiar to many developers. We have deployed a server running the 'phylesystem-api', which wraps the interactions with git and GitHub. The Open Tree of Life project has also developed and deployed a JavaScript application that uses the phylesystem-api and other web services to enable input and curation of published phylogenetic statements.

AVAILABILITY AND IMPLEMENTATION

Source code for the web service layer is available at https://github.com/OpenTreeOfLife/phylesystem-api. The data store can be cloned from: https://github.com/OpenTreeOfLife/phylesystem. A web application that uses the phylesystem web services is deployed at http://tree.opentreeoflife.org/curator. Code for that tool is available from https://github.com/OpenTreeOfLife/opentree.

CONTACT

mtholder@gmail.com.

Collapse

McMahon MM, Deepak A, Fernández-Baca D, Boss D, Sanderson MJ. STBase: one million species trees for comparative biology. PLoS One 2015;10:e0117987. [PMID: 25679219 PMCID: PMC4332655 DOI: 10.1371/journal.pone.0117987] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2014] [Accepted: 01/05/2015] [Indexed: 11/29/2022] Open

Abstract

Comprehensively sampled phylogenetic trees provide the most compelling foundations for strong inferences in comparative evolutionary biology. Mismatches are common, however, between the taxa for which comparative data are available and the taxa sampled by published phylogenetic analyses. Moreover, many published phylogenies are gene trees, which cannot always be adapted immediately for species level comparisons because of discordance, gene duplication, and other confounding biological processes. A new database, STBase, lets comparative biologists quickly retrieve species level phylogenetic hypotheses in response to a query list of species names. The database consists of 1 million single- and multi-locus data sets, each with a confidence set of 1000 putative species trees, computed from GenBank sequence data for 413,000 eukaryotic taxa. Two bodies of theoretical work are leveraged to aid in the assembly of multi-locus concatenated data sets for species tree construction. First, multiply labeled gene trees are pruned to conflict-free singly-labeled species-level trees that can be combined between loci. Second, impacts of missing data in multi-locus data sets are ameliorated by assembling only decisive data sets. Data sets overlapping with the user's query are ranked using a scheme that depends on user-provided weights for tree quality and for taxonomic overlap of the tree with the query. Retrieval times are independent of the size of the database, typically a few seconds. Tree quality is assessed by a real-time evaluation of bootstrap support on just the overlapping subtree. Associated sequence alignments, tree files and metadata can be downloaded for subsequent analysis. STBase provides a tool for comparative biologists interested in exploiting the most relevant sequence data available for the taxa of interest. It may also serve as a prototype for future species tree oriented databases and as a resource for assembly of larger species phylogenies from precomputed trees.

Collapse

Morales-Cazan A, Albert JS. Monophyly of Heterandriini (Teleostei: Poeciliidae) revisited: a critical review of the data. NEOTROPICAL ICHTHYOLOGY 2012. [DOI: 10.1590/s1679-62252012000100003] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]

Abstract The systematics and taxonomy of poeciliid fishes (guppies and allies) remain poorly understood despite the relative importance of these species as model systems in the biological sciences. This study focuses on testing the monophyly of the nominal poeciliine tribe Heterandriini and the genus Heterandria, through examination of the morphological characters on which the current classification is based. These characters include aspects of body shape (morphometrics), scale and fin-ray counts (meristics), pigmentation, the cephalic laterosensory system, and osteological features of the neurocranium, oral jaws and suspensorium, branchial basket, pectoral girdle, and the gonopodium and its supports. A Maximum Parsimony analysis was conducted of 150 characters coded for 56 poeciliid and outgroup species, including 22 of 45 heterandriin species (from the accounted in Parenti & Rauchenberger, 1989), or seven of nine heterandriin species (from the accounted in Lucinda & Reis, 2005). Multistate characters were analyzed as both unordered and ordered, and iterative a posteriori weighting was used to improve tree resolution. Tree topologies obtained from these analyses support the monophyly of the Middle American species of "Heterandria," which based on available phylogenetic information, are herein reassigned to the genus Pseudoxiphophorus. None of the characters used in previous studies to characterize the nominal taxon Heterandriini are found to be unambiguously diagnostic. Some of these characters are shared with species in other poeciliid tribes, and others are reversed within the Heterandriini. These results support the hypothesis that Pseudoxiphophorus is monophyletic, and that this clade is not the closest relative of H. formosa (the type species) from southeastern North America. Available morphological data are not sufficient to assess the phylogenetic relationships of H. formosa with respect to other members of the Heterandriini. The results further suggest that most tribe-level taxa of the Poeciliinae are not monophyletic, and that further work remains to resolve the evolutionary relationships of this group. Collapse

Escobar JS, Scornavacca C, Cenci A, Guilhaumon C, Santoni S, Douzery EJP, Ranwez V, Glémin S, David J. Multigenic phylogeny and analysis of tree incongruences in Triticeae (Poaceae). BMC Evol Biol 2011;11:181. [PMID: 21702931 PMCID: PMC3142523 DOI: 10.1186/1471-2148-11-181] [Citation(s) in RCA: 66] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2011] [Accepted: 06/24/2011] [Indexed: 11/30/2022] Open

Abstract

Background

Introgressive events (e.g., hybridization, gene flow, horizontal gene transfer) and incomplete lineage sorting of ancestral polymorphisms are a challenge for phylogenetic analyses since different genes may exhibit conflicting genealogical histories. Grasses of the Triticeae tribe provide a particularly striking example of incongruence among gene trees. Previous phylogenies, mostly inferred with one gene, are in conflict for several taxon positions. Therefore, obtaining a resolved picture of relationships among genera and species of this tribe has been a challenging task. Here, we obtain the most comprehensive molecular dataset to date in Triticeae, including one chloroplastic and 26 nuclear genes. We aim to test whether it is possible to infer phylogenetic relationships in the face of (potentially) large-scale introgressive events and/or incomplete lineage sorting; to identify parts of the evolutionary history that have not evolved in a tree-like manner; and to decipher the biological causes of gene-tree conflicts in this tribe.

Results

We obtain resolved phylogenetic hypotheses using the supermatrix and Bayesian Concordance Factors (BCF) approaches despite numerous incongruences among gene trees. These phylogenies suggest the existence of 4-5 major clades within Triticeae, with Psathyrostachys and Hordeum being the deepest genera. In addition, we construct a multigenic network that highlights parts of the Triticeae history that have not evolved in a tree-like manner. Dasypyrum, Heteranthelium and genera of clade V, grouping Secale, Taeniatherum, Triticum and Aegilops, have evolved in a reticulated manner. Their relationships are thus better represented by the multigenic network than by the supermatrix or BCF trees. Noteworthy, we demonstrate that gene-tree incongruences increase with genetic distance and are greater in telomeric than centromeric genes. Together, our results suggest that recombination is the main factor decoupling gene trees from multigenic trees.

Conclusions

Our study is the first to propose a comprehensive, multigenic phylogeny of Triticeae. It clarifies several aspects of the relationships among genera and species of this tribe, and pinpoints biological groups with likely reticulate evolution. Importantly, this study extends previous results obtained in Drosophila by demonstrating that recombination can exacerbate gene-tree conflicts in phylogenetic reconstructions.

Collapse

Ané C. Detecting phylogenetic breakpoints and discordance from genome-wide alignments for species tree reconstruction. Genome Biol Evol 2011;3:246-58. [PMID: 21362638 PMCID: PMC3070431 DOI: 10.1093/gbe/evr013] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open

Entropy and Information Approaches to Genetic Diversity and its Expression: Genomic Geography. ENTROPY 2010. [DOI: 10.3390/e12071765] [Citation(s) in RCA: 46] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/03/2023]

White MA, Ané C, Dewey CN, Larget BR, Payseur BA. Fine-scale phylogenetic discordance across the house mouse genome. PLoS Genet 2009;5:e1000729. [PMID: 19936022 PMCID: PMC2770633 DOI: 10.1371/journal.pgen.1000729] [Citation(s) in RCA: 81] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2009] [Accepted: 10/19/2009] [Indexed: 11/18/2022] Open

Baum DA. Species as ranked taxa. Syst Biol 2009;58:74-86. [PMID: 20525569 DOI: 10.1093/sysbio/syp011] [Citation(s) in RCA: 47] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open

Abstract

Because species names play an important role in scientific communication, it is more important that species be understood to be taxa than that they be equated with functional ecological or evolutionary entities. Although most biologists would agree that taxa are composed of organisms that share a unique common history, 2 major challenges remain in developing a species-as-taxa concept. First, grouping: in the face of genealogical discordance at all levels in the taxonomic hierarchy, how can we understand the nature of taxa? Second, ranking: what criteria should be used to designate certain taxa in a nested series as being species? The grouping problem can be solved by viewing taxa as exclusive groups of organisms- sets of organisms that form a clade for a plurality of the genome (more than any conflicting set). However, no single objective criterion of species rank can be proposed. Instead, the species rank should be assigned by practitioners based on the semisubjective application of a set of species-ranking criteria. Although these criteria can be designed to yield species taxa that approximately match the ecological, evolutionary, and morphological entities that taxonomists have traditionally associated with the species rank, such a correspondence cannot be enforced without undermining the assumption that species are taxa. The challenge and art of monography is to use genealogical and other kinds of data to assign all organisms to one and only one species-ranked taxon. Various implications of the species-as-ranked-taxa view are discussed, including the synchronic nature of taxa, fossil species, the treatment of hybrids, and species nomenclature. I conclude that, although challenges remain, adopting the view that species are ranked taxa will facilitate a much-needed revolution in taxonomy that will allow it to better serve the biodiversity informatic needs of the 21st century.

Collapse

Treangen TJ, Darling AE, Achaz G, Ragan MA, Messeguer X, Rocha EPC. A novel heuristic for local multiple alignment of interspersed DNA repeats. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2009;6:180-189. [PMID: 19407343 DOI: 10.1109/tcbb.2009.9] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/27/2023]

Chen D, Burleigh GJ, Fernández-Baca D. Spectral partitioning of phylogenetic data sets based on compatibility. Syst Biol 2007;56:623-32. [PMID: 17654366 DOI: 10.1080/10635150701499571] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022] Open

Darling AE, Treangen TJ, Zhang L, Kuiken C, Messeguer X, Perna NT. Procrastination Leads to Efficient Filtration for Local Multiple Alignment. LECTURE NOTES IN COMPUTER SCIENCE 2006. [DOI: 10.1007/11851561_12] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/28/2023]