Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For:	[Subscribe] [Scholar Register]

Number

Cited by Other Article(s)

Wong GKS, Soltis DE, Leebens-Mack J, Wickett NJ, Barker MS, Van de Peer Y, Graham SW, Melkonian M. Sequencing and Analyzing the Transcriptomes of a Thousand Species Across the Tree of Life for Green Plants. ANNUAL REVIEW OF PLANT BIOLOGY 2020;71:741-765. [PMID: 31851546 DOI: 10.1146/annurev-arplant-042916-041040] [Citation(s) in RCA: 33] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/23/2023]

Jones MG, Khodaverdian A, Quinn JJ, Chan MM, Hussmann JA, Wang R, Xu C, Weissman JS, Yosef N. Inference of single-cell phylogenies from lineage tracing data using Cassiopeia. Genome Biol 2020;21:92. [PMID: 32290857 PMCID: PMC7155257 DOI: 10.1186/s13059-020-02000-8] [Citation(s) in RCA: 47] [Impact Index Per Article: 11.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2019] [Accepted: 03/13/2020] [Indexed: 12/14/2022] Open

Affiliation(s)

Matthew G Jones Biological and Medical Informatics Graduate Program, University of California San Francisco, San Francisco, CA, USA Center for Computational Biology, University of California Berkeley, Berkeley, CA, USA Howard Hughes Medical Institute, University of California San Francisco, San Francisco, CA, USA Department of Cellular and Molecular Pharmacology, University of California San Francisco, San Francisco, CA, USA
Alex Khodaverdian Department of Electrical Engineering and Computer Science and Center for Computational Biology, University of California Berkeley, Berkeley, CA, USA
Jeffrey J Quinn Howard Hughes Medical Institute, University of California San Francisco, San Francisco, CA, USA Department of Cellular and Molecular Pharmacology, University of California San Francisco, San Francisco, CA, USA Center for RNA Systems Biology, University of California San Francisco, San Francisco, CA, USA
Michelle M Chan Howard Hughes Medical Institute, University of California San Francisco, San Francisco, CA, USA Department of Cellular and Molecular Pharmacology, University of California San Francisco, San Francisco, CA, USA Center for RNA Systems Biology, University of California San Francisco, San Francisco, CA, USA
Jeffrey A Hussmann Howard Hughes Medical Institute, University of California San Francisco, San Francisco, CA, USA Department of Cellular and Molecular Pharmacology, University of California San Francisco, San Francisco, CA, USA Center for RNA Systems Biology, University of California San Francisco, San Francisco, CA, USA University of California, San Francisco, Department of Microbiology and Immunology, San Francisco, California, USA
Robert Wang Center for Computational Biology, University of California Berkeley, Berkeley, CA, USA
Chenling Xu Center for Computational Biology, University of California Berkeley, Berkeley, CA, USA
Jonathan S Weissman Howard Hughes Medical Institute, University of California San Francisco, San Francisco, CA, USA. Department of Cellular and Molecular Pharmacology, University of California San Francisco, San Francisco, CA, USA. Center for RNA Systems Biology, University of California San Francisco, San Francisco, CA, USA.
Nir Yosef Center for Computational Biology, University of California Berkeley, Berkeley, CA, USA. Department of Electrical Engineering and Computer Science and Center for Computational Biology, University of California Berkeley, Berkeley, CA, USA. Ragon Institute of Massachusetts General Hospital - MIT and Harvard, Cambridge, MA, USA. Chan Zuckerberg Biohub Investigator, San Francisco, CA, USA.

Collapse

Prasanna AN, Gerber D, Kijpornyongpan T, Aime MC, Doyle VP, Nagy LG. Model Choice, Missing Data, and Taxon Sampling Impact Phylogenomic Inference of Deep Basidiomycota Relationships. Syst Biol 2020;69:17-37. [PMID: 31062852 DOI: 10.1093/sysbio/syz029] [Citation(s) in RCA: 23] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2017] [Revised: 04/21/2019] [Accepted: 04/26/2019] [Indexed: 11/12/2022] Open

Abstract

Resolving deep divergences in the tree of life is challenging even for analyses of genome-scale phylogenetic data sets. Relationships between Basidiomycota subphyla, the rusts and allies (Pucciniomycotina), smuts and allies (Ustilaginomycotina), and mushroom-forming fungi and allies (Agaricomycotina) were found particularly recalcitrant both to traditional multigene and genome-scale phylogenetics. Here, we address basal Basidiomycota relationships using concatenated and gene tree-based analyses of various phylogenomic data sets to examine the contribution of several potential sources of bias. We evaluate the contribution of biological causes (hard polytomy, incomplete lineage sorting) versus unmodeled evolutionary processes and factors that exacerbate their effects (e.g., fast-evolving sites and long-branch taxa) to inferences of basal Basidiomycota relationships. Bayesian Markov Chain Monte Carlo and likelihood mapping analyses reject the hard polytomy with confidence. In concatenated analyses, fast-evolving sites and oversimplified models of amino acid substitution favored the grouping of smuts with mushroom-forming fungi, often leading to maximal bootstrap support in both concatenation and coalescent analyses. On the contrary, the most conserved data subsets grouped rusts and allies with mushroom-forming fungi, although this relationship proved labile, sensitive to model choice, to different data subsets and to missing data. Excluding putative long-branch taxa, genes with high proportions of missing data and/or with strong signal failed to reveal a consistent trend toward one or the other topology, suggesting that additional sources of conflict are at play. While concatenated analyses yielded strong but conflicting support, individual gene trees mostly provided poor support for any resolution of rusts, smuts, and mushroom-forming fungi, suggesting that the true Basidiomycota tree might be in a part of tree space that is difficult to access using both concatenation and gene tree-based approaches. Inference-based assessments of absolute model fit strongly reject best-fit models for the vast majority of genes, indicating a poor fit of even the most commonly used models. While this is consistent with previous assessments of site-homogenous models of amino acid evolution, this does not appear to be the sole source of confounding signal. Our analyses suggest that topologies uniting smuts with mushroom-forming fungi can arise as a result of inappropriate modeling of amino acid sites that might be prone to systematic bias. We speculate that improved models of sequence evolution could shed more light on basal splits in the Basidiomycota, which, for now, remain unresolved despite the use of whole genome data.

Collapse

Wang Q, Li H. Phylogeny of the superfamily Gelechioidea (Lepidoptera: Obtectomera), with an exploratory application on geometric morphometrics. ZOOL SCR 2020. [DOI: 10.1111/zsc.12407] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]

Hyun DY, Sebastin R, Lee KJ, Lee GA, Shin MJ, Kim SH, Lee JR, Cho GT. Genotyping-by-Sequencing Derived Single Nucleotide Polymorphisms Provide the First Well-Resolved Phylogeny for the Genus Triticum (Poaceae). FRONTIERS IN PLANT SCIENCE 2020;11:688. [PMID: 32625218 PMCID: PMC7311657 DOI: 10.3389/fpls.2020.00688] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/10/2020] [Accepted: 04/30/2020] [Indexed: 05/17/2023]

Abstract

Wheat (Triticum spp.) has been an important staple food crop for mankind since the beginning of agriculture. The genus Triticum L. is composed of diploid, tetraploid, and hexaploid species, majority of which have not yet been discriminated clearly, and hence their phylogeny and classification remain unresolved. Genotyping-by-sequencing (GBS) is an easy and affordable method that allows us to generate genome-wide single nucleotide polymorphism (SNP) markers. In this study, we used GBS to obtain SNPs covering all seven chromosomes from 283 accessions of Triticum-related genera. After filtering low-quality and redundant SNPs based on haplotype information, the GBS assay provided 14,188 high-quality SNPs that were distributed across the A (71%), B (26%), and D (2.4%) genomes. Cluster analysis and discriminant analysis of principal components (DAPC) allowed us to distinguish six distinct groups that matched well with Triticum species complexity. We constructed a Bayesian phylogenetic tree using 14,188 SNPs, in which 17 Triticum species and subspecies were discriminated. Dendrogram analysis revealed that the polyploid wheat species could be divided into groups according to the presence of A, B, D, and G genomes with strong nodal support and provided new insight into the evolution of spelt wheat. A total of 2,692 species-specific SNPs were identified to discriminate the common (T. aestivum) and durum (T. turgidum) wheat cultivar and landraces. In principal component analysis grouping, the two wheat species formed individual clusters and the SNPs were able to distinguish up to nine groups of 10 subspecies. This study demonstrated that GBS-derived SNPs could be used efficiently in genebank management to classify Triticum species and subspecies that are very difficult to distinguish by their morphological characters.

Collapse

Du Y, Wu S, Edwards SV, Liu L. The effect of alignment uncertainty, substitution models and priors in building and dating the mammal tree of life. BMC Evol Biol 2019;19:203. [PMID: 31694538 PMCID: PMC6833305 DOI: 10.1186/s12862-019-1534-9] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2019] [Accepted: 10/21/2019] [Indexed: 11/29/2022] Open

Abstract

BACKGROUND

The flood of genomic data to help build and date the tree of life requires automation at several critical junctures, most importantly during sequence assembly and alignment. It is widely appreciated that automated alignment protocols can yield inaccuracies, but the relative impact of various sources error on phylogenomic analysis is not yet known. This study employs an updated mammal data set of 5162 coding loci sampled from 90 species to evaluate the effects of alignment uncertainty, substitution models, and fossil priors on gene tree, species tree, and divergence time estimation. Additionally, a novel coalescent likelihood ratio test is introduced for comparing competing species trees against a given set of gene trees.

RESULTS

The aligned DNA sequences of 5162 loci from 90 species were trimmed and filtered using trimAL and two filtering protocols. The final dataset contains 4 sets of alignments - before trimming, after trimming, filtered by a recently proposed pipeline, and further filtered by comparing ML gene trees for each locus with the concatenation tree. Our analyses suggest that the average discordance among the coalescent trees is significantly smaller than that among the concatenation trees estimated from the 4 sets of alignments or with different substitution models. There is no significant difference among the divergence times estimated with different substitution models. However, the divergence dates estimated from the alignments after trimming are more recent than those estimated from the alignments before trimming.

CONCLUSIONS

Our results highlight that alignment uncertainty of the updated mammal data set and the choice of substitution models have little impact on tree topologies yielded by coalescent methods for species tree estimation, whereas they are more influential on the trees made by concatenation. Given the choice of calibration scheme and clock models, divergence time estimates are robust to the choice of substitution models, but removing alignments deemed problematic by trimming algorithms can lead to more recent dates. Although the fossil prior is important in divergence time estimation, Bayesian estimates of divergence times in this data set are driven primarily by the sequence data.

Collapse

Gatesy J, Sloan DB, Warren JM, Baker RH, Simmons MP, Springer MS. Partitioned coalescence support reveals biases in species-tree methods and detects gene trees that determine phylogenomic conflicts. Mol Phylogenet Evol 2019;139:106539. [DOI: 10.1016/j.ympev.2019.106539] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2018] [Revised: 06/10/2019] [Accepted: 06/17/2019] [Indexed: 12/26/2022]

Bos KI, Kühnert D, Herbig A, Esquivel-Gomez LR, Andrades Valtueña A, Barquera R, Giffin K, Kumar Lankapalli A, Nelson EA, Sabin S, Spyrou MA, Krause J. Paleomicrobiology: Diagnosis and Evolution of Ancient Pathogens. Annu Rev Microbiol 2019;73:639-666. [PMID: 31283430 DOI: 10.1146/annurev-micro-090817-062436] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]

Tao Q, Tamura K, U. Battistuzzi F, Kumar S. A Machine Learning Method for Detecting Autocorrelation of Evolutionary Rates in Large Phylogenies. Mol Biol Evol 2019;36:811-824. [PMID: 30689923 PMCID: PMC6804408 DOI: 10.1093/molbev/msz014] [Citation(s) in RCA: 28] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022] Open

Johnston PR, Quijada L, Smith CA, Baral HO, Hosoya T, Baschien C, Pärtel K, Zhuang WY, Haelewaters D, Park D, Carl S, López-Giráldez F, Wang Z, Townsend JP. A multigene phylogeny toward a new phylogenetic classification of Leotiomycetes. IMA Fungus 2019;10:1. [PMID: 32647610 PMCID: PMC7325659 DOI: 10.1186/s43008-019-0002-x] [Citation(s) in RCA: 88] [Impact Index Per Article: 17.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2019] [Accepted: 04/30/2019] [Indexed: 12/31/2022] Open

Abstract

Fungi in the class Leotiomycetes are ecologically diverse, including mycorrhizas, endophytes of roots and leaves, plant pathogens, aquatic and aero-aquatic hyphomycetes, mammalian pathogens, and saprobes. These fungi are commonly detected in cultures from diseased tissue and from environmental DNA extracts. The identification of specimens from such character-poor samples increasingly relies on DNA sequencing. However, the current classification of Leotiomycetes is still largely based on morphologically defined taxa, especially at higher taxonomic levels. Consequently, the formal Leotiomycetes classification is frequently poorly congruent with the relationships suggested by DNA sequencing studies. Previous class-wide phylogenies of Leotiomycetes have been based on ribosomal DNA markers, with most of the published multi-gene studies being focussed on particular genera or families. In this paper we collate data available from specimens representing both sexual and asexual morphs from across the genetic breadth of the class, with a focus on generic type species, to present a phylogeny based on up to 15 concatenated genes across 279 specimens. Included in the dataset are genes that were extracted from 72 of the genomes available for the class, including 10 new genomes released with this study. To test the statistical support for the deepest branches in the phylogeny, an additional phylogeny based on 3156 genes from 51 selected genomes is also presented. To fill some of the taxonomic gaps in the 15-gene phylogeny, we further present an ITS gene tree, particularly targeting ex-type specimens of generic type species. A small number of novel taxa are proposed: Marthamycetales ord. nov., and Drepanopezizaceae and Mniaeciaceae fams. nov. The formal taxonomic changes are limited in part because of the ad hoc nature of taxon and specimen selection, based purely on the availability of data. The phylogeny constitutes a framework for enabling future taxonomically targeted studies using deliberate specimen selection. Such studies will ideally include designation of epitypes for the type species of those genera for which DNA is not able to be extracted from the original type specimen, and consideration of morphological characters whenever genetically defined clades are recognized as formal taxa within a classification.

Collapse

Olofsson JK, Cantera I, Van de Paer C, Hong-Wa C, Zedane L, Dunning LT, Alberti A, Christin PA, Besnard G. Phylogenomics using low-depth whole genome sequencing: A case study with the olive tribe. Mol Ecol Resour 2019;19:877-892. [PMID: 30934146 DOI: 10.1111/1755-0998.13016] [Citation(s) in RCA: 38] [Impact Index Per Article: 7.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2018] [Revised: 03/19/2019] [Accepted: 03/25/2019] [Indexed: 12/20/2022]

Effects of missing data and data type on phylotranscriptomic analysis of stony corals (Cnidaria: Anthozoa: Scleractinia). Mol Phylogenet Evol 2019;134:12-23. [DOI: 10.1016/j.ympev.2019.01.012] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2018] [Revised: 01/11/2019] [Accepted: 01/17/2019] [Indexed: 01/28/2023]

Shin S, Clarke DJ, Lemmon AR, Moriarty Lemmon E, Aitken AL, Haddad S, Farrell BD, Marvaldi AE, Oberprieler RG, McKenna DD. Phylogenomic Data Yield New and Robust Insights into the Phylogeny and Evolution of Weevils. Mol Biol Evol 2019;35:823-836. [PMID: 29294021 DOI: 10.1093/molbev/msx324] [Citation(s) in RCA: 59] [Impact Index Per Article: 11.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023] Open

Parks MB, Wickett NJ, Alverson AJ. Signal, Uncertainty, and Conflict in Phylogenomic Data for a Diverse Lineage of Microbial Eukaryotes (Diatoms, Bacillariophyta). Mol Biol Evol 2019;35:80-93. [PMID: 29040712 PMCID: PMC5850769 DOI: 10.1093/molbev/msx268] [Citation(s) in RCA: 35] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023] Open

Montingelli GG, Grazziotin FG, Battilana J, Murphy RW, Zhang Y, Zaher H. Higher‐level phylogenetic affinities of the Neotropical genus Mastigodryas Amaral, 1934 (Serpentes: Colubridae), species‐group definition and description of a new genus for Mastigodryas bifossatus. J ZOOL SYST EVOL RES 2019. [DOI: 10.1111/jzs.12262] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/01/2022]

White DM, Islam MB, Mason-Gamer RJ. Phylogenetic inference in section Archerythroxylum informs taxonomy, biogeography, and the domestication of coca (Erythroxylum species). AMERICAN JOURNAL OF BOTANY 2019;106:154-165. [PMID: 30629286 DOI: 10.1002/ajb2.1224] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/14/2018] [Accepted: 10/19/2018] [Indexed: 05/12/2023]

Liu L, Anderson C, Pearl D, Edwards SV. Modern Phylogenomics: Building Phylogenetic Trees Using the Multispecies Coalescent Model. Methods Mol Biol 2019;1910:211-239. [PMID: 31278666 DOI: 10.1007/978-1-4939-9074-0_7] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]

Abstract

The multispecies coalescent (MSC) model provides a compelling framework for building phylogenetic trees from multilocus DNA sequence data. The pure MSC is best thought of as a special case of so-called "multispecies network coalescent" models, in which gene flow is allowed among branches of the tree, whereas MSC methods assume there is no gene flow between diverging species. Early implementations of the MSC, such as "parsimony" or "democratic vote" approaches to combining information from multiple gene trees, as well as concatenation, in which DNA sequences from multiple gene trees are combined into a single "supergene," were quickly shown to be inconsistent in some regions of tree space, in so far as they converged on the incorrect species tree as more gene trees and sequence data were accumulated. The anomaly zone, a region of tree space in which the most frequent gene tree is different from the species tree, is one such region where many so-called "coalescent" methods are inconsistent. Second-generation implementations of the MSC employed Bayesian or likelihood models; these are consistent in all regions of gene tree space, but Bayesian methods in particular are incapable of handling the large phylogenomic data sets currently available. Two-step methods, such as MP-EST and ASTRAL, in which gene trees are first estimated and then combined to estimate an overarching species tree, are currently popular in part because they can handle large phylogenomic data sets. These methods are consistent in the anomaly zone but can sometimes provide inappropriate measures of tree support or apportion error and signal in the data inappropriately. MP-EST in particular employs a likelihood model which can be conveniently manipulated to perform statistical tests of competing species trees, incorporating the likelihood of the collected gene trees on each species tree in a likelihood ratio test. Such tests provide a useful alternative to the multilocus bootstrap, which only indirectly tests the appropriateness of competing species trees. We illustrate these tests and implementations of the MSC with examples and suggest that MSC methods are a useful class of models effectively using information from multiple loci to build phylogenetic trees.

Collapse

Carlsen MM, Fér T, Schmickl R, Leong-Škorničková J, Newman M, Kress WJ. Resolving the rapid plant radiation of early diverging lineages in the tropical Zingiberales: Pushing the limits of genomic data. Mol Phylogenet Evol 2018;128:55-68. [DOI: 10.1016/j.ympev.2018.07.020] [Citation(s) in RCA: 27] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2017] [Revised: 07/23/2018] [Accepted: 07/26/2018] [Indexed: 01/09/2023]

Sayyari E, Whitfield JB, Mirarab S. Fragmentary Gene Sequences Negatively Impact Gene Tree and Species Tree Reconstruction. Mol Biol Evol 2018;34:3279-3291. [PMID: 29029241 DOI: 10.1093/molbev/msx261] [Citation(s) in RCA: 61] [Impact Index Per Article: 10.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022] Open

Gates DJ, Pilson D, Smith SD. Filtering of target sequence capture individuals facilitates species tree construction in the plant subtribe Iochrominae (Solanaceae). Mol Phylogenet Evol 2018;123:26-34. [DOI: 10.1016/j.ympev.2018.02.002] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2017] [Revised: 01/30/2018] [Accepted: 02/01/2018] [Indexed: 10/18/2022]

Nute M, Chou J, Molloy EK, Warnow T. The performance of coalescent-based species tree estimation methods under models of missing data. BMC Genomics 2018;19:286. [PMID: 29745854 PMCID: PMC5998899 DOI: 10.1186/s12864-018-4619-8] [Citation(s) in RCA: 42] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022] Open

Dobrin BH, Zwickl DJ, Sanderson MJ. The prevalence of terraced treescapes in analyses of phylogenetic data sets. BMC Evol Biol 2018;18:46. [PMID: 29618314 PMCID: PMC5885316 DOI: 10.1186/s12862-018-1162-9] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2017] [Accepted: 03/22/2018] [Indexed: 11/21/2022] Open

Abstract

BACKGROUND

The pattern of data availability in a phylogenetic data set may lead to the formation of terraces, collections of equally optimal trees. Terraces can arise in tree space if trees are scored with parsimony or with partitioned, edge-unlinked maximum likelihood. Theory predicts that terraces can be large, but their prevalence in contemporary data sets has never been surveyed. We selected 26 data sets and phylogenetic trees reported in recent literature and investigated the terraces to which the trees would belong, under a common set of inference assumptions. We examined terrace size as a function of the sampling properties of the data sets, including taxon coverage density (the proportion of taxon-by-gene positions with any data present) and a measure of gene sampling "sufficiency". We evaluated each data set in relation to the theoretical minimum gene sampling depth needed to reduce terrace size to a single tree, and explored the impact of the terraces found in replicate trees in bootstrap methods.

RESULTS

Terraces were identified in nearly all data sets with taxon coverage densities < 0.90. They were not found, however, in high-coverage-density (i.e., ≥ 0.94) transcriptomic and genomic data sets. The terraces could be very large, and size varied inversely with taxon coverage density and with gene sampling sufficiency. Few data sets achieved a theoretical minimum gene sampling depth needed to reduce terrace size to a single tree. Terraces found during bootstrap resampling reduced overall support.

CONCLUSIONS

If certain inference assumptions apply, trees estimated from empirical data sets often belong to large terraces of equally optimal trees. Terrace size correlates to data set sampling properties. Data sets seldom include enough genes to reduce terrace size to one tree. When bootstrap replicate trees lie on a terrace, statistical support for phylogenetic hypotheses may be reduced. Although some of the published analyses surveyed were conducted with edge-linked inference models (which do not induce terraces), unlinked models have been used and advocated. The present study describes the potential impact of that inference assumption on phylogenetic inference in the context of the kinds of multigene data sets now widely assembled for large-scale tree construction.

Collapse

Brower AVZ, Garzón-Orduña IJ. Missing data, clade support and "reticulation": the molecular systematics of Heliconius and related genera (Lepidoptera: Nymphalidae) re-examined. Cladistics 2018;34:151-166. [PMID: 34645081 DOI: 10.1111/cla.12198] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 03/03/2017] [Indexed: 11/30/2022] Open

Christensen S, Molloy EK, Vachaspati P, Warnow T. OCTAL: Optimal Completion of gene trees in polynomial time. Algorithms Mol Biol 2018;13:6. [PMID: 29568323 PMCID: PMC5853121 DOI: 10.1186/s13015-018-0124-5] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2017] [Accepted: 03/06/2018] [Indexed: 12/16/2022] Open

Abstract

Background

For a combination of reasons (including data generation protocols, approaches to taxon and gene sampling, and gene birth and loss), estimated gene trees are often incomplete, meaning that they do not contain all of the species of interest. As incomplete gene trees can impact downstream analyses, accurate completion of gene trees is desirable.

Results

We introduce the Optimal Tree Completion problem, a general optimization problem that involves completing an unrooted binary tree (i.e., adding missing leaves) so as to minimize its distance from a reference tree on a superset of the leaves. We present OCTAL, an algorithm that finds an optimal solution to this problem when the distance between trees is defined using the Robinson–Foulds (RF) distance, and we prove that OCTAL runs in \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$O(n^2)$$\end{document}O(n2) time, where n is the total number of species. We report on a simulation study in which gene trees can differ from the species tree due to incomplete lineage sorting, and estimated gene trees are completed using OCTAL with a reference tree based on a species tree estimated from the multi-locus dataset. OCTAL produces completed gene trees that are closer to the true gene trees than an existing heuristic approach in ASTRAL-II, but the accuracy of a completed gene tree computed by OCTAL depends on how topologically similar the reference tree (typically an estimated species tree) is to the true gene tree.

Conclusions

OCTAL is a useful technique for adding missing taxa to incomplete gene trees and provides good accuracy under a wide range of model conditions. However, results show that OCTAL’s accuracy can be reduced when incomplete lineage sorting is high, as the reference tree can be far from the true gene tree. Hence, this study suggests that OCTAL would benefit from using other types of reference trees instead of species trees when there are large topological distances between true gene trees and species trees.

Electronic supplementary material

The online version of this article (10.1186/s13015-018-0124-5) contains supplementary material, which is available to authorized users.

Collapse

Posada D. Phylogenomics for Systematic Biology. Syst Biol 2018;65:353-6. [PMID: 27129844 DOI: 10.1093/sysbio/syw027] [Citation(s) in RCA: 38] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open

Rodriguez J, Jones TH, Sierwald P, Marek PE, Shear WA, Brewer MS, Kocot KM, Bond JE. Step-wise evolution of complex chemical defenses in millipedes: a phylogenomic approach. Sci Rep 2018;8:3209. [PMID: 29453332 PMCID: PMC5816663 DOI: 10.1038/s41598-018-19996-6] [Citation(s) in RCA: 25] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2017] [Accepted: 01/11/2018] [Indexed: 11/19/2022] Open

Blom MPK, Bragg JG, Potter S, Moritz C. Accounting for Uncertainty in Gene Tree Estimation: Summary-Coalescent Species Tree Inference in a Challenging Radiation of Australian Lizards. Syst Biol 2018;66:352-366. [PMID: 28039387 DOI: 10.1093/sysbio/syw089] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2015] [Accepted: 09/27/2016] [Indexed: 11/12/2022] Open

Abstract

Accurate gene tree inference is an important aspect of species tree estimation in a summary-coalescent framework. Yet, in empirical studies, inferred gene trees differ in accuracy due to stochastic variation in phylogenetic signal between targeted loci. Empiricists should, therefore, examine the consistency of species tree inference, while accounting for the observed heterogeneity in gene tree resolution of phylogenomic data sets. Here, we assess the impact of gene tree estimation error on summary-coalescent species tree inference by screening ${\sim}2000$ exonic loci based on gene tree resolution prior to phylogenetic inference. We focus on a phylogenetically challenging radiation of Australian lizards (genus Cryptoblepharus, Scincidae) and explore effects on topology and support. We identify a well-supported topology based on all loci and find that a relatively small number of high-resolution gene trees can be sufficient to converge on the same topology. Adding gene trees with decreasing resolution produced a generally consistent topology, and increased confidence for specific bipartitions that were poorly supported when using a small number of informative loci. This corroborates coalescent-based simulation studies that have highlighted the need for a large number of loci to confidently resolve challenging relationships and refutes the notion that low-resolution gene trees introduce phylogenetic noise. Further, our study also highlights the value of quantifying changes in nodal support across locus subsets of increasing size (but decreasing gene tree resolution). Such detailed analyses can reveal anomalous fluctuations in support at some nodes, suggesting the possibility of model violation. By characterizing the heterogeneity in phylogenetic signal among loci, we can account for uncertainty in gene tree estimation and assess its effect on the consistency of the species tree estimate. We suggest that the evaluation of gene tree resolution should be incorporated in the analysis of empirical phylogenomic data sets. This will ultimately increase our confidence in species tree estimation using summary-coalescent methods and enable us to exploit genomic data for phylogenetic inference. [Coalescence; concatenation; Cryptoblepharus; exon capture; gene tree; phylogenomics; species tree.].

Collapse

Tibiriçá Y, Pola M, Cervera JL. Systematics of the genus Halgerda Bergh, 1880 (Heterobranchia : Nudibranchia) of Mozambique with descriptions of six new species. INVERTEBR SYST 2018. [DOI: 10.1071/is17095] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]

Damerau M, Freese M, Hanel R. Multi-gene phylogeny of jacks and pompanos (Carangidae), including placement of monotypic vadigo Campogramma glaycos. JOURNAL OF FISH BIOLOGY 2018;92:190-202. [PMID: 29193148 DOI: 10.1111/jfb.13509] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/04/2017] [Accepted: 10/27/2017] [Indexed: 06/07/2023]

Mallo D, Posada D. Multilocus inference of species trees and DNA barcoding. Philos Trans R Soc Lond B Biol Sci 2017;371:rstb.2015.0335. [PMID: 27481787 PMCID: PMC4971187 DOI: 10.1098/rstb.2015.0335] [Citation(s) in RCA: 49] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 04/10/2016] [Indexed: 11/30/2022] Open

Loss of color terms not demonstrated. Proc Natl Acad Sci U S A 2017;114:E8131. [PMID: 28912353 DOI: 10.1073/pnas.1714007114] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open

Molloy EK, Warnow T. To Include or Not to Include: The Impact of Gene Filtering on Species Tree Estimation Methods. Syst Biol 2017;67:285-303. [DOI: 10.1093/sysbio/syx077] [Citation(s) in RCA: 138] [Impact Index Per Article: 19.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2016] [Accepted: 09/13/2017] [Indexed: 01/27/2023] Open

Kates HR, Soltis PS, Soltis DE. Evolutionary and domestication history of Cucurbita (pumpkin and squash) species inferred from 44 nuclear loci. Mol Phylogenet Evol 2017;111:98-109. [PMID: 28288944 DOI: 10.1016/j.ympev.2017.03.002] [Citation(s) in RCA: 38] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2016] [Revised: 02/28/2017] [Accepted: 03/01/2017] [Indexed: 11/28/2022]

Li X, Jang TS, Temsch EM, Kato H, Takayama K, Schneeweiss GM. Molecular and karyological data confirm that the enigmatic genus Platypholis from Bonin-Islands (SE Japan) is phylogenetically nested within Orobanche (Orobanchaceae). JOURNAL OF PLANT RESEARCH 2017;130:273-280. [PMID: 28004281 PMCID: PMC5318490 DOI: 10.1007/s10265-016-0888-y] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/09/2016] [Accepted: 10/26/2016] [Indexed: 05/17/2023]

Li X, Hao B, Pan D, Schneeweiss GM. Marker Development for Phylogenomics: The Case of Orobanchaceae, a Plant Family with Contrasting Nutritional Modes. FRONTIERS IN PLANT SCIENCE 2017;8:1973. [PMID: 29218053 PMCID: PMC5704539 DOI: 10.3389/fpls.2017.01973] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/29/2017] [Accepted: 11/01/2017] [Indexed: 05/02/2023]

Abstract

Phylogenomic approaches, employing next-generation sequencing (NGS) techniques, have revolutionized systematic and evolutionary biology. Target enrichment is an efficient and cost-effective method in phylogenomics and is becoming increasingly popular. Depending on availability and quality of reference data as well as on biological features of the study system, (semi-)automated identification of suitable markers will require specific bioinformatic pipelines. Here, we established a highly flexible bioinformatic pipeline, BaitsFinder, to identify putative orthologous single copy genes (SCGs) and to construct bait sequences in a single workflow. Additionally, this pipeline has been constructed to be able to cope with challenging data sets, such as the nutritionally heterogeneous plant family Orobanchaceae. To this end, we used transcriptome data of differing quality available for four Orobanchaceae species and, as reference, SCG data from monkeyflower (Erythranthe guttata, syn. Mimulus g.; 1,915 genes) and tomato (Solanum lycopersicum; 391 genes). Depending on whether gaps were permitted in initial blast searches of the four Orobanchaceae species against the reference, our pipeline identified 1,307 and 981 SCGs with average length of 994 bp and 775 bp, respectively. Automated bait sequence construction (using 2× tiling) resulted in 38,170 and 21,856 bait sequences, respectively. In comparison to the recently published MarkerMiner 1.0 pipeline BaitsFinder identified about 1.6 times as many SCGs (of at least 900 bp length). Skipping steps specific to analyses of Orobanchaceae, BaitsFinder was successfully used in a group of non-parasitic plants (three Asteraceae species and, as reference, SCG data from Arabidopsis thaliana based on previously compiled SCGs). Thus, BaitsFinder is expected to be broadly applicable in groups, where only transcriptomes or partial genome data of differing quality are available.

Collapse

Shen XX, Zhou X, Kominek J, Kurtzman CP, Hittinger CT, Rokas A. Reconstructing the Backbone of the Saccharomycotina Yeast Phylogeny Using Genome-Scale Data. G3 (BETHESDA, MD.) 2016;6:3927-3939. [PMID: 27672114 PMCID: PMC5144963 DOI: 10.1534/g3.116.034744] [Citation(s) in RCA: 134] [Impact Index Per Article: 16.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/18/2016] [Accepted: 09/21/2016] [Indexed: 01/20/2023]

Abstract

Understanding the phylogenetic relationships among the yeasts of the subphylum Saccharomycotina is a prerequisite for understanding the evolution of their metabolisms and ecological lifestyles. In the last two decades, the use of rDNA and multilocus data sets has greatly advanced our understanding of the yeast phylogeny, but many deep relationships remain unsupported. In contrast, phylogenomic analyses have involved relatively few taxa and lineages that were often selected with limited considerations for covering the breadth of yeast biodiversity. Here we used genome sequence data from 86 publicly available yeast genomes representing nine of the 11 known major lineages and 10 nonyeast fungal outgroups to generate a 1233-gene, 96-taxon data matrix. Species phylogenies reconstructed using two different methods (concatenation and coalescence) and two data matrices (amino acids or the first two codon positions) yielded identical and highly supported relationships between the nine major lineages. Aside from the lineage comprised by the family Pichiaceae, all other lineages were monophyletic. Most interrelationships among yeast species were robust across the two methods and data matrices. However, eight of the 93 internodes conflicted between analyses or data sets, including the placements of: the clade defined by species that have reassigned the CUG codon to encode serine, instead of leucine; the clade defined by a whole genome duplication; and the species Ascoidea rubescens These phylogenomic analyses provide a robust roadmap for future comparative work across the yeast subphylum in the disciplines of taxonomy, molecular genetics, evolutionary biology, ecology, and biotechnology. To further this end, we have also provided a BLAST server to query the 86 Saccharomycotina genomes, which can be found at http://y1000plus.org/blast.

Collapse

Zhao L, Li X, Zhang N, Zhang SD, Yi TS, Ma H, Guo ZH, Li DZ. Phylogenomic analyses of large-scale nuclear genes provide new insights into the evolutionary relationships within the rosids. Mol Phylogenet Evol 2016;105:166-176. [DOI: 10.1016/j.ympev.2016.06.007] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2015] [Revised: 06/06/2016] [Accepted: 06/27/2016] [Indexed: 12/28/2022]

Arbizu CI, Ellison SL, Senalik D, Simon PW, Spooner DM. Genotyping-by-sequencing provides the discriminating power to investigate the subspecies of Daucus carota (Apiaceae). BMC Evol Biol 2016;16:234. [PMID: 27793080 PMCID: PMC5084430 DOI: 10.1186/s12862-016-0806-x] [Citation(s) in RCA: 31] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2016] [Accepted: 10/14/2016] [Indexed: 12/05/2022] Open

Abstract

BACKGROUND

The majority of the subspecies of Daucus carota have not yet been discriminated clearly by various molecular or morphological methods and hence their phylogeny and classification remains unresolved. Recent studies using 94 nuclear orthologs and morphological characters, and studies employing other molecular approaches were unable to distinguish clearly many of the subspecies. Fertile intercrosses among traditionally recognized subspecies are well documented. We here explore the utility of single nucleotide polymorphisms (SNPs) generated by genotyping-by-sequencing (GBS) to serve as an effective molecular method to discriminate the subspecies of the D. carota complex.

RESULTS

We used GBS to obtain SNPs covering all nine Daucus carota chromosomes from 162 accessions of Daucus and two related genera. To study Daucus phylogeny, we scored a total of 10,814 or 38,920 SNPs with a maximum of 10 or 30 % missing data, respectively. To investigate the subspecies of D. carota, we employed two data sets including 150 accessions: (i) rate of missing data 10 % with a total of 18,565 SNPs, and (ii) rate of missing data 30 %, totaling 43,713 SNPs. Consistent with prior results, the topology of both data sets separated species with 2n = 18 chromosome from all other species. Our results place all cultivated carrots (D. carota subsp. sativus) in a single clade. The wild members of D. carota from central Asia were on a clade with eastern members of subsp. sativus. The other subspecies of D. carota were in four clades associated with geographic groups: (1) the Balkan Peninsula and the Middle East, (2) North America and Europe, (3) North Africa exclusive of Morocco, and (4) the Iberian Peninsula and Morocco. Daucus carota subsp. maximus was discriminated, but neither it, nor subsp. gummifer (defined in a broad sense) are monophyletic.

CONCLUSIONS

Our study suggests that (1) the morphotypes identified as D. carota subspecies gummifer (as currently broadly circumscribed), all confined to areas near the Atlantic Ocean and the western Mediterranean Sea, have separate origins from sympatric members of other subspecies of D. carota, (2) D. carota subsp. maximus, on two clades with some accessions of subsp. carota, can be distinguished from each other but only with poor morphological support, (3) D. carota subsp. capillifolius, well distinguished morphologically, is an apospecies relative to North African populations of D. carota subsp. carota, (4) the eastern cultivated carrots have origins closer to wild carrots from central Asia than to western cultivated carrots, and (5) large SNP data sets are suitable for species-level phylogenetic studies in Daucus.

Collapse

Edwards SV. Phylogenomic subsampling: a brief review. ZOOL SCR 2016. [DOI: 10.1111/zsc.12210] [Citation(s) in RCA: 40] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2023]

Gatesy J, Meredith RW, Janecka JE, Simmons MP, Murphy WJ, Springer MS. Resolution of a concatenation/coalescence kerfuffle: partitioned coalescence support and a robust family‐level tree for Mammalia. Cladistics 2016;33:295-332. [DOI: 10.1111/cla.12170] [Citation(s) in RCA: 62] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 06/30/2016] [Indexed: 12/14/2022] Open

Wu HY, Wang YH, Xie Q, Ke YL, Bu WJ. Molecular classification based on apomorphic amino acids (Arthropoda, Hexapoda): Integrative taxonomy in the era of phylogenomics. Sci Rep 2016;6:28308. [PMID: 27312960 PMCID: PMC4911608 DOI: 10.1038/srep28308] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2016] [Accepted: 05/31/2016] [Indexed: 11/10/2022] Open