151
|
Corcoran P, Anderson JL, Jacobson DJ, Sun Y, Ni P, Lascoux M, Johannesson H. Introgression maintains the genetic integrity of the mating-type determining chromosome of the fungus Neurospora tetrasperma. Genome Res 2016; 26:486-98. [PMID: 26893460 PMCID: PMC4817772 DOI: 10.1101/gr.197244.115] [Citation(s) in RCA: 30] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2015] [Accepted: 02/16/2016] [Indexed: 01/01/2023]
Abstract
Genome evolution is driven by a complex interplay of factors, including selection, recombination, and introgression. The regions determining sexual identity are particularly dynamic parts of eukaryotic genomes that are prone to molecular degeneration associated with suppressed recombination. In the fungus Neurospora tetrasperma, it has been proposed that this molecular degeneration is counteracted by the introgression of nondegenerated DNA from closely related species. In this study, we used comparative and population genomic analyses of 92 genomes from eight phylogenetically and reproductively isolated lineages of N. tetrasperma, and its three closest relatives, to investigate the factors shaping the evolutionary history of the genomes.We found that suppressed recombination extends across at least 6 Mbp (∼ 63%) of the mating-type (mat) chromosome in N. tetrasperma and is associated with decreased genetic diversity, which is likely the result primarily of selection at linked sites. Furthermore, analyses of molecular evolution revealed an increased mutational load in this region, relative to recombining regions. However, comparative genomic and phylogenetic analyses indicate that the mat chromosomes are temporarily regenerated via introgression from sister species; six of eight lineages show introgression into one of their mat chromosomes, with multiple Neurospora species acting as donors. The introgressed tracts have been fixed within lineages, suggesting that they confer an adaptive advantage in natural populations, and our analyses support the presence of selective sweeps in at least one lineage. Thus, these data strongly support the previously hypothesized role of introgression as a mechanism for the maintenance of mating-type determining chromosomal regions.
Collapse
Affiliation(s)
- Pádraic Corcoran
- Department of Organismal Biology, Uppsala University, 752 36 Uppsala, Sweden; Department of Animal and Plant Sciences, University of Sheffield, Sheffield S10 2TN, United Kingdom
| | - Jennifer L Anderson
- Department of Organismal Biology, Uppsala University, 752 36 Uppsala, Sweden
| | - David J Jacobson
- Department of Organismal Biology, Uppsala University, 752 36 Uppsala, Sweden
| | - Yu Sun
- Department of Cell and Molecular Biology, Uppsala University, 752 36 Uppsala, Sweden
| | | | - Martin Lascoux
- Department of Ecology and Genetics, Science for Life Laboratory, Uppsala University, 752 36 Uppsala, Sweden
| | - Hanna Johannesson
- Department of Organismal Biology, Uppsala University, 752 36 Uppsala, Sweden
| |
Collapse
|
152
|
Schierwater B, Holland PWH, Miller DJ, Stadler PF, Wiegmann BM, Wörheide G, Wray GA, DeSalle R. Never Ending Analysis of a Century Old Evolutionary Debate: “Unringing” the Urmetazoon Bell. Front Ecol Evol 2016. [DOI: 10.3389/fevo.2016.00005] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
|
153
|
Massatti R, Reznicek AA, Knowles LL. Utilizing RADseq data for phylogenetic analysis of challenging taxonomic groups: A case study in Carex sect. Racemosae. AMERICAN JOURNAL OF BOTANY 2016; 103:337-347. [PMID: 26851268 DOI: 10.3732/ajb.1500315] [Citation(s) in RCA: 31] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/05/2015] [Accepted: 12/29/2015] [Indexed: 06/05/2023]
Abstract
PREMISE OF THE STUDY Relationships among closely related and recently diverged taxa can be especially difficult to resolve. Here we use both Sanger sequencing and next-generation RADseq data sets to estimate phylogenetic relationships among species of Carex section Racemosae (Cyperaceae), a clade largely restricted to high latitudes and elevations. Interest in relationships among these taxa derives from questions about the species' biogeographic histories and possible links between diversification and Pleistocene glaciations. METHODS A combination of approaches and molecular markers were used to estimate relationships among Carex species within sect. Racemosae and taxa from closely related sections. Nuclear and chloroplast loci generated by Sanger sequencing were analyzed with *BEAST, and SNP data from RADseq loci were analyzed as a concatenated data set using maximum likelihood and as independent loci using SVDquartets. KEY RESULTS Sanger sequencing data sets resolved relationships among taxa at intermediate phylogenetic depths (albeit with low levels of support). Only the RADseq data resolved relationships with strong support at all phylogenetic depths. Moreover, different methods and data partitions of the RADseq data resulted in nearly identical topologies. Carex sect. Racemosae is a strongly supported clade, although a handful of species were found to group with closely related sections. Herbarium specimens up to 35 yr old successfully produced informative RADseq data. CONCLUSIONS Despite the short read lengths of RADseq data, they nevertheless resolved relationships that Sanger sequencing data did not. Resolution of the phylogenetic relationships among recently and rapidly diversifying taxa within sect. Racemosae clades suggest a role for the Pleistocene glaciations in clade diversification.
Collapse
Affiliation(s)
- Rob Massatti
- Department of Ecology and Evolutionary Biology, The University of Michigan, Ann Arbor, Michigan, 41809-1079 USA
| | - Anton A Reznicek
- Department of Ecology and Evolutionary Biology, The University of Michigan, Ann Arbor, Michigan, 41809-1079 USA
| | - L Lacey Knowles
- Department of Ecology and Evolutionary Biology, The University of Michigan, Ann Arbor, Michigan, 41809-1079 USA
| |
Collapse
|
154
|
Ogilvie HA, Heled J, Xie D, Drummond AJ. Computational Performance and Statistical Accuracy of *BEAST and Comparisons with Other Methods. Syst Biol 2016; 65:381-96. [PMID: 26821913 PMCID: PMC4851174 DOI: 10.1093/sysbio/syv118] [Citation(s) in RCA: 70] [Impact Index Per Article: 8.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2015] [Accepted: 12/07/2015] [Indexed: 01/02/2023] Open
Abstract
Under the multispecies coalescent model of molecular evolution, gene trees have independent evolutionary histories within a shared species tree. In comparison, supermatrix concatenation methods assume that gene trees share a single common genealogical history, thereby equating gene coalescence with species divergence. The multispecies coalescent is supported by previous studies which found that its predicted distributions fit empirical data, and that concatenation is not a consistent estimator of the species tree. *BEAST, a fully Bayesian implementation of the multispecies coalescent, is popular but computationally intensive, so the increasing size of phylogenetic data sets is both a computational challenge and an opportunity for better systematics. Using simulation studies, we characterize the scaling behavior of *BEAST, and enable quantitative prediction of the impact increasing the number of loci has on both computational performance and statistical accuracy. Follow-up simulations over a wide range of parameters show that the statistical performance of *BEAST relative to concatenation improves both as branch length is reduced and as the number of loci is increased. Finally, using simulations based on estimated parameters from two phylogenomic data sets, we compare the performance of a range of species tree and concatenation methods to show that using *BEAST with tens of loci can be preferable to using concatenation with thousands of loci. Our results provide insight into the practicalities of Bayesian species tree estimation, the number of loci required to obtain a given level of accuracy and the situations in which supermatrix or summary methods will be outperformed by the fully Bayesian multispecies coalescent.
Collapse
Affiliation(s)
- Huw A Ogilvie
- Evolution, Ecology and Genetics, Research School of Biology, The Australian National University, Canberra, Australia
| | - Joseph Heled
- Department of Computer Science, University of Auckland, Auckland, New Zealand; Allan Wilson Centre for Molecular Ecology and Evolution, University of Auckland, Auckland, New Zealand
| | - Dong Xie
- Department of Computer Science, University of Auckland, Auckland, New Zealand; Allan Wilson Centre for Molecular Ecology and Evolution, University of Auckland, Auckland, New Zealand
| | - Alexei J Drummond
- Department of Computer Science, University of Auckland, Auckland, New Zealand; Allan Wilson Centre for Molecular Ecology and Evolution, University of Auckland, Auckland, New Zealand
| |
Collapse
|
155
|
Rivers DM, Darwell CT, Althoff DM. Phylogenetic analysis of RAD-seq data: examining the influence of gene genealogy conflict on analysis of concatenated data. Cladistics 2016; 32:672-681. [DOI: 10.1111/cla.12149] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 11/19/2015] [Indexed: 01/15/2023] Open
Affiliation(s)
- David M. Rivers
- Department of Biology; Syracuse University; 107 College Place Syracuse NY 13244 USA
| | - Clive T. Darwell
- Department of Biology; Syracuse University; 107 College Place Syracuse NY 13244 USA
| | - David M. Althoff
- Department of Biology; Syracuse University; 107 College Place Syracuse NY 13244 USA
| |
Collapse
|
156
|
Simmons MP, Sloan DB, Gatesy J. The effects of subsampling gene trees on coalescent methods applied to ancient divergences. Mol Phylogenet Evol 2016; 97:76-89. [PMID: 26768112 DOI: 10.1016/j.ympev.2015.12.013] [Citation(s) in RCA: 31] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2015] [Revised: 12/03/2015] [Accepted: 12/20/2015] [Indexed: 10/22/2022]
Abstract
Gene-tree-estimation error is a major concern for coalescent methods of phylogenetic inference. We sampled eight empirical studies of ancient lineages with diverse numbers of taxa and genes for which the original authors applied one or more coalescent methods. We found that the average pairwise congruence among gene trees varied greatly both between studies and also often within a study. We recommend that presenting plots of pairwise congruence among gene trees in a dataset be treated as a standard practice for empirical coalescent studies so that readers can readily assess the extent and distribution of incongruence among gene trees. ASTRAL-based coalescent analyses generally outperformed MP-EST and STAR with respect to both internal consistency (congruence between analyses of subsamples of genes with the complete dataset of all genes) and congruence with the concatenation-based topology. We evaluated the approach of subsampling gene trees that are, on average, more congruent with other gene trees as a method to reduce artifacts caused by gene-tree-estimation errors on coalescent analyses. We suggest that this method is well suited to testing whether gene-tree-estimation error is a primary cause of incongruence between concatenation- and coalescent-based results, to reconciling conflicting phylogenetic results based on different coalescent methods, and to identifying genes affected by artifacts that may then be targeted for reciprocal illumination. We provide scripts that automate the process of calculating pairwise gene-tree incongruence and subsampling trees while accounting for differential taxon sampling among genes. Finally, we assert that multiple tree-search replicates should be implemented as a standard practice for empirical coalescent studies that apply MP-EST.
Collapse
Affiliation(s)
- Mark P Simmons
- Department of Biology, Colorado State University, Fort Collins, CO 80523, USA.
| | - Daniel B Sloan
- Department of Biology, Colorado State University, Fort Collins, CO 80523, USA
| | - John Gatesy
- Department of Biology, University of California, Riverside, CA 92521, USA
| |
Collapse
|
157
|
Implementing and testing the multispecies coalescent model: A valuable paradigm for phylogenomics. Mol Phylogenet Evol 2016; 94:447-62. [DOI: 10.1016/j.ympev.2015.10.027] [Citation(s) in RCA: 265] [Impact Index Per Article: 33.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/19/2023]
|
158
|
Springer MS, Gatesy J. The gene tree delusion. Mol Phylogenet Evol 2016; 94:1-33. [DOI: 10.1016/j.ympev.2015.07.018] [Citation(s) in RCA: 145] [Impact Index Per Article: 18.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2015] [Revised: 06/04/2015] [Accepted: 07/22/2015] [Indexed: 10/23/2022]
|
159
|
Dufort MJ. An augmented supermatrix phylogeny of the avian family Picidae reveals uncertainty deep in the family tree. Mol Phylogenet Evol 2016; 94:313-26. [DOI: 10.1016/j.ympev.2015.08.025] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2015] [Revised: 08/22/2015] [Accepted: 08/28/2015] [Indexed: 10/23/2022]
|
160
|
Schmickl R, Liston A, Zeisek V, Oberlander K, Weitemier K, Straub SCK, Cronn RC, Dreyer LL, Suda J. Phylogenetic marker development for target enrichment from transcriptome and genome skim data: the pipeline and its application in southern AfricanOxalis(Oxalidaceae). Mol Ecol Resour 2015; 16:1124-35. [DOI: 10.1111/1755-0998.12487] [Citation(s) in RCA: 69] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2015] [Revised: 10/06/2015] [Accepted: 11/05/2015] [Indexed: 01/08/2023]
Affiliation(s)
- Roswitha Schmickl
- Institute of Botany; The Czech Academy of Sciences; Zámek 1 252 43 Průhonice Czech Republic
| | - Aaron Liston
- Department of Botany and Plant Pathology; Oregon State University; 2082 Cordley Hall Corvallis OR 97331 USA
| | - Vojtěch Zeisek
- Institute of Botany; The Czech Academy of Sciences; Zámek 1 252 43 Průhonice Czech Republic
- Department of Botany; Faculty of Science; Charles University in Prague; Benátská 2 128 01 Prague Czech Republic
| | - Kenneth Oberlander
- Institute of Botany; The Czech Academy of Sciences; Zámek 1 252 43 Průhonice Czech Republic
- Department of Conservation Ecology and Entomology; Stellenbosch University; Private Bag X1 Matieland 7602 South Africa
| | - Kevin Weitemier
- Department of Botany and Plant Pathology; Oregon State University; 2082 Cordley Hall Corvallis OR 97331 USA
| | - Shannon C. K. Straub
- Department of Biology; Hobart and William Smith Colleges; 213 Eaton Hall Geneva NY 14456 USA
| | - Richard C. Cronn
- USDA Forest Service; Pacific Northwest Research Station; 3200 SW Jefferson Way Corvallis OR 97331 USA
| | - Léanne L. Dreyer
- Department of Botany and Zoology; Stellenbosch University; Private Bag X1 Matieland 7602 South Africa
| | - Jan Suda
- Institute of Botany; The Czech Academy of Sciences; Zámek 1 252 43 Průhonice Czech Republic
- Department of Botany; Faculty of Science; Charles University in Prague; Benátská 2 128 01 Prague Czech Republic
| |
Collapse
|
161
|
Richart CH, Hayashi CY, Hedin M. Phylogenomic analyses resolve an ancient trichotomy at the base of Ischyropsalidoidea (Arachnida, Opiliones) despite high levels of gene tree conflict and unequal minority resolution frequencies. Mol Phylogenet Evol 2015; 95:171-82. [PMID: 26691642 DOI: 10.1016/j.ympev.2015.11.010] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2015] [Revised: 09/16/2015] [Accepted: 11/13/2015] [Indexed: 11/19/2022]
Abstract
Phylogenetic resolution of ancient rapid radiations has remained problematic despite major advances in statistical approaches and DNA sequencing technologies. Here we report on a combined phylogenetic approach utilizing transcriptome data in conjunction with Sanger sequence data to investigate a tandem of ancient divergences in the harvestmen superfamily Ischyropsalidoidea (Arachnida, Opiliones, Dyspnoi). We rely on Sanger sequences to resolve nodes within and between closely related genera, and use RNA-seq data from a subset of taxa to resolve a short and ancient internal branch. We use several analytical approaches to explore this succession of ancient diversification events, including concatenated and coalescent-based analyses and maximum likelihood gene trees for each locus. We evaluate the robustness of phylogenetic inferences using a randomized locus sub-sampling approach, and find congruence across these methods despite considerable incongruence across gene trees. Incongruent gene trees are not recovered in frequencies expected from a simple multispecies coalescent model, and we reject incomplete lineage sorting as the sole contributor to gene tree conflict. Using these approaches we attain robust support for higher-level phylogenetic relationships within Ischyropsalidoidea.
Collapse
Affiliation(s)
- Casey H Richart
- Department of Biology, San Diego State University, 5500 Campanile Drive, San Diego, CA 92182, USA; Department of Biology, University of California, Riverside, CA 92521, USA.
| | - Cheryl Y Hayashi
- Department of Biology, University of California, Riverside, CA 92521, USA
| | - Marshal Hedin
- Department of Biology, San Diego State University, 5500 Campanile Drive, San Diego, CA 92182, USA
| |
Collapse
|
162
|
Wang Y, Zhou X, Yang D, Rokas A. A Genome-Scale Investigation of Incongruence in Culicidae Mosquitoes. Genome Biol Evol 2015; 7:3463-71. [PMID: 26608059 PMCID: PMC4700963 DOI: 10.1093/gbe/evv235] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022] Open
Abstract
Comparison of individual gene trees in several recent phylogenomic studies from diverse lineages has revealed a surprising amount of topological conflict or incongruence, but we still know relatively little about its distribution across the tree of life. To further our understanding of incongruence, the factors that contribute to it and how it can be ameliorated, we examined its distribution in a clade of 20 Culicidae mosquito species through the reconstruction and analysis of the phylogenetic histories of 2,007 groups of orthologous genes. Levels of incongruence were generally low, the three exceptions being the internodes concerned with the branching of Anopheles christyi, with the branching of the subgenus Anopheles as well as the already reported incongruence within the Anopheles gambiae species complex. Two of these incongruence events (A. gambiae species complex and A. christyi) are likely due to biological factors, whereas the third (subgenus Anopheles) is likely due to analytical factors. Similar to previous studies, the use of genes or internodes with high bootstrap support or internode certainty values, both of which were positively correlated with gene alignment length, substantially reduced the observed incongruence. However, the clade support values of the internodes concerned with the branching of the subgenus Anopheles as well as within the A. gambiae species complex remained very low. Based on these results, we infer that the prevalence of incongruence in Culicidae mosquitoes is generally low, that it likely stems from both analytical and biological factors, and that it can be ameliorated through the selection of genes with strong phylogenetic signal. More generally, selection of genes with strong phylogenetic signal may be a general empirical solution for reducing incongruence and increasing the robustness of inference in phylogenomic studies.
Collapse
Affiliation(s)
- Yuyu Wang
- Department of Entomology, China Agricultural University, Beijing, China Department of Biological Sciences, Vanderbilt University
| | - Xiaofan Zhou
- Department of Biological Sciences, Vanderbilt University
| | - Ding Yang
- Department of Entomology, China Agricultural University, Beijing, China
| | - Antonis Rokas
- Department of Biological Sciences, Vanderbilt University
| |
Collapse
|
163
|
Singhal S, Leffler EM, Sannareddy K, Turner I, Venn O, Hooper DM, Strand AI, Li Q, Raney B, Balakrishnan CN, Griffith SC, McVean G, Przeworski M. Stable recombination hotspots in birds. Science 2015; 350:928-32. [PMID: 26586757 PMCID: PMC4864528 DOI: 10.1126/science.aad0843] [Citation(s) in RCA: 198] [Impact Index Per Article: 22.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]
Abstract
The DNA-binding protein PRDM9 has a critical role in specifying meiotic recombination hotspots in mice and apes, but it appears to be absent from other vertebrate species, including birds. To study the evolution and determinants of recombination in species lacking the gene that encodes PRDM9, we inferred fine-scale genetic maps from population resequencing data for two bird species: the zebra finch, Taeniopygia guttata, and the long-tailed finch, Poephila acuticauda. We found that both species have recombination hotspots, which are enriched near functional genomic elements. Unlike in mice and apes, most hotspots are shared between the two species, and their conservation seems to extend over tens of millions of years. These observations suggest that in the absence of PRDM9, recombination targets functional features that both enable access to the genome and constrain its evolution.
Collapse
Affiliation(s)
- Sonal Singhal
- Department of Biological Sciences, Columbia University, New York, NY 10027, USA. Department of Systems Biology, Columbia University, New York, NY 10032, USA.
| | - Ellen M Leffler
- Department of Human Genetics, University of Chicago, Chicago, IL 60637, USA. Wellcome Trust Centre for Human Genetics, University of Oxford, Oxford OX3 7BN, UK
| | - Keerthi Sannareddy
- Department of Human Genetics, University of Chicago, Chicago, IL 60637, USA
| | - Isaac Turner
- Wellcome Trust Centre for Human Genetics, University of Oxford, Oxford OX3 7BN, UK
| | - Oliver Venn
- Wellcome Trust Centre for Human Genetics, University of Oxford, Oxford OX3 7BN, UK
| | - Daniel M Hooper
- Committee on Evolutionary Biology, University of Chicago, Chicago, IL 60637, USA
| | - Alva I Strand
- Department of Biological Sciences, Columbia University, New York, NY 10027, USA
| | - Qiye Li
- China National Genebank, BGI-Shenzhen, Shenzhen 518083, China
| | - Brian Raney
- Center for Biomolecular Science and Engineering, University of California-Santa Cruz, Santa Cruz, CA 95064, USA
| | | | - Simon C Griffith
- Department of Biological Sciences, Macquarie University, Sydney, NSW 2109, Australia
| | - Gil McVean
- Wellcome Trust Centre for Human Genetics, University of Oxford, Oxford OX3 7BN, UK
| | - Molly Przeworski
- Department of Biological Sciences, Columbia University, New York, NY 10027, USA. Department of Systems Biology, Columbia University, New York, NY 10032, USA.
| |
Collapse
|
164
|
Xi Z, Liu L, Davis CC. The Impact of Missing Data on Species Tree Estimation. Mol Biol Evol 2015; 33:838-60. [DOI: 10.1093/molbev/msv266] [Citation(s) in RCA: 101] [Impact Index Per Article: 11.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2023] Open
|
165
|
Fernandes NM, Paiva TDS, da Silva-Neto ID, Schlegel M, Schrago CG. Expanded phylogenetic analyses of the class Heterotrichea (Ciliophora, Postciliodesmatophora) using five molecular markers and morphological data. Mol Phylogenet Evol 2015; 95:229-46. [PMID: 26549427 DOI: 10.1016/j.ympev.2015.10.030] [Citation(s) in RCA: 32] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2015] [Revised: 10/30/2015] [Accepted: 10/31/2015] [Indexed: 11/25/2022]
Abstract
Most studies of the molecular evolution of Heterotrichea have been based solely on the 18S-rDNA gene, which were inconsistent with morphological classification. Because of the limitations of single locus phylogenies and the recurring problem of lack of resolution of deeper nodes found in previous studies, we present hypotheses of the evolution of internal groups of the class Heterotrichea based on multi-loci analyses (18S-rDNA, 28S-rDNA, ITS1-5.8S-ITS2 region, COI and alpha-tubulin) and morphological data. Phylogenetic trees from protein coding gene data are presented for Heterotrichea for the first time. Phylogenetic analyses included Bayesian inference, maximum likelihood, maximum parsimony methods, and optimal trees were statistically compared to alternative topologies from the literature. Additionally, the Bayesian concordance approach (BCA algorithm) was used to assess the concordance factor between topologies obtained from isolated analyses. Because different loci may evolve at different rates, resulting in different gene topologies, we also estimated a species tree for Heterotrichea using the STAR coalescence-based method. The results show that: (1) single gene trees are inconsistent regarding the position of some heterotrichean families; (2) the concatenation of all data in a total-evidence tree improved the resolution of deep nodes among the heterotrichean families and genera; (3) the coalescent-based species tree is consistent with phylogenies based on the 18S-rDNA gene and shows Spirostomidae as the stem group of Heterotrichea; (4) however, the total-evidence tree suggests that the large Heterotrichea cluster is divided into nine lineages in which Peritromidae diverges at the base of the Heterotrichea tree.
Collapse
Affiliation(s)
- Noemi M Fernandes
- Laboratório de Biologia Evolutiva Teórica e Aplicada, Departamento de Genética, Universidade Federal do Rio de Janeiro, Brazil.
| | - Thiago da Silva Paiva
- Laboratório de Protistologia, Departamento de Zoologia, Universidade Federal do Rio de Janeiro, Brazil; Laboratório de Biologia Molecular "Francisco Mauro Salzano", Instituto de Ciências Biológicas, Universidade Federal do Pará, Brazil
| | - Inácio D da Silva-Neto
- Laboratório de Protistologia, Departamento de Zoologia, Universidade Federal do Rio de Janeiro, Brazil
| | - Martin Schlegel
- Molecular Evolution and Animal Systematics, Institute of Biology, University of Leipzig, Germany
| | - Carlos G Schrago
- Laboratório de Biologia Evolutiva Teórica e Aplicada, Departamento de Genética, Universidade Federal do Rio de Janeiro, Brazil.
| |
Collapse
|
166
|
De Maio N, Schrempf D, Kosiol C. PoMo: An Allele Frequency-Based Approach for Species Tree Estimation. Syst Biol 2015; 64:1018-31. [PMID: 26209413 PMCID: PMC4604832 DOI: 10.1093/sysbio/syv048] [Citation(s) in RCA: 56] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2014] [Accepted: 06/11/2015] [Indexed: 11/24/2022] Open
Abstract
Incomplete lineage sorting can cause incongruencies of the overall species-level phylogenetic tree with the phylogenetic trees for individual genes or genomic segments. If these incongruencies are not accounted for, it is possible to incur several biases in species tree estimation. Here, we present a simple maximum likelihood approach that accounts for ancestral variation and incomplete lineage sorting. We use a POlymorphisms-aware phylogenetic MOdel (PoMo) that we have recently shown to efficiently estimate mutation rates and fixation biases from within and between-species variation data. We extend this model to perform efficient estimation of species trees. We test the performance of PoMo in several different scenarios of incomplete lineage sorting using simulations and compare it with existing methods both in accuracy and computational speed. In contrast to other approaches, our model does not use coalescent theory but is allele frequency based. We show that PoMo is well suited for genome-wide species tree estimation and that on such data it is more accurate than previous approaches.
Collapse
Affiliation(s)
- Nicola De Maio
- Institut für Populationsgenetik, Vetmeduni Vienna, Wien 1210, Austria; Vienna Graduate School of Population Genetics, Wien, Austria; and Nuffield Department of Clinical Medicine, University of Oxford, Oxford OX3 7BN, UK
| | - Dominik Schrempf
- Institut für Populationsgenetik, Vetmeduni Vienna, Wien 1210, Austria; Vienna Graduate School of Population Genetics, Wien, Austria; and
| | - Carolin Kosiol
- Institut für Populationsgenetik, Vetmeduni Vienna, Wien 1210, Austria;
| |
Collapse
|
167
|
Konowalik K, Wagner F, Tomasello S, Vogt R, Oberprieler C. Detecting reticulate relationships among diploid Leucanthemum Mill. (Compositae, Anthemideae) taxa using multilocus species tree reconstruction methods and AFLP fingerprinting. Mol Phylogenet Evol 2015; 92:308-28. [DOI: 10.1016/j.ympev.2015.06.003] [Citation(s) in RCA: 27] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2015] [Revised: 05/29/2015] [Accepted: 06/02/2015] [Indexed: 12/23/2022]
|
168
|
Nater A, Burri R, Kawakami T, Smeds L, Ellegren H. Resolving Evolutionary Relationships in Closely Related Species with Whole-Genome Sequencing Data. Syst Biol 2015; 64:1000-17. [PMID: 26187295 PMCID: PMC4604831 DOI: 10.1093/sysbio/syv045] [Citation(s) in RCA: 74] [Impact Index Per Article: 8.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2015] [Accepted: 06/24/2015] [Indexed: 01/25/2023] Open
Abstract
Using genetic data to resolve the evolutionary relationships of species is of major interest in evolutionary and systematic biology. However, reconstructing the sequence of speciation events, the so-called species tree, in closely related and potentially hybridizing species is very challenging. Processes such as incomplete lineage sorting and interspecific gene flow result in local gene genealogies that differ in their topology from the species tree, and analyses of few loci with a single sequence per species are likely to produce conflicting or even misleading results. To study these phenomena on a full phylogenomic scale, we use whole-genome sequence data from 200 individuals of four black-and-white flycatcher species with so far unresolved phylogenetic relationships to infer gene tree topologies and visualize genome-wide patterns of gene tree incongruence. Using phylogenetic analysis in nonoverlapping 10-kb windows, we show that gene tree topologies are extremely diverse and change on a very small physical scale. Moreover, we find strong evidence for gene flow among flycatcher species, with distinct patterns of reduced introgression on the Z chromosome. To resolve species relationships on the background of widespread gene tree incongruence, we used four complementary coalescent-based methods for species tree reconstruction, including complex modeling approaches that incorporate post-divergence gene flow among species. This allowed us to infer the most likely species tree with high confidence. Based on this finding, we show that regions of reduced effective population size, which have been suggested as particularly useful for species tree inference, can produce positively misleading species tree topologies. Our findings disclose the pitfalls of using loci potentially under selection as phylogenetic markers and highlight the potential of modeling approaches to disentangle species relationships in systems with large effective population sizes and post-divergence gene flow.
Collapse
Affiliation(s)
- Alexander Nater
- Department of Evolutionary Biology, Evolutionary Biology Centre, Uppsala University, Uppsala, Sweden
| | - Reto Burri
- Department of Evolutionary Biology, Evolutionary Biology Centre, Uppsala University, Uppsala, Sweden
| | - Takeshi Kawakami
- Department of Evolutionary Biology, Evolutionary Biology Centre, Uppsala University, Uppsala, Sweden
| | - Linnéa Smeds
- Department of Evolutionary Biology, Evolutionary Biology Centre, Uppsala University, Uppsala, Sweden
| | - Hans Ellegren
- Department of Evolutionary Biology, Evolutionary Biology Centre, Uppsala University, Uppsala, Sweden
| |
Collapse
|
169
|
Ruane S, Raxworthy CJ, Lemmon AR, Lemmon EM, Burbrink FT. Comparing species tree estimation with large anchored phylogenomic and small Sanger-sequenced molecular datasets: an empirical study on Malagasy pseudoxyrhophiine snakes. BMC Evol Biol 2015; 15:221. [PMID: 26459325 PMCID: PMC4603904 DOI: 10.1186/s12862-015-0503-1] [Citation(s) in RCA: 44] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2015] [Accepted: 10/01/2015] [Indexed: 11/15/2022] Open
Abstract
BACKGROUND Using molecular data generated by high throughput next generation sequencing (NGS) platforms to infer phylogeny is becoming common as costs go down and the ability to capture loci from across the genome goes up. While there is a general consensus that greater numbers of independent loci should result in more robust phylogenetic estimates, few studies have compared phylogenies resulting from smaller datasets for commonly used genetic markers with the large datasets captured using NGS. Here, we determine how a 5-locus Sanger dataset compares with a 377-locus anchored genomics dataset for understanding the evolutionary history of the pseudoxyrhophiine snake radiation centered in Madagascar. The Pseudoxyrhophiinae comprise ~86 % of Madagascar's serpent diversity, yet they are poorly known with respect to ecology, behavior, and systematics. Using the 377-locus NGS dataset and the summary statistics species-tree methods STAR and MP-EST, we estimated a well-supported species tree that provides new insights concerning intergeneric relationships for the pseudoxyrhophiines. We also compared how these and other methods performed with respect to estimating tree topology using datasets with varying numbers of loci. METHODS Using Sanger sequencing and an anchored phylogenomics approach, we sequenced datasets comprised of 5 and 377 loci, respectively, for 23 pseudoxyrhophiine taxa. For each dataset, we estimated phylogenies using both gene-tree (concatenation) and species-tree (STAR, MP-EST) approaches. We determined the similarity of resulting tree topologies from the different datasets using Robinson-Foulds distances. In addition, we examined how subsets of these data performed compared to the complete Sanger and anchored datasets for phylogenetic accuracy using the same tree inference methodologies, as well as the program *BEAST to determine if a full coalescent model for species tree estimation could generate robust results with fewer loci compared to the summary statistics species tree approaches. We also examined the individual gene trees in comparison to the 377-locus species tree using the program MetaTree. RESULTS Using the full anchored dataset under a variety of methods gave us the same, well-supported phylogeny for pseudoxyrhophiines. The African pseudoxyrhophiine Duberria is the sister taxon to the Malagasy pseudoxyrhophiines genera, providing evidence for a monophyletic radiation in Madagascar. In addition, within Madagascar, the two major clades inferred correspond largely to the aglyphous and opisthoglyphous genera, suggesting that feeding specializations associated with tooth venom delivery may have played a major role in the early diversification of this radiation. The comparison of tree topologies from the concatenated and species-tree methods using different datasets indicated the 5-locus dataset cannot beused to infer a correct phylogeny for the pseudoxyrhophiines under any method tested here and that summary statistics methods require 50 or more loci to consistently recover the species-tree inferred using the complete anchored dataset. However, as few as 15 loci may infer the correct topology when using the full coalescent species tree method *BEAST. MetaTree analyses of each gene tree from the Sanger and anchored datasets found that none of the individual gene trees matched the 377-locus species tree, and that no gene trees were identical with respect to topology. CONCLUSIONS Our results suggest that ≥50 loci may be necessary to confidently infer phylogenies when using summaryspecies-tree methods, but that the coalescent-based method *BEAST consistently recovers the same topology using only 15 loci. These results reinforce that datasets with small numbers of markers may result in misleading topologies, and further, that the method of inference used to generate a phylogeny also has a major influence on the number of loci necessary to infer robust species trees.
Collapse
Affiliation(s)
- Sara Ruane
- Department of Herpetology, American Museum of Natural History, Central Park West at 79th Street, New York, NY, 10024, USA.
| | - Christopher J Raxworthy
- Department of Herpetology, American Museum of Natural History, Central Park West at 79th Street, New York, NY, 10024, USA.
| | - Alan R Lemmon
- Department of Biology, Florida State University, 319 Stadium Drive, P.O. Box 3064295, Tallahassee, FL, 32306-4295, USA.
| | - Emily Moriarty Lemmon
- Department of Biology, Florida State University, 319 Stadium Drive, P.O. Box 3064295, Tallahassee, FL, 32306-4295, USA.
| | - Frank T Burbrink
- Department of Herpetology, American Museum of Natural History, Central Park West at 79th Street, New York, NY, 10024, USA.
- Biology Department, College of Staten Island/CUNY, 2800 Victory Boulevard, Staten Island, NY, 10314, USA.
| |
Collapse
|
170
|
A comprehensive phylogeny of birds (Aves) using targeted next-generation DNA sequencing. Nature 2015; 526:569-73. [DOI: 10.1038/nature15697] [Citation(s) in RCA: 1067] [Impact Index Per Article: 118.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2015] [Accepted: 09/09/2015] [Indexed: 12/20/2022]
|
171
|
Davidson R, Vachaspati P, Mirarab S, Warnow T. Phylogenomic species tree estimation in the presence of incomplete lineage sorting and horizontal gene transfer. BMC Genomics 2015; 16 Suppl 10:S1. [PMID: 26450506 PMCID: PMC4603753 DOI: 10.1186/1471-2164-16-s10-s1] [Citation(s) in RCA: 54] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022] Open
Abstract
BACKGROUND Species tree estimation is challenged by gene tree heterogeneity resulting from biological processes such as duplication and loss, hybridization, incomplete lineage sorting (ILS), and horizontal gene transfer (HGT). Mathematical theory about reconstructing species trees in the presence of HGT alone or ILS alone suggests that quartet-based species tree methods (known to be statistically consistent under ILS, or under bounded amounts of HGT) might be effective techniques for estimating species trees when both HGT and ILS are present. RESULTS We evaluated several publicly available coalescent-based methods and concatenation under maximum likelihood on simulated datasets with moderate ILS and varying levels of HGT. Our study shows that two quartet-based species tree estimation methods (ASTRAL-2 and weighted Quartets MaxCut) are both highly accurate, even on datasets with high rates of HGT. In contrast, although NJst and concatenation using maximum likelihood are highly accurate under low HGT, they are less robust to high HGT rates. CONCLUSION Our study shows that quartet-based species-tree estimation methods can be highly accurate under the presence of both HGT and ILS. The study suggests the possibility that some quartet-based methods might be statistically consistent under phylogenomic models of gene tree heterogeneity with both HGT and ILS.
Collapse
Affiliation(s)
- Ruth Davidson
- Department of Mathematics, University of Illinois at Urbana-Champaign, 1409 W. Green Street, 61801 Urbana, IL, USA
| | - Pranjal Vachaspati
- Department of Computer Science, University of Illinois at Urbana-Champaign, 201 North Goodwin Avenue, 61801 Urbana, IL, USA
| | - Siavash Mirarab
- Department of Computer Science, University of Texas at Austin, 2317 Speedway, Stop D9500, 78712 Austin, TX, USA
- Department of Electrical and Computer Engineering, University of California at San Diego, 9500 Gilman Drive, 92093, La Jolla, CA, USA
| | - Tandy Warnow
- Department of Computer Science, University of Illinois at Urbana-Champaign, 201 North Goodwin Avenue, 61801 Urbana, IL, USA
- Department of Bioengineering, University of Illinois at Urbana-Champaign, 1270 Digital Computer Laboratory, MC-278, 61801 Urbana, IL, USA
| |
Collapse
|
172
|
Abstract
BACKGROUND Incomplete lineage sorting (ILS), modelled by the multi-species coalescent (MSC), is known to create discordance between gene trees and species trees, and lead to inaccurate species tree estimations unless appropriate methods are used to estimate the species tree. While many statistically consistent methods have been developed to estimate the species tree in the presence of ILS, only ASTRAL-2 and NJst have been shown to have good accuracy on large datasets. Yet, NJst is generally slower and less accurate than ASTRAL-2, and cannot run on some datasets. RESULTS We have redesigned NJst to enable it to run on all datasets, and we have expanded its design space so that it can be used with different distance-based tree estimation methods. The resultant method, ASTRID, is statistically consistent under the MSC model, and has accuracy that is competitive with ASTRAL-2. Furthermore, ASTRID is much faster than ASTRAL-2, completing in minutes on some datasets for which ASTRAL-2 used hours. CONCLUSIONS ASTRID is a new coalescent-based method for species tree estimation that is competitive with the best current method in terms of accuracy, while being much faster. ASTRID is available in open source form on github.
Collapse
Affiliation(s)
- Pranjal Vachaspati
- Department of Computer Science, University of Illinois at Urbana-Champaign, 201 N. Goodwin Avenue, Urbana, IL, 61801 USA
| | - Tandy Warnow
- Department of Computer Science, University of Illinois at Urbana-Champaign, 201 N. Goodwin Avenue, Urbana, IL, 61801 USA
| |
Collapse
|
173
|
Simmons MP, Gatesy J. Coalescence vs. concatenation: Sophisticated analyses vs. first principles applied to rooting the angiosperms. Mol Phylogenet Evol 2015; 91:98-122. [DOI: 10.1016/j.ympev.2015.05.011] [Citation(s) in RCA: 64] [Impact Index Per Article: 7.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/12/2015] [Revised: 05/01/2015] [Accepted: 05/14/2015] [Indexed: 11/24/2022]
|
174
|
Lartillot N. Probabilistic models of eukaryotic evolution: time for integration. Philos Trans R Soc Lond B Biol Sci 2015; 370:20140338. [PMID: 26323768 PMCID: PMC4571576 DOI: 10.1098/rstb.2014.0338] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 06/03/2015] [Indexed: 11/12/2022] Open
Abstract
In spite of substantial work and recent progress, a global and fully resolved picture of the macroevolutionary history of eukaryotes is still under construction. This concerns not only the phylogenetic relations among major groups, but also the general characteristics of the underlying macroevolutionary processes, including the patterns of gene family evolution associated with endosymbioses, as well as their impact on the sequence evolutionary process. All these questions raise formidable methodological challenges, calling for a more powerful statistical paradigm. In this direction, model-based probabilistic approaches have played an increasingly important role. In particular, improved models of sequence evolution accounting for heterogeneities across sites and across lineages have led to significant, although insufficient, improvement in phylogenetic accuracy. More recently, one main trend has been to move away from simple parametric models and stepwise approaches, towards integrative models explicitly considering the intricate interplay between multiple levels of macroevolutionary processes. Such integrative models are in their infancy, and their application to the phylogeny of eukaryotes still requires substantial improvement of the underlying models, as well as additional computational developments.
Collapse
Affiliation(s)
- Nicolas Lartillot
- Laboratoire de Biométrie et Biologie Evolutive, UMR CNRS 5558, Université Claude Bernard Lyon 1, F-69622 Villeurbanne Cedex, France
| |
Collapse
|
175
|
Streicher JW, Schulte JA, Wiens JJ. How Should Genes and Taxa be Sampled for Phylogenomic Analyses with Missing Data? An Empirical Study in Iguanian Lizards. Syst Biol 2015; 65:128-45. [PMID: 26330450 DOI: 10.1093/sysbio/syv058] [Citation(s) in RCA: 111] [Impact Index Per Article: 12.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2014] [Accepted: 08/04/2015] [Indexed: 11/12/2022] Open
Abstract
Targeted sequence capture is becoming a widespread tool for generating large phylogenomic data sets to address difficult phylogenetic problems. However, this methodology often generates data sets in which increasing the number of taxa and loci increases amounts of missing data. Thus, a fundamental (but still unresolved) question is whether sampling should be designed to maximize sampling of taxa or genes, or to minimize the inclusion of missing data cells. Here, we explore this question for an ancient, rapid radiation of lizards, the pleurodont iguanians. Pleurodonts include many well-known clades (e.g., anoles, basilisks, iguanas, and spiny lizards) but relationships among families have proven difficult to resolve strongly and consistently using traditional sequencing approaches. We generated up to 4921 ultraconserved elements with sampling strategies including 16, 29, and 44 taxa, from 1179 to approximately 2.4 million characters per matrix and approximately 30% to 60% total missing data. We then compared mean branch support for interfamilial relationships under these 15 different sampling strategies for both concatenated (maximum likelihood) and species tree (NJst) approaches (after showing that mean branch support appears to be related to accuracy). We found that both approaches had the highest support when including loci with up to 50% missing taxa (matrices with ~40-55% missing data overall). Thus, our results show that simply excluding all missing data may be highly problematic as the primary guiding principle for the inclusion or exclusion of taxa and genes. The optimal strategy was somewhat different for each approach, a pattern that has not been shown previously. For concatenated analyses, branch support was maximized when including many taxa (44) but fewer characters (1.1 million). For species-tree analyses, branch support was maximized with minimal taxon sampling (16) but many loci (4789 of 4921). We also show that the choice of these sampling strategies can be critically important for phylogenomic analyses, since some strategies lead to demonstrably incorrect inferences (using the same method) that have strong statistical support. Our preferred estimate provides strong support for most interfamilial relationships in this important but phylogenetically challenging group.
Collapse
Affiliation(s)
- Jeffrey W Streicher
- Department of Ecology and Evolutionary Biology, University of Arizona, Tucson, AZ 85721, USA; Department of Life Sciences, The Natural History Museum, London SW7 5BD, UK and
| | - James A Schulte
- Department of Biology, Clarkson University, Potsdam, NY 13699, USA
| | - John J Wiens
- Department of Ecology and Evolutionary Biology, University of Arizona, Tucson, AZ 85721, USA
| |
Collapse
|
176
|
López-Torres S, Schillaci MA, Silcox MT. Life history of the most complete fossil primate skeleton: exploring growth models for Darwinius. ROYAL SOCIETY OPEN SCIENCE 2015; 2:150340. [PMID: 26473056 PMCID: PMC4593690 DOI: 10.1098/rsos.150340] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 07/13/2015] [Accepted: 08/11/2015] [Indexed: 06/05/2023]
Abstract
Darwinius is an adapoid primate from the Eocene of Germany, and its only known specimen represents the most complete fossil primate ever found. Its describers hypothesized a close relationship to Anthropoidea, and using a Saimiri model estimated its age at death. This study reconstructs the ancestral permanent dental eruption sequences for basal Euprimates, Haplorhini, Anthropoidea, and stem and crown Strepsirrhini. The results show that the ancestral sequences for the basal euprimate, haplorhine and stem strepsirrhine are identical, and similar to that of Darwinius. However, Darwinius differs from anthropoids by exhibiting early development of the lower third molars relative to the lower third and fourth premolars. The eruption of the lower second premolar marks the point of interruption of the sequence in Darwinius. The anthropoid Saimiri as a model is therefore problematic because it exhibits a delayed eruption of P2. Here, an alternative strepsirrhine model based on Eulemur and Varecia is presented. Our proposed model shows an older age at death than previously suggested (1.05-1.14 years), while the range for adult weight is entirely below the range proposed previously. This alternative model is more consistent with hypotheses supporting a stronger relationship between adapoids and strepsirrhines.
Collapse
Affiliation(s)
- Sergi López-Torres
- Department of Anthropology, University of Toronto Scarborough, 1265 Military Trail, Toronto, Ontario, Canada M1C 1A4
| | | | | |
Collapse
|
177
|
Bragg JG, Potter S, Bi K, Moritz C. Exon capture phylogenomics: efficacy across scales of divergence. Mol Ecol Resour 2015. [DOI: 10.1111/1755-0998.12449] [Citation(s) in RCA: 109] [Impact Index Per Article: 12.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Affiliation(s)
- Jason G. Bragg
- Research School of Biology and Centre for Biodiversity Analysis; Australian National University; Acton ACT Australia
| | - Sally Potter
- Research School of Biology and Centre for Biodiversity Analysis; Australian National University; Acton ACT Australia
| | - Ke Bi
- Museum of Vertebrate Zoology; University of California, Berkeley; CA USA
- Computational Genomics Resource Laboratory (CGRL); California Institute for Quantitative Biosciences (QB3); University of California, Berkeley; Berkeley CA 94720-3102 USA
| | - Craig Moritz
- Research School of Biology and Centre for Biodiversity Analysis; Australian National University; Acton ACT Australia
| |
Collapse
|
178
|
Zhou X, Sun F, Xu S, Yang G, Li M. The position of tree shrews in the mammalian tree: Comparing multi-gene analyses with phylogenomic results leaves monophyly of Euarchonta doubtful. Integr Zool 2015; 10:186-98. [PMID: 25311886 DOI: 10.1111/1749-4877.12116] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]
Abstract
The well-accepted Euarchonta grandorder is a pruned version of Archonta nested within the Euarchontoglires (or Supraprimates) clade. At present, it includes tree shrews (Scandentia), flying lemurs (Dermoptera) and primates (Primates). Here, a phylogenomic dataset containing 1912 exons from 22 representative mammals was compiled to investigate the phylogenetic relationships within this group. Phylogenetic analyses and hypothesis testing suggested that tree shrews can be classified as a sister group to Primates or to Glires or even as a basal clade within Euarchontoglires. Further analyses of both modified and original previously published datasets found that the phylogenetic position of tree shrews is unstable. We also found that two of three exonic indels reported as synapomorphies of Euarchonta in a previous study do not unambiguously support the monophyly of such a clade. Therefore, the monophyly of both Euarchonta and Sundatheria (Dermoptera + Scandentia) are suspect. Molecular dating and divergence rate analyses suggested that the ancestor of Euarchontoglires experienced a rapid divergence, which may cause the unresolved position of tree shrews even using the whole genomic data.
Collapse
Affiliation(s)
- Xuming Zhou
- Key laboratory of Animal Ecology and Conservation Biology, Institute of Zoology, Chinese Academy of Sciences, Beijing, China; University of Chinese Academy of Sciences, Beijing, China
| | | | | | | | | |
Collapse
|
179
|
DaCosta JM, Sorenson MD. ddRAD-seq phylogenetics based on nucleotide, indel, and presence-absence polymorphisms: Analyses of two avian genera with contrasting histories. Mol Phylogenet Evol 2015; 94:122-35. [PMID: 26279345 DOI: 10.1016/j.ympev.2015.07.026] [Citation(s) in RCA: 58] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2015] [Revised: 07/22/2015] [Accepted: 07/29/2015] [Indexed: 11/16/2022]
Abstract
Genotype-by-sequencing (GBS) methods have revolutionized the field of molecular ecology, but their application in molecular phylogenetics remains somewhat limited. In addition, most phylogenetic studies based on large GBS data sets have relied on analyses of concatenated data rather than species tree methods that explicitly account for genealogical stochasticity among loci. We explored the utility of "double-digest" restriction site-associated DNA sequencing (ddRAD-seq) for phylogenetic analyses of the Lagonosticta firefinches (family Estrildidae) and the Vidua brood parasitic finches (family Viduidae). As expected, the number of homologous loci shared among samples was negatively correlated with genetic distance due to the accumulation of restriction site polymorphisms. Nonetheless, for each genus, we obtained data sets of ∼3000 loci shared in common among all samples, including a more distantly related outgroup taxon. For all samples combined, we obtained >1000 homologous loci despite ∼20my divergence between estrildid and parasitic finches. In addition to nucleotide polymorphisms, the ddRAD-seq data yielded large sets of indel and locus presence-absence polymorphisms, all of which had higher consistency indices than mtDNA sequence data in the context of concatenated parsimony analyses. Species tree methods, using individual gene trees or single nucleotide polymorphisms as input, generated results broadly consistent with analyses of concatenated data, particularly for Lagonosticta, which appears to have a well resolved, bifurcating history. Results for Vidua were also generally consistent across methods and data sets, although nodal support and results from different species tree methods were more variable. Lower gene tree congruence in Vidua is likely the result of its unique evolutionary history, which includes rapid speciation by host shift and occasional hybridization and introgression due to incomplete reproductive isolation. We conclude that ddRAD-seq is a cost-effective method for generating robust phylogenetic data sets, particularly for analyses of closely related species and genera.
Collapse
|
180
|
Chen MY, Liang D, Zhang P. Selecting Question-Specific Genes to Reduce Incongruence in Phylogenomics: A Case Study of Jawed Vertebrate Backbone Phylogeny. Syst Biol 2015; 64:1104-20. [PMID: 26276158 DOI: 10.1093/sysbio/syv059] [Citation(s) in RCA: 77] [Impact Index Per Article: 8.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2015] [Accepted: 08/10/2015] [Indexed: 11/13/2022] Open
Abstract
Incongruence between different phylogenomic analyses is the main challenge faced by phylogeneticists in the genomic era. To reduce incongruence, phylogenomic studies normally adopt some data filtering approaches, such as reducing missing data or using slowly evolving genes, to improve the signal quality of data. Here, we assembled a phylogenomic data set of 58 jawed vertebrate taxa and 4682 genes to investigate the backbone phylogeny of jawed vertebrates under both concatenation and coalescent-based frameworks. To evaluate the efficiency of extracting phylogenetic signals among different data filtering methods, we chose six highly intractable internodes within the backbone phylogeny of jawed vertebrates as our test questions. We found that our phylogenomic data set exhibits substantial conflicting signal among genes for these questions. Our analyses showed that non-specific data sets that are generated without bias toward specific questions are not sufficient to produce consistent results when there are several difficult nodes within a phylogeny. Moreover, phylogenetic accuracy based on non-specific data is considerably influenced by the size of data and the choice of tree inference methods. To address such incongruences, we selected genes that resolve a given internode but not the entire phylogeny. Notably, not only can this strategy yield correct relationships for the question, but it also reduces inconsistency associated with data sizes and inference methods. Our study highlights the importance of gene selection in phylogenomic analyses, suggesting that simply using a large amount of data cannot guarantee correct results. Constructing question-specific data sets may be more powerful for resolving problematic nodes.
Collapse
Affiliation(s)
- Meng-Yun Chen
- State Key Laboratory of Biocontrol, College of Ecology and Evolution, School of Life Sciences, Sun Yat-Sen University, Guangzhou 510006, China
| | - Dan Liang
- State Key Laboratory of Biocontrol, College of Ecology and Evolution, School of Life Sciences, Sun Yat-Sen University, Guangzhou 510006, China
| | - Peng Zhang
- State Key Laboratory of Biocontrol, College of Ecology and Evolution, School of Life Sciences, Sun Yat-Sen University, Guangzhou 510006, China
| |
Collapse
|
181
|
Ai B, Kang M. How Many Genes are Needed to Resolve Phylogenetic Incongruence? Evol Bioinform Online 2015; 11:185-8. [PMID: 26309387 PMCID: PMC4533847 DOI: 10.4137/ebo.s26047] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2015] [Revised: 06/16/2015] [Accepted: 06/22/2015] [Indexed: 12/03/2022] Open
Abstract
The question how many genes are needed to resolve phylogenetic incongruence has been investigated at various taxonomic levels, yet few studies have investigated the minimum required numbers of selected genes based on single-gene tree performance at the genus level or lower. We conducted resampling analyses by compiling transcriptome-based single-copy nuclear gene sequences of 11 species of Primulina (Gesneriaceae) to investigate the minimum numbers of both random and selected genes needed to resolve the phylogeny. Only 8 of the 26 selected genes were sufficient for full resolution, while 175 genes were needed if all 830 random genes were used. Our results provided a baseline for future sampling strategies of gene numbers in molecular phylogenetic studies of speciose taxa. The gene selection strategies based on single-gene tree performance are strongly recommended in phylogenic analyses.
Collapse
Affiliation(s)
- Bin Ai
- Key Laboratory of Plant Resources Conservation and Sustainable Utilization, South China Botanical Garden, Chinese Academy of Sciences, Guangzhou, China
| | - Ming Kang
- Key Laboratory of Plant Resources Conservation and Sustainable Utilization, South China Botanical Garden, Chinese Academy of Sciences, Guangzhou, China
| |
Collapse
|
182
|
Smith SA, Moore MJ, Brown JW, Yang Y. Analysis of phylogenomic datasets reveals conflict, concordance, and gene duplications with examples from animals and plants. BMC Evol Biol 2015; 15:150. [PMID: 26239519 PMCID: PMC4524127 DOI: 10.1186/s12862-015-0423-0] [Citation(s) in RCA: 242] [Impact Index Per Article: 26.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2015] [Accepted: 06/25/2015] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND The use of transcriptomic and genomic datasets for phylogenetic reconstruction has become increasingly common as researchers attempt to resolve recalcitrant nodes with increasing amounts of data. The large size and complexity of these datasets introduce significant phylogenetic noise and conflict into subsequent analyses. The sources of conflict may include hybridization, incomplete lineage sorting, or horizontal gene transfer, and may vary across the phylogeny. For phylogenetic analysis, this noise and conflict has been accommodated in one of several ways: by binning gene regions into subsets to isolate consistent phylogenetic signal; by using gene-tree methods for reconstruction, where conflict is presumed to be explained by incomplete lineage sorting (ILS); or through concatenation, where noise is presumed to be the dominant source of conflict. The results provided herein emphasize that analysis of individual homologous gene regions can greatly improve our understanding of the underlying conflict within these datasets. RESULTS Here we examined two published transcriptomic datasets, the angiosperm group Caryophyllales and the aculeate Hymenoptera, for the presence of conflict, concordance, and gene duplications in individual homologs across the phylogeny. We found significant conflict throughout the phylogeny in both datasets and in particular along the backbone. While some nodes in each phylogeny showed patterns of conflict similar to what might be expected with ILS alone, the backbone nodes also exhibited low levels of phylogenetic signal. In addition, certain nodes, especially in the Caryophyllales, had highly elevated levels of strongly supported conflict that cannot be explained by ILS alone. CONCLUSION This study demonstrates that phylogenetic signal is highly variable in phylogenomic data sampled across related species and poses challenges when conducting species tree analyses on large genomic and transcriptomic datasets. Further insight into the conflict and processes underlying these complex datasets is necessary to improve and develop adequate models for sequence analysis and downstream applications. To aid this effort, we developed the open source software phyparts ( https://bitbucket.org/blackrim/phyparts ), which calculates unique, conflicting, and concordant bipartitions, maps gene duplications, and outputs summary statistics such as internode certainy (ICA) scores and node-specific counts of gene duplications.
Collapse
Affiliation(s)
- Stephen A Smith
- Department of Ecology and Evolutionary Biology, University of Michigan, S State St, Ann Arbor, 48109, MI, USA.
| | - Michael J Moore
- Department of Biology, Oberlin College, W Lorain St, Oberlin, 44074, OH, USA.
| | - Joseph W Brown
- Department of Ecology and Evolutionary Biology, University of Michigan, S State St, Ann Arbor, 48109, MI, USA.
| | - Ya Yang
- Department of Ecology and Evolutionary Biology, University of Michigan, S State St, Ann Arbor, 48109, MI, USA.
| |
Collapse
|
183
|
Folk RA, Mandel JR, Freudenstein JV. A protocol for targeted enrichment of intron-containing sequence markers for recent radiations: A phylogenomic example from Heuchera (Saxifragaceae). APPLICATIONS IN PLANT SCIENCES 2015; 3:apps1500039. [PMID: 26312196 PMCID: PMC4542943 DOI: 10.3732/apps.1500039] [Citation(s) in RCA: 51] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/06/2015] [Accepted: 07/09/2015] [Indexed: 05/18/2023]
Abstract
PREMISE OF THE STUDY Phylogenetic inference is moving to large multilocus data sets, yet there remains uncertainty in the choice of marker and sequencing method at low taxonomic levels. To address this gap, we present a method for enriching long loci spanning intron-exon boundaries in the genus Heuchera. METHODS Two hundred seventy-eight loci were designed using a splice-site prediction method combining transcriptomic and genomic data. Biotinylated probes were designed for enrichment of these loci. Reference-based assembly was performed using genomic references; additionally, chloroplast and mitochondrial genomes were used as references for off-target reads. The data were aligned and subjected to coalescent and concatenated phylogenetic analyses to demonstrate support for major relationships. RESULTS Complete or nearly complete (>99%) sequences were assembled from essentially all loci from all taxa. Aligned introns showed a fourfold increase in divergence as opposed to exons. Concatenated analysis gave decisive support to all nodes, and support was also high and relationships mostly similar in the coalescent analysis. Organellar phylogenies were also well-supported and conflicted with the nuclear signal. DISCUSSION Our approach shows promise for resolving a recent radiation. Enrichment for introns is highly successful with little or no sequencing dropout at low taxonomic levels despite higher substitution and indel frequencies, and should be exploited in studies of species complexes.
Collapse
Affiliation(s)
- Ryan A. Folk
- Herbarium, The Ohio State University, Columbus, Ohio 43212 USA
- Author for correspondence:
| | - Jennifer R. Mandel
- Department of Biology, University of Memphis, Memphis, Tennessee 38152 USA
| | | |
Collapse
|
184
|
Thomaz AT, Arcila D, Ortí G, Malabarba LR. Molecular phylogeny of the subfamily Stevardiinae Gill, 1858 (Characiformes: Characidae): classification and the evolution of reproductive traits. BMC Evol Biol 2015; 15:146. [PMID: 26195030 PMCID: PMC4509481 DOI: 10.1186/s12862-015-0403-4] [Citation(s) in RCA: 37] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2015] [Accepted: 06/02/2015] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND The subfamily Stevardiinae is a diverse and widely distributed clade of freshwater fishes from South and Central America, commonly known as "tetras" (Characidae). The group was named "clade A" when first proposed as a monophyletic unit of Characidae and later designated as a subfamily. Stevardiinae includes 48 genera and around 310 valid species with many species presenting inseminating reproductive strategy. No global hypothesis of relationships is available for this group and currently many genera are listed as incertae sedis or are suspected to be non-monophyletic. RESULTS We present a molecular phylogeny with the largest number of stevardiine species analyzed so far, including 355 samples representing 153 putative species distributed in 32 genera, to test the group's monophyly and internal relationships. The phylogeny was inferred using DNA sequence data from seven gene fragments (mtDNA: 12S, 16S and COI; nuclear: RAG1, RAG2, MYH6 and PTR). The results support the Stevardiinae as a monophyletic group and a detailed hypothesis of the internal relationships for this subfamily. CONCLUSIONS A revised classification based on the molecular phylogeny is proposed that includes seven tribes and also defines monophyletic genera, including a resurrected genus Eretmobrycon, and new definitions for Diapoma, Hemibrycon, Bryconamericus sensu stricto, and Knodus sensu stricto, placing some small genera as junior synonyms. Inseminating species are distributed in several clades suggesting that reproductive strategy is evolutionarily labile in this group of fishes.
Collapse
Affiliation(s)
- Andréa T Thomaz
- Department of Ecology and Evolutionary Biology (EEB), University of Michigan, 1109 Geddes Ave., Ann Arbor, 48109, MI, USA.
- Departamento de Zoologia, Universidade Federal do Rio Grande do Sul (UFRGS), Av. Bento Gonçalves 9500, Porto Alegre, 90501-970, RS, Brazil.
| | - Dahiana Arcila
- Department of Biological Sciences, The George Washington University, 2023 G St. NW, Washington, DC, 20052, USA.
- Department of Vertebrate Zoology, National Museum of Natural History Smithsonian Institution, PO Box 37012, MRC 159, Washington, DC, 20013, USA.
| | - Guillermo Ortí
- Department of Biological Sciences, The George Washington University, 2023 G St. NW, Washington, DC, 20052, USA.
| | - Luiz R Malabarba
- Departamento de Zoologia, Universidade Federal do Rio Grande do Sul (UFRGS), Av. Bento Gonçalves 9500, Porto Alegre, 90501-970, RS, Brazil.
| |
Collapse
|
185
|
Abstract
Hybrids between species are often sterile or inviable. This form of reproductive isolation is thought to evolve via the accumulation of mutations that interact to reduce fitness when combined in hybrids. Mathematical formulations of this "Dobzhansky-Muller model" predict an accelerating buildup of hybrid incompatibilities with divergence time (the "snowball effect"). Although the Dobzhansky-Muller model is widely accepted, the snowball effect has only been tested in two species groups. We evaluated evidence for the snowball effect in the evolution of hybrid male sterility among subspecies of house mice, a recently diverged group that shows partial reproductive isolation. We compared the history of subspecies divergence with patterns of quantitative trait loci (QTL) detected in F2 intercrosses between two pairs of subspecies (Mus musculus domesticus with M. m. musculus and M. m. domesticus with M. m. castaneus). We used a recently developed phylogenetic comparative method to statistically measure the fit of these data to the snowball prediction. To apply this method, QTL were partitioned as either shared or unshared in the two crosses. A heuristic partitioning based on the overlap of QTL confidence intervals produced unambiguous support for the snowball effect. An alternative approach combining data among crosses favored the snowball effect for the autosomes, but a linear accumulation of incompatibilities for the X chromosome. Reasoning that the X chromosome analyses are complicated by low mapping resolution, we conclude that hybrid male sterility loci have snowballed in house mice. Our study illustrates the power of comparative genetic mapping for understanding mechanisms of speciation.
Collapse
|
186
|
Xi Z, Liu L, Davis CC. Genes with minimal phylogenetic information are problematic for coalescent analyses when gene tree estimation is biased. Mol Phylogenet Evol 2015; 92:63-71. [PMID: 26115844 DOI: 10.1016/j.ympev.2015.06.009] [Citation(s) in RCA: 73] [Impact Index Per Article: 8.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2015] [Revised: 04/23/2015] [Accepted: 06/16/2015] [Indexed: 11/30/2022]
Abstract
The development and application of coalescent methods are undergoing rapid changes. One little explored area that bears on the application of gene-tree-based coalescent methods to species tree estimation is gene informativeness. Here, we investigate the accuracy of these coalescent methods when genes have minimal phylogenetic information, including the implementation of the multilocus bootstrap approach. Using simulated DNA sequences, we demonstrate that genes with minimal phylogenetic information can produce unreliable gene trees (i.e., high error in gene tree estimation), which may in turn reduce the accuracy of species tree estimation using gene-tree-based coalescent methods. We demonstrate that this problem can be alleviated by sampling more genes, as is commonly done in large-scale phylogenomic analyses. This applies even when these genes are minimally informative. If gene tree estimation is biased, however, gene-tree-based coalescent analyses will produce inconsistent results, which cannot be remedied by increasing the number of genes. In this case, it is not the gene-tree-based coalescent methods that are flawed, but rather the input data (i.e., estimated gene trees). Along these lines, the commonly used program PhyML has a tendency to infer one particular bifurcating topology even though it is best represented as a polytomy. We additionally corroborate these findings by analyzing the 183-locus mammal data set assembled by McCormack et al. (2012) using ultra-conserved elements (UCEs) and flanking DNA. Lastly, we demonstrate that when employing the multilocus bootstrap approach on this 183-locus data set, there is no strong conflict between species trees estimated from concatenation and gene-tree-based coalescent analyses, as has been previously suggested by Gatesy and Springer (2014).
Collapse
Affiliation(s)
- Zhenxiang Xi
- Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, MA 02138, USA
| | - Liang Liu
- Department of Statistics, University of Georgia, Athens, GA 30602, USA; Institute of Bioinformatics, University of Georgia, Athens, GA 30602, USA
| | - Charles C Davis
- Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, MA 02138, USA.
| |
Collapse
|
187
|
Bayzid MS, Mirarab S, Boussau B, Warnow T. Weighted Statistical Binning: Enabling Statistically Consistent Genome-Scale Phylogenetic Analyses. PLoS One 2015; 10:e0129183. [PMID: 26086579 PMCID: PMC4472720 DOI: 10.1371/journal.pone.0129183] [Citation(s) in RCA: 84] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2014] [Accepted: 05/05/2015] [Indexed: 11/19/2022] Open
Abstract
Because biological processes can result in different loci having different evolutionary histories, species tree estimation requires multiple loci from across multiple genomes. While many processes can result in discord between gene trees and species trees, incomplete lineage sorting (ILS), modeled by the multi-species coalescent, is considered to be a dominant cause for gene tree heterogeneity. Coalescent-based methods have been developed to estimate species trees, many of which operate by combining estimated gene trees, and so are called "summary methods". Because summary methods are generally fast (and much faster than more complicated coalescent-based methods that co-estimate gene trees and species trees), they have become very popular techniques for estimating species trees from multiple loci. However, recent studies have established that summary methods can have reduced accuracy in the presence of gene tree estimation error, and also that many biological datasets have substantial gene tree estimation error, so that summary methods may not be highly accurate in biologically realistic conditions. Mirarab et al. (Science 2014) presented the "statistical binning" technique to improve gene tree estimation in multi-locus analyses, and showed that it improved the accuracy of MP-EST, one of the most popular coalescent-based summary methods. Statistical binning, which uses a simple heuristic to evaluate "combinability" and then uses the larger sets of genes to re-calculate gene trees, has good empirical performance, but using statistical binning within a phylogenomic pipeline does not have the desirable property of being statistically consistent. We show that weighting the re-calculated gene trees by the bin sizes makes statistical binning statistically consistent under the multispecies coalescent, and maintains the good empirical performance. Thus, "weighted statistical binning" enables highly accurate genome-scale species tree estimation, and is also statistically consistent under the multi-species coalescent model. New data used in this study are available at DOI: http://dx.doi.org/10.6084/m9.figshare.1411146, and the software is available at https://github.com/smirarab/binning.
Collapse
Affiliation(s)
| | - Siavash Mirarab
- Department of Computer Science, University of Texas at Austin, Austin, Texas, USA
| | - Bastien Boussau
- Laboratoire de Biométrie et Biologie Évolutive, Université de Lyons, France
| | - Tandy Warnow
- Department of Computer Science, University of Illinois at Urbana-Champaign, Urbana, IL, USA
| |
Collapse
|
188
|
Heyduk K, Trapnell DW, Barrett CF, Leebens-Mack J. Phylogenomic analyses of species relationships in the genusSabal(Arecaceae) using targeted sequence capture. Biol J Linn Soc Lond 2015. [DOI: 10.1111/bij.12551] [Citation(s) in RCA: 79] [Impact Index Per Article: 8.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/05/2023]
Affiliation(s)
- Karolina Heyduk
- Department of Plant Biology; University of Georgia; Athens GA 30602 USA
| | | | - Craig F. Barrett
- Department of Biological Sciences; California State University; Los Angeles CA 90032 USA
| | - Jim Leebens-Mack
- Department of Plant Biology; University of Georgia; Athens GA 30602 USA
| |
Collapse
|
189
|
Roy T, Cole LW, Chang TH, Lindqvist C. Untangling reticulate evolutionary relationships among New World and Hawaiian mints (Stachydeae, Lamiaceae). Mol Phylogenet Evol 2015; 89:46-62. [PMID: 25888973 DOI: 10.1016/j.ympev.2015.03.023] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2014] [Revised: 03/24/2015] [Accepted: 03/26/2015] [Indexed: 02/05/2023]
Abstract
The phenomenon of polyploidy and hybridization usually results in novel genetic combinations, leading to complex, reticulate evolution and incongruence among gene trees, which in turn may show different phylogenetic histories than the inherent species tree. The largest tribe within the subfamily Lamioideae (Lamiaceae), Stachydeae, which includes the globally distributed Stachys, and one of the largest Hawaiian angiosperm radiations, the endemic mints, is a widespread and taxonomically challenging lineage displaying a wide spectrum of morphological and chromosomal diversity. Previous molecular phylogenetic studies have showed that while the Hawaiian mints group with Mexican-South American Stachys based on chloroplast DNA sequence data, nuclear ribosomal DNA (nrDNA) sequences suggest that they are most closely related to temperate North American Stachys. Here, we have utilized five independently inherited, low-copy nuclear loci, and a variety of phylogenetic methods, including multi-locus coalescence-based tree reconstructions, to provide insight into the complex origins and evolutionary relationships between the New World Stachys and the Hawaiian mints. Our results demonstrate incongruence between individual gene trees, grouping the Hawaiian mints with both temperate North American and Meso-South American Stachys clades. However, our multi-locus coalescence tree is concurrent with previous nrDNA results placing them within the temperate North American Stachys clade. Our results point toward a possible allopolyploid hybrid origin of the Hawaiian mints arising from temperate North American and Meso-South American ancestors, as well as a reticulate origin for South American Stachys. As such, our study is another significant step toward further understanding the putative parentage and the potential influence of hybridization and incomplete lineage sorting in giving rise to this insular plant lineage, which following colonization underwent rapid morphological and ecological diversification.
Collapse
Affiliation(s)
- Tilottama Roy
- Department of Biological Sciences, University at Buffalo (SUNY), Buffalo, NY 14260, USA.
| | - Logan W Cole
- Department of Biological Sciences, University at Buffalo (SUNY), Buffalo, NY 14260, USA; Department of Biology, Indiana University, Bloomington, IN 47405, USA.
| | - Tien-Hao Chang
- Department of Biological Sciences, University at Buffalo (SUNY), Buffalo, NY 14260, USA.
| | - Charlotte Lindqvist
- Department of Biological Sciences, University at Buffalo (SUNY), Buffalo, NY 14260, USA.
| |
Collapse
|
190
|
Liu L, Xi Z, Wu S, Davis CC, Edwards SV. Estimating phylogenetic trees from genome-scale data. Ann N Y Acad Sci 2015; 1360:36-53. [DOI: 10.1111/nyas.12747] [Citation(s) in RCA: 129] [Impact Index Per Article: 14.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Affiliation(s)
- Liang Liu
- Department of Statistics; University of Georgia; Athens Georgia
- Institute of Bioinformatics; University of Georgia; Athens Georgia
| | - Zhenxiang Xi
- Department of Organismic and Evolutionary Biology; Harvard University; Cambridge Massachusetts
| | - Shaoyuan Wu
- Department of Biochemistry and Molecular Biology & Tianjin Key Laboratory of Medical Epigenetics, School of Basic Medical Sciences; Tianjin Medical University; Tianjin China
| | - Charles C. Davis
- Department of Organismic and Evolutionary Biology; Harvard University; Cambridge Massachusetts
| | - Scott V. Edwards
- Department of Organismic and Evolutionary Biology; Harvard University; Cambridge Massachusetts
| |
Collapse
|
191
|
Brandley MC, Bragg JG, Singhal S, Chapple DG, Jennings CK, Lemmon AR, Lemmon EM, Thompson MB, Moritz C. Evaluating the performance of anchored hybrid enrichment at the tips of the tree of life: a phylogenetic analysis of Australian Eugongylus group scincid lizards. BMC Evol Biol 2015; 15:62. [PMID: 25880916 PMCID: PMC4434831 DOI: 10.1186/s12862-015-0318-0] [Citation(s) in RCA: 52] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2014] [Accepted: 02/24/2015] [Indexed: 01/31/2023] Open
Abstract
Background High-throughput sequencing using targeted enrichment and transcriptomic methods enables rapid construction of phylogenomic data sets incorporating hundreds to thousands of loci. These advances have enabled access to an unprecedented amount of nucleotide sequence data, but they also pose new questions. Given that the loci targeted for enrichment are often highly conserved, how informative are they at different taxonomic scales, especially at the intraspecific/phylogeographic scale? We investigate this question using Australian scincid lizards in the Eugongylus group (Squamata: Scincidae). We sequenced 415 anchored hybrid enriched (AHE) loci for 43 individuals and mined 1650 exons (1648 loci) from transcriptomes (transcriptome mining) from 11 individuals, including multiple phylogeographic lineages within several species of Carlia, Lampropholis, and Saproscincus skinks. We assessed the phylogenetic information content of these loci at the intergeneric, interspecific, and phylogeographic scales. As a further test of the utility at the phylogeographic scale, we used the anchor hybrid enriched loci to infer lineage divergence parameters using coalescent models of isolation with migration. Results Phylogenetic analyses of both data sets inferred very strongly supported trees at all taxonomic levels. Further, AHE loci yielded estimates of divergence times between closely related lineages that were broadly consistent with previous population-level analyses. Conclusions Anchored-enriched loci are useful at the deep phylogeny and phylogeographic scales. Although overall phylogenetic support was high throughout the Australian Eugongylus group phylogeny, there were nonetheless some conflicting or unresolved relationships, especially regarding the placement of Pseudemoia, Cryptoblepharus, and the relationships amongst closely-related species of Tasmanian Niveoscincus skinks. Electronic supplementary material The online version of this article (doi:10.1186/s12862-015-0318-0) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Matthew C Brandley
- School of Biological Sciences, Heydon-Laurence Building A08, University of Sydney, Sydney, NSW, 2006, Australia. .,New York University - Sydney, The Rocks, NSW, 2000, Australia.
| | - Jason G Bragg
- Research School of Biology and Centre for Biodiversity Analysis, The Australian National University, Canberra, ACT 0200, Australia.
| | - Sonal Singhal
- Museum of Vertebrate Zoology, University of California, 3101 Valley Life Sciences Building, Berkeley, CA, 94720, USA. .,Department of Integrative Biology, University of California, 3060 Valley Life Sciences Building, Berkeley, CA, 94720, USA.
| | - David G Chapple
- School of Biological Sciences, Monash University, Clayton, Melbourne, VIC, 3800, Australia.
| | - Charlotte K Jennings
- Museum of Vertebrate Zoology, University of California, 3101 Valley Life Sciences Building, Berkeley, CA, 94720, USA. .,Department of Integrative Biology, University of California, 3060 Valley Life Sciences Building, Berkeley, CA, 94720, USA.
| | - Alan R Lemmon
- Department of Scientific Computing, Florida State University, Dirac Science Library, Tallahassee, FL, 32306, USA.
| | - Emily Moriarty Lemmon
- Department of Biological Science, Florida State University, 319 Stadium Drive, PO Box 3064295, Tallahassee, FL, 32306, USA.
| | - Michael B Thompson
- School of Biological Sciences, Heydon-Laurence Building A08, University of Sydney, Sydney, NSW, 2006, Australia.
| | - Craig Moritz
- Research School of Biology and Centre for Biodiversity Analysis, The Australian National University, Canberra, ACT 0200, Australia. .,The Commonwealth Scientific and Industrial Research Organization Ecosystem Sciences Division, GPO Box 1700, Canberra, ACT, 2601, Australia.
| |
Collapse
|
192
|
Sharma PP, Fernández R, Esposito LA, González-Santillán E, Monod L. Phylogenomic resolution of scorpions reveals multilevel discordance with morphological phylogenetic signal. Proc Biol Sci 2015; 282:20142953. [PMID: 25716788 PMCID: PMC4375871 DOI: 10.1098/rspb.2014.2953] [Citation(s) in RCA: 90] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2014] [Accepted: 01/26/2015] [Indexed: 01/22/2023] Open
Abstract
Scorpions represent an iconic lineage of arthropods, historically renowned for their unique bauplan, ancient fossil record and venom potency. Yet, higher level relationships of scorpions, based exclusively on morphology, remain virtually untested, and no multilocus molecular phylogeny has been deployed heretofore towards assessing the basal tree topology. We applied a phylogenomic assessment to resolve scorpion phylogeny, for the first time, to our knowledge, sampling extensive molecular sequence data from all superfamilies and examining basal relationships with up to 5025 genes. Analyses of supermatrices as well as species tree approaches converged upon a robust basal topology of scorpions that is entirely at odds with traditional systematics and controverts previous understanding of scorpion evolutionary history. All analyses unanimously support a single origin of katoikogenic development, a form of parental investment wherein embryos are nurtured by direct connections to the parent's digestive system. Based on the phylogeny obtained herein, we propose the following systematic emendations: Caraboctonidae is transferred to Chactoidea new superfamilial assignment: ; superfamily Bothriuroidea revalidated: is resurrected and Bothriuridae transferred therein; and Chaerilida and Pseudochactida are synonymized with Buthida new parvordinal synonymies: .
Collapse
Affiliation(s)
- Prashant P Sharma
- Division of Invertebrate Zoology, American Museum of Natural History, Central Park West at 79th Street, New York, NY 10024, USA
| | - Rosa Fernández
- Department of Organismic and Evolutionary Biology, Harvard University, 26 Oxford Street, Cambridge, MA 02138, USA
| | - Lauren A Esposito
- Essig Museum of Entomology, University of California at Berkeley, 130 Mulford Hall, Berkeley, CA 94720, USA
| | - Edmundo González-Santillán
- Laboratorio Nacional de Genómica para la Biodiversidad, Centro de Investigaciones y de Estudios Avanzados del Instituto Politecnico Nacional, and Laboratorio de Aracnología, Departamento de Biología Comparada, Facultad de Ciencias, Universidad Nacional Autónoma de México, Coyoacán, C.P. 04510, México DF, México
| | - Lionel Monod
- Département des Arthropodes et d'Entomologie I, Muséum d'Histoire Naturelle de la Ville de Genève, Route de Malagnou 1, Genève 1208, Switzerland
| |
Collapse
|
193
|
Resolving phylogenetic relationships of the recently radiated carnivorous plant genus Sarracenia using target enrichment. Mol Phylogenet Evol 2015; 85:76-87. [DOI: 10.1016/j.ympev.2015.01.015] [Citation(s) in RCA: 67] [Impact Index Per Article: 7.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2014] [Revised: 01/23/2015] [Accepted: 01/27/2015] [Indexed: 12/22/2022]
|
194
|
Roch S, Warnow T. On the Robustness to Gene Tree Estimation Error (or lack thereof) of Coalescent-Based Species Tree Methods. Syst Biol 2015; 64:663-76. [PMID: 25813358 DOI: 10.1093/sysbio/syv016] [Citation(s) in RCA: 95] [Impact Index Per Article: 10.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2014] [Accepted: 03/20/2015] [Indexed: 11/13/2022] Open
Abstract
The estimation of species trees using multiple loci has become increasingly common. Because different loci can have different phylogenetic histories (reflected in different gene tree topologies) for multiple biological causes, new approaches to species tree estimation have been developed that take gene tree heterogeneity into account. Among these multiple causes, incomplete lineage sorting (ILS), modeled by the multi-species coalescent, is potentially the most common cause of gene tree heterogeneity, and much of the focus of the recent literature has been on how to estimate species trees in the presence of ILS. Despite progress in developing statistically consistent techniques for estimating species trees when gene trees can differ due to ILS, there is substantial controversy in the systematics community as to whether to use the new coalescent-based methods or the traditional concatenation methods. One of the key issues that has been raised is understanding the impact of gene tree estimation error on coalescent-based methods that operate by combining gene trees. Here we explore the mathematical guarantees of coalescent-based methods when analyzing estimated rather than true gene trees. Our results provide some insight into the differences between promise of coalescent-based methods in theory and their performance in practice.
Collapse
Affiliation(s)
- Sebastien Roch
- Department of Mathematics, University of Wisconsin at Madison, 480 Lincoln Dr., Madison, Wisconsin, 53706, USA and Departments of Bioengineering and Computer Science, University of Illinois at Urbana-Champaign, Urbana, IL, 61801, USA
| | - Tandy Warnow
- Department of Mathematics, University of Wisconsin at Madison, 480 Lincoln Dr., Madison, Wisconsin, 53706, USA and Departments of Bioengineering and Computer Science, University of Illinois at Urbana-Champaign, Urbana, IL, 61801, USA
| |
Collapse
|
195
|
Tonini J, Moore A, Stern D, Shcheglovitova M, Ortí G. Concatenation and Species Tree Methods Exhibit Statistically Indistinguishable Accuracy under a Range of Simulated Conditions. PLOS CURRENTS 2015; 7. [PMID: 25901289 PMCID: PMC4391732 DOI: 10.1371/currents.tol.34260cc27551a527b124ec5f6334b6be] [Citation(s) in RCA: 62] [Impact Index Per Article: 6.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
Phylogeneticists have long understood that several biological processes can cause a gene tree to disagree with its species tree. In recent years, molecular phylogeneticists have increasingly foregone traditional supermatrix approaches in favor of species tree methods that account for one such source of error, incomplete lineage sorting (ILS). While gene tree-species tree discordance no doubt poses a significant challenge to phylogenetic inference with molecular data, researchers have only recently begun to systematically evaluate the relative accuracy of traditional and ILS-sensitive methods. Here, we report on simulations demonstrating that concatenation can perform as well or better than methods that attempt to account for sources of error introduced by ILS. Based on these and similar results from other researchers, we argue that concatenation remains a useful component of the phylogeneticist’s toolbox and highlight that phylogeneticists should continue to make explicit comparisons of results produced by contemporaneous and classical methods.
Collapse
Affiliation(s)
- João Tonini
- Department of Biological Sciences, The George Washington Univerisity, Washington, District of Columbia, USA
| | - Andrew Moore
- Department of Biological Sciences, The George Washington University, Washington, District of Columbia, USA
| | - David Stern
- Computational Biology Institute, Department of Biological Sciences, The George Washington University, Washington, District of Columbia, USA
| | - Maryia Shcheglovitova
- Department of Geography & Environmental Systems, University of Maryland Baltimore County, Baltimore, MD, USA
| | - Guillermo Ortí
- Department of Biological Sciences, The George Washington Univerisity, Washington, District of Columbia, USA
| |
Collapse
|
196
|
Likelihood-based tree reconstruction on a concatenation of aligned sequence data sets can be statistically inconsistent. Theor Popul Biol 2015; 100C:56-62. [DOI: 10.1016/j.tpb.2014.12.005] [Citation(s) in RCA: 174] [Impact Index Per Article: 19.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2014] [Revised: 11/10/2014] [Accepted: 12/18/2014] [Indexed: 01/14/2023]
|
197
|
Dasarathy G, Nowak R, Roch S. Data Requirement for Phylogenetic Inference from Multiple Loci: A New Distance Method. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2015; 12:422-432. [PMID: 26357228 DOI: 10.1109/tcbb.2014.2361685] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
We consider the problem of estimating the evolutionary history of a set of species (phylogeny or species tree) from several genes. It is known that the evolutionary history of individual genes (gene trees) might be topologically distinct from each other and from the underlying species tree, possibly confounding phylogenetic analysis. A further complication in practice is that one has to estimate gene trees from molecular sequences of finite length. We provide the first full data-requirement analysis of a species tree reconstruction method that takes into account estimation errors at the gene level. Under that criterion, we also devise a novel reconstruction algorithm that provably improves over all previous methods in a regime of interest.
Collapse
|
198
|
Tang L, Zou XH, Zhang LB, Ge S. Multilocus species tree analyses resolve the ancient radiation of the subtribe Zizaniinae (Poaceae). Mol Phylogenet Evol 2015; 84:232-9. [DOI: 10.1016/j.ympev.2015.01.011] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2014] [Revised: 01/06/2015] [Accepted: 01/24/2015] [Indexed: 10/24/2022]
|
199
|
Joly S, Bryant D, Lockhart PJ. Flexible methods for estimating genetic distances from single nucleotide polymorphisms. Methods Ecol Evol 2015. [DOI: 10.1111/2041-210x.12343] [Citation(s) in RCA: 30] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Affiliation(s)
- Simon Joly
- Institut de recherche en biologie végétale Montreal Botanical Garden 4101 Sherbrooke East Montreal QC H1X 2B2Canada
| | - David Bryant
- Department of Mathematics and Statistics University of Otago P.O. Box 56, Dunedin 9054 New Zealand
| | - Peter J. Lockhart
- Institute of Fundamental Sciences Massey University Private Bag 11 222 Palmerston North New Zealand
| |
Collapse
|
200
|
Crawford NG, Parham JF, Sellas AB, Faircloth BC, Glenn TC, Papenfuss TJ, Henderson JB, Hansen MH, Simison WB. A phylogenomic analysis of turtles. Mol Phylogenet Evol 2015; 83:250-7. [DOI: 10.1016/j.ympev.2014.10.021] [Citation(s) in RCA: 209] [Impact Index Per Article: 23.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2014] [Revised: 10/16/2014] [Accepted: 10/28/2014] [Indexed: 11/25/2022]
|