101
|
Nagy LG, Szöllősi G. Fungal Phylogeny in the Age of Genomics: Insights Into Phylogenetic Inference From Genome-Scale Datasets. ADVANCES IN GENETICS 2017; 100:49-72. [DOI: 10.1016/bs.adgen.2017.09.008] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/26/2023]
|
102
|
Zhao L, Li X, Zhang N, Zhang SD, Yi TS, Ma H, Guo ZH, Li DZ. Phylogenomic analyses of large-scale nuclear genes provide new insights into the evolutionary relationships within the rosids. Mol Phylogenet Evol 2016; 105:166-176. [DOI: 10.1016/j.ympev.2016.06.007] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2015] [Revised: 06/06/2016] [Accepted: 06/27/2016] [Indexed: 12/28/2022]
|
103
|
Sequence capture using RAD probes clarifies phylogenetic relationships and species boundaries in Primula sect. Auricula. Mol Phylogenet Evol 2016; 104:60-72. [DOI: 10.1016/j.ympev.2016.08.003] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/11/2016] [Revised: 07/27/2016] [Accepted: 08/04/2016] [Indexed: 11/21/2022]
|
104
|
Koyama T, Ito H, Fujisawa T, Ikeda H, Kakishima S, Cooley JR, Simon C, Yoshimura J, Sota T. Genomic divergence and lack of introgressive hybridization between two 13-year periodical cicadas support life cycle switching in the face of climate change. Mol Ecol 2016; 25:5543-5556. [PMID: 27661077 DOI: 10.1111/mec.13858] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2016] [Revised: 09/13/2016] [Accepted: 09/18/2016] [Indexed: 01/15/2023]
Abstract
Life history evolution spurred by post-Pleistocene climatic change is hypothesized to be responsible for the present diversity in periodical cicadas (Magicicada), but the mechanism of life cycle change has been controversial. To understand the divergence process of 13-year and 17-year cicada life cycles, we studied genetic relationships between two synchronously emerging, parapatric 13-year periodical cicada species in the Decim group, Magicicada tredecim and M. neotredecim. The latter was hypothesized to be of hybrid origin or to have switched from a 17-year cycle via developmental plasticity. Phylogenetic analysis using restriction-site-associated DNA sequences for all Decim species and broods revealed that the 13-year M. tredecim lineage is genomically distinct from 17-year Magicicada septendecim but that 13-year M. neotredecim is not. We detected no significant introgression between M. tredecim and M. neotredecim/M. septendecim thus refuting the hypothesis that M. neotredecim are products of hybridization between M. tredecim and M. septendecim. Further, we found that introgressive hybridization is very rare or absent in the contact zone between the two 13-year species evidenced by segregation patterns in single nucleotide polymorphisms, mitochondrial lineage identity and head width and abdominal sternite colour phenotypes. Our study demonstrates that the two 13-year Decim species are of independent origin and nearly completely reproductively isolated. Combining our data with increasing observations of occasional life cycle change in part of a cohort (e.g. 4-year acceleration of emergence in 17-year species), we suggest a pivotal role for developmental plasticity in Magicicada life cycle evolution.
Collapse
Affiliation(s)
- Takuya Koyama
- Department of Zoology, Graduate School of Science, Kyoto University, Sakyo, Kyoto, 606-8502, Japan
| | - Hiromu Ito
- Graduate School of Science and Technology, Shizuoka University, Hamamatsu, 432-8561, Japan.,Department of International Health, Institute of Tropical Medicine, Nagasaki University, Nagasaki, 852-8523, Japan
| | - Tomochika Fujisawa
- Department of Zoology, Graduate School of Science, Kyoto University, Sakyo, Kyoto, 606-8502, Japan
| | - Hiroshi Ikeda
- Faculty of Agriculture and Life Science, Hirosaki University, Hirosaki, 036-8561, Japan
| | - Satoshi Kakishima
- Graduate School of Science and Technology, Shizuoka University, Hamamatsu, 432-8561, Japan
| | - John R Cooley
- College of Integrative Sciences, Wesleyan University, Middletown, CT, 06459, USA
| | - Chris Simon
- Department of Ecology and Evolutionary Biology, University of Connecticut, Storrs, CT, 06268-3043, USA
| | - Jin Yoshimura
- Graduate School of Science and Technology, Shizuoka University, Hamamatsu, 432-8561, Japan.,Department of Environmental and Forest Biology, State University of New York College of Environmental Science and Forestry, Syracuse, NY, 13210, USA.,Marine Biosystems Research Center, Chiba University, Uchiura, Kamogawa, Chiba, 299-5502, Japan
| | - Teiji Sota
- Department of Zoology, Graduate School of Science, Kyoto University, Sakyo, Kyoto, 606-8502, Japan.
| |
Collapse
|
105
|
Hamilton CA, Lemmon AR, Lemmon EM, Bond JE. Expanding anchored hybrid enrichment to resolve both deep and shallow relationships within the spider tree of life. BMC Evol Biol 2016; 16:212. [PMID: 27733110 PMCID: PMC5062932 DOI: 10.1186/s12862-016-0769-y] [Citation(s) in RCA: 100] [Impact Index Per Article: 12.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/04/2016] [Accepted: 09/28/2016] [Indexed: 12/17/2022] Open
Abstract
BACKGROUND Despite considerable effort, progress in spider molecular systematics has lagged behind many other comparable arthropod groups, thereby hindering family-level resolution, classification, and testing of important macroevolutionary hypotheses. Recently, alternative targeted sequence capture techniques have provided molecular systematics a powerful tool for resolving relationships across the Tree of Life. One of these approaches, Anchored Hybrid Enrichment (AHE), is designed to recover hundreds of unique orthologous loci from across the genome, for resolving both shallow and deep-scale evolutionary relationships within non-model systems. Herein we present a modification of the AHE approach that expands its use for application in spiders, with a particular emphasis on the infraorder Mygalomorphae. RESULTS Our aim was to design a set of probes that effectively capture loci informative at a diversity of phylogenetic timescales. Following identification of putative arthropod-wide loci, we utilized homologous transcriptome sequences from 17 species across all spiders to identify exon boundaries. Conserved regions with variable flanking regions were then sought across the tick genome, three published araneomorph spider genomes, and raw genomic reads of two mygalomorph taxa. Following development of the 585 target loci in the Spider Probe Kit, we applied AHE across three taxonomic depths to evaluate performance: deep-level spider family relationships (33 taxa, 327 loci); family and generic relationships within the mygalomorph family Euctenizidae (25 taxa, 403 loci); and species relationships in the North American tarantula genus Aphonopelma (83 taxa, 581 loci). At the deepest level, all three major spider lineages (the Mesothelae, Mygalomorphae, and Araneomorphae) were supported with high bootstrap support. Strong support was also found throughout the Euctenizidae, including generic relationships within the family and species relationships within the genus Aptostichus. As in the Euctenizidae, virtually identical topologies were inferred with high support throughout Aphonopelma. CONCLUSIONS The Spider Probe Kit, the first implementation of AHE methodology in Class Arachnida, holds great promise for gathering the types and quantities of molecular data needed to accelerate an understanding of the spider Tree of Life by providing a mechanism whereby different researchers can confidently and effectively use the same loci for independent projects, yet allowing synthesis of data across independent research groups.
Collapse
Affiliation(s)
- Chris A. Hamilton
- Department of Biological Sciences, Auburn University & Auburn University Museum of Natural History, Auburn, AL USA
| | - Alan R. Lemmon
- Department of Scientific Computing, Florida State University, Tallahassee, FL USA
| | | | - Jason E. Bond
- Department of Biological Sciences, Auburn University & Auburn University Museum of Natural History, Auburn, AL USA
| |
Collapse
|
106
|
Figueroa A, McKelvy AD, Grismer LL, Bell CD, Lailvaux SP. A Species-Level Phylogeny of Extant Snakes with Description of a New Colubrid Subfamily and Genus. PLoS One 2016; 11:e0161070. [PMID: 27603205 PMCID: PMC5014348 DOI: 10.1371/journal.pone.0161070] [Citation(s) in RCA: 159] [Impact Index Per Article: 19.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2016] [Accepted: 07/28/2016] [Indexed: 11/23/2022] Open
Abstract
BACKGROUND With over 3,500 species encompassing a diverse range of morphologies and ecologies, snakes make up 36% of squamate diversity. Despite several attempts at estimating higher-level snake relationships and numerous assessments of generic- or species-level phylogenies, a large-scale species-level phylogeny solely focusing on snakes has not been completed. Here, we provide the largest-yet estimate of the snake tree of life using maximum likelihood on a supermatrix of 1745 taxa (1652 snake species + 7 outgroup taxa) and 9,523 base pairs from 10 loci (5 nuclear, 5 mitochondrial), including previously unsequenced genera (2) and species (61). RESULTS Increased taxon sampling resulted in a phylogeny with a new higher-level topology and corroborate many lower-level relationships, strengthened by high nodal support values (> 85%) down to the species level (73.69% of nodes). Although the majority of families and subfamilies were strongly supported as monophyletic with > 88% support values, some families and numerous genera were paraphyletic, primarily due to limited taxon and loci sampling leading to a sparse supermatrix and minimal sequence overlap between some closely-related taxa. With all rogue taxa and incertae sedis species eliminated, higher-level relationships and support values remained relatively unchanged, except in five problematic clades. CONCLUSION Our analyses resulted in new topologies at higher- and lower-levels; resolved several previous topological issues; established novel paraphyletic affiliations; designated a new subfamily, Ahaetuliinae, for the genera Ahaetulla, Chrysopelea, Dendrelaphis, and Dryophiops; and appointed Hemerophis (Coluber) zebrinus to a new genus, Mopanveldophis. Although we provide insight into some distinguished problematic nodes, at the deeper phylogenetic scale, resolution of these nodes may require sampling of more slowly-evolving nuclear genes.
Collapse
Affiliation(s)
- Alex Figueroa
- Department of Biological Sciences, University of New Orleans, New Orleans, LA, United States of America
| | - Alexander D. McKelvy
- Department of Biology, The Graduate School and Center, City University of New York, New York, NY, United States of America
- Department of Biology, 6S-143, College of Staten Island, 2800 Victory Boulevard, Staten Island, NY, United States of America
| | - L. Lee Grismer
- Department of Biology, La Sierra University, 4500 Riverwalk Parkway, Riverside, CA, United States of America
| | - Charles D. Bell
- Department of Biological Sciences, University of New Orleans, New Orleans, LA, United States of America
| | - Simon P. Lailvaux
- Department of Biological Sciences, University of New Orleans, New Orleans, LA, United States of America
| |
Collapse
|
107
|
Shen XX, Salichos L, Rokas A. A Genome-Scale Investigation of How Sequence, Function, and Tree-Based Gene Properties Influence Phylogenetic Inference. Genome Biol Evol 2016; 8:2565-80. [PMID: 27492233 PMCID: PMC5010910 DOI: 10.1093/gbe/evw179] [Citation(s) in RCA: 47] [Impact Index Per Article: 5.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 07/25/2016] [Indexed: 12/13/2022] Open
Abstract
Molecular phylogenetic inference is inherently dependent on choices in both methodology and data. Many insightful studies have shown how choices in methodology, such as the model of sequence evolution or optimality criterion used, can strongly influence inference. In contrast, much less is known about the impact of choices in the properties of the data, typically genes, on phylogenetic inference. We investigated the relationships between 52 gene properties (24 sequence-based, 19 function-based, and 9 tree-based) with each other and with three measures of phylogenetic signal in two assembled data sets of 2,832 yeast and 2,002 mammalian genes. We found that most gene properties, such as evolutionary rate (measured through the percent average of pairwise identity across taxa) and total tree length, were highly correlated with each other. Similarly, several gene properties, such as gene alignment length, Guanine-Cytosine content, and the proportion of tree distance on internal branches divided by relative composition variability (treeness/RCV), were strongly correlated with phylogenetic signal. Analysis of partial correlations between gene properties and phylogenetic signal in which gene evolutionary rate and alignment length were simultaneously controlled, showed similar patterns of correlations, albeit weaker in strength. Examination of the relative importance of each gene property on phylogenetic signal identified gene alignment length, alongside with number of parsimony-informative sites and variable sites, as the most important predictors. Interestingly, the subsets of gene properties that optimally predicted phylogenetic signal differed considerably across our three phylogenetic measures and two data sets; however, gene alignment length and RCV were consistently included as predictors of all three phylogenetic measures in both yeasts and mammals. These results suggest that a handful of sequence-based gene properties are reliable predictors of phylogenetic signal and could be useful in guiding the choice of phylogenetic markers.
Collapse
Affiliation(s)
- Xing-Xing Shen
- Department of Biological Sciences, Vanderbilt University
| | - Leonidas Salichos
- Department of Biological Sciences, Vanderbilt University Department of Molecular Biophysics and Biochemistry, Yale University
| | - Antonis Rokas
- Department of Biological Sciences, Vanderbilt University
| |
Collapse
|
108
|
Fernández R, Edgecombe GD, Giribet G. Exploring Phylogenetic Relationships within Myriapoda and the Effects of Matrix Composition and Occupancy on Phylogenomic Reconstruction. Syst Biol 2016; 65:871-89. [PMID: 27162151 PMCID: PMC4997009 DOI: 10.1093/sysbio/syw041] [Citation(s) in RCA: 74] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2015] [Accepted: 04/28/2016] [Indexed: 11/14/2022] Open
Abstract
Myriapods, including the diverse and familiar centipedes and millipedes, are one of the dominant terrestrial arthropod groups. Although molecular evidence has shown that Myriapoda is monophyletic, its internal phylogeny remains contentious and understudied, especially when compared to those of Chelicerata and Hexapoda. Until now, efforts have focused on taxon sampling (e.g., by including a handful of genes from many species) or on maximizing matrix size (e.g., by including hundreds or thousands of genes in just a few species), but a phylogeny maximizing sampling at both levels remains elusive. In this study, we analyzed 40 Illumina transcriptomes representing 3 of the 4 myriapod classes (Diplopoda, Chilopoda, and Symphyla); 25 transcriptomes were newly sequenced to maximize representation at the ordinal level in Diplopoda and at the family level in Chilopoda. Ten supermatrices were constructed to explore the effect of several potential phylogenetic biases (e.g., rate of evolution, heterotachy) at 3 levels of gene occupancy per taxon (50%, 75%, and 90%). Analyses based on maximum likelihood and Bayesian mixture models retrieved monophyly of each myriapod class, and resulted in 2 alternative phylogenetic positions for Symphyla, as sister group to Diplopoda + Chilopoda, or closer to Diplopoda, the latter hypothesis having been traditionally supported by morphology. Within centipedes, all orders were well supported, but 2 deep nodes remained in conflict in the different analyses despite dense taxon sampling at the family level. Relationships among centipede orders in all analyses conducted with the most complete matrix (90% occupancy) are at odds not only with the sparser but more gene-rich supermatrices (75% and 50% supermatrices) and with the matrices optimizing phylogenetic informativeness or most conserved genes, but also with previous hypotheses based on morphology, development, or other molecular data sets. Our results indicate that a high percentage of ribosomal proteins in the most complete matrices, in conjunction with distance from the root, can act in concert to compromise the estimated relationships within the ingroup. We discuss the implications of these findings in the context of the ever more prevalent quest for completeness in phylogenomic studies.
Collapse
Affiliation(s)
- Rosa Fernández
- Museum of Comparative Zoology & Department of Organismic and Evolutionary Biology, Harvard University, 26 Oxford Street, Cambridge, MA 02138, USA
| | - Gregory D Edgecombe
- Department of Earth Sciences, The Natural History Museum, Cromwell Road, London SW7 5BD, UK
| | - Gonzalo Giribet
- Museum of Comparative Zoology & Department of Organismic and Evolutionary Biology, Harvard University, 26 Oxford Street, Cambridge, MA 02138, USA
| |
Collapse
|
109
|
Islands in the desert: Species delimitation and evolutionary history of Pseudotetracha tiger beetles (Coleoptera: Cicindelidae: Megacephalini) from Australian salt lakes. Mol Phylogenet Evol 2016; 101:279-285. [DOI: 10.1016/j.ympev.2016.05.017] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2015] [Revised: 05/09/2016] [Accepted: 05/16/2016] [Indexed: 11/19/2022]
|
110
|
Abstract
Animals make up only a small fraction of the eukaryotic tree of life, yet, from our vantage point as members of the animal kingdom, the evolution of the bewildering diversity of animal forms is endlessly fascinating. In the century following the publication of Darwin's Origin of Species, hypotheses regarding the evolution of the major branches of the animal kingdom - their relationships to each other and the evolution of their body plans - was based on a consideration of the morphological and developmental characteristics of the different animal groups. This morphology-based approach had many successes but important aspects of the evolutionary tree remained disputed. In the past three decades, molecular data, most obviously primary sequences of DNA and proteins, have provided an estimate of animal phylogeny largely independent of the morphological evolution we would ultimately like to understand. The molecular tree that has evolved over the past three decades has drastically altered our view of animal phylogeny and many aspects of the tree are no longer contentious. The focus of molecular studies on relationships between animal groups means, however, that the discipline has become somewhat divorced from the underlying biology and from the morphological characteristics whose evolution we aim to understand. Here, we consider what we currently know of animal phylogeny; what aspects we are still uncertain about and what our improved understanding of animal phylogeny can tell us about the evolution of the great diversity of animal life.
Collapse
Affiliation(s)
- Maximilian J Telford
- Department of Genetics, Evolution and Environment, University College London, WC1E 6BT, UK.
| | - Graham E Budd
- Department of Earth Sciences, Palaeobiology, Uppsala University, Villavägen 16, 75236 Uppsala, Sweden
| | - Hervé Philippe
- Centre de Théorisation et de Modélisation de la Biodiversité, Station d'Ecologie Expérimentale du CNRS, USR CNRS 2936 Moulis, 09200, France; Département de Biochimie, Centre Robert-Cedergren, Université de Montréal, Montréal, Québec, Canada
| |
Collapse
|
111
|
Irisarri I, Meyer A. The Identification of the Closest Living Relative(s) of Tetrapods: Phylogenomic Lessons for Resolving Short Ancient Internodes. Syst Biol 2016; 65:1057-1075. [PMID: 27425642 DOI: 10.1093/sysbio/syw057] [Citation(s) in RCA: 30] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2015] [Accepted: 06/08/2016] [Indexed: 01/08/2023] Open
Abstract
Identifying the closest living relative(s) of tetrapods is an important, yet still contested question in vertebrate phylogenetics. Three hypotheses are possible and ruling out alternatives has proven difficult even with large molecular data sets due to weak phylogenetic signal coupled nonphylogenetic noise resulting from relatively rapid speciation events that occurred a long time ago ([Formula: see text]400 Ma). Here, we revisit the identity of the closest living relative of land vertebrates from a phylogenomic perspective and include new genomic data for all extant lungfish genera. RNA-seq proves to be a great alternative to genomic sequencing, which currently is technically not feasible in lungfishes due to their huge (50-130 Gb) and repetitive genomes. We examined the most important sources of systematic error, namely long-branch attraction (LBA), compositional heterogeneity and distribution of missing data and applied different correction techniques. A multispecies coalescent approach is used to account for deep coalescence that might come from the short and deep internodes separating early sarcopterygian splits. Concatenation methods favored lungfishes as the closest living relatives of tetrapods with strong statistical support. Amino acid profile mixture models can unambiguously resolve this difficult internode thanks to their ability to avoid systematic error. We assessed the performance of different site-heterogeneous models and data partitioning and compared the ability of different strategies designed to overcome LBA, including taxon manipulation, reduction of among-lineage rate heterogeneity and removal of fast-evolving or compositionally heterogeneous positions. The identification of lungfish as sister group of tetrapods is robust regarding the effects of nonstationary composition and distribution of missing data. The multispecies coalescent method reconstructed strongly supported topologies that were congruent with concatenation, despite pervasive gene tree heterogeneity. We reject alternative topologies for early sarcopterygian relationships by increasing the signal-to-noise ratio in our alignments. The analytical pipeline outlined here combines probabilistic phylogenomic inference with methods for evaluating data quality, model adequacy, and assessing systematic error, and thus is likely to help resolve similarly difficult internodes in the tree of life. [Coalescence; coelacanth; compositional heterogeneity; gene tree; long-branch attraction; lungfish; missing data; model misspecification; phylogenomic; species tree; systematic error.].
Collapse
Affiliation(s)
- Iker Irisarri
- Laboratory for Zoology and Evolutionary Biology, Department of Biology, University of Konstanz, 78464 Konstanz, Germany
| | - Axel Meyer
- Laboratory for Zoology and Evolutionary Biology, Department of Biology, University of Konstanz, 78464 Konstanz, Germany
| |
Collapse
|
112
|
MetaPalette: a k-mer Painting Approach for Metagenomic Taxonomic Profiling and Quantification of Novel Strain Variation. mSystems 2016; 1:mSystems00020-16. [PMID: 27822531 PMCID: PMC5069763 DOI: 10.1128/msystems.00020-16] [Citation(s) in RCA: 34] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/29/2016] [Accepted: 03/28/2016] [Indexed: 12/30/2022] Open
Abstract
Metagenomic profiling is challenging in part because of the highly uneven sampling of the tree of life by genome sequencing projects and the limitations imposed by performing phylogenetic inference at fixed taxonomic ranks. We present the algorithm MetaPalette, which uses long k-mer sizes (k = 30, 50) to fit a k-mer "palette" of a given sample to the k-mer palette of reference organisms. By modeling the k-mer palettes of unknown organisms, the method also gives an indication of the presence, abundance, and evolutionary relatedness of novel organisms present in the sample. The method returns a traditional, fixed-rank taxonomic profile which is shown on independently simulated data to be one of the most accurate to date. Tree figures are also returned that quantify the relatedness of novel organisms to reference sequences, and the accuracy of such figures is demonstrated on simulated spike-ins and a metagenomic soil sample. The software implementing MetaPalette is available at: https://github.com/dkoslicki/MetaPalette. Pretrained databases are included for Archaea, Bacteria, Eukaryota, and viruses. IMPORTANCE Taxonomic profiling is a challenging first step when analyzing a metagenomic sample. This work presents a method that facilitates fine-scale characterization of the presence, abundance, and evolutionary relatedness of organisms present in a given sample but absent from the training database. We calculate a "k-mer palette" which summarizes the information from all reads, not just those in conserved genes or containing taxon-specific markers. The compositions of palettes are easy to model, allowing rapid inference of community composition. In addition to providing strain-level information where applicable, our approach provides taxonomic profiles that are more accurate than those of competing methods. Author Video: An author video summary of this article is available.
Collapse
|
113
|
Rivera-Rivera CJ, Montoya-Burgos JI. LS³: A Method for Improving Phylogenomic Inferences When Evolutionary Rates Are Heterogeneous among Taxa. Mol Biol Evol 2016; 33:1625-34. [PMID: 26912812 PMCID: PMC4868118 DOI: 10.1093/molbev/msw043] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022] Open
Abstract
Phylogenetic inference artifacts can occur when sequence evolution deviates from assumptions made by the models used to analyze them. The combination of strong model assumption violations and highly heterogeneous lineage evolutionary rates can become problematic in phylogenetic inference, and lead to the well-described long-branch attraction (LBA) artifact. Here, we define an objective criterion for assessing lineage evolutionary rate heterogeneity among predefined lineages: the result of a likelihood ratio test between a model in which the lineages evolve at the same rate (homogeneous model) and a model in which different lineage rates are allowed (heterogeneous model). We implement this criterion in the algorithm Locus Specific Sequence Subsampling (LS³), aimed at reducing the effects of LBA in multi-gene datasets. For each gene, LS³ sequentially removes the fastest-evolving taxon of the ingroup and tests for lineage rate homogeneity until all lineages have uniform evolutionary rates. The sequences excluded from the homogeneously evolving taxon subset are flagged as potentially problematic. The software implementation provides the user with the possibility to remove the flagged sequences for generating a new concatenated alignment. We tested LS³ with simulations and two real datasets containing LBA artifacts: a nucleotide dataset regarding the position of Glires within mammals and an amino-acid dataset concerning the position of nematodes within bilaterians. The initially incorrect phylogenies were corrected in all cases upon removing data flagged by LS³.
Collapse
Affiliation(s)
- Carlos J Rivera-Rivera
- Department of Genetics and Evolution, University of Geneva, Geneva, Switzerland Institute of Genetics and Genomics in Geneva (iGE3), Geneva, Switzerland
| | - Juan I Montoya-Burgos
- Department of Genetics and Evolution, University of Geneva, Geneva, Switzerland Institute of Genetics and Genomics in Geneva (iGE3), Geneva, Switzerland
| |
Collapse
|
114
|
Cavalier-Smith T, Chao EE, Lewis R. 187-gene phylogeny of protozoan phylum Amoebozoa reveals a new class (Cutosea) of deep-branching, ultrastructurally unique, enveloped marine Lobosa and clarifies amoeba evolution. Mol Phylogenet Evol 2016; 99:275-296. [DOI: 10.1016/j.ympev.2016.03.023] [Citation(s) in RCA: 38] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2015] [Revised: 03/16/2016] [Accepted: 03/17/2016] [Indexed: 10/22/2022]
|
115
|
Lewis PO, Chen MH, Kuo L, Lewis LA, Fučíková K, Neupane S, Wang YB, Shi D. Estimating Bayesian Phylogenetic Information Content. Syst Biol 2016; 65:1009-1023. [PMID: 27155008 PMCID: PMC5066063 DOI: 10.1093/sysbio/syw042] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2016] [Revised: 04/15/2016] [Accepted: 05/01/2016] [Indexed: 11/13/2022] Open
Abstract
Measuring the phylogenetic information content of data has a long history in systematics. Here we explore a Bayesian approach to information content estimation. The entropy of the posterior distribution compared with the entropy of the prior distribution provides a natural way to measure information content. If the data have no information relevant to ranking tree topologies beyond the information supplied by the prior, the posterior and prior will be identical. Information in data discourages consideration of some hypotheses allowed by the prior, resulting in a posterior distribution that is more concentrated (has lower entropy) than the prior. We focus on measuring information about tree topology using marginal posterior distributions of tree topologies. We show that both the accuracy and the computational efficiency of topological information content estimation improve with use of the conditional clade distribution, which also allows topological information content to be partitioned by clade. We explore two important applications of our method: providing a compelling definition of saturation and detecting conflict among data partitions that can negatively affect analyses of concatenated data. [Bayesian; concatenation; conditional clade distribution; entropy; information; phylogenetics; saturation.].
Collapse
Affiliation(s)
- Paul O Lewis
- Department of Ecology and Evolutionary Biology, University of Connecticut, 75 N. Eagleville Road, Unit 3043, Storrs, CT 06269, USA;
| | - Ming-Hui Chen
- Department of Statistics, University of Connecticut, 215 Glenbrook Road, Unit 4120, Storrs, CT 06269, USA
| | - Lynn Kuo
- Department of Statistics, University of Connecticut, 215 Glenbrook Road, Unit 4120, Storrs, CT 06269, USA
| | - Louise A Lewis
- Department of Ecology and Evolutionary Biology, University of Connecticut, 75 N. Eagleville Road, Unit 3043, Storrs, CT 06269, USA
| | - Karolina Fučíková
- Department of Ecology and Evolutionary Biology, University of Connecticut, 75 N. Eagleville Road, Unit 3043, Storrs, CT 06269, USA
| | - Suman Neupane
- Department of Ecology and Evolutionary Biology, University of Connecticut, 75 N. Eagleville Road, Unit 3043, Storrs, CT 06269, USA
| | - Yu-Bo Wang
- Department of Statistics, University of Connecticut, 215 Glenbrook Road, Unit 4120, Storrs, CT 06269, USA
| | - Daoyuan Shi
- Department of Statistics, University of Connecticut, 215 Glenbrook Road, Unit 4120, Storrs, CT 06269, USA
| |
Collapse
|
116
|
Takezaki N, Nishihara H. Resolving the Phylogenetic Position of Coelacanth: The Closest Relative Is Not Always the Most Appropriate Outgroup. Genome Biol Evol 2016; 8:1208-21. [PMID: 27026053 PMCID: PMC4860700 DOI: 10.1093/gbe/evw071] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Determining the phylogenetic relationship of two extant lineages of lobe-finned fish, coelacanths and lungfishes, and tetrapods is important for understanding the origin of tetrapods. We analyzed data sets from two previous studies along with a newly collected data set, each of which had varying numbers of species and genes and varying extent of missing sites. We found that in all the data sets the sister relationship of lungfish and tetrapods was constructed with the use of cartilaginous fish as the outgroup with a high degree of statistical support. In contrast, when ray-finned fish were used as the outgroup, which is taxonomically an immediate outgroup of lobe-finned fish and tetrapods, the sister relationship of coelacanth and tetrapods was supported most strongly, although the statistical support was weaker. Even though it is generally accepted that the closest relative is an appropriate outgroup, our analysis suggested that the large divergence of the ray-finned fish as indicated by their long branch lengths and different amino acid frequencies made them less suitable as an outgroup than cartilaginous fish.
Collapse
Affiliation(s)
- Naoko Takezaki
- Life Science Research Center, Kagawa University, Mikicho, Kitagun, Kagawa, Japan
| | - Hidenori Nishihara
- Graduate School of Bioscience and Biotechnology, Tokyo Institute of Technology, Nagatsuta-Cho, Midori-Ku, Yokohama, Kanagawa, Japan
| |
Collapse
|
117
|
Takahashi T, Sota T. A robust phylogeny among major lineages of the East African cichlids. Mol Phylogenet Evol 2016; 100:234-242. [PMID: 27068840 DOI: 10.1016/j.ympev.2016.04.012] [Citation(s) in RCA: 27] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2015] [Revised: 03/16/2016] [Accepted: 04/07/2016] [Indexed: 11/30/2022]
Abstract
The huge monophyletic group of the East African cichlid radiations (EAR) consists of thousands of species belonging to 12-14 tribes; the number of tribes differs among studies. Many studies have inferred phylogenies of EAR tribes using various genetic markers. However, these phylogenies partly contradict one another and can have weak statistic support. In this study, we conducted maximum-likelihood (ML) phylogenetic analyses using restriction site-associated DNA (RAD) sequences and propose a new robust phylogenetic hypothesis among Lake Tanganyika cichlid fishes, which cover most EAR tribes. Data matrices can vary in size and contents depending on the strategies used to process RAD sequences. Therefore, we prepared 23 data matrices with various processing strategies. The ML phylogenies inferred from 15 large matrices (2.0×10(6) to 1.1×10(7) base pairs) resolved every tribe as a monophyletic group with 100% bootstrap support and shared the same topology regarding relationships among the tribes. Most nodes among the tribes were supported by 100% bootstrap values, and the bootstrap support for the other node varied among the 15 ML trees from 70% to 100%. These robust ML trees differ partly in topology from those in earlier studies, and these phylogenetic relationships have important implications for the tribal classification of EAR.
Collapse
Affiliation(s)
- Tetsumi Takahashi
- Department of Zoology, Graduate School of Science, Kyoto University, Sakyo, Kyoto 606-8502, Japan; National Institute of Genetics, Yata, Mishima, Shizuoka 411-8540, Japan.
| | - Teiji Sota
- Department of Zoology, Graduate School of Science, Kyoto University, Sakyo, Kyoto 606-8502, Japan
| |
Collapse
|
118
|
Rose JP, Kriebel R, Sytsma KJ. Shape analysis of moss (Bryophyta) sporophytes: Insights into land plant evolution. AMERICAN JOURNAL OF BOTANY 2016; 103:652-62. [PMID: 26944353 DOI: 10.3732/ajb.1500394] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/02/2015] [Accepted: 01/08/2016] [Indexed: 05/05/2023]
Abstract
PREMISE OF THE STUDY The alternation of generations life cycle represents a key feature of land-plant evolution and has resulted in a diverse array of sporophyte forms and modifications in all groups of land plants. We test the hypothesis that evolution of sporangium (capsule) shape of the mosses-the second most diverse land-plant lineage-has been driven by differing physiological demands of life in diverse habitats. This study provides an important conceptual framework for analyzing the evolution of a single, homologous character in a continuous framework across a deep expanse of time, across all branches of the tree of life. METHODS We reconstruct ancestral sporangium shape and ancestral habitat on the largest phylogeny of mosses to date, and use phylogenetic generalized least squares regression to test the association between habitat and sporangium shape. In addition, we examine the association between shifts in sporangium shape and species diversification. RESULTS We demonstrate that sporangium shape is convergent, under natural selection, and associated with habitat type, and that many shifts in speciation rate are associated with shifts in sporangium shape. CONCLUSIONS Our results suggest that natural selection in different microhabitats results in the diversity of sporangium shape found in mosses, and that many increasing shifts in speciation rate result in changes in sporangium shape across their 480 million year history. Our framework provides a way to examine if diversification shifts in other land plants are also associated with massive changes in sporophyte form, among other morphological traits.
Collapse
Affiliation(s)
- Jeffrey P Rose
- Department of Botany, University of Wisconsin-Madison, 430 Lincoln Drive, Madison, Wisconsin 53706.
| | - Ricardo Kriebel
- Department of Botany, University of Wisconsin-Madison, 430 Lincoln Drive, Madison, Wisconsin 53706
| | - Kenneth J Sytsma
- Department of Botany, University of Wisconsin-Madison, 430 Lincoln Drive, Madison, Wisconsin 53706
| |
Collapse
|
119
|
Truszkowski J, Goldman N. Maximum Likelihood Phylogenetic Inference is Consistent on Multiple Sequence Alignments, with or without Gaps. Syst Biol 2016; 65:328-33. [PMID: 26615177 PMCID: PMC4748752 DOI: 10.1093/sysbio/syv089] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2015] [Accepted: 11/19/2015] [Indexed: 11/14/2022] Open
Abstract
We prove that maximum likelihood phylogenetic inference is consistent on gapped multiple sequence alignments (MSAs) as long as substitution rates across each edge are greater than zero, under mild assumptions on the structure of the alignment. Under these assumptions, maximum likelihood will asymptotically recover the tree with edge lengths corresponding to the mean number of substitutions per site on each edge. This refutes Warnow's recent suggestion (Warnow 2012) that maximum likelihood phylogenetic inference might be statistically inconsistent when gaps are treated as missing data, even if the MSA is correct. We also derive a simple new proof of maximum likelihood consistency of ungapped alignments.
Collapse
Affiliation(s)
- Jakub Truszkowski
- European Molecular Biology Laboratory, European Bioinformatics Institute Wellcome Genome Campus, Hinxton, CB10 1SD, UK; Cancer Research UK Cambridge Institute, University of Cambridge Robinson Way, Cambridge CB2 0RE, UK
| | - Nick Goldman
- European Molecular Biology Laboratory, European Bioinformatics Institute Wellcome Genome Campus, Hinxton, CB10 1SD, UK
| |
Collapse
|
120
|
Williams AV, Miller JT, Small I, Nevill PG, Boykin LM. Integration of complete chloroplast genome sequences with small amplicon datasets improves phylogenetic resolution in Acacia. Mol Phylogenet Evol 2016; 96:1-8. [DOI: 10.1016/j.ympev.2015.11.021] [Citation(s) in RCA: 48] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2015] [Revised: 11/12/2015] [Accepted: 11/24/2015] [Indexed: 11/27/2022]
|
121
|
Garrison NL, Rodriguez J, Agnarsson I, Coddington JA, Griswold CE, Hamilton CA, Hedin M, Kocot KM, Ledford JM, Bond JE. Spider phylogenomics: untangling the Spider Tree of Life. PeerJ 2016; 4:e1719. [PMID: 26925338 PMCID: PMC4768681 DOI: 10.7717/peerj.1719] [Citation(s) in RCA: 173] [Impact Index Per Article: 21.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2015] [Accepted: 01/31/2016] [Indexed: 12/12/2022] Open
Abstract
Spiders (Order Araneae) are massively abundant generalist arthropod predators that are found in nearly every ecosystem on the planet and have persisted for over 380 million years. Spiders have long served as evolutionary models for studying complex mating and web spinning behaviors, key innovation and adaptive radiation hypotheses, and have been inspiration for important theories like sexual selection by female choice. Unfortunately, past major attempts to reconstruct spider phylogeny typically employing the "usual suspect" genes have been unable to produce a well-supported phylogenetic framework for the entire order. To further resolve spider evolutionary relationships we have assembled a transcriptome-based data set comprising 70 ingroup spider taxa. Using maximum likelihood and shortcut coalescence-based approaches, we analyze eight data sets, the largest of which contains 3,398 gene regions and 696,652 amino acid sites forming the largest phylogenomic analysis of spider relationships produced to date. Contrary to long held beliefs that the orb web is the crowning achievement of spider evolution, ancestral state reconstructions of web type support a phylogenetically ancient origin of the orb web, and diversification analyses show that the mostly ground-dwelling, web-less RTA clade diversified faster than orb weavers. Consistent with molecular dating estimates we report herein, this may reflect a major increase in biomass of non-flying insects during the Cretaceous Terrestrial Revolution 125-90 million years ago favoring diversification of spiders that feed on cursorial rather than flying prey. Our results also have major implications for our understanding of spider systematics. Phylogenomic analyses corroborate several well-accepted high level groupings: Opisthothele, Mygalomorphae, Atypoidina, Avicularoidea, Theraphosoidina, Araneomorphae, Entelegynae, Araneoidea, the RTA clade, Dionycha and the Lycosoidea. Alternatively, our results challenge the monophyly of Eresoidea, Orbiculariae, and Deinopoidea. The composition of the major paleocribellate and neocribellate clades, the basal divisions of Araneomorphae, appear to be falsified. Traditional Haplogynae is in need of revision, as our findings appear to support the newly conceived concept of Synspermiata. The sister pairing of filistatids with hypochilids implies that some peculiar features of each family may in fact be synapomorphic for the pair. Leptonetids now are seen as a possible sister group to the Entelegynae, illustrating possible intermediates in the evolution of the more complex entelegyne genitalic condition, spinning organs and respiratory organs.
Collapse
Affiliation(s)
- Nicole L. Garrison
- Department of Biological Sciences and Auburn University Museum of Natural History, Auburn University, Auburn, AL, United States
| | - Juanita Rodriguez
- Department of Biological Sciences and Auburn University Museum of Natural History, Auburn University, Auburn, AL, United States
| | - Ingi Agnarsson
- Department of Biology, University of Vermont, Burlington, VT, United States
| | - Jonathan A. Coddington
- Department of Entomology, National Museum of Natural History, Smithsonian Institution, Washingtion, DC, United States
| | - Charles E. Griswold
- Arachnology, California Academy of Sciences, San Francisco, CA, United States
| | - Christopher A. Hamilton
- Department of Biological Sciences and Auburn University Museum of Natural History, Auburn University, Auburn, AL, United States
| | - Marshal Hedin
- Department of Biology, San Diego State University, San Diego, CA, United States
| | - Kevin M. Kocot
- Department of Biological Sciences and Alabama Museum of Natural History, University of Alabama—Tuscaloosa, Tuscaloosa, AL, United States
| | - Joel M. Ledford
- Department of Plant Biology, University of California, Davis, Davis, CA, United States
| | - Jason E. Bond
- Department of Biological Sciences and Auburn University Museum of Natural History, Auburn University, Auburn, AL, United States
| |
Collapse
|
122
|
Binet M, Gascuel O, Scornavacca C, Douzery EJP, Pardi F. Fast and accurate branch lengths estimation for phylogenomic trees. BMC Bioinformatics 2016; 17:23. [PMID: 26744021 PMCID: PMC4705742 DOI: 10.1186/s12859-015-0821-8] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2015] [Accepted: 11/02/2015] [Indexed: 01/26/2023] Open
Abstract
Background Branch lengths are an important attribute of phylogenetic trees, providing essential information for many studies in evolutionary biology. Yet, part of the current methodology to reconstruct a phylogeny from genomic information — namely supertree methods — focuses on the topology or structure of the phylogenetic tree, rather than the evolutionary divergences associated to it. Moreover, accurate methods to estimate branch lengths — typically based on probabilistic analysis of a concatenated alignment — are limited by large demands in memory and computing time, and may become impractical when the data sets are too large. Results Here, we present a novel phylogenomic distance-based method, named ERaBLE (Evolutionary Rates and Branch Length Estimation), to estimate the branch lengths of a given reference topology, and the relative evolutionary rates of the genes employed in the analysis. ERaBLE uses as input data a potentially very large collection of distance matrices, where each matrix is obtained from a different genomic region — either directly from its sequence alignment, or indirectly from a gene tree inferred from the alignment. Our experiments show that ERaBLE is very fast and fairly accurate when compared to other possible approaches for the same tasks. Specifically, it efficiently and accurately deals with large data sets, such as the OrthoMaM v8 database, composed of 6,953 exons from up to 40 mammals. Conclusions ERaBLE may be used as a complement to supertree methods — or it may provide an efficient alternative to maximum likelihood analysis of concatenated alignments — to estimate branch lengths from phylogenomic data sets. Electronic supplementary material The online version of this article (doi:10.1186/s12859-015-0821-8) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Manuel Binet
- Laboratoire d'Informatique de Robotique et de Microélectronique de Montpellier (LIRMM), CNRS, Université de Montpellier, Montpellier, France. .,Institut de Biologie Computationnelle, Montpellier, France. .,Institut des Sciences de l'Evolution de Montpellier, CNRS, IRD, EPHE, Université de Montpellier, France.
| | - Olivier Gascuel
- Laboratoire d'Informatique de Robotique et de Microélectronique de Montpellier (LIRMM), CNRS, Université de Montpellier, Montpellier, France. .,Institut de Biologie Computationnelle, Montpellier, France.
| | - Celine Scornavacca
- Institut de Biologie Computationnelle, Montpellier, France. .,Institut des Sciences de l'Evolution de Montpellier, CNRS, IRD, EPHE, Université de Montpellier, France.
| | - Emmanuel J P Douzery
- Institut des Sciences de l'Evolution de Montpellier, CNRS, IRD, EPHE, Université de Montpellier, France.
| | - Fabio Pardi
- Laboratoire d'Informatique de Robotique et de Microélectronique de Montpellier (LIRMM), CNRS, Université de Montpellier, Montpellier, France. .,Institut de Biologie Computationnelle, Montpellier, France.
| |
Collapse
|
123
|
Little J, Schmidt DJ, Cook BD, Page TJ, Hughes JM. Diversity and phylogeny of south-east Queensland Bathynellacea. AUST J ZOOL 2016. [DOI: 10.1071/zo16005] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
Abstract
The crustacean order Bathynellacea is amongst the most diverse and widespread groups of subterranean aquatic fauna (stygofauna) in Australia. Interest in the diversity and biogeography of Australian Bathynellacea has grown markedly in recent years. However, relatively little information relating to this group has emerged from Queensland. The aim of this study was to investigate bathynellacean diversity and phylogeny in south-east Queensland. Relationships between the south-east Queensland fauna and their continental relatives were evaluated through the analysis of combined mitochondrial and nuclear DNA sequence data. Bathynellaceans were collected from alluvial groundwater systems in three catchments in south-east Queensland. This study revealed a diverse bathynellacean fauna with complex evolutionary relationships to related fauna elsewhere in Queensland, and on the wider Australian continent. The multifamily assemblage revealed here is likely to represent several new species, and at least one new genus within the Parabathynellidae. These taxa likely have relatively restricted geographic distributions. Interestingly, the south-east Queensland Bathynellacea appeared to be distantly related to their north-east Queensland counterparts. Although it was not possible to determine the generic identities of their closest relatives, the south-east Queensland Parabathynellidae appear to be most closely affiliated with southern and eastern Australian lineages. Together with previous survey data, the findings here suggest that there is likely to be considerable bathynellacean diversity in alluvial groundwater systems across the wider Queensland region. Further assessment of stygofauna distributions in south-east Queensland is necessary to understand the biological implications of significant groundwater use and development in the region.
Collapse
|
124
|
Dufort MJ. An augmented supermatrix phylogeny of the avian family Picidae reveals uncertainty deep in the family tree. Mol Phylogenet Evol 2016; 94:313-26. [DOI: 10.1016/j.ympev.2015.08.025] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2015] [Revised: 08/22/2015] [Accepted: 08/28/2015] [Indexed: 10/23/2022]
|
125
|
Zheng Y, Wiens JJ. Combining phylogenomic and supermatrix approaches, and a time-calibrated phylogeny for squamate reptiles (lizards and snakes) based on 52 genes and 4162 species. Mol Phylogenet Evol 2016; 94:537-547. [DOI: 10.1016/j.ympev.2015.10.009] [Citation(s) in RCA: 383] [Impact Index Per Article: 47.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2015] [Revised: 09/30/2015] [Accepted: 10/08/2015] [Indexed: 11/24/2022]
|
126
|
Hosner PA, Faircloth BC, Glenn TC, Braun EL, Kimball RT. Avoiding Missing Data Biases in Phylogenomic Inference: An Empirical Study in the Landfowl (Aves: Galliformes). Mol Biol Evol 2015; 33:1110-25. [PMID: 26715628 DOI: 10.1093/molbev/msv347] [Citation(s) in RCA: 124] [Impact Index Per Article: 13.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022] Open
Abstract
Production of massive DNA sequence data sets is transforming phylogenetic inference, but best practices for analyzing such data sets are not well established. One uncertainty is robustness to missing data, particularly in coalescent frameworks. To understand the effects of increasing matrix size and loci at the cost of increasing missing data, we produced a 90 taxon, 2.2 megabase, 4,800 locus sequence matrix of landfowl using target capture of ultraconserved elements. We then compared phylogenies estimated with concatenated maximum likelihood, quartet-based methods executed on concatenated matrices and gene tree reconciliation methods, across five thresholds of missing data. Results of maximum likelihood and quartet analyses were similar, well resolved, and demonstrated increasing support with increasing matrix size and sparseness. Conversely, gene tree reconciliation produced unexpected relationships when we included all informative loci, with certain taxa placed toward the root compared with other approaches. Inspection of these taxa identified a prevalence of short average contigs, which potentially biased gene tree inference and caused erroneous results in gene tree reconciliation. This suggests that the more problematic missing data in gene tree-based analyses are partial sequences rather than entire missing sequences from locus alignments. Limiting gene tree reconciliation to the most informative loci solved this problem, producing well-supported topologies congruent with concatenation and quartet methods. Collectively, our analyses provide a well-resolved phylogeny of landfowl, including strong support for previously problematic relationships such as those among junglefowl (Gallus), and clarify the position of two enigmatic galliform genera (Lerwa, Melanoperdix) not sampled in previous molecular phylogenetic studies.
Collapse
Affiliation(s)
| | - Brant C Faircloth
- Department of Biological Sciences and Museum of Natural Science, Louisiana State University, Baton Rouge
| | - Travis C Glenn
- Department of Environmental Health Science, University of Georgia
| | | | | |
Collapse
|
127
|
Biogeography and divergent patterns of body size disparification in North American minnows. Mol Phylogenet Evol 2015. [DOI: 10.1016/j.ympev.2015.07.006] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/02/2023]
|
128
|
Cavalier-Smith T, Chao EE, Lewis R. Multiple origins of Heliozoa from flagellate ancestors: New cryptist subphylum Corbihelia, superclass Corbistoma, and monophyly of Haptista, Cryptista, Hacrobia and Chromista. Mol Phylogenet Evol 2015; 93:331-62. [DOI: 10.1016/j.ympev.2015.07.004] [Citation(s) in RCA: 63] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2015] [Revised: 06/25/2015] [Accepted: 07/10/2015] [Indexed: 11/30/2022]
|
129
|
A RAD-based phylogenetics for Orestias fishes from Lake Titicaca. Mol Phylogenet Evol 2015; 93:307-17. [DOI: 10.1016/j.ympev.2015.08.012] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2015] [Revised: 08/11/2015] [Accepted: 08/11/2015] [Indexed: 11/18/2022]
|
130
|
Borowiec ML, Lee EK, Chiu JC, Plachetzki DC. Extracting phylogenetic signal and accounting for bias in whole-genome data sets supports the Ctenophora as sister to remaining Metazoa. BMC Genomics 2015; 16:987. [PMID: 26596625 PMCID: PMC4657218 DOI: 10.1186/s12864-015-2146-4] [Citation(s) in RCA: 87] [Impact Index Per Article: 9.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2015] [Accepted: 10/26/2015] [Indexed: 01/25/2023] Open
Abstract
BACKGROUND Understanding the phylogenetic relationships among major lineages of multicellular animals (the Metazoa) is a prerequisite for studying the evolution of complex traits such as nervous systems, muscle tissue, or sensory organs. Transcriptome-based phylogenies have dramatically improved our understanding of metazoan relationships in recent years, although several important questions remain. The branching order near the base of the tree, in particular the placement of the poriferan (sponges, phylum Porifera) and ctenophore (comb jellies, phylum Ctenophora) lineages is one outstanding issue. Recent analyses have suggested that the comb jellies are sister to all remaining metazoan phyla including sponges. This finding is surprising because it suggests that neurons and other complex traits, present in ctenophores and eumetazoans but absent in sponges or placozoans, either evolved twice in Metazoa or were independently, secondarily lost in the lineages leading to sponges and placozoans. RESULTS To address the question of basal metazoan relationships we assembled a novel dataset comprised of 1080 orthologous loci derived from 36 publicly available genomes representing major lineages of animals. From this large dataset we procured an optimized set of partitions with high phylogenetic signal for resolving metazoan relationships. This optimized data set is amenable to the most appropriate and computationally intensive analyses using site-heterogeneous models of sequence evolution. We also employed several strategies to examine the potential for long-branch attraction to bias our inferences. Our analyses strongly support the Ctenophora as the sister lineage to other Metazoa. We find no support for the traditional view uniting the ctenophores and Cnidaria. Our findings are supported by Bayesian comparisons of topological hypotheses and we find no evidence that they are biased by long-branch attraction. CONCLUSIONS Our study further clarifies relationships among early branching metazoan lineages. Our phylogeny supports the still-controversial position of ctenophores as sister group to all other metazoans. This study also provides a workflow and computational tools for minimizing systematic bias in genome-based phylogenetic analyses. Future studies of metazoan phylogeny will benefit from ongoing efforts to sequence the genomes of additional invertebrate taxa that will continue to inform our view of the relationships among the major lineages of animals.
Collapse
Affiliation(s)
- Marek L Borowiec
- Department of Entomology and Nematology, University of California, Davis, USA.
| | - Ernest K Lee
- Sackler Institute for Comparative Genomics, American Museum of Natural History, New York, USA.
| | - Joanna C Chiu
- Department of Entomology and Nematology, University of California, Davis, USA.
| | - David C Plachetzki
- Department of Molecular, Cellular, and Biomedical Sciences, University of New Hampshire, Durham, USA.
| |
Collapse
|
131
|
Xi Z, Liu L, Davis CC. The Impact of Missing Data on Species Tree Estimation. Mol Biol Evol 2015; 33:838-60. [DOI: 10.1093/molbev/msv266] [Citation(s) in RCA: 101] [Impact Index Per Article: 11.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2023] Open
|
132
|
Murray GGR, Weinert LA, Rhule EL, Welch JJ. The Phylogeny of Rickettsia Using Different Evolutionary Signatures: How Tree-Like is Bacterial Evolution? Syst Biol 2015; 65:265-79. [PMID: 26559010 PMCID: PMC4748751 DOI: 10.1093/sysbio/syv084] [Citation(s) in RCA: 56] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2015] [Accepted: 11/04/2015] [Indexed: 11/14/2022] Open
Abstract
Rickettsia is a genus of intracellular bacteria whose hosts and transmission strategies are both impressively diverse, and this is reflected in a highly dynamic genome. Some previous studies have described the evolutionary history of Rickettsia as non-tree-like, due to incongruity between phylogenetic reconstructions using different portions of the genome. Here, we reconstruct the Rickettsia phylogeny using whole-genome data, including two new genomes from previously unsampled host groups. We find that a single topology, which is supported by multiple sources of phylogenetic signal, well describes the evolutionary history of the core genome. We do observe extensive incongruence between individual gene trees, but analyses of simulations over a single topology and interspersed partitions of sites show that this is more plausibly attributed to systematic error than to horizontal gene transfer. Some conflicting placements also result from phylogenetic analyses of accessory genome content (i.e., gene presence/absence), but we argue that these are also due to systematic error, stemming from convergent genome reduction, which cannot be accommodated by existing phylogenetic methods. Our results show that, even within a single genus, tests for gene exchange based on phylogenetic incongruence may be susceptible to false positives.
Collapse
Affiliation(s)
- Gemma G R Murray
- Department of Genetics, University of Cambridge, Downing Street, Cambridge CB2 3EH, UK; and
| | - Lucy A Weinert
- Department of Veterinary Medicine, University of Cambridge, Madingley Road, Cambridge CB3 0ES, UK
| | - Emma L Rhule
- Department of Genetics, University of Cambridge, Downing Street, Cambridge CB2 3EH, UK; and
| | - John J Welch
- Department of Genetics, University of Cambridge, Downing Street, Cambridge CB2 3EH, UK; and
| |
Collapse
|
133
|
|
134
|
Phylogenetic analysis of the Australian rosella parrots (Platycercus) reveals discordance among molecules and plumage. Mol Phylogenet Evol 2015; 91:150-9. [DOI: 10.1016/j.ympev.2015.05.012] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2014] [Revised: 05/12/2015] [Accepted: 05/13/2015] [Indexed: 12/25/2022]
|
135
|
De Novo Assembly and Characterization of Four Anthozoan (Phylum Cnidaria) Transcriptomes. G3-GENES GENOMES GENETICS 2015; 5:2441-52. [PMID: 26384772 PMCID: PMC4632063 DOI: 10.1534/g3.115.020164] [Citation(s) in RCA: 34] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 01/15/2023]
Abstract
Many nonmodel species exemplify important biological questions but lack the sequence resources required to study the genes and genomic regions underlying traits of interest. Reef-building corals are famously sensitive to rising seawater temperatures, motivating ongoing research into their stress responses and long-term prospects in a changing climate. A comprehensive understanding of these processes will require extending beyond the sequenced coral genome (Acropora digitifera) to encompass diverse coral species and related anthozoans. Toward that end, we have assembled and annotated reference transcriptomes to develop catalogs of gene sequences for three scleractinian corals (Fungia scutaria, Montastraea cavernosa, Seriatopora hystrix) and a temperate anemone (Anthopleura elegantissima). High-throughput sequencing of cDNA libraries produced ~20-30 million reads per sample, and de novo assembly of these reads produced ~75,000-110,000 transcripts from each sample with size distributions (mean ~1.4 kb, N50 ~2 kb), comparable to the distribution of gene models from the coral genome (mean ~1.7 kb, N50 ~2.2 kb). Each assembly includes matches for more than half the gene models from A. digitifera (54-67%) and many reasonably complete transcripts (~5300-6700) spanning nearly the entire gene (ortholog hit ratios ≥0.75). The catalogs of gene sequences developed in this study made it possible to identify hundreds to thousands of orthologs across diverse scleractinian species and related taxa. We used these sequences for phylogenetic inference, recovering known relationships and demonstrating superior performance over phylogenetic trees constructed using single mitochondrial loci. The resources developed in this study provide gene sequences and genetic markers for several anthozoan species. To enhance the utility of these resources for the research community, we developed searchable databases enabling researchers to rapidly recover sequences for genes of interest. Our analysis of de novo assembly quality highlights metrics that we expect will be useful for evaluating the relative quality of other de novo transcriptome assemblies. The identification of orthologous sequences and phylogenetic reconstruction demonstrates the feasibility of these methods for clarifying the substantial uncertainties in the existing scleractinian phylogeny.
Collapse
|
136
|
Streicher JW, Schulte JA, Wiens JJ. How Should Genes and Taxa be Sampled for Phylogenomic Analyses with Missing Data? An Empirical Study in Iguanian Lizards. Syst Biol 2015; 65:128-45. [PMID: 26330450 DOI: 10.1093/sysbio/syv058] [Citation(s) in RCA: 111] [Impact Index Per Article: 12.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2014] [Accepted: 08/04/2015] [Indexed: 11/12/2022] Open
Abstract
Targeted sequence capture is becoming a widespread tool for generating large phylogenomic data sets to address difficult phylogenetic problems. However, this methodology often generates data sets in which increasing the number of taxa and loci increases amounts of missing data. Thus, a fundamental (but still unresolved) question is whether sampling should be designed to maximize sampling of taxa or genes, or to minimize the inclusion of missing data cells. Here, we explore this question for an ancient, rapid radiation of lizards, the pleurodont iguanians. Pleurodonts include many well-known clades (e.g., anoles, basilisks, iguanas, and spiny lizards) but relationships among families have proven difficult to resolve strongly and consistently using traditional sequencing approaches. We generated up to 4921 ultraconserved elements with sampling strategies including 16, 29, and 44 taxa, from 1179 to approximately 2.4 million characters per matrix and approximately 30% to 60% total missing data. We then compared mean branch support for interfamilial relationships under these 15 different sampling strategies for both concatenated (maximum likelihood) and species tree (NJst) approaches (after showing that mean branch support appears to be related to accuracy). We found that both approaches had the highest support when including loci with up to 50% missing taxa (matrices with ~40-55% missing data overall). Thus, our results show that simply excluding all missing data may be highly problematic as the primary guiding principle for the inclusion or exclusion of taxa and genes. The optimal strategy was somewhat different for each approach, a pattern that has not been shown previously. For concatenated analyses, branch support was maximized when including many taxa (44) but fewer characters (1.1 million). For species-tree analyses, branch support was maximized with minimal taxon sampling (16) but many loci (4789 of 4921). We also show that the choice of these sampling strategies can be critically important for phylogenomic analyses, since some strategies lead to demonstrably incorrect inferences (using the same method) that have strong statistical support. Our preferred estimate provides strong support for most interfamilial relationships in this important but phylogenetically challenging group.
Collapse
Affiliation(s)
- Jeffrey W Streicher
- Department of Ecology and Evolutionary Biology, University of Arizona, Tucson, AZ 85721, USA; Department of Life Sciences, The Natural History Museum, London SW7 5BD, UK and
| | - James A Schulte
- Department of Biology, Clarkson University, Potsdam, NY 13699, USA
| | - John J Wiens
- Department of Ecology and Evolutionary Biology, University of Arizona, Tucson, AZ 85721, USA
| |
Collapse
|
137
|
Leaché AD, Linkem CW. Phylogenomics of Horned Lizards (Genus:Phrynosoma) Using Targeted Sequence Capture Data. COPEIA 2015. [DOI: 10.1643/ch-15-248] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
|
138
|
Chen MY, Liang D, Zhang P. Selecting Question-Specific Genes to Reduce Incongruence in Phylogenomics: A Case Study of Jawed Vertebrate Backbone Phylogeny. Syst Biol 2015; 64:1104-20. [PMID: 26276158 DOI: 10.1093/sysbio/syv059] [Citation(s) in RCA: 89] [Impact Index Per Article: 9.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2015] [Accepted: 08/10/2015] [Indexed: 11/13/2022] Open
Abstract
Incongruence between different phylogenomic analyses is the main challenge faced by phylogeneticists in the genomic era. To reduce incongruence, phylogenomic studies normally adopt some data filtering approaches, such as reducing missing data or using slowly evolving genes, to improve the signal quality of data. Here, we assembled a phylogenomic data set of 58 jawed vertebrate taxa and 4682 genes to investigate the backbone phylogeny of jawed vertebrates under both concatenation and coalescent-based frameworks. To evaluate the efficiency of extracting phylogenetic signals among different data filtering methods, we chose six highly intractable internodes within the backbone phylogeny of jawed vertebrates as our test questions. We found that our phylogenomic data set exhibits substantial conflicting signal among genes for these questions. Our analyses showed that non-specific data sets that are generated without bias toward specific questions are not sufficient to produce consistent results when there are several difficult nodes within a phylogeny. Moreover, phylogenetic accuracy based on non-specific data is considerably influenced by the size of data and the choice of tree inference methods. To address such incongruences, we selected genes that resolve a given internode but not the entire phylogeny. Notably, not only can this strategy yield correct relationships for the question, but it also reduces inconsistency associated with data sizes and inference methods. Our study highlights the importance of gene selection in phylogenomic analyses, suggesting that simply using a large amount of data cannot guarantee correct results. Constructing question-specific data sets may be more powerful for resolving problematic nodes.
Collapse
Affiliation(s)
- Meng-Yun Chen
- State Key Laboratory of Biocontrol, College of Ecology and Evolution, School of Life Sciences, Sun Yat-Sen University, Guangzhou 510006, China
| | - Dan Liang
- State Key Laboratory of Biocontrol, College of Ecology and Evolution, School of Life Sciences, Sun Yat-Sen University, Guangzhou 510006, China
| | - Peng Zhang
- State Key Laboratory of Biocontrol, College of Ecology and Evolution, School of Life Sciences, Sun Yat-Sen University, Guangzhou 510006, China
| |
Collapse
|
139
|
McTavish EJ, Steel M, Holder MT. Twisted trees and inconsistency of tree estimation when gaps are treated as missing data - The impact of model mis-specification in distance corrections. Mol Phylogenet Evol 2015; 93:289-95. [PMID: 26256643 DOI: 10.1016/j.ympev.2015.07.027] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2015] [Revised: 07/09/2015] [Accepted: 07/21/2015] [Indexed: 10/23/2022]
Abstract
Statistically consistent estimation of phylogenetic trees or gene trees is possible if pairwise sequence dissimilarities can be converted to a set of distances that are proportional to the true evolutionary distances. Susko et al. (2004) reported some strikingly broad results about the forms of inconsistency in tree estimation that can arise if corrected distances are not proportional to the true distances. They showed that if the corrected distance is a concave function of the true distance, then inconsistency due to long branch attraction will occur. If these functions are convex, then two "long branch repulsion" trees will be preferred over the true tree - though these two incorrect trees are expected to be tied as the preferred true. Here we extend their results, and demonstrate the existence of a tree shape (which we refer to as a "twisted Farris-zone" tree) for which a single incorrect tree topology will be guaranteed to be preferred if the corrected distance function is convex. We also report that the standard practice of treating gaps in sequence alignments as missing data is sufficient to produce non-linear corrected distance functions if the substitution process is not independent of the insertion/deletion process. Taken together, these results imply inconsistent tree inference under mild conditions. For example, if some positions in a sequence are constrained to be free of substitutions and insertion/deletion events while the remaining sites evolve with independent substitutions and insertion/deletion events, then the distances obtained by treating gaps as missing data can support an incorrect tree topology even given an unlimited amount of data.
Collapse
Affiliation(s)
- Emily Jane McTavish
- Heidelberg Institute for Theoretical Studies, Heidelberg, Germany; Department of Ecology and Evolutionary Biology, University of Kansas, Lawrence, KS, USA.
| | - Mike Steel
- Biomathematics Research Centre, University of Canterbury, Christchurch, New Zealand
| | - Mark T Holder
- Heidelberg Institute for Theoretical Studies, Heidelberg, Germany; Department of Ecology and Evolutionary Biology, University of Kansas, Lawrence, KS, USA
| |
Collapse
|
140
|
Blom MPK. EAPhy: A Flexible Tool for High-throughput Quality Filtering of Exon-alignments and Data Processing for Phylogenetic Methods. PLOS CURRENTS 2015; 7. [PMID: 26331095 PMCID: PMC4542277 DOI: 10.1371/currents.tol.75134257bd389c04bc1d26d42aa9089f] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 01/24/2023]
Abstract
Recently developed molecular methods enable geneticists to target and sequence thousands of orthologous loci and infer evolutionary relationships across the tree of life. Large numbers of genetic markers benefit species tree inference but visual inspection of alignment quality, as traditionally conducted, is challenging with thousands of loci. Furthermore, due to the impracticality of repeated visual inspection with alternative filtering criteria, the potential consequences of using datasets with different degrees of missing data remain nominally explored in most empirical phylogenomic studies. In this short communication, I describe a flexible high-throughput pipeline designed to assess alignment quality and filter exonic sequence data for subsequent inference. The stringency criteria for alignment quality and missing data can be adapted based on the expected level of sequence divergence. Each alignment is automatically evaluated based on the stringency criteria specified, significantly reducing the number of alignments that require visual inspection. By developing a rapid method for alignment filtering and quality assessment, the consistency of phylogenetic estimation based on exonic sequence alignments can be further explored across distinct inference methods, while accounting for different degrees of missing data.
Collapse
Affiliation(s)
- Mozes P K Blom
- Department of Evolution, Ecology and Genetics, Australian National University, Acton, ACT, Australia
| |
Collapse
|
141
|
Tan G, Muffato M, Ledergerber C, Herrero J, Goldman N, Gil M, Dessimoz C. Current Methods for Automated Filtering of Multiple Sequence Alignments Frequently Worsen Single-Gene Phylogenetic Inference. Syst Biol 2015; 64:778-91. [PMID: 26031838 PMCID: PMC4538881 DOI: 10.1093/sysbio/syv033] [Citation(s) in RCA: 143] [Impact Index Per Article: 15.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2014] [Accepted: 05/26/2015] [Indexed: 01/09/2023] Open
Abstract
Phylogenetic inference is generally performed on the basis of multiple sequence alignments (MSA). Because errors in an alignment can lead to errors in tree estimation, there is a strong interest in identifying and removing unreliable parts of the alignment. In recent years several automated filtering approaches have been proposed, but despite their popularity, a systematic and comprehensive comparison of different alignment filtering methods on real data has been lacking. Here, we extend and apply recently introduced phylogenetic tests of alignment accuracy on a large number of gene families and contrast the performance of unfiltered versus filtered alignments in the context of single-gene phylogeny reconstruction. Based on multiple genome-wide empirical and simulated data sets, we show that the trees obtained from filtered MSAs are on average worse than those obtained from unfiltered MSAs. Furthermore, alignment filtering often leads to an increase in the proportion of well-supported branches that are actually wrong. We confirm that our findings hold for a wide range of parameters and methods. Although our results suggest that light filtering (up to 20% of alignment positions) has little impact on tree accuracy and may save some computation time, contrary to widespread practice, we do not generally recommend the use of current alignment filtering methods for phylogenetic inference. By providing a way to rigorously and systematically measure the impact of filtering on alignments, the methodology set forth here will guide the development of better filtering algorithms.
Collapse
Affiliation(s)
- Ge Tan
- Department of Computer Science, ETH Zurich, Universitätstr. 6, 8092 Zurich, Switzerland, Department of Molecular Sciences, Institute of Clinical Sciences, Faculty of Medicine, Imperial College London, London, UK; MRC Clinical Sciences Centre, London W12 0NN, UK
| | - Matthieu Muffato
- European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton, Cambridge, CB10 1SD, UK
| | - Christian Ledergerber
- Department of Computer Science, ETH Zurich, Universitätstr. 6, 8092 Zurich, Switzerland
| | - Javier Herrero
- European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton, Cambridge, CB10 1SD, UK; University College London, Gower St, London WC1E 6BT, UK; and
| | - Nick Goldman
- European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton, Cambridge, CB10 1SD, UK
| | - Manuel Gil
- Institute of Molecular Life Sciences, University of Zurich, Winterthurerstr. 190 , 8057 Zurich, Switzerland; and Swiss Institute of Bioinformatics, Universitätstr. 6, 8092 Zurich, Switzerland
| | - Christophe Dessimoz
- University College London, Gower St, London WC1E 6BT, UK; and European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton, Cambridge, CB10 1SD, UK;
| |
Collapse
|
142
|
Mans BJ, de Klerk D, Pienaar R, de Castro MH, Latif AA. Next-generation sequencing as means to retrieve tick systematic markers, with the focus on Nuttalliella namaqua (Ixodoidea: Nuttalliellidae). Ticks Tick Borne Dis 2015; 6:450-62. [DOI: 10.1016/j.ttbdis.2015.03.013] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2014] [Revised: 03/06/2015] [Accepted: 03/08/2015] [Indexed: 10/23/2022]
|
143
|
Sanderson MJ, McMahon MM, Stamatakis A, Zwickl DJ, Steel M. Impacts of Terraces on Phylogenetic Inference. Syst Biol 2015; 64:709-26. [DOI: 10.1093/sysbio/syv024] [Citation(s) in RCA: 32] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2014] [Accepted: 04/15/2015] [Indexed: 11/14/2022] Open
|
144
|
Whelan NV, Kocot KM, Halanych KM. Employing Phylogenomics to Resolve the Relationships among Cnidarians, Ctenophores, Sponges, Placozoans, and Bilaterians. Integr Comp Biol 2015; 55:1084-95. [PMID: 25972566 DOI: 10.1093/icb/icv037] [Citation(s) in RCA: 34] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Despite an explosion in the amount of sequence data, phylogenomics has failed to settle controversy regarding some critical nodes on the animal tree of life. Understanding relationships among Bilateria, Ctenophora, Cnidaria, Placozoa, and Porifera is essential for studying how complex traits such as neurons, muscles, and gastrulation have evolved. Recent studies have cast doubt on the historical viewpoint that sponges are sister to all other animal lineages with recent studies recovering ctenophores as sister. However, the ctenophore-sister hypothesis has been criticized as unrealistic and caused by systematic error. We review past phylogenomic studies and potential causes of systematic error in an effort to identify areas that can be improved in future studies. Increased sampling of taxa, less missing data, and a priori removal of sequences and taxa that may cause systematic error in phylogenomic inference will likely be the most fruitful areas of focus when assembling future datasets. Ultimately, we foresee metazoan relationships being resolved with higher support in the near future, and we caution against dismissing novel hypotheses merely because they conflict with historical viewpoints of animal evolution.
Collapse
Affiliation(s)
- Nathan V Whelan
- *Department of Biological Sciences, Molette Biology Laboratory for Environmental and Climate Change Studies, Auburn University, 101 Life Sciences Building, Auburn, AL 36849, USA;
| | - Kevin M Kocot
- School of Biological Sciences, The University of Queensland, 325 Goddard Building, St Lucia, QLD 4101, Australia
| | - Kenneth M Halanych
- *Department of Biological Sciences, Molette Biology Laboratory for Environmental and Climate Change Studies, Auburn University, 101 Life Sciences Building, Auburn, AL 36849, USA
| |
Collapse
|
145
|
Feuda R, Smith AB. Phylogenetic signal dissection identifies the root of starfishes. PLoS One 2015; 10:e0123331. [PMID: 25955729 PMCID: PMC4425436 DOI: 10.1371/journal.pone.0123331] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2014] [Accepted: 02/20/2015] [Indexed: 11/19/2022] Open
Abstract
Relationships within the class Asteroidea have remained controversial for almost 100 years and, despite many attempts to resolve this problem using molecular data, no consensus has yet emerged. Using two nuclear genes and a taxon sampling covering the major asteroid clades we show that non-phylogenetic signal created by three factors - Long Branch Attraction, compositional heterogeneity and the use of poorly fitting models of evolution – have confounded accurate estimation of phylogenetic relationships. To overcome the effect of this non-phylogenetic signal we analyse the data using non-homogeneous models, site stripping and the creation of subpartitions aimed to reduce or amplify the systematic error, and calculate Bayes Factor support for a selection of previously suggested topological arrangements of asteroid orders. We show that most of the previous alternative hypotheses are not supported in the most reliable data partitions, including the previously suggested placement of either Forcipulatida or Paxillosida as sister group to the other major branches. The best-supported solution places Velatida as the sister group to other asteroids, and the implications of this finding for the morphological evolution of asteroids are presented.
Collapse
Affiliation(s)
- Roberto Feuda
- Division of Biology and Biological Engineering, California Institute of Technology Pasadena, California, United States of America
| | - Andrew B Smith
- Department of Earth Sciences, The Natural History Museum, London, United Kingdom
| |
Collapse
|
146
|
Sauquet H, Carrive L, Poullain N, Sannier J, Damerval C, Nadot S. Zygomorphy evolved from disymmetry in Fumarioideae (Papaveraceae, Ranunculales): new evidence from an expanded molecular phylogenetic framework. ANNALS OF BOTANY 2015; 115:895-914. [PMID: 25814061 PMCID: PMC4407061 DOI: 10.1093/aob/mcv020] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/02/2014] [Revised: 12/23/2014] [Accepted: 01/22/2015] [Indexed: 05/11/2023]
Abstract
BACKGROUND AND AIMS Fumarioideae (20 genera, 593 species) is a clade of Papaveraceae (Ranunculales) characterized by flowers that are either disymmetric (i.e. two perpendicular planes of bilateral symmetry) or zygomorphic (i.e. one plane of bilateral symmetry). In contrast, the other subfamily of Papaveraceae, Papaveroideae (23 genera, 230 species), has actinomorphic flowers (i.e. more than two planes of symmetry). Understanding of the evolution of floral symmetry in this clade has so far been limited by the lack of a reliable phylogenetic framework. Pteridophyllum (one species) shares similarities with Fumarioideae but has actinomorphic flowers, and the relationships among Pteridophyllum, Papaveroideae and Fumarioideae have remained unclear. This study reassesses the evolution of floral symmetry in Papaveraceae based on new molecular phylogenetic analyses of the family. METHODS Maximum likelihood, Bayesian and maximum parsimony phylogenetic analyses of Papaveraceae were conducted using six plastid markers and one nuclear marker, sampling Pteridophyllum, 18 (90 %) genera and 73 species of Fumarioideae, 11 (48 %) genera and 11 species of Papaveroideae, and a wide selection of outgroup taxa. Floral characters recorded from the literature were then optimized onto phylogenetic trees to reconstruct ancestral states using parsimony, maximum likelihood and reversible-jump Bayesian approaches. KEY RESULTS Pteridophyllum is not nested in Fumarioideae. Fumarioideae are monophyletic and Hypecoum (18 species) is the sister group of the remaining genera. Relationships within the core Fumarioideae are well resolved and supported. Dactylicapnos and all zygomorphic genera form a well-supported clade nested among disymmetric taxa. CONCLUSIONS Disymmetry of the corolla is a synapomorphy of Fumarioideae and is strongly correlated with changes in the androecium and differentiation of middle and inner tepal shape (basal spurs on middle tepals). Zygomorphy subsequently evolved from disymmetry either once (with a reversal in Dactylicapnos) or twice (Capnoides, other zygomorphic Fumarioideae) and appears to be correlated with the loss of one nectar spur.
Collapse
Affiliation(s)
- Hervé Sauquet
- Université Paris-Sud, Laboratoire Écologie, Systématique, Évolution, CNRS UMR 8079, 91405 Orsay, France and CNRS, UMR 0320/UMR 8120 Génétique Quantitative et Evolution - Le Moulon, INRA/Université Paris-Sud/CNRS/AgroParisTech, Ferme du Moulon, 91190 Gif-sur-Yvette, France
| | - Laetitia Carrive
- Université Paris-Sud, Laboratoire Écologie, Systématique, Évolution, CNRS UMR 8079, 91405 Orsay, France and CNRS, UMR 0320/UMR 8120 Génétique Quantitative et Evolution - Le Moulon, INRA/Université Paris-Sud/CNRS/AgroParisTech, Ferme du Moulon, 91190 Gif-sur-Yvette, France
| | - Noëlie Poullain
- Université Paris-Sud, Laboratoire Écologie, Systématique, Évolution, CNRS UMR 8079, 91405 Orsay, France and CNRS, UMR 0320/UMR 8120 Génétique Quantitative et Evolution - Le Moulon, INRA/Université Paris-Sud/CNRS/AgroParisTech, Ferme du Moulon, 91190 Gif-sur-Yvette, France
| | - Julie Sannier
- Université Paris-Sud, Laboratoire Écologie, Systématique, Évolution, CNRS UMR 8079, 91405 Orsay, France and CNRS, UMR 0320/UMR 8120 Génétique Quantitative et Evolution - Le Moulon, INRA/Université Paris-Sud/CNRS/AgroParisTech, Ferme du Moulon, 91190 Gif-sur-Yvette, France
| | - Catherine Damerval
- Université Paris-Sud, Laboratoire Écologie, Systématique, Évolution, CNRS UMR 8079, 91405 Orsay, France and CNRS, UMR 0320/UMR 8120 Génétique Quantitative et Evolution - Le Moulon, INRA/Université Paris-Sud/CNRS/AgroParisTech, Ferme du Moulon, 91190 Gif-sur-Yvette, France
| | - Sophie Nadot
- Université Paris-Sud, Laboratoire Écologie, Systématique, Évolution, CNRS UMR 8079, 91405 Orsay, France and CNRS, UMR 0320/UMR 8120 Génétique Quantitative et Evolution - Le Moulon, INRA/Université Paris-Sud/CNRS/AgroParisTech, Ferme du Moulon, 91190 Gif-sur-Yvette, France
| |
Collapse
|
147
|
Kozak KM, Wahlberg N, Neild AFE, Dasmahapatra KK, Mallet J, Jiggins CD. Multilocus species trees show the recent adaptive radiation of the mimetic heliconius butterflies. Syst Biol 2015; 64:505-24. [PMID: 25634098 PMCID: PMC4395847 DOI: 10.1093/sysbio/syv007] [Citation(s) in RCA: 131] [Impact Index Per Article: 14.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2014] [Accepted: 01/23/2015] [Indexed: 11/25/2022] Open
Abstract
Müllerian mimicry among Neotropical Heliconiini butterflies is an excellent example of natural selection, associated with the diversification of a large continental-scale radiation. Some of the processes driving the evolution of mimicry rings are likely to generate incongruent phylogenetic signals across the assemblage, and thus pose a challenge for systematics. We use a data set of 22 mitochondrial and nuclear markers from 92% of species in the tribe, obtained by Sanger sequencing and de novo assembly of short read data, to re-examine the phylogeny of Heliconiini with both supermatrix and multispecies coalescent approaches, characterize the patterns of conflicting signal, and compare the performance of various methodological approaches to reflect the heterogeneity across the data. Despite the large extent of reticulate signal and strong conflict between markers, nearly identical topologies are consistently recovered by most of the analyses, although the supermatrix approach failed to reflect the underlying variation in the history of individual loci. However, the supermatrix represents a useful approximation where multiple rare species represented by short sequences can be incorporated easily. The first comprehensive, time-calibrated phylogeny of this group is used to test the hypotheses of a diversification rate increase driven by the dramatic environmental changes in the Neotropics over the past 23 myr, or changes caused by diversity-dependent effects on the rate of diversification. We find that the rate of diversification has increased on the branch leading to the presently most species-rich genus Heliconius, but the change occurred gradually and cannot be unequivocally attributed to a specific environmental driver. Our study provides comprehensive comparison of philosophically distinct species tree reconstruction methods and provides insights into the diversification of an important insect radiation in the most biodiverse region of the planet.
Collapse
Affiliation(s)
- Krzysztof M Kozak
- Butterfly Genetics Group, Department of Zoology, University of Cambridge, CB2 3EJ Cambridge, UK; Laboratory of Genetics, Department of Biology, University of Turku, 20014 Turku, Finland; Department of Entomology, The Natural History Museum, London SW7 5BD, UK; Department of Biology, University of York, YO10 5DD Heslington, York, UK; and Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, MA 02138, USA
| | - Niklas Wahlberg
- Butterfly Genetics Group, Department of Zoology, University of Cambridge, CB2 3EJ Cambridge, UK; Laboratory of Genetics, Department of Biology, University of Turku, 20014 Turku, Finland; Department of Entomology, The Natural History Museum, London SW7 5BD, UK; Department of Biology, University of York, YO10 5DD Heslington, York, UK; and Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, MA 02138, USA
| | - Andrew F E Neild
- Butterfly Genetics Group, Department of Zoology, University of Cambridge, CB2 3EJ Cambridge, UK; Laboratory of Genetics, Department of Biology, University of Turku, 20014 Turku, Finland; Department of Entomology, The Natural History Museum, London SW7 5BD, UK; Department of Biology, University of York, YO10 5DD Heslington, York, UK; and Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, MA 02138, USA
| | - Kanchon K Dasmahapatra
- Butterfly Genetics Group, Department of Zoology, University of Cambridge, CB2 3EJ Cambridge, UK; Laboratory of Genetics, Department of Biology, University of Turku, 20014 Turku, Finland; Department of Entomology, The Natural History Museum, London SW7 5BD, UK; Department of Biology, University of York, YO10 5DD Heslington, York, UK; and Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, MA 02138, USA
| | - James Mallet
- Butterfly Genetics Group, Department of Zoology, University of Cambridge, CB2 3EJ Cambridge, UK; Laboratory of Genetics, Department of Biology, University of Turku, 20014 Turku, Finland; Department of Entomology, The Natural History Museum, London SW7 5BD, UK; Department of Biology, University of York, YO10 5DD Heslington, York, UK; and Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, MA 02138, USA
| | - Chris D Jiggins
- Butterfly Genetics Group, Department of Zoology, University of Cambridge, CB2 3EJ Cambridge, UK; Laboratory of Genetics, Department of Biology, University of Turku, 20014 Turku, Finland; Department of Entomology, The Natural History Museum, London SW7 5BD, UK; Department of Biology, University of York, YO10 5DD Heslington, York, UK; and Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, MA 02138, USA
| |
Collapse
|
148
|
Sigwart JD, Lindberg DR. Consensus and confusion in molluscan trees: evaluating morphological and molecular phylogenies. Syst Biol 2015; 64:384-95. [PMID: 25472575 PMCID: PMC4395843 DOI: 10.1093/sysbio/syu105] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2013] [Accepted: 11/21/2014] [Indexed: 11/18/2022] Open
Abstract
Mollusks are the most morphologically disparate living animal phylum, they have diversified into all habitats, and have a deep fossil record. Monophyly and identity of their eight living classes is undisputed, but relationships between these groups and patterns of their early radiation have remained elusive. Arguments about traditional morphological phylogeny focus on a small number of topological concepts but often without regard to proximity of the individual classes. In contrast, molecular studies have proposed a number of radically different, inherently contradictory, and controversial sister relationships. Here, we assembled a data set of 42 unique published trees describing molluscan interrelationships. We used these data to ask several questions about the state of resolution of molluscan phylogeny compared with a null model of the variation possible in random trees constructed from a monophyletic assemblage of eight terminals. Although 27 different unique trees have been proposed from morphological inference, the majority of these are not statistically different from each other. Within the available molecular topologies, only four studies to date have included the deep sea class Monoplacophora; but 36.4% of all trees are not significantly different. We also present supertrees derived from two data partitions and three methods, including all available molecular molluscan phylogenies, which will form the basis for future hypothesis testing. The supertrees presented here were not constructed to provide yet another hypothesis of molluscan relationships, but rather to algorithmically evaluate the relationships present in the disparate published topologies. Based on the totality of available evidence, certain patterns of relatedness among constituent taxa become clear. The internodal distance is consistently short between a few taxon pairs, particularly supporting the relatedness of Monoplacophora and the chitons, Polyplacophora. Other taxon pairs are rarely or never found in close proximity, such as the vermiform Caudofoveata and Bivalvia. Our results have specific utility for guiding constructive research planning to better test relationships in Mollusca as well as other problematic groups. Taxa with consistently proximate relationships should be the focus of a combined approach in a concerted assessment of potential genetic and anatomical homology, whereas unequivocally distant taxa will make the most constructive choices for exemplar selection in higher level phylogenomic analyses.
Collapse
Affiliation(s)
- Julia D Sigwart
- Marine Laboratory, Queen's University Belfast, BT22 1PF, Northern Ireland, UK; and Department of Integrative Biology, Museum of Paleontology and Center for Computational Biology, University of California, Berkeley, CA, 94720, USA
| | - David R Lindberg
- Marine Laboratory, Queen's University Belfast, BT22 1PF, Northern Ireland, UK; and Department of Integrative Biology, Museum of Paleontology and Center for Computational Biology, University of California, Berkeley, CA, 94720, USA
| |
Collapse
|
149
|
Dos Remedios N, Lee PLM, Burke T, Székely T, Küpper C. North or south? Phylogenetic and biogeographic origins of a globally distributed avian clade. Mol Phylogenet Evol 2015; 89:151-9. [PMID: 25916188 DOI: 10.1016/j.ympev.2015.04.010] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2014] [Revised: 04/10/2015] [Accepted: 04/14/2015] [Indexed: 11/16/2022]
Abstract
Establishing phylogenetic relationships within a clade can help to infer ancestral origins and indicate how widespread species reached their current biogeographic distributions. The small plovers, genus Charadrius, are cosmopolitan shorebirds, distributed across all continents except Antarctica. Here we present a global, species-level molecular phylogeny of this group based on four nuclear (ADH5, FIB7, MYO2 and RAG1) and two mitochondrial (COI and ND3) genes, and use the phylogeny to examine the biogeographic origin of the genus. A Bayesian multispecies coalescent approach identified two major clades (CRD I and CRD II) within the genus. Clade CRD I contains three species (Thinornis novaeseelandiae, Thinornis rubricollis and Eudromias morinellus), and CRD II one species (Anarhynchus frontalis), that were previously placed outside the Charadrius genus. In contrast to earlier work, ancestral area analyses using parsimony and Bayesian methods supported an origin of the Charadrius plovers in the Northern hemisphere. We propose that major radiations in this group were associated with shifts in the range of these ancestral plover species, leading to colonisation of the Southern hemisphere.
Collapse
Affiliation(s)
- Natalie Dos Remedios
- Department of Biology and Biochemistry, University of Bath, Claverton Down, Bath BA2 7AY, UK; NERC-Biomolecular Analysis Facility, Department of Animal and Plant Sciences, University of Sheffield, Western Bank, Sheffield S10 2TN, UK.
| | - Patricia L M Lee
- Centre for Integrative Ecology, School of Life and Environmental Sciences, Deakin University, Warrnambool, Victoria 3280, Australia; Department of Biosciences, College of Science, Swansea University, Singleton Park, Swansea SA2 8PP, Wales, UK
| | - Terry Burke
- NERC-Biomolecular Analysis Facility, Department of Animal and Plant Sciences, University of Sheffield, Western Bank, Sheffield S10 2TN, UK
| | - Tamás Székely
- Department of Biology and Biochemistry, University of Bath, Claverton Down, Bath BA2 7AY, UK
| | - Clemens Küpper
- NERC-Biomolecular Analysis Facility, Department of Animal and Plant Sciences, University of Sheffield, Western Bank, Sheffield S10 2TN, UK
| |
Collapse
|
150
|
Molecular investigation of the phylogenetic position of the polar nudibranch Doridoxa (Mollusca, Gastropoda, Heterobranchia). Polar Biol 2015. [DOI: 10.1007/s00300-015-1700-5] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
|