1
|
Baños H, Susko E, Roger AJ. Is Over-parameterization a Problem for Profile Mixture Models? Syst Biol 2024; 73:53-75. [PMID: 37843172 PMCID: PMC11129589 DOI: 10.1093/sysbio/syad063] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2022] [Revised: 09/12/2023] [Accepted: 10/13/2023] [Indexed: 10/17/2023] Open
Abstract
Biochemical constraints on the admissible amino acids at specific sites in proteins lead to heterogeneity of the amino acid substitution process over sites in alignments. It is well known that phylogenetic models of protein sequence evolution that do not account for site heterogeneity are prone to long-branch attraction (LBA) artifacts. Profile mixture models were developed to model heterogeneity of preferred amino acids at sites via a finite distribution of site classes each with a distinct set of equilibrium amino acid frequencies. However, it is unknown whether the large number of parameters in such models associated with the many amino acid frequency vectors can adversely affect tree topology estimates because of over-parameterization. Here, we demonstrate theoretically that for long sequences, over-parameterization does not create problems for estimation with profile mixture models. Under mild conditions, tree, amino acid frequencies, and other model parameters converge to true values as sequence length increases, even when there are large numbers of components in the frequency profile distributions. Because large sample theory does not necessarily imply good behavior for shorter alignments we explore the performance of these models with short alignments simulated with tree topologies that are prone to LBA artifacts. We find that over-parameterization is not a problem for complex profile mixture models even when there are many amino acid frequency vectors. In fact, simple models with few site classes behave poorly. Interestingly, we also found that misspecification of the amino acid frequency vectors does not lead to increased LBA artifacts as long as the estimated cumulative distribution function of the amino acid frequencies at sites adequately approximates the true one. In contrast, misspecification of the amino acid exchangeability rates can severely negatively affect parameter estimation. Finally, we explore the effects of including in the profile mixture model an additional "F-class" representing the overall frequencies of amino acids in the data set. Surprisingly, the F-class does not help parameter estimation significantly and can decrease the probability of correct tree estimation, depending on the scenario, even though it tends to improve likelihood scores.
Collapse
Affiliation(s)
- Hector Baños
- Department of Biochemistry and Molecular Biology, Dalhousie University, Halifax, Nova Scotia B3H 4R2, Canada
- Department of Mathematics and Statistics, Dalhousie University, Halifax, Nova Scotia B3H 4R2, Canada
- Institute for Comparative Genomics and Evolutionary Bioinformatics, Dalhousie University, Halifax, Nova Scotia B3H 4R2, Canada
| | - Edward Susko
- Department of Mathematics and Statistics, Dalhousie University, Halifax, Nova Scotia B3H 4R2, Canada
- Institute for Comparative Genomics and Evolutionary Bioinformatics, Dalhousie University, Halifax, Nova Scotia B3H 4R2, Canada
| | - Andrew J Roger
- Department of Biochemistry and Molecular Biology, Dalhousie University, Halifax, Nova Scotia B3H 4R2, Canada
- Institute for Comparative Genomics and Evolutionary Bioinformatics, Dalhousie University, Halifax, Nova Scotia B3H 4R2, Canada
| |
Collapse
|
2
|
Lozano-Fernandez J. A Practical Guide to Design and Assess a Phylogenomic Study. Genome Biol Evol 2022; 14:evac129. [PMID: 35946263 PMCID: PMC9452790 DOI: 10.1093/gbe/evac129] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 08/03/2022] [Indexed: 11/13/2022] Open
Abstract
Over the last decade, molecular systematics has undergone a change of paradigm as high-throughput sequencing now makes it possible to reconstruct evolutionary relationships using genome-scale datasets. The advent of "big data" molecular phylogenetics provided a battery of new tools for biologists but simultaneously brought new methodological challenges. The increase in analytical complexity comes at the price of highly specific training in computational biology and molecular phylogenetics, resulting very often in a polarized accumulation of knowledge (technical on one side and biological on the other). Interpreting the robustness of genome-scale phylogenetic studies is not straightforward, particularly as new methodological developments have consistently shown that the general belief of "more genes, more robustness" often does not apply, and because there is a range of systematic errors that plague phylogenomic investigations. This is particularly problematic because phylogenomic studies are highly heterogeneous in their methodology, and best practices are often not clearly defined. The main aim of this article is to present what I consider as the ten most important points to take into consideration when planning a well-thought-out phylogenomic study and while evaluating the quality of published papers. The goal is to provide a practical step-by-step guide that can be easily followed by nonexperts and phylogenomic novices in order to assess the technical robustness of phylogenomic studies or improve the experimental design of a project.
Collapse
Affiliation(s)
- Jesus Lozano-Fernandez
- Department of Genetics, Microbiology and Statistics, Biodiversity Research Institute (IRBio), University of Barcelona, Avd. Diagonal 643, 08028 Barcelona, Spain
- Institute of Evolutionary Biology (CSIC – Universitat Pompeu Fabra), Passeig marítim de la Barcelona 37-49, 08003 Barcelona, Spain
| |
Collapse
|
3
|
Out of chaos: Phylogenomics of Asian Sonerileae. Mol Phylogenet Evol 2022; 175:107581. [PMID: 35810973 DOI: 10.1016/j.ympev.2022.107581] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2022] [Revised: 05/23/2022] [Accepted: 05/26/2022] [Indexed: 11/22/2022]
Abstract
Sonerileae is a diverse Melastomataceae lineage comprising ca. 1000 species in 44 genera, with >70% of genera and species distributed in Asia. Asian Sonerileae are taxonomically intractable with obscure generic circumscriptions. The backbone phylogeny of this group remains poorly resolved, possibly due to complexity caused by rapid species radiation in early and middle Miocene, which hampers further systematic study. Here, we used genome resequencing data to reconstruct the phylogeny of Asian Sonerileae. Three parallel datasets, viz. single-copy ortholog (SCO), genomic SNPs, and whole plastome, were assembled from genome resequencing data of 205 species for this purpose. Based on these genome-scale data, we provided the first well resolved phylogeny of Asian Sonerileae, with 34 major clades identified and 74% of the interclade relationships consistently resolved by both SCO and genomic data. Meanwhile, widespread phylogenetic discordance was detected among SCO gene trees as well as species trees reconstructed using different tree estimation methods (concatenation/site-based coalescent method/summary method) or different datasets (SCO/genomic/plastome). We explored sources of discordance using multiple approaches and found that the observed discordance in Asian Sonerileae was mainly caused by a combination of biased distribution of missing data, random noise from uninformative genes, incomplete lineage sorting, and hybridization/introgression. Exploration of these sources can enable us to generate hypotheses for future testing, which is the first step towards understanding the evolution of Asian Sonerileae. We also detected high levels of homoplasy for some characters traditionally used in taxonomy, which explains current chaotic generic delimitations. The backbone phylogeny of Asian Sonerileae revealed in this study offers a solid basis for future taxonomic revision at the generic level.
Collapse
|
4
|
Martinez-Gutierrez CA, Aylward FO. Phylogenetic Signal, Congruence, and Uncertainty across Bacteria and Archaea. Mol Biol Evol 2021; 38:5514-5527. [PMID: 34436605 PMCID: PMC8662615 DOI: 10.1093/molbev/msab254] [Citation(s) in RCA: 34] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022] Open
Abstract
Reconstruction of the Tree of Life is a central goal in biology. Although numerous novel phyla of bacteria and archaea have recently been discovered, inconsistent phylogenetic relationships are routinely reported, and many inter-phylum and inter-domain evolutionary relationships remain unclear. Here, we benchmark different marker genes often used in constructing multidomain phylogenetic trees of bacteria and archaea and present a set of marker genes that perform best for multidomain trees constructed from concatenated alignments. We use recently-developed Tree Certainty metrics to assess the confidence of our results and to obviate the complications of traditional bootstrap-based metrics. Given the vastly disparate number of genomes available for different phyla of bacteria and archaea, we also assessed the impact of taxon sampling on multidomain tree construction. Our results demonstrate that biases between the representation of different taxonomic groups can dramatically impact the topology of resulting trees. Inspection of our highest-quality tree supports the division of most bacteria into Terrabacteria and Gracilicutes, with Thermatogota and Synergistota branching earlier from these superphyla. This tree also supports the inclusion of the Patescibacteria within the Terrabacteria as a sister group to the Chloroflexota instead of as a basal-branching lineage. For the Archaea, our tree supports three monophyletic lineages (DPANN, Euryarchaeota, and TACK/Asgard), although we note the basal placement of the DPANN may still represent an artifact caused by biased sequence composition. Our findings provide a robust and standardized framework for multidomain phylogenetic reconstruction that can be used to evaluate inter-phylum relationships and assess uncertainty in conflicting topologies of the Tree of Life.
Collapse
Affiliation(s)
| | - Frank O Aylward
- Department of Biological Sciences, Virginia Tech, Blacksburg, VA, USA
- Center for Emerging, Zoonotic, and Arthropod-borne Pathogens, Virginia Polytechnic Institute and State University, Blacksburg, VA, USA
| |
Collapse
|
5
|
Schrempf D, Lartillot N, Szöllősi G. Scalable Empirical Mixture Models That Account for Across-Site Compositional Heterogeneity. Mol Biol Evol 2021; 37:3616-3631. [PMID: 32877529 PMCID: PMC7743758 DOI: 10.1093/molbev/msaa145] [Citation(s) in RCA: 22] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022] Open
Abstract
Biochemical demands constrain the range of amino acids acceptable at specific sites resulting in across-site compositional heterogeneity of the amino acid replacement process. Phylogenetic models that disregard this heterogeneity are prone to systematic errors, which can lead to severe long-branch attraction artifacts. State-of-the-art models accounting for across-site compositional heterogeneity include the CAT model, which is computationally expensive, and empirical distribution mixture models estimated via maximum likelihood (C10–C60 models). Here, we present a new, scalable method EDCluster for finding empirical distribution mixture models involving a simple cluster analysis. The cluster analysis utilizes specific coordinate transformations which allow the detection of specialized amino acid distributions either from curated databases or from the alignment at hand. We apply EDCluster to the HOGENOM and HSSP databases in order to provide universal distribution mixture (UDM) models comprising up to 4,096 components. Detailed analyses of the UDM models demonstrate the removal of various long-branch attraction artifacts and improved performance compared with the C10–C60 models. Ready-to-use implementations of the UDM models are provided for three established software packages (IQ-TREE, Phylobayes, and RevBayes).
Collapse
Affiliation(s)
- Dominik Schrempf
- Department of Biological Physics, Eötvös University, Budapest, Hungary
| | - Nicolas Lartillot
- Laboratoire de Biométrie et Biologie Evolutive UMR 5558, CNRS, Université de Lyon, Villeurbanne, France
| | - Gergely Szöllősi
- Department of Biological Physics, Eötvös University, Budapest, Hungary.,ELTE-MTA "Lendület" Evolutionary Genomics Research Group, Budapest, Hungary.,Evolutionary Systems Research Group, Centre for Ecological Research, Hungarian Academy of Sciences, Tihany, Hungary
| |
Collapse
|
6
|
Bossert S, Murray EA, Pauly A, Chernyshov K, Brady SG, Danforth BN. Gene Tree Estimation Error with Ultraconserved Elements: An Empirical Study on Pseudapis Bees. Syst Biol 2020; 70:803-821. [PMID: 33367855 DOI: 10.1093/sysbio/syaa097] [Citation(s) in RCA: 16] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2019] [Revised: 11/18/2020] [Accepted: 12/02/2020] [Indexed: 11/12/2022] Open
Abstract
Summarizing individual gene trees to species phylogenies using two-step coalescent methods is now a standard strategy in the field of phylogenomics. However, practical implementations of summary methods suffer from gene tree estimation error, which is caused by various biological and analytical factors. Greatly understudied is the choice of gene tree inference method and downstream effects on species tree estimation for empirical data sets. To better understand the impact of this method choice on gene and species tree accuracy, we compare gene trees estimated through four widely used programs under different model-selection criteria: PhyloBayes, MrBayes, IQ-Tree, and RAxML. We study their performance in the phylogenomic framework of $>$800 ultraconserved elements from the bee subfamily Nomiinae (Halictidae). Our taxon sampling focuses on the genus Pseudapis, a distinct lineage with diverse morphological features, but contentious morphology-based taxonomic classifications and no molecular phylogenetic guidance. We approximate topological accuracy of gene trees by assessing their ability to recover two uncontroversial, monophyletic groups, and compare branch lengths of individual trees using the stemminess metric (the relative length of internal branches). We further examine different strategies of removing uninformative loci and the collapsing of weakly supported nodes into polytomies. We then summarize gene trees with ASTRAL and compare resulting species phylogenies, including comparisons to concatenation-based estimates. Gene trees obtained with the reversible jump model search in MrBayes were most concordant on average and all Bayesian methods yielded gene trees with better stemminess values. The only gene tree estimation approach whose ASTRAL summary trees consistently produced the most likely correct topology, however, was IQ-Tree with automated model designation (ModelFinder program). We discuss these findings and provide practical advice on gene tree estimation for summary methods. Lastly, we establish the first phylogeny-informed classification for Pseudapis s. l. and map the distribution of distinct morphological features of the group. [ASTRAL; Bees; concordance; gene tree estimation error; IQ-Tree; MrBayes, Nomiinae; PhyloBayes; RAxML; phylogenomics; stemminess].
Collapse
Affiliation(s)
- Silas Bossert
- Department of Entomology, Cornell University, Comstock Hall, Ithaca, NY 14853, USA.,Department of Entomology, National Museum of Natural History, Smithsonian Institution, Washington, DC 20560, USA.,Department of Entomology, Washington State University, Pullman, Washington 99164, USA
| | - Elizabeth A Murray
- Department of Entomology, National Museum of Natural History, Smithsonian Institution, Washington, DC 20560, USA.,Department of Entomology, Washington State University, Pullman, Washington 99164, USA
| | - Alain Pauly
- O.D. Taxonomy and Phylogeny, Royal Belgian Institute of Natural Sciences, Rue Vautier 29, 1000 Brussels, Belgium
| | - Kyrylo Chernyshov
- College of Arts and Sciences, Cornell University, Ithaca, NY 14853, USA
| | - Seán G Brady
- Department of Entomology, National Museum of Natural History, Smithsonian Institution, Washington, DC 20560, USA
| | - Bryan N Danforth
- Department of Entomology, Cornell University, Comstock Hall, Ithaca, NY 14853, USA
| |
Collapse
|
7
|
Yazaki E, Kume K, Shiratori T, Eglit Y, Tanifuji G, Harada R, Simpson AGB, Ishida KI, Hashimoto T, Inagaki Y. Barthelonids represent a deep-branching metamonad clade with mitochondrion-related organelles predicted to generate no ATP. Proc Biol Sci 2020; 287:20201538. [PMID: 32873198 PMCID: PMC7542792 DOI: 10.1098/rspb.2020.1538] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open
Abstract
We here report the phylogenetic position of barthelonids, small anaerobic flagellates previously examined using light microscopy alone. Barthelona spp. were isolated from geographically distinct regions and we established five laboratory strains. Transcriptomic data generated from one Barthelona strain (PAP020) were used for large-scale, multi-gene phylogenetic (phylogenomic) analyses. Our analyses robustly placed strain PAP020 at the base of the Fornicata clade, indicating that barthelonids represent a deep-branching metamonad clade. Considering the anaerobic/microaerophilic nature of barthelonids and preliminary electron microscopy observations on strain PAP020, we suspected that barthelonids possess functionally and structurally reduced mitochondria (i.e. mitochondrion-related organelles or MROs). The metabolic pathways localized in the MRO of strain PAP020 were predicted based on its transcriptomic data and compared with those in the MROs of fornicates. We here propose that strain PAP020 is incapable of generating ATP in the MRO, as no mitochondrial/MRO enzymes involved in substrate-level phosphorylation were detected. Instead, we detected a putative cytosolic ATP-generating enzyme (acetyl-CoA synthetase), suggesting that strain PAP020 depends on ATP generated in the cytosol. We propose two separate losses of substrate-level phosphorylation from the MRO in the clade containing barthelonids and (other) fornicates.
Collapse
Affiliation(s)
- Euki Yazaki
- Interdisciplinary Theoretical and Mathematical Sciences (iTHEMS), RIKEN, Wako, Saitama, Japan
| | - Keitaro Kume
- Faculty of Medicine, University of Tsukuba, Tsukuba, Ibaraki, Japan
| | - Takashi Shiratori
- Faculty of Life and Environmental Sciences, University of Tsukuba, Tsukuba, Ibaraki, Japan.,Graduate School of Life and Environmental Sciences, University of Tsukuba, Tsukuba, Ibaraki, Japan
| | - Yana Eglit
- Department of Biology, Dalhousie University, Halifax, Nova Scotia, Canada.,Centre for Comparative Genomics and Evolutionary Bioinformatics, Dalhousie University, Halifax, Nova Scotia, Canada
| | - Goro Tanifuji
- Department of Zoology, National Museum of Nature and Science, Ibaraki, Japan
| | - Ryo Harada
- Graduate School of Life and Environmental Sciences, University of Tsukuba, Tsukuba, Ibaraki, Japan
| | - Alastair G B Simpson
- Department of Biology, Dalhousie University, Halifax, Nova Scotia, Canada.,Centre for Comparative Genomics and Evolutionary Bioinformatics, Dalhousie University, Halifax, Nova Scotia, Canada
| | - Ken-Ichiro Ishida
- Faculty of Life and Environmental Sciences, University of Tsukuba, Tsukuba, Ibaraki, Japan.,Graduate School of Life and Environmental Sciences, University of Tsukuba, Tsukuba, Ibaraki, Japan
| | - Tetsuo Hashimoto
- Faculty of Life and Environmental Sciences, University of Tsukuba, Tsukuba, Ibaraki, Japan.,Graduate School of Life and Environmental Sciences, University of Tsukuba, Tsukuba, Ibaraki, Japan
| | - Yuji Inagaki
- Graduate School of Life and Environmental Sciences, University of Tsukuba, Tsukuba, Ibaraki, Japan.,Center for Computational Sciences, University of Tsukuba, Tsukuba, Ibaraki, Japan
| |
Collapse
|
8
|
Wong GKS, Soltis DE, Leebens-Mack J, Wickett NJ, Barker MS, Van de Peer Y, Graham SW, Melkonian M. Sequencing and Analyzing the Transcriptomes of a Thousand Species Across the Tree of Life for Green Plants. ANNUAL REVIEW OF PLANT BIOLOGY 2020; 71:741-765. [PMID: 31851546 DOI: 10.1146/annurev-arplant-042916-041040] [Citation(s) in RCA: 35] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/23/2023]
Abstract
The 1,000 Plants (1KP) initiative was the first large-scale effort to collect next-generation sequencing (NGS) data across a phylogenetically representative sampling of species for a major clade of life, in this case theViridiplantae, or green plants. As an international multidisciplinary consortium, we focused on plant evolution and its practical implications. Among the major outcomes were the inference of a reference species tree for green plants by phylotranscriptomic analysis of low-copy genes, a survey of paleopolyploidy (whole-genome duplications) across the Viridiplantae, the inferred evolutionary histories for many gene families and biological processes, the discovery of novel light-sensitive proteins for optogenetic studies in mammalian neuroscience, and elucidation of the genetic network for a complex trait (C4 photosynthesis). Altogether, 1KP demonstrated how value can be extracted from a phylodiverse sequencing data set, providing a template for future projects that aim to generate even more data, including complete de novo genomes, across the tree of life.
Collapse
Affiliation(s)
- Gane Ka-Shu Wong
- Department of Biological Sciences and Department of Medicine, University of Alberta, Edmonton, Alberta T6G 2E9, Canada;
- BGI-Shenzhen, Beishan Industrial Zone, Yantian District, Shenzhen 518083, China
| | - Douglas E Soltis
- Florida Museum of Natural History, Gainesville, Florida 32611, USA
- Department of Biology, University of Florida, Gainesville, Florida 32611, USA
| | - Jim Leebens-Mack
- Department of Plant Biology, University of Georgia, Athens, Georgia 30602, USA
| | - Norman J Wickett
- Negaunee Institute for Plant Conservation Science and Action, Chicago Botanic Garden, Glencoe, Illinois 60022, USA
| | - Michael S Barker
- Department of Ecology and Evolutionary Biology, University of Arizona, Tucson, Arizona 85721, USA
| | - Yves Van de Peer
- Department of Plant Biotechnology and Bioinformatics, VIB Center for Plant Systems Biology, Ghent University, 9052 Ghent, Belgium
- Department of Biochemistry, Genetics and Microbiology, University of Pretoria, Pretoria 0028, South Africa
| | - Sean W Graham
- Department of Botany, University of British Columbia, Vancouver, British Columbia V6T 1Z4, Canada
| | - Michael Melkonian
- Faculty of Biology, University of Duisburg-Essen, D-45141 Essen, Germany
| |
Collapse
|
9
|
Sousa F, Civáň P, Brazão J, Foster PG, Cox CJ. The mitochondrial phylogeny of land plants shows support for Setaphyta under composition-heterogeneous substitution models. PeerJ 2020; 8:e8995. [PMID: 32377448 PMCID: PMC7194085 DOI: 10.7717/peerj.8995] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2019] [Accepted: 03/26/2020] [Indexed: 01/04/2023] Open
Abstract
Congruence among analyses of plant genomic data partitions (nuclear, chloroplast and mitochondrial) is a strong indicator of accuracy in plant molecular phylogenetics. Recent analyses of both nuclear and chloroplast genome data of land plants (embryophytes) have, controversially, been shown to support monophyly of both bryophytes (mosses, liverworts, and hornworts) and tracheophytes (lycopods, ferns, and seed plants), with mosses and liverworts forming the clade Setaphyta. However, relationships inferred from mitochondria are incongruent with these results, and typically indicate paraphyly of bryophytes with liverworts alone resolved as the earliest-branching land plant group. Here, we reconstruct the mitochondrial land plant phylogeny from a newly compiled data set. When among-lineage composition heterogeneity is accounted for in analyses of codon-degenerate nucleotide and amino acid data, the clade Setaphyta is recovered with high support, and hornworts are supported as the earliest-branching lineage of land plants. These new mitochondrial analyses demonstrate partial congruence with current hypotheses based on nuclear and chloroplast genome data, and provide further incentive for revision of how plants arose on land.
Collapse
Affiliation(s)
- Filipe Sousa
- Centro de Ciências do Mar, Universidade do Algarve, Faro, Portugal
| | - Peter Civáň
- Centro de Ciências do Mar, Universidade do Algarve, Faro, Portugal
- INRAE-Université Clermont-Auvergne, Clermont-Ferrand, France
| | - João Brazão
- Centro de Ciências do Mar, Universidade do Algarve, Faro, Portugal
| | - Peter G. Foster
- Department of Life Sciences, Natural History Museum, London, United Kingdom
| | - Cymon J. Cox
- Centro de Ciências do Mar, Universidade do Algarve, Faro, Portugal
| |
Collapse
|
10
|
Gribaldo S, Brochier-Armanet C. Evolutionary relationships between Archaea and eukaryotes. Nat Ecol Evol 2019; 4:20-21. [DOI: 10.1038/s41559-019-1073-1] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]
|
11
|
|
12
|
Flandrois JP, Brochier-Armanet C, Briolay J, Abrouk D, Schwob G, Normand P, Fernandez MP. Taxonomic assignment of uncultured prokaryotes with long range PCR targeting the spectinomycin operon. Res Microbiol 2019; 170:280-287. [PMID: 31279085 DOI: 10.1016/j.resmic.2019.06.005] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2018] [Revised: 05/02/2019] [Accepted: 06/25/2019] [Indexed: 11/28/2022]
Abstract
The taxonomic assignment of uncultured prokaryotes to known taxa is a major challenge in microbial systematics. This relies usually on the phylogenetic analysis of the ribosomal small subunit RNA or a few housekeeping genes. Recent works have disclosed ribosomal proteins as valuable markers for systematics and, due to the boom in complete genome sequencing, their use has become widespread. Yet, in the case of uncultured strains, for which complete genome sequences cannot be easily obtained, sequencing many markers is complicated and time consuming. Taking the advantage of the organization of ribosomal protein coding genes in large gene clusters, we amplified a 32 kb conserved region encompassing the spectinomycin (spc) operon using long range PCR from isolated and from uncultured nodular endophytic Frankia strains. The phylogenetic analysis of the 27 ribosomal protein genes contained in this region provided a robust phylogenetic tree consistent with phylogenies based on larger set of markers, indicating that this subset of ribosomal proteins contains enough phylogenetic signal to address systematic issues. This work shows that using long range PCR could break down the barrier preventing the use of ribosomal proteins as phylogenetic markers when complete genome sequences cannot be easily obtained.
Collapse
Affiliation(s)
- Jean-Pierre Flandrois
- Université de Lyon, Université Lyon 1, CNRS, UMR5558, Laboratoire de Biométrie et Biologie Évolutive, F-69622, Villeurbanne, France.
| | - Céline Brochier-Armanet
- Université de Lyon, Université Lyon 1, CNRS, UMR5558, Laboratoire de Biométrie et Biologie Évolutive, F-69622, Villeurbanne, France.
| | - Jérôme Briolay
- Université de Lyon, Université Lyon 1, DTAMB, Villeurbanne, France.
| | - Danis Abrouk
- Université de Lyon, Université Lyon 1, CNRS, UMR5557, INRA, UMR1418, Laboratoire d'Écologie Microbienne, Villeurbanne, France.
| | - Guillaume Schwob
- Université de Lyon, Université Lyon 1, CNRS, UMR5557, INRA, UMR1418, Laboratoire d'Écologie Microbienne, Villeurbanne, France.
| | - Philippe Normand
- Université de Lyon, Université Lyon 1, CNRS, UMR5557, INRA, UMR1418, Laboratoire d'Écologie Microbienne, Villeurbanne, France.
| | - Maria P Fernandez
- Université de Lyon, Université Lyon 1, CNRS, UMR5557, INRA, UMR1418, Laboratoire d'Écologie Microbienne, Villeurbanne, France.
| |
Collapse
|
13
|
Song N, Zhang H, Zhao T. Insights into the phylogeny of Hemiptera from increased mitogenomic taxon sampling. Mol Phylogenet Evol 2019; 137:236-249. [PMID: 31121308 DOI: 10.1016/j.ympev.2019.05.009] [Citation(s) in RCA: 37] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2018] [Revised: 05/15/2019] [Accepted: 05/16/2019] [Indexed: 10/26/2022]
Abstract
Although reconstruction of the phylogeny of Hemiptera has progressed tremendously over the past two decades, some higher-level relationships remain poorly resolved. Here, we investigated the Hemiptera higher-level relationships using full mitochondrial genome data from 357 ingroup species, representing the most comprehensive sampling yet undertaken for reconstructing the phylogeny of this group. In this study, 92 mitochondrial genomes were newly determined. Various data treatment methods and substitution models were applied to tree reconstructions. Effects of compositional heterogeneity, rate heterogeneity, model adequacy and taxon sampling on support values and topological stability were explored. Phylogenetic analyses (1) confirmed the monophyly of Hemiptera under site-heterogeneous model, (2) placed Sternorrhyncha as sister to all other Hemiptera, (3) recovered Coccoidea as the sister taxon of Aphidoidea, followed successively by Aleyrodoidea and Psylloidea, and (4) indicated that the grouping of Coleorrhyncha and Fulgoromorpha was the result of long-branch attraction effect.
Collapse
Affiliation(s)
- Nan Song
- College of Plant Protection, Henan Agricultural University, Zhengzhou 450002, China.
| | - Hao Zhang
- Henan Vocational and Technological College of Communication, Zhengzhou 450015, China
| | - Te Zhao
- College of Plant Protection, Henan Agricultural University, Zhengzhou 450002, China.
| |
Collapse
|
14
|
de Sousa F, Foster PG, Donoghue PCJ, Schneider H, Cox CJ. Nuclear protein phylogenies support the monophyly of the three bryophyte groups (Bryophyta Schimp.). THE NEW PHYTOLOGIST 2019; 222:565-575. [PMID: 30411803 DOI: 10.1111/nph.15587] [Citation(s) in RCA: 59] [Impact Index Per Article: 9.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/21/2018] [Accepted: 10/31/2018] [Indexed: 05/05/2023]
Abstract
Unraveling the phylogenetic relationships between the four major lineages of terrestrial plants (mosses, liverworts, hornworts, and vascular plants) is essential for an understanding of the evolution of traits specific to land plants, such as their complex life cycles, and the evolutionary development of stomata and vascular tissue. Well supported phylogenetic hypotheses resulting from different data and methods are often incongruent due to processes of nucleotide evolution that are difficult to model, for example substitutional saturation and composition heterogeneity. We reanalysed a large published dataset of nuclear data and modelled these processes using degenerate-codon recoding and tree-heterogeneous composition substitution models. Our analyses resolved bryophytes as a monophyletic group and showed that the nonnonmonophyly of the clade that is supported by the analysis of nuclear nucleotide data is due solely to fast-evolving synonymous substitutions. The current congruence among phylogenies of both nuclear and chloroplast analyses lent considerable support to the conclusion that the bryophytes are a monophyletic group. An initial split between bryophytes and vascular plants implies that the bryophyte life cycle (with a dominant gametophyte nurturing an unbranched sporophyte) may not be ancestral to all land plants and that stomata are likely to be a symplesiomorphy among embryophytes.
Collapse
Affiliation(s)
- Filipe de Sousa
- Centro de Ciências do Mar, Universidade do Algarve, Gambelas, Faro, 8005-319, Portugal
| | - Peter G Foster
- Department of Life Sciences, Natural History Museum, London, SW7 5BD, UK
| | | | - Harald Schneider
- Department of Life Sciences, Natural History Museum, London, SW7 5BD, UK
- School of Earth Sciences, University of Bristol, Bristol, BS8 1TQ, UK
- Center of Integrative Conservation, Xishuangbanna Tropical Botanical Garden, Chinese Academy of Sciences, Yunnan, 666303, China
| | - Cymon J Cox
- Centro de Ciências do Mar, Universidade do Algarve, Gambelas, Faro, 8005-319, Portugal
| |
Collapse
|
15
|
Digging for the spiny rat and hutia phylogeny using a gene capture approach, with the description of a new mammal subfamily. Mol Phylogenet Evol 2019; 136:241-253. [PMID: 30885830 DOI: 10.1016/j.ympev.2019.03.007] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2018] [Revised: 03/13/2019] [Accepted: 03/14/2019] [Indexed: 02/07/2023]
Abstract
Next generation sequencing (NGS) and genomic database mining allow biologists to gather and select large molecular datasets well suited to address phylogenomics and molecular evolution questions. Here we applied this approach to a mammal family, the Echimyidae, for which generic relationships have been difficult to recover and often referred to as a star phylogeny. These South-American spiny rats represent a family of caviomorph rodents exhibiting a striking diversity of species and life history traits. Using a NGS exon capture protocol, we isolated and sequenced ca. 500 nuclear DNA exons for 35 species belonging to all major echimyid and capromyid clades. Exons were carefully selected to encompass as much diversity as possible in terms of rate of evolution, heterogeneity in the distribution of site-variation and nucleotide composition. Supermatrix inferences and coalescence-based approaches were subsequently applied to infer this family's phylogeny. The inferred topologies were the same for both approaches, and support was maximal for each node, entirely resolving the ambiguous relationships of previous analyses. Fast-evolving nuclear exons tended to yield more reliable phylogenies, as slower-evolving sequences were not informative enough to disentangle the short branches of the Echimyidae radiation. Based on this resolved phylogeny and on molecular and morphological evidence, we confirm the rank of the Caribbean hutias - formerly placed in the Capromyidae family - as Capromyinae, a clade nested within Echimyidae. We also name and define Carterodontinae, a new subfamily of Echimyidae, comprising the extant monotypic genus Carterodon from Brazil, which is the closest living relative of West Indies Capromyinae.
Collapse
|
16
|
Affiliation(s)
- Samuel Abalde
- Departamento de Biodiversidad y Biología Evolutiva; Museo Nacional de Ciencias Naturales (MNCN-CSIC); Madrid Spain
| | - Manuel J. Tenorio
- Departamento CMIM y Q. Inorgánica-INBIO, Facultad de Ciencias; Universidad de Cádiz; Puerto Real Spain
| | - Juan E. Uribe
- Departamento de Biodiversidad y Biología Evolutiva; Museo Nacional de Ciencias Naturales (MNCN-CSIC); Madrid Spain
- Department of Invertebrate Zoology, Smithsonian Institution; National Museum of Natural History; Washington District of Columbia USA
- Grupo de Evolución, Sistemática y Ecología Molecular; Universidad del Magdalena; Santa Marta Colombia
| | - Rafael Zardoya
- Departamento de Biodiversidad y Biología Evolutiva; Museo Nacional de Ciencias Naturales (MNCN-CSIC); Madrid Spain
| |
Collapse
|
17
|
Hilton SK, Bloom JD. Modeling site-specific amino-acid preferences deepens phylogenetic estimates of viral sequence divergence. Virus Evol 2018; 4:vey033. [PMID: 30425841 PMCID: PMC6220371 DOI: 10.1093/ve/vey033] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023] Open
Abstract
Molecular phylogenetics is often used to estimate the time since the divergence of modern gene sequences. For highly diverged sequences, such phylogenetic techniques sometimes estimate surprisingly recent divergence times. In the case of viruses, independent evidence indicates that the estimates of deep divergence times from molecular phylogenetics are sometimes too recent. This discrepancy is caused in part by inadequate models of purifying selection leading to branch-length underestimation. Here we examine the effect on branch-length estimation of using models that incorporate experimental measurements of purifying selection. We find that models informed by experimentally measured site-specific amino-acid preferences estimate longer deep branches on phylogenies of influenza virus hemagglutinin. This lengthening of branches is due to more realistic stationary states of the models, and is mostly independent of the branch-length extension from modeling site-to-site variation in amino-acid substitution rate. The branch-length extension from experimentally informed site-specific models is similar to that achieved by other approaches that allow the stationary state to vary across sites. However, the improvements from all of these site-specific but time homogeneous and site independent models are limited by the fact that a protein’s amino-acid preferences gradually shift as it evolves. Overall, our work underscores the importance of modeling site-specific amino-acid preferences when estimating deep divergence times—but also shows the inherent limitations of approaches that fail to account for how these preferences shift over time.
Collapse
Affiliation(s)
- Sarah K Hilton
- Basic Sciences and Computational Biology Program, Fred Hutchinson Cancer Research Center.,Department of Genome Sciences, University of Washington, USA
| | - Jesse D Bloom
- Basic Sciences and Computational Biology Program, Fred Hutchinson Cancer Research Center.,Department of Genome Sciences, University of Washington, USA.,Howard Hughes Medical Institute, Seattle, WA, USA
| |
Collapse
|
18
|
Roos C, Liedigk R, Thinh VN, Nadler T, Zinner D. The Hybrid Origin of the Indochinese Gray Langur Trachypithecus crepusculus. INT J PRIMATOL 2017. [DOI: 10.1007/s10764-017-0008-4] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022]
|
19
|
Miller JM, Chaudhary H, Marsee JD. Phylogenetic analysis predicts structural divergence for proteobacterial ClpC proteins. J Struct Biol 2017; 201:52-62. [PMID: 29129755 DOI: 10.1016/j.jsb.2017.11.003] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2017] [Revised: 11/06/2017] [Accepted: 11/08/2017] [Indexed: 12/29/2022]
Abstract
Regulated proteolysis is required in all organisms for the removal of misfolded or degradation-tagged protein substrates in cellular quality control pathways. The molecular machines that catalyze this process are known as ATP-dependent proteases with examples that include ClpAP and ClpCP. Clp/Hsp100 subunits form ring-structures that couple the energy of ATP binding and hydrolysis to protein unfolding and subsequent translocation of denatured protein into the compartmentalized ClpP protease for degradation. Copies of the clpA, clpC, clpE, clpK, and clpL genes are present in all characterized bacteria and their gene products are highly conserved in structure and function. However, the evolutionary relationship between these proteins remains unclear. Here we report a comprehensive phylogenetic analysis that suggests divergent evolution yielded ClpA from an ancestral ClpC protein and that ClpE/ClpL represent intermediates between ClpA/ClpC. This analysis also identifies a group of proteobacterial ClpC proteins that are likely not functional in regulated proteolysis. Our results strongly suggest that bacterial ClpC proteins should not be assumed to all function identically due to the structural differences identified here.
Collapse
Affiliation(s)
- Justin M Miller
- Middle Tennessee State University, Department of Chemistry, 1301 East Main Street, Murfreesboro, TN 37132, United States.
| | - Hamza Chaudhary
- Middle Tennessee State University, Department of Chemistry, 1301 East Main Street, Murfreesboro, TN 37132, United States
| | - Justin D Marsee
- Middle Tennessee State University, Department of Chemistry, 1301 East Main Street, Murfreesboro, TN 37132, United States
| |
Collapse
|
20
|
Yahalomi D, Haddas-Sasson M, Rubinstein ND, Feldstein T, Diamant A, Huchon D. The Multipartite Mitochondrial Genome of Enteromyxum leei (Myxozoa): Eight Fast-Evolving Megacircles. Mol Biol Evol 2017; 34:1551-1556. [PMID: 28333349 DOI: 10.1093/molbev/msx072] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022] Open
Abstract
Myxozoans are a large group of poorly characterized cnidarian parasites. To gain further insight into their evolution, we sequenced the mitochondrial (mt) genome of Enteromyxum leei and reevaluate the mt genome structure of Kudoa iwatai. Although the typical animal mt genome is a compact, 13-25 kb, circular chromosome, the mt genome of E. leei was found to be fragmented into eight circular chromosomes of ∼23 kb, making it the largest described animal mt genome. Each chromosome was found to harbor a large noncoding region (∼15 kb), nearly identical between chromosomes. The protein coding genes show an unusually high rate of sequence evolution and possess little similarity to their cnidarian homologs. Only five protein coding genes could be identified and no tRNA genes. Surprisingly, the mt genome of K. iwatai was also found to be composed of two chromosomes. These observations confirm the remarkable plasticity of myxozoan mt genomes.
Collapse
Affiliation(s)
- Dayana Yahalomi
- Department of Zoology, George S. Wise Faculty of Life Sciences, Tel Aviv University, Tel Aviv, Israel
| | - Michal Haddas-Sasson
- Department of Zoology, George S. Wise Faculty of Life Sciences, Tel Aviv University, Tel Aviv, Israel
| | - Nimrod D Rubinstein
- Department of Cell Research and Immunology, George S. Wise Faculty of Life Sciences, Tel Aviv University, Tel Aviv, Israel
| | - Tamar Feldstein
- Department of Zoology, George S. Wise Faculty of Life Sciences, Tel Aviv University, Tel Aviv, Israel.,The Steinhardt Museum of Natural History and Israel National Center for Biodiversity Studies, Tel Aviv University, Tel Aviv, Israel
| | - Arik Diamant
- National Center for Mariculture, Israel Oceanographic and Limnological Research, Eilat, Israel
| | - Dorothée Huchon
- Department of Zoology, George S. Wise Faculty of Life Sciences, Tel Aviv University, Tel Aviv, Israel.,The Steinhardt Museum of Natural History and Israel National Center for Biodiversity Studies, Tel Aviv University, Tel Aviv, Israel
| |
Collapse
|
21
|
Kolařík M, Vohník M. When the ribosomal DNA does not tell the truth: The case of the taxonomic position of Kurtia argillacea, an ericoid mycorrhizal fungus residing among Hymenochaetales. Fungal Biol 2017; 122:1-18. [PMID: 29248111 DOI: 10.1016/j.funbio.2017.09.006] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2017] [Revised: 09/13/2017] [Accepted: 09/27/2017] [Indexed: 11/19/2022]
Abstract
The nuclear ribosomal DNA (nuc-rDNA) is widely used for the identification and phylogenetic reconstruction of Agaricomycetes. However, nuc-rDNA-based phylogenies may sometimes be in conflict with phylogenetic relationships derived from protein coding genes. In this study, the taxonomic position of the basidiomycetous mycobiont that forms the recently discovered sheathed ericoid mycorrhiza was investigated, because its nuc-rDNA is highly dissimilar to any other available fungal sequences in terms of nucleotide composition and length, and its nuc-rDNA-based phylogeny is inconclusive and significantly disagrees with protein coding sequences and morphological data. In the present work, this mycobiont was identified as Kurtia argillacea (= Hyphoderma argillaceum) residing in the order Hymenochaetales (Basidiomycota). Bioinformatic screening of the Kurtia ribosomal DNA sequence indicates that it represents a gene with a non-standard substitution rate or nucleotide composition heterogeneity rather than a deep paralogue or a pseudogene. Such a phenomenon probably also occurs in other lineages of the Fungi and should be taken into consideration when nuc-rDNA (especially that with unusual nucleotide composition) is used as a sole marker for phylogenetic reconstructions. Kurtia argillacea so far represents the only confirmed non-sebacinoid ericoid mycorrhizal fungus in the Basidiomycota and its intriguing placement among mostly saprobic and parasitic Hymenochaetales begs further investigation of its eco-physiology.
Collapse
Affiliation(s)
- Miroslav Kolařík
- Laboratory of Fungal Genetics and Metabolism, Institute of Microbiology, Czech Academy of Sciences (CAS), Vídeňská 1083, CZ-14220 Prague, Czech Republic.
| | - Martin Vohník
- Department of Mycorrhizal Symbioses, Institute of Botany CAS, CZ-252 43 Průhonice, Czech Republic; Department of Experimental Plant Biology, Faculty of Science, Charles University, Viničná 5, CZ-128 44 Prague, Czech Republic
| |
Collapse
|
22
|
Mai U, Sayyari E, Mirarab S. Minimum variance rooting of phylogenetic trees and implications for species tree reconstruction. PLoS One 2017; 12:e0182238. [PMID: 28800608 PMCID: PMC5553649 DOI: 10.1371/journal.pone.0182238] [Citation(s) in RCA: 42] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2017] [Accepted: 06/25/2017] [Indexed: 12/29/2022] Open
Abstract
Phylogenetic trees inferred using commonly-used models of sequence evolution are unrooted, but the root position matters both for interpretation and downstream applications. This issue has been long recognized; however, whether the potential for discordance between the species tree and gene trees impacts methods of rooting a phylogenetic tree has not been extensively studied. In this paper, we introduce a new method of rooting a tree based on its branch length distribution; our method, which minimizes the variance of root to tip distances, is inspired by the traditional midpoint rerooting and is justified when deviations from the strict molecular clock are random. Like midpoint rerooting, the method can be implemented in a linear time algorithm. In extensive simulations that consider discordance between gene trees and the species tree, we show that the new method is more accurate than midpoint rerooting, but its relative accuracy compared to using outgroups to root gene trees depends on the size of the dataset and levels of deviations from the strict clock. We show high levels of error for all methods of rooting estimated gene trees due to factors that include effects of gene tree discordance, deviations from the clock, and gene tree estimation error. Our simulations, however, did not reveal significant differences between two equivalent methods for species tree estimation that use rooted and unrooted input, namely, STAR and NJst. Nevertheless, our results point to limitations of existing scalable rooting methods.
Collapse
Affiliation(s)
- Uyen Mai
- Dept of Computer Science and Engineering, University of California at San Diego, San Diego, CA, United States of America
| | - Erfan Sayyari
- Dept of Electrical and Computer Engineering, University of California at San Diego, San Diego, CA, United States of America
| | - Siavash Mirarab
- Dept of Electrical and Computer Engineering, University of California at San Diego, San Diego, CA, United States of America
| |
Collapse
|
23
|
Nasir A, Kim KM, Caetano-Anollés G. Phylogenetic Tracings of Proteome Size Support the Gradual Accretion of Protein Structural Domains and the Early Origin of Viruses from Primordial Cells. Front Microbiol 2017; 8:1178. [PMID: 28690608 PMCID: PMC5481351 DOI: 10.3389/fmicb.2017.01178] [Citation(s) in RCA: 30] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2017] [Accepted: 06/09/2017] [Indexed: 01/05/2023] Open
Abstract
Untangling the origin and evolution of viruses remains a challenging proposition. We recently studied the global distribution of protein domain structures in thousands of completely sequenced viral and cellular proteomes with comparative genomics, phylogenomics, and multidimensional scaling methods. A tree of life describing the evolution of proteomes revealed viruses emerging from the base of the tree as a fourth supergroup of life. A tree of domains indicated an early origin of modern viral lineages from ancient cells that co-existed with the cellular ancestors. However, it was recently argued that the rooting of our trees and the basal placement of viruses was artifactually induced by small genome (proteome) size. Here we show that these claims arise from misunderstanding and misinterpretations of cladistic methodology. Trees are reconstructed unrooted, and thus, their topologies cannot be distorted a posteriori by the rooting methodology. Tracing proteome size in trees and multidimensional views of evolutionary relationships as well as tests of leaf stability and exclusion/inclusion of taxa demonstrated that the smallest proteomes were neither attracted toward the root nor caused any topological distortions of the trees. Simulations confirmed that taxa clustering patterns were independent of proteome size and were determined by the presence of known evolutionary relatives in data matrices, highlighting the need for broader taxon sampling in phylogeny reconstruction. Instead, phylogenetic tracings of proteome size revealed a slowdown in innovation of the structural domain vocabulary and four regimes of allometric scaling that reflected a Heaps law. These regimes explained increasing economies of scale in the evolutionary growth and accretion of kernel proteome repertoires of viruses and cellular organisms that resemble growth of human languages with limited vocabulary sizes. Results reconcile dynamic and static views of domain frequency distributions that are consistent with the axiom of spatiotemporal continuity that is tenet of evolutionary thinking.
Collapse
Affiliation(s)
- Arshan Nasir
- Department of Biosciences, COMSATS Institute of Information TechnologyIslamabad, Pakistan
- Evolutionary Bioinformatics Laboratory, Department of Crop Sciences, University of Illinois at Urbana-ChampaignUrbana, IL, United States
| | - Kyung Mo Kim
- Division of Polar Life Sciences, Korea Polar Research InstituteIncheon, South Korea
| | - Gustavo Caetano-Anollés
- Evolutionary Bioinformatics Laboratory, Department of Crop Sciences, University of Illinois at Urbana-ChampaignUrbana, IL, United States
| |
Collapse
|
24
|
Williams TA, Szöllősi GJ, Spang A, Foster PG, Heaps SE, Boussau B, Ettema TJG, Embley TM. Integrative modeling of gene and genome evolution roots the archaeal tree of life. Proc Natl Acad Sci U S A 2017; 114:E4602-E4611. [PMID: 28533395 PMCID: PMC5468678 DOI: 10.1073/pnas.1618463114] [Citation(s) in RCA: 151] [Impact Index Per Article: 18.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
A root for the archaeal tree is essential for reconstructing the metabolism and ecology of early cells and for testing hypotheses that propose that the eukaryotic nuclear lineage originated from within the Archaea; however, published studies based on outgroup rooting disagree regarding the position of the archaeal root. Here we constructed a consensus unrooted archaeal topology using protein concatenation and a multigene supertree method based on 3,242 single gene trees, and then rooted this tree using a recently developed model of genome evolution. This model uses evidence from gene duplications, horizontal transfers, and gene losses contained in 31,236 archaeal gene families to identify the most likely root for the tree. Our analyses support the monophyly of DPANN (Diapherotrites, Parvarchaeota, Aenigmarchaeota, Nanoarchaeota, Nanohaloarchaea), a recently discovered cosmopolitan and genetically diverse lineage, and, in contrast to previous work, place the tree root between DPANN and all other Archaea. The sister group to DPANN comprises the Euryarchaeota and the TACK Archaea, including Lokiarchaeum, which our analyses suggest are monophyletic sister lineages. Metabolic reconstructions on the rooted tree suggest that early Archaea were anaerobes that may have had the ability to reduce CO2 to acetate via the Wood-Ljungdahl pathway. In contrast to proposals suggesting that genome reduction has been the predominant mode of archaeal evolution, our analyses infer a relatively small-genomed archaeal ancestor that subsequently increased in complexity via gene duplication and horizontal gene transfer.
Collapse
Affiliation(s)
- Tom A Williams
- School of Earth Sciences, University of Bristol, Bristol BS8 1TQ, United Kingdom;
- Institute for Cell and Molecular Biosciences, Newcastle University, Newcastle upon Tyne NE2 4HH, United Kingdom
| | - Gergely J Szöllősi
- MTA-ELTE Lendület Evolutionary Genomics Research Group, 1117 Budapest, Hungary
| | - Anja Spang
- Department of Cell and Molecular Biology, Science for Life Laboratory, Uppsala University, SE-75123 Uppsala, Sweden
| | - Peter G Foster
- Department of Life Sciences, Natural History Museum, London SW7 5BD, United Kingdom
| | - Sarah E Heaps
- Institute for Cell and Molecular Biosciences, Newcastle University, Newcastle upon Tyne NE2 4HH, United Kingdom
- School of Mathematics & Statistics, Newcastle University, Newcastle upon Tyne NE1 7RU, United Kingdom
| | - Bastien Boussau
- Univ Lyon, Université Lyon 1, CNRS, Laboratoire de Biométrie et Biologie Evolutive UMR5558, F-69622 Villeurbanne, France
| | - Thijs J G Ettema
- Department of Cell and Molecular Biology, Science for Life Laboratory, Uppsala University, SE-75123 Uppsala, Sweden
| | - T Martin Embley
- Institute for Cell and Molecular Biosciences, Newcastle University, Newcastle upon Tyne NE2 4HH, United Kingdom
| |
Collapse
|
25
|
Abstract
Animals make up only a small fraction of the eukaryotic tree of life, yet, from our vantage point as members of the animal kingdom, the evolution of the bewildering diversity of animal forms is endlessly fascinating. In the century following the publication of Darwin's Origin of Species, hypotheses regarding the evolution of the major branches of the animal kingdom - their relationships to each other and the evolution of their body plans - was based on a consideration of the morphological and developmental characteristics of the different animal groups. This morphology-based approach had many successes but important aspects of the evolutionary tree remained disputed. In the past three decades, molecular data, most obviously primary sequences of DNA and proteins, have provided an estimate of animal phylogeny largely independent of the morphological evolution we would ultimately like to understand. The molecular tree that has evolved over the past three decades has drastically altered our view of animal phylogeny and many aspects of the tree are no longer contentious. The focus of molecular studies on relationships between animal groups means, however, that the discipline has become somewhat divorced from the underlying biology and from the morphological characteristics whose evolution we aim to understand. Here, we consider what we currently know of animal phylogeny; what aspects we are still uncertain about and what our improved understanding of animal phylogeny can tell us about the evolution of the great diversity of animal life.
Collapse
Affiliation(s)
- Maximilian J Telford
- Department of Genetics, Evolution and Environment, University College London, WC1E 6BT, UK.
| | - Graham E Budd
- Department of Earth Sciences, Palaeobiology, Uppsala University, Villavägen 16, 75236 Uppsala, Sweden
| | - Hervé Philippe
- Centre de Théorisation et de Modélisation de la Biodiversité, Station d'Ecologie Expérimentale du CNRS, USR CNRS 2936 Moulis, 09200, France; Département de Biochimie, Centre Robert-Cedergren, Université de Montréal, Montréal, Québec, Canada
| |
Collapse
|
26
|
O'Malley MA. Histories of molecules: Reconciling the past. STUDIES IN HISTORY AND PHILOSOPHY OF SCIENCE 2016; 55:69-83. [PMID: 26774071 DOI: 10.1016/j.shpsa.2015.09.002] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/14/2015] [Revised: 09/07/2015] [Accepted: 09/08/2015] [Indexed: 06/05/2023]
Abstract
Molecular data and methods have become centrally important to evolutionary analysis, largely because they have enabled global phylogenetic reconstructions of the relationships between organisms in the tree of life. Often, however, molecular stories conflict dramatically with morphology-based histories of lineages. The evolutionary origin of animal groups provides one such case. In other instances, different molecular analyses have so far proved irreconcilable. The ancient and major divergence of eukaryotes from prokaryotic ancestors is an example of this sort of problem. Efforts to overcome these conflicts highlight the role models play in phylogenetic reconstruction. One crucial model is the molecular clock; another is that of 'simple-to-complex' modification. I will examine animal and eukaryote evolution against a backdrop of increasing methodological sophistication in molecular phylogeny, and conclude with some reflections on the nature of historical science in the molecular era of phylogeny.
Collapse
|
27
|
Pisani D, Pett W, Dohrmann M, Feuda R, Rota-Stabelli O, Philippe H, Lartillot N, Wörheide G. Genomic data do not support comb jellies as the sister group to all other animals. Proc Natl Acad Sci U S A 2015; 112:15402-7. [PMID: 26621703 PMCID: PMC4687580 DOI: 10.1073/pnas.1518127112] [Citation(s) in RCA: 208] [Impact Index Per Article: 20.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022] Open
Abstract
Understanding how complex traits, such as epithelia, nervous systems, muscles, or guts, originated depends on a well-supported hypothesis about the phylogenetic relationships among major animal lineages. Traditionally, sponges (Porifera) have been interpreted as the sister group to the remaining animals, a hypothesis consistent with the conventional view that the last common animal ancestor was relatively simple and more complex body plans arose later in evolution. However, this premise has recently been challenged by analyses of the genomes of comb jellies (Ctenophora), which, instead, found ctenophores as the sister group to the remaining animals (the "Ctenophora-sister" hypothesis). Because ctenophores are morphologically complex predators with true epithelia, nervous systems, muscles, and guts, this scenario implies these traits were either present in the last common ancestor of all animals and were lost secondarily in sponges and placozoans (Trichoplax) or, alternatively, evolved convergently in comb jellies. Here, we analyze representative datasets from recent studies supporting Ctenophora-sister, including genome-scale alignments of concatenated protein sequences, as well as a genomic gene content dataset. We found no support for Ctenophora-sister and conclude it is an artifact resulting from inadequate methodology, especially the use of simplistic evolutionary models and inappropriate choice of species to root the metazoan tree. Our results reinforce a traditional scenario for the evolution of complexity in animals, and indicate that inferences about the evolution of Metazoa based on the Ctenophora-sister hypothesis are not supported by the currently available data.
Collapse
Affiliation(s)
- Davide Pisani
- School of Earth Sciences, University of Bristol, Bristol BS8 1TG, United Kingdom; School of Biological Sciences, University of Bristol, Bristol BS8 1TG, United Kingdom;
| | - Walker Pett
- Laboratoire de Biométrie et Biologie Évolutive, Université Lyon 1, CNRS, UMR 5558, 69622 Villeurbanne cedex, France
| | - Martin Dohrmann
- Department of Earth & Environmental Sciences & GeoBio-Center, Ludwig-Maximilians-Universität München, Munich 80333, Germany
| | - Roberto Feuda
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA 91125
| | - Omar Rota-Stabelli
- Department of Sustainable Agro-Ecosystems and Bioresources, Research and Innovation Centre, Fondazione Edmund Mach, San Michele all' Adige 38010, Italy
| | - Hervé Philippe
- Centre for Biodiversity Theory and Modelling, USR CNRS 2936, Station d'Ecologie Expérimentale du CNRS, Moulis 09200, France; Département de Biochimie, Centre Robert-Cedergren, Université de Montréal, Montreal, QC, Canada H3C 3J7
| | - Nicolas Lartillot
- Laboratoire de Biométrie et Biologie Évolutive, Université Lyon 1, CNRS, UMR 5558, 69622 Villeurbanne cedex, France
| | - Gert Wörheide
- Department of Earth & Environmental Sciences & GeoBio-Center, Ludwig-Maximilians-Universität München, Munich 80333, Germany; Bayerische Staatssammlung für Paläontologie und Geologie, Munich 80333, Germany
| |
Collapse
|
28
|
Kurland CG, Harish A. The phylogenomics of protein structures: The backstory. Biochimie 2015; 119:284-302. [DOI: 10.1016/j.biochi.2015.07.027] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2015] [Accepted: 07/28/2015] [Indexed: 12/11/2022]
|
29
|
Structural and evolutionary relationships of "AT-less" type I polyketide synthase ketosynthases. Proc Natl Acad Sci U S A 2015; 112:12693-8. [PMID: 26420866 DOI: 10.1073/pnas.1515460112] [Citation(s) in RCA: 45] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Acyltransferase (AT)-less type I polyketide synthases (PKSs) break the type I PKS paradigm. They lack the integrated AT domains within their modules and instead use a discrete AT that acts in trans, whereas a type I PKS module minimally contains AT, acyl carrier protein (ACP), and ketosynthase (KS) domains. Structures of canonical type I PKS KS-AT didomains reveal structured linkers that connect the two domains. AT-less type I PKS KSs have remnants of these linkers, which have been hypothesized to be AT docking domains. Natural products produced by AT-less type I PKSs are very complex because of an increased representation of unique modifying domains. AT-less type I PKS KSs possess substrate specificity and fall into phylogenetic clades that correlate with their substrates, whereas canonical type I PKS KSs are monophyletic. We have solved crystal structures of seven AT-less type I PKS KS domains that represent various sequence clusters, revealing insight into the large structural and subtle amino acid residue differences that lead to unique active site topologies and substrate specificities. One set of structures represents a larger group of KS domains from both canonical and AT-less type I PKSs that accept amino acid-containing substrates. One structure has a partial AT-domain, revealing the structural consequences of a type I PKS KS evolving into an AT-less type I PKS KS. These structures highlight the structural diversity within the AT-less type I PKS KS family, and most important, provide a unique opportunity to study the molecular evolution of substrate specificity within the type I PKSs.
Collapse
|
30
|
Lartillot N. Probabilistic models of eukaryotic evolution: time for integration. Philos Trans R Soc Lond B Biol Sci 2015; 370:20140338. [PMID: 26323768 PMCID: PMC4571576 DOI: 10.1098/rstb.2014.0338] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 06/03/2015] [Indexed: 11/12/2022] Open
Abstract
In spite of substantial work and recent progress, a global and fully resolved picture of the macroevolutionary history of eukaryotes is still under construction. This concerns not only the phylogenetic relations among major groups, but also the general characteristics of the underlying macroevolutionary processes, including the patterns of gene family evolution associated with endosymbioses, as well as their impact on the sequence evolutionary process. All these questions raise formidable methodological challenges, calling for a more powerful statistical paradigm. In this direction, model-based probabilistic approaches have played an increasingly important role. In particular, improved models of sequence evolution accounting for heterogeneities across sites and across lineages have led to significant, although insufficient, improvement in phylogenetic accuracy. More recently, one main trend has been to move away from simple parametric models and stepwise approaches, towards integrative models explicitly considering the intricate interplay between multiple levels of macroevolutionary processes. Such integrative models are in their infancy, and their application to the phylogeny of eukaryotes still requires substantial improvement of the underlying models, as well as additional computational developments.
Collapse
Affiliation(s)
- Nicolas Lartillot
- Laboratoire de Biométrie et Biologie Evolutive, UMR CNRS 5558, Université Claude Bernard Lyon 1, F-69622 Villeurbanne Cedex, France
| |
Collapse
|
31
|
Williams TA, Heaps SE, Cherlin S, Nye TMW, Boys RJ, Embley TM. New substitution models for rooting phylogenetic trees. Philos Trans R Soc Lond B Biol Sci 2015; 370:20140336. [PMID: 26323766 PMCID: PMC4571574 DOI: 10.1098/rstb.2014.0336] [Citation(s) in RCA: 45] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 06/04/2015] [Indexed: 12/23/2022] Open
Abstract
The root of a phylogenetic tree is fundamental to its biological interpretation, but standard substitution models do not provide any information on its position. Here, we describe two recently developed models that relax the usual assumptions of stationarity and reversibility, thereby facilitating root inference without the need for an outgroup. We compare the performance of these models on a classic test case for phylogenetic methods, before considering two highly topical questions in evolutionary biology: the deep structure of the tree of life and the root of the archaeal radiation. We show that all three alignments contain meaningful rooting information that can be harnessed by these new models, thus complementing and extending previous work based on outgroup rooting. In particular, our analyses exclude the root of the tree of life from the eukaryotes or Archaea, placing it on the bacterial stem or within the Bacteria. They also exclude the root of the archaeal radiation from several major clades, consistent with analyses using other rooting methods. Overall, our results demonstrate the utility of non-reversible and non-stationary models for rooting phylogenetic trees, and identify areas where further progress can be made.
Collapse
Affiliation(s)
- Tom A Williams
- Institute for Cell and Molecular Biosciences, Newcastle University, Newcastle upon Tyne NE2 4HH, UK
| | - Sarah E Heaps
- Institute for Cell and Molecular Biosciences, Newcastle University, Newcastle upon Tyne NE2 4HH, UK School of Mathematics and Statistics, Newcastle University, Newcastle upon Tyne NE1 7RU, UK
| | - Svetlana Cherlin
- Institute for Cell and Molecular Biosciences, Newcastle University, Newcastle upon Tyne NE2 4HH, UK School of Mathematics and Statistics, Newcastle University, Newcastle upon Tyne NE1 7RU, UK
| | - Tom M W Nye
- School of Mathematics and Statistics, Newcastle University, Newcastle upon Tyne NE1 7RU, UK
| | - Richard J Boys
- School of Mathematics and Statistics, Newcastle University, Newcastle upon Tyne NE1 7RU, UK
| | - T Martin Embley
- Institute for Cell and Molecular Biosciences, Newcastle University, Newcastle upon Tyne NE2 4HH, UK
| |
Collapse
|
32
|
Xu X, Dunn KA, Field C. A Robust ANOVA Approach to Estimating a Phylogeny from Multiple Genes. Mol Biol Evol 2015; 32:2186-94. [PMID: 25841490 DOI: 10.1093/molbev/msv084] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/18/2023] Open
Abstract
In this article, we address the issue of estimating the phylogenetic tree based on sequence data across a set of genes. Recognizing that the individual gene trees may not all share the same evolutionary history due to lateral gene transfer or differences in rates of evolution for instance, we develop a robust algorithm for tree estimation based on pairwise distances computed gene by gene. A robust analysis of variance (ANOVA) is used to combine the distances across all genes giving a summary distance for all genes. The tree can then be constructed using any distance method such as BIONJ. Using the weights from the robust ANOVA, we can then identify the outlying genes and taxa for further examination. As the method is based on distances, computation is much faster than maximum likelihood on the concatenated genes. It is also very straightforward to carry out a bootstrap analysis using standard methods for regression models. We test our methods in a comprehensive simulation study and apply them to three data sets recently analyzed in the literature.
Collapse
Affiliation(s)
- Ximing Xu
- Department of Mathematics and Statistics, Dalhousie University, Halifax, NS, Canada
| | - Katherine A Dunn
- Department of Biology, Dalhousie University, Halifax, NS, Canada
| | - Chris Field
- Department of Mathematics and Statistics, Dalhousie University, Halifax, NS, Canada
| |
Collapse
|
33
|
Petitjean C, Deschamps P, López-García P, Moreira D, Brochier-Armanet C. Extending the conserved phylogenetic core of archaea disentangles the evolution of the third domain of life. Mol Biol Evol 2015; 32:1242-54. [PMID: 25660375 DOI: 10.1093/molbev/msv015] [Citation(s) in RCA: 47] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Initial studies of the archaeal phylogeny relied mainly on the analysis of the RNA component of the small subunit of the ribosome (SSU rRNA). The resulting phylogenies have provided interesting but partial information on the evolutionary history of the third domain of life because SSU rRNA sequences do not contain enough phylogenetic signal to resolve all nodes of the archaeal tree. Thus, many relationships, and especially the most ancient ones, remained elusive. Moreover, SSU rRNA phylogenies can be heavily biased by tree reconstruction artifacts. The sequencing of complete genomes allows using a variety of protein markers as an alternative to SSU rRNA. Taking advantage of the recent burst of archaeal complete genome sequences, we have carried out an in-depth phylogenomic analysis of this domain. We have identified 200 new protein families that, in addition to the ribosomal proteins and the subunits of the RNA polymerase, form a conserved phylogenetic core of archaeal genes. The accurate analysis of these markers combined with desaturation approaches shed new light on the evolutionary history of Archaea and reveals that several relationships recovered in recent analyses are likely the consequence of tree reconstruction artifacts. Among others, we resolve a number of important relationships, such as those among methanogens Class I, and we propose the definition of two new superclasses within the Euryarchaeota: Methanomada and Diaforarchaea.
Collapse
Affiliation(s)
- Céline Petitjean
- UMR CNRS 8079, Unité d'Ecologie, Systématique et Evolution, Université Paris-Sud, Orsay, France
| | - Philippe Deschamps
- UMR CNRS 8079, Unité d'Ecologie, Systématique et Evolution, Université Paris-Sud, Orsay, France
| | | | - David Moreira
- UMR CNRS 8079, Unité d'Ecologie, Systématique et Evolution, Université Paris-Sud, Orsay, France
| | - Céline Brochier-Armanet
- Université de Lyon, Université Lyon 1, CNRS, UMR5558, Laboratoire de Biométrie et Biologie Evolutive, Villeurbanne, France
| |
Collapse
|
34
|
Petitjean C, Deschamps P, López-García P, Moreira D. Rooting the domain archaea by phylogenomic analysis supports the foundation of the new kingdom Proteoarchaeota. Genome Biol Evol 2014; 7:191-204. [PMID: 25527841 PMCID: PMC4316627 DOI: 10.1093/gbe/evu274] [Citation(s) in RCA: 94] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022] Open
Abstract
The first 16S rRNA-based phylogenies of the Archaea showed a deep division between two groups, the kingdoms Euryarchaeota and Crenarchaeota. This bipartite classification has been challenged by the recent discovery of new deeply branching lineages (e.g., Thaumarchaeota, Aigarchaeota, Nanoarchaeota, Korarchaeota, Parvarchaeota, Aenigmarchaeota, Diapherotrites, and Nanohaloarchaeota) which have also been given the same taxonomic status of kingdoms. However, the phylogenetic position of some of these lineages is controversial. In addition, phylogenetic analyses of the Archaea have often been carried out without outgroup sequences, making it difficult to determine if these taxa actually define lineages at the same level as the Euryarchaeota and Crenarchaeota. We have addressed the question of the position of the root of the Archaea by reconstructing rooted archaeal phylogenetic trees using bacterial sequences as outgroup. These trees were based on commonly used conserved protein markers (32 ribosomal proteins) as well as on 38 new markers identified through phylogenomic analysis. We thus gathered a total of 70 conserved markers that we analyzed as a concatenated data set. In contrast with previous analyses, our trees consistently placed the root of the archaeal tree between the Euryarchaeota (including the Nanoarchaeota and other fast-evolving lineages) and the rest of archaeal species, which we propose to class within the new kingdom Proteoarchaeota. This implies the relegation of several groups previously classified as kingdoms (e.g., Crenarchaeota, Thaumarchaeota, Aigarchaeota, and Korarchaeota) to a lower taxonomic rank. In addition to taxonomic implications, this profound reorganization of the archaeal phylogeny has also consequences on our appraisal of the nature of the last archaeal ancestor, which most likely was a complex organism with a gene-rich genome.
Collapse
Affiliation(s)
- Céline Petitjean
- Unité d'Ecologie, Systématique et Evolution, CNRS UMR 8079, Université Paris-Sud, Orsay, France
| | - Philippe Deschamps
- Unité d'Ecologie, Systématique et Evolution, CNRS UMR 8079, Université Paris-Sud, Orsay, France
| | | | - David Moreira
- Unité d'Ecologie, Systématique et Evolution, CNRS UMR 8079, Université Paris-Sud, Orsay, France
| |
Collapse
|
35
|
Kannan S, Rogozin IB, Koonin EV. MitoCOGs: clusters of orthologous genes from mitochondria and implications for the evolution of eukaryotes. BMC Evol Biol 2014; 14:237. [PMID: 25421434 PMCID: PMC4256733 DOI: 10.1186/s12862-014-0237-5] [Citation(s) in RCA: 30] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2014] [Accepted: 11/07/2014] [Indexed: 01/19/2023] Open
Abstract
Background Mitochondria are ubiquitous membranous organelles of eukaryotic cells that evolved from an alpha-proteobacterial endosymbiont and possess a small genome that encompasses from 3 to 106 genes. Accumulation of thousands of mitochondrial genomes from diverse groups of eukaryotes provides an opportunity for a comprehensive reconstruction of the evolution of the mitochondrial gene repertoire. Results Clusters of orthologous mitochondrial protein-coding genes (MitoCOGs) were constructed from all available mitochondrial genomes and complemented with nuclear orthologs of mitochondrial genes. With minimal exceptions, the mitochondrial gene complements of eukaryotes are subsets of the superset of 66 genes found in jakobids. Reconstruction of the evolution of mitochondrial genomes indicates that the mitochondrial gene set of the last common ancestor of the extant eukaryotes was slightly larger than that of jakobids. This superset of mitochondrial genes likely represents an intermediate stage following the loss and transfer to the nucleus of most of the endosymbiont genes early in eukaryote evolution. Subsequent evolution in different lineages involved largely parallel transfer of ancestral endosymbiont genes to the nuclear genome. The intron density in nuclear orthologs of mitochondrial genes typically is nearly the same as in the rest of the genes in the respective genomes. However, in land plants, the intron density in nuclear orthologs of mitochondrial genes is almost 1.5-fold lower than the genomic mean, suggestive of ongoing transfer of functional genes from mitochondria to the nucleus. Conclusions The MitoCOGs are expected to become an important resource for the study of mitochondrial evolution. The nearly complete superset of mitochondrial genes in jakobids likely represents an intermediate stage in the evolution of eukaryotes after the initial, extensive loss and transfer of the endosymbiont genes. In addition, the bacterial multi-subunit RNA polymerase that is encoded in the jakobid mitochondrial genomes was replaced by a single-subunit phage-type RNA polymerase in the rest of the eukaryotes. These results are best compatible with the rooting of the eukaryotic tree between jakobids and the rest of the eukaryotes. The land plants are the only eukaryotic branch in which the gene transfer from the mitochondrial to the nuclear genome appears to be an active, ongoing process. Electronic supplementary material The online version of this article (doi:10.1186/s12862-014-0237-5) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Sivakumar Kannan
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, 20894, USA.
| | - Igor B Rogozin
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, 20894, USA.
| | - Eugene V Koonin
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, 20894, USA.
| |
Collapse
|
36
|
Shpirer E, Chang ES, Diamant A, Rubinstein N, Cartwright P, Huchon D. Diversity and evolution of myxozoan minicollagens and nematogalectins. BMC Evol Biol 2014; 14:205. [PMID: 25262812 PMCID: PMC4195985 DOI: 10.1186/s12862-014-0205-0] [Citation(s) in RCA: 33] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2014] [Accepted: 09/19/2014] [Indexed: 11/10/2022] Open
Abstract
Background Myxozoa are a diverse group of metazoan parasites with a very simple organization, which has for decades eluded their evolutionary origin. Their most prominent and characteristic feature is the polar capsule: a complex intracellular structure of the myxozoan spore, which plays a role in host infection. Striking morphological similarities have been found between myxozoan polar capsules and nematocysts, the stinging structures of cnidarians (corals, sea anemones and jellyfish) leading to the suggestion that Myxozoa and Cnidaria share a more recent common ancestry. This hypothesis has recently been supported by phylogenomic evidence and by the identification of a nematocyst specific minicollagen gene in the myxozoan Tetracapsuloides bryosalmonae. Here we searched genomes and transcriptomes of several myxozoan taxa for the presence of additional cnidarian specific genes and characterized these genes within a phylogenetic context. Results Illumina assemblies of transcriptome or genome data of three myxozoan species (Enteromyxum leei, Kudoa iwatai, and Sphaeromyxa zaharoni) and of the enigmatic cnidarian parasite Polypodium hydriforme (Polypodiozoa) were mined using tBlastn searches with nematocyst-specific proteins as queries. Several orthologs of nematogalectins and minicollagens were identified. Our phylogenetic analyses indicate that myxozoans possess three distinct minicollagens. We found that the cnidarian repertoire of nematogalectins is more complex than previously thought and we identified additional members of the nematogalectin family. Cnidarians were found to possess four nematogalectin/ nematogalectin-related genes, while in myxozoans only three genes could be identified. Conclusions Our results demonstrate that myxozoans possess a diverse array of genes that are taxonomically restricted to Cnidaria. Characterization of these genes provide compelling evidence that polar capsules and nematocysts are homologous structures and that myxozoans are highly degenerate cnidarians. The diversity of minicollagens was higher than previously thought, with the presence of three minicollagen genes in myxozoans. Our phylogenetic results suggest that the different myxozoan sequences are the results of ancient divergences within Cnidaria and not of recent specializations of the polar capsule. For both minicollagen and nematogalectin, our results show that myxozoans possess less gene copies than their cnidarian counter parts, suggesting that the polar capsule gene repertoire was simplified with their reduced body plan. Electronic supplementary material The online version of this article (doi:10.1186/s12862-014-0205-0) contains supplementary material, which is available to authorized users.
Collapse
|
37
|
Maguire F, Henriquez FL, Leonard G, Dacks JB, Brown MW, Richards TA. Complex patterns of gene fission in the eukaryotic folate biosynthesis pathway. Genome Biol Evol 2014; 6:2709-20. [PMID: 25252772 PMCID: PMC4224340 DOI: 10.1093/gbe/evu213] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022] Open
Abstract
Shared derived genomic characters can be useful for polarizing phylogenetic relationships, for example, gene fusions have been used to identify deep-branching relationships in the eukaryotes. Here, we report the evolutionary analysis of a three-gene fusion of folB, folK, and folP, which encode enzymes that catalyze consecutive steps in de novo folate biosynthesis. The folK-folP fusion was found across the eukaryotes and a sparse collection of prokaryotes. This suggests an ancient derivation with a number of gene losses in the eukaryotes potentially as a consequence of adaptation to heterotrophic lifestyles. In contrast, the folB-folK-folP gene is specific to a mosaic collection of Amorphea taxa (a group encompassing: Amoebozoa, Apusomonadida, Breviatea, and Opisthokonta). Next, we investigated the stability of this character. We identified numerous gene losses and a total of nine gene fission events, either by break up of an open reading frame (four events identified) or loss of a component domain (five events identified). This indicates that this three gene fusion is highly labile. These data are consistent with a growing body of data indicating gene fission events occur at high relative rates. Accounting for these sources of homoplasy, our data suggest that the folB-folK-folP gene fusion was present in the last common ancestor of Amoebozoa and Opisthokonta but absent in the Metazoa including the human genome. Comparative genomic data of these genes provides an important resource for designing therapeutic strategies targeting the de novo folate biosynthesis pathway of a variety of eukaryotic pathogens such as Acanthamoeba castellanii.
Collapse
Affiliation(s)
- Finlay Maguire
- Department of Life Sciences, Natural History Museum, London, United Kingdom
| | - Fiona L Henriquez
- Infection and Microbiology Research Group, Institute of Biomedical and Environmental Health Research, School of Science, University of the West of Scotland, Paisley, Renfrewshire, United Kingdom
| | - Guy Leonard
- Biosciences, University of Exeter, Geoffrey Pope Building, Exeter, United Kingdom
| | - Joel B Dacks
- Department of Life Sciences, Natural History Museum, London, United Kingdom Department of Cell Biology, Faculty of Medicine and Dentistry, University of Alberta, Edmonton, Alberta, Canada
| | - Matthew W Brown
- Department of Biological Sciences, Mississippi State University
| | - Thomas A Richards
- Biosciences, University of Exeter, Geoffrey Pope Building, Exeter, United Kingdom Canadian Institute for Advanced Research, CIFAR Program in Integrated Microbial Biodiversity
| |
Collapse
|
38
|
Parks SL, Goldman N. Maximum likelihood inference of small trees in the presence of long branches. Syst Biol 2014; 63:798-811. [PMID: 24996414 PMCID: PMC6371681 DOI: 10.1093/sysbio/syu044] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2013] [Accepted: 06/20/2014] [Indexed: 11/14/2022] Open
Abstract
The statistical basis of maximum likelihood (ML), its robustness, and the fact that it appears to suffer less from biases lead to it being one of the most popular methods for tree reconstruction. Despite its popularity, very few analytical solutions for ML exist, so biases suffered by ML are not well understood. One possible bias is long branch attraction (LBA), a regularly cited term generally used to describe a propensity for long branches to be joined together in estimated trees. Although initially mentioned in connection with inconsistency of parsimony, LBA has been claimed to affect all major phylogenetic reconstruction methods, including ML. Despite the widespread use of this term in the literature, exactly what LBA is and what may be causing it is poorly understood, even for simple evolutionary models and small model trees. Studies looking at LBA have focused on the effect of two long branches on tree reconstruction. However, to understand the effect of two long branches it is also important to understand the effect of just one long branch. If ML struggles to reconstruct one long branch, then this may have an impact on LBA. In this study, we look at the effect of one long branch on three-taxon tree reconstruction. We show that, counterintuitively, long branches are preferentially placed at the tips of the tree. This can be understood through the use of analytical solutions to the ML equation and distance matrix methods. We go on to look at the placement of two long branches on four-taxon trees, showing that there is no attraction between long branches, but that for extreme branch lengths long branches are joined together disproportionally often. These results illustrate that even small model trees are still interesting to help understand how ML phylogenetic reconstruction works, and that LBA is a complicated phenomenon that deserves further study.
Collapse
Affiliation(s)
- Sarah L Parks
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, CB10 1SD, United Kingdom
| | - Nick Goldman
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, CB10 1SD, United Kingdom
| |
Collapse
|
39
|
Ramulu HG, Groussin M, Talla E, Planel R, Daubin V, Brochier-Armanet C. Ribosomal proteins: toward a next generation standard for prokaryotic systematics? Mol Phylogenet Evol 2014; 75:103-17. [PMID: 24583288 DOI: 10.1016/j.ympev.2014.02.013] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2013] [Revised: 01/23/2014] [Accepted: 02/17/2014] [Indexed: 10/25/2022]
Abstract
The seminal work of Carl Woese and co-workers has contributed to promote the RNA component of the small subunit of the ribosome (SSU rRNA) as a "gold standard" of modern prokaryotic taxonomy and systematics, and an essential tool to explore microbial diversity. Yet, this marker has a limited resolving power, especially at deep phylogenetic depth and can lead to strongly biased trees. The ever-larger number of available complete genomes now calls for a novel standard dataset of robust protein markers that may complement SSU rRNA. In this respect, concatenation of ribosomal proteins (r-proteins) is being growingly used to reconstruct large-scale prokaryotic phylogenies, but their suitability for systematic and/or taxonomic purposes has not been specifically addressed. Using Proteobacteria as a case study, we show that amino acid and nucleic acid r-protein sequences contain a reliable phylogenetic signal at a wide range of taxonomic depths, which has not been totally blurred by mutational saturation or horizontal gene transfer. The use of accurate evolutionary models and reconstruction methods allows overcoming most tree reconstruction artefacts resulting from compositional biases and/or fast evolutionary rates. The inferred phylogenies allow clarifying the relationships among most proteobacterial orders and families, along with the position of several unclassified lineages, suggesting some possible revisions of the current classification. In addition, we investigate the root of the Proteobacteria by considering the time-variation of nucleic acid composition of r-protein sequences and the information carried by horizontal gene transfers, two approaches that do not require the use of an outgroup and limit tree reconstruction artefacts. Altogether, our analyses indicate that r-proteins may represent a promising standard for prokaryotic taxonomy and systematics.
Collapse
Affiliation(s)
- Hemalatha Golaconda Ramulu
- Aix-Marseille Université, CNRS, UMR 7283, Laboratoire de Chimie Bactérienne, IMM, 31 chemin Joseph Aiguier, F-13402 Marseille, France
| | - Mathieu Groussin
- Université de Lyon, Université Lyon 1, CNRS, UMR 5558, Laboratoire de Biométrie et Biologie Evolutive, 43 boulevard du 11 novembre 1918, F-69622 Villeurbanne, France
| | - Emmanuel Talla
- Aix-Marseille Université, CNRS, UMR 7283, Laboratoire de Chimie Bactérienne, IMM, 31 chemin Joseph Aiguier, F-13402 Marseille, France
| | - Remi Planel
- Aix-Marseille Université, CNRS, UMR 7283, Laboratoire de Chimie Bactérienne, IMM, 31 chemin Joseph Aiguier, F-13402 Marseille, France
| | - Vincent Daubin
- Université de Lyon, Université Lyon 1, CNRS, UMR 5558, Laboratoire de Biométrie et Biologie Evolutive, 43 boulevard du 11 novembre 1918, F-69622 Villeurbanne, France
| | - Céline Brochier-Armanet
- Université de Lyon, Université Lyon 1, CNRS, UMR 5558, Laboratoire de Biométrie et Biologie Evolutive, 43 boulevard du 11 novembre 1918, F-69622 Villeurbanne, France.
| |
Collapse
|
40
|
Fawcett RC, Parrow MW. Mixotrophy and loss of phototrophy among geographic isolates of freshwater Esoptrodinium/Bernardinium sp. (Dinophyceae). JOURNAL OF PHYCOLOGY 2014; 50:55-70. [PMID: 26988008 DOI: 10.1111/jpy.12144] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/13/2013] [Accepted: 10/13/2013] [Indexed: 06/05/2023]
Abstract
The genus Esoptrodinium Javornický consists of freshwater, athecate dinoflagellates with an incomplete cingulum. Strains isolated thus far feed on microalgae and most possess obvious pigmented chloroplasts, suggesting mixotrophy. However, some geographic isolates lack obvious pigmented chloroplasts. The purpose of this study was to comparatively examine this difference and the associated potential for mixotrophy among different isolates of Esoptrodinium. All isolates phagocytized prey cells through an unusual hatch-like peduncle located on the ventral episome, and were capable of ingesting various protist taxa. All Esoptrodinium isolates required both food and light to grow. However, only the tested strain with visible pigmented chloroplasts benefited from light in terms of increased biomass (phototrophy). Isolates lacking obvious chloroplasts received no biomass benefit from light, but nevertheless required light for sustained growth (i.e., photoobligate, but not phototrophic). Isolates with visible chloroplasts exhibited chlorophyll autofluorescence and formed a monophyletic psbA gene clade that suggested Esoptrodinium possesses inherited, peridinoid-type plastids. One isolate with cryptic, barely visible plastids lacked detectable chlorophyll and exhibited an apparent loss-of-function mutation in psbA, indicating the presence of nonphotosynthetic plastids. The other isolate that lacked visible chloroplasts lacked both detectable chlorophyll and an amplifiable psbA sequence. The results demonstrate mixotrophy quantitatively for the first time in a freshwater dinoflagellate, as well as apparent within-clade loss of phototrophy along with a correlated mutation sufficient to explain that phenotype. Phototrophy is a variable trait in Esoptrodinium; further study is required to determine if this represents an inter- or intraspecific (allelic) characteristic in this taxon.
Collapse
Affiliation(s)
- Ryan C Fawcett
- Department of Biology, University of North Carolina at Charlotte, Charlotte, North Carolina, 28223, USA
| | - Matthew W Parrow
- Department of Biology, University of North Carolina at Charlotte, Charlotte, North Carolina, 28223, USA
| |
Collapse
|
41
|
Endosymbiotic gene transfer in tertiary plastid-containing dinoflagellates. EUKARYOTIC CELL 2013; 13:246-55. [PMID: 24297445 DOI: 10.1128/ec.00299-13] [Citation(s) in RCA: 40] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/01/2023]
Abstract
Plastid establishment involves the transfer of endosymbiotic genes to the host nucleus, a process known as endosymbiotic gene transfer (EGT). Large amounts of EGT have been shown in several photosynthetic lineages but also in present-day plastid-lacking organisms, supporting the notion that endosymbiotic genes leave a substantial genetic footprint in the host nucleus. Yet the extent of this genetic relocation remains debated, largely because the long period that has passed since most plastids originated has erased many of the clues to how this process unfolded. Among the dinoflagellates, however, the ancestral peridinin-containing plastid has been replaced by tertiary plastids on several more recent occasions, giving us a less ancient window to examine plastid origins. In this study, we evaluated the endosymbiotic contribution to the host genome in two dinoflagellate lineages with tertiary plastids. We generated the first nuclear transcriptome data sets for the "dinotoms," which harbor diatom-derived plastids, and analyzed these data in combination with the available transcriptomes for kareniaceans, which harbor haptophyte-derived plastids. We found low level of detectable EGT in both dinoflagellate lineages, with only 9 genes and 90 genes of possible tertiary endosymbiotic origin in dinotoms and kareniaceans, respectively, suggesting that tertiary endosymbioses did not heavily impact the host dinoflagellate genomes.
Collapse
|
42
|
Nozaki H, Yang Y, Maruyama S, Suzaki T. A case study for effects of operational taxonomic units from intracellular endoparasites and ciliates on the eukaryotic phylogeny: phylogenetic position of the haptophyta in analyses of multiple slowly evolving genes. PLoS One 2012; 7:e50827. [PMID: 23226396 PMCID: PMC3511332 DOI: 10.1371/journal.pone.0050827] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2012] [Accepted: 10/25/2012] [Indexed: 01/09/2023] Open
Abstract
Recent multigene phylogenetic analyses have contributed much to our understanding of eukaryotic phylogeny. However, the phylogenetic positions of various lineages within the eukaryotes have remained unresolved or in conflict between different phylogenetic studies. These phylogenetic ambiguities might have resulted from mixtures or integration from various factors including limited taxon sampling, missing data in the alignment, saturations of rapidly evolving genes, mixed analyses of short- and long-branched operational taxonomic units (OTUs), intracellular endoparasite and ciliate OTUs with unusual substitution etc. In order to evaluate the effects from intracellular endoparasite and ciliate OTUs co-analyzed on the eukaryotic phylogeny and simplify the results, we here used two different sets of data matrices of multiple slowly evolving genes with small amounts of missing data and examined the phylogenetic position of the secondary photosynthetic chromalveolates Haptophyta, one of the most abundant groups of oceanic phytoplankton and significant primary producers. In both sets, a robust sister relationship between Haptophyta and SAR (stramenopiles, alveolates, rhizarians, or SA [stramenopiles and alveolates]) was resolved when intracellular endoparasite/ciliate OTUs were excluded, but not in their presence. Based on comparisons of character optimizations on a fixed tree (with a clade composed of haptophytes and SAR or SA), disruption of the monophyly between haptophytes and SAR (or SA) in the presence of intracellular endoparasite/ciliate OTUs can be considered to be a result of multiple evolutionary reversals of character positions that supported the synapomorphy of the haptophyte and SAR (or SA) clade in the absence of intracellular endoparasite/ciliate OTUs.
Collapse
Affiliation(s)
- Hisayoshi Nozaki
- Department of Biological Sciences, Graduate School of Science, University of Tokyo, Tokyo, Japan.
| | | | | | | |
Collapse
|
43
|
Gallegos ME, Balakrishnan S, Chandramouli P, Arora S, Azameera A, Babushekar A, Bargoma E, Bokhari A, Chava SK, Das P, Desai M, Decena D, Saramma SDD, Dey B, Doss AL, Gor N, Gudiputi L, Guo C, Hande S, Jensen M, Jones S, Jones N, Jorgens D, Karamchedu P, Kamrani K, Kolora LD, Kristensen L, Kwan K, Lau H, Maharaj P, Mander N, Mangipudi K, Menakuru H, Mody V, Mohanty S, Mukkamala S, Mundra SA, Nagaraju S, Narayanaswamy R, Ndungu-Case C, Noorbakhsh M, Patel J, Patel P, Pendem SV, Ponakala A, Rath M, Robles MC, Rokkam D, Roth C, Sasidharan P, Shah S, Tandon S, Suprai J, Truong TQN, Uthayaruban R, Varma A, Ved U, Wang Z, Yu Z. The C. elegans rab family: identification, classification and toolkit construction. PLoS One 2012; 7:e49387. [PMID: 23185324 PMCID: PMC3504004 DOI: 10.1371/journal.pone.0049387] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2012] [Accepted: 10/09/2012] [Indexed: 11/29/2022] Open
Abstract
Rab monomeric GTPases regulate specific aspects of vesicle transport in eukaryotes including coat recruitment, uncoating, fission, motility, target selection and fusion. Moreover, individual Rab proteins function at specific sites within the cell, for example the ER, golgi and early endosome. Importantly, the localization and function of individual Rab subfamily members are often conserved underscoring the significant contributions that model organisms such as Caenorhabditis elegans can make towards a better understanding of human disease caused by Rab and vesicle trafficking malfunction. With this in mind, a bioinformatics approach was first taken to identify and classify the complete C. elegans Rab family placing individual Rabs into specific subfamilies based on molecular phylogenetics. For genes that were difficult to classify by sequence similarity alone, we did a comparative analysis of intron position among specific subfamilies from yeast to humans. This two-pronged approach allowed the classification of 30 out of 31 C. elegans Rab proteins identified here including Rab31/Rab50, a likely member of the last eukaryotic common ancestor (LECA). Second, a molecular toolset was created to facilitate research on biological processes that involve Rab proteins. Specifically, we used Gateway-compatible C. elegans ORFeome clones as starting material to create 44 full-length, sequence-verified, dominant-negative (DN) and constitutive active (CA) rab open reading frames (ORFs). Development of this toolset provided independent research projects for students enrolled in a research-based molecular techniques course at California State University, East Bay (CSUEB).
Collapse
Affiliation(s)
- Maria E Gallegos
- Department of Biological Sciences, California State University East Bay, Hayward, CA, USA.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
44
|
Lasek-Nesselquist E. A mitogenomic re-evaluation of the bdelloid phylogeny and relationships among the Syndermata. PLoS One 2012; 7:e43554. [PMID: 22927990 PMCID: PMC3426538 DOI: 10.1371/journal.pone.0043554] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2012] [Accepted: 07/23/2012] [Indexed: 11/22/2022] Open
Abstract
Molecular and morphological data regarding the relationships among the three classes of Rotifera (Bdelloidea, Seisonidea, and Monogononta) and the phylum Acanthocephala are inconclusive. In particular, Bdelloidea lacks molecular-based phylogenetic appraisal. I obtained coding sequences from the mitochondrial genomes of twelve bdelloids and two monogononts to explore the molecular phylogeny of Bdelloidea and provide insight into the relationships among lineages of Syndermata (Rotifera + Acanthocephala). With additional sequences taken from previously published mitochondrial genomes, the total dataset included nine species of bdelloids, three species of monogononts, and two species of acanthocephalans. A supermatrix of these 10-12 mitochondrial proteins consistently recovered a bdelloid phylogeny that questions the validity of a generally accepted classification scheme despite different methods of inference and various parameter adjustments. Specifically, results showed that neither the family Philodinidae nor the order Philodinida are monophyletic as currently defined. The application of a similar analytical strategy to assess syndermate relationships recovered either a tree with Bdelloidea and Monogononta as sister taxa (Eurotatoria) or Bdelloidea and Acanthocephala as sister taxa (Lemniscea). Both outgroup choice and method of inference affected the topological outcome emphasizing the need for sequences from more closely related outgroups and more sophisticated methods of analysis that can account for the complexity of the data.
Collapse
Affiliation(s)
- Erica Lasek-Nesselquist
- University of Connecticut, Department of Molecular and Cellular Biology, Storrs Connecticut, United States of America.
| |
Collapse
|
45
|
Fawcett RC, Parrow MW. CYTOLOGICAL AND PHYLOGENETIC DIVERSITY IN FRESHWATER ESOPTRODINIUM/BERNARDINIUM SPECIES (DINOPHYCEAE)(1). JOURNAL OF PHYCOLOGY 2012; 48:793-807. [PMID: 27011096 DOI: 10.1111/j.1529-8817.2012.01174.x] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
The genera Esoptrodinium Javornický and Bernardinium Chodat comprise freshwater, athecate dinoflagellates with an incomplete cingulum but differing reports regarding cingulum orientation and the presence of chloroplasts and an eyespot. To examine this reported diversity, six isolates were collected from different freshwater ponds and brought into clonal culture. The isolates were examined using LM to determine major cytological differences, and rDNA sequences were compared to determine relatedness and overall phylogenetic position within the dinoflagellates. All isolates were athecate with a left-oriented cingulum that did not fully encircle the cell, corresponding to the current taxonomic concept of Esoptrodinium. However, consistent cytological differences were observed among clonal isolates. Most isolates exhibited unambiguous pale green chloroplasts and a distinct bright-red eyespot located at the base of the longitudinal flagellum. However, one isolate had cryptic chloroplasts that were difficult to observe using LM, and another had an eyespot that was so reduced as to be almost undetectable. Another isolate lacked visible chloroplasts but did possess the characteristic eyespot. Nuclear rDNA phylogenies strongly supported a monophyletic Esoptrodinium clade containing all isolates from this study together with a previous sequence from Portugal, within the Tovelliaceae. Esoptrodinium subclades were largely correlated with cytological differences, and the data suggested that independent chloroplast and eyespot reduction and/or loss may have occurred within this taxon. Overall, the isolates encompassed the majority of cytological diversity reported in previous observations of Bernardinium/Esoptrodinium in field samples. Systematic issues with the current taxonomic distinction between Bernardinium and Esoptrodinium are discussed.
Collapse
Affiliation(s)
- Ryan C Fawcett
- Department of Biology, University of North Carolina at Charlotte, Charlotte, North Carolina 28223, USA
| | - Matthew W Parrow
- Department of Biology, University of North Carolina at Charlotte, Charlotte, North Carolina 28223, USA
| |
Collapse
|
46
|
Burki F, Flegontov P, Oborník M, Cihlář J, Pain A, Lukeš J, Keeling PJ. Re-evaluating the green versus red signal in eukaryotes with secondary plastid of red algal origin. Genome Biol Evol 2012; 4:626-35. [PMID: 22593553 PMCID: PMC3516247 DOI: 10.1093/gbe/evs049] [Citation(s) in RCA: 47] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
The transition from endosymbiont to organelle in eukaryotic cells involves the transfer of significant numbers of genes to the host genomes, a process known as endosymbiotic gene transfer (EGT). In the case of plastid organelles, EGTs have been shown to leave a footprint in the nuclear genome that can be indicative of ancient photosynthetic activity in present-day plastid-lacking organisms, or even hint at the existence of cryptic plastids. Here, we evaluated the impact of EGT on eukaryote genomes by reanalyzing the recently published EST dataset for Chromera velia, an interesting test case of a photosynthetic alga closely related to apicomplexan parasites. Previously, 513 genes were reported to originate from red and green algae in a 1:1 ratio. In contrast, by manually inspecting newly generated trees indicating putative algal ancestry, we recovered only 51 genes congruent with EGT, of which 23 and 9 were of red and green algal origin, respectively, whereas 19 were ambiguous regarding the algal provenance. Our approach also uncovered 109 genes that branched within a monocot angiosperm clade, most likely representing a contamination. We emphasize the lack of congruence and the subjectivity resulting from independent phylogenomic screens for EGT, which appear to call for extreme caution when drawing conclusions for major evolutionary events.
Collapse
Affiliation(s)
- Fabien Burki
- Canadian Institute for Advanced Research, Department of Botany, University of British Columbia, Vancouver, Canada
| | - Pavel Flegontov
- Biology Centre, Institute of Parasitology, Czech Academy of Sciences, České Budějovice, Czech Republic
| | - Miroslav Oborník
- Biology Centre, Institute of Parasitology, Czech Academy of Sciences, České Budějovice, Czech Republic
- Faculty of Science, University of South Bohemia, České Budějovice, Czech Republic
- Institute of Microbiology, Czech Academy of Sciences, Třeboň, Czech Republic
| | - Jaromír Cihlář
- Biology Centre, Institute of Parasitology, Czech Academy of Sciences, České Budějovice, Czech Republic
| | - Arnab Pain
- Computational Bioscience Research Center (CBRC), Chemical Life Sciences and Engineering Division, King Abdullah University of Science and Technology (KAUST), Thuwal, Saudi Arabia
| | - Julius Lukeš
- Biology Centre, Institute of Parasitology, Czech Academy of Sciences, České Budějovice, Czech Republic
- Faculty of Science, University of South Bohemia, České Budějovice, Czech Republic
| | - Patrick J. Keeling
- Canadian Institute for Advanced Research, Department of Botany, University of British Columbia, Vancouver, Canada
- *Corresponding author: E-mail:
| |
Collapse
|
47
|
Dollet M, Sturm NR, Campbell DA. The internal transcribed spacer of ribosomal RNA genes in plant trypanosomes (Phytomonas spp.) resolves 10 groups. INFECTION GENETICS AND EVOLUTION 2012; 12:299-308. [DOI: 10.1016/j.meegid.2011.11.010] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/17/2011] [Revised: 11/21/2011] [Accepted: 11/22/2011] [Indexed: 11/24/2022]
|
48
|
Kim KM, Caetano-Anollés G. The evolutionary history of protein fold families and proteomes confirms that the archaeal ancestor is more ancient than the ancestors of other superkingdoms. BMC Evol Biol 2012; 12:13. [PMID: 22284070 PMCID: PMC3306197 DOI: 10.1186/1471-2148-12-13] [Citation(s) in RCA: 50] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2011] [Accepted: 01/27/2012] [Indexed: 11/23/2022] Open
Abstract
Background The entire evolutionary history of life can be studied using myriad sequences generated by genomic research. This includes the appearance of the first cells and of superkingdoms Archaea, Bacteria, and Eukarya. However, the use of molecular sequence information for deep phylogenetic analyses is limited by mutational saturation, differential evolutionary rates, lack of sequence site independence, and other biological and technical constraints. In contrast, protein structures are evolutionary modules that are highly conserved and diverse enough to enable deep historical exploration. Results Here we build phylogenies that describe the evolution of proteins and proteomes. These phylogenetic trees are derived from a genomic census of protein domains defined at the fold family (FF) level of structural classification. Phylogenomic trees of FF structures were reconstructed from genomic abundance levels of 2,397 FFs in 420 proteomes of free-living organisms. These trees defined timelines of domain appearance, with time spanning from the origin of proteins to the present. Timelines are divided into five different evolutionary phases according to patterns of sharing of FFs among superkingdoms: (1) a primordial protein world, (2) reductive evolution and the rise of Archaea, (3) the rise of Bacteria from the common ancestor of Bacteria and Eukarya and early development of the three superkingdoms, (4) the rise of Eukarya and widespread organismal diversification, and (5) eukaryal diversification. The relative ancestry of the FFs shows that reductive evolution by domain loss is dominant in the first three phases and is responsible for both the diversification of life from a universal cellular ancestor and the appearance of superkingdoms. On the other hand, domain gains are predominant in the last two phases and are responsible for organismal diversification, especially in Bacteria and Eukarya. Conclusions The evolution of functions that are associated with corresponding FFs along the timeline reveals that primordial metabolic domains evolved earlier than informational domains involved in translation and transcription, supporting the metabolism-first hypothesis rather than the RNA world scenario. In addition, phylogenomic trees of proteomes reconstructed from FFs appearing in each of the five phases of the protein world show that trees reconstructed from ancient domain structures were consistently rooted in archaeal lineages, supporting the proposal that the archaeal ancestor is more ancient than the ancestors of other superkingdoms.
Collapse
Affiliation(s)
- Kyung Mo Kim
- Evolutionary Bioinformatics Laboratory, Department of Crop Science, University of Illinois, Urbana, IL 61801, USA
| | | |
Collapse
|
49
|
Soria-Carrasco V, Castresana J. Patterns of mammalian diversification in recent evolutionary times: global tendencies and methodological issues. J Evol Biol 2011; 24:2611-23. [PMID: 21955145 DOI: 10.1111/j.1420-9101.2011.02384.x] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Changes in diversification patterns estimated from phylogenetic trees are an important source of information about the dynamics of evolution. To study the diversification of mammals, we reconstructed phylogenetic trees of 29 families and fitted both constant-rate and variable-rate models of diversification. In addition, we investigated the effect of clock models and phylogenetic reconstruction problems on diversification analyses. We observed, first, that none of the families increased its diversification rate during the last few million years, including the Pleistocene. Furthermore, we detected a decrease in diversification that, after application of different tests, was significant only for a minority of families. However, when diversification variation was analysed in a combined tree of all families, a global decline in diversification became significant. Therefore, although distorted by some methodological artefacts, we found an underlying signal of gradually decreasing diversification that suggests that ecological factors may have shaped the recent diversification of mammals.
Collapse
Affiliation(s)
- V Soria-Carrasco
- Institute of Evolutionary Biology (CSIC-UPF), Passeig Marítim de la Barceloneta, Barcelona, Spain
| | | |
Collapse
|
50
|
Lim CH, Hamazaki T, Braun EL, Wade J, Terada N. Evolutionary genomics implies a specific function of Ant4 in mammalian and anole lizard male germ cells. PLoS One 2011; 6:e23122. [PMID: 21858006 PMCID: PMC3155547 DOI: 10.1371/journal.pone.0023122] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2011] [Accepted: 07/11/2011] [Indexed: 11/18/2022] Open
Abstract
Most vertebrates have three paralogous genes with identical intron-exon structures and a high degree of sequence identity that encode mitochondrial adenine nucleotide translocase (Ant) proteins, Ant1 (Slc25a4), Ant2 (Slc25a5) and Ant3 (Slc25a6). Recently, we and others identified a fourth mammalian Ant paralog, Ant4 (Slc25a31), with a distinct intron-exon structure and a lower degree of sequence identity. Ant4 was expressed selectively in testis and sperm in adult mammals and was indeed essential for mouse spermatogenesis, but it was absent in birds, fish and frogs. Since Ant2 is X-linked in mammalian genomes, we hypothesized that the autosomal Ant4 gene may compensate for the loss of Ant2 gene expression during male meiosis in mammals. Here we report that the Ant4 ortholog is conserved in green anole lizard (Anolis carolinensis) and demonstrate that it is expressed in the anole testis. Further, a degenerate DNA fragment of putative Ant4 gene was identified in syntenic regions of avian genomes, indicating that Ant4 was present in the common amniote ancestor. Phylogenetic analyses suggest an even more ancient origin of the Ant4 gene. Although anole lizards are presumed male (XY) heterogametic, like mammals, copy numbers of the Ant2 as well as its neighboring gene were similar between male and female anole genomes, indicating that the anole Ant2 gene is either autosomal or located in the pseudoautosomal region of the sex chromosomes, in contrast to the case to mammals. These results imply the conservation of Ant4 is not likely simply driven by the sex chromosomal localization of the Ant2 gene and its subsequent inactivation during male meiosis. Taken together with the fact that Ant4 protein has a uniquely conserved structure when compared to other somatic Ant1, 2 and 3, there may be a specific advantage for mammals and lizards to express Ant4 in their male germ cells.
Collapse
Affiliation(s)
- Chae Ho Lim
- Department of Pathology, College of Medicine, University of Florida, Gainesville, Florida, United States of America
| | - Takashi Hamazaki
- Department of Pathology, College of Medicine, University of Florida, Gainesville, Florida, United States of America
| | - Edward L. Braun
- Department of Biology, College of Liberal Arts and Sciences, University of Florida, Gainesville, Florida, United States of America
| | - Juli Wade
- Neuroscience Program, Department of Psychology, Department of Zoology, Michigan State University, East Lansing, Michigan, United States of America
| | - Naohiro Terada
- Department of Pathology, College of Medicine, University of Florida, Gainesville, Florida, United States of America
- * E-mail:
| |
Collapse
|