1
|
Braun EL. An evolutionary model motivated by physicochemical properties of amino acids reveals variation among proteins. Bioinformatics 2019; 34:i350-i356. [PMID: 29950007 PMCID: PMC6022633 DOI: 10.1093/bioinformatics/bty261] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Motivation The relative rates of amino acid interchanges over evolutionary time are likely to vary among proteins. Variation in those rates has the potential to reveal information about constraints on proteins. However, the most straightforward model that could be used to estimate relative rates of amino acid substitution is parameter-rich and it is therefore impractical to use for this purpose. Results A six-parameter model of amino acid substitution that incorporates information about the physicochemical properties of amino acids was developed. It showed that amino acid side chain volume, polarity and aromaticity have major impacts on protein evolution. It also revealed variation among proteins in the relative importance of those properties. The same general approach can be used to improve the fit of empirical models such as the commonly used PAM and LG models. Availability and implementation Perl code and test data are available from https://github.com/ebraun68/sixparam. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Edward L Braun
- Department of Biology and Genetics Institute, University of Florida, Gainesville, FL, USA
| |
Collapse
|
2
|
Giribet G, Edgecombe GD. Current Understanding of Ecdysozoa and its Internal Phylogenetic Relationships. Integr Comp Biol 2017; 57:455-466. [DOI: 10.1093/icb/icx072] [Citation(s) in RCA: 70] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
|
3
|
Chorev M, Joseph Bekker A, Goldberger J, Carmel L. Identification of introns harboring functional sequence elements through positional conservation. Sci Rep 2017. [PMID: 28646210 PMCID: PMC5482813 DOI: 10.1038/s41598-017-04476-0] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022] Open
Abstract
Many human introns carry out a function, in the sense that they are critical to maintain normal cellular activity. Their identification is fundamental to understanding cellular processes and disease. However, being noncoding elements, such functional introns are poorly predicted based on traditional approaches of sequence and structure conservation. Here, we generated a dataset of human functional introns that carry out different types of functions. We showed that functional introns share common characteristics, such as higher positional conservation along the coding sequence and reduced loss rates, regardless of their specific function. A unique property of the data is that if an intron is unknown to be functional, it still does not mean that it is indeed non-functional. We developed a probabilistic framework that explicitly accounts for this unique property, and predicts which specific human introns are functional. We show that we successfully predict function even when the algorithm is trained on introns with a different type of function. This ability has many implications in studying regulatory networks, gene regulation, the effect of mutations outside exons on human disease, and on our general understanding of intron evolution and their functional exaptation in mammals.
Collapse
Affiliation(s)
- Michal Chorev
- Department of Genetics, The Alexander Silberman Institute of Life Sciences, Faculty of Science, The Hebrew University of Jerusalem, Edmond J. Safra Campus, Givat Ram, Jerusalem, 91904, Israel.,The Rachel and Selim Benin School of Computer Science and Engineering, The Hebrew University of Jerusalem, Edmond J. Safra Campus, Jerusalem, 91904, Israel
| | | | - Jacob Goldberger
- Faculty of Engineering, Bar-Ilan University, Ramat Gan, 52900, Israel
| | - Liran Carmel
- Department of Genetics, The Alexander Silberman Institute of Life Sciences, Faculty of Science, The Hebrew University of Jerusalem, Edmond J. Safra Campus, Givat Ram, Jerusalem, 91904, Israel.
| |
Collapse
|
4
|
Trichinella spiralis: Adaptation and parasitism. Vet Parasitol 2016; 231:8-21. [PMID: 27425574 DOI: 10.1016/j.vetpar.2016.07.003] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2016] [Revised: 06/29/2016] [Accepted: 07/02/2016] [Indexed: 11/21/2022]
Abstract
Publication of the genome from the clade I organism, Trichinella spiralis, has provided us an avenue to address more holistic problems in parasitology; namely the processes of adaptation and the evolution of parasitism. Parasitism among nematodes has evolved in multiple, independent events. Deciphering processes that drive species diversity and adaptation are keys to understanding parasitism and advancing control strategies. Studies have been put forth on morphological and physiological aspects of parasitism and adaptation in nematodes; however, data is now coming available to investigate adaptation, host switching and parasitism at the genomic level. Herein we compare proteomic data from the clade I parasite, Trichinella spiralis with data from Brugia malayi (clade III), Meloidogyne hapla and Meloidogyne incognita (clade IV), and free-living nematodes belonging to the genera Caenorhabditis and Pristionchus (clade V). We explore changes in protein family birth/death and expansion/reduction over the course of metazoan evolution using Homo sapiens, Drosophila melanogaster and Saccharomyces cerevisiae as outgroups for the phylum Nematoda. We further examine relationships between these changes and the ability and/or result of nematodes adapting to their environments. Data are consistent with gene loss occurring in conjunction with nematode specialization resulting from parasitic worms acclimating to well-defined, environmental niches. We observed evidence for independent, lateral gene transfer events involving conserved genes that may have played a role in the evolution of nematode parasitism. In general, parasitic nematodes gained proteins through duplication and lateral gene transfer, and lost proteins through random mutation and deletions. Data suggest independent acquisition rather than ancestral inheritance among the Nematoda followed by selective gene loss over evolutionary time. Data also show that parasitism and adaptation affected a broad range of proteins, especially those involved in sensory perception, metabolism, and transcription/translation. New protein gains with functions related to regulating transcription and translation, and protein family expansions with functions related to morphology and body development have occurred in association with parasitism. Further gains occurred as a result of lateral gene transfer and in particular, with the cyanase protein family In contrast, reductions and/or losses have occurred in protein families with functions related to metabolic process and signal transduction. Taking advantage of the independent occurrences of parasitism in nematodes, which enabled us to distinguish changes associated with parasitism from species specific niche adaptation, our study provides valuable insights into nematode parasitism at a proteome level using T. spiralis as a benchmark for early adaptation to or acquisition of parasitism.
Collapse
|
5
|
Meiklejohn KA, Faircloth BC, Glenn TC, Kimball RT, Braun EL. Analysis of a Rapid Evolutionary Radiation Using Ultraconserved Elements: Evidence for a Bias in Some Multispecies Coalescent Methods. Syst Biol 2016; 65:612-27. [DOI: 10.1093/sysbio/syw014] [Citation(s) in RCA: 114] [Impact Index Per Article: 14.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2015] [Accepted: 01/25/2016] [Indexed: 01/30/2023] Open
|
6
|
Dunn CW, Ryan JF. The evolution of animal genomes. Curr Opin Genet Dev 2015; 35:25-32. [PMID: 26363125 DOI: 10.1016/j.gde.2015.08.006] [Citation(s) in RCA: 38] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2015] [Revised: 08/18/2015] [Accepted: 08/20/2015] [Indexed: 11/18/2022]
Abstract
Genome sequences are now available for hundreds of species sampled across the animal phylogeny, bringing key features of animal genome evolution into sharper focus. The field of animal evolutionary genomics has focused on identifying and classifying the diversity genomic features, reconstructing the history of evolutionary changes in animal genomes, and testing hypotheses about the evolutionary relationships of animals. The grand challenges moving forward are to connect evolutionary changes in genomes with particular evolutionary changes in phenotypes, and to determine which changes are driven by selection. This will require far greater genome sampling both across and within species, extensive phenotype data, a well resolved animal phylogeny, and advances in comparative methods.
Collapse
Affiliation(s)
- Casey W Dunn
- Department of Ecology and Evolutionary Biology, Brown University, 80 Waterman St., Providence, RI 02906, USA.
| | - Joseph F Ryan
- Whitney Laboratory for Marine Bioscience, University of Florida, 9505 Ocean Shore Blvd., St Augustine, FL 32080, USA; Department of Biology, University of Florida, Gainesville, FL 32611, USA
| |
Collapse
|
7
|
Wang B, Zhang Y, Wei P, Sun M, Ma X, Zhu X. Identification of nuclear low-copy genes and their phylogenetic utility in rosids. Genome 2015; 57:547-54. [PMID: 25761707 DOI: 10.1139/gen-2014-0138] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
By far, the interordinal relationships in rosids remain poorly resolved. Previous studies based on chloroplast, mitochondrial, and nuclear DNA has produced conflicting phylogenetic resolutions that has become a widely concerned problem in recent phylogenetic studies. Here, a total of 96 single-copy nuclear gene loci were identified from the KOG (eukaryotic orthologous groups) database, most of which were first used for phylogenetic analysis of angiosperms. The orthologous sequence datasets from completely sequenced genomes of rosids were assembled for the resolution of the position of the COM (Celastrales-Oxalidales-Malpighiales) clade in rosids. Our analysis revealed strong and consistent support for CM topology (the COM clade as sister to the malvids). Our results will contribute to further exploring the underlying cause of conflict between chloroplast, mitochondrial, and nuclear data. In addition, our study identified a few novel nuclear molecular markers with potential to investigate the deep phylogenetic relationship of plants or other eukaryotic taxonomical groups.
Collapse
Affiliation(s)
- Baohua Wang
- School of Life Sciences, Nantong University, Nantong 226019, China
| | | | | | | | | | | |
Collapse
|
8
|
Giribet G. Morphology should not be forgotten in the era of genomics–a phylogenetic perspective. ZOOL ANZ 2015. [DOI: 10.1016/j.jcz.2015.01.003] [Citation(s) in RCA: 66] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
|
9
|
Borner J, Rehm P, Schill RO, Ebersberger I, Burmester T. A transcriptome approach to ecdysozoan phylogeny. Mol Phylogenet Evol 2014; 80:79-87. [PMID: 25124096 DOI: 10.1016/j.ympev.2014.08.001] [Citation(s) in RCA: 74] [Impact Index Per Article: 7.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2014] [Revised: 07/15/2014] [Accepted: 08/01/2014] [Indexed: 11/20/2022]
Abstract
The monophyly of Ecdysozoa, which comprise molting phyla, has received strong support from several lines of evidence. However, the internal relationships of Ecdysozoa are still contended. We generated expressed sequence tags from a priapulid (penis worm), a kinorhynch (mud dragon), a tardigrade (water bear) and five chelicerate taxa by 454 transcriptome sequencing. A multigene alignment was assembled from 63 taxa, which comprised after matrix optimization 24,249 amino acid positions with high data density (2.6% gaps, 19.1% missing data). Phylogenetic analyses employing various models support the monophyly of Ecdysozoa. A clade combining Priapulida and Kinorhyncha (i.e. Scalidophora) was recovered as the earliest branch among Ecdysozoa. We conclude that Cycloneuralia, a taxon erected to combine Priapulida, Kinorhyncha and Nematoda (and others), are paraphyletic. Rather Arthropoda (including Onychophora) are allied with Nematoda and Tardigrada. Within Arthropoda, we found strong support for most clades, including monophyletic Mandibulata and Pancrustacea. The phylogeny within the Euchelicerata remained largely unresolved. There is conflicting evidence on the position of tardigrades: While Bayesian and maximum likelihood analyses of only slowly evolving genes recovered Tardigrada as a sister group to Arthropoda, analyses of the full data set, and of subsets containing genes evolving at fast and intermediate rates identified a clade of Tardigrada and Nematoda. Notably, the latter topology is also supported by the analyses of indel patterns.
Collapse
Affiliation(s)
- Janus Borner
- Institute of Zoology and Zoological Museum, University of Hamburg, D-20146 Hamburg, Germany
| | - Peter Rehm
- Institute of Zoology and Zoological Museum, University of Hamburg, D-20146 Hamburg, Germany
| | - Ralph O Schill
- Zoology, Biological Institute, University of Stuttgart, Germany
| | - Ingo Ebersberger
- Department for Applied Bioinformatics, University of Frankfurt, Institute for Cell Biology and Neuroscience, Germany
| | - Thorsten Burmester
- Institute of Zoology and Zoological Museum, University of Hamburg, D-20146 Hamburg, Germany.
| |
Collapse
|
10
|
Ray PS, Fox PL. Origin and evolution of glutamyl-prolyl tRNA synthetase WHEP domains reveal evolutionary relationships within Holozoa. PLoS One 2014; 9:e98493. [PMID: 24968216 PMCID: PMC4072531 DOI: 10.1371/journal.pone.0098493] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2014] [Accepted: 05/02/2014] [Indexed: 02/05/2023] Open
Abstract
Repeated domains in proteins that have undergone duplication or loss, and sequence divergence, are especially informative about phylogenetic relationships. We have exploited divergent repeats of the highly structured, 50-amino acid WHEP domains that join the catalytic subunits of bifunctional glutamyl-prolyl tRNA synthetase (EPRS) as a sequence-informed repeat (SIR) to trace the origin and evolution of EPRS in holozoa. EPRS is the only fused tRNA synthetase, with two distinct aminoacylation activities, and a non-canonical translation regulatory function mediated by the WHEP domains in the linker. Investigating the duplications, deletions and divergence of WHEP domains, we traced the bifunctional EPRS to choanozoans and identified the fusion event leading to its origin at the divergence of ichthyosporea and emergence of filozoa nearly a billion years ago. Distribution of WHEP domains from a single species in two or more distinct clades suggested common descent, allowing the identification of linking organisms. The discrete assortment of choanoflagellate WHEP domains with choanozoan domains as well as with those in metazoans supported the phylogenetic position of choanoflagellates as the closest sister group to metazoans. Analysis of clustering and assortment of WHEP domains provided unexpected insights into phylogenetic relationships amongst holozoan taxa. Furthermore, observed gaps in the transition between WHEP domain groupings in distant taxa allowed the prediction of undiscovered or extinct evolutionary intermediates. Analysis based on SIR domains can provide a phylogenetic counterpart to palaentological approaches of discovering “missing links” in the tree of life.
Collapse
Affiliation(s)
- Partho Sarothi Ray
- Department of Cellular and Molecular Medicine, The Lerner Research Institute, Cleveland Clinic, Cleveland, Ohio, Unites States of America
- Department of Biological Sciences, Indian Institute of Science Education and Research, Kolkata, India
| | - Paul L. Fox
- Department of Cellular and Molecular Medicine, The Lerner Research Institute, Cleveland Clinic, Cleveland, Ohio, Unites States of America
- * E-mail:
| |
Collapse
|
11
|
Yi Z, Strüder-Kypke M, Hu X, Lin X, Song W. Sampling strategies for improving tree accuracy and phylogenetic analyses: a case study in ciliate protists, with notes on the genus Paramecium. Mol Phylogenet Evol 2013; 71:142-8. [PMID: 24315865 DOI: 10.1016/j.ympev.2013.11.013] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2013] [Revised: 11/20/2013] [Accepted: 11/24/2013] [Indexed: 11/28/2022]
Abstract
In order to assess how dataset-selection for multi-gene analyses affects the accuracy of inferred phylogenetic trees in ciliates, we chose five genes and the genus Paramecium, one of the most widely used model protist genera, and compared tree topologies of the single- and multi-gene analyses. Our empirical study shows that: (1) Using multiple genes improves phylogenetic accuracy, even when their one-gene topologies are in conflict with each other. (2) The impact of missing data on phylogenetic accuracy is ambiguous: resolution power and topological similarity, but not number of represented taxa, are the most important criteria of a dataset for inclusion in concatenated analyses. (3) As an example, we tested the three classification models of the genus Paramecium with a multi-gene based approach, and only the monophyly of the subgenus Paramecium is supported.
Collapse
Affiliation(s)
- Zhenzhen Yi
- Key Laboratory of Ecology and Environment Science in Guangdong Higher Education, School of Life Science, South China Normal University, Guangzhou 510631, China; Laboratory of Protozoology, Institute of Evolution & Marine Biodiversity, Ocean University of China, Qingdao 266003, China
| | - Michaela Strüder-Kypke
- Department of Molecular and Cellular Biology, University of Guelph, Guelph, Ontario NIG 2W1, Canada
| | - Xiaozhong Hu
- Laboratory of Protozoology, Institute of Evolution & Marine Biodiversity, Ocean University of China, Qingdao 266003, China
| | - Xiaofeng Lin
- Key Laboratory of Ecology and Environment Science in Guangdong Higher Education, School of Life Science, South China Normal University, Guangzhou 510631, China.
| | - Weibo Song
- Laboratory of Protozoology, Institute of Evolution & Marine Biodiversity, Ocean University of China, Qingdao 266003, China.
| |
Collapse
|
12
|
Campbell MA, Chen WJ, López JA. Are flatfishes (Pleuronectiformes) monophyletic? Mol Phylogenet Evol 2013; 69:664-73. [PMID: 23876291 PMCID: PMC4458374 DOI: 10.1016/j.ympev.2013.07.011] [Citation(s) in RCA: 40] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2013] [Revised: 07/07/2013] [Accepted: 07/12/2013] [Indexed: 02/03/2023]
Abstract
All extant species of flatfish (order Pleuronectiformes) are thought to descend from a common ancestor, and therefore to represent a monophyletic group. This hypothesis is based largely on the dramatic bilateral asymmetry and associated ocular migration characteristics of all flatfish. Yet, molecular-based phylogenetic studies have been inconclusive on this premise. Support for flatfish monophyly has varied with differences in taxonomic and gene region sampling schemes. Notably, the genus Psettodes has been found to be more related to non-flatfishes than to other flatfishes in many recent studies. The polyphyletic nature of the Pleuronectiformes is often inferred to be the result of weak historical signal and/or artifact of phylogenetic inference due to a bias in the data. In this study, we address the question of pleuronectiform monophyly with a broad set of markers (from six phylogenetically informative nuclear loci) and inference methods designed to limit the influence of phylogenetic artifacts. Concomitant with a character-rich analytical strategy, an extensive taxonomic sampling of flatfish and potential close relatives is used to increase power and resolution. Results of our analyses are most consistent with a non-monophyletic Pleuronectiformes with Psettodes always being excluded. A fossil-calibrated Bayesian relaxed clock analysis estimates the age of Pleuronectoidei to be 73 Ma, and the time to most recent common ancestor of Pleuronectoidei, Psettodes, and other relative taxa to be 77 Ma. The ages are much older than the records of any fossil pleuronectiform currently recognized. We discuss our findings in the context of the available morphological evidence and discuss the compatibility of our molecular hypothesis with morphological data regarding extinct and extant flatfish forms.
Collapse
Affiliation(s)
- Matthew A. Campbell
- Department of Biology and Wildlife, University of Alaska Fairbanks, Fairbanks, Alaska 99775, USA
| | - Wei-Jen Chen
- Institute of Oceanography, National Taiwan University, Taipei 10617, Taiwan
| | - J. Andrés López
- School of Fisheries and Ocean Sciences, University of Alaska, Fairbanks, AK 99775, USA
- University of Alaska Museum, Fairbanks, AK 99775, USA
| |
Collapse
|
13
|
Ocampo EH, Robles R, Terossi M, Nuñez JD, Cledón M, Mantelatto FL. Phylogeny, phylogeography, and systematics of the American pea crab genusCalyptraeotheres Campos, 1990, inferred from molecular markers. Zool J Linn Soc 2013. [DOI: 10.1111/zoj.12045] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Affiliation(s)
- Emiliano H. Ocampo
- Instituto de Investigaciones Marinas y Costeras; CONICET-UNMDP; Mar del Plata; Buenos Aires; 7600; Argentina
| | - Rafael Robles
- Laboratory of Bioecology and Crustacean Systematics, Program in Comparative Biology, Department of Biology, Faculty of Philosophy, Science and Letters of Ribeirão Preto; University of São Paulo; Ribeirão Preto; São Paulo; 14040-901; Brazil
| | - Mariana Terossi
- Laboratory of Bioecology and Crustacean Systematics, Program in Comparative Biology, Department of Biology, Faculty of Philosophy, Science and Letters of Ribeirão Preto; University of São Paulo; Ribeirão Preto; São Paulo; 14040-901; Brazil
| | - Jesús D. Nuñez
- Instituto de Investigaciones Marinas y Costeras; CONICET-UNMDP; Mar del Plata; Buenos Aires; 7600; Argentina
| | - Maximiliano Cledón
- Instituto de Investigaciones Marinas y Costeras; CONICET-UNMDP; Mar del Plata; Buenos Aires; 7600; Argentina
| | - Fernando L. Mantelatto
- Laboratory of Bioecology and Crustacean Systematics, Program in Comparative Biology, Department of Biology, Faculty of Philosophy, Science and Letters of Ribeirão Preto; University of São Paulo; Ribeirão Preto; São Paulo; 14040-901; Brazil
| |
Collapse
|
14
|
Struck TH. The impact of paralogy on phylogenomic studies - a case study on annelid relationships. PLoS One 2013; 8:e62892. [PMID: 23667537 PMCID: PMC3647064 DOI: 10.1371/journal.pone.0062892] [Citation(s) in RCA: 39] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2012] [Accepted: 03/26/2013] [Indexed: 12/15/2022] Open
Abstract
Phylogenomic studies based on hundreds of genes derived from expressed sequence tags libraries are increasingly used to reveal the phylogeny of taxa. A prerequisite for these studies is the assignment of genes into clusters of orthologous sequences. Sophisticated methods of orthology prediction are used in such analyses, but it is rarely assessed whether paralogous sequences have been erroneously grouped together as orthologous sequences after the prediction, and whether this had an impact on the phylogenetic reconstruction using a super-matrix approach. Herein, I tested the impact of paralogous sequences on the reconstruction of annelid relationships based on phylogenomic datasets. Using single-partition analyses, screening for bootstrap support, blast searches and pruning of sequences in the supermatrix, wrongly assigned paralogous sequences were found in eight partitions and the placement of five taxa (the annelids Owenia, Scoloplos, Sthenelais and Eurythoe and the nemertean Cerebratulus) including the robust bootstrap support could be attributed to the presence of paralogous sequences in two partitions. Excluding these sequences resulted in a different, weaker supported placement for these taxa. Moreover, the analyses revealed that paralogous sequences impacted the reconstruction when only a single taxon represented a previously supported higher taxon such as a polychaete family. One possibility of a priori detection of wrongly assigned paralogous sequences could combine 1) a screening of single-partition analyses based on criteria such as nodal support or internal branch length with 2) blast searches of suspicious cases as presented herein. Also possible are a posteriori approaches in which support for specific clades is investigated by comparing alternative hypotheses based on differences in per-site likelihoods. Increasing the sizes of EST libraries will also decrease the likelihood of wrongly assigned paralogous sequences, and in the case of orthology prediction methods like HaMStR it is likewise decreased by using more than one reference taxon.
Collapse
|
15
|
Bigot T, Daubin V, Lassalle F, Perrière G. TPMS: a set of utilities for querying collections of gene trees. BMC Bioinformatics 2013; 14:109. [PMID: 23530580 PMCID: PMC3655882 DOI: 10.1186/1471-2105-14-109] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2012] [Accepted: 03/12/2013] [Indexed: 01/02/2023] Open
Abstract
Background The information in large collections of phylogenetic trees is useful for many comparative genomic studies. Therefore, there is a need for flexible tools that allow exploration of such collections in order to retrieve relevant data as quickly as possible. Results In this paper, we present TPMS (Tree Pattern-Matching Suite), a set of programs for handling and retrieving gene trees according to different criteria. The programs from the suite include utilities for tree collection building, specific tree-pattern search strategies and tree rooting. Use of TPMS is illustrated through three examples: systematic search for incongruencies in a large tree collection, a short study on the Coelomata/Ecdysozoa controversy and an evaluation of the level of support for a recently published Mammal phylogeny. Conclusion TPMS is a powerful suite allowing to quickly retrieve sets of trees matching complex patterns in large collection or to root trees using more rigorous approaches than the classical midpoint method. As it is made of a set of command-line programs, it can be easily integrated in any sequence analysis pipeline for an automated use.
Collapse
Affiliation(s)
- Thomas Bigot
- Laboratoire de Biométrie et Biologie Évolutive, UMR CNRS 5558, Université Claude Bernard - Lyon 1, 43 bd, du 11 Novembre 1918, 69622 Villeurbanne Cedex, France
| | | | | | | |
Collapse
|
16
|
Yuri T, Kimball RT, Harshman J, Bowie RCK, Braun MJ, Chojnowski JL, Han KL, Hackett SJ, Huddleston CJ, Moore WS, Reddy S, Sheldon FH, Steadman DW, Witt CC, Braun EL. Parsimony and model-based analyses of indels in avian nuclear genes reveal congruent and incongruent phylogenetic signals. BIOLOGY 2013; 2:419-44. [PMID: 24832669 PMCID: PMC4009869 DOI: 10.3390/biology2010419] [Citation(s) in RCA: 76] [Impact Index Per Article: 6.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/28/2012] [Revised: 02/21/2013] [Accepted: 02/22/2013] [Indexed: 11/19/2022]
Abstract
Insertion/deletion (indel) mutations, which are represented by gaps in multiple sequence alignments, have been used to examine phylogenetic hypotheses for some time. However, most analyses combine gap data with the nucleotide sequences in which they are embedded, probably because most phylogenetic datasets include few gap characters. Here, we report analyses of 12,030 gap characters from an alignment of avian nuclear genes using maximum parsimony (MP) and a simple maximum likelihood (ML) framework. Both trees were similar, and they exhibited almost all of the strongly supported relationships in the nucleotide tree, although neither gap tree supported many relationships that have proven difficult to recover in previous studies. Moreover, independent lines of evidence typically corroborated the nucleotide topology instead of the gap topology when they disagreed, although the number of conflicting nodes with high bootstrap support was limited. Filtering to remove short indels did not substantially reduce homoplasy or reduce conflict. Combined analyses of nucleotides and gaps resulted in the nucleotide topology, but with increased support, suggesting that gap data may prove most useful when analyzed in combination with nucleotide substitutions.
Collapse
Affiliation(s)
- Tamaki Yuri
- Department of Biology, University of Florida, Gainesville, FL 32611, USA; E-Mails: (T.Y.); (R.T.K.); (J.L.C.); (K.-L.H.)
- Sam Noble Oklahoma Museum of Natural History, University of Oklahoma, Norman, OK 73072, USA
| | - Rebecca T. Kimball
- Department of Biology, University of Florida, Gainesville, FL 32611, USA; E-Mails: (T.Y.); (R.T.K.); (J.L.C.); (K.-L.H.)
| | - John Harshman
- 4869 Pepperwood Way, San Jose, CA 95124, USA; E-Mail:
| | - Rauri C. K. Bowie
- Museum of Vertebrate Zoology and Department of Integrative Biology, University of California, Berkeley, CA 94720, USA; E-Mail:
| | - Michael J. Braun
- Department of Vertebrate Zoology, National Museum of Natural History, Smithsonian Institution, 4210 Silver Hill Road, Suitland, MD 20746, USA; E-Mails: (M.J.B.); (C.J.H.)
- Behavior, Ecology, Evolution and Systematics Program, University of Maryland, College Park, MD 20742, USA
| | - Jena L. Chojnowski
- Department of Biology, University of Florida, Gainesville, FL 32611, USA; E-Mails: (T.Y.); (R.T.K.); (J.L.C.); (K.-L.H.)
| | - Kin-Lan Han
- Department of Biology, University of Florida, Gainesville, FL 32611, USA; E-Mails: (T.Y.); (R.T.K.); (J.L.C.); (K.-L.H.)
| | - Shannon J. Hackett
- Zoology Department, Field Museum of Natural History, 1400 South Lakeshore Drive, Chicago, IL 60605, USA; E-Mail:
| | - Christopher J. Huddleston
- Department of Vertebrate Zoology, National Museum of Natural History, Smithsonian Institution, 4210 Silver Hill Road, Suitland, MD 20746, USA; E-Mails: (M.J.B.); (C.J.H.)
| | - William S. Moore
- Department of Biological Sciences, Wayne State University, 5047 Gullen Mall, Detroit, MI 48202, USA; E-Mail:
| | - Sushma Reddy
- Biology Department, Loyola University Chicago, Chicago, IL 60660, USA; E-Mail:
| | - Frederick H. Sheldon
- Museum of Natural Science, 119 Foster Hall, Louisiana State University, Baton Rouge, LA 70803, USA; E-Mail:
| | - David W. Steadman
- Florida Museum of Natural History, University of Florida, Gainesville, FL 32611, USA; E-Mail:
| | - Christopher C. Witt
- Department of Biology and Museum of Southwestern Biology, University of New Mexico, Albuquerque, NM 87131, USA; E-Mail:
| | - Edward L. Braun
- Department of Biology, University of Florida, Gainesville, FL 32611, USA; E-Mails: (T.Y.); (R.T.K.); (J.L.C.); (K.-L.H.)
| |
Collapse
|
17
|
Abstract
Gene structure data can substantially advance our understanding of metazoan evolution and deliver an independent approach to resolve conflicts among existing hypotheses. Here, we used changes of spliceosomal intron positions as novel phylogenetic marker to reconstruct the animal tree. This kind of data is inferred from orthologous genes containing mutually exclusive introns at pairs of sequence positions in close proximity, so-called near intron pairs (NIPs). NIP data were collected for 48 species and utilized as binary genome-level characters in maximum parsimony (MP) analyses to reconstruct deep metazoan phylogeny. All groupings that were obtained with more than 80% bootstrap support are consistent with currently supported phylogenetic hypotheses. This includes monophyletic Chordata, Vertebrata, Nematoda, Platyhelminthes and Trochozoa. Several other clades such as Deuterostomia, Protostomia, Arthropoda, Ecdysozoa, Spiralia, and Eumetazoa, however, failed to be recovered due to a few problematic taxa such as the mite Ixodesand the warty comb jelly Mnemiopsis. The corresponding unexpected branchings can be explained by the paucity of synapomorphic changes of intron positions shared between some genomes, by the sensitivity of MP analyses to long-branch attraction (LBA), and by the very unequal evolutionary rates of intron loss and intron gain during evolution of the different subclades of metazoans. In addition, we obtained an assemblage of Cnidaria, Porifera, and Placozoa as sister group of Bilateria+Ctenophora with medium support, a disputable, but remarkable result. We conclude that NIPs can be used as phylogenetic characters also within a broader phylogenetic context, given that they have emerged regularly during evolution irrespective of the large variation of intron density across metazoan genomes.
Collapse
Affiliation(s)
- Jörg Lehmann
- Bioinformatics Group, Department of Computer Science, and Interdisciplinary Center for Bioinformatics, University of Leipzig, Härtelstraße 16-18, 04107 Leipzig, Germany
| | | | | |
Collapse
|
18
|
Fong JJ, Brown JM, Fujita MK, Boussau B. A phylogenomic approach to vertebrate phylogeny supports a turtle-archosaur affinity and a possible paraphyletic lissamphibia. PLoS One 2012; 7:e48990. [PMID: 23145043 PMCID: PMC3492174 DOI: 10.1371/journal.pone.0048990] [Citation(s) in RCA: 52] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/10/2012] [Accepted: 10/03/2012] [Indexed: 01/18/2023] Open
Abstract
In resolving the vertebrate tree of life, two fundamental questions remain: 1) what is the phylogenetic position of turtles within amniotes, and 2) what are the relationships between the three major lissamphibian (extant amphibian) groups? These relationships have historically been difficult to resolve, with five different hypotheses proposed for turtle placement, and four proposed branching patterns within Lissamphibia. We compiled a large cDNA/EST dataset for vertebrates (75 genes for 129 taxa) to address these outstanding questions. Gene-specific phylogenetic analyses revealed a great deal of variation in preferred topology, resulting in topologically ambiguous conclusions from the combined dataset. Due to consistent preferences for the same divergent topologies across genes, we suspected systematic phylogenetic error as a cause of some variation. Accordingly, we developed and tested a novel statistical method that identifies sites that have a high probability of containing biased signal for a specific phylogenetic relationship. After removing putatively biased sites, support emerged for a sister relationship between turtles and either crocodilians or archosaurs, as well as for a caecilian-salamander sister relationship within Lissamphibia, with Lissamphibia potentially paraphyletic.
Collapse
Affiliation(s)
- Jonathan J Fong
- Museum of Vertebrate Zoology, University of California, Berkeley, CA, USA.
| | | | | | | |
Collapse
|
19
|
Li C, Matthes-Rosana KA, Garcia M, Naylor GJP. Phylogenetics of Chondrichthyes and the problem of rooting phylogenies with distant outgroups. Mol Phylogenet Evol 2012; 63:365-73. [PMID: 22300842 DOI: 10.1016/j.ympev.2012.01.013] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2011] [Revised: 01/08/2012] [Accepted: 01/13/2012] [Indexed: 11/29/2022]
Abstract
Erroneous estimates of ingroup relationships can be caused by attributes in the outgroup chosen to root the tree. Phylogenetic analyses of DNA sequences frequently yield incorrect estimates of ingroup relationships when the outgroup used to "root" the tree is highly divergent from the ingroup. This is especially the case when the outgroup has a different base composition than the ingroup. Unfortunately, in many instances, alternative less divergent outgroups are not available. In such cases, investigators must either target genes with attributes that minimize the problem (slowly evolving genes with stationary base compositions--which are often not ideal for estimating relationships among the more closely related ingroup taxa) or use inference models that are explicitly tailored to deal with an attenuated historical signal with a superimposed non-stationary base composition. In this paper we explore the problem both empirically and through simulation. For the empirical component we looked at the phylogenetic relationships among elasmobranch fishes (sharks and rays), a group whose closest living outgroup, the holocephalan Ghost fishes, are separated from the elasmobranchs by more than 100 million years of evolution. We compiled a data set for analysis comprising 10 single-copy nuclear protein-coding genes (12,096 bp) for representatives of the major lineages within elasmobranchs and holocephalans. For the simulation, we used an evolutionary model on a fixed tree topology to generate DNA sequence data sets which varied both in their distance to the outgroup, and in their base compositional difference between ingroup and outgroup. Results from both the empirical data set and the simulation, support the idea that deviation from base compositional stationarity, in conjunction with distance from the root can act in concert to compromise accuracy of estimated relationships within the ingroup. We tested several approaches to mitigate such problems. We found, that excluding genes with overall faster rates and heterogeneous base compositions, while the least sophisticated of the methods evaluated, seemed to be the most effective.
Collapse
Affiliation(s)
- Chenhong Li
- School of Biological Sciences, University of Nebraska, Lincoln, NE 68588, USA
| | | | | | | |
Collapse
|
20
|
Nabhan AR, Sarkar IN. The impact of taxon sampling on phylogenetic inference: a review of two decades of controversy. Brief Bioinform 2012; 13:122-34. [PMID: 21436145 PMCID: PMC3251835 DOI: 10.1093/bib/bbr014] [Citation(s) in RCA: 122] [Impact Index Per Article: 10.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2010] [Revised: 02/25/2011] [Indexed: 11/13/2022] Open
Abstract
Over the past two decades, there has been a long-standing debate about the impact of taxon sampling on phylogenetic inference. Studies have been based on both real and simulated data sets, within actual and theoretical contexts, and using different inference methods, to study the impact of taxon sampling. In some cases, conflicting conclusions have been drawn for the same data set. The main questions explored in studies to date have been about the effects of using sparse data, adding new taxa, including more characters from genome sequences and using different (or concatenated) locus regions. These questions can be reduced to more fundamental ones about the assessment of data quality and the design guidelines of taxon sampling in phylogenetic inference experiments. This review summarizes progress to date in understanding the impact of taxon sampling on the accuracy of phylogenetic analysis.
Collapse
Affiliation(s)
- Ahmed Ragab Nabhan
- Center for Clinical and Translational Science, 89 Beaumont Avenue, Given Courtyard N309, Burlington, VT 05405, USA
| | | |
Collapse
|
21
|
Gasnereau I, Herr P, Chia PZC, Basler K, Gleeson PA. Identification of an endocytosis motif in an intracellular loop of Wntless protein, essential for its recycling and the control of Wnt protein signaling. J Biol Chem 2011; 286:43324-33. [PMID: 22027831 DOI: 10.1074/jbc.m111.307231] [Citation(s) in RCA: 39] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022] Open
Abstract
The secretion of Wnt signaling proteins is dependent upon the transmembrane sorting receptor, Wntless (Wls), which recycles between the trans-Golgi network and the cell surface. Loss of Wls results in impairment of Wnt secretion and defects in development and homeostasis in Drosophila, Caenorhabditis elegans, and the mouse. The sorting signals for the internalization and trafficking of Wls have not been defined. Here, we demonstrate that Wls internalization requires clathrin and dynamin I, components of the clathrin-mediated endocytosis pathway. Moreover, we have identified a conserved YXXϕ endocytosis motif in the third intracellular loop of the multipass membrane protein Wls. Mutation of the tyrosine-based motif YEGL to AEGL (Y425A) resulted in the accumulation of human mutant Wls on the cell surface of transfected HeLa cells. The cell surface accumulation of Wls(AEGL) was rescued by the insertion of a classical YXXϕ motif in the cytoplasmic tail. Significantly, a Drosophila Wls(AEGL) mutant displayed a wing notch phenotype, with reduced Wnt secretion and signaling. These findings demonstrate that YXXϕ endocytosis motifs can occur in the intracellular loops of multipass membrane proteins and, moreover, provide direct evidence that the trafficking of Wls is required for efficient secretion of Wnt signaling proteins.
Collapse
Affiliation(s)
- Isabelle Gasnereau
- Department of Biochemistry and Molecular Biology and Bio21 Molecular Science and Biotechnology Institute, The University of Melbourne, Victoria 3010, Australia
| | | | | | | | | |
Collapse
|
22
|
Telford MJ, Copley RR. Improving animal phylogenies with genomic data. Trends Genet 2011; 27:186-95. [DOI: 10.1016/j.tig.2011.02.003] [Citation(s) in RCA: 62] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2010] [Revised: 02/08/2011] [Accepted: 02/09/2011] [Indexed: 02/04/2023]
|
23
|
Edgecombe GD, Giribet G, Dunn CW, Hejnol A, Kristensen RM, Neves RC, Rouse GW, Worsaae K, Sørensen MV. Higher-level metazoan relationships: recent progress and remaining questions. ORG DIVERS EVOL 2011. [DOI: 10.1007/s13127-011-0044-4] [Citation(s) in RCA: 206] [Impact Index Per Article: 15.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
|
24
|
Staljanssens D, Azari EK, Christiaens O, Beaufays J, Lins L, Van Camp J, Smagghe G. The CCK(-like) receptor in the animal kingdom: functions, evolution and structures. Peptides 2011; 32:607-19. [PMID: 21167241 DOI: 10.1016/j.peptides.2010.11.025] [Citation(s) in RCA: 47] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/15/2010] [Revised: 11/27/2010] [Accepted: 11/30/2010] [Indexed: 01/09/2023]
Abstract
In this review, the cholecystokinin (CCK)(-like) receptors throughout the animal kingdom are compared on the level of physiological functions, evolutionary basis and molecular structure. In vertebrates, the CCK receptor is an important member of the G-protein coupled receptors as it is involved in the regulation of many physiological functions like satiety, gastrointestinal motility, gastric acid secretion, gall bladder contraction, pancreatic secretion, panic, anxiety and memory and learning processes. A homolog for this receptor is also found in nematodes and arthropods, called CK receptor and sulfakinin (SK) receptor, respectively. These receptors seem to have evolved from a common ancestor which is probably still closely related to the nematode CK receptor. The SK receptor is more closely related to the CCK receptor and seems to have similar functions. A molecular 3D-model for the CCK receptor type 1 has been built together with the docking of the natural ligands for the CCK and SK receptors in the CCK receptor type 1. These molecular models can help to study ligand-receptor interactions, that can in turn be useful in the development of new CCK(-like) receptor agonists and antagonists with beneficial health effects in humans or potential for pest control.
Collapse
Affiliation(s)
- Dorien Staljanssens
- Department of Crop Protection, Faculty of Bioscience Engineering, Ghent University, Ghent, Belgium
| | | | | | | | | | | | | |
Collapse
|
25
|
Gagnière N, Jollivet D, Boutet I, Brélivet Y, Busso D, Da Silva C, Gaill F, Higuet D, Hourdez S, Knoops B, Lallier F, Leize-Wagner E, Mary J, Moras D, Perrodou E, Rees JF, Segurens B, Shillito B, Tanguy A, Thierry JC, Weissenbach J, Wincker P, Zal F, Poch O, Lecompte O. Insights into metazoan evolution from Alvinella pompejana cDNAs. BMC Genomics 2010; 11:634. [PMID: 21080938 PMCID: PMC3018142 DOI: 10.1186/1471-2164-11-634] [Citation(s) in RCA: 43] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2010] [Accepted: 11/16/2010] [Indexed: 11/29/2022] Open
Abstract
Background Alvinella pompejana is a representative of Annelids, a key phylum for evo-devo studies that is still poorly studied at the sequence level. A. pompejana inhabits deep-sea hydrothermal vents and is currently known as one of the most thermotolerant Eukaryotes in marine environments, withstanding the largest known chemical and thermal ranges (from 5 to 105°C). This tube-dwelling worm forms dense colonies on the surface of hydrothermal chimneys and can withstand long periods of hypo/anoxia and long phases of exposure to hydrogen sulphides. A. pompejana specifically inhabits chimney walls of hydrothermal vents on the East Pacific Rise. To survive, Alvinella has developed numerous adaptations at the physiological and molecular levels, such as an increase in the thermostability of proteins and protein complexes. It represents an outstanding model organism for studying adaptation to harsh physicochemical conditions and for isolating stable macromolecules resistant to high temperatures. Results We have constructed four full length enriched cDNA libraries to investigate the biology and evolution of this intriguing animal. Analysis of more than 75,000 high quality reads led to the identification of 15,858 transcripts and 9,221 putative protein sequences. Our annotation reveals a good coverage of most animal pathways and networks with a prevalence of transcripts involved in oxidative stress resistance, detoxification, anti-bacterial defence, and heat shock protection. Alvinella proteins seem to show a slow evolutionary rate and a higher similarity with proteins from Vertebrates compared to proteins from Arthropods or Nematodes. Their composition shows enrichment in positively charged amino acids that might contribute to their thermostability. The gene content of Alvinella reveals that an important pool of genes previously considered to be specific to Deuterostomes were in fact already present in the last common ancestor of the Bilaterian animals, but have been secondarily lost in model invertebrates. This pool is enriched in glycoproteins that play a key role in intercellular communication, hormonal regulation and immunity. Conclusions Our study starts to unravel the gene content and sequence evolution of a deep-sea annelid, revealing key features in eukaryote adaptation to extreme environmental conditions and highlighting the proximity of Annelids and Vertebrates.
Collapse
Affiliation(s)
- Nicolas Gagnière
- Department of Structural Biology and Genomics, Institut de Génétique et de Biologie Moléculaire et Cellulaire, CERBM F-67400 Illkirch, France
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
26
|
Holton TA, Pisani D. Deep genomic-scale analyses of the metazoa reject Coelomata: evidence from single- and multigene families analyzed under a supertree and supermatrix paradigm. Genome Biol Evol 2010; 2:310-24. [PMID: 20624736 PMCID: PMC2997542 DOI: 10.1093/gbe/evq016] [Citation(s) in RCA: 45] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022] Open
Abstract
Solving the phylogeny of the animals with bilateral symmetry has proven difficult. Morphological studies have suggested a variety of alternative hypotheses, of which, Hyman’s Coelomata hypothesis has become the most established. Studies based on 18S rRNA have failed to endorse Coelomata, supporting instead the rearrangement of the protostomes into two new clades: the Lophotrochozoa (including, e.g., the molluscs and the annelids) and the Ecdysozoa (including the Panarthropoda and most pseudocoelomates, such as the nematodes and priapulids). Support for this new animal phylogeny has been attained from expressed sequence tag studies, although these generally have a limited gene sampling. In contrast, deep genomic-scale analyses have often supported Coelomata. However, these studies are problematic due to their limited taxonomic sampling, which could exacerbate tree reconstruction artifacts. Here, we address both of these sampling limitations; we study the effect of long-branch attraction (LBA) in deep genomic-scale analyses and provide convincing evidence, using both single- and multigene families, that Coelomata is an artifact. We show that optimal outgroup selection is key in avoiding LBA and identify the use of inadequate outgroups as the reason previous deep genomic-scale analyses found strong support for Coelomata.
Collapse
Affiliation(s)
- Thérèse A Holton
- Department of Biology, National University of Ireland, Maynooth, Maynooth Co. Kildare, Ireland
| | | |
Collapse
|
27
|
Krupovic M, Gribaldo S, Bamford DH, Forterre P. The evolutionary history of archaeal MCM helicases: a case study of vertical evolution combined with hitchhiking of mobile genetic elements. Mol Biol Evol 2010; 27:2716-32. [PMID: 20581330 DOI: 10.1093/molbev/msq161] [Citation(s) in RCA: 57] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022] Open
Abstract
Genes encoding DNA replication proteins have been frequently exchanged between cells and mobile elements, such as viruses or plasmids. This raises potential problems to reconstruct their history. Here, we combine phylogenetic and genomic context analyses to study the evolution of the replicative minichromosome maintenance (MCM) helicases in Archaea. Several archaeal genomes encode more than one copy of the mcm gene. Genome context analysis reveals that most of these additional copies are encoded within mobile elements. Exhaustive analysis of these elements reveals diverse groups of integrated archaeal plasmids or viruses, including several head-and-tail proviruses. Some MCMs encoded by mobile elements are structurally distinct from their cellular counterparts, with one case of novel domain organization. Both genome context and phylogenetic analysis indicate that MCM encoded by mobile elements were recruited from cellular genomes. An accelerated evolution and a dramatic expansion of methanococcal MCMs suggest a host-to-virus-to-host transfer loop, possibly triggered by the loss of the archaeal initiator protein Cdc6 in Methanococcales. Surprisingly, despite extensive transfer of mcm genes between viruses, plasmids, and cells, the topology of the MCM tree is strikingly congruent with the consensus archaeal phylogeny, indicating that mobile elements encoding mcm have coevolved with their hosts and that DNA replication proteins can be also useful to reconstruct the history of the archaeal domain.
Collapse
Affiliation(s)
- Mart Krupovic
- Department of Biosciences and Institute of Biotechnology, University of Helsinki, Helsinki, Finland
| | | | | | | |
Collapse
|
28
|
Giribet G. A new dimension in combining data? The use of morphology and phylogenomic data in metazoan systematics. ACTA ZOOL-STOCKHOLM 2010. [DOI: 10.1111/j.1463-6395.2009.00420.x] [Citation(s) in RCA: 37] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
|
29
|
Belinky F, Cohen O, Huchon D. Large-scale parsimony analysis of metazoan indels in protein-coding genes. Mol Biol Evol 2009; 27:441-51. [PMID: 19864469 DOI: 10.1093/molbev/msp263] [Citation(s) in RCA: 33] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Insertions and deletions (indels) are considered to be rare evolutionary events, the analysis of which may resolve controversial phylogenetic relationships. Indeed, indel characters are often assumed to be less homoplastic than amino acid and nucleotide substitutions and, consequently, more reliable markers for phylogenetic reconstruction. In this study, we analyzed indels from over 1,000 metazoan orthologous genes. We studied the impact of different species sampling, ortholog data sets, lengths of included indels, and indel-coding methods on the resulting metazoan tree. Our results show that, similar to sequence substitutions, indels are homoplastic characters, and their analysis is sensitive to the long-branch attraction artifact. Furthermore, improving the taxon sampling and choosing a closely related outgroup greatly impact the phylogenetic inference. Our indel-based inferences support the Ecdysozoa hypothesis over the Coelomata hypothesis and suggest that sponges are a sister clade to other animals.
Collapse
|
30
|
Takahashi T, McDougall C, Troscianko J, Chen WC, Jayaraman-Nagarajan A, Shimeld SM, Ferrier DEK. An EST screen from the annelid Pomatoceros lamarckii reveals patterns of gene loss and gain in animals. BMC Evol Biol 2009; 9:240. [PMID: 19781084 PMCID: PMC2762978 DOI: 10.1186/1471-2148-9-240] [Citation(s) in RCA: 35] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2009] [Accepted: 09/25/2009] [Indexed: 01/06/2023] Open
Abstract
Background Since the drastic reorganisation of the phylogeny of the animal kingdom into three major clades of bilaterians; Ecdysozoa, Lophotrochozoa and Deuterostomia, it became glaringly obvious that the selection of model systems with extensive molecular resources was heavily biased towards only two of these three clades, namely the Ecdysozoa and Deuterostomia. Increasing efforts have been put towards redressing this imbalance in recent years, and one of the principal phyla in the vanguard of this endeavour is the Annelida. Results In the context of this effort we here report our characterisation of an Expressed Sequence Tag (EST) screen in the serpulid annelid, Pomatoceros lamarckii. We have sequenced over 5,000 ESTs which consolidate into over 2,000 sequences (clusters and singletons). These sequences are used to build phylogenetic trees to estimate relative branch lengths amongst different taxa and, by comparison to genomic data from other animals, patterns of gene retention and loss are deduced. Conclusion The molecular phylogenetic trees including the P. lamarckii sequences extend early observations that polychaetes tend to have relatively short branches in such trees, and hence are useful taxa with which to reconstruct gene family evolution. Also, with the availability of lophotrochozoan data such as that of P. lamarckii, it is now possible to make much more accurate reconstructions of the gene complement of the ancestor of the bilaterians than was previously possible from comparisons of ecdysozoan and deuterostome genomes to non-bilaterian outgroups. It is clear that the traditional molecular model systems for protostomes (e.g. Drosophila melanogaster and Caenorhabditis elegans), which are restricted to the Ecdysozoa, have undergone extensive gene loss during evolution. These ecdysozoan systems, in terms of gene content, are thus more derived from the bilaterian ancestral condition than lophotrochozoan systems like the polychaetes, and thus cannot be used as good, general representatives of protostome genomes. Currently sequenced insect and nematode genomes are less suitable models for deducing bilaterian ancestral states than lophotrochozoan genomes, despite the array of powerful genetic and mechanistic manipulation techniques in these ecdysozoans. A distinct category of genes that includes those present in non-bilaterians and lophotrochozoans, but which are absent from ecdysozoans and deuterostomes, highlights the need for further lophotrochozoan data to gain a more complete understanding of the gene complement of the bilaterian ancestor.
Collapse
Affiliation(s)
- Tokiharu Takahashi
- Faculty of Life Sciences, University of Manchester, Oxford Road, Manchester, UK.
| | | | | | | | | | | | | |
Collapse
|
31
|
A 454 sequencing approach for large scale phylogenomic analysis of the common emperor scorpion (Pandinus imperator). Mol Phylogenet Evol 2009; 53:826-34. [PMID: 19695333 DOI: 10.1016/j.ympev.2009.08.014] [Citation(s) in RCA: 58] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2009] [Revised: 08/02/2009] [Accepted: 08/04/2009] [Indexed: 10/20/2022]
Abstract
In recent years, phylogenetic tree reconstructions that rely on multiple gene alignments that had been deduced from expressed sequence tags (ESTs) have become a popular method in molecular systematics. Here, we present a 454 pyrosequencing approach to infer the transcriptome of the Emperor scorpion Pandinus imperator. We obtained 428,844 high-quality reads (mean length=223+/-50 b) from total cDNA, which were assembled into 8334 contigs (mean length 422+/-313 bp) and 26,147 singletons. About 1200 contigs were successfully annotated by BLAST and orthology search. Specific analyses of eight distinct hemocyanin sequences provided further proof for the quality of the 454 reads and the assembly process. The P. imperator sequences were included in a concatenated alignment of 149 orthologous genes of 67 metazoan taxa that covers 39,842 amino acids. After removal of low-quality regions, 11,168 positions were employed for phylogenetic reconstructions. Using Bayesian and maximum likelihood methods, we obtained strongly supported monophyletic Ecdysozoa, Arthropoda (excluding Tardigrada), Euarthropoda, Pancrustacea and Hexapoda. We also recovered the Myriochelata (Chelicerata+Myriapoda). Within the chelicerates, Pycnogonida form the sister group of Euchelicerata. However, Arachnida were found paraphyletic because the Acari (mites and ticks) were recovered as sister group of a clade comprising Xiphosura, Scorpiones and Araneae. In summary, we have shown that 454 pyrosequencing is a cost-effective method that provides sufficient data and coverage depth for gene detection and multigene-based phylogenetic analyses.
Collapse
|
32
|
Zarlenga DS, Gasbarre LC. From parasite genomes to one healthy world: Are we having fun yet? Vet Parasitol 2009; 163:235-49. [PMID: 19560277 DOI: 10.1016/j.vetpar.2009.06.010] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
Abstract
In 1990, the Human Genome Sequencing Project was established. This laid the ground work for an explosion of sequence data that has since followed. As a result of this effort, the first complete genome of an animal, Caenorhabditis elegans was published in 1998. The sequence of Drosophila melanogaster was made available in March, 2000 and in the following year, working drafts of the human genome were generated with the completed sequence (92%) being released in 2003. Recent advancements and next-generation technologies have made sequencing common place and have infiltrated every aspect of biological research, including parasitology. To date, sequencing of 32 apicomplexa and 24 nematode genomes are either in progress or near completion, and over 600k nematode EST and 200k apicomplexa EST submissions fill the databases. However, the winds have shifted and efforts are now refocusing on how best to store, mine and apply these data to problem solving. Herein we tend not to summarize existing X-omics datasets or present new technological advances that promise future benefits. Rather, the information to follow condenses up-to-date-applications of existing technologies to problem solving as it relates to parasite research. Advancements in non-parasite systems are also presented with the proviso that applications to parasite research are in the making.
Collapse
Affiliation(s)
- Dante S Zarlenga
- USDA, ARS, ANRI Animal Parasitic Diseases Laboratory, Beltsville, MD 20705, USA.
| | | |
Collapse
|
33
|
Bleidorn C, Podsiadlowski L, Zhong M, Eeckhaut I, Hartmann S, Halanych KM, Tiedemann R. On the phylogenetic position of Myzostomida: can 77 genes get it wrong? BMC Evol Biol 2009; 9:150. [PMID: 19570199 PMCID: PMC2716322 DOI: 10.1186/1471-2148-9-150] [Citation(s) in RCA: 48] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2009] [Accepted: 07/01/2009] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Phylogenomic analyses recently became popular to address questions about deep metazoan phylogeny. Ribosomal proteins (RP) dominate many of these analyses or are, in some cases, the only genes included. Despite initial hopes, phylogenomic analyses including tens to hundreds of genes still fail to robustly place many bilaterian taxa. RESULTS Using the phylogenetic position of myzostomids as an example, we show that phylogenies derived from RP genes and mitochondrial genes produce incongruent results. Whereas the former support a position within a clade of platyzoan taxa, mitochondrial data recovers an annelid affinity, which is strongly supported by the gene order data and is congruent with morphology. Using hypothesis testing, our RP data significantly rejects the annelids affinity, whereas a platyzoan relationship is significantly rejected by the mitochondrial data. CONCLUSION We conclude (i) that reliance of a set of markers belonging to a single class of macromolecular complexes might bias the analysis, and (ii) that concatenation of all available data might introduce conflicting signal into phylogenetic analyses. We therefore strongly recommend testing for data incongruence in phylogenomic analyses. Furthermore, judging all available data, we consider the annelid affinity hypothesis more plausible than a possible platyzoan affinity for myzostomids, and suspect long branch attraction is influencing the RP data. However, this hypothesis needs further confirmation by future analyses.
Collapse
Affiliation(s)
- Christoph Bleidorn
- Unit of Evolutionary Biology/Systematic Zoology, Institute of Biochemistry and Biology, University of Potsdam, Karl-Liebknecht-Strasse 24-25, Haus 26, D-14476 Potsdam-Golm, Germany
| | - Lars Podsiadlowski
- Institute of Evolutionary Biology and Ecology, Rheinische Friedrich-Wilhelms-Universität Bonn, An der Immenburg 1, D-53121 Bonn, Germany
| | - Min Zhong
- Department of Biological Sciences, Auburn University, 101 Life Science Building, AL 36849, USA
| | - Igor Eeckhaut
- Marine Biology Laboratory, Natural Sciences Building, University of Mons-Hainaut, Av. Champs de Mars 6, B-7000 Mons, Belgium
| | - Stefanie Hartmann
- Unit of Bioinformatics, Institute of Biochemistry and Biology, University of Potsdam, Karl-Liebknecht-Strasse 24-25, Haus 26, D-14476 Potsdam-Golm, Germany
| | - Kenneth M Halanych
- Department of Biological Sciences, Auburn University, 101 Life Science Building, AL 36849, USA
| | - Ralph Tiedemann
- Unit of Evolutionary Biology/Systematic Zoology, Institute of Biochemistry and Biology, University of Potsdam, Karl-Liebknecht-Strasse 24-25, Haus 26, D-14476 Potsdam-Golm, Germany
| |
Collapse
|
34
|
Michelle C, Vourc'h P, Mignon L, Andres CR. What was the set of ubiquitin and ubiquitin-like conjugating enzymes in the eukaryote common ancestor? J Mol Evol 2009; 68:616-28. [PMID: 19452197 PMCID: PMC2691932 DOI: 10.1007/s00239-009-9225-6] [Citation(s) in RCA: 100] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2008] [Revised: 03/06/2009] [Accepted: 03/17/2009] [Indexed: 11/03/2022]
Abstract
Ubiquitin (Ub)-conjugating enzymes (E2) are key enzymes in ubiquitination or Ub-like modifications of proteins. We searched for all proteins belonging to the E2 enzyme super-family in seven species (Homo sapiens, Mus musculus, Drosophila melanogaster, Caenorhabditis elegans, Schizosaccharomyces pombe, Saccharomyces cerevisiae, and Arabidopsis thaliana) to identify families and to reconstruct each family’s phylogeny. Our phylogenetic analysis of 207 genes led us to define 17 E2 families, with 37 E2 genes, in the human genome. The subdivision of E2 into four classes did not correspond to the phylogenetic tree. The sequence signature HPN (histidine–proline–asparagine), followed by a tryptophan residue at 16 (up to 29) amino acids, was highly conserved. When present, the active cysteine was found 7 to 8 amino acids from the C-terminal end of HPN. The secondary structures were characterized by a canonical alpha/beta fold. Only family 10 deviated from the common organization because the proteins were devoid of enzymatic activity. Family 7 had an insertion between beta strands 1 and 2; families 3, 5 and 14 had an insertion between the active cysteine and the conserved tryptophan. The three-dimensional data of these proteins highlight a strong structural conservation of the core domain. Our analysis shows that the primitive eukaryote ancestor possessed a diversified set of E2 enzymes, thus emphasizing the importance of the Ub pathway. This comprehensive overview of E2 enzymes emphasizes the diversity and evolution of this superfamily and helps clarify the nomenclature and true orthologies. A better understanding of the functions of these enzymes is necessary to decipher several human diseases.
Collapse
Affiliation(s)
- Caroline Michelle
- Faculté de Médecine, Génétique de l'Autisme et des Déficiences Mentales, INSERM U930, Université François Rabelais, 10, boulevard Tonnellé, BP 3223, 37032, Tours, France
| | | | | | | |
Collapse
|
35
|
Abstract
Contemporary protein architectures can be regarded as molecular fossils, historical imprints that mark important milestones in the history of life. Whereas sequences change at a considerable pace, higher-order structures are constrained by the energetic landscape of protein folding, the exploration of sequence and structure space, and complex interactions mediated by the proteostasis and proteolytic machineries of the cell. The survey of architectures in the living world that was fuelled by recent structural genomic initiatives has been summarized in protein classification schemes, and the overall structure of fold space explored with novel bioinformatic approaches. However, metrics of general structural comparison have not yet unified architectural complexity using the 'shared and derived' tenet of evolutionary analysis. In contrast, a shift of focus from molecules to proteomes and a census of protein structure in fully sequenced genomes were able to uncover global evolutionary patterns in the structure of proteins. Timelines of discovery of architectures and functions unfolded episodes of specialization, reductive evolutionary tendencies of architectural repertoires in proteomes and the rise of modularity in the protein world. They revealed a biologically complex ancestral proteome and the early origin of the archaeal lineage. Studies also identified an origin of the protein world in enzymes of nucleotide metabolism harbouring the P-loop-containing triphosphate hydrolase fold and the explosive discovery of metabolic functions that recapitulated well-defined prebiotic shells and involved the recruitment of structures and functions. These observations have important implications for origins of modern biochemistry and diversification of life.
Collapse
|
36
|
Alekseyenko AV, Lee CJ, Suchard MA. Wagner and Dollo: a stochastic duet by composing two parsimonious solos. Syst Biol 2008; 57:772-84. [PMID: 18853363 DOI: 10.1080/10635150802434394] [Citation(s) in RCA: 53] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022] Open
Abstract
New contributions toward generalizing evolutionary models expand greatly our ability to analyze complex evolutionary characters and advance phylogeny reconstruction. In this article, we extend the binary stochastic Dollo model to allow for multi-state characters. In doing so, we align previously incompatible Wagner and Dollo parsimony principles under a common probabilistic framework by embedding arbitrary continuous-time Markov chains into the binary stochastic Dollo model. This approach enables us to analyze character traits that exhibit both Dollo and Wagner characteristics throughout their evolutionary histories. Utilizing Bayesian inference, we apply our novel model to analyze intron conservation patterns and the evolution of alternatively spliced exons. The generalized framework we develop demonstrates potential in distinguishing between phylogenetic hypotheses and providing robust estimates of evolutionary rates. Moreover, for the two applications analyzed here, our framework is the first to provide an adequate stochastic process for the data. We discuss possible extensions to the framework from both theoretical and applied perspectives.
Collapse
Affiliation(s)
- Alexander V Alekseyenko
- Department of Biomathematics, David Geffen School of Medicine at UCLA, Los Angeles, California 90095, USA.
| | | | | |
Collapse
|
37
|
Nunes A, Nogueira PJ, Borrego MJ, Gomes JP. Chlamydia trachomatis diversity viewed as a tissue-specific coevolutionary arms race. Genome Biol 2008; 9:R153. [PMID: 18947394 PMCID: PMC2760880 DOI: 10.1186/gb-2008-9-10-r153] [Citation(s) in RCA: 29] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2008] [Revised: 09/26/2008] [Accepted: 10/23/2008] [Indexed: 01/13/2023] Open
Abstract
Analysis of 15 serovars of Chlamydia trachomatis reveals an evolutionary arms race in pathogen-host interactions. Background The genomes of pathogens are thought to have evolved under selective pressure provided by the host in a coevolutionary arms race (the 'Red Queen's Hypothesis'). Traditionally, adaptation by pathogens is thought to rely not on whole chromosome dynamics but on gain/loss of specific genes, yielding differential abilities to infect distinct tissues. Thus, it is not known whether distinct host organs differently shape the genome of the same pathogen. We tested this hypothesis using Chlamydia trachomatis as model species, looking at 15 serovars that infect different organs: eyes, genitalia and lymph nodes. Results We analyzed over 51,000 base pairs from all serovars using various phylogenetic approaches and a non-phylogenetic indel-based algorithm to study the evolution of individual and concatenated loci. This survey comprised about 33% of all single nucleotide polymorphisms in C. trachomatis chromosomes. We present a model in which genome evolution indeed correlates with the cell type (epithelial versus lymph cells) and organ (eyes versus genitalia) that a serovar infects, illustrating an adaptation to physiologically distinct niches, and discarding genetic drift as the dominant evolutionary driving force. We show that radiation of serovars occurred primarily by accumulation of single nucleotide polymorphisms in intergenomic regions, housekeeping genes, and genes encoding hypothetical and cell envelope proteins. Furthermore, serovar evolution also correlates with ecological success, as the two most successful serovars showed a parallel evolution. Conclusion We identified a single nucleotide polymorphism-based tissue-specific arms race for strains in the same species, reflecting global chromosomal dynamics. Studying such tissue-specific arms race scenarios is crucial for understanding pathogen-host interactions during the course of infectious diseases, in order to dissect pathogen biology and develop preventive and therapeutic strategies.
Collapse
Affiliation(s)
- Alexandra Nunes
- Department of Infectious Diseases, National Institute of Health, Avenida Padre Cruz, Lisbon, Portugal
| | | | | | | |
Collapse
|
38
|
Marlétaz F, Le Parco Y. Careful with understudied phyla: the case of chaetognath. BMC Evol Biol 2008; 8:251. [PMID: 18798978 PMCID: PMC2566580 DOI: 10.1186/1471-2148-8-251] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2007] [Accepted: 09/17/2008] [Indexed: 11/10/2022] Open
Abstract
Background A recent study by Barthélémy et al. described a set of ribosomal protein (RP) genes extracted from a collection of expressed sequence tags (ESTs) of the chaetognath (arrow worm) Spadella cephaloptera. Three main conclusions were drawn in this paper. First, the authors stated that RP genes present paralogous copies, which have arisen through allopolyploidization. Second, they reported two alternate nucleotide stretches conserved within the 5' untranslated regions (UTR) of multiple ribosomal cDNAs and they suggested that these motifs are involved in the differential transcriptional regulation of paralogous RP genes. Third, they claimed that the phylogenetic position of chaetognaths could not be accurately inferred from a RP dataset because of the persistence of two problems: a long branch attraction (LBA) artefact and a compositional bias. Results We reconsider here the results described in Barthélémy et al. and question the evidence on which they are based. We find that their evidence for paralogous copies relies on faulty PCR experiments since they attempted to amplify DNA fragments absent from the genomic template. Our PCR experiments proved that the conserved motifs in 5'UTRs that they targeted in their amplifications are added post-transcriptionally by a trans-splicing mechanism. Then, we showed that the lack of phylogenetic resolution observed by these authors is due to limited taxon sampling and not to LBA or to compositional bias. A ribosomal protein dataset thus fully supports the position of chaetognaths as sister group of all other protostomes. This reinterpretation demonstrates that the statements of Barthélémy et al. should be taken with caution because they rely on inaccurate evidence. Conclusion The genomic study of an unconventional model organism is a meaningful approach to understand the evolution of animals. However, the previous study came to incorrect conclusions on the basis of experiments that omitted validation procedures.
Collapse
Affiliation(s)
- Ferdinand Marlétaz
- Station Marine d'Endoume, CNRS UMR 6540 DIMAR, Centre d'Océanologie de Marseille, Université de Méditerranée, Marseille, France.
| | | |
Collapse
|
39
|
Lartillot N, Philippe H. Improvement of molecular phylogenetic inference and the phylogeny of Bilateria. Philos Trans R Soc Lond B Biol Sci 2008; 363:1463-72. [PMID: 18192187 DOI: 10.1098/rstb.2007.2236] [Citation(s) in RCA: 104] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Inferring the relationships among Bilateria has been an active and controversial research area since Haeckel. The lack of a sufficient number of phylogenetically reliable characters was the main limitation of traditional phylogenies based on morphology. With the advent of molecular data, this problem has been replaced by another one, statistical inconsistency, which stems from an erroneous interpretation of convergences induced by multiple changes. The analysis of alignments rich in both genes and species, combined with a probabilistic method (maximum likelihood or Bayesian) using sophisticated models of sequence evolution, should alleviate these two major limitations. We applied this approach to a dataset of 94 genes and 79 species using CAT, a previously developed model accounting for site-specific amino acid replacement patterns. The resulting tree is in good agreement with current knowledge: the monophyly of most major groups (e.g. Chordata, Arthropoda, Lophotrochozoa, Ecdysozoa, Protostomia) was recovered with high support. Two results are surprising and are discussed in an evo-devo framework: the sister-group relationship of Platyhelminthes and Annelida to the exclusion of Mollusca, contradicting the Neotrochozoa hypothesis, and, with a lower statistical support, the paraphyly of Deuterostomia. These results, in particular the status of deuterostomes, need further confirmation, both through increased taxonomic sampling, and future improvements of probabilistic models.
Collapse
Affiliation(s)
- Nicolas Lartillot
- Laboratoire d'Informatique, de Robotique et de Microélectronique de Montpellier, CNRS-Université de Montpellier 2, 34392 Montpellier Cedex 5, France
| | | |
Collapse
|
40
|
Telford MJ, Bourlat SJ, Economou A, Papillon D, Rota-Stabelli O. The evolution of the Ecdysozoa. Philos Trans R Soc Lond B Biol Sci 2008; 363:1529-37. [PMID: 18192181 DOI: 10.1098/rstb.2007.2243] [Citation(s) in RCA: 169] [Impact Index Per Article: 10.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Ecdysozoa is a clade composed of eight phyla: the arthropods, tardigrades and onychophorans that share segmentation and appendages and the nematodes, nematomorphs, priapulids, kinorhynchs and loriciferans, which are worms with an anterior proboscis or introvert. Ecdysozoa contains the vast majority of animal species and there is a great diversity of body plans among both living and fossil members. The monophyly of the clade has been called into question by some workers based on analyses of whole genome datasets. We review the evidence that now conclusively supports the unique origin of these phyla. Relationships within Ecdysozoa are also controversial and we discuss the molecular and morphological evidence for a number of monophyletic groups within this superphylum.
Collapse
|
41
|
Peregrín-Alvarez JM, Parkinson J. The global landscape of sequence diversity. Genome Biol 2008; 8:R238. [PMID: 17996061 PMCID: PMC2258180 DOI: 10.1186/gb-2007-8-11-r238] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2007] [Revised: 10/18/2007] [Accepted: 11/08/2007] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Systematic comparisons between genomic sequence datasets have revealed a wide spectrum of sequence specificity from sequences that are highly conserved to those that are specific to individual species. Due to the limited number of fully sequenced eukaryotic genomes, analyses of this spectrum have largely focused on prokaryotes. Combining existing genomic datasets with the partial genomes of 193 eukaryotes derived from collections of expressed sequence tags, we performed a quantitative analysis of the sequence specificity spectrum to provide a global view of the origins and extent of sequence diversity across the three domains of life. RESULTS Comparisons with prokaryotic datasets reveal a greater genetic diversity within eukaryotes that may be related to differences in modes of genetic inheritance. Mapping this diversity within a phylogenetic framework revealed that the majority of sequences are either highly conserved or specific to the species or taxon from which they derive. Between these two extremes, several evolutionary landmarks consisting of large numbers of sequences conserved within specific taxonomic groups were identified. For example, 8% of sequences derived from metazoan species are specific and conserved within the metazoan lineage. Many of these sequences likely mediate metazoan specific functions, such as cell-cell communication and differentiation. CONCLUSION Through the use of partial genome datasets, this study provides a unique perspective of sequence conservation across the three domains of life. The provision of taxon restricted sequences should prove valuable for future computational and biochemical analyses aimed at understanding evolutionary and functional relationships.
Collapse
Affiliation(s)
- José Manuel Peregrín-Alvarez
- Molecular Structure and Function, Hospital for Sick Children, 555 University Avenue, Toronto, ON M5G 1X8, Canada.
| | | |
Collapse
|
42
|
Rokas A, Chatzimanolis S. From gene-scale to genome-scale phylogenetics: the data flood in, but the challenges remain. Methods Mol Biol 2008; 422:1-12. [PMID: 18629657 DOI: 10.1007/978-1-59745-581-7_1] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/06/2023]
Abstract
An important goal of phylogenetics is to be able to consistently and accurately reconstruct the historical patterns of cladogenesis among major organismic groups. Gene-scale phylogenetics is insufficient to attain this goal owing to the presence of poor resolution and incongruence in single--and few--gene phylogenies. The increasing availability of genome-scale amounts of data promises to overcome the insufficiency of gene-scale phylogenetics and uncover the genealogical tapestry uniting all living organisms with unprecedented accuracy. Here, we argue that a vast increase in data size alone--although necessary--may not be sufficient to achieve the desired accuracy for three reasons: (i) the existence of short stems in the tree of life, (ii) the saturation of phylogenetic signal in molecular sequences, and (iii) the effect of systematic error on phylogenetic inference. Devising strategies to ameliorate the effect of such challenges on sequence evolution will be critical to the success of current efforts to reconstruct the tree of life.
Collapse
Affiliation(s)
- Antonis Rokas
- Department of Biological Sciences, Vanderbilt University, Nashville, TN, USA
| | | |
Collapse
|
43
|
Janssen T, Meelkop E, Lindemans M, Verstraelen K, Husson SJ, Temmerman L, Nachman RJ, Schoofs L. Discovery of a cholecystokinin-gastrin-like signaling system in nematodes. Endocrinology 2008; 149:2826-39. [PMID: 18339709 DOI: 10.1210/en.2007-1772] [Citation(s) in RCA: 89] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
Abstract
Members of the cholecystokinin (CCK)/gastrin family of peptides, including the arthropod sulfakinins, and their cognate receptors, play an important role in the regulation of feeding behavior and energy homeostasis. Despite many efforts after the discovery of CCK/gastrin immunoreactivity in nematodes 23 yr ago, the identity of these nematode CCK/gastrin-related peptides has remained a mystery ever since. The Caenorhabditis elegans genome contains two genes with high identity to the mammalian CCK receptors and their invertebrate counterparts, the sulfakinin receptors. By using the potential C. elegans CCK receptors as a fishing hook, we have isolated and identified two CCK-like neuropeptides encoded by neuropeptide-like protein-12 (nlp-12) as the endogenous ligands of these receptors. The neuropeptide-like protein-12 peptides have a very limited neuronal expression pattern, seem to occur in vivo in the unsulfated form, and react specifically with a human CCK-8 antibody. Both receptors and ligands share a high degree of structural similarity with their vertebrate and arthropod counterparts, and also display similar biological activities with respect to digestive enzyme secretion and fat storage. Our data indicate that the gastrin-CCK signaling system was already well established before the divergence of protostomes and deuterostomes.
Collapse
Affiliation(s)
- Tom Janssen
- Functional Genomics and Proteomics Unit, Department of Biology, Katholieke Universiteit Leuven, Naamsestraat 59, B-3000 Leuven, Belgium.
| | | | | | | | | | | | | | | |
Collapse
|
44
|
Abstract
The advent of numerical methods for analysing phylogenetic relationships, along with the study of morphology and molecular data, has driven our understanding of animal relationships for the past three decades. Within the protostome branch of the animal tree of life, these data have sufficed to establish its two main side branches, the moulting Ecdysozoa and the non-moulting Lophotrochozoa. In this review, I explore our current knowledge of protostome relationships and discuss progress and future perspectives and strategies to increase resolution within the main lophotrochozoan clades. Novel approaches to coding morphological characters are needed by scoring real observations on species selected as terminals. Still, methodological issues, for example, how to deal with inapplicable characters or the coding of absences, may require novel algorithmic developments. Taxon sampling is another key issue, as phyla should include enough species so as to represent their span of anatomical disparity. On the molecular side, phylogenomics is playing an increasingly important role in elucidating animal relationships, but genomic sampling is still fairly limited within the lophotrochozoan protostomes, for which only three phyla are represented in currently available phylogenies. Future work should therefore concentrate on generating novel morphological observations and on producing genomic data for the lophotrochozoan side of the animal tree of life.
Collapse
Affiliation(s)
- Gonzalo Giribet
- Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, MA 02138, USA.
| |
Collapse
|
45
|
Ewing GB, Ebersberger I, Schmidt HA, von Haeseler A. Rooted triple consensus and anomalous gene trees. BMC Evol Biol 2008; 8:118. [PMID: 18439266 PMCID: PMC2409437 DOI: 10.1186/1471-2148-8-118] [Citation(s) in RCA: 53] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2007] [Accepted: 04/25/2008] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Anomalous gene trees (AGTs) are gene trees with a topology different from a species tree that are more probable to observe than congruent gene trees. In this paper we propose a rooted triple approach to finding the correct species tree in the presence of AGTs. RESULTS Based on simulated data we show that our method outperforms the extended majority rule consensus strategy, while still resolving the species tree. Applying both methods to a metazoan data set of 216 genes, we tested whether AGTs substantially interfere with the reconstruction of the metazoan phylogeny. CONCLUSION Evidence of AGTs was not found in this data set, suggesting that erroneously reconstructed gene trees are the most significant challenge in the reconstruction of phylogenetic relationships among species with current data. The new method does however rule out the erroneous reconstruction of deep or poorly resolved splits in the presence of lineage sorting.
Collapse
Affiliation(s)
- Gregory B Ewing
- Center for Integrative Bioinformatics Vienna, Max F, Perutz Laboratories, Dr, Bohr Gasse 9, A-1030 Vienna, Austria.
| | | | | | | |
Collapse
|
46
|
Rogozin IB, Thomson K, Csürös M, Carmel L, Koonin EV. Homoplasy in genome-wide analysis of rare amino acid replacements: the molecular-evolutionary basis for Vavilov's law of homologous series. Biol Direct 2008; 3:7. [PMID: 18346278 PMCID: PMC2292158 DOI: 10.1186/1745-6150-3-7] [Citation(s) in RCA: 37] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2008] [Accepted: 03/17/2008] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Rare genomic changes (RGCs) that are thought to comprise derived shared characters of individual clades are becoming an increasingly important class of markers in genome-wide phylogenetic studies. Recently, we proposed a new type of RGCs designated RGC_CAMs (after Conserved Amino acids-Multiple substitutions) that were inferred using genome-wide identification of amino acid replacements that were: i) located in unambiguously aligned regions of orthologous genes, ii) shared by two or more taxa in positions that contain a different, conserved amino acid in a much broader range of taxa, and iii) require two or three nucleotide substitutions. When applied to animal phylogeny, the RGC_CAM approach supported the coelomate clade that unites deuterostomes with arthropods as opposed to the ecdysozoan (molting animals) clade. However, a non-negligible level of homoplasy was detected. RESULTS We provide a direct estimate of the level of homoplasy caused by parallel changes and reversals among the RGC_CAMs using 462 alignments of orthologous genes from 19 eukaryotic species. It is shown that the impact of parallel changes and reversals on the results of phylogenetic inference using RGC_CAMs cannot explain the observed support for the Coelomata clade. In contrast, the evidence in support of the Ecdysozoa clade, in large part, can be attributed to parallel changes. It is demonstrated that parallel changes are significantly more common in internal branches of different subtrees that are separated from the respective common ancestor by relatively short times than in terminal branches separated by longer time intervals. A similar but much weaker trend was detected for reversals. The observed evolutionary trend of parallel changes is explained in terms of the covarion model of molecular evolution. As the overlap between the covarion sets in orthologous genes from different lineages decreases with time after divergence, the likelihood of parallel changes decreases as well. CONCLUSION The level of homoplasy observed here appears to be low enough to justify the utility of RGC_CAMs and other types of RGCs for resolution of hard problems in phylogeny. Parallel changes, one of the major classes of events leading to homoplasy, occur much more often in relatively recently diverged lineages than in those separated from their last common ancestor by longer time intervals of time. This pattern seems to provide the molecular-evolutionary underpinning of Vavilov's law of homologous series and is readily interpreted within the framework of the covarion model of molecular evolution.
Collapse
Affiliation(s)
- Igor B Rogozin
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA
| | - Karen Thomson
- Department of Computer Science, University of New Orleans, New Orleans, LA 70148, USA
| | - Miklós Csürös
- Department of Computer Science and Operations Research, Université de Montréal, Montréal, Québec H3C 3J7, Canada
| | - Liran Carmel
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA
| | - Eugene V Koonin
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA
| |
Collapse
|
47
|
Roy SW, Irimia M. Rare Genomic Characters Do Not Support Coelomata: RGC_CAMs. J Mol Evol 2008; 66:308-15. [DOI: 10.1007/s00239-008-9077-5] [Citation(s) in RCA: 30] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2007] [Revised: 12/21/2007] [Accepted: 01/25/2008] [Indexed: 11/29/2022]
|
48
|
Huerta-Cepas J, Dopazo H, Dopazo J, Gabaldón T. The human phylome. Genome Biol 2008; 8:R109. [PMID: 17567924 PMCID: PMC2394744 DOI: 10.1186/gb-2007-8-6-r109] [Citation(s) in RCA: 112] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2006] [Revised: 03/16/2007] [Accepted: 06/13/2007] [Indexed: 01/09/2023] Open
Abstract
The human phylome, which includes evolutionary relationships of all human proteins and their homologs among thirty-nine fully sequenced eukaryotes, is reconstructed. Background: Phylogenomics analyses serve to establish evolutionary relationships among organisms and their genes. A phylome, the complete collection of all gene phylogenies in a genome, constitutes a valuable source of information, but its use in large genomes still constitutes a technical challenge. The use of phylomes also requires the development of new methods that help us to interpret them. Results: We reconstruct here the human phylome, which includes the evolutionary relationships of all human proteins and their homologs among 39 fully sequenced eukaryotes. Phylogenetic techniques used include alignment trimming, branch length optimization, evolutionary model testing and maximum likelihood and Bayesian methods. Although differences with alternative topologies are minor, most of the trees support the Coelomata and Unikont hypotheses as well as the grouping of primates with laurasatheria to the exclusion of rodents. We assess the extent of gene duplication events and their relationship with the functional roles of the protein families involved. We find support for at least one, and probably two, rounds of whole genome duplications before vertebrate radiation. Using a novel algorithm that is independent from a species phylogeny, we derive orthology and paralogy relationships of human proteins among eukaryotic genomes. Conclusion: Topological variations among phylogenies for different genes are to be expected, highlighting the danger of gene-sampling effects in phylogenomic analyses. Several links can be established between the functions of gene families duplicated at certain phylogenetic splits and major evolutionary transitions in those lineages. The pipeline implemented here can be easily adapted for use in other organisms.
Collapse
Affiliation(s)
- Jaime Huerta-Cepas
- Bioinformatics Department, Centro de Investigación Príncipe Felipe, Autopista del Saler, 46013 Valencia, Spain
| | - Hernán Dopazo
- Bioinformatics Department, Centro de Investigación Príncipe Felipe, Autopista del Saler, 46013 Valencia, Spain
| | - Joaquín Dopazo
- Bioinformatics Department, Centro de Investigación Príncipe Felipe, Autopista del Saler, 46013 Valencia, Spain
| | - Toni Gabaldón
- Bioinformatics Department, Centro de Investigación Príncipe Felipe, Autopista del Saler, 46013 Valencia, Spain
| |
Collapse
|
49
|
Kosarek JN, Woodruff RV, Rivera-Begeman A, Guo C, D'Souza S, Koonin EV, Walker GC, Friedberg EC. Comparative analysis of in vivo interactions between Rev1 protein and other Y-family DNA polymerases in animals and yeasts. DNA Repair (Amst) 2008; 7:439-51. [PMID: 18242152 DOI: 10.1016/j.dnarep.2007.11.016] [Citation(s) in RCA: 34] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2007] [Revised: 11/26/2007] [Accepted: 11/28/2007] [Indexed: 10/22/2022]
Abstract
Eukaryotes are endowed with multiple specialized DNA polymerases, some (if not all) of which are believed to play important roles in the tolerance of base damage during DNA replication. Among these DNA polymerases, Rev1 protein (a deoxycytidyl transferase) from vertebrates interacts with several other specialized polymerases via a highly conserved C-terminal region. The present studies assessed whether these interactions are retained in more experimentally tractable model systems, including yeasts, flies, and the nematode C. elegans. We observed a physical interaction between Rev1 protein and other Y-family polymerases in the fruit fly Drosophila melanogaster. However, despite the fact that the C-terminal region of Drosophila and yeast Rev1 are conserved from vertebrates to a similar extent, such interactions were not observed in Saccharomyces cerevisiae or Schizosaccharomyces pombe. With respect to regions in specialized DNA polymerases that are required for interaction with Rev1, we find predicted disorder to be an underlying structural commonality. The results of this study suggest that special consideration should be exercised when making mechanistic extrapolations regarding translesion DNA synthesis from one eukaryotic system to another.
Collapse
Affiliation(s)
- J Nicole Kosarek
- Department of Pathology, University of Texas Southwestern Medical Center, Dallas, TX 75390-9072, USA
| | | | | | | | | | | | | | | |
Collapse
|
50
|
Basu MK, Carmel L, Rogozin IB, Koonin EV. Evolution of protein domain promiscuity in eukaryotes. Genome Res 2008; 18:449-61. [PMID: 18230802 DOI: 10.1101/gr.6943508] [Citation(s) in RCA: 135] [Impact Index Per Article: 8.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
Numerous eukaryotic proteins contain multiple domains. Certain domains show a tendency to occur in diverse domain architectures and can be considered "promiscuous." These promiscuous domains are, typically, involved in protein-protein interactions and play crucial roles in interaction networks, particularly those that contribute to signal transduction. A systematic comparative-genomic analysis of promiscuous domains in eukaryotes is described. Two quantitative measures of domain promiscuity are introduced and applied to the analysis of 28 genomes of diverse eukaryotes. Altogether, 215 domains are identified as strongly promiscuous. The fraction of promiscuous domains in animals is shown to be significantly greater than that in fungi or plants. Evolutionary reconstructions indicate that domain promiscuity is a volatile, relatively fast-changing feature of eukaryotic proteins, with few domains remaining promiscuous throughout the evolution of eukaryotes. Some domains appear to have attained promiscuity independently in different lineages, for example, animals and plants. It is proposed that promiscuous domains persist within a relatively small pool of evolutionarily stable domain combinations from which numerous rare architectures emerge during evolution. Domain promiscuity positively correlates with the number of experimentally detected domain interactions and with the strength of purifying selection affecting a domain. Thus, evolution of promiscuous domains seems to be constrained by the diversity of their interaction partners. The set of promiscuous domains is enriched for domains mediating protein-protein interactions that are involved in various forms of signal transduction, especially in the ubiquitin system and in chromatin. Thus, a limited repertoire of promiscuous domains makes a major contribution to the diversity and evolvability of eukaryotic proteomes and signaling networks.
Collapse
Affiliation(s)
- Malay Kumar Basu
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland 20894, USA
| | | | | | | |
Collapse
|