51
|
Abstract
A universal Tree of Life has been a longstanding goal of the biosciences. The most common Tree of Life, based on the small subunit rRNA gene, may or may not represent the phylogenetic history of microorganisms. The horizontal transfer of genes from one taxon to another provides a means by which each gene may tell of an independent history. When complete genomes became available, the extent to which horizontal gene transfer (HGT) has occurred became more evident. When using genomic data to study the Tree of Life, one can use any of the four broad approaches: (i) build lots of individual gene trees ("phylogenomics"), (ii) concatenate genes together for an analysis yielding one "supergene" tree, (iii) form a single tree based on the "gene content" within genomes using either orthologs or homologs, or (iv) investigate the order of genes within genomes to discern some aspects of microbial evolution. The application of whole genome tree building has suggested that there is a core tree, that such a core tree can be investigated using these varied methods, and that the results are largely similar to those of the rRNA universal Tree of Life. Some of the most interesting features of the rRNA tree, such as early diverging hyperthermophilic lineages are still uncertain, but remain a possibility. Genomic trees and geologic evidence together suggest that the vertical descent of genes and the horizontal transfer of genes between genetically similar lineages ultimately results in a core Tree of Life with at least some lineages that have phenotypic characteristics recognizable for billions of years.
Collapse
Affiliation(s)
- Christopher H House
- Department of Geosciences and Pennsylvania State Astrobiology Research Center, Pennsylvania State University, University Park, PA, USA
| |
Collapse
|
52
|
Beiko RG, Doolittle WF, Charlebois RL. The Impact of Reticulate Evolution on Genome Phylogeny. Syst Biol 2008; 57:844-56. [DOI: 10.1080/10635150802559265] [Citation(s) in RCA: 38] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022] Open
Affiliation(s)
- Robert G. Beiko
- Faculty of Computer Science, Dalhousie University, and Institute for Molecular Bioscience/ARC Centre for Bioinformatics
Brisbane, Australia; E-mail:
| | - W. Ford Doolittle
- Genome Atlantic, Department of Biochemistry & Molecular Biology, Dalhousie University
Halifax, Nova Scotia, Canada
| | - Robert L. Charlebois
- Genome Atlantic, Department of Biochemistry & Molecular Biology, Dalhousie University
Halifax, Nova Scotia, Canada
| |
Collapse
|
53
|
Matsuoka MP, Infante C, Reith M, Cañavate JP, Douglas SE, Manchado M. Translational machinery of senegalese sole (Solea senegalensis Kaup) and Atlantic halibut (Hippoglossus hippoglossus L.): comparative sequence analysis of the complete set of 60s ribosomal proteins and their expression. MARINE BIOTECHNOLOGY (NEW YORK, N.Y.) 2008; 10:676-691. [PMID: 18478294 DOI: 10.1007/s10126-008-9104-y] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/14/2008] [Revised: 03/26/2008] [Accepted: 04/04/2008] [Indexed: 05/26/2023]
Abstract
Ribosomal proteins (RPs) comprise a large set of highly evolutionarily conserved proteins that are often over-represented in complementary DNA libraries. They have become very useful markers in comparative genomics, genome evolution, and phylogenetic studies across taxa. In this study, we report the sequences of the complete set of 60S RPs in Senegalese sole (Solea senegalensis) and Atlantic halibut (Hippoglossus hippoglossus), two commercially important flatfish species. Amino-acid sequence comparisons of the encoded proteins showed a high similarity both between these two flatfish species and with respect to other fish and human counterparts. Expressed sequence tag analysis revealed the existence of paralogous genes for RPL3, RPL7, RPL41, and RPLP2 in Atlantic halibut and RPL13a in Senegalese sole as well as RPL19 and RPL22 in both species. Phylogenetic analysis of paralogs revealed distinct evolutionary histories for each RP in agreement with three rounds of genome duplications and lineage-specific duplications during flatfish evolution. Steady-state transcript levels for RPL19 and RPL22 RPs were quantitated during larval development and in different tissues of sole and halibut using a real-time polymerase chain reaction approach. All paralogs were expressed ubiquitously although at different levels in different tissues. Most RP transcripts increased coordinately after larval first-feeding in both species but decreased progressively during the metamorphic process. In all cases, expression profiles and transcript levels of orthologous genes in Senegalese sole and Atlantic halibut were highly congruent. The genomic resources and knowledge developed in this survey will be useful for the study of Pleuronectiformes evolution.
Collapse
Affiliation(s)
- Makoto P Matsuoka
- Institute for Marine Biosciences, National Research Council, 1411 Oxford Street, Halifax, Nova Scotia, B3H 3Z1, Canada
| | | | | | | | | | | |
Collapse
|
54
|
Indicators from archaeal secretomes. Microbiol Res 2008; 165:1-10. [PMID: 18407482 DOI: 10.1016/j.micres.2008.03.002] [Citation(s) in RCA: 15] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2007] [Revised: 02/14/2008] [Accepted: 03/01/2008] [Indexed: 11/21/2022]
Abstract
Just as in the Eukarya and the Bacteria, members of the Archaea need to export proteins beyond the cell membrane. This would be required to fulfill a variety of essential functions such as nutrient acquisition and biotransformations, maintenance of extracellular structures and more. Apart from the Eukarya and the Bacteria however, members of the Archaea share a number of unique characteristics. Does this uniqueness extend to the protein secretion system? It was the objective of this study to answer this question. To overcome the limited experimental information on secreted proteins in Archaea, this study was carried out by subjecting the available archaeal genomes, which represent halophiles, thermophiles, and extreme thermophiles, to bioinformatics analysis. Specifically, to examine the properties of the secretomes of the Archaea using the ExProt program. A total of 24 genomes were analyzed. Secretomes were found to fall in the range of 6% of total ORFs (Methanopyrus kandleri) to 19% (Halobacterium sp. NRC-1). Methanosarcina acetivorans has the highest fraction of lipoproteins (at 89) and the lowest (at 1) were members of the Thermoplasma, Pyrobaculum aerophilum, and Nanoarchaeum equitans. Based on the Tat consensus sequence, contribution of these secreted proteins to the secretomes were negligible, making up 8 proteins out of a total of 7105 predicted exported proteins. Amino acid composition, an attribute of signal peptides not used as a selection criteria by ExProt, of predicted archaeal signal peptides show that in the haloarchaea secretomes, the frequency of the amino acid Lys is much lower than that seen in bacterial signal peptides, but is compensated for by a higher frequency of Arg. It also showed that higher frequencies for Thr, Val, and Gly contribute to the hydrophobic character in haloarchaeal signal peptides, unlike bacterial signal peptides in which the hydrophobic character is dominated by Leu and Ile.
Collapse
|
55
|
Berthon J, Cortez D, Forterre P. Genomic context analysis in Archaea suggests previously unrecognized links between DNA replication and translation. Genome Biol 2008; 9:R71. [PMID: 18400081 PMCID: PMC2643942 DOI: 10.1186/gb-2008-9-4-r71] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2007] [Revised: 02/22/2008] [Accepted: 04/09/2008] [Indexed: 11/05/2022] Open
Abstract
Specific functional interactions of proteins involved in DNA replication and/or DNA repair or transcription might occur in Archaea, suggesting a previously unrecognized regulatory network coupling DNA replication and translation, which might also exist in Eukarya. Background Comparative analysis of genomes is valuable to explore evolution of genomes, deduce gene functions, or predict functional linking between proteins. Here, we have systematically analyzed the genomic environment of all known DNA replication genes in 27 archaeal genomes to infer new connections for DNA replication proteins from conserved genomic associations. Results Two distinct sets of DNA replication genes frequently co-localize in archaeal genomes: the first includes the genes for PCNA, the small subunit of the DNA primase (PriS), and Gins15; the second comprises the genes for MCM and Gins23. Other genomic associations of genes encoding proteins involved in informational processes that may be functionally relevant at the cellular level have also been noted; in particular, the association between the genes for PCNA, transcription factor S, and NudF. Surprisingly, a conserved cluster of genes coding for proteins involved in translation or ribosome biogenesis (S27E, L44E, aIF-2 alpha, Nop10) is almost systematically contiguous to the group of genes coding for PCNA, PriS, and Gins15. The functional relevance of this cluster encoding proteins conserved in Archaea and Eukarya is strongly supported by statistical analysis. Interestingly, the gene encoding the S27E protein, also known as metallopanstimulin 1 (MPS-1) in human, is overexpressed in multiple cancer cell lines. Conclusion Our genome context analysis suggests specific functional interactions for proteins involved in DNA replication between each other or with proteins involved in DNA repair or transcription. Furthermore, it suggests a previously unrecognized regulatory network coupling DNA replication and translation in Archaea that may also exist in Eukarya.
Collapse
Affiliation(s)
- Jonathan Berthon
- Univ. Paris-Sud 11, CNRS, UMR8621, Institut de Génétique et Microbiologie, 91405 Orsay CEDEX, France.
| | | | | |
Collapse
|
56
|
Mesophilic Crenarchaeota: proposal for a third archaeal phylum, the Thaumarchaeota. Nat Rev Microbiol 2008; 6:245-52. [PMID: 18274537 DOI: 10.1038/nrmicro1852] [Citation(s) in RCA: 631] [Impact Index Per Article: 39.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
Abstract
The archaeal domain is currently divided into two major phyla, the Euryarchaeota and Crenarchaeota. During the past few years, diverse groups of uncultivated mesophilic archaea have been discovered and affiliated with the Crenarchaeota. It was recently recognized that these archaea have a major role in geochemical cycles. Based on the first genome sequence of a crenarchaeote, Cenarchaeum symbiosum, we show that these mesophilic archaea are different from hyperthermophilic Crenarchaeota and branch deeper than was previously assumed. Our results indicate that C. symbiosum and its relatives are not Crenarchaeota, but should be considered as a third archaeal phylum, which we propose to name Thaumarchaeota (from the Greek 'thaumas', meaning wonder).
Collapse
|
57
|
Sherrer RL, Ho JML, Söll D. Divergence of selenocysteine tRNA recognition by archaeal and eukaryotic O-phosphoseryl-tRNASec kinase. Nucleic Acids Res 2008; 36:1871-80. [PMID: 18267971 PMCID: PMC2330242 DOI: 10.1093/nar/gkn036] [Citation(s) in RCA: 30] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/23/2023] Open
Abstract
Selenocysteine (Sec) biosynthesis in archaea and eukaryotes requires three steps: serylation of tRNASec by seryl-tRNA synthetase (SerRS), phosphorylation of Ser-tRNASec by O-phosphoseryl-tRNASec kinase (PSTK), and conversion of O-phosphoseryl-tRNASec (Sep-tRNASec) by Sep-tRNA:Sec-tRNA synthase (SepSecS) to Sec-tRNASec. Although SerRS recognizes both tRNASec and tRNASer species, PSTK must discriminate Ser-tRNASec from Ser-tRNASer. Based on a comparison of the sequences and secondary structures of archaeal tRNASec and tRNASer, we introduced mutations into Methanococcus maripaludis tRNASec to investigate how Methanocaldococcus jannaschii PSTK distinguishes tRNASec from tRNASer. Unlike eukaryotic PSTK, the archaeal enzyme was found to recognize the acceptor stem rather than the length and secondary structure of the D-stem. While the D-arm and T-loop provide minor identity elements, the acceptor stem base pairs G2-C71 and C3-G70 in tRNASec were crucial for discrimination from tRNASer. Furthermore, the A5-U68 base pair in tRNASer has some antideterminant properties for PSTK. Transplantation of these identity elements into the tRNASerUGA scaffold resulted in phosphorylation of the chimeric Ser-tRNA. The chimera was able to stimulate the ATPase activity of PSTK albeit at a lower level than tRNASec, whereas tRNASer did not. Additionally, the seryl moiety of Ser-tRNASec is not required for enzyme recognition, as PSTK efficiently phosphorylated Thr-tRNASec.
Collapse
Affiliation(s)
- R Lynn Sherrer
- Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT 06520-8114, USA
| | | | | |
Collapse
|
58
|
Lebedinsky AV, Chernyh NA, Bonch-Osmolovskaya EA. Phylogenetic systematics of microorganisms inhabiting thermal environments. BIOCHEMISTRY (MOSCOW) 2007; 72:1299-312. [DOI: 10.1134/s0006297907120048] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
|
59
|
Makarova KS, Sorokin AV, Novichkov PS, Wolf YI, Koonin EV. Clusters of orthologous genes for 41 archaeal genomes and implications for evolutionary genomics of archaea. Biol Direct 2007; 2:33. [PMID: 18042280 PMCID: PMC2222616 DOI: 10.1186/1745-6150-2-33] [Citation(s) in RCA: 150] [Impact Index Per Article: 8.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2007] [Accepted: 11/27/2007] [Indexed: 12/29/2022] Open
Abstract
Background An evolutionary classification of genes from sequenced genomes that distinguishes between orthologs and paralogs is indispensable for genome annotation and evolutionary reconstruction. Shortly after multiple genome sequences of bacteria, archaea, and unicellular eukaryotes became available, an attempt on such a classification was implemented in Clusters of Orthologous Groups of proteins (COGs). Rapid accumulation of genome sequences creates opportunities for refining COGs but also represents a challenge because of error amplification. One of the practical strategies involves construction of refined COGs for phylogenetically compact subsets of genomes. Results New Archaeal Clusters of Orthologous Genes (arCOGs) were constructed for 41 archaeal genomes (13 Crenarchaeota, 27 Euryarchaeota and one Nanoarchaeon) using an improved procedure that employs a similarity tree between smaller, group-specific clusters, semi-automatically partitions orthology domains in multidomain proteins, and uses profile searches for identification of remote orthologs. The annotation of arCOGs is a consensus between three assignments based on the COGs, the CDD database, and the annotations of homologs in the NR database. The 7538 arCOGs, on average, cover ~88% of the genes in a genome compared to a ~76% coverage in COGs. The finer granularity of ortholog identification in the arCOGs is apparent from the fact that 4538 arCOGs correspond to 2362 COGs; ~40% of the arCOGs are new. The archaeal gene core (protein-coding genes found in all 41 genome) consists of 166 arCOGs. The arCOGs were used to reconstruct gene loss and gene gain events during archaeal evolution and gene sets of ancestral forms. The Last Archaeal Common Ancestor (LACA) is conservatively estimated to possess 996 genes compared to 1245 and 1335 genes for the last common ancestors of Crenarchaeota and Euryarchaeota, respectively. It is inferred that LACA was a chemoautotrophic hyperthermophile that, in addition to the core archaeal functions, encoded more idiosyncratic systems, e.g., the CASS systems of antivirus defense and some toxin-antitoxin systems. Conclusion The arCOGs provide a convenient, flexible framework for functional annotation of archaeal genomes, comparative genomics and evolutionary reconstructions. Genomic reconstructions suggest that the last common ancestor of archaea might have been (nearly) as advanced as the modern archaeal hyperthermophiles. ArCOGs and related information are available at: . Reviewers This article was reviewed by Peer Bork, Patrick Forterre, and Purificacion Lopez-Garcia.
Collapse
Affiliation(s)
- Kira S Makarova
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA.
| | | | | | | | | |
Collapse
|
60
|
Coker JA, DasSarma S. Genetic and transcriptomic analysis of transcription factor genes in the model halophilic Archaeon: coordinate action of TbpD and TfbA. BMC Genet 2007; 8:61. [PMID: 17892563 PMCID: PMC2121645 DOI: 10.1186/1471-2156-8-61] [Citation(s) in RCA: 51] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2007] [Accepted: 09/24/2007] [Indexed: 11/10/2022] Open
Abstract
Background Archaea are prokaryotic organisms with simplified versions of eukaryotic transcription systems. Genes coding for the general transcription factors TBP and TFB are present in multiple copies in several Archaea, including Halobacterium sp. NRC-1. Multiple TBP and TFBs have been proposed to participate in transcription of genes via recognition and recruitment of RNA polymerase to different classes of promoters. Results We attempted to knock out all six TBP and seven TFB genes in Halobacterium sp. NRC-1 using the ura3-based gene deletion system. Knockouts were obtained for six out of thirteen genes, tbpCDF and tfbACG, indicating that they are not essential for cell viability under standard conditions. Screening of a population of 1,000 candidate mutants showed that genes which did not yield mutants contained less that 0.1% knockouts, strongly suggesting that they are essential. The transcriptomes of two mutants, ΔtbpD and ΔtfbA, were compared to the parental strain and showed coordinate down regulation of many genes. Over 500 out of 2,677 total genes were regulated in the ΔtbpD and ΔtfbA mutants with 363 regulated in both, indicating that over 10% of genes in both strains require the action of both TbpD and TfbA for normal transcription. Culturing studies on the ΔtbpD and ΔtfbA mutant strains showed them to grow more slowly than the wild-type at an elevated temperature, 49°C, and they showed reduced viability at 56°C, suggesting TbpD and TfbA are involved in the heat shock response. Alignment of TBP and TFB protein sequences suggested the expansion of the TBP gene family, especially in Halobacterium sp. NRC-1, and TFB gene family in representatives of five different genera of haloarchaea in which genome sequences are available. Conclusion Six of thirteen TBP and TFB genes of Halobacterium sp. NRC-1 are non-essential under standard growth conditions. TbpD and TfbA coordinate the expression of over 10% of the genes in the NRC-1 genome. The ΔtbpD and ΔtfbA mutant strains are temperature sensitive, possibly as a result of down regulation of heat shock genes. Sequence alignments suggest the existence of several families of TBP and TFB transcription factors in Halobacterium which may function in transcription of different classes of genes.
Collapse
Affiliation(s)
- James A Coker
- University of Maryland Biotechnology Institute, Center of Marine Biotechnology, 701 East Pratt Street, Baltimore, MD 21202, USA
| | - Shiladitya DasSarma
- University of Maryland Biotechnology Institute, Center of Marine Biotechnology, 701 East Pratt Street, Baltimore, MD 21202, USA
| |
Collapse
|
61
|
Koonin EV. The Biological Big Bang model for the major transitions in evolution. Biol Direct 2007; 2:21. [PMID: 17708768 PMCID: PMC1973067 DOI: 10.1186/1745-6150-2-21] [Citation(s) in RCA: 78] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2007] [Accepted: 08/20/2007] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Major transitions in biological evolution show the same pattern of sudden emergence of diverse forms at a new level of complexity. The relationships between major groups within an emergent new class of biological entities are hard to decipher and do not seem to fit the tree pattern that, following Darwin's original proposal, remains the dominant description of biological evolution. The cases in point include the origin of complex RNA molecules and protein folds; major groups of viruses; archaea and bacteria, and the principal lineages within each of these prokaryotic domains; eukaryotic supergroups; and animal phyla. In each of these pivotal nexuses in life's history, the principal "types" seem to appear rapidly and fully equipped with the signature features of the respective new level of biological organization. No intermediate "grades" or intermediate forms between different types are detectable. Usually, this pattern is attributed to cladogenesis compressed in time, combined with the inevitable erosion of the phylogenetic signal. HYPOTHESIS I propose that most or all major evolutionary transitions that show the "explosive" pattern of emergence of new types of biological entities correspond to a boundary between two qualitatively distinct evolutionary phases. The first, inflationary phase is characterized by extremely rapid evolution driven by various processes of genetic information exchange, such as horizontal gene transfer, recombination, fusion, fission, and spread of mobile elements. These processes give rise to a vast diversity of forms from which the main classes of entities at the new level of complexity emerge independently, through a sampling process. In the second phase, evolution dramatically slows down, the respective process of genetic information exchange tapers off, and multiple lineages of the new type of entities emerge, each of them evolving in a tree-like fashion from that point on. This biphasic model of evolution incorporates the previously developed concepts of the emergence of protein folds by recombination of small structural units and origin of viruses and cells from a pre-cellular compartmentalized pool of recombining genetic elements. The model is extended to encompass other major transitions. It is proposed that bacterial and archaeal phyla emerged independently from two distinct populations of primordial cells that, originally, possessed leaky membranes, which made the cells prone to rampant gene exchange; and that the eukaryotic supergroups emerged through distinct, secondary endosymbiotic events (as opposed to the primary, mitochondrial endosymbiosis). This biphasic model of evolution is substantially analogous to the scenario of the origin of universes in the eternal inflation version of modern cosmology. Under this model, universes like ours emerge in the infinite multiverse when the eternal process of exponential expansion, known as inflation, ceases in a particular region as a result of false vacuum decay, a first order phase transition process. The result is the nucleation of a new universe, which is traditionally denoted Big Bang, although this scenario is radically different from the Big Bang of the traditional model of an expanding universe. Hence I denote the phase transitions at the end of each inflationary epoch in the history of life Biological Big Bangs (BBB). CONCLUSION A Biological Big Bang (BBB) model is proposed for the major transitions in life's evolution. According to this model, each transition is a BBB such that new classes of biological entities emerge at the end of a rapid phase of evolution (inflation) that is characterized by extensive exchange of genetic information which takes distinct forms for different BBBs. The major types of new forms emerge independently, via a sampling process, from the pool of recombining entities of the preceding generation. This process is envisaged as being qualitatively different from tree-pattern cladogenesis.
Collapse
Affiliation(s)
- Eugene V Koonin
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA.
| |
Collapse
|
62
|
Coulson RMR, Touboul N, Ouzounis CA. Lineage-specific partitions in archaeal transcription. ARCHAEA (VANCOUVER, B.C.) 2007; 2:117-25. [PMID: 17350932 PMCID: PMC2686387 DOI: 10.1155/2006/629868] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/13/2006] [Accepted: 10/23/2006] [Indexed: 11/18/2022]
Abstract
The phylogenetic distribution of the components comprising the transcriptional machinery in the crenarchaeal and euryarchaeal lineages of the Archaea was analyzed in a systematic manner by genome-wide profiling of transcription complements in fifteen complete archaeal genome sequences. Initially, a reference set of transcription-associated proteins (TAPs) consisting of sequences functioning in all aspects of the transcriptional process, and originating from the three domains of life, was used to query the genomes. TAP-families were detected by sequence clustering of the TAPs and their archaeal homologues, and through extensive database searching, these families were assigned a function. The phylogenetic origins of archaeal genes matching hidden Markov model profiles of protein domains associated with transcription, and those encoding the TAP-homologues, showed there is extensive lineage-specificity of proteins that function as regulators of transcription: most of these sequences are present solely in the Euryarchaeota, with nearly all of them homologous to bacterial DNA-binding proteins. Strikingly, the hidden Markov model profile searches revealed that archaeal chromatin and histone-modifying enzymes also display extensive taxon-restrictedness, both across and within the two phyla.
Collapse
Affiliation(s)
- Richard M R Coulson
- Microarray Group, The European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK.
| | | | | |
Collapse
|
63
|
McCarren J, DeLong EF. Proteorhodopsin photosystem gene clusters exhibit co-evolutionary trends and shared ancestry among diverse marine microbial phyla. Environ Microbiol 2007; 9:846-58. [PMID: 17359257 DOI: 10.1111/j.1462-2920.2006.01203.x] [Citation(s) in RCA: 81] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]
Abstract
Since the recent discovery of retinylidene proteins in marine bacteria (proteorhodopsins), the estimated abundance and diversity of this gene family has expanded rapidly. To explore proteorhodopsin photosystem evolutionary and distributional trends, we identified and compared 16 different proteorhodopsin-containing genome fragments recovered from naturally occurring bacterioplankton populations. In addition to finding several deep-branching proteorhodopsin sequences, proteorhodopsins were found in novel taxonomic contexts, including a betaproteobacterium and a planctomycete. Approximately one-third of the proteorhodopsin-containing genome fragments analysed, as well as a number of recently reported marine bacterial whole genome sequences, contained a linked set of genes required for biosynthesis of the rhodopsin chromophore, retinal. Phylogenetic analyses of the retinal biosynthetic genes suggested their co-evolution and probable coordinated lateral gene transfer into disparate lineages, including Euryarchaeota, Planctomycetales, and three different proteobacterial lineages. The lateral transfer and retention of genes required to assemble a functional proteorhodopsin photosystem appears to be a coordinated and relatively frequent evolutionary event. Strong selection pressure apparently acts to preserve these light-dependent photosystems in diverse marine microbial lineages.
Collapse
Affiliation(s)
- Jay McCarren
- Department of Civil and Environmental Engineering and Division of Biological Engineering, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
| | | |
Collapse
|
64
|
Phylogenomic analysis of proteins that are distinctive of Archaea and its main subgroups and the origin of methanogenesis. BMC Genomics 2007; 8:86. [PMID: 17394648 PMCID: PMC1852104 DOI: 10.1186/1471-2164-8-86] [Citation(s) in RCA: 61] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2006] [Accepted: 03/29/2007] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND The Archaea are highly diverse in terms of their physiology, metabolism and ecology. Presently, very few molecular characteristics are known that are uniquely shared by either all archaea or the different main groups within archaea. The evolutionary relationships among different groups within the Euryarchaeota branch are also not clearly understood. RESULTS We have carried out comprehensive analyses on each open reading frame (ORFs) in the genomes of 11 archaea (3 Crenarchaeota--Aeropyrum pernix, Pyrobaculum aerophilum and Sulfolobus acidocaldarius; 8 Euryarchaeota--Pyrococcus abyssi, Methanococcus maripaludis, Methanopyrus kandleri, Methanococcoides burtonii, Halobacterium sp. NCR-1, Haloquadratum walsbyi, Thermoplasma acidophilum and Picrophilus torridus) to search for proteins that are unique to either all Archaea or for its main subgroups. These studies have identified 1448 proteins or ORFs that are distinctive characteristics of Archaea and its various subgroups and whose homologues are not found in other organisms. Six of these proteins are unique to all Archaea, 10 others are only missing in Nanoarchaeum equitans and a large number of other proteins are specific for various main groups within the Archaea (e.g. Crenarchaeota, Euryarchaeota, Sulfolobales and Desulfurococcales, Halobacteriales, Thermococci, Thermoplasmata, all methanogenic archaea or particular groups of methanogens). Of particular importance is the observation that 31 proteins are uniquely present in virtually all methanogens (including M. kandleri) and 10 additional proteins are only found in different methanogens as well as A. fulgidus. In contrast, no protein was exclusively shared by various methanogen and any of the Halobacteriales or Thermoplasmatales. These results strongly indicate that all methanogenic archaea form a monophyletic group exclusive of other archaea and that this lineage likely evolved from Archaeoglobus. In addition, 15 proteins that are uniquely shared by M. kandleri and Methanobacteriales suggest a close evolutionary relationship between them. In contrast to the phylogenomics studies, a monophyletic grouping of archaea is not supported by phylogenetic analyses based on protein sequences. CONCLUSION The identified archaea-specific proteins provide novel molecular markers or signature proteins that are distinctive characteristics of Archaea and all of its major subgroups. The species distributions of these proteins provide novel insights into the evolutionary relationships among different groups within Archaea, particularly regarding the origin of methanogenesis. Most of these proteins are of unknown function and further studies should lead to discovery of novel biochemical and physiological characteristics that are unique to either all archaea or its different subgroups.
Collapse
|
65
|
House CH. Linking taxonomy with environmental geochemistry and why it matters to the field of geobiology. GEOBIOLOGY 2007; 5:1-3. [PMID: 36298873 DOI: 10.1111/j.1472-4669.2007.00097.x] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Affiliation(s)
- C H House
- Penn State Astrobiology Research Center and Department of Geosciences, Pennsylvania State University, Pennsylvania, USA
| |
Collapse
|
66
|
Wilkinson M, McInerney JO, Hirt RP, Foster PG, Embley TM. Of clades and clans: terms for phylogenetic relationships in unrooted trees. Trends Ecol Evol 2007; 22:114-5. [PMID: 17239486 DOI: 10.1016/j.tree.2007.01.002] [Citation(s) in RCA: 97] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2006] [Revised: 12/07/2006] [Accepted: 01/11/2007] [Indexed: 10/23/2022]
|
67
|
Bucknam J, Boucher Y, Bapteste E. Refuting phylogenetic relationships. Biol Direct 2006; 1:26. [PMID: 16956399 PMCID: PMC1574289 DOI: 10.1186/1745-6150-1-26] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2006] [Accepted: 09/06/2006] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Phylogenetic methods are philosophically grounded, and so can be philosophically biased in ways that limit explanatory power. This constitutes an important methodologic dimension not often taken into account. Here we address this dimension in the context of concatenation approaches to phylogeny. RESULTS We discuss some of the limits of a methodology restricted to verificationism, the philosophy on which gene concatenation practices generally rely. As an alternative, we describe a software which identifies and focuses on impossible or refuted relationships, through a simple analysis of bootstrap bipartitions, followed by multivariate statistical analyses. We show how refuting phylogenetic relationships could in principle facilitate systematics. We also apply our method to the study of two complex phylogenies: the phylogeny of the archaea and the phylogeny of the core of genes shared by all life forms. While many groups are rejected, our results left open a possible proximity of N. equitans and the Methanopyrales, of the Archaea and the Cyanobacteria, and as well the possible grouping of the Methanobacteriales/Methanoccocales and Thermosplasmatales, of the Spirochaetes and the Actinobacteria and of the Proteobacteria and firmicutes. CONCLUSION It is sometimes easier (and preferable) to decide which species do not group together than which ones do. When possible topologies are limited, identifying local relationships that are rejected may be a useful alternative to classical concatenation approaches aiming to find a globally resolved tree on the basis of weak phylogenetic markers. REVIEWERS This article was reviewed by Mark Ragan, Eugene V Koonin and J Peter Gogarten.
Collapse
Affiliation(s)
- James Bucknam
- Canadian Institute for Advanced Research and Genome Atlantic, Department of Biochemistry and Molecular Biology, Dalhousie University, Halifax, Nova Scotia, B3H 4H7, Canada
| | - Yan Boucher
- Department of Chemistry and Molecular Biosciences, Macquarie University, North Ryde, NSW 2109, Australia
| | - Eric Bapteste
- Canadian Institute for Advanced Research and Genome Atlantic, Department of Biochemistry and Molecular Biology, Dalhousie University, Halifax, Nova Scotia, B3H 4H7, Canada
| |
Collapse
|
68
|
Ashby MK. Distribution, structure and diversity of "bacterial" genes encoding two-component proteins in the Euryarchaeota. ARCHAEA-AN INTERNATIONAL MICROBIOLOGICAL JOURNAL 2006; 2:11-30. [PMID: 16877318 PMCID: PMC2685588 DOI: 10.1155/2006/562404] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
The publicly available annotated archaeal genome sequences (23 complete and three partial annotations, October 2005) were searched for the presence of potential two-component open reading frames (ORFs) using gene category lists and BLASTP. A total of 489 potential two-component genes were identified from the gene category lists and BLASTP. Two-component genes were found in 14 of the 21 Euryarchaeal sequences (October 2005) and in neither the Crenarchaeota nor the Nanoarchaeota. A total of 20 predicted protein domains were identified in the putative two-component ORFs that, in addition to the histidine kinase and receiver domains, also includes sensor and signalling domains. The detailed structure of these putative proteins is shown, as is the distribution of each class of two-component genes in each species. Potential members of orthologous groups have been identified, as have any potential operons containing two or more two-component genes. The number of two-component genes in those Euryarchaeal species which have them seems to be linked more to lifestyle and habitat than to genome complexity, with most examples being found in Methanospirillum hungatei, Haloarcula marismortui, Methanococcoides burtonii and the mesophilic Methanosarcinales group. The large numbers of two-component genes in these species may reflect a greater requirement for internal regulation. Phylogenetic analysis of orthologous groups of five different protein classes, three probably involved in regulating taxis, suggests that most of these ORFs have been inherited vertically from an ancestral Euryarchaeal species and point to a limited number of key horizontal gene transfer events.
Collapse
Affiliation(s)
- Mark K Ashby
- Department of Basic Medical Sciences, Biochemistry Section, University of the West Indies, Mona Campus, Kingston 7, Jamaica.
| |
Collapse
|
69
|
Gribaldo S, Brochier-Armanet C. The origin and evolution of Archaea: a state of the art. Philos Trans R Soc Lond B Biol Sci 2006; 361:1007-22. [PMID: 16754611 PMCID: PMC1578729 DOI: 10.1098/rstb.2006.1841] [Citation(s) in RCA: 189] [Impact Index Per Article: 10.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/24/2023] Open
Abstract
Environmental surveys indicate that the Archaea are diverse and abundant not only in extreme environments, but also in soil, oceans and freshwater, where they may fulfil a key role in the biogeochemical cycles of the planet. Archaea display unique capacities, such as methanogenesis and survival at temperatures higher than 90 degrees C, that make them crucial for understanding the nature of the biota of early Earth. Molecular, genomics and phylogenetics data strengthen Woese's definition of Archaea as a third domain of life in addition to Bacteria and Eukarya. Phylogenomics analyses of the components of different molecular systems are highlighting a core of mainly vertically inherited genes in Archaea. This allows recovering a globally well-resolved picture of archaeal evolution, as opposed to what is observed for Bacteria and Eukarya. This may be due to the fact that no rapid divergence occurred at the emergence of present-day archaeal lineages. This phylogeny supports a hyperthermophilic and non-methanogenic ancestor to present-day archaeal lineages, and a profound divergence between two major phyla, the Crenarchaeota and the Euryarchaeota, that may not have an equivalent in the other two domains of life. Nanoarchaea may not represent a third and ancestral archaeal phylum, but a fast-evolving euryarchaeal lineage. Methanogenesis seems to have appeared only once and early in the evolution of Euryarchaeota. Filling up this picture of archaeal evolution by adding presently uncultivated species, and placing it back in geological time remain two essential goals for the future.
Collapse
Affiliation(s)
- Simonetta Gribaldo
- Institut Pasteur, Unité Biologie Moléculaire du Gène chez les Extremophiles, 25 rue du Dr Roux, 75724 Paris Cedex 15, France.
| | | |
Collapse
|
70
|
Wang HC, Xia X, Hickey D. Thermal Adaptation of the Small Subunit Ribosomal RNA Gene: A Comparative Study. J Mol Evol 2006; 63:120-6. [PMID: 16786438 DOI: 10.1007/s00239-005-0255-4] [Citation(s) in RCA: 33] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2005] [Accepted: 03/01/2006] [Indexed: 11/27/2022]
Abstract
We carried out a comprehensive survey of small subunit ribosomal RNA sequences from archaeal, bacterial, and eukaryotic lineages in order to understand the general patterns of thermal adaptation in the rRNA genes. Within each lineage, we compared sequences from mesophilic, moderately thermophilic, and hyperthermophilic species. We carried out a more detailed study of the archaea, because of the wide range of growth temperatures within this group. Our results confirmed that there is a clear correlation between the GC content of the paired stem regions of the 16S rRNA genes and the optimal growth temperature, and we show that this correlation cannot be explained simply by phylogenetic relatedness among the thermophilic archaeal species. In addition, we found a significant, positive relationship between rRNA stem length and growth temperature. These correlations are found in both bacterial and archaeal rRNA genes. Finally, we compared rRNA sequences from warm-blooded and cold-blooded vertebrates. We found that, while rRNA sequences from the warm-blooded vertebrates have a higher overall GC content than those from the cold-blooded vertebrates, this difference is not concentrated in the paired regions of the molecule, suggesting that thermal adaptation is not the cause of the nucleotide differences between the vertebrate lineages.
Collapse
Affiliation(s)
- Huai-Chun Wang
- Department of Mathematics and Statistics, Dalhousie University, Halifax, Nova Scotia, B3H 3J5, Canada
| | | | | |
Collapse
|
71
|
Forterre P. DNA topoisomerase V: a new fold of mysterious origin. Trends Biotechnol 2006; 24:245-7. [PMID: 16650908 DOI: 10.1016/j.tibtech.2006.04.006] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2006] [Revised: 03/23/2006] [Accepted: 04/13/2006] [Indexed: 10/24/2022]
Abstract
Although all other topoisomerases have a broad phylogenetic distribution, DNA topoisomerase V, the major component of the ThermoFidelase sequencing kit, is presently only known in a single species--the archaeon Methanopyrus kandleri. Resolution of the structure of this enzyme by Taneja and co-workers now reveals that this atypical topoisomerase has no structural similarity with other proteins. So, where did it come from? It is my contention that Topo V, and many other orphan proteins, could have a viral origin.
Collapse
Affiliation(s)
- Patrick Forterre
- Biologie Moléculaire du Gène chez les Extrêmophiles, Institut Pasteur, 25 rue du Dr Roux, 75015, Paris, France.
| |
Collapse
|
72
|
Keane TM, Creevey CJ, Pentony MM, Naughton TJ, Mclnerney JO. Assessment of methods for amino acid matrix selection and their use on empirical data shows that ad hoc assumptions for choice of matrix are not justified. BMC Evol Biol 2006; 6:29. [PMID: 16563161 PMCID: PMC1435933 DOI: 10.1186/1471-2148-6-29] [Citation(s) in RCA: 794] [Impact Index Per Article: 44.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2005] [Accepted: 03/24/2006] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND In recent years, model based approaches such as maximum likelihood have become the methods of choice for constructing phylogenies. A number of authors have shown the importance of using adequate substitution models in order to produce accurate phylogenies. In the past, many empirical models of amino acid substitution have been derived using a variety of different methods and protein datasets. These matrices are normally used as surrogates, rather than deriving the maximum likelihood model from the dataset being examined. With few exceptions, selection between alternative matrices has been carried out in an ad hoc manner. RESULTS We start by highlighting the potential dangers of arbitrarily choosing protein models by demonstrating an empirical example where a single alignment can produce two topologically different and strongly supported phylogenies using two different arbitrarily-chosen amino acid substitution models. We demonstrate that in simple simulations, statistical methods of model selection are indeed robust and likely to be useful for protein model selection. We have investigated patterns of amino acid substitution among homologous sequences from the three Domains of life and our results show that no single amino acid matrix is optimal for any of the datasets. Perhaps most interestingly, we demonstrate that for two large datasets derived from the proteobacteria and archaea, one of the most favored models in both datasets is a model that was originally derived from retroviral Pol proteins. CONCLUSION This demonstrates that choosing protein models based on their source or method of construction may not be appropriate.
Collapse
Affiliation(s)
- Thomas M Keane
- Bioinformatics Laboratory, Department of Biology, National University of Ireland, Maynooth, Co. Kildare, Ireland
| | | | - Melissa M Pentony
- Department of Computer Science, University College London, Gower Street, London, UK
| | - Thomas J Naughton
- Department of Computer Science, National University of Ireland, Maynooth, Co. Kildare, Ireland
| | - James O Mclnerney
- Bioinformatics Laboratory, Department of Biology, National University of Ireland, Maynooth, Co. Kildare, Ireland
| |
Collapse
|
73
|
DasSarma S, Berquist BR, Coker JA, DasSarma P, Müller JA. Post-genomics of the model haloarchaeon Halobacterium sp. NRC-1. SALINE SYSTEMS 2006; 2:3. [PMID: 16542428 PMCID: PMC1447603 DOI: 10.1186/1746-1448-2-3] [Citation(s) in RCA: 47] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/01/2006] [Accepted: 03/16/2006] [Indexed: 11/21/2022]
Abstract
Halobacteriumsp. NRC-1 is an extremely halophilic archaeon that is easily cultured and genetically tractable. Since its genome sequence was completed in 2000, a combination of genetic, transcriptomic, proteomic, and bioinformatic approaches have provided insights into both its extremophilic lifestyle as well as fundamental cellular processes common to all life forms. Here, we review post-genomic research on this archaeon, including investigations of DNA replication and repair systems, phototrophic, anaerobic, and other physiological capabilities, acidity of the proteome for function at high salinity, and role of lateral gene transfer in its evolution.
Collapse
Affiliation(s)
- Shiladitya DasSarma
- University of Maryland Biotechnology Institute, Center of Marine Biotechnology, 701 E. Pratt Street, Suite 236, Baltimore, MD 21202, USA
| | - Brian R Berquist
- University of Maryland Biotechnology Institute, Center of Marine Biotechnology, 701 E. Pratt Street, Suite 236, Baltimore, MD 21202, USA
| | - James A Coker
- University of Maryland Biotechnology Institute, Center of Marine Biotechnology, 701 E. Pratt Street, Suite 236, Baltimore, MD 21202, USA
| | - Priya DasSarma
- University of Maryland Biotechnology Institute, Center of Marine Biotechnology, 701 E. Pratt Street, Suite 236, Baltimore, MD 21202, USA
| | - Jochen A Müller
- Department of Biology, Morgan State University, 1700 East Cold Spring Lane, Baltimore, MD 21251, USA
| |
Collapse
|
74
|
Teeling H, Gloeckner FO. RibAlign: a software tool and database for eubacterial phylogeny based on concatenated ribosomal protein subunits. BMC Bioinformatics 2006; 7:66. [PMID: 16476165 PMCID: PMC1421441 DOI: 10.1186/1471-2105-7-66] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2005] [Accepted: 02/13/2006] [Indexed: 11/28/2022] Open
Abstract
Background Until today, analysis of 16S ribosomal RNA (rRNA) sequences has been the de-facto gold standard for the assessment of phylogenetic relationships among prokaryotes. However, the branching order of the individual phlya is not well-resolved in 16S rRNA-based trees. In search of an improvement, new phylogenetic methods have been developed alongside with the growing availability of complete genome sequences. Unfortunately, only a few genes in prokaryotic genomes qualify as universal phylogenetic markers and almost all of them have a lower information content than the 16S rRNA gene. Therefore, emphasis has been placed on methods that are based on multiple genes or even entire genomes. The concatenation of ribosomal protein sequences is one method which has been ascribed an improved resolution. Since there is neither a comprehensive database for ribosomal protein sequences nor a tool that assists in sequence retrieval and generation of respective input files for phylogenetic reconstruction programs, RibAlign has been developed to fill this gap. Results RibAlign serves two purposes: First, it provides a fast and scalable database that has been specifically adapted to eubacterial ribosomal protein sequences and second, it provides sophisticated import and export capabilities. This includes semi-automatic extraction of ribosomal protein sequences from whole-genome GenBank and FASTA files as well as exporting aligned, concatenated and filtered sequence files that can directly be used in conjunction with the PHYLIP and MrBayes phylogenetic reconstruction programs. Conclusion Up to now, phylogeny based on concatenated ribosomal protein sequences is hampered by the limited set of sequenced genomes and high computational requirements. However, hundreds of full and draft genome sequencing projects are on the way, and advances in cluster-computing and algorithms make phylogenetic reconstructions feasible even with large alignments of concatenated marker genes. RibAlign is a first step in this direction and may be particularly interesting to scientists involved in whole genome sequencing of representatives of new or sparsely studied eubacterial phyla. RibAlign is available at
Collapse
Affiliation(s)
- Hanno Teeling
- Microbial Genomics Group, Max Planck Institute for Marine Microbiology, D-28359 Bremen, Germany
| | - Frank Oliver Gloeckner
- Microbial Genomics Group, Max Planck Institute for Marine Microbiology, D-28359 Bremen, Germany
- International University Bremen, D-28759 Bremen, Germany
| |
Collapse
|
75
|
O'Donoghue P, Sethi A, Woese CR, Luthey-Schulten ZA. The evolutionary history of Cys-tRNACys formation. Proc Natl Acad Sci U S A 2005; 102:19003-8. [PMID: 16380427 PMCID: PMC1323144 DOI: 10.1073/pnas.0509617102] [Citation(s) in RCA: 66] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
The recent discovery of an alternate pathway for indirectly charging tRNA(Cys) has stimulated a re-examination of the evolutionary history of Cys-tRNA(Cys) formation. In the first step of the pathway, O-phosphoseryl-tRNA synthetase charges tRNA(Cys) with O-phosphoserine (Sep), a precursor of the cognate amino acid. In the following step, Sep-tRNA:Cys-tRNA synthase (SepCysS) converts Sep to Cys in a tRNA-dependent reaction. The existence of such a pathway raises several evolutionary questions, including whether the indirect pathway is a recent evolutionary invention, as might be implied from its localization to the Euryarchaea, or, as evidence presented here indicates, whether this pathway is more ancient, perhaps already in existence at the time of the last universal common ancestral state. A comparative phylogenetic approach is used, combining evolutionary information from protein sequences and structures, that takes both the signature of horizontal gene transfer and the recurrence of the full canonical phylogenetic pattern into account, to document the complete evolutionary history of cysteine coding and understand the nature of this process in the last universal common ancestral state. Resulting from the historical study of tRNA(Cys) aminoacylation and the integrative perspective of sequence, structure, and function are 3D models of O-phosphoseryl-tRNA synthetase and SepCysS, which provide experimentally testable predictions regarding the identity and function of key active-site residues in these proteins. The model of SepCysS is used to suggest a sulfhydrylation reaction mechanism, which is predicted to occur at the interface of a SepCysS dimer.
Collapse
Affiliation(s)
- Patrick O'Donoghue
- Department of Chemistry, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA
| | | | | | | |
Collapse
|
76
|
Hsiao WWL, Ung K, Aeschliman D, Bryan J, Finlay BB, Brinkman FSL. Evidence of a large novel gene pool associated with prokaryotic genomic islands. PLoS Genet 2005; 1:e62. [PMID: 16299586 PMCID: PMC1285063 DOI: 10.1371/journal.pgen.0010062] [Citation(s) in RCA: 146] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2005] [Accepted: 10/13/2005] [Indexed: 11/21/2022] Open
Abstract
Microbial genes that are “novel” (no detectable homologs in other species) have become of increasing interest as environmental sampling suggests that there are many more such novel genes in yet-to-be-cultured microorganisms. By analyzing known microbial genomic islands and prophages, we developed criteria for systematic identification of putative genomic islands (clusters of genes of probable horizontal origin in a prokaryotic genome) in 63 prokaryotic genomes, and then characterized the distribution of novel genes and other features. All but a few of the genomes examined contained significantly higher proportions of novel genes in their predicted genomic islands compared with the rest of their genome (Paired t test = 4.43E-14 to 1.27E-18, depending on method). Moreover, the reverse observation (i.e., higher proportions of novel genes outside of islands) never reached statistical significance in any organism examined. We show that this higher proportion of novel genes in predicted genomic islands is not due to less accurate gene prediction in genomic island regions, but likely reflects a genuine increase in novel genes in these regions for both bacteria and archaea. This represents the first comprehensive analysis of novel genes in prokaryotic genomic islands and provides clues regarding the origin of novel genes. Our collective results imply that there are different gene pools associated with recently horizontally transmitted genomic regions versus regions that are primarily vertically inherited. Moreover, there are more novel genes within the gene pool associated with genomic islands. Since genomic islands are frequently associated with a particular microbial adaptation, such as antibiotic resistance, pathogen virulence, or metal resistance, this suggests that microbes may have access to a larger “arsenal” of novel genes for adaptation than previously thought. More than 250 microbial genomes have been sequenced to date. A significant proportion of the genes in these genomes have no apparent similarity to known genes and their functions are unknown (i.e., they appear to be novel). As the number of sequenced genomes increases, the number of these novel genes continues to increase. In this paper, the authors now show, through an analysis of a diverse range of prokaryotic genomes, that novel genes are more prevalent in regions called genomic islands. Genomic islands are clusters of genes in genomes that show evidence of horizontal origins. This study is notable since genomic islands disproportionately contain many genes of medical, agricultural, and environmental importance (e.g., animal and plant pathogen virulence factors, antibiotic resistance genes, phenolic degradation genes, etc.). The observation that high proportions of novel genes are also localized to genomic islands suggests that microbes may have access to a larger “arsenal” of novel genes for important adaptations than previously thought. These results also imply that there are different gene pools associated with recently horizontally transmitted genomic regions versus regions that are primarily vertically inherited. The authors suggest that further studies involving large-scale environmental genomic sampling are required to help characterize this understudied gene pool.
Collapse
Affiliation(s)
- William W. L Hsiao
- Department of Molecular Biology and Biochemistry, Simon Fraser University, Burnaby, British Columbia, Canada
| | - Korine Ung
- Department of Molecular Biology and Biochemistry, Simon Fraser University, Burnaby, British Columbia, Canada
| | - Dana Aeschliman
- Department of Statistics, University of British Columbia, Vancouver, British Columbia, Canada
| | - Jenny Bryan
- Department of Statistics, University of British Columbia, Vancouver, British Columbia, Canada
| | - B. Brett Finlay
- Michael Smith Laboratory, University of British Columbia, Vancouver, British Columbia, Canada
| | - Fiona S. L Brinkman
- Department of Molecular Biology and Biochemistry, Simon Fraser University, Burnaby, British Columbia, Canada
- * To whom correspondence should be addressed. E-mail:
| |
Collapse
|
77
|
Silverman BD. Asymmetry in the burial of hydrophobic residues along the histone chains of eukarya, archaea and a transcription factor. BMC STRUCTURAL BIOLOGY 2005; 5:20. [PMID: 16242031 PMCID: PMC1283977 DOI: 10.1186/1472-6807-5-20] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/18/2005] [Accepted: 10/21/2005] [Indexed: 11/30/2022]
Abstract
Background The histone fold is a common structural motif of proteins involved in the chromatin packaging of DNA and in transcription regulation. This single chain fold is stabilized by either homo- or hetero-dimer formation in archaea and eukarya. X-ray structures at atomic resolution have shown the eukaryotic nucleosome core particle to consist of a central tetramer of two bound H3-H4 dimers flanked by two H2A-H2B dimers. The c-terminal region of the H3 histone fold involved in coupling the two eukaryotic dimers of the tetramer, through a four-fold helical bundle, had previously been shown to be a region of reduced burial of hydrophobic residues within the dimers, and thereby provide a rationale for the observed reduced stability of the H3-H4 dimer compared with that of the H2A-H2B dimer. Furthermore, comparison between eukaryal and archaeal histones had suggested that this asymmetry in the distribution of hydrophobic residues along the H3 histone chains could be due to selective evolution that enhanced the coupling between the eukaryotic dimers of the tetramer. Results and discussion The present work describes calculations utilizing the X-ray structures at atomic resolution of a hyperthermophile from Methanopyrus kandleri (HMk) and a eukaryotic transcription factor from Drosophila melanogaster (DRm), that are structurally homologous to the eukaryotic (H3-H4)2 tetramer. The results for several other related structures are also described. Reduced burial of hydrophobic residues, at the homologous H3 c-terminal regions of these structures, is found to parallel the burial at the c-terminal regions of the H3 histones and is, thereby, expected to affect dimer stability and the processes involving histone structural rearrangement. Significantly different sequence homology between the two histones of the HMk doublet with other archaeal sequences is observed, and how this might have occurred during selection to enhance tetramer stability is described.
Collapse
Affiliation(s)
- B David Silverman
- IBM Thomas J. Watson Research Center, PO Box 218, Yorktown Heights, NY 10598, USA.
| |
Collapse
|
78
|
Gophna U, Bapteste E, Doolittle WF, Biran D, Ron EZ. Evolutionary plasticity of methionine biosynthesis. Gene 2005; 355:48-57. [PMID: 16046084 DOI: 10.1016/j.gene.2005.05.028] [Citation(s) in RCA: 37] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2005] [Accepted: 05/17/2005] [Indexed: 11/25/2022]
Abstract
Methionine is an essential cellular constituent, the initiator of protein synthesis and a precursor in many metabolic activities, such as methylation and formylation. Here we investigate the genomic distribution of the methionine biosynthetic pathway and analyze its evolutionary history by reconstructing the phylogeny of its enzymatic components. We demonstrate the evolutionary complexity of methionine synthesis and describe the various mechanisms that have shaped this biosynthetic pathway: gene duplication, functional reassignment, lateral acquisition and gene loss. Lateral gene transfer within and between domains and gene recruitment have played an important role in the evolution of this pathway, especially in its first and third enzymatic steps--homoserine activation and homocysteine methylation. These analyses are also the basis of predictions regarding methionine synthesis in Archaea, where the pathway is yet to be characterized. This study illustrates how diverse molecular solutions can fulfill a conserved function in living beings.
Collapse
Affiliation(s)
- Uri Gophna
- Genome Atlantic and Department of Biochemistry and Molecular Biology, Dalhousie University, Halifax, Nova Scotia, Canada B3H 1X5.
| | | | | | | | | |
Collapse
|
79
|
Ge F, Wang LS, Kim J. The cobweb of life revealed by genome-scale estimates of horizontal gene transfer. PLoS Biol 2005; 3:e316. [PMID: 16122348 PMCID: PMC1233574 DOI: 10.1371/journal.pbio.0030316] [Citation(s) in RCA: 101] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2005] [Accepted: 07/11/2005] [Indexed: 11/18/2022] Open
Abstract
With the availability of increasing amounts of genomic sequences, it is becoming clear that genomes experience horizontal transfer and incorporation of genetic information. However, to what extent such horizontal gene transfer (HGT) affects the core genealogical history of organisms remains controversial. Based on initial analyses of complete genomic sequences, HGT has been suggested to be so widespread that it might be the “essence of phylogeny” and might leave the treelike form of genealogy in doubt. On the other hand, possible biased estimation of HGT extent and the findings of coherent phylogenetic patterns indicate that phylogeny of life is well represented by tree graphs. Here, we reexamine this question by assessing the extent of HGT among core orthologous genes using a novel statistical method based on statistical comparisons of tree topology. We apply the method to 40 microbial genomes in the Clusters of Orthologous Groups database over a curated set of 297 orthologous gene clusters, and we detect significant HGT events in 33 out of 297 clusters over a wide range of functional categories. Estimates of positions of HGT events suggest a low mean genome-specific rate of HGT (2.0%) among the orthologous genes, which is in general agreement with other quantitative of HGT. We propose that HGT events, even when relatively common, still leave the treelike history of phylogenies intact, much like cobwebs hanging from tree branches. A stastical approach applied to 297 orthologous gene clusters in 40 microbial genomes suggests a low rate of interspecies gene transfer. Species relationships can therefore be modeled with a tree structure.
Collapse
Affiliation(s)
- Fan Ge
- 1Department of Biology, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America
| | - Li-San Wang
- 1Department of Biology, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America
| | - Junhyong Kim
- 1Department of Biology, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America
| |
Collapse
|
80
|
Zhang R, Zhang CT. Identification of replication origins in archaeal genomes based on the Z-curve method. ARCHAEA-AN INTERNATIONAL MICROBIOLOGICAL JOURNAL 2005; 1:335-46. [PMID: 15876567 PMCID: PMC2685548 DOI: 10.1155/2005/509646] [Citation(s) in RCA: 79] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
The Z-curve is a three-dimensional curve that constitutes a unique representation of a DNA sequence, i.e., both the Z-curve and the given DNA sequence can be uniquely reconstructed from the other. We employed Z-curve analysis to identify one replication origin in the Methanocaldococcus jannaschii genome, two replication origins in the Halobacterium species NRC-1 genome and one replication origin in the Methanosarcina mazei genome. One of the predicted replication origins of Halobacterium species NRC-1 is the same as a replication origin later identified by in vivo experiments. The Z-curve analysis of the Sulfolobus solfataricus P2 genome suggested the existence of three replication origins, which is also consistent with later experimental results. This review aims to summarize applications of the Z-curve in identifying replication origins of archaeal genomes, and to provide clues about the locations of as yet unidentified replication origins of the Aeropyrum pernix K1, Methanococcus maripaludis S2, Picrophilus torridus DSM 9790 and Pyrobaculum aerophilum str. IM2 genomes.
Collapse
Affiliation(s)
- Ren Zhang
- Department of Epidemiology and Biostatistics, Tianjin Cancer Institute and Hospital, Tianjin 300060, China
| | - Chun-Ting Zhang
- Department of Physics, Tianjin University, Tianjin 300072, China
- Corresponding author ()
| |
Collapse
|
81
|
Bapteste E, Brochier C, Boucher Y. Higher-level classification of the Archaea: evolution of methanogenesis and methanogens. ARCHAEA-AN INTERNATIONAL MICROBIOLOGICAL JOURNAL 2005; 1:353-63. [PMID: 15876569 PMCID: PMC2685549 DOI: 10.1155/2005/859728] [Citation(s) in RCA: 147] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
We used a phylogenetic approach to analyze the evolution of methanogenesis and methanogens. We show that 23 vertically transmitted ribosomal proteins do not support the monophyly of methanogens, and propose instead that there are two distantly related groups of extant archaea that produce methane, which we have named Class I and Class II. Based on this finding, we subsequently investigated the uniqueness of the origin of methanogenesis by studying both the enzymes of methanogenesis and the proteins that synthesize its specific coenzymes. We conclude that hydrogenotrophic methanogenesis appeared only once during evolution. Genes involved in the seven central steps of the methanogenic reduction of carbon dioxide (CO(2)) are ubiquitous in methanogens and share a common history. This suggests that, although extant methanogens produce methane from various substrates (CO(2), formate, acetate, methylated C-1 compounds), these archaea have a core of conserved enzymes that have undergone little evolutionary change. Furthermore, this core of methanogenesis enzymes seems to originate (as a whole) from the last ancestor of all methanogens and does not appear to have been horizontally transmitted to other organisms or between members of Class I and Class II. The observation of a unique and ancestral form of methanogenesis suggests that it was preserved in two independent lineages, with some instances of specialization or added metabolic flexibility. It was likely lost in the Halobacteriales, Thermoplasmatales and Archaeoglobales. Given that fossil evidence for methanogenesis dates back 2.8 billion years, a unique origin of this process makes the methanogenic archaea a very ancient taxon.
Collapse
Affiliation(s)
- Eric Bapteste
- Department of Biochemistry and Molecular Biology, Dalhousie University, Sir Charles Tupper Building, Halifax, NS, B3H 4H7, Canada
| | | | | |
Collapse
|
82
|
Brochier C, Forterre P, Gribaldo S. An emerging phylogenetic core of Archaea: phylogenies of transcription and translation machineries converge following addition of new genome sequences. BMC Evol Biol 2005; 5:36. [PMID: 15932645 PMCID: PMC1177939 DOI: 10.1186/1471-2148-5-36] [Citation(s) in RCA: 80] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2005] [Accepted: 06/02/2005] [Indexed: 12/01/2022] Open
Abstract
Background The concept of a genomic core, defined as the set of genes ubiquitous in all genomes of a monophyletic group, has become crucial in comparative and evolutionary genomics. However, it is still a matter of debate whether lateral gene transfers (LGT) may affect the components of genomic cores, preventing their use to retrace species evolution. We have recently reconstructed the phylogeny of Archaea by using two large concatenated datasets of core proteins involved in translation and transcription, respectively. The resulting trees were largely congruent, showing that informational gene components of the archaeal genomic core belonging to two distinct molecular systems contain a coherent signal for archaeal phylogeny. However, some incongruence remained between the two phylogenies. This may be due either to undetected LGT and/or to a lack of sufficient phylogenetic signal in the datasets. Results We present evidence strongly favoring of the latter hypothesis. In fact, we have updated our transcription and translation datasets with five new archaeal genomes for a total of 6384 and 2928 amino acid positions, respectively, and 25 taxa. This increase in taxonomic sampling led to the nearly complete convergence of the transcription-based and translation-based trees on a single phylogenetic pattern for archaeal evolution. In fact, only a single incongruence persisted between the two phylogenies. This concerned Methanopyrus kandleri, whose placement remained strongly biased in the transcription tree due to its above average evolutionary rates, and could not be counterbalanced due to the lack of availability of closely related and/or slower-evolving relatives. Conclusion To our knowledge, this is the first report of evidence that the phylogenetic signal harbored by components of the archaeal translation apparatus is confirmed by additional markers belonging to a second molecular system (i.e. transcription). This rules out the risk of circularity when inferring species evolution by small subunit ribosomal RNA and ribosomal protein sequences, since it has been suggested that concerted LGT may affect these markers. Our results strongly support the existence of a core of proteins that has evolved mainly through vertical inheritance in Archaea, and carries a bona fide phylogenetic signal that can be used to retrace the evolutionary history of this domain. The identification and analysis of additional molecular markers not affected by LGT should continue defining the emerging picture of a genuine phylogenetic core for the third domain of life.
Collapse
Affiliation(s)
- Céline Brochier
- Laboratoire EGEE (Evolution, Génomique, Environnement) Université Aix-Marseille I, Centre Saint-Charles, Case 36, 3 Place Victor Hugo, 13331 Marseille, Cedex 3, France
| | - Patrick Forterre
- Unite Biologie Moléculaire du Gène chez les Extremophiles, Institut Pasteur, 25 rue du Dr. Roux, 75724 Paris Cedex 15, France
| | - Simonetta Gribaldo
- Unite Biologie Moléculaire du Gène chez les Extremophiles, Institut Pasteur, 25 rue du Dr. Roux, 75724 Paris Cedex 15, France
- Atelier de Bioinformatique, Université Paris 6, 12 rue Cuvier, Paris, France
| |
Collapse
|
83
|
Bapteste E, Susko E, Leigh J, MacLeod D, Charlebois RL, Doolittle WF. Do orthologous gene phylogenies really support tree-thinking? BMC Evol Biol 2005; 5:33. [PMID: 15913459 PMCID: PMC1156881 DOI: 10.1186/1471-2148-5-33] [Citation(s) in RCA: 148] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2005] [Accepted: 05/24/2005] [Indexed: 11/17/2022] Open
Abstract
Background Since Darwin's Origin of Species, reconstructing the Tree of Life has been a goal of evolutionists, and tree-thinking has become a major concept of evolutionary biology. Practically, building the Tree of Life has proven to be tedious. Too few morphological characters are useful for conducting conclusive phylogenetic analyses at the highest taxonomic level. Consequently, molecular sequences (genes, proteins, and genomes) likely constitute the only useful characters for constructing a phylogeny of all life. For this reason, tree-makers expect a lot from gene comparisons. The simultaneous study of the largest number of molecular markers possible is sometimes considered to be one of the best solutions in reconstructing the genealogy of organisms. This conclusion is a direct consequence of tree-thinking: if gene inheritance conforms to a tree-like model of evolution, sampling more of these molecules will provide enough phylogenetic signal to build the Tree of Life. The selection of congruent markers is thus a fundamental step in simultaneous analysis of many genes. Results Heat map analyses were used to investigate the congruence of orthologues in four datasets (archaeal, bacterial, eukaryotic and alpha-proteobacterial). We conclude that we simply cannot determine if a large portion of the genes have a common history. In addition, none of these datasets can be considered free of lateral gene transfer. Conclusion Our phylogenetic analyses do not support tree-thinking. These results have important conceptual and practical implications. We argue that representations other than a tree should be investigated in this case because a non-critical concatenation of markers could be highly misleading.
Collapse
Affiliation(s)
- E Bapteste
- GenomeAtlantic, 1721 Lower Water Street, Suite 401, Halifax, NS, B3J 1S5, Canada
- Dalhousie University, Department of Biochemistry & Molecular Biology, 5850 College St., Halifax, NS, B3H 1X5, Canada
| | - E Susko
- GenomeAtlantic, 1721 Lower Water Street, Suite 401, Halifax, NS, B3J 1S5, Canada
- Dalhousie University, Department of Mathematics and Statistics, Halifax, Nova Scotia, Canada
| | - J Leigh
- GenomeAtlantic, 1721 Lower Water Street, Suite 401, Halifax, NS, B3J 1S5, Canada
- Dalhousie University, Department of Biochemistry & Molecular Biology, 5850 College St., Halifax, NS, B3H 1X5, Canada
| | - D MacLeod
- GenomeAtlantic, 1721 Lower Water Street, Suite 401, Halifax, NS, B3J 1S5, Canada
- Dalhousie University, Department of Biochemistry & Molecular Biology, 5850 College St., Halifax, NS, B3H 1X5, Canada
| | - RL Charlebois
- GenomeAtlantic, 1721 Lower Water Street, Suite 401, Halifax, NS, B3J 1S5, Canada
- Dalhousie University, Department of Biochemistry & Molecular Biology, 5850 College St., Halifax, NS, B3H 1X5, Canada
| | - WF Doolittle
- GenomeAtlantic, 1721 Lower Water Street, Suite 401, Halifax, NS, B3J 1S5, Canada
- Dalhousie University, Department of Biochemistry & Molecular Biology, 5850 College St., Halifax, NS, B3H 1X5, Canada
| |
Collapse
|
84
|
Brochier C, Gribaldo S, Zivanovic Y, Confalonieri F, Forterre P. Nanoarchaea: representatives of a novel archaeal phylum or a fast-evolving euryarchaeal lineage related to Thermococcales? Genome Biol 2005; 6:R42. [PMID: 15892870 PMCID: PMC1175954 DOI: 10.1186/gb-2005-6-5-r42] [Citation(s) in RCA: 131] [Impact Index Per Article: 6.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2004] [Revised: 02/10/2005] [Accepted: 03/09/2005] [Indexed: 11/13/2022] Open
Abstract
An analysis of the position of Nanoarcheum equitans in the archaeal phylogeny using a large dataset of concatenated ribosomal proteins from 25 archaeal genomes suggests that N. equitans is likely to be the representative of a fast-evolving euryarchaeal lineage. Background Cultivable archaeal species are assigned to two phyla - the Crenarchaeota and the Euryarchaeota - by a number of important genetic differences, and this ancient split is strongly supported by phylogenetic analysis. The recently described hyperthermophile Nanoarchaeum equitans, harboring the smallest cellular genome ever sequenced (480 kb), has been suggested as the representative of a new phylum - the Nanoarchaeota - that would have diverged before the Crenarchaeota/Euryarchaeota split. Confirming the phylogenetic position of N. equitans is thus crucial for deciphering the history of the archaeal domain. Results We tested the placement of N. equitans in the archaeal phylogeny using a large dataset of concatenated ribosomal proteins from 25 archaeal genomes. We indicate that the placement of N. equitans in archaeal phylogenies on the basis of ribosomal protein concatenation may be strongly biased by the coupled effect of its above-average evolutionary rate and lateral gene transfers. Indeed, we show that different subsets of ribosomal proteins harbor a conflicting phylogenetic signal for the placement of N. equitans. A BLASTP-based survey of the phylogenetic pattern of all open reading frames (ORFs) in the genome of N. equitans revealed a surprisingly high fraction of close hits with Euryarchaeota, notably Thermococcales. Strikingly, a specific affinity of N. equitans and Thermococcales was strongly supported by phylogenies based on a subset of ribosomal proteins, and on a number of unrelated molecular markers. Conclusion We suggest that N. equitans may more probably be the representative of a fast-evolving euryarchaeal lineage (possibly related to Thermococcales) than the representative of a novel and early diverging archaeal phylum.
Collapse
Affiliation(s)
- Celine Brochier
- EA EGEE (Evolution, Génomique, Environnement) Université Aix-Marseille I, Centre Saint-Charles, 3 Place Victor Hugo, 13331 Marseille, Cedex 3, France.
| | | | | | | | | |
Collapse
|
85
|
Blair JE, Shah P, Hedges SB. Evolutionary sequence analysis of complete eukaryote genomes. BMC Bioinformatics 2005; 6:53. [PMID: 15762985 PMCID: PMC1274250 DOI: 10.1186/1471-2105-6-53] [Citation(s) in RCA: 41] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2004] [Accepted: 03/11/2005] [Indexed: 11/29/2022] Open
Abstract
Background Gene duplication and gene loss during the evolution of eukaryotes have hindered attempts to estimate phylogenies and divergence times of species. Although current methods that identify clusters of orthologous genes in complete genomes have helped to investigate gene function and gene content, they have not been optimized for evolutionary sequence analyses requiring strict orthology and complete gene matrices. Here we adopt a relatively simple and fast genome comparison approach designed to assemble orthologs for evolutionary analysis. Our approach identifies single-copy genes representing only species divergences (panorthologs) in order to minimize potential errors caused by gene duplication. We apply this approach to complete sets of proteins from published eukaryote genomes specifically for phylogeny and time estimation. Results Despite the conservative criterion used, 753 panorthologs (proteins) were identified for evolutionary analysis with four genomes, resulting in a single alignment of 287,000 amino acids. With this data set, we estimate that the divergence between deuterostomes and arthropods took place in the Precambrian, approximately 400 million years before the first appearance of animals in the fossil record. Additional analyses were performed with seven, 12, and 15 eukaryote genomes resulting in similar divergence time estimates and phylogenies. Conclusion Our results with available eukaryote genomes agree with previous results using conventional methods of sequence data assembly from genomes. They show that large sequence data sets can be generated relatively quickly and efficiently for evolutionary analyses of complete genomes.
Collapse
Affiliation(s)
- Jaime E Blair
- NASA Astrobiology Institute and Department of Biology, The Pennsylvania State University, 208 Mueller Laboratory, University Park, Pennsylvania 16802-5301, USA
| | - Prachi Shah
- NASA Astrobiology Institute and Department of Biology, The Pennsylvania State University, 208 Mueller Laboratory, University Park, Pennsylvania 16802-5301, USA
| | - S Blair Hedges
- NASA Astrobiology Institute and Department of Biology, The Pennsylvania State University, 208 Mueller Laboratory, University Park, Pennsylvania 16802-5301, USA
| |
Collapse
|
86
|
Erkel C, Kemnitz D, Kube M, Ricke P, Chin KJ, Dedysh S, Reinhardt R, Conrad R, Liesack W. Retrieval of first genome data for rice cluster I methanogens by a combination of cultivation and molecular techniques. FEMS Microbiol Ecol 2005; 53:187-204. [PMID: 16329940 DOI: 10.1016/j.femsec.2004.12.004] [Citation(s) in RCA: 43] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2004] [Revised: 11/05/2004] [Accepted: 12/09/2004] [Indexed: 01/08/2023] Open
Abstract
We report first insights into a representative genome of rice cluster I (RC-I), a major group of as-yet uncultured methanogens. The starting point of our study was the methanogenic consortium MRE50 that had been stably maintained for 3 years by consecutive transfers to fresh medium and anaerobic incubation at 50 degrees C. Process-oriented measurements provided evidence for hydrogenotrophic CO(2)-reducing methanogenesis. Assessment of the diversity of consortium MRE50 suggested members of the families Thermoanaerobacteriaceae and Clostridiaceae to constitute the major bacterial component, while the archaeal population was represented entirely by RC-I. The RC-I population amounted to more than 50% of total cells, as concluded from fluorescence in situ hybridization using specific probes for either Bacteria or Archaea. The high enrichment status of RC-I prompted construction of a large insert fosmid library from consortium MRE50. Comparative sequence analysis of internal transcribed spacer (ITS) regions revealed that three different RC-I rrn operon variants were present in the fosmid library. Three, approximately 40-kb genomic fragments, each representative for one of the three different rrn operon variants, were recovered and sequenced. Computational analysis of the sequence data resulted in two major findings: (i) consortium MRE50 most likely harbours only a single RC-I genotype, which is characterized by multiple rrn operon copies; (ii) seven genes were identified to possess a strong phylogenetic signal (eIF2a, dnaG, priA, pcrA, gatD, gatE, and a gene encoding a putative RNA-binding protein). Trees exemplarily computed for the deduced amino acid sequences of eIF2a, dnaG, and priA corroborated a specific phylogenetic association of RC-I with the Methanosarcinales.
Collapse
Affiliation(s)
- Christoph Erkel
- Max-Planck-Institut für Terrestrische Mikrobiologie, Marburg, Germany
| | | | | | | | | | | | | | | | | |
Collapse
|
87
|
A genomic timescale of prokaryote evolution: insights into the origin of methanogenesis, phototrophy, and the colonization of land. BMC Evol Biol 2004; 4:44. [PMID: 15535883 PMCID: PMC533871 DOI: 10.1186/1471-2148-4-44] [Citation(s) in RCA: 314] [Impact Index Per Article: 15.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2004] [Accepted: 11/09/2004] [Indexed: 11/10/2022] Open
Abstract
Background The timescale of prokaryote evolution has been difficult to reconstruct because of a limited fossil record and complexities associated with molecular clocks and deep divergences. However, the relatively large number of genome sequences currently available has provided a better opportunity to control for potential biases such as horizontal gene transfer and rate differences among lineages. We assembled a data set of sequences from 32 proteins (~7600 amino acids) common to 72 species and estimated phylogenetic relationships and divergence times with a local clock method. Results Our phylogenetic results support most of the currently recognized higher-level groupings of prokaryotes. Of particular interest is a well-supported group of three major lineages of eubacteria (Actinobacteria, Deinococcus, and Cyanobacteria) that we call Terrabacteria and associate with an early colonization of land. Divergence time estimates for the major groups of eubacteria are between 2.5–3.2 billion years ago (Ga) while those for archaebacteria are mostly between 3.1–4.1 Ga. The time estimates suggest a Hadean origin of life (prior to 4.1 Ga), an early origin of methanogenesis (3.8–4.1 Ga), an origin of anaerobic methanotrophy after 3.1 Ga, an origin of phototrophy prior to 3.2 Ga, an early colonization of land 2.8–3.1 Ga, and an origin of aerobic methanotrophy 2.5–2.8 Ga. Conclusions Our early time estimates for methanogenesis support the consideration of methane, in addition to carbon dioxide, as a greenhouse gas responsible for the early warming of the Earths' surface. Our divergence times for the origin of anaerobic methanotrophy are compatible with highly depleted carbon isotopic values found in rocks dated 2.8–2.6 Ga. An early origin of phototrophy is consistent with the earliest bacterial mats and structures identified as stromatolites, but a 2.6 Ga origin of cyanobacteria suggests that those Archean structures, if biologically produced, were made by anoxygenic photosynthesizers. The resistance to desiccation of Terrabacteria and their elaboration of photoprotective compounds suggests that the common ancestor of this group inhabited land. If true, then oxygenic photosynthesis may owe its origin to terrestrial adaptations.
Collapse
|
88
|
Rivera MC, Lake JA. The ring of life provides evidence for a genome fusion origin of eukaryotes. Nature 2004; 431:152-5. [PMID: 15356622 DOI: 10.1038/nature02848] [Citation(s) in RCA: 258] [Impact Index Per Article: 12.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2004] [Accepted: 07/15/2004] [Indexed: 10/26/2022]
Abstract
Genomes hold within them the record of the evolution of life on Earth. But genome fusions and horizontal gene transfer seem to have obscured sufficiently the gene sequence record such that it is difficult to reconstruct the phylogenetic tree of life. Here we determine the general outline of the tree using complete genome data from representative prokaryotes and eukaryotes and a new genome analysis method that makes it possible to reconstruct ancient genome fusions and phylogenetic trees. Our analyses indicate that the eukaryotic genome resulted from a fusion of two diverse prokaryotic genomes, and therefore at the deepest levels linking prokaryotes and eukaryotes, the tree of life is actually a ring of life. One fusion partner branches from deep within an ancient photosynthetic clade, and the other is related to the archaeal prokaryotes. The eubacterial organism is either a proteobacterium, or a member of a larger photosynthetic clade that includes the Cyanobacteria and the Proteobacteria.
Collapse
Affiliation(s)
- Maria C Rivera
- Molecular Biology Institute, MCD Biology, University of California, Los Angeles 90095, USA
| | | |
Collapse
|