101
|
Abstract
As lateral gene transfer among prokaryotes and endosymbiotic gene transfer (from organelles) among eukaryotes are fundamentally not tree-like in nature, biologists need to depart from the notion that all genomes are related by a single bifurcating tree. Two significant evolutionary processes are fundamentally not tree-like in nature - lateral gene transfer among prokaryotes and endosymbiotic gene transfer (from organelles) among eukaryotes. To incorporate such processes into the bigger picture of early evolution, biologists need to depart from the preconceived notion that all genomes are related by a single bifurcating tree.
Collapse
Affiliation(s)
- Tal Dagan
- Institute of Botany, University of Düsseldorf, D-40225 Düsseldorf, Germany.
| | | |
Collapse
|
102
|
Whitaker RJ, Banfield JF. Population genomics in natural microbial communities. Trends Ecol Evol 2006; 21:508-16. [PMID: 16859806 DOI: 10.1016/j.tree.2006.07.001] [Citation(s) in RCA: 79] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2005] [Revised: 05/02/2006] [Accepted: 07/03/2006] [Indexed: 11/24/2022]
Abstract
Little is known about the evolutionary processes that structure and maintain microbial diversity because, until recently, it was difficult to explore individual-level patterns of variation at the microbial scale. Now, community-genomic sequence data enable such variation to be assessed across large segments of microbial genomes. Here, we discuss how population-genomic analysis of these data can be used to determine how selection and genetic exchange shape the evolution of new microbial lineages. We show that once independent lineages have been identified, such analyses enable the identification of genome changes that drive niche differentiation and promote the coexistence of closely related lineages within the same environment. We suggest that understanding the evolutionary ecology of natural microbial populations through population-genomic analyses will enhance our understanding of genome evolution across all domains of life.
Collapse
Affiliation(s)
- Rachel J Whitaker
- Ecosystem Sciences, 137 Mulford Hall, University of California, Berkeley, CA 94720-3114, USA.
| | | |
Collapse
|
103
|
Gribaldo S, Brochier-Armanet C. The origin and evolution of Archaea: a state of the art. Philos Trans R Soc Lond B Biol Sci 2006; 361:1007-22. [PMID: 16754611 PMCID: PMC1578729 DOI: 10.1098/rstb.2006.1841] [Citation(s) in RCA: 141] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/24/2023] Open
Abstract
Environmental surveys indicate that the Archaea are diverse and abundant not only in extreme environments, but also in soil, oceans and freshwater, where they may fulfil a key role in the biogeochemical cycles of the planet. Archaea display unique capacities, such as methanogenesis and survival at temperatures higher than 90 degrees C, that make them crucial for understanding the nature of the biota of early Earth. Molecular, genomics and phylogenetics data strengthen Woese's definition of Archaea as a third domain of life in addition to Bacteria and Eukarya. Phylogenomics analyses of the components of different molecular systems are highlighting a core of mainly vertically inherited genes in Archaea. This allows recovering a globally well-resolved picture of archaeal evolution, as opposed to what is observed for Bacteria and Eukarya. This may be due to the fact that no rapid divergence occurred at the emergence of present-day archaeal lineages. This phylogeny supports a hyperthermophilic and non-methanogenic ancestor to present-day archaeal lineages, and a profound divergence between two major phyla, the Crenarchaeota and the Euryarchaeota, that may not have an equivalent in the other two domains of life. Nanoarchaea may not represent a third and ancestral archaeal phylum, but a fast-evolving euryarchaeal lineage. Methanogenesis seems to have appeared only once and early in the evolution of Euryarchaeota. Filling up this picture of archaeal evolution by adding presently uncultivated species, and placing it back in geological time remain two essential goals for the future.
Collapse
Affiliation(s)
- Simonetta Gribaldo
- Institut Pasteur, Unité Biologie Moléculaire du Gène chez les Extremophiles, 25 rue du Dr Roux, 75724 Paris Cedex 15, France.
| | | |
Collapse
|
104
|
Delmotte F, Rispe C, Schaber J, Silva FJ, Moya A. Tempo and mode of early gene loss in endosymbiotic bacteria from insects. BMC Evol Biol 2006; 6:56. [PMID: 16848891 PMCID: PMC1544356 DOI: 10.1186/1471-2148-6-56] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/11/2006] [Accepted: 07/18/2006] [Indexed: 02/07/2023] Open
Abstract
BACKGROUND Understanding evolutionary processes that drive genome reduction requires determining the tempo (rate) and the mode (size and types of deletions) of gene losses. In this study, we analysed five endosymbiotic genome sequences of the gamma-proteobacteria (three different Buchnera aphidicola strains, Wigglesworthia glossinidia, Blochmannia floridanus) to test if gene loss could be driven by the selective importance of genes. We used a parsimony method to reconstruct a minimal ancestral genome of insect endosymbionts and quantified gene loss along the branches of the phylogenetic tree. To evaluate the selective or functional importance of genes, we used a parameter that measures the level of adaptive codon bias in E. coli (i.e. codon adaptive index, or CAI), and also estimates of evolutionary rates (Ka) between pairs of orthologs either in free-living bacteria or in pairs of symbionts. RESULTS Our results demonstrate that genes lost in the early stages of symbiosis were on average less selectively constrained than genes conserved in any of the extant symbiotic strains studied. These results also extend to more recent events of gene losses (i.e. among Buchnera strains) that still tend to concentrate on genes with low adaptive bias in E. coli and high evolutionary rates both in free-living and in symbiotic lineages. In addition, we analyzed the physical organization of gene losses for early steps of symbiosis acquisition under the hypothesis of a common origin of different symbioses. In contrast with previous findings we show that gene losses mostly occurred through loss of rather small blocks and mostly in syntenic regions between at least one of the symbionts and present-day E. coli. CONCLUSION At both ancient and recent stages of symbiosis evolution, gene loss was at least partially influenced by selection, highly conserved genes being retained more readily than lowly conserved genes: although losses might result from drift due to the bottlenecking of endosymbiontic populations, we demonstrated that purifying selection also acted by retaining genes of greater selective importance.
Collapse
Affiliation(s)
- F Delmotte
- UMR Santé Végétale (INRA-ENITAB), INRA BP81, 33883 Villenave d'Ornon Cedex, France
| | - C Rispe
- UMR Biologie des Organismes et des Populations appliquée à la Protection des Plantes [BIO3P], INRA BP 35327, 35653 Le Rheu Cedex, France
| | - J Schaber
- Max Planck Institute for Molecular Genetics, Ihnestrasse 63–73, 14196 Berlin, Germany
| | - FJ Silva
- Instituto Cavanilles de Biodiversidad y Biologia Evolutiva, Universidad de Valencia, A.C. 22085, 46071 Valencia, Spain
| | - A Moya
- Instituto Cavanilles de Biodiversidad y Biologia Evolutiva, Universidad de Valencia, A.C. 22085, 46071 Valencia, Spain
| |
Collapse
|
105
|
Hao W, Golding GB. The fate of laterally transferred genes: life in the fast lane to adaptation or death. Genome Res 2006; 16:636-43. [PMID: 16651664 PMCID: PMC1457040 DOI: 10.1101/gr.4746406] [Citation(s) in RCA: 126] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
Large-scale genome arrangement plays an important role in bacterial genome evolution. A substantial number of genes can be inserted into, deleted from, or rearranged within genomes during evolution. Detecting or inferring gene insertions/deletions is of interest because such information provides insights into bacterial genome evolution and speciation. However, efficient inference of genome events is difficult because genome comparisons alone do not generally supply enough information to distinguish insertions, deletions, and other rearrangements. In this study, homologous genes from the complete genomes of 13 closely related bacteria were examined. The presence or absence of genes from each genome was cataloged, and a maximum likelihood method was used to infer insertion/deletion rates according to the phylogenetic history of the taxa. It was found that whole gene insertions/deletions in genomes occur at rates comparable to or greater than the rate of nucleotide substitution and that higher insertion/deletion rates are often inferred to be present at the tips of the phylogeny with lower rates on more ancient interior branches. Recently transferred genes are under faster and relaxed evolution compared with more ancient genes. Together, this implies that many of the lineage-specific insertions are lost quickly during evolution and that perhaps a few of the genes inserted by lateral transfer are niche specific.
Collapse
Affiliation(s)
- Weilong Hao
- Department of Biology, McMaster University, Hamilton, Ontario, Canada L8S 4K1
| | - G. Brian Golding
- Department of Biology, McMaster University, Hamilton, Ontario, Canada L8S 4K1
- Corresponding author.E-mail ; fax (905) 522-6066
| |
Collapse
|
106
|
Mau B, Glasner JD, Darling AE, Perna NT. Genome-wide detection and analysis of homologous recombination among sequenced strains of Escherichia coli. Genome Biol 2006; 7:R44. [PMID: 16737554 PMCID: PMC1779527 DOI: 10.1186/gb-2006-7-5-r44] [Citation(s) in RCA: 54] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2005] [Revised: 02/08/2006] [Accepted: 05/08/2006] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Comparisons of complete bacterial genomes reveal evidence of lateral transfer of DNA across otherwise clonally diverging lineages. Some lateral transfer events result in acquisition of novel genomic segments and are easily detected through genome comparison. Other more subtle lateral transfers involve homologous recombination events that result in substitution of alleles within conserved genomic regions. This type of event is observed infrequently among distantly related organisms. It is reported to be more common within species, but the frequency has been difficult to quantify since the sequences under comparison tend to have relatively few polymorphic sites. RESULTS Here we report a genome-wide assessment of homologous recombination among a collection of six complete Escherichia coli and Shigella flexneri genome sequences. We construct a whole-genome multiple alignment and identify clusters of polymorphic sites that exhibit atypical patterns of nucleotide substitution using a random walk-based method. The analysis reveals one large segment (approximately 100 kb) and 186 smaller clusters of single base pair differences that suggest lateral exchange between lineages. These clusters include portions of 10% of the 3,100 genes conserved in six genomes. Statistical analysis of the functional roles of these genes reveals that several classes of genes are over-represented, including those involved in recombination, transport and motility. CONCLUSION We demonstrate that intraspecific recombination in E. coli is much more common than previously appreciated and may show a bias for certain types of genes. The described method provides high-specificity, conservative inference of past recombination events.
Collapse
Affiliation(s)
- Bob Mau
- Department of Mathematics, Lincoln Drive, University of Wisconsin, Madison WI 53706, USA
- Department of Oncology, University Ave, University of Wisconsin, Madison WI 53706, USA
- Genome Center of Wisconsin, Henry Mall, University of Wisconsin, Madison WI 53706, USA
| | - Jeremy D Glasner
- Genome Center of Wisconsin, Henry Mall, University of Wisconsin, Madison WI 53706, USA
| | - Aaron E Darling
- Department of Computer Science, W. Dayton St, University of Wisconsin, Madison WI 53706, USA
| | - Nicole T Perna
- Genome Center of Wisconsin, Henry Mall, University of Wisconsin, Madison WI 53706, USA
- Department of Animal Health and Biomedical Sciences, Linden Drive, University of Wisconsin, Madison WI 53706, USA
| |
Collapse
|
107
|
Abstract
Exponentially accumulating genetic molecular data were supposed to bring us closer to resolving one of the most fundamental issues in biology—the reconstruction of the tree of life. This tree should encompass the evolutionary history of all living creatures on earth and trace back a few billions of years to
the most ancient microbial ancestor.
Ironically, this abundance of data only blurs our traditional beliefs and seems to make this goal harder to achieve than initially thought. This is largelydue to lateral gene transfer, the passage of genetic material between organisms not through lineal descent. Evolution in light of lateral transfer tangles the traditional universal tree of life, turning it into a network of relationships. Lateral
transfer is a significant factor in microbial evolution and is the mechanism of antibiotic resistance spread in bacteria species.
In this paper we survey current methods designed to cope with lateral transfer in conjunction with vertical inheritance. We distinguish between phylogenetic-based methods and sequence-based methods and illuminate the advantages and disadvantages of each. Finally, we sketch a new statistically rigorous approach aimed at identifying lateral transfer between two genomes.
Collapse
Affiliation(s)
- Sagi Snir
- Institute of Evolution, University of Haifa, 31905 Haifa, Israel and Department of Computer Science, Netanya Academic College
| |
Collapse
|
108
|
Lebrun E, Santini JM, Brugna M, Ducluzeau AL, Ouchane S, Schoepp-Cothenet B, Baymann F, Nitschke W. The Rieske Protein: A Case Study on the Pitfalls of Multiple Sequence Alignments and Phylogenetic Reconstruction. Mol Biol Evol 2006; 23:1180-91. [PMID: 16569761 DOI: 10.1093/molbev/msk010] [Citation(s) in RCA: 54] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Previously published phylogenetic trees reconstructed on "Rieske protein" sequences frequently are at odds with each other, with those of other subunits of the parent enzymes and with small-subunit rRNA trees. These differences are shown to be at least partially if not completely due to problems in the reconstruction procedures. A major source of erroneous Rieske protein trees lies in the presence of a large, poorly conserved domain prone to accommodate very long insertions in well-defined structural hot spots substantially hampering multiple alignments. The remaining smaller domain, in contrast, is too conserved to allow distant phylogenies to be deduced with sufficient confidence. Three-dimensional structures of representatives from this protein family are now available from phylogenetically distant species and from diverse enzymes. Multiple alignments can thus be refined on the basis of these structures. We show that structurally guided alignments of Rieske proteins from Rieske-cytochrome b complexes and arsenite oxidases strongly reduce conflicts between resulting trees and those obtained on their companion enzyme subunits. Further problems encountered during this work, mainly consisting in database errors such as wrong annotations and frameshifts, are described. The obtained results are discussed against the background of hypotheses stipulating pervasive lateral gene transfer in prokaryotes.
Collapse
Affiliation(s)
- Evelyne Lebrun
- Laboratoire de Bioénergétique et Ingénierie des Protéines, Institut de Biologie Structurale et Microbiologie (IFR), Marseille, France
| | | | | | | | | | | | | | | |
Collapse
|
109
|
Budowle B, Johnson MD, Fraser CM, Leighton TJ, Murch RS, Chakraborty R. Genetic analysis and attribution of microbial forensics evidence. Crit Rev Microbiol 2006; 31:233-54. [PMID: 16417203 DOI: 10.1080/10408410500304082] [Citation(s) in RCA: 28] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/21/2023]
Abstract
Because of the availability of pathogenic microorganisms and the relatively low cost of preparing and disseminating bioweapons, there is a continuing threat of biocrime and bioterrorism. Thus, enhanced capabilities are needed that enable the full and robust forensic exploitation and interpretation of microbial evidence from acts of bioterrorism or biocrimes. To respond to the need, greater resources and efforts are being applied to the burgeoning field of microbial forensics. Microbial forensics focuses on the characterization, analysis and interpretation of evidence for attributional purposes from a bioterrorism act, biocrime, hoax or inadvertent agent release. To enhance attribution capabilities, a major component of microbial forensics is the analysis of nucleic acids to associate or eliminate putative samples. The degree that attribution can be addressed depends on the context of the case, the available knowledge of the genetics, phylogeny, and ecology of the target microorganism, and technologies applied. The types of genetic markers and features that can impact statistical inferences of microbial forensic evidence include: single nucleotide polymorphisms, repetitive sequences, insertions and deletions, mobile elements, pathogenicity islands, virulence and resistance genes, house keeping genes, structural genes, whole genome sequences, asexual and sexual reproduction, horizontal gene transfer, conjugation, transduction, lysogeny, gene conversion, recombination, gene duplication, rearrangements, and mutational hotspots. Nucleic acid based typing technologies include: PCR, real-time PCR, MLST, MLVA, whole genome sequencing, and microarrays.
Collapse
|
110
|
Ciccarelli FD, Doerks T, von Mering C, Creevey CJ, Snel B, Bork P. Toward automatic reconstruction of a highly resolved tree of life. Science 2006; 311:1283-7. [PMID: 16513982 DOI: 10.1126/science.1123061] [Citation(s) in RCA: 1065] [Impact Index Per Article: 59.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/02/2022]
Abstract
We have developed an automatable procedure for reconstructing the tree of life with branch lengths comparable across all three domains. The tree has its basis in a concatenation of 31 orthologs occurring in 191 species with sequenced genomes. It revealed interdomain discrepancies in taxonomic classification. Systematic detection and subsequent exclusion of products of horizontal gene transfer increased phylogenetic resolution, allowing us to confirm accepted relationships and resolve disputed and preliminary classifications. For example, we place the phylum Acidobacteria as a sister group of delta-Proteobacteria, support a Gram-positive origin of Bacteria, and suggest a thermophilic last universal common ancestor.
Collapse
|
111
|
Liu J, Glazko G, Mushegian A. Protein repertoire of double-stranded DNA bacteriophages. Virus Res 2006; 117:68-80. [PMID: 16490276 DOI: 10.1016/j.virusres.2006.01.015] [Citation(s) in RCA: 35] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2005] [Revised: 01/11/2006] [Accepted: 01/18/2006] [Indexed: 01/21/2023]
Abstract
The complexity and diversity of phage gene sets, which are produced by rapid evolution of phage genomes and rampant gene exchanges among phages, hamper the efforts to decipher the evolutionary relationships between individual phage proteins and reconstruct the complete set of evolutionary events leading to the known phages. To start unraveling the natural history of phages, we built the phage orthologous groups (POGs), a natural system of phage protein families that includes 6378 genes from 164 complete genome sequences of double-stranded DNA bacteriophages. Phage proteomes have high POG coverage: on average, 39 genes per phage genome belong to POGs, which is close to half of all genes in most phages. In an agreement with the notion of phage role in horizontal gene transfer, we see many cases of likely gene exchange between phages and their microbial hosts. At the same time, about 80% of all POGs are highly specific to phage genomes and are not commonly found in microbial genomes, indicating coherence and large degree of evolutionary independence of phage gene sets. The information on orthologous genes is essential for evolutionary classification of known bacteriophages and for reconstruction of ancestral phage genomes.
Collapse
Affiliation(s)
- Jing Liu
- Stowers Institute for Medical Research, 1000 E 50th St., Kansas City, MO 64110, USA
| | | | | |
Collapse
|
112
|
Teeling H, Gloeckner FO. RibAlign: a software tool and database for eubacterial phylogeny based on concatenated ribosomal protein subunits. BMC Bioinformatics 2006; 7:66. [PMID: 16476165 PMCID: PMC1421441 DOI: 10.1186/1471-2105-7-66] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2005] [Accepted: 02/13/2006] [Indexed: 11/28/2022] Open
Abstract
Background Until today, analysis of 16S ribosomal RNA (rRNA) sequences has been the de-facto gold standard for the assessment of phylogenetic relationships among prokaryotes. However, the branching order of the individual phlya is not well-resolved in 16S rRNA-based trees. In search of an improvement, new phylogenetic methods have been developed alongside with the growing availability of complete genome sequences. Unfortunately, only a few genes in prokaryotic genomes qualify as universal phylogenetic markers and almost all of them have a lower information content than the 16S rRNA gene. Therefore, emphasis has been placed on methods that are based on multiple genes or even entire genomes. The concatenation of ribosomal protein sequences is one method which has been ascribed an improved resolution. Since there is neither a comprehensive database for ribosomal protein sequences nor a tool that assists in sequence retrieval and generation of respective input files for phylogenetic reconstruction programs, RibAlign has been developed to fill this gap. Results RibAlign serves two purposes: First, it provides a fast and scalable database that has been specifically adapted to eubacterial ribosomal protein sequences and second, it provides sophisticated import and export capabilities. This includes semi-automatic extraction of ribosomal protein sequences from whole-genome GenBank and FASTA files as well as exporting aligned, concatenated and filtered sequence files that can directly be used in conjunction with the PHYLIP and MrBayes phylogenetic reconstruction programs. Conclusion Up to now, phylogeny based on concatenated ribosomal protein sequences is hampered by the limited set of sequenced genomes and high computational requirements. However, hundreds of full and draft genome sequencing projects are on the way, and advances in cluster-computing and algorithms make phylogenetic reconstructions feasible even with large alignments of concatenated marker genes. RibAlign is a first step in this direction and may be particularly interesting to scientists involved in whole genome sequencing of representatives of new or sparsely studied eubacterial phyla. RibAlign is available at
Collapse
Affiliation(s)
- Hanno Teeling
- Microbial Genomics Group, Max Planck Institute for Marine Microbiology, D-28359 Bremen, Germany
| | - Frank Oliver Gloeckner
- Microbial Genomics Group, Max Planck Institute for Marine Microbiology, D-28359 Bremen, Germany
- International University Bremen, D-28759 Bremen, Germany
| |
Collapse
|
113
|
Campillos M, von Mering C, Jensen LJ, Bork P. Identification and analysis of evolutionarily cohesive functional modules in protein networks. Genome Res 2006; 16:374-82. [PMID: 16449501 PMCID: PMC1415216 DOI: 10.1101/gr.4336406] [Citation(s) in RCA: 57] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
The increasing number of sequenced genomes makes it possible to infer the evolutionary history of functional modules, i.e., groups of proteins that contribute jointly to the same cellular function in a given species. Here we identify and analyze those prokaryotic functional modules, whose composition remains largely unchanged during evolution, and study their properties. Such "cohesive" modules have a large number of internal functional connections, encode genes that tend to be in close proximity in prokaryotic genomes, and correspond to physical complexes or complex functional systems like the flagellar apparatus. Cohesive modules are enriched in processes such as energy and amino acid metabolism, cell motility, and intracellular trafficking, or secretion. By grouping genes into modules we achieve a more precise estimate of their age and find that the young modules are often horizontally transferred between species and are enriched in functions involved in interactions with the environment, implying that they play an important role in the adaptation of species to new environments.
Collapse
Affiliation(s)
- Mónica Campillos
- The European Molecular Biology Laboratory (EMBL), 69117 Heidelberg, Germany
| | | | | | | |
Collapse
|
114
|
Abstract
Genome trees are a means to capture the overwhelming amount of phylogenetic information that is present in genomes. Different formalisms have been introduced to reconstruct genome trees on the basis of various aspects of the genome. On the basis of these aspects, we separate genome trees into five classes: (a) alignment-free trees based on statistic properties of the genome, (b) gene content trees based on the presence and absence of genes, (c) trees based on chromosomal gene order, (d) trees based on average sequence similarity, and (e) phylogenomics-based genome trees. Despite their recent development, genome tree methods have already had some impact on the phylogenetic classification of bacterial species. However, their main impact so far has been on our understanding of the nature of genome evolution and the role of horizontal gene transfer therein. An ideal genome tree method should be capable of using all gene families, including those containing paralogs, in a phylogenomics framework capitalizing on existing methods in conventional phylogenetic reconstruction. We expect such sophisticated methods to help us resolve the branching order between the main bacterial phyla.
Collapse
Affiliation(s)
- Berend Snel
- Center for Molecular and Biomolecular Informatics, Nijmegen, The Netherlands.
| | | | | |
Collapse
|
115
|
Than C, Ruths D, Innan H, Nakhleh L. Identifiability Issues in Phylogeny-Based Detection of Horizontal Gene Transfer. COMPARATIVE GENOMICS 2006. [DOI: 10.1007/11864127_17] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/12/2023]
|
116
|
Abstract
No field of research has embraced and applied genomic technology more than the field of microbiology. Comparative analysis of nearly 300 microbial species has demonstrated that the microbial genome is a dynamic entity shaped by multiple forces. Microbial genomics has provided a foundation for a broad range of applications, from understanding basic biological processes, host-pathogen interactions, and protein-protein interactions, to discovering DNA variations that can be used in genotyping or forensic analyses, the design of novel antimicrobial compounds and vaccines, and the engineering of microbes for industrial applications. Most recently, metagenomics approaches are allowing us to begin to probe complex microbial communities for the first time, and they hold great promise in helping to unravel the relationships between microbial species.
Collapse
|
117
|
Zagorski N. Profile of Nancy A. Moran. Proc Natl Acad Sci U S A 2005; 102:16916-8. [PMID: 16286644 PMCID: PMC1288003 DOI: 10.1073/pnas.0508498102] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
|
118
|
Abstract
The ranks higher than the species in the prokaryotic taxonomy are primarily designated based on phylogenetic analysis of the 16S rRNA gene sequences, but no definite standards exist for the absolute relatedness (measured by 16S rRNA or other means) between the ranks. Accordingly, it remains unknown how comparable the ranks are between different organisms. To gain insights into this question, we studied the relationship between shared gene content and genetic relatedness for 175 fully sequenced strains, using as a robust measure of relatedness the average amino acid identity (AAI) of the shared genes. Our results reveal that adjacent ranks (e.g., phylum versus class) frequently show extensive overlap in terms of genetic and gene content relatedness of the grouped organisms, and hence, the current system is of limited predictive power in this respect. The overlap between nonadjacent ranks (e.g., phylum versus family) is generally limited and attributable to clear inconsistencies of the taxonomy. In addition to providing means for standardizing taxonomy, our AAI-based approach provides a means to evaluate the robustness of alternative genetic markers for phylogenetic purposes. For instance, the 23S rRNA gene was found to be as good a marker as the 16S rRNA gene, while several of the widely distributed protein-coding genes, such as the RNA polymerase and gyrase subunits, show a strong phylogenetic signal, albeit less strong than the rRNA genes (0.78 > R2 > 0.69 for the protein-coding genes versus R2 = 0.84 for the rRNA genes). The AAI approach outlined here could contribute significantly to a genome-based taxonomy for all microbial organisms.
Collapse
|
119
|
Molenaar D, Bringel F, Schuren FH, de Vos WM, Siezen RJ, Kleerebezem M. Exploring Lactobacillus plantarum genome diversity by using microarrays. J Bacteriol 2005; 187:6119-27. [PMID: 16109953 PMCID: PMC1196139 DOI: 10.1128/jb.187.17.6119-6127.2005] [Citation(s) in RCA: 199] [Impact Index Per Article: 10.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
Lactobacillus plantarum is a versatile and flexible species that is encountered in a variety of niches and can utilize a broad range of fermentable carbon sources. To assess if this versatility is linked to a variable gene pool, microarrays containing a subset of small genomic fragments of L. plantarum strain WCFS1 were used to perform stringent genotyping of 20 strains of L. plantarum from various sources. The gene categories with the most genes conserved in all strains were those involved in biosynthesis or degradation of structural compounds like proteins, lipids, and DNA. Conversely, genes involved in sugar transport and catabolism were highly variable between strains. Moreover, besides the obvious regions of variance, like prophages, other regions varied between the strains, including regions encoding plantaricin biosynthesis, nonribosomal peptide biosynthesis, and exopolysaccharide biosynthesis. In many cases, these variable regions colocalized with regions of unusual base composition. Two large regions of flexibility were identified between 2.70 and 2.85 and 3.10 and 3.29 Mb of the WCFS1 chromosome, the latter being close to the origin of replication. The majority of genes encoded in these variable regions are involved in sugar metabolism. This functional overrepresentation and the unusual base composition of these regions led to the hypothesis that they represented lifestyle adaptation regions in L. plantarum. The present study consolidates this hypothesis by showing that there is a high degree of gene content variation among L. plantarum strains in genes located in these regions of the WCFS1 genome. Interestingly, based on our genotyping data L. plantarum strains clustered into two clearly distinguishable groups, which coincided with an earlier proposed subdivision of this species based on conventional methods.
Collapse
Affiliation(s)
- Douwe Molenaar
- Wageningen Centre for Food Sciences; NIZO food research, P.O. Box 20, 6710 BA Ede, The Netherlands
| | | | | | | | | | | |
Collapse
|
120
|
Cummings MP, Meyer A. Magic bullets and golden rules: data sampling in molecular phylogenetics. ZOOLOGY 2005; 108:329-36. [PMID: 16351981 DOI: 10.1016/j.zool.2005.09.006] [Citation(s) in RCA: 37] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2005] [Revised: 09/22/2005] [Accepted: 09/23/2005] [Indexed: 11/23/2022]
Abstract
Data collection for molecular phylogenetic studies is based on samples of both genes and taxa. In an ideal world, with no limitations to resources, as many genes could be sampled as deemed necessary to address phylogenetic problems. Given limited resources in the real world, inadequate (in terms of choice of genes or number of genes) sequences or restricted taxon sampling can adversely affect the reliability or information gained in phylogenetics. Recent empirical and simulation-based studies of data sampling in molecular phylogenetics have reached differing conclusions on how to deal with these problems. Some advocated sampling more genes, others more taxa. There is certainly no 'magic bullet' that will fit all phylogenetic problems, and no specific 'golden rules' have been deduced, other than that single genes may not always contain sufficient phylogenetic information. However, several general conclusions and suggestions can be made. One suggestion is that the determination of a multiple, but moderate number (e.g., 6-10) of gene sequences might take precedence over sequencing a larger set of genes and thereby permit the sampling of more taxa for a phylogenetic study.
Collapse
Affiliation(s)
- Michael P Cummings
- Center for Bioinformatics and Computational Biology, University of Maryland, College Park, MD 20742, USA.
| | | |
Collapse
|
121
|
Itaya M, Tsuge K, Koizumi M, Fujita K. Combining two genomes in one cell: stable cloning of the Synechocystis PCC6803 genome in the Bacillus subtilis 168 genome. Proc Natl Acad Sci U S A 2005; 102:15971-6. [PMID: 16236728 PMCID: PMC1276048 DOI: 10.1073/pnas.0503868102] [Citation(s) in RCA: 142] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Cloning the whole 3.5-megabase (Mb) genome of the photosynthetic bacterium Synechocystis PCC6803 into the 4.2-Mb genome of the mesophilic bacterium Bacillus subtilis 168 resulted in a 7.7-Mb composite genome. We succeeded in such unprecedented large-size cloning by progressively assembling and editing contiguous DNA regions that cover the entire Synechocystis genome. The strain containing the two sets of genome grew only in the B. subtilis culture medium where all of the cloning procedures were carried out. The high structural stability of the cloned Synechocystis genome was closely associated with the symmetry of the bacterial genome structure of the DNA replication origin (oriC) and its termination (terC) and the exclusivity of Synechocystis ribosomal RNA operon genes (rrnA and rrnB). Given the significant diversity in genome structure observed upon horizontal DNA transfer in nature, our stable laboratory-generated composite genome raised fundamental questions concerning two complete genomes in one cell. Our megasize DNA cloning method, designated megacloning, may be generally applicable to other genomes or genome loci of free-living organisms.
Collapse
Affiliation(s)
- Mitsuhiro Itaya
- Mitsubishi Kagaku Institute of Life Sciences, 11 Minamiooya, Machida-shi, Tokyo 194-8511, Japan.
| | | | | | | |
Collapse
|
122
|
Abstract
To what extent is the tree of life the best representation of the evolutionary history of microorganisms? Recent work has shown that, among sets of prokaryotic genomes in which most homologous genes show extremely low sequence divergence, gene content can vary enormously, implying that those genes that are variably present or absent are frequently horizontally transferred. Traditionally, successful horizontal gene transfer was assumed to provide a selective advantage to either the host or the gene itself, but could horizontally transferred genes be neutral or nearly neutral? We suggest that for many prokaryotes, the boundaries between species are fuzzy, and therefore the principles of population genetics must be broadened so that they can be applied to higher taxonomic categories.
Collapse
Affiliation(s)
- J Peter Gogarten
- Department of Molecular and Cell Biology, University of Connecticut, Storrs, Connecticut 06269-3125, USA.
| | | |
Collapse
|
123
|
Srinivasan BS, Caberoy NB, Suen G, Taylor RG, Shah R, Tengra F, Goldman BS, Garza AG, Welch RD. Functional genome annotation through phylogenomic mapping. Nat Biotechnol 2005; 23:691-8. [PMID: 15940241 DOI: 10.1038/nbt1098] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
Accurate determination of functional interactions among proteins at the genome level remains a challenge for genomic research. Here we introduce a genome-scale approach to functional protein annotation--phylogenomic mapping--that requires only sequence data, can be applied equally well to both finished and unfinished genomes, and can be extended beyond single genomes to annotate multiple genomes simultaneously. We have developed and applied it to more than 200 sequenced bacterial genomes. Proteins with similar evolutionary histories were grouped together, placed on a three dimensional map and visualized as a topographical landscape. The resulting phylogenomic maps display thousands of proteins clustered in mountains on the basis of coinheritance, a strong indicator of shared function. In addition to systematic computational validation, we have experimentally confirmed the ability of phylogenomic maps to predict both mutant phenotype and gene function in the delta proteobacterium Myxococcus xanthus.
Collapse
Affiliation(s)
- Balaji S Srinivasan
- Department of Developmental Biology, Stanford University School of Medicine, Stanford, CA, USA
| | | | | | | | | | | | | | | | | |
Collapse
|
124
|
Nightingale KK, Windham K, Wiedmann M. Evolution and molecular phylogeny of Listeria monocytogenes isolated from human and animal listeriosis cases and foods. J Bacteriol 2005; 187:5537-51. [PMID: 16077098 PMCID: PMC1196091 DOI: 10.1128/jb.187.16.5537-5551.2005] [Citation(s) in RCA: 159] [Impact Index Per Article: 8.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
To probe the evolution and phylogeny of Listeria monocytogenes from defined host species and environments, L. monocytogenes isolates from human (n = 60) and animal (n = 30) listeriosis cases and food samples (n = 30) were randomly selected from a larger collection of isolates (n = 354) obtained in New York State between 1999 and 2001. Partial sequencing of four housekeeping genes (gap, prs, purM, and ribC), one stress response gene (sigB), and two virulence genes (actA and inlA) revealed between 11 (gap) and 33 (inlA) allelic types as well as 52 sequence types (unique combination of allelic types). actA, ribC, and purM demonstrated the highest levels of nucleotide diversity (pi > 0.05). actA and inlA as well as prs and the hypervariable housekeeping genes ribC and purM showed evidence of horizontal gene transfer and recombination. actA and inlA also showed evidence of positive selection at specific amino acid sites. Maximum likelihood phylogenies for all seven genes confirmed that L. monocytogenes contains two deeply separated evolutionary lineages. Lineage I was found to be highly clonal, while lineage II showed greater diversity and evidence of horizontal gene transfer. Allelic types were exclusive to lineages, except for a single gap allele, and nucleotide distance within lineages was much lower than that between lineages, suggesting that genetic exchange between lineages is rare. Our data show that (i) L. monocytogenes is a highly diverse species with at least two distinct phylogenetic lineages differing in their evolutionary history and population structure and (ii) horizontal gene transfer as well as positive selection contributed to the evolution of L. monocytogenes.
Collapse
Affiliation(s)
- K K Nightingale
- Department of Food Science, Cornell University, 412B Stocking Hall, Ithaca, NY 14853, USA
| | | | | |
Collapse
|
125
|
Pazos F, Ranea JAG, Juan D, Sternberg MJE. Assessing Protein Co-evolution in the Context of the Tree of Life Assists in the Prediction of the Interactome. J Mol Biol 2005; 352:1002-15. [PMID: 16139301 DOI: 10.1016/j.jmb.2005.07.005] [Citation(s) in RCA: 96] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2005] [Revised: 06/22/2005] [Accepted: 07/04/2005] [Indexed: 11/19/2022]
Abstract
The identification of the whole set of protein interactions taking place in an organism is one of the main tasks in genomics, proteomics and systems biology. One of the computational techniques used by many investigators for studying and predicting protein interactions is the comparison of evolutionary histories (phylogenetic trees), under the hypothesis that interacting proteins would be subject to a similar evolutionary pressure resulting in a similar topology of the corresponding trees. Here, we present a new approach to predict protein interactions from phylogenetic trees, which incorporates information on the overall evolutionary histories of the species (i.e. the canonical "tree of life") in order to correct by the expected background similarity due to the underlying speciation events. We test the new approach in the largest set of annotated interacting proteins for Escherichia coli. This assessment of co-evolution in the context of the tree of life leads to a highly significant improvement (P(N) by sign test approximately 10E-6) in predicting interaction partners with respect to the previous technique, which does not incorporate information on the overall speciation tree. For half of the proteins we found a real interactor among the 6.4% top scores, compared with the 16.5% by the previous method. We applied the new method to the whole E.coli proteome and propose functions for some hypothetical proteins based on their predicted interactors. The new approach allows us also to detect non-canonical evolutionary events, in particular horizontal gene transfers. We also show that taking into account these non-canonical evolutionary events when assessing the similarity between evolutionary trees improves the performance of the method predicting interactions.
Collapse
Affiliation(s)
- Florencio Pazos
- Structural Bioinformatics Group, Division of Molecular Biosciences, Imperial College London, London SW7 2AZ, UK.
| | | | | | | |
Collapse
|
126
|
Schlüter A, Heuer H, Szczepanowski R, Poler SM, Schneiker S, Pühler A, Top EM. Plasmid pB8 is closely related to the prototype IncP-1β plasmid R751 but transfers poorly to Escherichia coli and carries a new transposon encoding a small multidrug resistance efflux protein. Plasmid 2005; 54:135-48. [PMID: 16122561 DOI: 10.1016/j.plasmid.2005.03.001] [Citation(s) in RCA: 51] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2004] [Revised: 02/04/2005] [Accepted: 03/04/2005] [Indexed: 11/18/2022]
Abstract
The IncP-1beta plasmid pB8, which confers resistance to amoxicillin, spectinomycin, streptomycin, and sulfonamides, was previously isolated from a sewage treatment plant. It was found to possess abnormal conjugative transfer properties, i.e., transfer to Escherichia coli by conjugation or electroporation could not be detected. We showed in this study that plasmid pB8 is transferable to E. coli by conjugation, but only at low frequencies and under specific experimental conditions, a phenomenon that is very unusual for IncP-1 plasmids. Determination of the complete 57,198bp pB8 nucleotide sequence revealed that the backbone of the plasmid consists of a complete set of IncP-1beta-specific genes for replication initiation, conjugative plasmid transfer, stable inheritance, and plasmid control with an organisation identical to that of the prototype IncP-1beta plasmid R751. All of the minor differences in the pB8 backbone sequence compared to that of R751 were also found in other IncP-1beta plasmids known to transfer to and replicate in E. coli. Plasmids pB8 and R751 can be distinguished with respect to their accessory genetic elements. First, the pB8 region downstream of the replication initiation gene trfA contains two transposable elements one of which is similar to Tn5501. The latter transposon encodes a putative post-segregational-killing system and the small multidrug resistance (SMR) protein QacF, mediating quaternary ammonium compound resistance. The accessory genes in this region are not responsible for the poor plasmid transfer to E. coli since a pB8 deletion derivative devoid of all genes in that region showed the same conjugative transfer properties as pB8. A Tn5090/Tn402 derivative carrying a class 1 integron is located between the conjugative transfer modules. The Tn5090/Tn402 integration-sites are exactly identical on pB8 and R751 but in contrast to R751 the pB8 element carries the resistance gene cassettes oxa-2 for amoxicillin resistance and aadA4 for streptomycin/spectinomycin resistance, the integron-specific conserved segment consisting of the genes qacEDelta1, sul1, and orf5, and a truncated tni transposition module (tniAB). Although future work will have to determine the molecular basis for the poor transfer of pB8 to E. coli, our findings demonstrate that the host-range of typical IncP-1 plasmids may be less broad than expected.
Collapse
Affiliation(s)
- Andreas Schlüter
- Department of Biological Sciences, University of Idaho, Moscow, ID 83844-3051, USA.
| | | | | | | | | | | | | |
Collapse
|
127
|
Ge F, Wang LS, Kim J. The cobweb of life revealed by genome-scale estimates of horizontal gene transfer. PLoS Biol 2005; 3:e316. [PMID: 16122348 PMCID: PMC1233574 DOI: 10.1371/journal.pbio.0030316] [Citation(s) in RCA: 101] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2005] [Accepted: 07/11/2005] [Indexed: 11/18/2022] Open
Abstract
With the availability of increasing amounts of genomic sequences, it is becoming clear that genomes experience horizontal transfer and incorporation of genetic information. However, to what extent such horizontal gene transfer (HGT) affects the core genealogical history of organisms remains controversial. Based on initial analyses of complete genomic sequences, HGT has been suggested to be so widespread that it might be the “essence of phylogeny” and might leave the treelike form of genealogy in doubt. On the other hand, possible biased estimation of HGT extent and the findings of coherent phylogenetic patterns indicate that phylogeny of life is well represented by tree graphs. Here, we reexamine this question by assessing the extent of HGT among core orthologous genes using a novel statistical method based on statistical comparisons of tree topology. We apply the method to 40 microbial genomes in the Clusters of Orthologous Groups database over a curated set of 297 orthologous gene clusters, and we detect significant HGT events in 33 out of 297 clusters over a wide range of functional categories. Estimates of positions of HGT events suggest a low mean genome-specific rate of HGT (2.0%) among the orthologous genes, which is in general agreement with other quantitative of HGT. We propose that HGT events, even when relatively common, still leave the treelike history of phylogenies intact, much like cobwebs hanging from tree branches. A stastical approach applied to 297 orthologous gene clusters in 40 microbial genomes suggests a low rate of interspecies gene transfer. Species relationships can therefore be modeled with a tree structure.
Collapse
Affiliation(s)
- Fan Ge
- 1Department of Biology, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America
| | - Li-San Wang
- 1Department of Biology, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America
| | - Junhyong Kim
- 1Department of Biology, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America
| |
Collapse
|
128
|
Whitaker RJ, Grogan DW, Taylor JW. Recombination shapes the natural population structure of the hyperthermophilic archaeon Sulfolobus islandicus. Mol Biol Evol 2005; 22:2354-61. [PMID: 16093568 DOI: 10.1093/molbev/msi233] [Citation(s) in RCA: 86] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022] Open
Abstract
Although microorganisms make up the preponderance of the biodiversity on Earth, the ecological and evolutionary factors that structure microbial populations are not well understood. We investigated the genetic structure of a thermoacidophilic crenarchaeal species, Sulfolobus islandicus, using multilocus sequence analysis of six variable protein-coding loci on a set of 60 isolates from the Mutnovsky region of Kamchatka, Russia. We demonstrate significant incongruence among gene genealogies and a lack of association between alleles consistent with recombination rates greater than the rate of mutation. The observation of high relative rates of recombination suggests that the structure of this natural population does not fit the periodic selection model often used to describe populations of asexual microorganisms. We propose instead that frequent recombination among closely related individuals prevents periodic selection from purging diversity and provides a fundamental cohesive mechanism within this and perhaps other archaeal species.
Collapse
Affiliation(s)
- Rachel J Whitaker
- Department of Plant and Microbial Biology, University of California, Berkeley, USA.
| | | | | |
Collapse
|
129
|
Bern M, Goldberg D. Automatic selection of representative proteins for bacterial phylogeny. BMC Evol Biol 2005; 5:34. [PMID: 15927057 PMCID: PMC1175084 DOI: 10.1186/1471-2148-5-34] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2004] [Accepted: 05/31/2005] [Indexed: 11/22/2022] Open
Abstract
Background Although there are now about 200 complete bacterial genomes in GenBank, deep bacterial phylogeny remains a difficult problem, due to confounding horizontal gene transfers and other phylogenetic "noise". Previous methods have relied primarily upon biological intuition or manual curation for choosing genomic sequences unlikely to be horizontally transferred, and have given inconsistent phylogenies with poor bootstrap confidence. Results We describe an algorithm that automatically picks "representative" protein families from entire genomes for use as phylogenetic characters. A representative protein family is one that, taken alone, gives an organismal distance matrix in good agreement with a distance matrix computed from all sufficiently conserved proteins. We then use maximum-likelihood methods to compute phylogenetic trees from a concatenation of representative sequences. We validate the use of representative proteins on a number of small phylogenetic questions with accepted answers. We then use our methodology to compute a robust and well-resolved phylogenetic tree for a diverse set of sequenced bacteria. The tree agrees closely with a recently published tree computed using manually curated proteins, and supports two proposed high-level clades: one containing Actinobacteria, Deinococcus, and Cyanobacteria ("Terrabacteria"), and another containing Planctomycetes and Chlamydiales. Conclusion Representative proteins provide an effective solution to the problem of selecting phylogenetic characters.
Collapse
Affiliation(s)
- Marshall Bern
- Palo Alto Research Center, 3333 Coyote Hill Road, Palo Alto, CA 94304, USA
| | - David Goldberg
- Palo Alto Research Center, 3333 Coyote Hill Road, Palo Alto, CA 94304, USA
| |
Collapse
|
130
|
Ochman H, Lerat E, Daubin V. Examining bacterial species under the specter of gene transfer and exchange. Proc Natl Acad Sci U S A 2005; 102 Suppl 1:6595-9. [PMID: 15851673 PMCID: PMC1131874 DOI: 10.1073/pnas.0502035102] [Citation(s) in RCA: 158] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Even in lieu of a dependable species concept for asexual organisms, the classification of bacteria into discrete taxonomic units is considered to be obstructed by the potential for lateral gene transfer (LGT) among lineages at virtually all phylogenetic levels. In most bacterial genomes, large proportions of genes are introduced by LGT, as indicated by their compositional features and/or phylogenetic distributions, and there is also clear evidence of LGT between very distantly related organisms. By adopting a whole-genome approach, which examined the history of every gene in numerous bacterial genomes, we show that LGT does not hamper phylogenetic reconstruction at many of the shallower taxonomic levels. Despite the high levels of gene acquisition, the only taxonomic group for which appreciable amounts of homologous recombination were detected was within bacterial species. Taken as a whole, the results derived from the analysis of complete gene inventories support several of the current means to recognize and define bacterial species.
Collapse
Affiliation(s)
- Howard Ochman
- Department of Biochemistry and Molecular Biophysics, University of Arizona, Tucson, 85721, USA.
| | | | | |
Collapse
|
131
|
Pollack JD, Li Q, Pearl DK. Taxonomic utility of a phylogenetic analysis of phosphoglycerate kinase proteins of Archaea, Bacteria, and Eukaryota: Insights by Bayesian analyses. Mol Phylogenet Evol 2005; 35:420-30. [PMID: 15804412 DOI: 10.1016/j.ympev.2005.02.002] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2004] [Revised: 02/04/2005] [Accepted: 02/07/2005] [Indexed: 10/25/2022]
Abstract
We studied 131 protein sequences of the essentially ubiquitous glycolytic enzyme 3-phosphoglycerate kinase (3-PGK) by Bayesian analyses in three Domains: 15 Archaea, 83 Bacteria, and 33 Eukaryota. The posterior distribution of phylogenetic trees developed were based on a uniform prior, the WAG model of protein evolution, Metropolis-Hastings sampling in a Markov chain Monte Carlo analysis, and a package of diagnostics to critically evaluate the validity of the analyses. The 15 Archaea separated with high posterior probability. The archaean Phyla Euryarchaeota and the apparently Euryarchaeota derived Crenarchaeota were monophyletic. The 33 Eukaryota separated into two main groups: the non-chlorophyllous forms with coherent sub-groupings of Euglenozoa, Alveolata, Fungi, and Metazoa and all the chlorophyllous species studied: the Plantae (Viridaeplantae), chlorophyllous Stramenopiles, and the chlorophyllous Bacteria. This association supports other opinions concerning the related lineage of cyanobacteria and the Plantae. The 3-PGK sequences from 83 Bacteria in almost every instance associated by their recognized taxal group: alpha-, beta-, gamma-, epsilon-proteobacteria, Chlamydia, Actinobacteridae, and Firmicutes. Firmicutes sequences were subdivided into three apparently monophyletic groups: the anaerobic Clostridia, the spore-forming Bacillales and a group containing the Mollicutes, Lactobacillales and non-spore-forming Bacillales. The 3-PGK-gene tree assemblage was notable both for its pervasive clustering in three Domains according to recognized taxonomic groupings of Class, Order, Family, and Genus. The 3-PGK enzyme or 3-PGK-like activity may have played a central role in the metabolism of the Universal Ancestor.
Collapse
Affiliation(s)
- J Dennis Pollack
- Department of Molecular Virology, Immunology and Medical Genetics, The Ohio State University, 333 West 10th Avenue, Columbus, OH 43210, USA.
| | | | | |
Collapse
|
132
|
Lerat E, Daubin V, Ochman H, Moran NA. Evolutionary origins of genomic repertoires in bacteria. PLoS Biol 2005; 3:e130. [PMID: 15799709 PMCID: PMC1073693 DOI: 10.1371/journal.pbio.0030130] [Citation(s) in RCA: 274] [Impact Index Per Article: 14.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2004] [Accepted: 02/12/2005] [Indexed: 11/18/2022] Open
Abstract
Explaining the diversity of gene repertoires has been a major problem in modern evolutionary biology. In eukaryotes, this diversity is believed to result mainly from gene duplication and loss, but in prokaryotes, lateral gene transfer (LGT) can also contribute substantially to genome contents. To determine the histories of gene inventories, we conducted an exhaustive analysis of gene phylogenies for all gene families in a widely sampled group, the γ-Proteobacteria. We show that, although these bacterial genomes display striking differences in gene repertoires, most gene families having representatives in several species have congruent histories. Other than the few vast multigene families, gene duplication has contributed relatively little to the contents of these genomes; instead, LGT, over time, provides most of the diversity in genomic repertoires. Most such acquired genes are lost, but the majority of those that persist in genomes are transmitted strictly vertically. Although our analyses are limited to the γ-Proteobacteria, these results resolve a long-standing paradox—i.e., the ability to make robust phylogenetic inferences in light of substantial LGT. Lateral gene transfer, rather than duplication, is responsible for most gene diversity present in gamma-Protobacteria; however, these genes are then vertically transmitted and have little impact on gene phylogenies
Collapse
Affiliation(s)
- Emmanuelle Lerat
- 1Department of Ecology and Evolutionary Biology, University of ArizonaTucson, ArizonaUnited States of America
| | - Vincent Daubin
- 2Department of Biochemistry and Molecular Biophysics, University of ArizonaTucson, ArizonaUnited States of America
| | - Howard Ochman
- 2Department of Biochemistry and Molecular Biophysics, University of ArizonaTucson, ArizonaUnited States of America
| | - Nancy A Moran
- 1Department of Ecology and Evolutionary Biology, University of ArizonaTucson, ArizonaUnited States of America
| |
Collapse
|
133
|
Hughes AL, Ekollu V, Friedman R, Rose JR. Gene Family Content-Based Phylogeny of Prokaryotes: The Effect of Criteria for Inferring Homology. Syst Biol 2005; 54:268-76. [PMID: 16012097 DOI: 10.1080/10635150590923335] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022] Open
Abstract
A number of recent papers have suggested that gene family content can be used to resolve phylogenies, particularly in the case of prokaryotes, in which extensive horizontal gene transfer means that individual gene phylogenies may not mirror the organismal phylogeny. However, no study has yet examined how sensitive such analyses are to the criterion of homology assessment used to assemble multigene families. Using data from 99 completely sequenced prokaryotic genomes, we examined the effect of homology criteria in phylogenetic analyses wherein presence or absence of each family in the genome was used as a cladistic character. Different criteria resulted in evidence for contradictory tree topologies, sometimes with high bootstrap support. A moderately strict criterion seemed best for assembling multigene families in a biologically meaningful way, but it was not necessarily preferable for phylogenetic analysis. Instead, a very strict criterion, which broke up gene families into smaller subfamilies, seemed to have advantages for phylogenetic purposes. The poor performance of gene family content-based phylogenetic analysis in the case of prokaryotes appears to reflect high levels of homoplasy resulting not only from horizontal gene transfer but also, more importantly, from extensive parallel loss of gene families in certain bacteria genomes.
Collapse
Affiliation(s)
- Austin L Hughes
- Department of Biological Sciences, University of South Carolina, Columbia, SC 29205, USA.
| | | | | | | |
Collapse
|
134
|
Caetano-Anollés G, Caetano-Anollés D. Universal Sharing Patterns in Proteomes and Evolution of Protein Fold Architecture and Life. J Mol Evol 2005; 60:484-98. [PMID: 15883883 DOI: 10.1007/s00239-004-0221-6] [Citation(s) in RCA: 36] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2004] [Accepted: 10/11/2004] [Indexed: 11/30/2022]
Abstract
Protein evolution is imprinted in both the sequence and the structure of evolutionary building blocks known as protein domains. These domains share a common ancestry and can be unified into a comparatively small set of folding architectures, the protein folds. We have traced the distribution of protein folds between and within proteomes belonging to Eukarya, Archaea, and Bacteria along the branches of a universal phylogeny of protein architecture. This tree was reconstructed from global fold-usage statistics derived from a structural census of proteomes. We found that folds shared by the three organismal domains were placed almost exclusively at the base of the rooted tree and that there were marked heterogeneities in fold distribution and clear evolutionary patterns related to protein architecture and organismal diversification. These include a relative timing for the emergence of prokaryotes, congruent episodes of architectural loss and diversification in Archaea and Bacteria, and a late and quite massive rise of architectural novelties in Eukarya perhaps linked to multicellularity.
Collapse
Affiliation(s)
- Gustavo Caetano-Anollés
- Department of Crop Sciences, University of Illinois, 332 NSRC, 1101 West Peabody Drive, Urbana, IL, 61801, USA.
| | | |
Collapse
|
135
|
Foster J, Ganatra M, Kamal I, Ware J, Makarova K, Ivanova N, Bhattacharyya A, Kapatral V, Kumar S, Posfai J, Vincze T, Ingram J, Moran L, Lapidus A, Omelchenko M, Kyrpides N, Ghedin E, Wang S, Goltsman E, Joukov V, Ostrovskaya O, Tsukerman K, Mazur M, Comb D, Koonin E, Slatko B. The Wolbachia genome of Brugia malayi: endosymbiont evolution within a human pathogenic nematode. PLoS Biol 2005; 3:e121. [PMID: 15780005 PMCID: PMC1069646 DOI: 10.1371/journal.pbio.0030121] [Citation(s) in RCA: 443] [Impact Index Per Article: 23.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2004] [Accepted: 02/02/2005] [Indexed: 11/18/2022] Open
Abstract
Complete genome DNA sequence and analysis is presented for Wolbachia, the obligate alpha-proteobacterial endosymbiont required for fertility and survival of the human filarial parasitic nematode Brugia malayi. Although, quantitatively, the genome is even more degraded than those of closely related Rickettsia species, Wolbachia has retained more intact metabolic pathways. The ability to provide riboflavin, flavin adenine dinucleotide, heme, and nucleotides is likely to be Wolbachia's principal contribution to the mutualistic relationship, whereas the host nematode likely supplies amino acids required for Wolbachia growth. Genome comparison of the Wolbachia endosymbiont of B. malayi (wBm) with the Wolbachia endosymbiont of Drosophila melanogaster (wMel) shows that they share similar metabolic trends, although their genomes show a high degree of genome shuffling. In contrast to wMel, wBm contains no prophage and has a reduced level of repeated DNA. Both Wolbachia have lost a considerable number of membrane biogenesis genes that apparently make them unable to synthesize lipid A, the usual component of proteobacterial membranes. However, differences in their peptidoglycan structures may reflect the mutualistic lifestyle of wBm in contrast to the parasitic lifestyle of wMel. The smaller genome size of wBm, relative to wMel, may reflect the loss of genes required for infecting host cells and avoiding host defense systems. Analysis of this first sequenced endosymbiont genome from a filarial nematode provides insight into endosymbiont evolution and additionally provides new potential targets for elimination of cutaneous and lymphatic human filarial disease. Analysis of this Wolbachia genome, which resides within filarial parasites, offers insight into endosymbiont evolution and the promise of new strategies for the elimination of human filarial disease
Collapse
Affiliation(s)
- Jeremy Foster
- 1Molecular Parasitology Division, New England BiolabsBeverly, MassachusettsUnited States of America
| | - Mehul Ganatra
- 1Molecular Parasitology Division, New England BiolabsBeverly, MassachusettsUnited States of America
| | - Ibrahim Kamal
- 1Molecular Parasitology Division, New England BiolabsBeverly, MassachusettsUnited States of America
| | - Jennifer Ware
- 1Molecular Parasitology Division, New England BiolabsBeverly, MassachusettsUnited States of America
| | - Kira Makarova
- 2National Center for Biotechnology Information, National Library of MedicineNational Institutes of Health, Bethesda, MarylandUnited States of America
| | - Natalia Ivanova
- 3Integrated Genomics, ChicagoIllinoisUnited States of America
| | | | | | - Sanjay Kumar
- 1Molecular Parasitology Division, New England BiolabsBeverly, MassachusettsUnited States of America
| | - Janos Posfai
- 1Molecular Parasitology Division, New England BiolabsBeverly, MassachusettsUnited States of America
| | - Tamas Vincze
- 1Molecular Parasitology Division, New England BiolabsBeverly, MassachusettsUnited States of America
| | - Jessica Ingram
- 1Molecular Parasitology Division, New England BiolabsBeverly, MassachusettsUnited States of America
| | - Laurie Moran
- 1Molecular Parasitology Division, New England BiolabsBeverly, MassachusettsUnited States of America
| | - Alla Lapidus
- 3Integrated Genomics, ChicagoIllinoisUnited States of America
| | - Marina Omelchenko
- 2National Center for Biotechnology Information, National Library of MedicineNational Institutes of Health, Bethesda, MarylandUnited States of America
| | - Nikos Kyrpides
- 3Integrated Genomics, ChicagoIllinoisUnited States of America
| | - Elodie Ghedin
- 4Parasite Genomics, Institute for Genomic ResearchRockville, MarylandUnited States of America
| | - Shiliang Wang
- 4Parasite Genomics, Institute for Genomic ResearchRockville, MarylandUnited States of America
| | - Eugene Goltsman
- 3Integrated Genomics, ChicagoIllinoisUnited States of America
| | - Victor Joukov
- 3Integrated Genomics, ChicagoIllinoisUnited States of America
| | | | - Kiryl Tsukerman
- 3Integrated Genomics, ChicagoIllinoisUnited States of America
| | - Mikhail Mazur
- 3Integrated Genomics, ChicagoIllinoisUnited States of America
| | - Donald Comb
- 1Molecular Parasitology Division, New England BiolabsBeverly, MassachusettsUnited States of America
| | - Eugene Koonin
- 2National Center for Biotechnology Information, National Library of MedicineNational Institutes of Health, Bethesda, MarylandUnited States of America
| | - Barton Slatko
- 1Molecular Parasitology Division, New England BiolabsBeverly, MassachusettsUnited States of America
| |
Collapse
|
136
|
Belda E, Moya A, Silva FJ. Genome rearrangement distances and gene order phylogeny in gamma-Proteobacteria. Mol Biol Evol 2005; 22:1456-67. [PMID: 15772379 DOI: 10.1093/molbev/msi134] [Citation(s) in RCA: 120] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Genome rearrangements have been studied in 30 gamma-proteobacterial complete genomes by comparing the order of a reduced set of genes on the chromosome. This set included those genes fulfilling several characteristics, the main ones being that an ortholog was present in every genome and that none of them had been acquired by horizontal gene transfer. Genome rearrangement distances were estimated based on either the number of breakpoints or the minimal number of inversions separating two genomes. Breakpoint and inversion distances were highly correlated, indicating that inversions were the main type of rearrangement event in gamma-Proteobacteria. In general, the progressive increase in sequence-based distances between genome pairs was associated with the increase in their rearrangement-based distances but with several groups of distances not following this pattern. Compared with free-living enteric bacteria, the lineages of Pasteurellaceae were evolving, on average, to relatively higher rates of between 2.02 and 1.64, while the endosymbiotic bacterial lineages of Buchnera aphidicola and Wigglesworthia glossinidia were evolving at moderately higher rates of 1.38 and 1.35, respectively. Because we know that the rearrangement rate in the Bu. aphidicola lineage was close to zero during the last 100-150 Myr of evolution, we deduced that a much higher rate took place in the first period of lineage evolution after the divergence of the Escherichia coli lineage. On the other hand, the lineage of the endosymbiont Blochmannia floridanus did present an almost identical rate to free-living enteric bacteria, indicating that the increase in the genome rearrangement rate is not a general change associated with bacterial endosymbiosis. Phylogenetic reconstruction based on rearrangement distances showed a different topology from the one inferred by sequence information. This topology broke the proposed monophyly of the three endosymbiotic lineages and placed Bl. floridanus as a closer relative to E. coli than Yersinia pestis. These results indicate that the phylogeny of these insect endosymbionts is still an open question that will require the development of specific phylogenetic methods to confirm whether the sisterhood of the three endosymbiotic lineages is real or a consequence of a long-branch attraction phenomenon.
Collapse
Affiliation(s)
- Eugeni Belda
- Institut Cavanilles de Biodiversitat i Biologia Evolutiva and Departament de Genètica, Universitat de València, Valencia, Spain
| | | | | |
Collapse
|
137
|
O'Malley MA, Boucher Y. Paradigm change in evolutionary microbiology. STUDIES IN HISTORY AND PHILOSOPHY OF BIOLOGICAL AND BIOMEDICAL SCIENCES 2005; 36:183-208. [PMID: 16120264 DOI: 10.1016/j.shpsc.2004.12.002] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/22/2004] [Revised: 07/19/2004] [Indexed: 05/04/2023]
Abstract
Thomas Kuhn had little to say about scientific change in biological science, and biologists are ambivalent about how applicable his framework is for their disciplines. We apply Kuhn's account of paradigm change to evolutionary microbiology, where key Darwinian tenets are being challenged by two decades of findings from molecular phylogenetics. The chief culprit is lateral gene transfer, which undermines the role of vertical descent and the representation of evolutionary history as a tree of life. To assess Kuhn's relevance to this controversy, we add a social analysis of the scientists involved to the historical and philosophical debates. We conclude that while Kuhn's account may capture aspects of the pattern (or outcome) of an episode of scientific change, he has little to say about how the process of generating new understandings is occurring in evolutionary microbiology. Once Kuhn's application is limited to that of an initial investigative probe into how scientific problem-solving occurs, his disciplinary scope becomes broader.
Collapse
Affiliation(s)
- Maureen A O'Malley
- Department of Biochemistry and Molecular Biology, Dalhousie University, Halifax, NS, Canada B3H 1X5.
| | | |
Collapse
|
138
|
Rohmer L, Guttman DS, Dangl JL. Diverse evolutionary mechanisms shape the type III effector virulence factor repertoire in the plant pathogen Pseudomonas syringae. Genetics 2005; 167:1341-60. [PMID: 15280247 PMCID: PMC1470954 DOI: 10.1534/genetics.103.019638] [Citation(s) in RCA: 80] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/07/2023] Open
Abstract
Many gram-negative pathogenic bacteria directly translocate effector proteins into eukaryotic host cells via type III delivery systems. Type III effector proteins are determinants of virulence on susceptible plant hosts; they are also the proteins that trigger specific disease resistance in resistant plant hosts. Evolution of type III effectors is dominated by competing forces: the likely requirement for conservation of virulence function, the avoidance of host defenses, and possible adaptation to new hosts. To understand the evolutionary history of type III effectors in Pseudomonas syringae, we searched for homologs to 44 known or candidate P. syringae type III effectors and two effector chaperones. We examined 24 gene families for distribution among bacterial species, amino acid sequence diversity, and features indicative of horizontal transfer. We assessed the role of diversifying and purifying selection in the evolution of these gene families. While some P. syringae type III effectors were acquired recently, others have evolved predominantly by descent. The majority of codons in most of these genes were subjected to purifying selection, suggesting selective pressure to maintain presumed virulence function. However, members of 7 families had domains subject to diversifying selection.
Collapse
Affiliation(s)
- Laurence Rohmer
- Department of Biology, University of North Carolina, Chapel Hill, North Carolina 27599, USA
| | | | | |
Collapse
|
139
|
Charlebois RL, Doolittle WF. Computing prokaryotic gene ubiquity: rescuing the core from extinction. Genome Res 2005; 14:2469-77. [PMID: 15574825 PMCID: PMC534671 DOI: 10.1101/gr.3024704] [Citation(s) in RCA: 143] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
The genomic core concept has found several uses in comparative and evolutionary genomics. Defined as the set of all genes common to (ubiquitous among) all genomes in a phylogenetically coherent group, core size decreases as the number and phylogenetic diversity of the relevant group increases. Here, we focus on methods for defining the size and composition of the core of all genes shared by sequenced genomes of prokaryotes (Bacteria and Archaea). There are few (almost certainly less than 50) genes shared by all of the 147 genomes compared, surely insufficient to conduct all essential functions. Sequencing and annotation errors are responsible for the apparent absence of some genes, while very limited but genuine disappearances (from just one or a few genomes) can account for several others. Core size will continue to decrease as more genome sequences appear, unless the requirement for ubiquity is relaxed. Such relaxation seems consistent with any reasonable biological purpose for seeking a core, but it renders the problem of definition more problematic. We propose an alternative approach (the phylogenetically balanced core), which preserves some of the biological utility of the core concept. Cores, however delimited, preferentially contain informational rather than operational genes; we present a new hypothesis for why this might be so.
Collapse
Affiliation(s)
- Robert L Charlebois
- Genome Atlantic, Department of Biochemistry and Molecular Biology, Dalhousie University, Halifax, Nova Scotia, B3H 1X5, Canada
| | | |
Collapse
|
140
|
Dufayard JF, Duret L, Penel S, Gouy M, Rechenmann F, Perrière G. Tree pattern matching in phylogenetic trees: automatic search for orthologs or paralogs in homologous gene sequence databases. Bioinformatics 2005; 21:2596-603. [PMID: 15713731 DOI: 10.1093/bioinformatics/bti325] [Citation(s) in RCA: 132] [Impact Index Per Article: 6.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
MOTIVATION Comparative sequence analysis is widely used to study genome function and evolution. This approach first requires the identification of homologous genes and then the interpretation of their homology relationships (orthology or paralogy). To provide help in this complex task, we developed three databases of homologous genes containing sequences, multiple alignments and phylogenetic trees: HOBACGEN, HOVERGEN and HOGENOM. In this paper, we present two new tools for automating the search for orthologs or paralogs in these databases. RESULTS First, we have developed and implemented an algorithm to infer speciation and duplication events by comparison of gene and species trees (tree reconciliation). Second, we have developed a general method to search in our databases the gene families for which the tree topology matches a peculiar tree pattern. This algorithm of unordered tree pattern matching has been implemented in the FamFetch graphical interface. With the help of a graphical editor, the user can specify the topology of the tree pattern, and set constraints on its nodes and leaves. Then, this pattern is compared with all the phylogenetic trees of the database, to retrieve the families in which one or several occurrences of this pattern are found. By specifying ad hoc patterns, it is therefore possible to identify orthologs in our databases.
Collapse
|
141
|
Abstract
Metagenomics (also referred to as environmental and community genomics) is the genomic analysis of microorganisms by direct extraction and cloning of DNA from an assemblage of microorganisms. The development of metagenomics stemmed from the ineluctable evidence that as-yet-uncultured microorganisms represent the vast majority of organisms in most environments on earth. This evidence was derived from analyses of 16S rRNA gene sequences amplified directly from the environment, an approach that avoided the bias imposed by culturing and led to the discovery of vast new lineages of microbial life. Although the portrait of the microbial world was revolutionized by analysis of 16S rRNA genes, such studies yielded only a phylogenetic description of community membership, providing little insight into the genetics, physiology, and biochemistry of the members. Metagenomics provides a second tier of technical innovation that facilitates study of the physiology and ecology of environmental microorganisms. Novel genes and gene products discovered through metagenomics include the first bacteriorhodopsin of bacterial origin; novel small molecules with antimicrobial activity; and new members of families of known proteins, such as an Na(+)(Li(+))/H(+) antiporter, RecA, DNA polymerase, and antibiotic resistance determinants. Reassembly of multiple genomes has provided insight into energy and nutrient cycling within the community, genome structure, gene function, population genetics and microheterogeneity, and lateral gene transfer among members of an uncultured community. The application of metagenomic sequence information will facilitate the design of better culturing strategies to link genomic analysis with pure culture studies.
Collapse
Affiliation(s)
- Jo Handelsman
- Department of Plant Pathology, University of Wisconsin-Madison, Madison, WI 53706, USA.
| |
Collapse
|
142
|
Erkel C, Kemnitz D, Kube M, Ricke P, Chin KJ, Dedysh S, Reinhardt R, Conrad R, Liesack W. Retrieval of first genome data for rice cluster I methanogens by a combination of cultivation and molecular techniques. FEMS Microbiol Ecol 2005; 53:187-204. [PMID: 16329940 DOI: 10.1016/j.femsec.2004.12.004] [Citation(s) in RCA: 43] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2004] [Revised: 11/05/2004] [Accepted: 12/09/2004] [Indexed: 01/08/2023] Open
Abstract
We report first insights into a representative genome of rice cluster I (RC-I), a major group of as-yet uncultured methanogens. The starting point of our study was the methanogenic consortium MRE50 that had been stably maintained for 3 years by consecutive transfers to fresh medium and anaerobic incubation at 50 degrees C. Process-oriented measurements provided evidence for hydrogenotrophic CO(2)-reducing methanogenesis. Assessment of the diversity of consortium MRE50 suggested members of the families Thermoanaerobacteriaceae and Clostridiaceae to constitute the major bacterial component, while the archaeal population was represented entirely by RC-I. The RC-I population amounted to more than 50% of total cells, as concluded from fluorescence in situ hybridization using specific probes for either Bacteria or Archaea. The high enrichment status of RC-I prompted construction of a large insert fosmid library from consortium MRE50. Comparative sequence analysis of internal transcribed spacer (ITS) regions revealed that three different RC-I rrn operon variants were present in the fosmid library. Three, approximately 40-kb genomic fragments, each representative for one of the three different rrn operon variants, were recovered and sequenced. Computational analysis of the sequence data resulted in two major findings: (i) consortium MRE50 most likely harbours only a single RC-I genotype, which is characterized by multiple rrn operon copies; (ii) seven genes were identified to possess a strong phylogenetic signal (eIF2a, dnaG, priA, pcrA, gatD, gatE, and a gene encoding a putative RNA-binding protein). Trees exemplarily computed for the deduced amino acid sequences of eIF2a, dnaG, and priA corroborated a specific phylogenetic association of RC-I with the Methanosarcinales.
Collapse
Affiliation(s)
- Christoph Erkel
- Max-Planck-Institut für Terrestrische Mikrobiologie, Marburg, Germany
| | | | | | | | | | | | | | | | | |
Collapse
|
143
|
Hughes AL, Friedman R. Poxvirus genome evolution by gene gain and loss. Mol Phylogenet Evol 2005; 35:186-95. [PMID: 15737590 DOI: 10.1016/j.ympev.2004.12.008] [Citation(s) in RCA: 88] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2004] [Revised: 11/12/2004] [Accepted: 12/13/2004] [Indexed: 01/01/2023]
Abstract
The poxviruses (Poxviridae) are a family of viruses with double-stranded DNA genomes and substantial numbers (often >200) of genes per genome. We studied the patterns of gene gain and loss over the evolutionary history of 17 poxvirus complete genomes. A phylogeny based on gene family presence/absence showed good agreement with families based on concatenated amino acid sequences of conserved single-copy genes. Gene duplications in poxviruses were often lineage specific, and the most extensively duplicated viral gene families were found in only a few of the genomes analyzed. A total of 34 gene families were found to include a member in at least one of the poxvirus genomes analyzed and at least one animal genome; in 16 (47%) of these families, there was evidence of recent horizontal gene transfer (HGT) from host to virus. Gene families with evidence of HGT included several involved in host immune defense mechanisms (the MHC class I, interleukin-10, interleukin-24, interleukin-18, the interferon gamma receptor, and tumor necrosis factor receptor II) and others (glutaredoxin and glutathione peroxidase) involved in resistance of cells to oxidative stress. Thus "capture" of host genes by HGT has been a recurrent feature of poxvirus evolution and has played an important role in adapting the virus to survive host antiviral defense mechanisms.
Collapse
Affiliation(s)
- Austin L Hughes
- Department of Biological Sciences, University of South Carolina, Columbia, SC 29208, USA.
| | | |
Collapse
|
144
|
Abstract
In prokaryotic genomes, related genes are frequently clustered in operons and higher-order arrangements that reflect functional context. Organization emerges despite rearrangements that constantly shuffle gene and operon order. Evidence is presented that the tandem duplication of related genes acts as a driving evolutionary force in the origin and maintenance of clusters. Gene amplification can be viewed as a dynamic and reversible regulatory mechanism that facilitates adaptation to variable environments. Clustered genes confer selective benefits via their ability to be coamplified. During evolution, rearrangements that bring together related genes can be selected if they increase the fitness of the organism in which they reside. Similarly, the benefits of gene amplification can prevent the dispersal of existing clusters. Examples of frequent and spontaneous amplification of large genomic fragments are provided. The possibility is raised that tandem gene duplication works in concert with horizontal gene transfer as interrelated evolutionary forces for gene clustering.
Collapse
Affiliation(s)
- Andrew B Reams
- Section of Microbiology, University of California, Davis, California 95616, USA.
| | | |
Collapse
|
145
|
Stuart GW, Berry MW. An SVD-based comparison of nine whole eukaryotic genomes supports a coelomate rather than ecdysozoan lineage. BMC Bioinformatics 2004; 5:204. [PMID: 15606920 PMCID: PMC544558 DOI: 10.1186/1471-2105-5-204] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2004] [Accepted: 12/17/2004] [Indexed: 11/24/2022] Open
Abstract
Background Eukaryotic whole genome sequences are accumulating at an impressive rate. Effective methods for comparing multiple whole eukaryotic genomes on a large scale are needed. Most attempted solutions involve the production of large scale alignments, and many of these require a high stringency pre-screen for putative orthologs in order to reduce the effective size of the dataset and provide a reasonably high but unknown fraction of correctly aligned homologous sites for comparison. As an alternative, highly efficient methods that do not require the pre-alignment of operationally defined orthologs are also being explored. Results A non-alignment method based on the Singular Value Decomposition (SVD) was used to compare the predicted protein complement of nine whole eukaryotic genomes ranging from yeast to man. This analysis resulted in the simultaneous identification and definition of a large number of well conserved motifs and gene families, and produced a species tree supporting one of two conflicting hypotheses of metazoan relationships. Conclusions Our SVD-based analysis of the entire protein complement of nine whole eukaryotic genomes suggests that highly conserved motifs and gene families can be identified and effectively compared in a single coherent definition space for the easy extraction of gene and species trees. While this occurs without the explicit definition of orthologs or homologous sites, the analysis can provide a basis for these definitions.
Collapse
Affiliation(s)
- Gary W Stuart
- Department of Life Sciences, Indiana State University, Terre Haute, IN 47809, USA
- Visiting Scientist, Center for Genomics and Bioinformatics, Indiana University, Bloomington, IN 47405, USA
| | - Michael W Berry
- Department of Computer Science, University of Tennessee, Knoxville TN 37996-3450, USA
| |
Collapse
|
146
|
Dutilh BE, Huynen MA, Bruno WJ, Snel B. The consistent phylogenetic signal in genome trees revealed by reducing the impact of noise. J Mol Evol 2004; 58:527-39. [PMID: 15170256 DOI: 10.1007/s00239-003-2575-6] [Citation(s) in RCA: 60] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2003] [Accepted: 11/12/2003] [Indexed: 11/25/2022]
Abstract
Phylogenetic trees based on gene repertoires are remarkably similar to the current consensus of life history. Yet it has been argued that shared gene content is unreliable for phylogenetic reconstruction because of convergence in gene content due to horizontal gene transfer and parallel gene loss. Here we test this argument, by filtering out as noise those orthologous groups that have an inconsistent phylogenetic distribution, using two independent methods. The resulting phylogenies do indeed contain small but significant improvements. More importantly, we find that the majority of orthologous groups contain some phylogenetic signal and that the resulting phylogeny is the only detectable signal present in the gene distribution across genomes. Horizontal gene transfer or parallel gene loss does not cause systematic biases in the gene content tree.
Collapse
Affiliation(s)
- Bas E Dutilh
- Center for Molecular and Biomolecular Informatics/Nijmegen Center for Molecular Life Sciences, University of Nijmegen, Nijmegen, The Netherlands.
| | | | | | | |
Collapse
|
147
|
Stoebel DM. Lack of Evidence for Horizontal Transfer of the lac Operon into Escherichia coli. Mol Biol Evol 2004; 22:683-90. [PMID: 15563718 DOI: 10.1093/molbev/msi056] [Citation(s) in RCA: 16] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
The idea that Escherichia coli gained the lac operon via horizontal transfer, allowing it to invade a new niche and form a new species, has become a paradigmatic example of bacterial nonpathogenic adaptation and speciation catalyzed by horizontal transfer. Surprisingly, empirical evidence for this event is essentially nonexistent. To see whether horizontal transfer occurred, I compared a phylogeny of 14 Enterobacteriaceae based on two housekeeping genes to a phylogeny of a part of their lac operon. Although several species in this clade appear to have acquired some or all of the operon via horizontal transfer, there is no evidence of horizontal transfer into E. coli. It is not clear whether the horizontal transfer events for which there is evidence were adaptive because those species which have acquired the operon are not thought to live in high lactose environments. I propose that vertical transmission from the common ancestor of the Enterobacteriaceae, with subsequent loss of these genes in many species can explain much of the patchy distribution of lactose use in this clade. Finally, I argue that we need new, well-supported examples of horizontal transfer spurring niche expansion and speciation, particularly in nonpathogenic cases, before we can accept claims that horizontal transfer is a hallmark of bacterial adaptation.
Collapse
Affiliation(s)
- Daniel M Stoebel
- Department of Ecology and Evolution, Stony Brook University, USA.
| |
Collapse
|
148
|
Holmes EC, Rambaut A. Viral evolution and the emergence of SARS coronavirus. Philos Trans R Soc Lond B Biol Sci 2004; 359:1059-65. [PMID: 15306390 PMCID: PMC1693395 DOI: 10.1098/rstb.2004.1478] [Citation(s) in RCA: 96] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022] Open
Abstract
The recent appearance of severe acute respiratory syndrome coronavirus (SARS-CoV) highlights the continual threat to human health posed by emerging viruses. However, the central processes in the evolution of emerging viruses are unclear, particularly the selection pressures faced by viruses in new host species. We outline some of the key evolutionary genetic aspects of viral emergence. We emphasize that, although the high mutation rates of RNA viruses provide them with great adaptability and explain why they are the main cause of emerging diseases, their limited genome size means that they are also subject to major evolutionary constraints. Understanding the mechanistic basis of these constraints, particularly the roles played by epistasis and pleiotropy, is likely to be central in explaining why some RNA viruses are more able than others to cross species boundaries. Viral genetic factors have also been implicated in the emergence of SARS-CoV, with the suggestion that this virus is a recombinant between mammalian and avian coronaviruses. We show, however, that the phylogenetic patterns cited as evidence for recombination are more probably caused by a variation in substitution rate among lineages and that recombination is unlikely to explain the appearance of SARS in humans.
Collapse
Affiliation(s)
- Edward C Holmes
- Department of Zoology, University of Oxford, South Parks Road, Oxford OX1 3PS, UK.
| | | |
Collapse
|
149
|
A genomic timescale of prokaryote evolution: insights into the origin of methanogenesis, phototrophy, and the colonization of land. BMC Evol Biol 2004; 4:44. [PMID: 15535883 PMCID: PMC533871 DOI: 10.1186/1471-2148-4-44] [Citation(s) in RCA: 322] [Impact Index Per Article: 16.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2004] [Accepted: 11/09/2004] [Indexed: 11/10/2022] Open
Abstract
Background The timescale of prokaryote evolution has been difficult to reconstruct because of a limited fossil record and complexities associated with molecular clocks and deep divergences. However, the relatively large number of genome sequences currently available has provided a better opportunity to control for potential biases such as horizontal gene transfer and rate differences among lineages. We assembled a data set of sequences from 32 proteins (~7600 amino acids) common to 72 species and estimated phylogenetic relationships and divergence times with a local clock method. Results Our phylogenetic results support most of the currently recognized higher-level groupings of prokaryotes. Of particular interest is a well-supported group of three major lineages of eubacteria (Actinobacteria, Deinococcus, and Cyanobacteria) that we call Terrabacteria and associate with an early colonization of land. Divergence time estimates for the major groups of eubacteria are between 2.5–3.2 billion years ago (Ga) while those for archaebacteria are mostly between 3.1–4.1 Ga. The time estimates suggest a Hadean origin of life (prior to 4.1 Ga), an early origin of methanogenesis (3.8–4.1 Ga), an origin of anaerobic methanotrophy after 3.1 Ga, an origin of phototrophy prior to 3.2 Ga, an early colonization of land 2.8–3.1 Ga, and an origin of aerobic methanotrophy 2.5–2.8 Ga. Conclusions Our early time estimates for methanogenesis support the consideration of methane, in addition to carbon dioxide, as a greenhouse gas responsible for the early warming of the Earths' surface. Our divergence times for the origin of anaerobic methanotrophy are compatible with highly depleted carbon isotopic values found in rocks dated 2.8–2.6 Ga. An early origin of phototrophy is consistent with the earliest bacterial mats and structures identified as stromatolites, but a 2.6 Ga origin of cyanobacteria suggests that those Archean structures, if biologically produced, were made by anoxygenic photosynthesizers. The resistance to desiccation of Terrabacteria and their elaboration of photoprotective compounds suggests that the common ancestor of this group inhabited land. If true, then oxygenic photosynthesis may owe its origin to terrestrial adaptations.
Collapse
|
150
|
Herbeck JT, Degnan PH, Wernegreen JJ. Nonhomogeneous model of sequence evolution indicates independent origins of primary endosymbionts within the enterobacteriales (gamma-Proteobacteria). Mol Biol Evol 2004; 22:520-32. [PMID: 15525700 DOI: 10.1093/molbev/msi036] [Citation(s) in RCA: 52] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022] Open
Abstract
Standard methods of phylogenetic reconstruction are based on models that assume homogeneity of nucleotide composition among taxa. However, this assumption is often violated in biological data sets. In this study, we examine possible effects of nucleotide heterogeneity among lineages on the phylogenetic reconstruction of a bacterial group that spans a wide range of genomic nucleotide contents: obligately endosymbiotic bacteria and free-living or commensal species in the gamma-Proteobacteria. We focus on AT-rich primary endosymbionts to better understand the origins of obligately intracellular lifestyles. Previous phylogenetic analyses of this bacterial group point to the importance of accounting for base compositional variation in estimating relationships, particularly between endosymbiotic and free-living taxa. Here, we develop an approach to compare susceptibility of various phylogenetic reconstruction methods to the effects of nucleotide heterogeneity. First, we identify candidate trees of gamma-Proteobacteria groEL and 16S rRNA using approaches that assume homogeneous and stationary base composition, including Bayesian, maximum likelihood, parsimony, and distance methods. We then create permutations of the resulting candidate trees by varying the placement of the AT-rich endosymbiont Buchnera. These permutations are evaluated under the nonhomogeneous and nonstationary maximum likelihood model of Galtier and Gouy, which allows equilibrium base content to vary among examined lineages. Our results show that commonly used phylogenetic methods produce incongruent trees of the Enterobacteriales, and that the placement of Buchnera is especially unstable. However, under a nonhomogeneous model, various groEL and 16S rRNA phylogenies that separate Buchnera from other AT-rich endosymbionts (Blochmannia and Wigglesworthia) have consistently and significantly higher likelihood scores. Blochmannia and Wigglesworthia appear to have evolved from secondary endosymbionts, and represent an origin of primary endosymbiosis that is independent from Buchnera. This application of a nonhomogeneous model offers a computationally feasible way to test specific phylogenetic hypotheses for taxa with heterogeneous and nonstationary base composition.
Collapse
Affiliation(s)
- Joshua T Herbeck
- Josephine Bay Paul Center for Comparative Molecular Biology and Evolution, Marine Biological Laboratory, Woods Hole, Massachusetts, USA.
| | | | | |
Collapse
|