1
|
Quartet analysis of putative horizontal gene transfer in Crenarchaeota. J Mol Evol 2013; 78:163-70. [PMID: 24346234 DOI: 10.1007/s00239-013-9607-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2013] [Accepted: 12/06/2013] [Indexed: 10/25/2022]
Abstract
Horizontal gene transfers (HGT) between four Crenarchaeota species (Metallosphaera cuprina Ar-4T, Acidianus hospitalis W1T, Vulcanisaeta moutnovskia 768-28T, and Pyrobaculum islandicum DSM 4184T) were investigated with quartet analysis. Strong support was found for individual genes that disagree with the phylogeny of the majority, implying genomic mosaicism. One such gene, a ferredoxin-related gene, was investigated further and incorporated into a larger phylogeny, which provided evidence for HGT of this gene from the Vulcanisaeta lineage to the Acidianus lineage. This is the first application of quartet analysis of HGT for the phylum Crenarchaeota. The results have shown that quartet analysis is a powerful technique to screen homologous sequences for putative HGTs and is useful in visually describing genomic mosaicism and HGT within four taxa.
Collapse
|
2
|
Williams D, Gogarten JP, Papke RT. Quantifying homologous replacement of loci between haloarchaeal species. Genome Biol Evol 2013; 4:1223-44. [PMID: 23160063 PMCID: PMC3542582 DOI: 10.1093/gbe/evs098] [Citation(s) in RCA: 49] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/14/2023] Open
Abstract
In vitro studies of the haloarchaeal genus Haloferax have demonstrated
their ability to frequently exchange DNA between species, whereas rates of homologous
recombination estimated from natural populations in the genus Halorubrum
are high enough to maintain random association of alleles between five loci. To quantify
the effects of gene transfer and recombination of commonly held (relaxed core) genes
during the evolution of the class Halobacteria (haloarchaea), we reconstructed the history
of 21 genomes representing all major groups. Using a novel algorithm and a concatenated
ribosomal protein phylogeny as a reference, we created a directed horizontal genetic
transfer (HGT) network of contemporary and ancestral genomes. Gene order analysis revealed
that 90% of testable HGTs were by direct homologous replacement, rather than
nonhomologous integration followed by a loss. Network analysis revealed an inverse
log-linear relationship between HGT frequency and ribosomal protein evolutionary distance
that is maintained across the deepest divergences in Halobacteria. We use this
mathematical relationship to estimate the total transfers and amino acid substitutions
delivered by HGTs in each genome, providing a measure of chimerism. For the relaxed core
genes of each genome, we conservatively estimate that 11–20% of their
evolution occurred in other haloarchaea. Our findings are unexpected, because the transfer
and homologous recombination of relaxed core genes between members of the class
Halobacteria disrupts the coevolution of genes; however, the generation of new
combinations of divergent but functionally related genes may lead to adaptive phenotypes
not available through cumulative mutations and recombination within a single
population.
Collapse
Affiliation(s)
- David Williams
- Department of Molecular and Cell Biology, University of Connecticut, CT, USA
| | | | | |
Collapse
|
3
|
Mao F, Williams D, Zhaxybayeva O, Poptsova M, Lapierre P, Gogarten JP, Xu Y. Quartet decomposition server: a platform for analyzing phylogenetic trees. BMC Bioinformatics 2012; 13:123. [PMID: 22676320 PMCID: PMC3447714 DOI: 10.1186/1471-2105-13-123] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2011] [Accepted: 06/07/2012] [Indexed: 11/11/2022] Open
Abstract
Background The frequent exchange of genetic material among prokaryotes means that extracting a majority or plurality phylogenetic signal from many gene families, and the identification of gene families that are in significant conflict with the plurality signal is a frequent task in comparative genomics, and especially in phylogenomic analyses. Decomposition of gene trees into embedded quartets (unrooted trees each with four taxa) is a convenient and statistically powerful technique to address this challenging problem. This approach was shown to be useful in several studies of completely sequenced microbial genomes. Results We present here a web server that takes a collection of gene phylogenies, decomposes them into quartets, generates a Quartet Spectrum, and draws a split network. Users are also provided with various data download options for further analyses. Each gene phylogeny is to be represented by an assessment of phylogenetic information content, such as sets of trees reconstructed from bootstrap replicates or sampled from a posterior distribution. The Quartet Decomposition server is accessible at http://quartets.uga.edu. Conclusions The Quartet Decomposition server presented here provides a convenient means to perform Quartet Decomposition analyses and will empower users to find statistically supported phylogenetic conflicts.
Collapse
Affiliation(s)
- Fenglou Mao
- Department of Biochemistry and Molecular Biology, University of Georgia, 120 Green St, Athens, GA 30622, USA
| | | | | | | | | | | | | |
Collapse
|
4
|
Williams D, Fournier GP, Lapierre P, Swithers KS, Green AG, Andam CP, Gogarten JP. A rooted net of life. Biol Direct 2011; 6:45. [PMID: 21936906 PMCID: PMC3189188 DOI: 10.1186/1745-6150-6-45] [Citation(s) in RCA: 34] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2011] [Accepted: 09/21/2011] [Indexed: 01/29/2023] Open
Abstract
Abstract Phylogenetic reconstruction using DNA and protein sequences has allowed the reconstruction of evolutionary histories encompassing all life. We present and discuss a means to incorporate much of this rich narrative into a single model that acknowledges the discrete evolutionary units that constitute the organism. Briefly, this Rooted Net of Life genome phylogeny is constructed around an initial, well resolved and rooted tree scaffold inferred from a supermatrix of combined ribosomal genes. Extant sampled ribosomes form the leaves of the tree scaffold. These leaves, but not necessarily the deeper parts of the scaffold, can be considered to represent a genome or pan-genome, and to be associated with members of other gene families within that sequenced (pan)genome. Unrooted phylogenies of gene families containing four or more members are reconstructed and superimposed over the scaffold. Initially, reticulations are formed where incongruities between topologies exist. Given sufficient evidence, edges may then be differentiated as those representing vertical lines of inheritance within lineages and those representing horizontal genetic transfers or endosymbioses between lineages. Reviewers W. Ford Doolittle, Eric Bapteste and Robert Beiko.
Collapse
Affiliation(s)
- David Williams
- Department of Molecular and Cell Biology, University of Connecticut, Storrs, CT 06269-3125, USA.
| | | | | | | | | | | | | |
Collapse
|
5
|
Abstract
Phylogenetic trees of individual genes of prokaryotes (archaea and bacteria) generally have different topologies, largely owing to extensive horizontal gene transfer (HGT), suggesting that the Tree of Life (TOL) should be replaced by a "net of life" as the paradigm of prokaryote evolution. However, trees remain the natural representation of the histories of individual genes given the fundamentally bifurcating process of gene replication. Therefore, although no single tree can fully represent the evolution of prokaryote genomes, the complete picture of evolution will necessarily combine trees and nets. A quantitative measure of the signals of tree and net evolution is derived from an analysis of all quartets of species in all trees of the "Forest of Life" (FOL), which consists of approximately 7,000 phylogenetic trees for prokaryote genes including approximately 100 nearly universal trees (NUTs). Although diverse routes of net-like evolution collectively dominate the FOL, the pattern of tree-like evolution that reflects the consistent topologies of the NUTs is the most prominent coherent trend. We show that the contributions of tree-like and net-like evolutionary processes substantially differ across bacterial and archaeal lineages and between functional classes of genes. Evolutionary simulations indicate that the central tree-like signal cannot be realistically explained by a self-reinforcing pattern of biased HGT.
Collapse
Affiliation(s)
- Pere Puigbò
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland, USA
| | | | | |
Collapse
|
6
|
Tang K, Huang H, Jiao N, Wu CH. Phylogenomic analysis of marine Roseobacters. PLoS One 2010; 5:e11604. [PMID: 20657646 PMCID: PMC2904699 DOI: 10.1371/journal.pone.0011604] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2010] [Accepted: 06/20/2010] [Indexed: 11/24/2022] Open
Abstract
BACKGROUND Members of the Roseobacter clade which play a key role in the biogeochemical cycles of the ocean are diverse and abundant, comprising 10-25% of the bacterioplankton in most marine surface waters. The rapid accumulation of whole-genome sequence data for the Roseobacter clade allows us to obtain a clearer picture of its evolution. METHODOLOGY/PRINCIPAL FINDINGS In this study about 1,200 likely orthologous protein families were identified from 17 Roseobacter bacteria genomes. Functional annotations for these genes are provided by iProClass. Phylogenetic trees were constructed for each gene using maximum likelihood (ML) and neighbor joining (NJ). Putative organismal phylogenetic trees were built with phylogenomic methods. These trees were compared and analyzed using principal coordinates analysis (PCoA), approximately unbiased (AU) and Shimodaira-Hasegawa (SH) tests. A core set of 694 genes with vertical descent signal that are resistant to horizontal gene transfer (HGT) is used to reconstruct a robust organismal phylogeny. In addition, we also discovered the most likely 109 HGT genes. The core set contains genes that encode ribosomal apparatus, ABC transporters and chaperones often found in the environmental metagenomic and metatranscriptomic data. These genes in the core set are spread out uniformly among the various functional classes and biological processes. CONCLUSIONS/SIGNIFICANCE Here we report a new multigene-derived phylogenetic tree of the Roseobacter clade. Of particular interest is the HGT of eleven genes involved in vitamin B12 synthesis as well as key enzynmes for dimethylsulfoniopropionate (DMSP) degradation. These aquired genes are essential for the growth of Roseobacters and their eukaryotic partners.
Collapse
Affiliation(s)
- Kai Tang
- State Key Laboratory of Marine Environmental Science, Xiamen University, Xiamen, China
| | - Hongzhan Huang
- Protein Information Resource (PIR), Georgetown University, Washington, D. C., United States of America
- Center for Bioinformatics and Computational Biology, University of Delaware, Newark, Delaware, United States of America
| | - Nianzhi Jiao
- State Key Laboratory of Marine Environmental Science, Xiamen University, Xiamen, China
| | - Cathy H. Wu
- Protein Information Resource (PIR), Georgetown University, Washington, D. C., United States of America
- Center for Bioinformatics and Computational Biology, University of Delaware, Newark, Delaware, United States of America
| |
Collapse
|
7
|
Abstract
The notion that all prokaryotes belong to genomically and phenomically cohesive clusters that we might legitimately call "species" is a contentious one. At issue are (1) whether such clusters actually exist; (2) what species definition might most reliably identify them, if they do; and (3) what species concept -- by which is meant a genetic and ecological theory of speciation -- might best explain species existence and rationalize a species definition, if we could agree on one. We review existing theories and some relevant data. We conclude that microbiologists now understand in some detail the various genetic, population, and ecological processes that effect the evolution of prokaryotes. There will be on occasion circumstances under which these, working together, will form groups of related organisms sufficiently like each other that we might all agree to call them "species," but there is no reason that this must always be so. Thus, there is no principled way in which questions about prokaryotic species, such as how many there are, how large their populations are, or how globally they are distributed, can be answered. These questions can, however, be reformulated so that metagenomic methods and thinking will meaningfully address the biological patterns and processes whose understanding is our ultimate target.
Collapse
|
8
|
Abstract
This chapter discusses the pros and cons of the existing computational methods for the detection of horizontal (or lateral) gene transfer and highlights the genome-wide studies utilizing these methods. The impact of horizontal gene transfer (HGT) on prokaryote genome evolution is discussed.
Collapse
|
9
|
Genome evolution in cyanobacteria: the stable core and the variable shell. Proc Natl Acad Sci U S A 2008; 105:2510-5. [PMID: 18268351 DOI: 10.1073/pnas.0711165105] [Citation(s) in RCA: 117] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Cyanobacteria are the only known prokaryotes capable of oxygenic photosynthesis, the evolution of which transformed the biology and geochemistry of Earth. The rapid increase in published genomic sequences of cyanobacteria provides the first opportunity to reconstruct events in the evolution of oxygenic photosynthesis on the scale of entire genomes. Here, we demonstrate the overall phylogenetic incongruence among 682 orthologous protein families from 13 genomes of cyanobacteria. However, using principal coordinates analysis, we discovered a core set of 323 genes with similar evolutionary trajectories. The core set is highly conserved in amino acid sequence and contains genes encoding the major components in the photosynthetic and ribosomal apparatus. Many of the key proteins are encoded by genome-wide conserved small gene clusters, which often are indicative of protein-protein, protein-prosthetic group, and protein-lipid interactions. We propose that the macromolecular interactions in complex protein structures and metabolic pathways retard the tempo of evolution of the core genes and hence exert a selection pressure that restricts piecemeal horizontal gene transfer of components of the core. Identification of the core establishes a foundation for reconstructing robust organismal phylogeny in genome space. Our phylogenetic trees constructed from 16S rRNA gene sequences, concatenated orthologous proteins, and the core gene set all suggest that the ancestral cyanobacterium did not fix nitrogen and probably was a thermophilic organism.
Collapse
|
10
|
Podell S, Gaasterland T. DarkHorse: a method for genome-wide prediction of horizontal gene transfer. Genome Biol 2007; 8:R16. [PMID: 17274820 PMCID: PMC1852411 DOI: 10.1186/gb-2007-8-2-r16] [Citation(s) in RCA: 123] [Impact Index Per Article: 7.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2006] [Revised: 11/09/2006] [Accepted: 02/02/2007] [Indexed: 12/14/2022] Open
Abstract
DarkHorse is a new approach to rapid, genome-wide identification and ranking of horizontal transfer candidate proteins. A new approach to rapid, genome-wide identification and ranking of horizontal transfer candidate proteins is presented. The method is quantitative, reproducible, and computationally undemanding. It can be combined with genomic signature and/or phylogenetic tree-building procedures to improve accuracy and efficiency. The method is also useful for retrospective assessments of horizontal transfer prediction reliability, recognizing orthologous sequences that may have been previously overlooked or unavailable. These features are demonstrated in bacterial, archaeal, and eukaryotic examples.
Collapse
Affiliation(s)
- Sheila Podell
- Scripps Genome Center, Scripps Institution of Oceanography, University of California at San Diego, Gilman Drive, La Jolla, CA 92093-0202, USA
| | - Terry Gaasterland
- Scripps Genome Center, Scripps Institution of Oceanography, University of California at San Diego, Gilman Drive, La Jolla, CA 92093-0202, USA
| |
Collapse
|
11
|
Poptsova MS, Gogarten JP. BranchClust: a phylogenetic algorithm for selecting gene families. BMC Bioinformatics 2007; 8:120. [PMID: 17425803 PMCID: PMC1853112 DOI: 10.1186/1471-2105-8-120] [Citation(s) in RCA: 43] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2006] [Accepted: 04/10/2007] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Automated methods for assembling families of orthologous genes include those based on sequence similarity scores and those based on phylogenetic approaches. The first are easy to automate but usually they do not distinguish between paralogs and orthologs or have restriction on the number of taxa. Phylogenetic methods often are based on reconciliation of a gene tree with a known rooted species tree; a limitation of this approach, especially in case of prokaryotes, is that the species tree is often unknown, and that from the analyses of single gene families the branching order between related organisms frequently is unresolved. RESULTS Here we describe an algorithm for the automated selection of orthologous genes that recognizes orthologous genes from different species in a phylogenetic tree for any number of taxa. The algorithm is capable of distinguishing complete (containing all taxa) and incomplete (not containing all taxa) families and recognizes in- and outparalogs. The BranchClust algorithm is implemented in Perl with the use of the BioPerl module for parsing trees and is freely available at http://bioinformatics.org/branchclust. CONCLUSION BranchClust outperforms the Reciprocal Best Blast hit method in selecting more sets of putatively orthologous genes. In the test cases examined, the correctness of the selected families and of the identified in- and outparalogs was confirmed by inspection of the pertinent phylogenetic trees.
Collapse
Affiliation(s)
- Maria S Poptsova
- Department of Molecular and Cell Biology, University of Connecticut, Storrs, CT 06269-3125, USA
| | - J Peter Gogarten
- Department of Molecular and Cell Biology, University of Connecticut, Storrs, CT 06269-3125, USA
| |
Collapse
|
12
|
The power of phylogenetic approaches to detect horizontally transferred genes. BMC Evol Biol 2007; 7:45. [PMID: 17376230 PMCID: PMC1847511 DOI: 10.1186/1471-2148-7-45] [Citation(s) in RCA: 40] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2006] [Accepted: 03/21/2007] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Horizontal gene transfer plays an important role in evolution because it sometimes allows recipient lineages to adapt to new ecological niches. High genes transfer frequencies were inferred for prokaryotic and early eukaryotic evolution. Does horizontal gene transfer also impact phylogenetic reconstruction of the evolutionary history of genomes and organisms? The answer to this question depends at least in part on the actual gene transfer frequencies and on the ability to weed out transferred genes from further analyses. Are the detected transfers mainly false positives, or are they the tip of an iceberg of many transfer events most of which go undetected by current methods? RESULTS Phylogenetic detection methods appear to be the method of choice to infer gene transfers, especially for ancient transfers and those followed by orthologous replacement. Here we explore how well some of these methods perform using in silico transfers between the terminal branches of a gamma proteobacterial, genome based phylogeny. For the experiments performed here on average the AU test at a 5% significance level detects 90.3% of the transfers and 91% of the exchanges as significant. Using the Robinson-Foulds distance only 57.7% of the exchanges and 60% of the donations were identified as significant. Analyses using bipartition spectra appeared most successful in our test case. The power of detection was on average 97% using a 70% cut-off and 94.2% with 90% cut-off for identifying conflicting bipartitions, while the rate of false positives was below 4.2% and 2.1% for the two cut-offs, respectively. For all methods the detection rates improved when more intervening branches separated donor and recipient. CONCLUSION Rates of detected transfers should not be mistaken for the actual transfer rates; most analyses of gene transfers remain anecdotal. The method and significance level to identify potential gene transfer events represent a trade-off between the frequency of erroneous identification (false positives) and the power to detect actual transfer events.
Collapse
|
13
|
Zhaxybayeva O, Gogarten JP, Charlebois RL, Doolittle WF, Papke RT. Phylogenetic analyses of cyanobacterial genomes: quantification of horizontal gene transfer events. Genes Dev 2006; 16:1099-108. [PMID: 16899658 PMCID: PMC1557764 DOI: 10.1101/gr.5322306] [Citation(s) in RCA: 208] [Impact Index Per Article: 11.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2006] [Indexed: 11/25/2022]
Abstract
Using 1128 protein-coding gene families from 11 completely sequenced cyanobacterial genomes, we attempt to quantify horizontal gene transfer events within cyanobacteria, as well as between cyanobacteria and other phyla. A novel method of detecting and enumerating potential horizontal gene transfer events within a group of organisms based on analyses of "embedded quartets" allows us to identify phylogenetic signal consistent with a plurality of gene families, as well as to delineate cases of conflict to the plurality signal, which include horizontally transferred genes. To infer horizontal gene transfer events between cyanobacteria and other phyla, we added homologs from 168 available genomes. We screened phylogenetic trees reconstructed for each of these extended gene families for highly supported monophyly of cyanobacteria (or lack of it). Cyanobacterial genomes reveal a complex evolutionary history, which cannot be represented by a single strictly bifurcating tree for all genes or even most genes, although a single completely resolved phylogeny was recovered from the quartets' plurality signals. We find more conflicts within cyanobacteria than between cyanobacteria and other phyla. We also find that genes from all functional categories are subject to transfer. However, in interphylum as compared to intraphylum transfers, the proportion of metabolic (operational) gene transfers increases, while the proportion of informational gene transfers decreases.
Collapse
Affiliation(s)
- Olga Zhaxybayeva
- Genome Atlantic and Department of Biochemistry and Molecular Biology, Dalhousie University, Halifax, Nova Scotia B3H 1X5, Canada.
| | | | | | | | | |
Collapse
|
14
|
Zhaxybayeva O, Lapierre P, Gogarten JP. Ancient gene duplications and the root(s) of the tree of life. PROTOPLASMA 2005; 227:53-64. [PMID: 16389494 DOI: 10.1007/s00709-005-0135-1] [Citation(s) in RCA: 46] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/18/2005] [Accepted: 05/31/2005] [Indexed: 05/06/2023]
Abstract
Tracing organismal histories on the timescale of the tree of life remains one of the challenging tasks in evolutionary biology. The hotly debated questions include the evolutionary relationship between the three domains of life (e.g., which of the three domains are sister domains, are the domains para-, poly-, or monophyletic) and the location of the root within the universal tree of life. For the latter, many different points of view have been considered but so far no consensus has been reached. The only widely accepted rationale to root the universal tree of life is to use anciently duplicated paralogous genes that are present in all three domains of life. To date only few anciently duplicated gene families useful for phylogenetic reconstruction have been identified. Here we present results from a systematic search for ancient gene duplications using twelve representative, completely sequenced, archaeal and bacterial genomes. Phylogenetic analyses of identified cases show that the majority of datasets support a root between Archaea and Bacteria; however, some datasets support alternative hypotheses, and all of them suffer from a lack of strong phylogenetic signal. The results are discussed with respect to the impact of horizontal gene transfer on the ability to reconstruct organismal evolution. The exchange of genetic information between divergent organisms gives rise to mosaic genomes, where different genes in a genome have different histories. Simulations show that even low rates of horizontal gene transfer dramatically complicate the reconstruction of organismal evolution, and that the different most recent common molecular ancestors likely existed at different times and in different lineages.
Collapse
Affiliation(s)
- Olga Zhaxybayeva
- Department of Molecular and Cell Biology, University of Connecticut, Storrs, Connecticut 06269-31258, USA
| | | | | |
Collapse
|
15
|
Abstract
To what extent is the tree of life the best representation of the evolutionary history of microorganisms? Recent work has shown that, among sets of prokaryotic genomes in which most homologous genes show extremely low sequence divergence, gene content can vary enormously, implying that those genes that are variably present or absent are frequently horizontally transferred. Traditionally, successful horizontal gene transfer was assumed to provide a selective advantage to either the host or the gene itself, but could horizontally transferred genes be neutral or nearly neutral? We suggest that for many prokaryotes, the boundaries between species are fuzzy, and therefore the principles of population genetics must be broadened so that they can be applied to higher taxonomic categories.
Collapse
Affiliation(s)
- J Peter Gogarten
- Department of Molecular and Cell Biology, University of Connecticut, Storrs, Connecticut 06269-3125, USA.
| | | |
Collapse
|
16
|
Hamel L, Zhaxybayeva O, Gogarten JP. PentaPlot: a software tool for the illustration of genome mosaicism. BMC Bioinformatics 2005; 6:139. [PMID: 15938752 PMCID: PMC1177926 DOI: 10.1186/1471-2105-6-139] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2005] [Accepted: 06/06/2005] [Indexed: 12/02/2022] Open
Abstract
Background Dekapentagonal maps depict the phylogenetic relationships of five genomes in a visually appealing diagram and can be viewed as an alternative to a single evolutionary consensus tree. In particular, the generated maps focus attention on those gene families that significantly deviate from the consensus or plurality phylogeny. PentaPlot is a software tool that computes such dekapentagonal maps given an appropriate probability support matrix. Results The visualization with dekapentagonal maps critically depends on the optimal layout of unrooted tree topologies representing different evolutionary relationships among five organisms along the vertices of the dekapentagon. This is a difficult optimization problem given the large number of possible layouts. At its core our tool utilizes a genetic algorithm with demes and a local search strategy to search for the optimal layout. The hybrid genetic algorithm performs satisfactorily even in those cases where the chosen genomes are so divergent that little phylogenetic information has survived in the individual gene families. Conclusion PentaPlot is being made publicly available as an open source project at .
Collapse
Affiliation(s)
- Lutz Hamel
- Department of Computer Science and Statistics, University of Rhode Island, Kingston, RI 02881, USA
| | - Olga Zhaxybayeva
- Department of Molecular and Cell Biology, University of Connecticut, Storrs, CT, 06269-3125, USA
- Department of Biochemistry and Molecular Biology, Dalhousie University, 5850 College Street, Halifax, NS B3H 1X5, Canada
| | - J Peter Gogarten
- Department of Molecular and Cell Biology, University of Connecticut, Storrs, CT, 06269-3125, USA
| |
Collapse
|
17
|
Bern M, Goldberg D. Automatic selection of representative proteins for bacterial phylogeny. BMC Evol Biol 2005; 5:34. [PMID: 15927057 PMCID: PMC1175084 DOI: 10.1186/1471-2148-5-34] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2004] [Accepted: 05/31/2005] [Indexed: 11/22/2022] Open
Abstract
Background Although there are now about 200 complete bacterial genomes in GenBank, deep bacterial phylogeny remains a difficult problem, due to confounding horizontal gene transfers and other phylogenetic "noise". Previous methods have relied primarily upon biological intuition or manual curation for choosing genomic sequences unlikely to be horizontally transferred, and have given inconsistent phylogenies with poor bootstrap confidence. Results We describe an algorithm that automatically picks "representative" protein families from entire genomes for use as phylogenetic characters. A representative protein family is one that, taken alone, gives an organismal distance matrix in good agreement with a distance matrix computed from all sufficiently conserved proteins. We then use maximum-likelihood methods to compute phylogenetic trees from a concatenation of representative sequences. We validate the use of representative proteins on a number of small phylogenetic questions with accepted answers. We then use our methodology to compute a robust and well-resolved phylogenetic tree for a diverse set of sequenced bacteria. The tree agrees closely with a recently published tree computed using manually curated proteins, and supports two proposed high-level clades: one containing Actinobacteria, Deinococcus, and Cyanobacteria ("Terrabacteria"), and another containing Planctomycetes and Chlamydiales. Conclusion Representative proteins provide an effective solution to the problem of selecting phylogenetic characters.
Collapse
Affiliation(s)
- Marshall Bern
- Palo Alto Research Center, 3333 Coyote Hill Road, Palo Alto, CA 94304, USA
| | - David Goldberg
- Palo Alto Research Center, 3333 Coyote Hill Road, Palo Alto, CA 94304, USA
| |
Collapse
|
18
|
MacLeod D, Charlebois RL, Doolittle F, Bapteste E. Deduction of probable events of lateral gene transfer through comparison of phylogenetic trees by recursive consolidation and rearrangement. BMC Evol Biol 2005; 5:27. [PMID: 15819979 PMCID: PMC1087482 DOI: 10.1186/1471-2148-5-27] [Citation(s) in RCA: 51] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2004] [Accepted: 04/08/2005] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND When organismal phylogenies based on sequences of single marker genes are poorly resolved, a logical approach is to add more markers, on the assumption that weak but congruent phylogenetic signal will be reinforced in such multigene trees. Such approaches are valid only when the several markers indeed have identical phylogenies, an issue which many multigene methods (such as the use of concatenated gene sequences or the assembly of supertrees) do not directly address. Indeed, even when the true history is a mixture of vertical descent for some genes and lateral gene transfer (LGT) for others, such methods produce unique topologies. RESULTS We have developed software that aims to extract evidence for vertical and lateral inheritance from a set of gene trees compared against an arbitrary reference tree. This evidence is then displayed as a synthesis showing support over the tree for vertical inheritance, overlaid with explicit lateral gene transfer (LGT) events inferred to have occurred over the history of the tree. Like splits-tree methods, one can thus identify nodes at which conflict occurs. Additionally one can make reasonable inferences about vertical and lateral signal, assigning putative donors and recipients. CONCLUSION A tool such as ours can serve to explore the reticulated dimensionality of molecular evolution, by dissecting vertical and lateral inheritance at high resolution. By this, we mean that individual nodes can be examined not only for congruence, but also for coherence in light of LGT. We assert that our tools will facilitate the comparison of phylogenetic trees, and the interpretation of conflicting data.
Collapse
Affiliation(s)
- Dave MacLeod
- GenomeAtlantic, 1721 Lower Water Street, Suite 401, Halifax, NS, B3J 1S5, Canada
- Department of Biochemistry & Molecular Biology, Dalhousie University, 5850 College St., Halifax, NS, B3H 1X5, Canada
| | - Robert L Charlebois
- GenomeAtlantic, 1721 Lower Water Street, Suite 401, Halifax, NS, B3J 1S5, Canada
- Department of Biochemistry & Molecular Biology, Dalhousie University, 5850 College St., Halifax, NS, B3H 1X5, Canada
| | - Ford Doolittle
- GenomeAtlantic, 1721 Lower Water Street, Suite 401, Halifax, NS, B3J 1S5, Canada
- Department of Biochemistry & Molecular Biology, Dalhousie University, 5850 College St., Halifax, NS, B3H 1X5, Canada
| | - Eric Bapteste
- GenomeAtlantic, 1721 Lower Water Street, Suite 401, Halifax, NS, B3J 1S5, Canada
- Department of Biochemistry & Molecular Biology, Dalhousie University, 5850 College St., Halifax, NS, B3H 1X5, Canada
| |
Collapse
|
19
|
Affiliation(s)
- Olga Zhaxybayeva
- Department of Molecular and Cell Biology, University of Connecticut, Storrs, CT 06269, USA
| | | | | |
Collapse
|
20
|
Zhaxybayeva O, Hamel L, Raymond J, Gogarten JP. Visualization of the phylogenetic content of five genomes using dekapentagonal maps. Genome Biol 2004; 5:R20. [PMID: 15003123 PMCID: PMC395770 DOI: 10.1186/gb-2004-5-3-r20] [Citation(s) in RCA: 18] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2003] [Revised: 12/18/2003] [Accepted: 01/13/2004] [Indexed: 11/12/2022] Open
Abstract
Dekapentagonal maps depict phylogenetic information for orthologous genes present in five genomes, and provide a pre-screen for putatively horizontally transferred genes. The methods presented here summarize phylogenetic relationships of genomes in visually appealing and informative figures. Dekapentagonal maps depict phylogenetic information for orthologous genes present in five genomes, and provide a pre-screen for putatively horizontally transferred genes. If the majority of individual gene phylogenies are unresolved, bipartition histograms provide a means of uncovering and analyzing the plurality consensus. Analyses of genomes representing five photosynthetic bacterial phyla and of the prokaryotic contributions to the eukaryotic cell illustrate the utility of the methods.
Collapse
Affiliation(s)
- Olga Zhaxybayeva
- Department of Molecular and Cell Biology, University of Connecticut, Storrs, CT 06269-3125, USA
| | - Lutz Hamel
- Department of Computer Science and Statistics, University of Rhode Island, Kingston, RI 02881, USA
| | - Jason Raymond
- Department of Chemistry and Biochemistry, Arizona State University, Tempe, AZ 85287-1604, USA
| | - J Peter Gogarten
- Department of Molecular and Cell Biology, University of Connecticut, Storrs, CT 06269-3125, USA
| |
Collapse
|