1
|
Lu B. Evolutionary Insights into the Relationship of Frogs, Salamanders, and Caecilians and Their Adaptive Traits, with an Emphasis on Salamander Regeneration and Longevity. Animals (Basel) 2023; 13:3449. [PMID: 38003067 PMCID: PMC10668855 DOI: 10.3390/ani13223449] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2023] [Revised: 11/01/2023] [Accepted: 11/06/2023] [Indexed: 11/26/2023] Open
Abstract
The extant amphibians have developed uncanny abilities to adapt to their environment. I compared the genes of amphibians to those of other vertebrates to investigate the genetic changes underlying their unique traits, especially salamanders' regeneration and longevity. Using the well-supported Batrachia tree, I found that salamander genomes have undergone accelerated adaptive evolution, especially for development-related genes. The group-based comparison showed that several genes are under positive selection, rapid evolution, and unexpected parallel evolution with traits shared by distantly related species, such as the tail-regenerative lizard and the longer-lived naked mole rat. The genes, such as EEF1E1, PAFAH1B1, and OGFR, may be involved in salamander regeneration, as they are involved in the apoptotic process, blastema formation, and cell proliferation, respectively. The genes PCNA and SIRT1 may be involved in extending lifespan, as they are involved in DNA repair and histone modification, respectively. Some genes, such as PCNA and OGFR, have dual roles in regeneration and aging, which suggests that these two processes are interconnected. My experiment validated the time course differential expression pattern of SERPINI1 and OGFR, two genes that have evolved in parallel in salamanders and lizards during the regeneration process of salamander limbs. In addition, I found several candidate genes responsible for frogs' frequent vocalization and caecilians' degenerative vision. This study provides much-needed insights into the processes of regeneration and aging, and the discovery of the critical genes paves the way for further functional analysis, which could open up new avenues for exploiting the genetic potential of humans and improving human well-being.
Collapse
Affiliation(s)
- Bin Lu
- Chengdu Institute of Biology, Chinese Academy of Sciences, Chengdu 610041, China
| |
Collapse
|
2
|
Patra AK, Kwon YM, Yang Y. Complete gammaproteobacterial endosymbiont genome assembly from a seep tubeworm Lamellibrachia satsuma. J Microbiol 2022; 60:916-927. [DOI: 10.1007/s12275-022-2057-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2022] [Revised: 05/09/2022] [Accepted: 05/24/2022] [Indexed: 11/27/2022]
|
3
|
Dhakal U, Dobhal S, Alvarez AM, Arif M. Phylogenetic Analyses of Xanthomonads Causing Bacterial Leaf Spot of Tomato and Pepper: Xanthomonas euvesicatoria Revealed Homologous Populations Despite Distant Geographical Distribution. Microorganisms 2019; 7:microorganisms7100462. [PMID: 31623235 PMCID: PMC6843189 DOI: 10.3390/microorganisms7100462] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2019] [Revised: 10/13/2019] [Accepted: 10/14/2019] [Indexed: 12/28/2022] Open
Abstract
Bacterial leaf spot of tomato and pepper (BLS), an economically important bacterial disease caused by four species of Xanthomonas (X. euvesicatoria (Xe), X. vesicatoria (Xv), X. gardneri (Xg), and X. perforans (Xp)), is a global problem and can cause over 50% crop loss under unfavorable conditions. Among the four species, Xe and Xv are prevalent worldwide. Characterization of the pathogens is crucial for disease management and regulatory purposes. In this study, we performed a multilocus sequence analysis (MLSA) with six genes (hrcN, dnaA gyrB, gapA, pdg, and hmbs) on BLS strains. Other Xanthomonas species were included to determine phylogenetic relationships within and among the tested strains. Four BLS species comprising 76 strains from different serological groups and diverse geographical locations were resolved into three major clades. BLS xanthomonads formed distinct clusters in the phylogenetic analyses. Three other xanthomonads, including X. albilineans, X. sacchari, and X. translucens pv. undolusa revealed less than 85%, 88%, and 89% average nucleotide identity (ANI), respectively, with the other species of Xanthomonas included in this study. Both antibody and MLSA data showed that Xv was clearly separated from Xe and that the latter strains were remarkably clonal, even though they originated from distant geographical locations. The Xe strains formed two separate phylogenetic groups; Xe group A1 consisted only of tomato strains, whereas Xe group A2 included strains from pepper and tomato. In contrast, the Xv group showed greater heterogeneity. Some Xv strains from South America were closely related to strains from California, while others grouped closer to a strain from Indiana and more distantly to a strain from Hawaii. Using this information molecular tests can now be devised to track distribution of clonal populations that may be introduced into new geographic areas through seeds and other infected plant materials.
Collapse
Affiliation(s)
- Upasana Dhakal
- Department of Plant and Environmental Protection Sciences, University of Hawaii at Manoa, Manoa, HI 96822, USA.
| | - Shefali Dobhal
- Department of Plant and Environmental Protection Sciences, University of Hawaii at Manoa, Manoa, HI 96822, USA.
| | - Anne M Alvarez
- Department of Plant and Environmental Protection Sciences, University of Hawaii at Manoa, Manoa, HI 96822, USA.
| | - Mohammad Arif
- Department of Plant and Environmental Protection Sciences, University of Hawaii at Manoa, Manoa, HI 96822, USA.
| |
Collapse
|
4
|
Zahariev M, Chen W, Visagie CM, Lévesque CA. Cluster oligonucleotide signatures for rapid identification by sequencing. BMC Bioinformatics 2018; 19:395. [PMID: 30522439 PMCID: PMC6284311 DOI: 10.1186/s12859-018-2363-3] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2017] [Accepted: 09/09/2018] [Indexed: 01/19/2023] Open
Abstract
BACKGROUND Oligonucleotide signatures (signatures) have been widely used for studying microbial diversity and function in wet-lab settings, but using them for accurate in silico identification of organisms from high-throughput sequencing (HTS) data is only a proof of concept. Existing signature design programs for sequence signatures (signatures matching exactly one sequence) or clade signatures (signatures matching every sequence in a phylogenetic clade) are not able to identify all possible polymorphic sites for sequences with high similarity and perform poorly when handling large genome sequencing datasets. RESULTS We introduce cluster signatures: subsequences that match perfectly and exclusively any group of sequences in a data set. Cluster signatures provide complete recall for primer/probe design and increased discrimination between sequences beyond that of clade signatures. Using cluster signatures for in silico identification of HTS targets achieves good precision/recall and running time performance. This method has been implemented into an open source tool, the Automated Oligonucleotide Design Pipeline (adop), included in supplementary material and available at: https://bitbucket.org/wenchen_aafc/aodp_v2.0_release . CONCLUSIONS Cluster signatures provide a rapid and universal analysis tool to identify all possible short diagnostic DNA markers and variants from any DNA sequencing dataset. They are particularly useful in discriminating genetic material from closely related organisms and in detecting deleterious mutations in highly or perfectly conserved genomic sites.
Collapse
Affiliation(s)
- Manuel Zahariev
- Ottawa R&D Centre, Agriculture & Agri-Food Canada, 960 Carling Ave., Ottawa, ON, K1A 0C6 Canada
| | - Wen Chen
- Skwez Technology Corp, Box 3674, Garibaldi Highlands, BC, V0N 1T0 Canada
| | - Cobus M. Visagie
- The Agricultural Research Counci –PPRI, P/Bag X134, Queenswood, 0121 South Africa
| | - C. André Lévesque
- Sidney Laboratory Project - Science, Canadian Food Inspection Agency, Floor 2E, Room 233, 59 Camelot Drive, Ottawa, ON, K1A 0Y9 Canada
| |
Collapse
|
5
|
Gao A, Zhang J, Zhang W. Evolution of RAD- and DIV-Like Genes in Plants. Int J Mol Sci 2017; 18:ijms18091961. [PMID: 28902138 PMCID: PMC5618610 DOI: 10.3390/ijms18091961] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2017] [Revised: 09/01/2017] [Accepted: 09/01/2017] [Indexed: 11/25/2022] Open
Abstract
Developmental genetic studies of Antirrhinum majus demonstrated that two transcription factors from the MYB gene family, RADIALIS (RAD) and DIVIRICATA (DIV), interact through antagonism to regulate floral dorsoventral asymmetry. Interestingly, similar antagonistic interaction found among proteins of FSM1 (RAD-like) and MYBI (DIV-like) in Solanum lycopersicum is involved in fruit development. Here, we report the reconstruction of the phylogeny of I-box-like and R-R-type clades, where RAD- and DIV-like genes belong, respectively. We also examined the homology of these antagonistic MYB proteins using these phylogenies. The results show that there are likely three paralogs of RAD-/I-box-like genes, RAD1, RAD2, and RAD3, which originated in the common ancestor of the core eudicots. In contrast, R-R-type sequences fall into two major clades, RR1 and RR2, the result of gene duplication in the common ancestor of both monocots and dicots. RR1 was divided into clades RR1A, RR1B, and RR1C, while RR2 was divided into clades RR2A/DIV1, RR2B/DIV2, and RR2C/DIV3. We demonstrate that among similar antagonistic interactions in An. Majus and So. lycopersicum, RAD-like genes originate from the RAD2 clade, while DIV-like genes originate from distantly related paralogs of the R-R-type lineage. The phylogenetic analyses of these two MYB clades lay the foundation for future comparative studies including testing the evolution of the antagonistic relationship of proteins.
Collapse
Affiliation(s)
- Ao Gao
- Department of Biology, Virginia Commonwealth University, 1000 West Cary Street, Richmond, VA 23284, USA.
| | - Jingbo Zhang
- Department of Biology, Virginia Commonwealth University, 1000 West Cary Street, Richmond, VA 23284, USA.
| | - Wenheng Zhang
- Department of Biology, Virginia Commonwealth University, 1000 West Cary Street, Richmond, VA 23284, USA.
| |
Collapse
|
6
|
Lake JA, Larsen J, Sarna B, de la Haba RR, Pu Y, Koo H, Zhao J, Sinsheimer JS. Rings Reconcile Genotypic and Phenotypic Evolution within the Proteobacteria. Genome Biol Evol 2015; 7:3434-42. [PMID: 26659922 PMCID: PMC4700952 DOI: 10.1093/gbe/evv221] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 11/09/2015] [Indexed: 11/13/2022] Open
Abstract
Although prokaryotes are usually classified using molecular phylogenies instead of phenotypes after the advent of gene sequencing, neither of these methods is satisfactory because the phenotypes cannot explain the molecular trees and the trees do not fit the phenotypes. This scientific crisis still exists and the profound disconnection between these two pillars of evolutionary biology--genotypes and phenotypes--grows larger. We use rings and a genomic form of goods thinking to resolve this conundrum (McInerney JO, Cummins C, Haggerty L. 2011. Goods thinking vs. tree thinking. Mobile Genet Elements. 1:304-308; Nelson-Sathi S, et al. 2015. Origins of major archaeal clades correspond to gene acquisitions from bacteria. Nature 517:77-80). The Proteobacteria is the most speciose prokaryotic phylum known. It is an ideal phylogenetic model for reconstructing Earth's evolutionary history. It contains diverse free living, pathogenic, photosynthetic, sulfur metabolizing, and symbiotic species. Due to its large number of species (Whitman WB, Coleman DC, Wiebe WJ. 1998. Prokaryotes: the unseen majority. Proc Nat Acad Sci U S A. 95:6578-6583) it was initially expected to provide strong phylogenetic support for a proteobacterial tree of life. But despite its many species, sequence-based tree analyses are unable to resolve its topology. Here we develop new rooted ring analyses and study proteobacterial evolution. Using protein family data and new genome-based outgroup rooting procedures, we reconstruct the complex evolutionary history of the proteobacterial rings (combinations of tree-like divergences and endosymbiotic-like convergences). We identify and map the origins of major gene flows within the rooted proteobacterial rings (P < 3.6 × 10(-6)) and find that the evolution of the "Alpha-," "Beta-," and "Gammaproteobacteria" is represented by a unique set of rings. Using new techniques presented here we also root these rings using outgroups. We also map the independent flows of genes involved in DNA-, RNA-, ATP-, and membrane- related processes within the Proteobacteria and thereby demonstrate that these large gene flows are consistent with endosymbioses (P < 3.6 × 10(-9)). Our analyses illustrate what it means to find that a gene is present, or absent, within a gene flow, and thereby clarify the origin of the apparent conflicts between genotypes and phenotypes. Here we identify the gene flows that introduced photosynthesis into the Alpha-, Beta-, and Gammaproteobacteria from the common ancestor of the Actinobacteria and the Firmicutes. Our results also explain why rooted rings, unlike trees, are consistent with the observed genotypic and phenotypic relationships observed among the various proteobacterial classes. We find that ring phylogenies can explain the genotypes and the phenotypes of biological processes within large and complex groups like the Proteobacteria.
Collapse
Affiliation(s)
| | | | | | | | - Yiyi Pu
- University of California, Los Angeles Zhejiang University, Zhejiang, China
| | - HyunMin Koo
- University of California, Los Angeles University of Alabama, Birmingham
| | - Jun Zhao
- University of California, Los Angeles Peking University, Beijing, China
| | | |
Collapse
|
7
|
Kumar N, Lad G, Giuntini E, Kaye ME, Udomwong P, Shamsani NJ, Young JPW, Bailly X. Bacterial genospecies that are not ecologically coherent: population genomics of Rhizobium leguminosarum. Open Biol 2015; 5:140133. [PMID: 25589577 PMCID: PMC4313370 DOI: 10.1098/rsob.140133] [Citation(s) in RCA: 105] [Impact Index Per Article: 11.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023] Open
Abstract
Biological species may remain distinct because of genetic isolation or ecological adaptation, but these two aspects do not always coincide. To establish the nature of the species boundary within a local bacterial population, we characterized a sympatric population of the bacterium Rhizobium leguminosarum by genomic sequencing of 72 isolates. Although all strains have 16S rRNA typical of R. leguminosarum, they fall into five genospecies by the criterion of average nucleotide identity (ANI). Many genes, on plasmids as well as the chromosome, support this division: recombination of core genes has been largely within genospecies. Nevertheless, variation in ecological properties, including symbiotic host range and carbon-source utilization, cuts across these genospecies, so that none of these phenotypes is diagnostic of genospecies. This phenotypic variation is conferred by mobile genes. The genospecies meet the Mayr criteria for biological species in respect of their core genes, but do not correspond to coherent ecological groups, so periodic selection may not be effective in purging variation within them. The population structure is incompatible with traditional ‘polyphasic taxonomy′ that requires bacterial species to have both phylogenetic coherence and distinctive phenotypes. More generally, genomics has revealed that many bacterial species share adaptive modules by horizontal gene transfer, and we envisage a more consistent taxonomic framework that explicitly recognizes this. Significant phenotypes should be recognized as ‘biovars' within species that are defined by core gene phylogeny.
Collapse
Affiliation(s)
- Nitin Kumar
- Department of Biology, University of York, York YO10 5DD, UK
| | - Ganesh Lad
- Department of Biology, University of York, York YO10 5DD, UK
| | - Elisa Giuntini
- Department of Biology, University of York, York YO10 5DD, UK
| | - Maria E Kaye
- Department of Biology, University of York, York YO10 5DD, UK
| | | | | | - J Peter W Young
- Department of Biology, University of York, York YO10 5DD, UK
| | - Xavier Bailly
- Department of Biology, University of York, York YO10 5DD, UK
| |
Collapse
|
8
|
Pillonel T, Bertelli C, Salamin N, Greub G. Taxogenomics of the order Chlamydiales. Int J Syst Evol Microbiol 2015; 65:1381-1393. [PMID: 25634949 DOI: 10.1099/ijs.0.000090] [Citation(s) in RCA: 49] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Bacterial classification is a long-standing problem for taxonomists and species definition itself is constantly debated among specialists. The classification of strict intracellular bacteria such as members of the order Chlamydiales mainly relies on DNA- or protein-based phylogenetic reconstructions because these organisms exhibit few phenotypic differences and are difficult to culture. The availability of full genome sequences allows the comparison of the performance of conserved protein sequences to reconstruct Chlamydiales phylogeny. This approach permits the identification of markers that maximize the phylogenetic signal and the robustness of the inferred tree. In this study, a set of 424 core proteins was identified and concatenated to reconstruct a reference species tree. Although individual protein trees present variable topologies, we detected only few cases of incongruence with the reference species tree, which were due to horizontal gene transfers. Detailed analysis of the phylogenetic information of individual protein sequences (i) showed that phylogenies based on single randomly chosen core proteins are not reliable and (ii) led to the identification of twenty taxonomically highly reliable proteins, allowing the reconstruction of a robust tree close to the reference species tree. We recommend using these protein sequences to precisely classify newly discovered isolates at the family, genus and species levels.
Collapse
Affiliation(s)
- Trestan Pillonel
- SIB Swiss Institute of Bioinformatics, Lausanne, Switzerland.,Center for Research on Intracellular Bacteria, Institute of Microbiology, University Hospital Center and University of Lausanne, Lausanne, Switzerland
| | - Claire Bertelli
- SIB Swiss Institute of Bioinformatics, Lausanne, Switzerland.,Center for Research on Intracellular Bacteria, Institute of Microbiology, University Hospital Center and University of Lausanne, Lausanne, Switzerland
| | - Nicolas Salamin
- Department of Ecology and Evolution, Biophore, University of Lausanne, Lausanne, Switzerland.,SIB Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Gilbert Greub
- Center for Research on Intracellular Bacteria, Institute of Microbiology, University Hospital Center and University of Lausanne, Lausanne, Switzerland
| |
Collapse
|
9
|
Li J, Wong CF, Wong MT, Huang H, Leung FC. Modularized evolution in archaeal methanogens phylogenetic forest. Genome Biol Evol 2014; 6:3344-59. [PMID: 25502908 PMCID: PMC4986457 DOI: 10.1093/gbe/evu259] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 11/17/2014] [Indexed: 11/13/2022] Open
Abstract
Methanogens are methane-producing archaea that plays a key role in the global carbon cycle. To date, the evolutionary history of methanogens and closely related nonmethanogen species remains unresolved among studies conducted upon different genetic markers, attributing to horizontal gene transfers (HGTs). With an effort to decipher both congruent and conflicting evolutionary events, reconstruction of coevolved gene clusters and hierarchical structure in the archaeal methanogen phylogenetic forest, comprehensive evolution, and network analyses were performed upon 3,694 gene families from 41 methanogens and 33 closely related archaea. Our results show that 1) greater than 50% of genes are in topological dissonance with others; 2) the prevalent interorder HGTs, even for core genes, in methanogen genomes led to their scrambled phylogenetic relationships; 3) most methanogenesis-related genes have experienced at least one HGT; 4) greater than 20% of the genes in methanogen genomes were transferred horizontally from other archaea, with genes involved in cell-wall synthesis and defense system having been transferred most frequently; 5) the coevolution network contains seven statistically robust modules, wherein the central module has the highest average node strength and comprises a majority of the core genes; 6) different coevolutionary module genes boomed in different time and evolutionary lineage, constructing diversified pan-genome structures; 7) the modularized evolution is also closely related to the vertical evolution signals and the HGT rate of the genes. Overall, this study presented a modularized phylogenetic forest that describes a combination of complicated vertical and nonvertical evolutionary processes for methanogenic archaeal species.
Collapse
Affiliation(s)
- Jun Li
- School of Biological Sciences, Faculty of Science, The University of Hong Kong, China
| | - Chi-Fat Wong
- School of Biological Sciences, Faculty of Science, The University of Hong Kong, China School of Biological Sciences, Faculty of Science, The University of Hong Kong, China
| | - Mabel Ting Wong
- School of Biological Sciences, Faculty of Science, The University of Hong Kong, China Present address: Department of Chemical Engineering and Applied Chemistry, University of Toronto, Toronto, ON, Canada
| | - He Huang
- Center for Marine Environmental Studies, Ehime University, Japan
| | - Frederick C Leung
- School of Biological Sciences, Faculty of Science, The University of Hong Kong, China Bioinformatics Center, Nanjing Agricultural University, People's Republic of China
| |
Collapse
|
10
|
Struck TH, Wey-Fabrizius AR, Golombek A, Hering L, Weigert A, Bleidorn C, Klebow S, Iakovenko N, Hausdorf B, Petersen M, Kück P, Herlyn H, Hankeln T. Platyzoan paraphyly based on phylogenomic data supports a noncoelomate ancestry of spiralia. Mol Biol Evol 2014; 31:1833-49. [PMID: 24748651 DOI: 10.1093/molbev/msu143] [Citation(s) in RCA: 112] [Impact Index Per Article: 11.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022] Open
Abstract
Based on molecular data three major clades have been recognized within Bilateria: Deuterostomia, Ecdysozoa, and Spiralia. Within Spiralia, small-sized and simply organized animals such as flatworms, gastrotrichs, and gnathostomulids have recently been grouped together as Platyzoa. However, the representation of putative platyzoans was low in the respective molecular phylogenetic studies, in terms of both, taxon number and sequence data. Furthermore, increased substitution rates in platyzoan taxa raised the possibility that monophyletic Platyzoa represents an artifact due to long-branch attraction. In order to overcome such problems, we employed a phylogenomic approach, thereby substantially increasing 1) the number of sampled species within Platyzoa and 2) species-specific sequence coverage in data sets of up to 82,162 amino acid positions. Using established and new measures (long-branch score), we disentangled phylogenetic signal from misleading effects such as long-branch attraction. In doing so, our phylogenomic analyses did not recover a monophyletic origin of platyzoan taxa that, instead, appeared paraphyletic with respect to the other spiralians. Platyhelminthes and Gastrotricha formed a monophylum, which we name Rouphozoa. To the exclusion of Gnathifera, Rouphozoa and all other spiralians represent a monophyletic group, which we name Platytrochozoa. Platyzoan paraphyly suggests that the last common ancestor of Spiralia was a simple-bodied organism lacking coelomic cavities, segmentation, and complex brain structures, and that more complex animals such as annelids evolved from such a simply organized ancestor. This conclusion contradicts alternative evolutionary scenarios proposing an annelid-like ancestor of Bilateria and Spiralia and several independent events of secondary reduction.
Collapse
Affiliation(s)
- Torsten H Struck
- Zoological Research Museum Alexander Koenig, Bonn, GermanyUniversity of Osnabrück, FB05 Biology/Chemistry, AG Zoology, Osnabrück, Germany
| | - Alexandra R Wey-Fabrizius
- Institute of Molecular Genetics, Biosafety Research and Consulting, Johannes Gutenberg University, Mainz, Germany
| | - Anja Golombek
- Zoological Research Museum Alexander Koenig, Bonn, Germany
| | - Lars Hering
- Animal Evolution and Development, Institute of Biology II, University of Leipzig, Leipzig, Germany
| | - Anne Weigert
- Molecular Evolution and Systematics of Animals, Institute of Biology, University of Leipzig, Leipzig, Germany
| | - Christoph Bleidorn
- Molecular Evolution and Systematics of Animals, Institute of Biology, University of Leipzig, Leipzig, Germany
| | - Sabrina Klebow
- Institute of Molecular Genetics, Biosafety Research and Consulting, Johannes Gutenberg University, Mainz, Germany
| | - Nataliia Iakovenko
- Department of Biology and Ecology, Ostravian University in Ostrava, Ostrava, Czech RepublicDepartment of Invertebrate Fauna and Systematics, Schmalhausen Institute of Zoology NAS of Ukraine, Kyiv, Ukraine
| | | | - Malte Petersen
- Zoological Research Museum Alexander Koenig, Bonn, Germany
| | - Patrick Kück
- Zoological Research Museum Alexander Koenig, Bonn, Germany
| | - Holger Herlyn
- Institute of Anthropology, Johannes Gutenberg University, Mainz, Germany
| | - Thomas Hankeln
- Institute of Molecular Genetics, Biosafety Research and Consulting, Johannes Gutenberg University, Mainz, Germany
| |
Collapse
|
11
|
Abstract
Bacterial genomes are remarkably stable from one generation to the next but are plastic on an evolutionary time scale, substantially shaped by horizontal gene transfer, genome rearrangement, and the activities of mobile DNA elements. This implies the existence of a delicate balance between the maintenance of genome stability and the tolerance of genome instability. In this review, we describe the specialized genetic elements and the endogenous processes that contribute to genome instability. We then discuss the consequences of genome instability at the physiological level, where cells have harnessed instability to mediate phase and antigenic variation, and at the evolutionary level, where horizontal gene transfer has played an important role. Indeed, this ability to share DNA sequences has played a major part in the evolution of life on Earth. The evolutionary plasticity of bacterial genomes, coupled with the vast numbers of bacteria on the planet, substantially limits our ability to control disease.
Collapse
|
12
|
Fontanez KM, Cavanaugh CM. Evidence for horizontal transmission from multilocus phylogeny of deep-sea mussel (Mytilidae) symbionts. Environ Microbiol 2014; 16:3608-21. [DOI: 10.1111/1462-2920.12379] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2013] [Accepted: 12/22/2013] [Indexed: 11/29/2022]
Affiliation(s)
- Kristina M. Fontanez
- Department of Organismic and Evolutionary Biology; Harvard University; Cambridge MA 02138 USA
| | - Colleen M. Cavanaugh
- Department of Organismic and Evolutionary Biology; Harvard University; Cambridge MA 02138 USA
| |
Collapse
|
13
|
Capella-Gutierrez S, Kauff F, Gabaldón T. A phylogenomics approach for selecting robust sets of phylogenetic markers. Nucleic Acids Res 2014; 42:e54. [PMID: 24476915 PMCID: PMC3985644 DOI: 10.1093/nar/gku071] [Citation(s) in RCA: 34] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022] Open
Abstract
Reconstructing the evolutionary relationships of species is a major goal in biology. Despite the increasing number of completely sequenced genomes, a large number of phylogenetic projects rely on targeted sequencing and analysis of a relatively small sample of marker genes. The selection of these phylogenetic markers should ideally be based on accurate predictions of their combined, rather than individual, potential to accurately resolve the phylogeny of interest. Here we present and validate a new phylogenomics strategy to efficiently select a minimal set of stable markers able to reconstruct the underlying species phylogeny. In contrast to previous approaches, our methodology does not only rely on the ability of individual genes to reconstruct a known phylogeny, but it also explores the combined power of sets of concatenated genes to accurately infer phylogenetic relationships of species not previously analyzed. We applied our approach to two broad sets of cyanobacterial and ascomycetous fungal species, and provide two minimal sets of six and four genes, respectively, necessary to fully resolve the target phylogenies. This approach paves the way for the informed selection of phylogenetic markers in the effort of reconstructing the tree of life.
Collapse
Affiliation(s)
- Salvador Capella-Gutierrez
- Bioinformatics and Genomics Programme. Centre for Genomic Regulation (CRG) and UPF. Doctor Aiguader, 88. 08003 Barcelona, Spain, Universitat Pompeu Fabra (UPF). 08003 Barcelona, Spain, University of Kaiserslautern, Molecular Phylogenetics, Postfach 3049, 67653 Kaiserslautern, Germany and Institució Catalana de Recerca i Estudis Avançats (ICREA), Pg. Lluís Companys 23, 08010 Barcelona, Spain
| | | | | |
Collapse
|
14
|
Matzke NJ, Shih PM, Kerfeld CA. Bayesian analysis of congruence of core genes in Prochlorococcus and Synechococcus and implications on horizontal gene transfer. PLoS One 2014; 9:e85103. [PMID: 24465485 PMCID: PMC3897415 DOI: 10.1371/journal.pone.0085103] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2013] [Accepted: 11/22/2013] [Indexed: 01/28/2023] Open
Abstract
It is often suggested that horizontal gene transfer is so ubiquitous in microbes that the concept of a phylogenetic tree representing the pattern of vertical inheritance is oversimplified or even positively misleading. "Universal proteins" have been used to infer the organismal phylogeny, but have been criticized as being only the "tree of one percent." Currently, few options exist for those wishing to rigorously assess how well a universal protein phylogeny, based on a relative handful of well-conserved genes, represents the phylogenetic histories of hundreds of genes. Here, we address this problem by proposing a visualization method and a statistical test within a Bayesian framework. We use the genomes of marine cyanobacteria, a group thought to exhibit substantial amounts of HGT, as a test case. We take 379 orthologous gene families from 28 cyanobacteria genomes and estimate the Bayesian posterior distributions of trees - a "treecloud" - for each, as well as for a concatenated dataset based on putative "universal proteins." We then calculate the average distance between trees within and between all treeclouds on various metrics and visualize this high-dimensional space with non-metric multidimensional scaling (NMMDS). We show that the tree space is strongly clustered and that the universal protein treecloud is statistically significantly closer to the center of this tree space than any individual gene treecloud. We apply several commonly-used tests for incongruence/HGT and show that they agree HGT is rare in this dataset, but make different choices about which genes were subject to HGT. Our results show that the question of the representativeness of the "tree of one percent" is a quantitative empirical question, and that the phylogenetic central tendency is a meaningful observation even if many individual genes disagree due to the various sources of incongruence.
Collapse
Affiliation(s)
- Nicholas J. Matzke
- Department of Integrative Biology, University of California, Berkeley, California, United States of America
| | - Patrick M. Shih
- Department of Plant and Microbial Biology, University of California, Berkeley, California, United States of America
| | - Cheryl A. Kerfeld
- Department of Plant and Microbial Biology, University of California, Berkeley, California, United States of America
- US Department of Energy-Joint Genome Institute, Walnut Creek, California, United States of America
- * E-mail:
| |
Collapse
|
15
|
Kück P, Struck TH. BaCoCa – A heuristic software tool for the parallel assessment of sequence biases in hundreds of gene and taxon partitions. Mol Phylogenet Evol 2014; 70:94-8. [DOI: 10.1016/j.ympev.2013.09.011] [Citation(s) in RCA: 79] [Impact Index Per Article: 7.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2013] [Revised: 09/12/2013] [Accepted: 09/14/2013] [Indexed: 10/26/2022]
|
16
|
Lu B, Yang W, Dai Q, Fu J. Using genes as characters and a parsimony analysis to explore the phylogenetic position of turtles. PLoS One 2013; 8:e79348. [PMID: 24278129 PMCID: PMC3836853 DOI: 10.1371/journal.pone.0079348] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2013] [Accepted: 09/26/2013] [Indexed: 11/18/2022] Open
Abstract
The phylogenetic position of turtles within the vertebrate tree of life remains controversial. Conflicting conclusions from different studies are likely a consequence of systematic error in the tree construction process, rather than random error from small amounts of data. Using genomic data, we evaluate the phylogenetic position of turtles with both conventional concatenated data analysis and a "genes as characters" approach. Two datasets were constructed, one with seven species (human, opossum, zebra finch, chicken, green anole, Chinese pond turtle, and western clawed frog) and 4584 orthologous genes, and the second with four additional species (soft-shelled turtle, Nile crocodile, royal python, and tuatara) but only 1638 genes. Our concatenated data analysis strongly supported turtle as the sister-group to archosaurs (the archosaur hypothesis), similar to several recent genomic data based studies using similar methods. When using genes as characters and gene trees as character-state trees with equal weighting for each gene, however, our parsimony analysis suggested that turtles are possibly sister-group to diapsids, archosaurs, or lepidosaurs. None of these resolutions were strongly supported by bootstraps. Furthermore, our incongruence analysis clearly demonstrated that there is a large amount of inconsistency among genes and most of the conflict relates to the placement of turtles. We conclude that the uncertain placement of turtles is a reflection of the true state of nature. Concatenated data analysis of large and heterogeneous datasets likely suffers from systematic error and over-estimates of confidence as a consequence of a large number of characters. Using genes as characters offers an alternative for phylogenomic analysis. It has potential to reduce systematic error, such as data heterogeneity and long-branch attraction, and it can also avoid problems associated with computation time and model selection. Finally, treating genes as characters provides a convenient method for examining gene and genome evolution.
Collapse
Affiliation(s)
- Bin Lu
- Chengdu Institute of Biology, Chinese Academy of Sciences, Chengdu, Sichuan, China
| | - Weizhao Yang
- Chengdu Institute of Biology, Chinese Academy of Sciences, Chengdu, Sichuan, China
| | - Qiang Dai
- Chengdu Institute of Biology, Chinese Academy of Sciences, Chengdu, Sichuan, China
| | - Jinzhong Fu
- Chengdu Institute of Biology, Chinese Academy of Sciences, Chengdu, Sichuan, China
- Department of Integrative Biology, University of Guelph, Guelph, Ontario, Canada
| |
Collapse
|
17
|
Lefeuvre P, Cellier G, Remenant B, Chiroleu F, Prior P. Constraints on genome dynamics revealed from gene distribution among the Ralstonia solanacearum species. PLoS One 2013; 8:e63155. [PMID: 23723974 PMCID: PMC3665557 DOI: 10.1371/journal.pone.0063155] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2013] [Accepted: 03/28/2013] [Indexed: 01/11/2023] Open
Abstract
Because it is suspected that gene content may partly explain host adaptation and ecology of pathogenic bacteria, it is important to study factors affecting genome composition and its evolution. While recent genomic advances have revealed extremely large pan-genomes for some bacterial species, it remains difficult to predict to what extent gene pool is accessible within or transferable between populations. As genomes bear imprints of the history of the organisms, gene distribution pattern analyses should provide insights into the forces and factors at play in the shaping and maintaining of bacterial genomes. In this study, we revisited the data obtained from a previous CGH microarrays analysis in order to assess the genomic plasticity of the R. solanacearum species complex. Gene distribution analyses demonstrated the remarkably dispersed genome of R. solanacearum with more than half of the genes being accessory. From the reconstruction of the ancestral genomes compositions, we were able to infer the number of gene gain and loss events along the phylogeny. Analyses of gene movement patterns reveal that factors associated with gene function, genomic localization and ecology delineate gene flow patterns. While the chromosome displayed lower rates of movement, the megaplasmid was clearly associated with hot-spots of gene gain and loss. Gene function was also confirmed to be an essential factor in gene gain and loss dynamics with significant differences in movement patterns between different COG categories. Finally, analyses of gene distribution highlighted possible highways of horizontal gene transfer. Due to sampling and design bias, we can only speculate on factors at play in this gene movement dynamic. Further studies examining precise conditions that favor gene transfer would provide invaluable insights in the fate of bacteria, species delineation and the emergence of successful pathogens.
Collapse
Affiliation(s)
- Pierre Lefeuvre
- CIRAD UMR Peuplements Végétaux et Bioagresseurs en Milieu Tropical, CIRAD-Université de la Réunion, Pôle de Protection des Plantes, Saint Pierre, La Réunion, France.
| | | | | | | | | |
Collapse
|
18
|
Lasek-Nesselquist E, Gogarten JP. The effects of model choice and mitigating bias on the ribosomal tree of life. Mol Phylogenet Evol 2013; 69:17-38. [PMID: 23707703 DOI: 10.1016/j.ympev.2013.05.006] [Citation(s) in RCA: 51] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2013] [Revised: 04/26/2013] [Accepted: 05/08/2013] [Indexed: 01/03/2023]
Abstract
Deep-level relationships within Bacteria, Archaea, and Eukarya as well as the relationships of these three domains to each other require resolution. The ribosomal machinery, universal to all cellular life, represents a protein repertoire resistant to horizontal gene transfer, which provides a largely congruent signal necessary for reconstructing a tree suitable as a backbone for life's reticulate history. Here, we generate a ribosomal tree of life from a robust taxonomic sampling of Bacteria, Archaea, and Eukarya to elucidate deep-level intra-domain and inter-domain relationships. Lack of phylogenetic information and systematic errors caused by inadequate models (that cannot account for substitution rate or compositional heterogeneities) or improper model selection compound conflicting phylogenetic signals from HGT and/or paralogy. Thus, we tested several models of varying sophistication on three different datasets, performed removal of fast-evolving or long-branched Archaea and Eukarya, and employed three different strategies to remove compositional heterogeneity to examine their effects on the topological outcome. Our results support a two-domain topology for the tree of life, where Eukarya emerges from within Archaea as sister to a Korarchaeota/Thaumarchaeota (KT) or Crenarchaeota/KT clade for all models under all or at least one of the strategies employed. Taxonomic manipulation allows single-matrix and certain mixture models to vacillate between two-domain and three-domain phylogenies. We find that models vary in their ability to resolve different areas of the tree of life, which does not necessarily correlate with model complexity. For example, both single-matrix and some mixture models recover monophyletic Crenarchaeota and Euryarchaeota archaeal phyla. In contrast, the most sophisticated model recovers a paraphyletic Euryarchaeota but detects two large clades that comprise the Bacteria, which were recovered separately but never together in the other models. Overall, models recovered consistent topologies despite dataset modifications due to the removal of compositional bias, which reflects either ineffective bias reduction or robust datasets that allow models to overcome reconstruction artifacts. We recommend a comparative approach for evolutionary models to identify model weaknesses as well as consensus relationships.
Collapse
|
19
|
Merhej V, Raoult D. Rhizome of life, catastrophes, sequence exchanges, gene creations, and giant viruses: how microbial genomics challenges Darwin. Front Cell Infect Microbiol 2012; 2:113. [PMID: 22973559 PMCID: PMC3428605 DOI: 10.3389/fcimb.2012.00113] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2012] [Accepted: 08/06/2012] [Indexed: 11/29/2022] Open
Abstract
Darwin's theory about the evolution of species has been the object of considerable dispute. In this review, we have described seven key principles in Darwin's book The Origin of Species and tried to present how genomics challenge each of these concepts and improve our knowledge about evolution. Darwin believed that species evolution consists on a positive directional selection ensuring the “survival of the fittest.” The most developed state of the species is characterized by increasing complexity. Darwin proposed the theory of “descent with modification” according to which all species evolve from a single common ancestor through a gradual process of small modification of their vertical inheritance. Finally, the process of evolution can be depicted in the form of a tree. However, microbial genomics showed that evolution is better described as the “biological changes over time.” The mode of change is not unidirectional and does not necessarily favors advantageous mutations to increase fitness it is rather subject to random selection as a result of catastrophic stochastic processes. Complexity is not necessarily the completion of development: several complex organisms have gone extinct and many microbes including bacteria with intracellular lifestyle have streamlined highly effective genomes. Genomes evolve through large events of gene deletions, duplications, insertions, and genomes rearrangements rather than a gradual adaptative process. Genomes are dynamic and chimeric entities with gene repertoires that result from vertical and horizontal acquisitions as well as de novo gene creation. The chimeric character of microbial genomes excludes the possibility of finding a single common ancestor for all the genes recorded currently. Genomes are collections of genes with different evolutionary histories that cannot be represented by a single tree of life (TOL). A forest, a network or a rhizome of life may be more accurate to represent evolutionary relationships among species.
Collapse
Affiliation(s)
- Vicky Merhej
- URMITE, UM63, CNRS 7278, IRD 198, INSERM U1095, Aix Marseille Université Marseille, France
| | | |
Collapse
|
20
|
Bhandari V, Naushad HS, Gupta RS. Protein based molecular markers provide reliable means to understand prokaryotic phylogeny and support Darwinian mode of evolution. Front Cell Infect Microbiol 2012; 2:98. [PMID: 22919687 PMCID: PMC3417386 DOI: 10.3389/fcimb.2012.00098] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2012] [Accepted: 06/27/2012] [Indexed: 11/20/2022] Open
Abstract
The analyses of genome sequences have led to the proposal that lateral gene transfers (LGTs) among prokaryotes are so widespread that they disguise the interrelationships among these organisms. This has led to questioning of whether the Darwinian model of evolution is applicable to prokaryotic organisms. In this review, we discuss the usefulness of taxon-specific molecular markers such as conserved signature indels (CSIs) and conserved signature proteins (CSPs) for understanding the evolutionary relationships among prokaryotes and to assess the influence of LGTs on prokaryotic evolution. The analyses of genomic sequences have identified large numbers of CSIs and CSPs that are unique properties of different groups of prokaryotes ranging from phylum to genus levels. The species distribution patterns of these molecular signatures strongly support a tree-like vertical inheritance of the genes containing these molecular signatures that is consistent with phylogenetic trees. Recent detailed studies in this regard on the Thermotogae and Archaea, which are reviewed here, have identified large numbers of CSIs and CSPs that are specific for the species from these two taxa and a number of their major clades. The genetic changes responsible for these CSIs (and CSPs) initially likely occurred in the common ancestors of these taxa and then vertically transferred to various descendants. Although some CSIs and CSPs in unrelated groups of prokaryotes were identified, their small numbers and random occurrence has no apparent influence on the consistent tree-like branching pattern emerging from other markers. These results provide evidence that although LGT is an important evolutionary force, it does not mask the tree-like branching pattern of prokaryotes or understanding of their evolutionary relationships. The identified CSIs and CSPs also provide novel and highly specific means for identification of different groups of microbes and for taxonomical and biochemical studies.
Collapse
Affiliation(s)
- Vaibhav Bhandari
- Department of Biochemistry and Biomedical Sciences, McMaster University Hamilton, ON, Canada
| | | | | |
Collapse
|
21
|
Xu L, Kuo J, Liu JK, Wong TY. Bacterial phylogenetic tree construction based on genomic translation stop signals. MICROBIAL INFORMATICS AND EXPERIMENTATION 2012; 2:6. [PMID: 22651236 PMCID: PMC3466146 DOI: 10.1186/2042-5783-2-6] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/23/2012] [Accepted: 04/15/2012] [Indexed: 11/10/2022]
Abstract
Background The efficiencies of the stop codons TAA, TAG, and TGA in protein synthesis termination are not the same. These variations could allow many genes to be regulated. There are many similar nucleotide trimers found on the second and third reading-frames of a gene. They are called premature stop codons (PSC). Like stop codons, the PSC in bacterial genomes are also highly bias in terms of their quantities and qualities on the genes. Phylogenetically related species often share a similar PSC profile. We want to know whether the selective forces that influence the stop codons and the PSC usage biases in a genome are related. We also wish to know how strong these trimers in a genome are related to the natural history of the bacterium. Knowing these relations may provide better knowledge in the phylogeny of bacteria Results A 16SrRNA-alignment tree of 19 well-studied α-, β- and γ-Proteobacteria Type species is used as standard reference for bacterial phylogeny. The genomes of sixty-one bacteria, belonging to the α-, β- and γ-Proteobacteria subphyla, are used for this study. The stop codons and PSC are collectively termed “Translation Stop Signals” (TSS). A gene is represented by nine scalars corresponding to the numbers of counts of TAA, TAG, and TGA on each of the three reading-frames of that gene. “Translation Stop Signals Ratio” (TSSR) is the ratio between the TSS counts. Four types of TSSR are investigated. The TSSR-1, TSSR-2 and TSSR-3 are each a 3-scalar series corresponding respectively to the average ratio of TAA: TAG: TGA on the first, second, and third reading-frames of all genes in a genome. The Genomic-TSSR is a 9-scalar series representing the ratio of distribution of all TSS on the three reading-frames of all genes in a genome. Results show that bacteria grouped by their similarities based on TSSR-1, TSSR-2, or TSSR-3 values could only partially resolve the phylogeny of the species. However, grouping bacteria based on thier Genomic-TSSR values resulted in clusters of bacteria identical to those bacterial clusters of the reference tree. Unlike the 16SrRNA method, the Genomic-TSSR tree is also able to separate closely related species/strains at high resolution. Species and strains separated by the Genomic-TSSR grouping method are often in good agreement with those classified by other taxonomic methods. Correspondence analysis of individual genes shows that most genes in a bacterial genome share a similar TSSR value. However, within a chromosome, the Genic-TSSR values of genes near the replication origin region (Ori) are more similar to each other than those genes near the terminus region (Ter). Conclusion The translation stop signals on the three reading-frames of the genes on a bacterial genome are interrelated, possibly due to frequent off-frame recombination facilitated by translational-associated recombination (TSR). However, TSR may not occur randomly in a bacterial chromosome. Genes near the Ori region are often highly expressed and a bacterium always maintains multiple copies of Ori. Frequent collisions between DNA- polymerase and RNA-polymerase would create many DNA strand-breaks on the genes; whereas DNA strand-break induced homologues-recombination is more likely to take place between genes with similar sequence. Thus, localized recombination could explain why the TSSR of genes near the Ori region are more similar to each other. The quantity and quality of these TSS in a genome strongly reflect the natural history of a bacterium. We propose that the Genomic- TSSR can be used as a subjective biomarker to represent the phyletic status of a bacterium.
Collapse
Affiliation(s)
- Lijing Xu
- Department of Biological Sciences, Bioinformatics Program, The University of Memphis, Memphis, TN, USA
| | - Jimmy Kuo
- Department of Planning and Research, National Museum of Marine Biology and Aquarium, Pingtung, Taiwan
| | - Jong-Kang Liu
- Department of Biological Sciences, National Sun Yat-sen University, Kaohsiung, Taiwan
| | - Tit-Yee Wong
- Department of Biological Sciences, Bioinformatics Program, The University of Memphis, Memphis, TN, USA
| |
Collapse
|
22
|
MultiLocus Sequence Analysis- and Amplified Fragment Length Polymorphism-based characterization of xanthomonads associated with bacterial spot of tomato and pepper and their relatedness to Xanthomonas species. Syst Appl Microbiol 2012; 35:183-90. [DOI: 10.1016/j.syapm.2011.12.005] [Citation(s) in RCA: 39] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2011] [Revised: 12/07/2011] [Accepted: 12/16/2011] [Indexed: 11/21/2022]
|
23
|
de Vienne DM, Ollier S, Aguileta G. Phylo-MCOA: a fast and efficient method to detect outlier genes and species in phylogenomics using multiple co-inertia analysis. Mol Biol Evol 2012; 29:1587-98. [PMID: 22319162 DOI: 10.1093/molbev/msr317] [Citation(s) in RCA: 67] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Full genome data sets are currently being explored on a regular basis to infer phylogenetic trees, but there are often discordances among the trees produced by different genes. An important goal in phylogenomics is to identify which individual gene and species produce the same phylogenetic tree and are thus likely to share the same evolutionary history. On the other hand, it is also essential to identify which genes and species produce discordant topologies and therefore evolve in a different way or represent noise in the data. The latter are outlier genes or species and they can provide a wealth of information on potentially interesting biological processes, such as incomplete lineage sorting, hybridization, and horizontal gene transfers. Here, we propose a new method to explore the genomic tree space and detect outlier genes and species based on multiple co-inertia analysis (MCOA), which efficiently captures and compares the similarities in the phylogenetic topologies produced by individual genes. Our method allows the rapid identification of outlier genes and species by extracting the similarities and discrepancies, in terms of the pairwise distances, between all the species in all the trees, simultaneously. This is achieved by using MCOA, which finds successive decomposition axes from individual ordinations (i.e., derived from distance matrices) that maximize a covariance function. The method is freely available as a set of R functions. The source code and tutorial can be found online at http://phylomcoa.cgenomics.org.
Collapse
Affiliation(s)
- Damien M de Vienne
- Bioinformatics and Genomics Programme, Centre for Genomic Regulation (CRG) and UPF, Barcelona, Spain.
| | | | | |
Collapse
|
24
|
Narechania A, Baker RH, Sit R, Kolokotronis SO, DeSalle R, Planet PJ. Random Addition Concatenation Analysis: a novel approach to the exploration of phylogenomic signal reveals strong agreement between core and shell genomic partitions in the cyanobacteria. Genome Biol Evol 2011; 4:30-43. [PMID: 22094860 PMCID: PMC3267395 DOI: 10.1093/gbe/evr121] [Citation(s) in RCA: 33] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 11/12/2011] [Indexed: 11/14/2022] Open
Abstract
Recent whole-genome approaches to microbial phylogeny have emphasized partitioning genes into functional classes, often focusing on differences between a stable core of genes and a variable shell. To rigorously address the effects of partitioning and combining genes in genome-level analyses, we developed a novel technique called Random Addition Concatenation Analysis (RADICAL). RADICAL operates by sequentially concatenating randomly chosen gene partitions starting with a single-gene partition and ending with the entire genomic data set. A phylogenetic tree is built for every successive addition, and the entire process is repeated creating multiple random concatenation paths. The result is a library of trees representing a large variety of differently sized random gene partitions. This library can then be mined to identify unique topologies, assess overall agreement, and measure support for different trees. To evaluate RADICAL, we used 682 orthologous genes across 13 cyanobacterial genomes. Despite previous assertions of substantial differences between a core and a shell set of genes for this data set, RADICAL reveals the two partitions contain congruent phylogenetic signal. Substantial disagreement within the data set is limited to a few nodes and genes involved in metabolism, a functional group that is distributed evenly between the core and the shell partitions. We highlight numerous examples where RADICAL reveals aspects of phylogenetic behavior not evident by examining individual gene trees or a "'total evidence" tree. Our method also demonstrates that most emergent phylogenetic signal appears early in the concatenation process. The software is freely available at http://desalle.amnh.org.
Collapse
Affiliation(s)
- Apurva Narechania
- Sackler Institute for Comparative Genomics, American Museum of Natural History
| | - Richard H. Baker
- Sackler Institute for Comparative Genomics, American Museum of Natural History
| | - Ryan Sit
- Sackler Institute for Comparative Genomics, American Museum of Natural History
| | - Sergios-Orestis Kolokotronis
- Sackler Institute for Comparative Genomics, American Museum of Natural History
- Present address: Department of Biology, Barnard College, Columbia University
| | - Rob DeSalle
- Sackler Institute for Comparative Genomics, American Museum of Natural History
| | - Paul J. Planet
- Sackler Institute for Comparative Genomics, American Museum of Natural History
- Department of Pediatrics, College of Physicians and Surgeons, Columbia University
| |
Collapse
|
25
|
Toward an efficient method of identifying core genes for evolutionary and functional microbial phylogenies. PLoS One 2011; 6:e24704. [PMID: 21931822 PMCID: PMC3171473 DOI: 10.1371/journal.pone.0024704] [Citation(s) in RCA: 67] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2011] [Accepted: 08/16/2011] [Indexed: 02/04/2023] Open
Abstract
Microbial community metagenomes and individual microbial genomes are becoming increasingly accessible by means of high-throughput sequencing. Assessing organismal membership within a community is typically performed using one or a few taxonomic marker genes such as the 16S rDNA, and these same genes are also employed to reconstruct molecular phylogenies. There is thus a growing need to bioinformatically catalog strongly conserved core genes that can serve as effective taxonomic markers, to assess the agreement among phylogenies generated from different core gene, and to characterize the biological functions enriched within core genes and thus conserved throughout large microbial clades. We present a method to recursively identify core genes (i.e. genes ubiquitous within a microbial clade) in high-throughput from a large number of complete input genomes. We analyzed over 1,100 genomes to produce core gene sets spanning 2,861 bacterial and archaeal clades, ranging in size from one to >2,000 genes in inverse correlation with the α-diversity (total phylogenetic branch length) spanned by each clade. These cores are enriched as expected for housekeeping functions including translation, transcription, and replication, in addition to significant representations of regulatory, chaperone, and conserved uncharacterized proteins. In agreement with previous manually curated core gene sets, phylogenies constructed from one or more of these core genes agree with those built using 16S rDNA sequence similarity, suggesting that systematic core gene selection can be used to optimize both comparative genomics and determination of microbial community structure. Finally, we examine functional phylogenies constructed by clustering genomes by the presence or absence of orthologous gene families and show that they provide an informative complement to standard sequence-based molecular phylogenies.
Collapse
|
26
|
Beauregard-Racine J, Bicep C, Schliep K, Lopez P, Lapointe FJ, Bapteste E. Of woods and webs: possible alternatives to the tree of life for studying genomic fluidity in E. coli. Biol Direct 2011; 6:39; discussion 39. [PMID: 21774799 PMCID: PMC3160433 DOI: 10.1186/1745-6150-6-39] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2011] [Accepted: 07/20/2011] [Indexed: 12/26/2022] Open
Abstract
Background We introduce several forest-based and network-based methods for exploring microbial evolution, and apply them to the study of thousands of genes from 30 strains of E. coli. This case study illustrates how additional analyses could offer fast heuristic alternatives to standard tree of life (TOL) approaches. Results We use gene networks to identify genes with atypical modes of evolution, and genome networks to characterize the evolution of genetic partnerships between E. coli and mobile genetic elements. We develop a novel polychromatic quartet method to capture patterns of recombination within E. coli, to update the clanistic toolkit, and to search for the impact of lateral gene transfer and of pathogenicity on gene evolution in two large forests of trees bearing E. coli. We unravel high rates of lateral gene transfer involving E. coli (about 40% of the trees under study), and show that both core genes and shell genes of E. coli are affected by non-tree-like evolutionary processes. We show that pathogenic lifestyle impacted the structure of 30% of the gene trees, and that pathogenic strains are more likely to transfer genes with one another than with non-pathogenic strains. In addition, we propose five groups of genes as candidate mobile modules of pathogenicity. We also present strong evidence for recent lateral gene transfer between E. coli and mobile genetic elements. Conclusions Depending on which evolutionary questions biologists want to address (i.e. the identification of modules, genetic partnerships, recombination, lateral gene transfer, or genes with atypical evolutionary modes, etc.), forest-based and network-based methods are preferable to the reconstruction of a single tree, because they provide insights and produce hypotheses about the dynamics of genome evolution, rather than the relative branching order of species and lineages. Such a methodological pluralism - the use of woods and webs - is to be encouraged to analyse the evolutionary processes at play in microbial evolution. This manuscript was reviewed by: Ford Doolittle, Tal Pupko, Richard Burian, James McInerney, Didier Raoult, and Yan Boucher
Collapse
|
27
|
Abstract
BACKGROUND Genome sequencing has revolutionized our view of the relationships among genomes, particularly in revealing the confounding effects of lateral genetic transfer (LGT). Phylogenomic techniques have been used to construct purported trees of microbial life. Although such trees are easily interpreted and allow the use of a subset of genomes as "proxies" for the full set, LGT and other phenomena impact the positioning of different groups in genome trees, confounding and potentially invalidating attempts to construct a phylogeny-based taxonomy of microorganisms. Network and graph approaches can reveal complex sets of relationships, but applying these techniques to large data sets is a significant challenge. Notwithstanding the question of what exactly it might represent, generating and interpreting a Tree or Network of All Genomes will only be feasible if current algorithms can be improved upon. RESULTS Complex relationships among even the most-similar genomes demonstrate that proxy-based approaches to simplifying large sets of genomes are not alone sufficient to solve the analysis problem. A phylogenomic analysis of 1173 sequenced bacterial and archaeal genomes generated phylogenetic trees for 159,905 distinct homologous gene sets. The relationships inferred from this set can be heavily dependent on the inclusion of other taxa: for example, phyla such as Spirochaetes, Proteobacteria and Firmicutes are recovered as cohesive groups or split depending on the presence of other specific lineages. Furthermore, named groups such as Acidithiobacillus, Coprothermobacter and Brachyspira show a multitude of affiliations that are more consistent with their ecology than with small subunit ribosomal DNA-based taxonomy. Network and graph representations can illustrate the multitude of conflicting affinities, but all methods impose constraints on the input data and create challenges of construction and interpretation. CONCLUSIONS These complex relationships highlight the need for an inclusive approach to genomic data, and current methods with minor alterations will likely scale to allow the analysis of data sets with 10,000 or more genomes. The main challenges lie in the visualization and interpretation of genomic relationships, and the redefinition of microbial taxonomy when subsets of genomic data are so evidently in conflict with one another, and with the "canonical" molecular taxonomy.
Collapse
Affiliation(s)
- Robert G Beiko
- Faculty of Computer Science, Dalhousie University, Halifax, NS B3H 1W5 Canada.
| |
Collapse
|
28
|
Leigh JW, Lapointe FJ, Lopez P, Bapteste E. Evaluating phylogenetic congruence in the post-genomic era. Genome Biol Evol 2011; 3:571-87. [PMID: 21712432 PMCID: PMC3156567 DOI: 10.1093/gbe/evr050] [Citation(s) in RCA: 40] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 05/27/2011] [Indexed: 12/04/2022] Open
Abstract
Congruence is a broadly applied notion in evolutionary biology used to justify multigene phylogeny or phylogenomics, as well as in studies of coevolution, lateral gene transfer, and as evidence for common descent. Existing methods for identifying incongruence or heterogeneity using character data were designed for data sets that are both small and expected to be rarely incongruent. At the same time, methods that assess incongruence using comparison of trees test a null hypothesis of uncorrelated tree structures, which may be inappropriate for phylogenomic studies. As such, they are ill-suited for the growing number of available genome sequences, most of which are from prokaryotes and viruses, either for phylogenomic analysis or for studies of the evolutionary forces and events that have shaped these genomes. Specifically, many existing methods scale poorly with large numbers of genes, cannot accommodate high levels of incongruence, and do not adequately model patterns of missing taxa for different markers. We propose the development of novel incongruence assessment methods suitable for the analysis of the molecular evolution of the vast majority of life and support the investigation of homogeneity of evolutionary process in cases where markers do not share identical tree structures.
Collapse
Affiliation(s)
- Jessica W Leigh
- Department of Mathematics and Statistics, University of Otago, Dunedin, New Zealand.
| | | | | | | |
Collapse
|
29
|
Campbell V, Legendre P, Lapointe FJ. The performance of the Congruence Among Distance Matrices (CADM) test in phylogenetic analysis. BMC Evol Biol 2011; 11:64. [PMID: 21388552 PMCID: PMC3065422 DOI: 10.1186/1471-2148-11-64] [Citation(s) in RCA: 65] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2010] [Accepted: 03/09/2011] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND CADM is a statistical test used to estimate the level of Congruence Among Distance Matrices. It has been shown in previous studies to have a correct rate of type I error and good power when applied to dissimilarity matrices and to ultrametric distance matrices. Contrary to most other tests of incongruence used in phylogenetic analysis, the null hypothesis of the CADM test assumes complete incongruence of the phylogenetic trees instead of congruence. In this study, we performed computer simulations to assess the type I error rate and power of the test. It was applied to additive distance matrices representing phylogenies and to genetic distance matrices obtained from nucleotide sequences of different lengths that were simulated on randomly generated trees of varying sizes, and under different evolutionary conditions. RESULTS Our results showed that the test has an accurate type I error rate and good power. As expected, power increased with the number of objects (i.e., taxa), the number of partially or completely congruent matrices and the level of congruence among distance matrices. CONCLUSIONS Based on our results, we suggest that CADM is an excellent candidate to test for congruence and, when present, to estimate its level in phylogenomic studies where numerous genes are analysed simultaneously.
Collapse
Affiliation(s)
- Véronique Campbell
- Département de Sciences biologiques, Université de Montréal, C.P. 6128, Succ. Centre-ville, Montréal, Québec, H3C 3J7, Canada
| | - Pierre Legendre
- Département de Sciences biologiques, Université de Montréal, C.P. 6128, Succ. Centre-ville, Montréal, Québec, H3C 3J7, Canada
| | - François-Joseph Lapointe
- Département de Sciences biologiques, Université de Montréal, C.P. 6128, Succ. Centre-ville, Montréal, Québec, H3C 3J7, Canada
| |
Collapse
|
30
|
Holloway C, Beiko RG. Assembling networks of microbial genomes using linear programming. BMC Evol Biol 2010; 10:360. [PMID: 21092133 PMCID: PMC3224671 DOI: 10.1186/1471-2148-10-360] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2010] [Accepted: 11/20/2010] [Indexed: 01/04/2023] Open
Abstract
Background Microbial genomes exhibit complex sets of genetic affinities due to lateral genetic transfer. Assessing the relative contributions of parent-to-offspring inheritance and gene sharing is a vital step in understanding the evolutionary origins and modern-day function of an organism, but recovering and showing these relationships is a challenging problem. Results We have developed a new approach that uses linear programming to find between-genome relationships, by treating tables of genetic affinities (here, represented by transformed BLAST e-values) as an optimization problem. Validation trials on simulated data demonstrate the effectiveness of the approach in recovering and representing vertical and lateral relationships among genomes. Application of the technique to a set comprising Aquifex aeolicus and 75 other thermophiles showed an important role for large genomes as 'hubs' in the gene sharing network, and suggested that genes are preferentially shared between organisms with similar optimal growth temperatures. We were also able to discover distinct and common genetic contributors to each sequenced representative of genus Pseudomonas. Conclusions The linear programming approach we have developed can serve as an effective inference tool in its own right, and can be an efficient first step in a more-intensive phylogenomic analysis.
Collapse
Affiliation(s)
- Catherine Holloway
- Faculty of Computer Science, Dalhousie University, 6050 University Avenue, Halifax, Nova Scotia B3 H 1W5, Canada
| | | |
Collapse
|
31
|
Genome sequencing reveals widespread virulence gene exchange among human Neisseria species. PLoS One 2010; 5:e11835. [PMID: 20676376 PMCID: PMC2911385 DOI: 10.1371/journal.pone.0011835] [Citation(s) in RCA: 150] [Impact Index Per Article: 10.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2010] [Accepted: 06/01/2010] [Indexed: 11/19/2022] Open
Abstract
Commensal bacteria comprise a large part of the microbial world, playing important roles in human development, health and disease. However, little is known about the genomic content of commensals or how related they are to their pathogenic counterparts. The genus Neisseria, containing both commensal and pathogenic species, provides an excellent opportunity to study these issues. We undertook a comprehensive sequencing and analysis of human commensal and pathogenic Neisseria genomes. Commensals have an extensive repertoire of virulence alleles, a large fraction of which has been exchanged among Neisseria species. Commensals also have the genetic capacity to donate DNA to, and take up DNA from, other Neisseria. Our findings strongly suggest that commensal Neisseria serve as reservoirs of virulence alleles, and that they engage extensively in genetic exchange.
Collapse
|
32
|
Tang K, Huang H, Jiao N, Wu CH. Phylogenomic analysis of marine Roseobacters. PLoS One 2010; 5:e11604. [PMID: 20657646 PMCID: PMC2904699 DOI: 10.1371/journal.pone.0011604] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2010] [Accepted: 06/20/2010] [Indexed: 11/24/2022] Open
Abstract
BACKGROUND Members of the Roseobacter clade which play a key role in the biogeochemical cycles of the ocean are diverse and abundant, comprising 10-25% of the bacterioplankton in most marine surface waters. The rapid accumulation of whole-genome sequence data for the Roseobacter clade allows us to obtain a clearer picture of its evolution. METHODOLOGY/PRINCIPAL FINDINGS In this study about 1,200 likely orthologous protein families were identified from 17 Roseobacter bacteria genomes. Functional annotations for these genes are provided by iProClass. Phylogenetic trees were constructed for each gene using maximum likelihood (ML) and neighbor joining (NJ). Putative organismal phylogenetic trees were built with phylogenomic methods. These trees were compared and analyzed using principal coordinates analysis (PCoA), approximately unbiased (AU) and Shimodaira-Hasegawa (SH) tests. A core set of 694 genes with vertical descent signal that are resistant to horizontal gene transfer (HGT) is used to reconstruct a robust organismal phylogeny. In addition, we also discovered the most likely 109 HGT genes. The core set contains genes that encode ribosomal apparatus, ABC transporters and chaperones often found in the environmental metagenomic and metatranscriptomic data. These genes in the core set are spread out uniformly among the various functional classes and biological processes. CONCLUSIONS/SIGNIFICANCE Here we report a new multigene-derived phylogenetic tree of the Roseobacter clade. Of particular interest is the HGT of eleven genes involved in vitamin B12 synthesis as well as key enzynmes for dimethylsulfoniopropionate (DMSP) degradation. These aquired genes are essential for the growth of Roseobacters and their eukaryotic partners.
Collapse
Affiliation(s)
- Kai Tang
- State Key Laboratory of Marine Environmental Science, Xiamen University, Xiamen, China
| | - Hongzhan Huang
- Protein Information Resource (PIR), Georgetown University, Washington, D. C., United States of America
- Center for Bioinformatics and Computational Biology, University of Delaware, Newark, Delaware, United States of America
| | - Nianzhi Jiao
- State Key Laboratory of Marine Environmental Science, Xiamen University, Xiamen, China
| | - Cathy H. Wu
- Protein Information Resource (PIR), Georgetown University, Washington, D. C., United States of America
- Center for Bioinformatics and Computational Biology, University of Delaware, Newark, Delaware, United States of America
| |
Collapse
|
33
|
Blank CE. Not so old Archaea - the antiquity of biogeochemical processes in the archaeal domain of life. GEOBIOLOGY 2009; 7:495-514. [PMID: 19843187 DOI: 10.1111/j.1472-4669.2009.00219.x] [Citation(s) in RCA: 28] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/18/2023]
Abstract
Since the archaeal domain of life was first recognized, it has often been assumed that Archaea are ancient, and harbor primitive traits. In fact, the names of the major archaeal lineages reflect our assumptions regarding the antiquity of their traits. Ancestral state reconstruction and relaxed molecular clock analyses using newly articulated oxygen age constraints show that although the archaeal domain itself is old, tracing back to the Archean eon, many clades and traits within the domain are not ancient or primitive. Indeed many clades and traits, particularly in the Euryarchaeota, were inferred to be Neoproterozoic or Phanerozoic in age. Both Eury- and Crenarchaeota show increasing metabolic and physiological diversity through time. Early archaeal microbial communities were likely limited to sulfur reduction and hydrogenotrophic methanogenesis, and were confined to high-temperature geothermal environments. However, after the appearance of atmospheric oxygen, nodes containing a wide variety of traits (sulfate and thiosulfate reduction, sulfur oxidation, sulfide oxidation, aerobic respiration, nitrate reduction, mesophilic methanogenesis in sedimentary environments) appear, first in environments containing terrestrial Crenarchaeota in the Meso/Neoproterozoic followed by environments containing marine Euryarchaeota in the Neoproterozoic and Phanerozoic. This provides phylogenetic evidence for increasing complexity in the biogeochemical cycling of C, N, and S through geologic time, likely as a consequence of microbial evolution and the gradual oxygenation of various compartments within the biosphere. This work has implications not only for the large-scale evolution of microbial communities and biogeochemical processes, but also for the interpretation of microbial biosignatures in the ancient rock record.
Collapse
Affiliation(s)
- Carrine E Blank
- Department of Geosciences, University of Montana, Missoula, MT, USA.
| |
Collapse
|
34
|
Doolittle WF. The practice of classification and the theory of evolution, and what the demise of Charles Darwin's tree of life hypothesis means for both of them. Philos Trans R Soc Lond B Biol Sci 2009; 364:2221-8. [PMID: 19571242 DOI: 10.1098/rstb.2009.0032] [Citation(s) in RCA: 61] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Debates over the status of the tree of life (TOL) often proceed without agreement as to what it is supposed to be: a hierarchical classification scheme, a tracing of genomic and organismal history or a hypothesis about evolutionary processes and the patterns they can generate. I will argue that for Darwin it was a hypothesis, which lateral gene transfer in prokaryotes now shows to be false. I will propose a more general and relaxed evolutionary theory and point out why anti-evolutionists should take no comfort from disproof of the TOL hypothesis.
Collapse
Affiliation(s)
- W Ford Doolittle
- Department of Biochemistry and Molecular Biology, Dalhousie University, Halifax, Nova Scotia, Canada.
| |
Collapse
|
35
|
Bapteste E, O'Malley MA, Beiko RG, Ereshefsky M, Gogarten JP, Franklin-Hall L, Lapointe FJ, Dupré J, Dagan T, Boucher Y, Martin W. Prokaryotic evolution and the tree of life are two different things. Biol Direct 2009; 4:34. [PMID: 19788731 PMCID: PMC2761302 DOI: 10.1186/1745-6150-4-34] [Citation(s) in RCA: 128] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2009] [Accepted: 09/29/2009] [Indexed: 01/04/2023] Open
Abstract
BACKGROUND The concept of a tree of life is prevalent in the evolutionary literature. It stems from attempting to obtain a grand unified natural system that reflects a recurrent process of species and lineage splittings for all forms of life. Traditionally, the discipline of systematics operates in a similar hierarchy of bifurcating (sometimes multifurcating) categories. The assumption of a universal tree of life hinges upon the process of evolution being tree-like throughout all forms of life and all of biological time. In multicellular eukaryotes, the molecular mechanisms and species-level population genetics of variation do indeed mainly cause a tree-like structure over time. In prokaryotes, they do not. Prokaryotic evolution and the tree of life are two different things, and we need to treat them as such, rather than extrapolating from macroscopic life to prokaryotes. In the following we will consider this circumstance from philosophical, scientific, and epistemological perspectives, surmising that phylogeny opted for a single model as a holdover from the Modern Synthesis of evolution. RESULTS It was far easier to envision and defend the concept of a universal tree of life before we had data from genomes. But the belief that prokaryotes are related by such a tree has now become stronger than the data to support it. The monistic concept of a single universal tree of life appears, in the face of genome data, increasingly obsolete. This traditional model to describe evolution is no longer the most scientifically productive position to hold, because of the plurality of evolutionary patterns and mechanisms involved. Forcing a single bifurcating scheme onto prokaryotic evolution disregards the non-tree-like nature of natural variation among prokaryotes and accounts for only a minority of observations from genomes. CONCLUSION Prokaryotic evolution and the tree of life are two different things. Hence we will briefly set out alternative models to the tree of life to study their evolution. Ultimately, the plurality of evolutionary patterns and mechanisms involved, such as the discontinuity of the process of evolution across the prokaryote-eukaryote divide, summons forth a pluralistic approach to studying evolution. REVIEWERS This article was reviewed by Ford Doolittle, John Logsdon and Nicolas Galtier.
Collapse
|
36
|
Galtier N, Daubin V. Dealing with incongruence in phylogenomic analyses. Philos Trans R Soc Lond B Biol Sci 2009; 363:4023-9. [PMID: 18852109 DOI: 10.1098/rstb.2008.0144] [Citation(s) in RCA: 141] [Impact Index Per Article: 9.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Incongruence between gene trees is the main challenge faced by phylogeneticists in the genomic era. Incongruence can occur for artefactual reasons, when we fail to recover the correct gene trees, or for biological reasons, when true gene trees are actually distinct from each other, and from the species tree. Horizontal gene transfers (HGTs) between genomes are an important process of bacterial evolution resulting in a substantial amount of phylogenetic conflicts between gene trees. We argue that the (bacterial) species tree is still a meaningful scientific concept even in the case of HGTs, and that reconstructing it is still a valid goal. We tentatively assess the amount of phylogenetic incongruence caused by HGTs in bacteria by comparing bacterial datasets to a metazoan dataset in which transfers are presumably very scarce or absent.We review existing phylogenomic methods and their ability to return to the user, both the vertical (speciation/extinction history) and horizontal (gene transfers) phylogenetic signals.
Collapse
Affiliation(s)
- Nicolas Galtier
- CNRS UMR 5554, Institut des Sciences de l'Evolution, Université Montpellier 2, CC64, Place E. Bataillon, 34095 Montpellier, France.
| | | |
Collapse
|
37
|
Merkl R, Wiezer A. GO4genome: a prokaryotic phylogeny based on genome organization. J Mol Evol 2009; 68:550-62. [PMID: 19436929 PMCID: PMC3085772 DOI: 10.1007/s00239-009-9233-6] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2008] [Revised: 03/10/2009] [Accepted: 04/03/2009] [Indexed: 11/24/2022]
Abstract
Determining the phylogeny of closely related prokaryotes may fail in an analysis of rRNA or a small set of sequences. Whole-genome phylogeny utilizes the maximally available sample space. For a precise determination of genome similarity, two aspects have to be considered when developing an algorithm of whole-genome phylogeny: (1) gene order conservation is a more precise signal than gene content; and (2) when using sequence similarity, failures in identifying orthologues or the in situ replacement of genes via horizontal gene transfer may give misleading results. GO4genome is a new paradigm, which is based on a detailed analysis of gene function and the location of the respective genes. For characterization of genes, the algorithm uses gene ontology enabling a comparison of function independent of evolutionary relationship. After the identification of locally optimal series of gene functions, their length distribution is utilized to compute a phylogenetic distance. The outcome is a classification of genomes based on metabolic capabilities and their organization. Thus, the impact of effects on genome organization that are not covered by methods of molecular phylogeny can be studied. Genomes of strains belonging to Escherichia coli, Shigella, Streptococcus, Methanosarcina, and Yersinia were analyzed. Differences from the findings of classical methods are discussed.
Collapse
Affiliation(s)
- Rainer Merkl
- Institut für Biophysik und Physikalische Biochemie, Universität Regensburg, 93040, Regensburg, Germany.
| | | |
Collapse
|
38
|
Blank CE. Phylogenomic dating--a method of constraining the age of microbial taxa that lack a conventional fossil record. ASTROBIOLOGY 2009; 9:173-191. [PMID: 19371160 DOI: 10.1089/ast.2008.0247] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/27/2023]
Abstract
A phylogenomic dating approach was used to identify potential age constraints for multiple archaeal groups, many of which have no fossil, isotopic, or biomarker record. First, well-resolved phylogenetic trees were inferred with the use of multiple gene sequences obtained from whole genome sequences. Next, the ability to use oxygen as a terminal electron acceptor was coded into characters, and ancestral state reconstruction was used to identify clades with taxa that metabolize oxygen and likely had an aerobic ancestor. Next, the habitat of the ancestor was inferred. If the local presence of Cyanobacteria could be excluded from the putative ancestral habitat, then these clades would have originated after the rise in atmospheric oxygen 2.32 Ga. With this method, an upper age of 2.32 Ga (an "oxygen age constraint") is proposed for four major archaeal clades: the Sulfolobales, Thermoplasmatales, Thermoproteus neutrophilus/Pyrobaculum spp., and the Thermoproteales. It was also shown that the halophilic archaea likely had an aerobic common ancestor, yet the possibility of local oxygen oases before oxygenation of the atmosphere could not be formally rejected. Thus, an oxygen age constraint was not assessed for this group. This work suggests that many archaeal groups are not as ancient as many in the research community have previously assumed, and it provides a new method for establishing upper age constraints for major microbial groups that lack a conventional fossil record.
Collapse
Affiliation(s)
- Carrine E Blank
- Department of Geosciences, University of Montana, Missoula, Montana 59808-1296, USA.
| |
Collapse
|
39
|
Lang P, Lefébure T, Wang W, Zadoks RN, Schukken Y, Stanhope MJ. Gene content differences across strains of Streptococcus uberis identified using oligonucleotide microarray comparative genomic hybridization. INFECTION GENETICS AND EVOLUTION 2008; 9:179-88. [PMID: 19056519 DOI: 10.1016/j.meegid.2008.10.015] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/29/2008] [Revised: 10/16/2008] [Accepted: 10/26/2008] [Indexed: 11/29/2022]
Abstract
Streptococcus uberis is one of the principal causative agents of bovine mastitis. The organism is typically considered an environmental pathogen. In this study, two multilocus sequence typing (MLST) schemes and whole genome DNA microarrays were used to evaluate the degree and nature of genome flexibility between S. uberis strains. The 21 isolates examined in this study arise from a collection of 232 international isolates for which previous epidemiological and preliminary genotyping data existed. The microarray analysis resulted in an estimate of the core genome for S. uberis, consisting of 1530 ORFs, among 1855 tested, representing 82.5% of the S. uberis 0140J genome. The remaining ORFs were variable in gene content across the 21 tested strains. A total of 26 regions of difference (RDs), consisting of three or more contiguous ORFs, were identified among the variable genes. Core genes mainly encoded housekeeping functions, while the variable genes primarily fell within categories such as protection responses, degradation of small molecules, laterally acquired elements, and two component systems. Recombination detection procedures involving the MLST loci suggested S. uberis is a highly recombinant species, precluding accurate phylogenetic reconstructions involving these data. On the other hand, the microarray data did provide limited support for an association of gene content with strains found in multiple cows and/or multiple herds, suggesting the possibility of genes related to bovine transmissibility or host-adaptation.
Collapse
Affiliation(s)
- Ping Lang
- Department of Population Medicine and Diagnostic Sciences, College of Veterinary Medicine, Cornell University, Ithaca, NY 14853, USA
| | | | | | | | | | | |
Collapse
|
40
|
Castillo-Ramírez S, González V. Factors affecting the concordance between orthologous gene trees and species tree in bacteria. BMC Evol Biol 2008; 8:300. [PMID: 18973688 PMCID: PMC2614993 DOI: 10.1186/1471-2148-8-300] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2008] [Accepted: 10/30/2008] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND As originally defined, orthologous genes implied a reflection of the history of the species. In recent years, many studies have examined the concordance between orthologous gene trees and species trees in bacteria. These studies have produced contradictory results that may have been influenced by orthologous gene misidentification and artefactual phylogenetic reconstructions. Here, using a method that allows the detection and exclusion of false positives during identification of orthologous genes, we address the question of whether putative orthologous genes within bacteria really reflect the history of the species. RESULTS We identified a set of 370 orthologous genes from the bacterial order Rhizobiales. Although manifesting strong vertical signal, almost every orthologous gene had a distinct phylogeny, and the most common topology among the orthologous gene trees did not correspond with the best estimate of the species tree. However, each orthologous gene tree shared an average of 70% of its bipartitions with the best estimate of the species tree. Stochastic error related to gene size affected the concordance between the best estimated of the species tree and the orthologous gene trees, although this effect was weak and distributed unevenly among the functional categories. The nodes showing the greatest discordance were those defined by the shortest internal branches in the best estimated of the species tree. Moreover, a clear bias was evident with respect to the function of the orthologous genes, and the degree of divergence among the orthologous genes appeared to be related to their functional classification. CONCLUSION Orthologous genes do not reflect the history of the species when taken as individual markers, but they do when taken as a whole. Stochastic error affected the concordance of orthologous genes with the species tree, albeit weakly. We conclude that two important biological causes of discordance among orthologous genes are incomplete lineage sorting and functional restriction.
Collapse
Affiliation(s)
- Santiago Castillo-Ramírez
- Programa de Genómica Evolutiva, Centro de Ciencias Genómicas, Universidad Nacional Autónoma de México, Apartado Postal 565-A, CP 62210, Cuernavaca, Morelos, México.
| | | |
Collapse
|
41
|
Esser C, Martin W, Dagan T. The origin of mitochondria in light of a fluid prokaryotic chromosome model. Biol Lett 2008; 3:180-4. [PMID: 17251118 PMCID: PMC2375920 DOI: 10.1098/rsbl.2006.0582] [Citation(s) in RCA: 74] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Biologists agree that the ancestor of mitochondria was an alpha-proteobacterium. But there is no consensus as to what constitutes an alpha-proteobacterial gene. Is it a gene found in all or several alpha-proteobacteria, or in only one? Here, we examine the proportion of alpha-proteobacterial genes in alpha-proteobacterial genomes by means of sequence comparisons. We find that each alpha-proteobacterium harbours a particular collection of genes and that, depending upon the lineage examined, between 97 and 33% are alpha-proteobacterial by the nearest-neighbour criterion. Our findings bear upon attempts to reconstruct the mitochondrial ancestor and upon inferences concerning the collection of genes that the mitochondrial ancestor possessed at the time that it became an endosymbiont.
Collapse
|
42
|
Modular networks and cumulative impact of lateral transfer in prokaryote genome evolution. Proc Natl Acad Sci U S A 2008; 105:10039-44. [PMID: 18632554 DOI: 10.1073/pnas.0800679105] [Citation(s) in RCA: 249] [Impact Index Per Article: 15.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Lateral gene transfer is an important mechanism of natural variation among prokaryotes, but the significance of its quantitative contribution to genome evolution is debated. Here, we report networks that capture both vertical and lateral components of evolutionary history among 539,723 genes distributed across 181 sequenced prokaryotic genomes. Partitioning of these networks by an eigenspectrum analysis identifies community structure in prokaryotic gene-sharing networks, the modules of which do not correspond to a strictly hierarchical prokaryotic classification. Our results indicate that, on average, at least 81 +/- 15% of the genes in each genome studied were involved in lateral gene transfer at some point in their history, even though they can be vertically inherited after acquisition, uncovering a substantial cumulative effect of lateral gene transfer on longer evolutionary time scales.
Collapse
|
43
|
Shapiro BJ, Alm EJ. Comparing patterns of natural selection across species using selective signatures. PLoS Genet 2008; 4:e23. [PMID: 18266472 PMCID: PMC2233676 DOI: 10.1371/journal.pgen.0040023] [Citation(s) in RCA: 33] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2007] [Accepted: 12/18/2007] [Indexed: 12/04/2022] Open
Abstract
Comparing gene expression profiles over many different conditions has led to insights that were not obvious from single experiments. In the same way, comparing patterns of natural selection across a set of ecologically distinct species may extend what can be learned from individual genome-wide surveys. Toward this end, we show how variation in protein evolutionary rates, after correcting for genome-wide effects such as mutation rate and demographic factors, can be used to estimate the level and types of natural selection acting on genes across different species. We identify unusually rapidly and slowly evolving genes, relative to empirically derived genome-wide and gene family-specific background rates for 744 core protein families in 30 γ-proteobacterial species. We describe the pattern of fast or slow evolution across species as the “selective signature” of a gene. Selective signatures represent a profile of selection across species that is predictive of gene function: pairs of genes with correlated selective signatures are more likely to share the same cellular function, and genes in the same pathway can evolve in concert. For example, glycolysis and phenylalanine metabolism genes evolve rapidly in Idiomarina loihiensis, mirroring an ecological shift in carbon source from sugars to amino acids. In a broader context, our results suggest that the genomic landscape is organized into functional modules even at the level of natural selection, and thus it may be easier than expected to understand the complex evolutionary pressures on a cell. Natural selection promotes the survival of the fittest individuals within a species. Over many generations, this may result in the maintenance of ancestral traits (conservation through purifying selection), or the emergence of newly beneficial traits (adaptation through positive selection). At the genetic level, long-term purifying or positive selection can cause genes to evolve more slowly, or more rapidly, providing a way to identify these evolutionary forces. While some genes are subject to consistent purifying or positive selection in most species, other genes show unexpected levels of selection in a particular species or group of species—a pattern we refer to as the “selective signature” of the gene. In this work, we demonstrate that these patterns of natural selection can be mined for information about gene function and species ecology. In the future, this method could be applied to any set of related species with fully sequenced genomes to better understand the genetic basis of ecological divergence.
Collapse
Affiliation(s)
- B. Jesse Shapiro
- Program in Computational and Systems Biology, Massachusetts Institute of Technology, Cambridge, Massachusetts, United States of America
| | - Eric J Alm
- Program in Computational and Systems Biology, Massachusetts Institute of Technology, Cambridge, Massachusetts, United States of America
- Department of Biological Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts, United States of America
- Department of Civil and Environmental Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts, United States of America
- The Virtual institute of Microbial Stress and Survival, Berkeley, California, United States of America
- The Broad Institute of MIT and Harvard, Cambridge, Massachusetts, United States of America
- * To whom correspondence should be addressed. E-mail:
| |
Collapse
|
44
|
Bapteste E, Boucher Y. Lateral gene transfer challenges principles of microbial systematics. Trends Microbiol 2008; 16:200-7. [PMID: 18420414 DOI: 10.1016/j.tim.2008.02.005] [Citation(s) in RCA: 43] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2007] [Revised: 02/13/2008] [Accepted: 02/15/2008] [Indexed: 10/22/2022]
Abstract
Evolutionists strive to learn about the natural historical process that gave rise to various taxa, while also attempting to classify them efficiently and make generalizations about them. The quantitative importance of lateral gene transfer inferred from genomic data, although well acknowledged by microbiologists, is in conflict with the conceptual foundations of the traditional phylogenetic system erected to achieve these goals. To provide a true account of microbial evolution, we suggest developing an alternative conception of natural groups and introduce a new notion--the composite evolutionary unit. Furthermore, we argue that a comprehensive database containing overlapping taxonomical groups would constitute a step forward regarding the classification of microbes in the presence of lateral gene transfer.
Collapse
|
45
|
Leigh JW, Susko E, Baumgartner M, Roger AJ. Testing Congruence in Phylogenomic Analysis. Syst Biol 2008; 57:104-15. [DOI: 10.1080/10635150801910436] [Citation(s) in RCA: 131] [Impact Index Per Article: 8.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022] Open
Affiliation(s)
- Jessica W. Leigh
- Department of Biochemistry and Molecular Biology, Dalhousie University Halifax NS, Canada B3H 1X5; E-mail: (A.J.R.)
| | - Edward Susko
- Department of Mathematics and Statistics and Genome Atlantic, Dalhousie University Halifax NS, Canada B3H 3J5
| | - Manuela Baumgartner
- Department für Biologie I, Botanik, Ludwig-Maximilians-Universität München Menzingerstraße 67, D-80638 München, Germany
| | - Andrew J. Roger
- Department of Biochemistry and Molecular Biology, Dalhousie University Halifax NS, Canada B3H 1X5; E-mail: (A.J.R.)
| |
Collapse
|
46
|
Abstract
How much horizontal gene transfer (HGT) between species influences bacterial phylogenomics is a controversial issue. This debate, however, lacks any quantitative assessment of the impact of HGT on phylogenies and of the ability of tree-building methods to cope with such events. I introduce a Markov model of genome evolution with HGT, accounting for the constraints on time -- an HGT event can only occur between concomitantly living species. This model is used to simulate multigene sequence data sets with or without HGT. The consequences of HGT on phylogenomic inference are analyzed and compared to other well-known phylogenetic artefacts. It is found that supertree methods are quite robust to HGT, keeping high levels of performance even when gene trees are largely incongruent with each other. Gene tree incongruence per se is not indicative of HGT. HGT, however, removes the (otherwise observed) positive relationship between sequence length and gene tree congruence to the estimated species tree. Surprisingly, when applied to a bacterial and a eukaryotic multigene data set, this criterion rejects the HGT hypothesis for the former, but not the latter data set.
Collapse
Affiliation(s)
- Nicolas Galtier
- Institut des Sciences de l'Evolution (UM2-CNRS), Université Montpellier 2, Montpellier, France.
| |
Collapse
|
47
|
Doolittle WF, Bapteste E. Pattern pluralism and the Tree of Life hypothesis. Proc Natl Acad Sci U S A 2007; 104:2043-9. [PMID: 17261804 PMCID: PMC1892968 DOI: 10.1073/pnas.0610699104] [Citation(s) in RCA: 366] [Impact Index Per Article: 21.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2006] [Indexed: 11/18/2022] Open
Abstract
Darwin claimed that a unique inclusively hierarchical pattern of relationships between all organisms based on their similarities and differences [the Tree of Life (TOL)] was a fact of nature, for which evolution, and in particular a branching process of descent with modification, was the explanation. However, there is no independent evidence that the natural order is an inclusive hierarchy, and incorporation of prokaryotes into the TOL is especially problematic. The only data sets from which we might construct a universal hierarchy including prokaryotes, the sequences of genes, often disagree and can seldom be proven to agree. Hierarchical structure can always be imposed on or extracted from such data sets by algorithms designed to do so, but at its base the universal TOL rests on an unproven assumption about pattern that, given what we know about process, is unlikely to be broadly true. This is not to say that similarities and differences between organisms are not to be accounted for by evolutionary mechanisms, but descent with modification is only one of these mechanisms, and a single tree-like pattern is not the necessary (or expected) result of their collective operation. Pattern pluralism (the recognition that different evolutionary models and representations of relationships will be appropriate, and true, for different taxa or at different scales or for different purposes) is an attractive alternative to the quixotic pursuit of a single true TOL.
Collapse
Affiliation(s)
- W Ford Doolittle
- Department of Biochemistry and Molecular Biology, Dalhousie University, Halifax, NS, Canada B3H 1X5.
| | | |
Collapse
|
48
|
Comas I, Moya A, González-Candelas F. Phylogenetic signal and functional categories in Proteobacteria genomes. BMC Evol Biol 2007; 7 Suppl 1:S7. [PMID: 17288580 PMCID: PMC1796616 DOI: 10.1186/1471-2148-7-s1-s7] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023] Open
Abstract
BACKGROUND A comprehensive evolutionary analysis of bacterial genomes implies to identify the hallmark of vertical and non-vertical signals and to discriminate them from the presence of mere phylogenetic noise. In this report we have addressed the impact of factors like the universal distribution of the genes, their essentiality or their functional role in the cell on the inference of vertical signal through phylogenomic methods. RESULTS We have established that supermatrices derived from data sets composed mainly by genes suspected to be essential for bacterial cellular life perform better on the recovery of vertical signal than those composed by widely distributed genes. In addition, we show that the "Transcription" category of genes seems to harbor a better vertical signal than other functional categories. Moreover, the "Poorly characterized" category performs better than other categories related with metabolism or cellular processes. CONCLUSION From these results we conclude that different data sets allow addressing different questions in phylogenomic analyses. The vertical signal seems to be more present in essential genes although these also include a significant degree of incongruence. From a functional perspective, as expected, informational genes perform better than operational ones but we have also shown the surprising behavior of poorly annotated genes, which points to their importance in the genome evolution of bacteria.
Collapse
Affiliation(s)
- Iñaki Comas
- Instituto Cavanilles de Biodiversidad y Biología Evolutiva. Universidad de Valencia. Apartado Oficial 22085, Valencia E-46071, Spain
| | - Andrés Moya
- Instituto Cavanilles de Biodiversidad y Biología Evolutiva. Universidad de Valencia. Apartado Oficial 22085, Valencia E-46071, Spain
| | - Fernando González-Candelas
- Instituto Cavanilles de Biodiversidad y Biología Evolutiva. Universidad de Valencia. Apartado Oficial 22085, Valencia E-46071, Spain
| |
Collapse
|
49
|
Comas I, Moya A, González-Candelas F. From phylogenetics to phylogenomics: the evolutionary relationships of insect endosymbiotic gamma-Proteobacteria as a test case. Syst Biol 2007; 56:1-16. [PMID: 17366133 DOI: 10.1080/10635150601109759] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023] Open
Abstract
The increasing availability of complete genome sequences and the development of new, faster methods for phylogenetic reconstruction allow the exploration of the set of evolutionary trees for each gene in the genome of any species. This has led to the development of new phylogenomic methods. Here, we have compared different phylogenetic and phylogenomic methods in the analysis of the monophyletic origin of insect endosymbionts from the gamma-Proteobacteria, a hotly debated issue with several recent, conflicting reports. We have obtained the phylogenetic tree for each of the 579 identified protein-coding genes in the genome of the primary endosymbiont of carpenter ants, Blochmannia floridanus, after determining their presumed orthologs in 20 additional Proteobacteria genomes. A reference phylogeny reflecting the monophyletic origin of insect endosymbionts was further confirmed with different approaches, which led us to consider it as the presumed species tree. Remarkably, only 43 individual genes produced exactly the same topology as this presumed species tree. Most discrepancies between this tree and those obtained from individual genes or by concatenation of different genes were due to the grouping of Xanthomonadales with beta-Proteobacteria and not to uncertainties over the monophyly of insect endosymbionts. As previously noted, operational genes were more prone to reject the presumed species tree than those included in information-processing categories, but caution should be exerted when selecting genes for phylogenetic inference on the basis of their functional category assignment. We have obtained strong evidence in support of the monophyletic origin of gamma-Proteobacteria insect endosymbionts by a combination of phylogenetic and phylogenomic methods. In our analysis, the use of concatenated genes has shown to be a valuable tool for analyzing primary phylogenetic signals coded in the genomes. Nevertheless, other phylogenomic methods such as supertree approaches were useful in revealing alternative phylogenetic signals and should be included in comprehensive phylogenomic studies.
Collapse
Affiliation(s)
- Iñaki Comas
- Institut Cavanilles de Biodiversitat i Biologia Evolutiva, Universitat de València, Valencia, Spain.
| | | | | |
Collapse
|
50
|
Dagan T, Martin W. Ancestral genome sizes specify the minimum rate of lateral gene transfer during prokaryote evolution. Proc Natl Acad Sci U S A 2007; 104:870-5. [PMID: 17213324 PMCID: PMC1783406 DOI: 10.1073/pnas.0606318104] [Citation(s) in RCA: 143] [Impact Index Per Article: 8.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
The amount of lateral gene transfer (LGT) that has occurred in microbial evolution is heavily debated. Efforts to quantify LGT through gene-tree comparisons have delivered estimates that between 2% and 60% of all prokaryotic genes have been affected by LGT, the 30-fold discrepancy reflecting differences among gene samples studied and uncertainties inherent in phylogenetic reconstruction. Here we present a simple method that is independent of gene-tree comparisons to estimate the LGT rate among sequenced prokaryotic genomes. If little or no LGT has occurred during evolution, ancestral genome sizes would become unrealistically large, whereas too much LGT would render them far too small. We determine the amount of LGT that is necessary and sufficient to bring the distribution of inferred ancestral genome sizes into agreement with that observed among modern microbes. Rather than testing for phylogenetic congruence or lack thereof across genes, we assume that all gene trees are compatible; hence, our method delivers very conservative lower-bound estimates of the average LGT rate. The results indicate that among 57,670 gene families distributed across 190 sequenced genomes, at least two-thirds and probably all, have been affected by LGT at some time in their evolutionary past. A component of common ancestry nonetheless remains detectable in gene distribution patterns. We estimate the minimum lower bound for the average LGT rate across all genes as 1.1 LGT events per gene family and gene family lifespan and this minimum rate increases sharply when genes present in only a few genomes are excluded from the analysis.
Collapse
Affiliation(s)
- Tal Dagan
- Institut für Botanik III, Heinrich-Heine Universität, Universitätsstrasse 1, 40225 Düsseldorf, Germany.
| | | |
Collapse
|