1
|
Liu Y, Brinkhoff T, Berger M, Poehlein A, Voget S, Paoli L, Sunagawa S, Amann R, Simon M. Metagenome-assembled genomes reveal greatly expanded taxonomic and functional diversification of the abundant marine Roseobacter RCA cluster. MICROBIOME 2023; 11:265. [PMID: 38007474 PMCID: PMC10675870 DOI: 10.1186/s40168-023-01644-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/12/2023] [Accepted: 08/07/2023] [Indexed: 11/27/2023]
Abstract
BACKGROUND The RCA (Roseobacter clade affiliated) cluster belongs to the family Roseobacteracea and represents a major Roseobacter lineage in temperate to polar oceans. Despite its prevalence and abundance, only a few genomes and one described species, Planktomarina temperata, exist. To gain more insights into our limited understanding of this cluster and its taxonomic and functional diversity and biogeography, we screened metagenomic datasets from the global oceans and reconstructed metagenome-assembled genomes (MAG) affiliated to this cluster. RESULTS The total of 82 MAGs, plus five genomes of isolates, reveal an unexpected diversity and novel insights into the genomic features, the functional diversity, and greatly refined biogeographic patterns of the RCA cluster. This cluster is subdivided into three genera: Planktomarina, Pseudoplanktomarina, and the most deeply branching Candidatus Paraplanktomarina. Six of the eight Planktomarina species have larger genome sizes (2.44-3.12 Mbp) and higher G + C contents (46.36-53.70%) than the four Pseudoplanktomarina species (2.26-2.72 Mbp, 42.22-43.72 G + C%). Cand. Paraplanktomarina is represented only by one species with a genome size of 2.40 Mbp and a G + C content of 45.85%. Three novel species of the genera Planktomarina and Pseudoplanktomarina are validly described according to the SeqCode nomenclature for prokaryotic genomes. Aerobic anoxygenic photosynthesis (AAP) is encoded in three Planktomarina species. Unexpectedly, proteorhodopsin (PR) is encoded in the other Planktomarina and all Pseudoplanktomarina species, suggesting that this light-driven proton pump is the most important mode of acquiring complementary energy of the RCA cluster. The Pseudoplanktomarina species exhibit differences in functional traits compared to Planktomarina species and adaptations to more resource-limited conditions. An assessment of the global biogeography of the different species greatly expands the range of occurrence and shows that the different species exhibit distinct biogeographic patterns. They partially reflect the genomic features of the species. CONCLUSIONS Our detailed MAG-based analyses shed new light on the diversification, environmental adaptation, and global biogeography of a major lineage of pelagic bacteria. The taxonomic delineation and validation by the SeqCode nomenclature of prominent genera and species of the RCA cluster may be a promising way for a refined taxonomic identification of major prokaryotic lineages and sublineages in marine and other prokaryotic communities assessed by metagenomics approaches. Video Abstract.
Collapse
Affiliation(s)
- Yanting Liu
- Institute for Chemistry and Biology of the Marine Environment, University of Oldenburg, Carl Von Ossietzky Str. 9-11, 26129, Oldenburg, Germany.
- Max Planck Institute for Marine Microbiology, Bremen, Germany.
- State Key Laboratory for Marine Environmental Science, Institute of Marine Microbes and Ecospheres, Xiamen University, Xiamen, People's Republic of China.
| | - Thorsten Brinkhoff
- Institute for Chemistry and Biology of the Marine Environment, University of Oldenburg, Carl Von Ossietzky Str. 9-11, 26129, Oldenburg, Germany.
| | - Martine Berger
- Institute for Chemistry and Biology of the Marine Environment, University of Oldenburg, Carl Von Ossietzky Str. 9-11, 26129, Oldenburg, Germany
| | - Anja Poehlein
- Department of Genomic and Applied Microbiology & Göttingen Genomics Laboratory, Georg-August University Göttingen, Grisebachstr. 8, 37077, Göttingen, Germany
| | - Sonja Voget
- Department of Genomic and Applied Microbiology & Göttingen Genomics Laboratory, Georg-August University Göttingen, Grisebachstr. 8, 37077, Göttingen, Germany
| | - Lucas Paoli
- Department of Biology, Institute of Microbiology and Swiss Institute of Bioinformatics, ETH Zürich, Zurich, Switzerland
| | - Shinichi Sunagawa
- Department of Biology, Institute of Microbiology and Swiss Institute of Bioinformatics, ETH Zürich, Zurich, Switzerland
| | - Rudolf Amann
- Max Planck Institute for Marine Microbiology, Bremen, Germany
| | - Meinhard Simon
- Institute for Chemistry and Biology of the Marine Environment, University of Oldenburg, Carl Von Ossietzky Str. 9-11, 26129, Oldenburg, Germany.
- Helmholtz Institute for Functional Marine Biodiversity at the University of Oldenburg (HIFMB), Ammerländer Heerstr. 231, 26129, Oldenburg, Germany.
| |
Collapse
|
2
|
Halter T, Köstlbacher S, Collingro A, Sixt BS, Tönshoff ER, Hendrickx F, Kostanjšek R, Horn M. Ecology and evolution of chlamydial symbionts of arthropods. ISME COMMUNICATIONS 2022; 2:45. [PMID: 37938728 PMCID: PMC9723776 DOI: 10.1038/s43705-022-00124-5] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/27/2021] [Revised: 03/31/2022] [Accepted: 04/08/2022] [Indexed: 05/08/2023]
Abstract
The phylum Chlamydiae consists of obligate intracellular bacteria including major human pathogens and diverse environmental representatives. Here we investigated the Rhabdochlamydiaceae, which is predicted to be the largest and most diverse chlamydial family, with the few described members known to infect arthropod hosts. Using published 16 S rRNA gene sequence data we identified at least 388 genus-level lineages containing about 14 051 putative species within this family. We show that rhabdochlamydiae are mainly found in freshwater and soil environments, suggesting the existence of diverse, yet unknown hosts. Next, we used a comprehensive genome dataset including metagenome assembled genomes classified as members of the family Rhabdochlamydiaceae, and we added novel complete genome sequences of Rhabdochlamydia porcellionis infecting the woodlouse Porcellio scaber, and of 'Candidatus R. oedothoracis' associated with the linyphiid dwarf spider Oedothorax gibbosus. Comparative analysis of basic genome features and gene content with reference genomes of well-studied chlamydial families with known host ranges, namely Parachlamydiaceae (protist hosts) and Chlamydiaceae (human and other vertebrate hosts) suggested distinct niches for members of the Rhabdochlamydiaceae. We propose that members of the family represent intermediate stages of adaptation of chlamydiae from protists to vertebrate hosts. Within the genus Rhabdochlamydia, pronounced genome size reduction could be observed (1.49-1.93 Mb). The abundance and genomic distribution of transposases suggests transposable element expansion and subsequent gene inactivation as a mechanism of genome streamlining during adaptation to new hosts. This type of genome reduction has never been described before for any member of the phylum Chlamydiae. This study provides new insights into the molecular ecology, genomic diversity, and evolution of representatives of one of the most divergent chlamydial families.
Collapse
Affiliation(s)
- Tamara Halter
- Centre for Microbiology and Environmental Systems Science, University of Vienna, Vienna, Austria
- Doctoral School in Microbiology and Environmental Science, University of Vienna, Vienna, Austria
| | - Stephan Köstlbacher
- Centre for Microbiology and Environmental Systems Science, University of Vienna, Vienna, Austria
- Doctoral School in Microbiology and Environmental Science, University of Vienna, Vienna, Austria
| | - Astrid Collingro
- Centre for Microbiology and Environmental Systems Science, University of Vienna, Vienna, Austria
| | - Barbara S Sixt
- The Laboratory for Molecular Infection Medicine Sweden (MIMS), Umeå Centre for Microbial Research (UCMR), Department of Molecular Biology, Umeå University, Umeå, Sweden
| | - Elena R Tönshoff
- Institute of Molecular Biology and Biophysics, Eidgenössische Technische Hochschule Zürich (ETH), Zurich, Switzerland
| | | | - Rok Kostanjšek
- Department of Biology, Biotechnical Faculty, University of Ljubljana, Ljubljana, Slovenia
| | - Matthias Horn
- Centre for Microbiology and Environmental Systems Science, University of Vienna, Vienna, Austria.
| |
Collapse
|
3
|
Duan B, Ding P, Navarre WW, Liu J, Xia B. Xenogeneic Silencing and Bacterial Genome Evolution: Mechanisms for DNA Recognition Imply Multifaceted Roles of Xenogeneic Silencers. Mol Biol Evol 2021; 38:4135-4148. [PMID: 34003286 PMCID: PMC8476142 DOI: 10.1093/molbev/msab136] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2021] [Revised: 04/08/2021] [Indexed: 12/14/2022] Open
Abstract
Horizontal gene transfer (HGT) is a major driving force for bacterial evolution. To avoid the deleterious effects due to the unregulated expression of newly acquired foreign genes, bacteria have evolved specific proteins named xenogeneic silencers to recognize foreign DNA sequences and suppress their transcription. As there is considerable diversity in genomic base compositions among bacteria, how xenogeneic silencers distinguish self- from nonself DNA in different bacteria remains poorly understood. This review summarizes the progress in studying the DNA binding preferences and the underlying molecular mechanisms of known xenogeneic silencer families, represented by H-NS of Escherichia coli, Lsr2 of Mycobacterium, MvaT of Pseudomonas, and Rok of Bacillus. Comparative analyses of the published data indicate that the differences in DNA recognition mechanisms enable these xenogeneic silencers to have clear characteristics in DNA sequence preferences, which are further correlated with different host genomic features. These correlations provide insights into the mechanisms of how these xenogeneic silencers selectively target foreign DNA in different genomic backgrounds. Furthermore, it is revealed that the genomic AT contents of bacterial species with the same xenogeneic silencer family proteins are distributed in a limited range and are generally lower than those species without any known xenogeneic silencers in the same phylum/class/genus, indicating that xenogeneic silencers have multifaceted roles on bacterial genome evolution. In addition to regulating horizontal gene transfer, xenogeneic silencers also act as a selective force against the GC to AT mutational bias found in bacterial genomes and help the host genomic AT contents maintained at relatively low levels.
Collapse
Affiliation(s)
- Bo Duan
- Beijing Nuclear Magnetic Resonance Center, College of Chemistry and Molecular Engineering, and School of Life Sciences, Peking University, Beijing, 100871, China
| | - Pengfei Ding
- Beijing Nuclear Magnetic Resonance Center, College of Chemistry and Molecular Engineering, and School of Life Sciences, Peking University, Beijing, 100871, China
| | - William Wiley Navarre
- Department of Molecular Genetics, University of Toronto, Toronto, Ontario, M5G 1M1, Canada
| | - Jun Liu
- Department of Molecular Genetics, University of Toronto, Toronto, Ontario, M5G 1M1, Canada
| | - Bin Xia
- Beijing Nuclear Magnetic Resonance Center, College of Chemistry and Molecular Engineering, and School of Life Sciences, Peking University, Beijing, 100871, China
| |
Collapse
|
4
|
Kovalchuk SN, Babii AV. Draft genome sequence data and comparative analysis of Erysipelothrix Rhusiopathiae vaccine strain VR-2. 3 Biotech 2020; 10:455. [PMID: 33088652 DOI: 10.1007/s13205-020-02451-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2020] [Accepted: 09/21/2020] [Indexed: 11/29/2022] Open
Abstract
Erysipelothrix rhusiopathiae VR-2 is a commercially available live attenuated vaccine strain widely used in Russia, Kazakhstan, and a number of European countries for immunization of pigs against swine erysipelas. The draft genome sequence of E. rhusiopathiae strain VR-2 reported in this paper is 1,704,727 bp in length, has CG content of 36.5%, and contains 1680 genes, including 51 tRNA, 3 rRNA, and 1408 protein-coding genes. Comparative sequence analysis between Fujisawa (serovar 1a), VR-2 and six other serovar N strains of E. rhusiopathiae revealed wide genetic variability of the chromosomal region essential for serovar-specific antigenicity and virulence of E. rhusiopathiae strains. We have performed a BLAST search and found 12 genomic loci potentially specific for the E. rhusiopathiae VR-2 strain. These data could be helpful for developing genetic assays for differentiation of field isolates and this live attenuated vaccine strain, which is especially important for epizootical monitoring of swine erysipelas in countries, where the live vaccine strain E. rhusiopathiae VR-2 is used for pig immunization, as well as for the design of recombinant vaccines against swine erysipelas. The genome of E. rhusiopathiae VR-2 has been submitted in GenBank under accession number RJTK00000000.1.
Collapse
Affiliation(s)
- Svetlana N Kovalchuk
- Federal Science Center for Animal Husbandry Named After Academy Member L.K. Ernst, Dubrovitsy 60, Podolsk Municipal District, 142132 Moscow Region Russian Federation
| | - Anna V Babii
- Federal Science Center for Animal Husbandry Named After Academy Member L.K. Ernst, Dubrovitsy 60, Podolsk Municipal District, 142132 Moscow Region Russian Federation
| |
Collapse
|
5
|
Bohlin J, Rose B, Brynildsrud O, Birgitte Freiesleben De Blasio. A simple stochastic model describing genomic evolution over time of GC content in microbial symbionts. J Theor Biol 2020; 503:110389. [PMID: 32634385 DOI: 10.1016/j.jtbi.2020.110389] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2019] [Revised: 04/21/2020] [Accepted: 06/24/2020] [Indexed: 11/29/2022]
Abstract
An organism's genomic base composition is usually summarized by its AT or GC content due to Chargaff's parity laws. Variation in prokaryotic GC content can be substantial between taxa but is generally small within microbial genomes. This variation has been found to correlate with both phylogeny and environmental factors. Since novel single-nucleotide polymorphisms (SNPs) within genomes are at least partially linked to the environment through natural selection, SNP GC content can be considered a compound measure of an organism's environmental influences, lifestyle, phylogeny as well as other more or less random processes. While there are several models describing genomic GC content few, if any, consider AT/GC mutation rates subjected to random perturbations. We present a mathematical model that describes how GC content in microbial genomes evolves over time as a function of the AT → GC and GC → AT mutation rates with Gaussian white noise disturbances. The model, which is suited specifically to non-recombining vertically transmitted prokaryotic symbionts, suggests that small differences in the AT/GC mutation rates can lead to profound differences in outcome due to the ensuing stochastic process. In other words, the model indicates that time to extinction could be a consequence of the mutation rate trajectory on which the symbiont embarked early on in its evolutionary history.
Collapse
Affiliation(s)
- Jon Bohlin
- Division of Infection Control and Environmental Health, Norwegian Institute of Public Health, Oslo, Norway; Centre for Fertility and Health, Norwegian Institute of Public Health, Oslo, Norway; Department of Production Animals, Faculty of Veterinary Medicine, Norwegian University of Life Science, Oslo, Norway
| | - Brittany Rose
- Division of Infection Control and Environmental Health, Norwegian Institute of Public Health, Oslo, Norway; Department of Biostatistics, Oslo Centre for Biostatistics and Epidemiology, Institute of Basic Medical Sciences, University of Oslo, Oslo, Norway
| | - Ola Brynildsrud
- Division of Infection Control and Environmental Health, Norwegian Institute of Public Health, Oslo, Norway; Department of Production Animals, Faculty of Veterinary Medicine, Norwegian University of Life Science, Oslo, Norway
| | - Birgitte Freiesleben De Blasio
- Division of Infection Control and Environmental Health, Norwegian Institute of Public Health, Oslo, Norway; Department of Biostatistics, Oslo Centre for Biostatistics and Epidemiology, Institute of Basic Medical Sciences, University of Oslo, Oslo, Norway
| |
Collapse
|
6
|
Cen S, Yin R, Mao B, Zhao J, Zhang H, Zhai Q, Chen W. Comparative genomics shows niche-specific variations of Lactobacillus plantarum strains isolated from human, Drosophila melanogaster, vegetable and dairy sources. FOOD BIOSCI 2020. [DOI: 10.1016/j.fbio.2020.100581] [Citation(s) in RCA: 18] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
|
7
|
Bohlin J, Rose B, Pettersson JHO. Estimation of AT and GC content distributions of nucleotide substitution rates in bacterial core genomes. BIG DATA ANALYTICS 2019. [DOI: 10.1186/s41044-019-0042-7] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
|
8
|
Bohlin J, Pettersson JHO. Evolution of Genomic Base Composition: From Single Cell Microbes to Multicellular Animals. Comput Struct Biotechnol J 2019; 17:362-370. [PMID: 30949307 PMCID: PMC6429543 DOI: 10.1016/j.csbj.2019.03.001] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2018] [Revised: 02/28/2019] [Accepted: 03/01/2019] [Indexed: 01/07/2023] Open
Abstract
Whole genome sequencing (WGS) of thousands of microbial genomes has provided considerable insight into evolutionary mechanisms in the microbial world. While substantially fewer eukaryotic genomes are available for analyses the number is rapidly increasing. This mini-review summarizes broadly evolutionary dynamics of base composition in the different domains of life from the perspective of prokaryotes. Common and different evolutionary mechanisms influencing genomic base composition in eukaryotes and prokaryotes are discussed. The conclusion from the data currently available suggests that while there are similarities there are also striking differences in how genomic base composition has evolved within prokaryotes and eukaryotes. For instance, homologous recombination appears to increase GC content locally in eukaryotes due to a non-selective process termed GC-biased gene conversion (gBGC). For prokaryotes on the other hand, increase in genomic GC content seems to be driven by the environment and selection. We find that similar phenomena observed for some organisms in each respective domain may be caused by very different mechanisms: while gBGC and recombination rates appear to explain the negative correlation between GC3 (GC content based on the third codon nucleotides) and genome size in some eukaryotes uptake of AT rich DNA sequences is the main reason for a similar negative correlation observed in prokaryotes. We provide further examples that indicate that base composition in prokaryotes and eukaryotes have evolved under very different constraints.
Collapse
Affiliation(s)
- Jon Bohlin
- Norwegian Institute of Public Health, Division of Infection Control and Environmental Health, Department of Infectious Disease Epidemiology and Modelling, Lovisenberggata 8, 0456 Oslo, Norway.,Centre for Fertility and Health, Norwegian Institute of Public Health, PO-Box 222 Skøyen, N-0213 Oslo, Norway.,Norwegian University of Life Sciences, Faculty of Veterinary Sciences, Production Animal Clinical Sciences, Ullevålsveien 72, 0454 Oslo, Norway
| | - John H-O Pettersson
- Marie Bashir Institute for Infectious Diseases and Biosecurity, Charles Perkins Centre, School of Life and Environmental Sciences and Sydney Medical School the University of Sydney, New South Wales 2006, Australia.,Zoonosis Science Center, Department of Medical Biochemistry and Microbiology, Uppsala University, Uppsala, Sweden.,Public Health Agency of Sweden, Nobels vg 18, SE-171 82 Solna, Sweden
| |
Collapse
|
9
|
More than 18,000 effectors in the Legionella genus genome provide multiple, independent combinations for replication in human cells. Proc Natl Acad Sci U S A 2019; 116:2265-2273. [PMID: 30659146 DOI: 10.1073/pnas.1808016116] [Citation(s) in RCA: 137] [Impact Index Per Article: 27.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022] Open
Abstract
The genus Legionella comprises 65 species, among which Legionella pneumophila is a human pathogen causing severe pneumonia. To understand the evolution of an environmental to an accidental human pathogen, we have functionally analyzed 80 Legionella genomes spanning 58 species. Uniquely, an immense repository of 18,000 secreted proteins encoding 137 different eukaryotic-like domains and over 200 eukaryotic-like proteins is paired with a highly conserved type IV secretion system (T4SS). Specifically, we show that eukaryotic Rho- and Rab-GTPase domains are found nearly exclusively in eukaryotes and Legionella Translocation assays for selected Rab-GTPase proteins revealed that they are indeed T4SS secreted substrates. Furthermore, F-box, U-box, and SET domains were present in >70% of all species, suggesting that manipulation of host signal transduction, protein turnover, and chromatin modification pathways are fundamental intracellular replication strategies for legionellae. In contrast, the Sec-7 domain was restricted to L. pneumophila and seven other species, indicating effector repertoire tailoring within different amoebae. Functional screening of 47 species revealed 60% were competent for intracellular replication in THP-1 cells, but interestingly, this phenotype was associated with diverse effector assemblages. These data, combined with evolutionary analysis, indicate that the capacity to infect eukaryotic cells has been acquired independently many times within the genus and that a highly conserved yet versatile T4SS secretes an exceptional number of different proteins shaped by interdomain gene transfer. Furthermore, we revealed the surprising extent to which legionellae have coopted genes and thus cellular functions from their eukaryotic hosts, providing an understanding of how dynamic reshuffling and gene acquisition have led to the emergence of major human pathogens.
Collapse
|
10
|
Almpanis A, Swain M, Gatherer D, McEwan N. Correlation between bacterial G+C content, genome size and the G+C content of associated plasmids and bacteriophages. Microb Genom 2018; 4:e000168. [PMID: 29633935 PMCID: PMC5989581 DOI: 10.1099/mgen.0.000168] [Citation(s) in RCA: 63] [Impact Index Per Article: 10.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2017] [Accepted: 03/06/2018] [Indexed: 02/06/2023] Open
Abstract
Based on complete bacterial genome sequence data, we demonstrate a correlation between bacterial chromosome length and the G+C content of the genome, with longer genomes having higher G+C contents. The correlation value decreases at shorter genome sizes, where there is a wider spread of G+C values. However, although significant (P<0.001), the correlation value (Pearson R=0.58) suggests that other factors also have a significant influence. A similar pattern was seen for plasmids; longer plasmids had higher G+C values, although the large number of shorter plasmids had a wide spread of G+C values. There was also a significant (P<0.0001) correlation between the G+C content of plasmids and the G+C content of their bacterial host. Conversely, the G+C content of bacteriophages tended to reduce with larger genome sizes, and although there was a correlation between host genome G+C content and that of the bacteriophage, it was not as strong as that seen between plasmids and their hosts.
Collapse
Affiliation(s)
- Apostolos Almpanis
- Aberystwyth University, Aberystwyth, UK
- Newcastle University, Newcastle-upon-Tyne, UK
| | | | | | - Neil McEwan
- Aberystwyth University, Aberystwyth, UK
- School of Pharmacy and Life Sciences, Robert Gordon University, Aberdeen, UK
| |
Collapse
|
11
|
Bohlin J, Eldholm V, Pettersson JHO, Brynildsrud O, Snipen L. The nucleotide composition of microbial genomes indicates differential patterns of selection on core and accessory genomes. BMC Genomics 2017; 18:151. [PMID: 28187704 PMCID: PMC5303225 DOI: 10.1186/s12864-017-3543-7] [Citation(s) in RCA: 37] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2016] [Accepted: 02/02/2017] [Indexed: 12/02/2022] Open
Abstract
Background The core genome consists of genes shared by the vast majority of a species and is therefore assumed to have been subjected to substantially stronger purifying selection than the more mobile elements of the genome, also known as the accessory genome. Here we examine intragenic base composition differences in core genomes and corresponding accessory genomes in 36 species, represented by the genomes of 731 bacterial strains, to assess the impact of selective forces on base composition in microbes. We also explore, in turn, how these results compare with findings for whole genome intragenic regions. Results We found that GC content in coding regions is significantly higher in core genomes than accessory genomes and whole genomes. Likewise, GC content variation within coding regions was significantly lower in core genomes than in accessory genomes and whole genomes. Relative entropy in coding regions, measured as the difference between observed and expected trinucleotide frequencies estimated from mononucleotide frequencies, was significantly higher in the core genomes than in accessory and whole genomes. Relative entropy was positively associated with coding region GC content within the accessory genomes, but not within the corresponding coding regions of core or whole genomes. Conclusion The higher intragenic GC content and relative entropy, as well as the lower GC content variation, observed in the core genomes is most likely associated with selective constraints. It is unclear whether the positive association between GC content and relative entropy in the more mobile accessory genomes constitutes signatures of selection or selective neutral processes. Electronic supplementary material The online version of this article (doi:10.1186/s12864-017-3543-7) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Jon Bohlin
- Infectious Disease Control and Environmental Health, Norwegian Institute of Public Health, Lovisenberggata 8, P.O. Box 4404, 0403, Oslo, Norway.
| | - Vegard Eldholm
- Infectious Disease Control and Environmental Health, Norwegian Institute of Public Health, Lovisenberggata 8, P.O. Box 4404, 0403, Oslo, Norway
| | - John H O Pettersson
- Infectious Disease Control and Environmental Health, Norwegian Institute of Public Health, Lovisenberggata 8, P.O. Box 4404, 0403, Oslo, Norway
| | - Ola Brynildsrud
- Infectious Disease Control and Environmental Health, Norwegian Institute of Public Health, Lovisenberggata 8, P.O. Box 4404, 0403, Oslo, Norway
| | - Lars Snipen
- Department of Chemistry, Biotechnology and Food Sciences, Norwegian University of Life Sciences, 1430, Ås, Norway
| |
Collapse
|
12
|
Bohlin J. Genome expansion in bacteria: the curios case of Chlamydia trachomatis. BMC Res Notes 2015; 8:512. [PMID: 26423146 PMCID: PMC4589037 DOI: 10.1186/s13104-015-1464-6] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2015] [Accepted: 09/21/2015] [Indexed: 11/23/2022] Open
Abstract
Background Recent findings indicated that a correlation between genomic % AT and genome size within strains of microbial species was predominantly associated with the uptake of foreign DNA. One species however, Chlamydia trachomatis, defied any explanation. In the present study 79 fully sequenced C. trachomatis genomes, representing ocular- (nine strains), urogenital- (36 strains) and lymphogranuloma venereum strains (LGV, 22 strains), in three pathogroups, in addition to 12 laboratory isolates, were scrutinized with the intent of elucidating the positive correlation between genomic AT content and genome size. Results The average size difference between the strains of each pathogroup was largely explained by the incorporation of genetic fragments. These fragments were slightly more AT rich than their corresponding host genomes, but not enough to justify the difference in AT content between the strains of the smaller genomes lacking the fragments. In addition, a genetic region predominantly found in the ocular strains, which had the largest genomes, was on average more GC rich than the host genomes of the urogenital strains (58.64 % AT vs. 58.69 % AT), which had the second largest genomes, implying that the foreign genetic regions cannot alone explain the association between genome size and AT content in C. trachomatis. 23,492 SNPs were identified for all 79 genomes, and although the SNPs were on average slightly GC rich (~47 % AT), a significant association was found between genome-wide SNP AT content, for each pathogroup, and genome size (p < 0.001, R2 = 0.86) in the C. trachomatis strains. Conclusions The correlation between genome size and AT content, with respect to the C. trachomatis pathogroups, was explained by the incorporation of genetic fragments unique to the ocular and/or urogenital strains into the LGV- and urogential strains in addition to the genome-wide SNP AT content differences between the three pathogroups. Electronic supplementary material The online version of this article (doi:10.1186/s13104-015-1464-6) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Jon Bohlin
- Department of Bacteriology and Immunology, Norwegian Institute of Public Health, Lovisenberggata 6, P.O. Box 4404, 0403, Oslo, Norway.
| |
Collapse
|
13
|
Bohlin J, Brynildsrud OB, Sekse C, Snipen L. An evolutionary analysis of genome expansion and pathogenicity in Escherichia coli. BMC Genomics 2014; 15:882. [PMID: 25297974 PMCID: PMC4200225 DOI: 10.1186/1471-2164-15-882] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2014] [Accepted: 09/29/2014] [Indexed: 12/27/2022] Open
Abstract
BACKGROUND There are several studies describing loss of genes through reductive evolution in microbes, but how selective forces are associated with genome expansion due to horizontal gene transfer (HGT) has not received similar attention. The aim of this study was therefore to examine how selective pressures influence genome expansion in 53 fully sequenced and assembled Escherichia coli strains. We also explored potential connections between genome expansion and the attainment of virulence factors. This was performed using estimations of several genomic parameters such as AT content, genomic drift (measured using relative entropy), genome size and estimated HGT size, which were subsequently compared to analogous parameters computed from the core genome consisting of 1729 genes common to the 53 E. coli strains. Moreover, we analyzed how selective pressures (quantified using relative entropy and dN/dS), acting on the E. coli core genome, influenced lineage and phylogroup formation. RESULTS Hierarchical clustering of dS and dN estimations from the E. coli core genome resulted in phylogenetic trees with topologies in agreement with known E. coli taxonomy and phylogroups. High values of dS, compared to dN, indicate that the E. coli core genome has been subjected to substantial purifying selection over time; significantly more than the non-core part of the genome (p<0.001). This is further supported by a linear association between strain-wise dS and dN values (β = 26.94 ± 0.44, R2~0.98, p<0.001). The non-core part of the genome was also significantly more AT-rich (p<0.001) than the core genome and E. coli genome size correlated with estimated HGT size (p<0.001). In addition, genome size (p<0.001), AT content (p<0.001) as well as estimated HGT size (p<0.005) were all associated with the presence of virulence factors, suggesting that pathogenicity traits in E. coli are largely attained through HGT. No associations were found between selective pressures operating on the E. coli core genome, as estimated using relative entropy, and genome size (p~0.98). CONCLUSIONS On a larger time frame, genome expansion in E. coli, which is significantly associated with the acquisition of virulence factors, appears to be independent of selective forces operating on the core genome.
Collapse
Affiliation(s)
- Jon Bohlin
- Division of Epidemiology, Norwegian Institute of Public Health, Marcus Thranes gate 6, P,O, Box 4404, Oslo 0403, Norway.
| | | | | | | |
Collapse
|