1
|
Curry KD, Yu FB, Vance SE, Segarra S, Bhaya D, Chikhi R, Rocha EPC, Treangen TJ. Reference-free structural variant detection in microbiomes via long-read co-assembly graphs. Bioinformatics 2024; 40:i58-i67. [PMID: 38940156 PMCID: PMC11211843 DOI: 10.1093/bioinformatics/btae224] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/29/2024] Open
Abstract
MOTIVATION The study of bacterial genome dynamics is vital for understanding the mechanisms underlying microbial adaptation, growth, and their impact on host phenotype. Structural variants (SVs), genomic alterations of 50 base pairs or more, play a pivotal role in driving evolutionary processes and maintaining genomic heterogeneity within bacterial populations. While SV detection in isolate genomes is relatively straightforward, metagenomes present broader challenges due to the absence of clear reference genomes and the presence of mixed strains. In response, our proposed method rhea, forgoes reference genomes and metagenome-assembled genomes (MAGs) by encompassing all metagenomic samples in a series (time or other metric) into a single co-assembly graph. The log fold change in graph coverage between successive samples is then calculated to call SVs that are thriving or declining. RESULTS We show rhea to outperform existing methods for SV and horizontal gene transfer (HGT) detection in two simulated mock metagenomes, particularly as the simulated reads diverge from reference genomes and an increase in strain diversity is incorporated. We additionally demonstrate use cases for rhea on series metagenomic data of environmental and fermented food microbiomes to detect specific sequence alterations between successive time and temperature samples, suggesting host advantage. Our approach leverages previous work in assembly graph structural and coverage patterns to provide versatility in studying SVs across diverse and poorly characterized microbial communities for more comprehensive insights into microbial gene flux. AVAILABILITY AND IMPLEMENTATION rhea is open source and available at: https://github.com/treangenlab/rhea.
Collapse
Affiliation(s)
- Kristen D Curry
- Department of Computer Science, Rice University, 6100 Main St., Houston, TX 77005, United States
- Department of Genomes and Genetics, Microbial Evolutionary Genomics, Institut Pasteur, Université Paris Cité, CNRS, UMR3525, Paris 75015, France
| | | | - Summer E Vance
- Department of Environmental Science, Policy, and Management, University of California, Berkeley, CA 94720, United States
| | - Santiago Segarra
- Department of Electrical and Computer Engineering, Rice University, Houston, TX 77005, United States
| | - Devaki Bhaya
- Carnegie Institution for Science, Department of Plant Biology, Stanford, CA 94305, United States
| | - Rayan Chikhi
- Department of Computational Biology, Institut Pasteur, Université Paris Cité, Paris 75015, France
| | - Eduardo P C Rocha
- Department of Genomes and Genetics, Microbial Evolutionary Genomics, Institut Pasteur, Université Paris Cité, CNRS, UMR3525, Paris 75015, France
| | - Todd J Treangen
- Department of Computer Science, Rice University, 6100 Main St., Houston, TX 77005, United States
| |
Collapse
|
2
|
Curry KD, Yu FB, Vance SE, Segarra S, Bhaya D, Chikhi R, Rocha EP, Treangen TJ. Reference-free Structural Variant Detection in Microbiomes via Long-read Coassembly Graphs. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.01.25.577285. [PMID: 38352454 PMCID: PMC10862772 DOI: 10.1101/2024.01.25.577285] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 02/20/2024]
Abstract
Bacterial genome dynamics are vital for understanding the mechanisms underlying microbial adaptation, growth, and their broader impact on host phenotype. Structural variants (SVs), genomic alterations of 10 base pairs or more, play a pivotal role in driving evolutionary processes and maintaining genomic heterogeneity within bacterial populations. While SV detection in isolate genomes is relatively straightforward, metagenomes present broader challenges due to absence of clear reference genomes and presence of mixed strains. In response, our proposed method rhea, forgoes reference genomes and metagenome-assembled genomes (MAGs) by encompassing a single metagenome coassembly graph constructed from all samples in a series. The log fold change in graph coverage between subsequent samples is then calculated to call SVs that are thriving or declining throughout the series. We show rhea to outperform existing methods for SV and horizontal gene transfer (HGT) detection in two simulated mock metagenomes, which is particularly noticeable as the simulated reads diverge from reference genomes and an increase in strain diversity is incorporated. We additionally demonstrate use cases for rhea on series metagenomic data of environmental and fermented food microbiomes to detect specific sequence alterations between subsequent time and temperature samples, suggesting host advantage. Our innovative approach leverages raw read patterns rather than references or MAGs to include all sequencing reads in analysis, and thus provide versatility in studying SVs across diverse and poorly characterized microbial communities for more comprehensive insights into microbial genome dynamics.
Collapse
Affiliation(s)
- Kristen D. Curry
- Rice University, Department of Computer Science, Houston, TX 77005, United States
- Institut Pasteur, Université Paris Cité, CNRS, UMR3525, Microbial Evolutionary Genomics, 75015 Paris, France
| | | | - Summer E. Vance
- University of California, Berkeley, Department of Environmental Science, Policy, and Management, Berkeley, CA 94720, United States
| | - Santiago Segarra
- Rice University, Department of Electrical and Computer Engineering, Houston, TX 77005, United States
| | - Devaki Bhaya
- Carnegie Institution for Science, Department of Plant Biology, Stanford, CA 94305, United States
| | - Rayan Chikhi
- Institut Pasteur, Université Paris Cité, Sequence Bioinformatics unit, 75015 Paris, France
| | - Eduardo P.C. Rocha
- Institut Pasteur, Université Paris Cité, CNRS, UMR3525, Microbial Evolutionary Genomics, 75015 Paris, France
| | - Todd J. Treangen
- Rice University, Department of Computer Science, Houston, TX 77005, United States
| |
Collapse
|
3
|
Laux M, Piroupo CM, Setubal JC, Giani A. The Raphidiopsis (= Cylindrospermopsis) raciborskii pangenome updated: Two new metagenome-assembled genomes from the South American clade. HARMFUL ALGAE 2023; 129:102518. [PMID: 37951618 DOI: 10.1016/j.hal.2023.102518] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/14/2023] [Revised: 09/15/2023] [Accepted: 09/28/2023] [Indexed: 11/14/2023]
Abstract
Two Raphidiopsis (=Cylindrospermopsis) raciborskii metagenome-assembled genomes (MAGs) were recovered from two freshwater metagenomic datasets sampled in 2011 and 2012 in Pampulha Lake, a hypereutrophic, artificial, shallow reservoir, located in the city of Belo Horizonte (MG), Brazil. Since the late 1970s, the lake has undergone increasing eutrophication pressure, due to wastewater input, leading to the occurrence of frequent cyanobacterial blooms. The major difference observed between PAMP2011 and PAMP2012 MAGs was the lack of the saxitoxin gene cluster in PAMP2012, which also presented a smaller genome, while PAMP2011 presented the complete sxt cluster and all essential proteins and clusters. The pangenome analysis was performed with all Raphidiopsis/Cylindrospermopsis genomes available at NCBI to date, with the addition of PAMP2011 and PAMP2012 MAGs (All33 subset), but also without the South American strains (noSA subset), and only among the South American strains (SA10 and SA8 subsets). We observed a substantial increase in the core genome size for the 'noSA' subset, in comparison to 'All33' subset, and since the core genome reflects the closeness among the pangenome members, the results strongly suggest that the conservation level of the essential gene repertoire seems to be affected by the geographic origin of the strains being analyzed, supporting the existence of a distinct SA clade. The Raphidiopsis pangenome comprised a total of 7943 orthologous protein clusters, and the two new MAGs increased the pangenome size by 11%. The pangenome based phylogenetic relationships among the 33 analyzed genomes showed that the SA genomes clustered together with 99% bootstrap support, reinforcing the metabolic particularity of the Raphidiopsis South American clade, related to its saxitoxin producing unique ability, while also indicating a different evolutionary history due to its geographic isolation.
Collapse
Affiliation(s)
- Marcele Laux
- Department of Botany, Phycology Laboratory, Universidade Federal de Minas Gerais, 31270-901, Belo Horizonte, MG, Brazil
| | - Carlos Morais Piroupo
- Department of Biochemistry, Institute of Chemistry, Universidade de São Paulo, 05508-000, São Paulo, SP, Brazil
| | - João Carlos Setubal
- Department of Biochemistry, Institute of Chemistry, Universidade de São Paulo, 05508-000, São Paulo, SP, Brazil
| | - Alessandra Giani
- Department of Botany, Phycology Laboratory, Universidade Federal de Minas Gerais, 31270-901, Belo Horizonte, MG, Brazil.
| |
Collapse
|
4
|
Huang S, Li H, Ma L, Liu R, Li Y, Wang H, Lu X, Huang X, Wu X, Liu X. Insertion sequence contributes to the evolution and environmental adaptation of Acidithiobacillus. BMC Genomics 2023; 24:282. [PMID: 37231368 DOI: 10.1186/s12864-023-09372-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2023] [Accepted: 05/10/2023] [Indexed: 05/27/2023] Open
Abstract
BACKGROUND The genus Acidithiobacillus has been widely concerned due to its superior survival and oxidation ability in acid mine drainage (AMD). However, the contribution of insertion sequence (IS) to their biological evolution and environmental adaptation is very limited. ISs are the simplest kinds of mobile genetic elements (MGEs), capable of interrupting genes, operons, or regulating the expression of genes through transposition activity. ISs could be classified into different families with their own members, possessing different copies. RESULTS In this study, the distribution and evolution of ISs, as well as the functions of the genes around ISs in 36 Acidithiobacillus genomes, were analyzed. The results showed that 248 members belonging to 23 IS families with a total of 10,652 copies were identified within the target genomes. The IS families and copy numbers among each species were significantly different, indicating that the IS distribution of Acidithiobacillus were not even. A. ferrooxidans had 166 IS members, which may develop more gene transposition strategies compared with other Acidithiobacillus spp. What's more, A. thiooxidans harbored the most IS copies, suggesting that their ISs were the most active and more likely to transpose. The ISs clustered in the phylogenetic tree approximately according to the family, which were mostly different from the evolutionary trends of their host genomes. Thus, it was suggested that the recent activity of ISs of Acidithiobacillus was not only determined by their genetic characteristics, but related with the environmental pressure. In addition, many ISs especially Tn3 and IS110 families were inserted around the regions whose functions were As/Hg/Cu/Co/Zn/Cd translocation and sulfur oxidation, implying that ISs could improve the adaptive capacities of Acidithiobacillus to the extremely acidic environment by enhancing their resistance to heavy metals and utilization of sulfur. CONCLUSIONS This study provided the genomic evidence for the contribution of IS to evolution and adaptation of Acidithiobacillus, opening novel sights into the genome plasticity of those acidophiles.
Collapse
Affiliation(s)
- Shanshan Huang
- School of Minerals Processing and Bioengineering, Central South University, 410083, Changsha, China
| | - Huiying Li
- School of Minerals Processing and Bioengineering, Central South University, 410083, Changsha, China
| | - Liyuan Ma
- Hubei Key Laboratory of Yangtze Catchment Environmental Aquatic Science, School of Environmental Studies, China University of Geosciences, 430074, Wuhan, China.
- State Key Laboratory of Biogeology and Environmental Geology, China University of Geosciences, 430074, Wuhan, China.
| | - Rui Liu
- Hubei Key Laboratory of Yangtze Catchment Environmental Aquatic Science, School of Environmental Studies, China University of Geosciences, 430074, Wuhan, China
- State Key Laboratory of Biogeology and Environmental Geology, China University of Geosciences, 430074, Wuhan, China
| | - Yiran Li
- School of Minerals Processing and Bioengineering, Central South University, 410083, Changsha, China
| | - Hongmei Wang
- Hubei Key Laboratory of Yangtze Catchment Environmental Aquatic Science, School of Environmental Studies, China University of Geosciences, 430074, Wuhan, China
- State Key Laboratory of Biogeology and Environmental Geology, China University of Geosciences, 430074, Wuhan, China
| | - Xiaolu Lu
- Hubei Key Laboratory of Yangtze Catchment Environmental Aquatic Science, School of Environmental Studies, China University of Geosciences, 430074, Wuhan, China
- State Key Laboratory of Biogeology and Environmental Geology, China University of Geosciences, 430074, Wuhan, China
| | - Xinping Huang
- Hubei Key Laboratory of Yangtze Catchment Environmental Aquatic Science, School of Environmental Studies, China University of Geosciences, 430074, Wuhan, China
| | - Xinhong Wu
- School of Minerals Processing and Bioengineering, Central South University, 410083, Changsha, China
| | - Xueduan Liu
- School of Minerals Processing and Bioengineering, Central South University, 410083, Changsha, China
| |
Collapse
|
5
|
Two Archaeal Metagenome-Assembled Genomes from El Tatio Provide New Insights into the Crenarchaeota Phylum. Genes (Basel) 2021; 12:genes12030391. [PMID: 33803363 PMCID: PMC7999037 DOI: 10.3390/genes12030391] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2021] [Revised: 02/26/2021] [Accepted: 03/03/2021] [Indexed: 11/16/2022] Open
Abstract
A phylogenomic and functional analysis of the first two Crenarchaeota MAGs belonging to El Tatio geysers fields in Chile is reported. A soil sample contiguous to a geothermal activity exposed lagoon of El Tatio was used for shotgun sequencing. Afterwards, contigs were binned into individual population-specific genomes data. A phylogenetic placement was carried out for both MAG 9-5TAT and MAG 47-5TAT. Then functional comparisons and metabolic reconstruction were carried out. Results showed that both MAG 9-5TAT and MAG 47-5TAT likely represent new species in the genus Thermoproteus and the genus Sulfolobus, respectively. These findings provide new insights into the phylogenetic and genomic diversity for archaea species that inhabit the El Tatio geysers field and expand the understanding of the Crenarchaeota phylum diversity.
Collapse
|
6
|
Abstract
Species belonging to the family Lactobacillaceae are found in highly diverse environments and play an important role in fermented foods and probiotic products. Many of these species have been individually reported to harbour plasmids that encode important genes. In this study, we performed comparative genomic analysis of publicly available data for 512 plasmids from 282 strains represented by 51 species of this family and correlated the genomic features of plasmids with the ecological niches in which these species are found. Two-thirds of the species had at least one plasmid-harbouring strain. Plasmid abundance and GC content were significantly lower in vertebrate-adapted species as compared to nomadic and free-living species. Hierarchical clustering highlighted the distinct nature of plasmids from the nomadic and free-living species than those from the vertebrate-adapted species. EggNOG-assisted functional annotation revealed that genes associated with transposition, conjugation, DNA repair and recombination, exopolysaccharide production, metal ion transport, toxin–antitoxin system, and stress tolerance were significantly enriched on the plasmids of the nomadic and in some cases nomadic and free-living species. On the other hand, genes related to anaerobic metabolism, ABC transporters and the major facilitator superfamily were overrepresented on the plasmids of the vertebrate-adapted species. These genomic signatures correlate with the comparatively nutrient-depleted, stressful and dynamic environments of nomadic and free-living species and nutrient-rich and anaerobic environments of vertebrate-adapted species. Thus, these results indicate the contribution of the plasmids in the adaptation of lactobacilli to their respective habitats. This study also underlines the potential application of these plasmids in improving the technological and probiotic properties of lactic acid bacteria.
Collapse
Affiliation(s)
- Dimple Davray
- Symbiosis School of Biological Sciences, Symbiosis International (Deemed University), Lavale, Pune 412115, India
| | - Dipti Deo
- Symbiosis School of Biological Sciences, Symbiosis International (Deemed University), Lavale, Pune 412115, India
| | - Ram Kulkarni
- Symbiosis School of Biological Sciences, Symbiosis International (Deemed University), Lavale, Pune 412115, India
| |
Collapse
|
7
|
Son S, Oh JD, Lee SH, Shin D, Kim Y. Comparative genomics of canine Lactobacillus reuteri reveals adaptation to a shared environment with humans. Genes Genomics 2020; 42:1107-1116. [PMID: 32761525 DOI: 10.1007/s13258-020-00978-w] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2020] [Accepted: 07/23/2020] [Indexed: 01/19/2023]
Abstract
BACKGROUND Lactobacillus reuteri is a gram-positive, non-motile bacterial species that has been used as a representative microorganism model to describe the ecology and evolution of vertebrate gut symbionts. OBJECTIVE Because the genetic features and evolutionary strategies of L. reuteri from the gastrointestinal tract of canines remain unknown, we tried to construct draft genome canine L. reuteri and investigate modified, acquired, or lost genetic features that have facilitated the evolution and adaptation of strains to specific environmental niches by this study. METHODS To examine canine L. reuteri, we sequenced an L. reuteri strain isolated from a dog in Korea. A comparative genomic approach was used to assess genetic diversity and gain insight into the distinguishing features related to different hosts based on 27 published genomic sequences. RESULTS The pan-genome of 28 L. reuteri strains contained 7,369 gene families, and the core genome contained 1070 gene families. The ANI tree based on the core genes in the canine L. reuteri strain (C1) was very close to those for three strains (IRT, DSM20016, JCM1112) from humans. Evolutionarily, these four strains formed one clade, which we regarded as C1-clade in this study. We could investigate a total of 32,050 amino acid substitutions among the 28 L. reuteri strain genomes. In this comparison, 283 amino acid substitutions were specific to strain C1 and four strains in C1-clade shared most of these 283 C1-strain specific amino acid substitutions, suggesting strongly similar selective pressure. In accessory genes, we could identify 127 C1-clade host-specific genes and found that several genes were closely related to replication, recombination, and repair. CONCLUSION This study provides new insights into the adaptation of L. reuteri to the canine intestinal habitat, and suggests that the genome of L. reuteri from canines is closely associated with their living and shared environment with humans.
Collapse
Affiliation(s)
- Seungwoo Son
- The Animal Molecular Genetics and Breeding Center, Jeonbuk National University, Jeollabuk-do, Jeonju-si, 54896, Republic of Korea
| | - Jae-Don Oh
- The Animal Molecular Genetics and Breeding Center, Jeonbuk National University, Jeollabuk-do, Jeonju-si, 54896, Republic of Korea
| | - Sung Ho Lee
- Woogene B&G Co., Ltd., Gyeonggi-do, Hwaseong-si, 18630, Republic of Korea
| | - Donghyun Shin
- The Animal Molecular Genetics and Breeding Center, Jeonbuk National University, Jeollabuk-do, Jeonju-si, 54896, Republic of Korea.
| | - Yangseon Kim
- Center for Industrialization of Agriculture and Livestock Microorganism, Jeongeup-si, Jeollabuk-do, 56212, Republic of Korea.
| |
Collapse
|
8
|
Genome analysis of Rubritalea profundi SAORIC-165 T, the first deep-sea verrucomicrobial isolate, from the northwestern Pacific Ocean. J Microbiol 2019; 57:413-422. [PMID: 30806980 DOI: 10.1007/s12275-019-8712-8] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/28/2018] [Revised: 01/04/2019] [Accepted: 01/04/2019] [Indexed: 12/13/2022]
Abstract
Although culture-independent studies have shown the presence of Verrucomicrobia in the deep sea, verrucomicrobial strains from deep-sea environments have been rarely cultured and characterized. Recently, Rubritalea profundi SAORIC-165T, a psychrophilic bacterium of the phylum Verrucomicrobia, was isolated from a depth of 2,000 m in the northwestern Pacific Ocean. In this study, the genome sequence of R. profundi SAORIC-165T, the first deep-sea verrucomicrobial isolate, is reported with description of the genome properties and comparison to surface-borne Rubritalea genomes. The draft genome consisted of four contigs with an entire size of 4,167,407 bp and G+C content of 47.5%. The SAORIC-165T genome was predicted to have 3,844 proteincoding genes and 45 non-coding RNA genes. The genome contained a repertoire of metabolic pathways, including the Embden-Meyerhof-Parnas pathway, pentose phosphate pathway, tricarboxylic acid cycle, assimilatory sulfate reduction, and biosynthesis of nicotinate/nicotinamide, pantothenate/coenzyme A, folate, and lycopene. The comparative genomic analyses with two surface-derived Rubritalea genomes showed that the SAORIC-165T genome was enriched in genes involved in transposition of mobile elements, signal transduction, and carbohydrate metabolism, some of which might be related to bacterial enhancement of ecological fitness in the deep-sea environment. Amplicon sequencing of 16S rRNA genes from the water column revealed that R. profundi-related phylotypes were relatively abundant at 2,000 m and preferred a particle-associated life style in the deep sea. These findings suggest that R. profundi represents a genetically unique and ecologically relevant verrucomicrobial group well adapted to the deep-sea environment.
Collapse
|
9
|
Blesa A, Sánchez M, Sacristán-Horcajada E, González-de la Fuente S, Peiró R, Berenguer J. Into the Thermus Mobilome: Presence, Diversity and Recent Activities of Insertion Sequences Across Thermus spp. Microorganisms 2019; 7:microorganisms7010025. [PMID: 30669685 PMCID: PMC6352166 DOI: 10.3390/microorganisms7010025] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2018] [Revised: 01/09/2019] [Accepted: 01/17/2019] [Indexed: 11/28/2022] Open
Abstract
A high level of transposon-mediated genome rearrangement is a common trait among microorganisms isolated from thermal environments, probably contributing to the extraordinary genomic plasticity and horizontal gene transfer (HGT) observed in these habitats. In this work, active and inactive insertion sequences (ISs) spanning the sequenced members of the genus Thermus were characterized, with special emphasis on three T. thermophilus strains: HB27, HB8, and NAR1. A large number of full ISs and fragments derived from different IS families were found, concentrating within megaplasmids present in most isolates. Potentially active ISs were identified through analysis of transposase integrity, and domestication-related transposition events of ISTth7 were identified in laboratory-adapted HB27 derivatives. Many partial copies of ISs appeared throughout the genome, which may serve as specific targets for homologous recombination contributing to genome rearrangement. Moreover, recruitment of IS1000 32 bp segments as spacers for CRISPR sequence was identified, pointing to the adaptability of these elements in the biology of these thermophiles. Further knowledge about the activity and functional diversity of ISs in this genus may contribute to the generation of engineered transposons as new genetic tools, and enrich our understanding of the outstanding plasticity shown by these thermophiles.
Collapse
Affiliation(s)
- Alba Blesa
- Department of Biotechnology, Faculty of Experimental Sciences, Universidad Francisco de Vitoria, Madrid 28223, Spain.
| | - Mercedes Sánchez
- Centro de Biología Molecular Severo Ochoa (CBMSO), Universidad Autónoma de Madrid-Consejo Superior de Investigaciones Científicas, Madrid 28049, Spain.
| | - Eva Sacristán-Horcajada
- Centro de Biología Molecular Severo Ochoa (CBMSO), Universidad Autónoma de Madrid-Consejo Superior de Investigaciones Científicas, Madrid 28049, Spain.
| | - Sandra González-de la Fuente
- Centro de Biología Molecular Severo Ochoa (CBMSO), Universidad Autónoma de Madrid-Consejo Superior de Investigaciones Científicas, Madrid 28049, Spain.
| | - Ramón Peiró
- Centro de Biología Molecular Severo Ochoa (CBMSO), Universidad Autónoma de Madrid-Consejo Superior de Investigaciones Científicas, Madrid 28049, Spain.
| | - José Berenguer
- Centro de Biología Molecular Severo Ochoa (CBMSO), Universidad Autónoma de Madrid-Consejo Superior de Investigaciones Científicas, Madrid 28049, Spain.
| |
Collapse
|
10
|
Rosen MJ, Davison M, Fisher DS, Bhaya D. Probing the ecological and evolutionary history of a thermophilic cyanobacterial population via statistical properties of its microdiversity. PLoS One 2018; 13:e0205396. [PMID: 30427861 PMCID: PMC6235289 DOI: 10.1371/journal.pone.0205396] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2017] [Accepted: 09/25/2018] [Indexed: 12/11/2022] Open
Abstract
Despite extensive DNA sequencing data derived from natural microbial communities, it remains a major challenge to identify the key evolutionary and ecological forces that shape microbial populations. We have focused on the extensive microdiversity of the cyanobacterium Synechococcus sp., which is a dominant member of the dense phototrophic biofilms in the hot springs of Yellowstone National Park. From deep amplicon sequencing of many loci and statistical analyses of these data, we showed previously that the population has undergone an unexpectedly high degree of homologous recombination, unlinking synonymous SNP-pair correlations even on intragenic length scales. Here, we analyze the genic amino acid diversity, which provides new evidence of selection and insights into the evolutionary history of the population. Surprisingly, some features of the data, including the spectrum of distances between genic-alleles, appear consistent with primarily asexual neutral drift. Yet the non-synonymous site frequency spectrum has too large an excess of low-frequency polymorphisms to result from negative selection on deleterious mutations given the distribution of coalescent times that we infer. And our previous analyses showed that the population is not asexual. Taken together, these apparently contradictory data suggest that selection, epistasis, and hitchhiking all play essential roles in generating and stabilizing the diversity. We discuss these as well as potential roles of ecological niches at genomic and genic levels. From quantitative properties of the diversity and comparative genomic data, we infer aspects of the history and inter-spring dispersal of the meta-population since it was established in the Yellowstone Caldera. Our investigations illustrate the need for combining multiple types of sequencing data and quantitative statistical analyses to develop an understanding of microdiversity in natural microbial populations.
Collapse
Affiliation(s)
- Michael J. Rosen
- Applied Physics Department, Stanford University, Stanford, CA, United States of America
| | - Michelle Davison
- Department of Plant Biology, Carnegie Institution for Science, Stanford, CA, United States of America
| | - Daniel S. Fisher
- Applied Physics Department, Stanford University, Stanford, CA, United States of America
| | - Devaki Bhaya
- Department of Plant Biology, Carnegie Institution for Science, Stanford, CA, United States of America
| |
Collapse
|
11
|
Vigil-Stenman T, Ininbergs K, Bergman B, Ekman M. High abundance and expression of transposases in bacteria from the Baltic Sea. THE ISME JOURNAL 2017; 11:2611-2623. [PMID: 28731472 PMCID: PMC5649170 DOI: 10.1038/ismej.2017.114] [Citation(s) in RCA: 40] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/20/2017] [Revised: 05/23/2017] [Accepted: 06/01/2017] [Indexed: 02/06/2023]
Abstract
Transposases are mobile genetic elements suggested to have an important role in bacterial genome plasticity and host adaptation but their transcriptional activity in natural bacterial communities is largely unexplored. Here we analyzed metagenomes and -transcriptomes of size fractionated (0.1-0.8, 0.8-3.0 and 3.0-200 μm) bacterial communities from the brackish Baltic Sea, and adjacent marine waters. The Baltic Sea transposase levels, up to 1.7% of bacterial genes and 2% of bacterial transcripts, were considerably higher than in marine waters and similar to levels reported for extreme environments. Large variations in expression were found between transposase families and groups of bacteria, with a two-fold higher transcription in Cyanobacteria than in any other phylum. The community-level results were corroborated at the genus level by Synechococcus transposases reaching up to 5.2% of genes and 6.9% of transcripts, which is in contrast to marine Synechococcus that largely lack these genes. Levels peaked in Synechococcus from the largest size fraction, suggesting high frequencies of lateral gene transfer and high genome plasticity in colony-forming picocyanobacteria. Together, the results support an elevated rate of transposition-based genome change and adaptation in bacterial populations of the Baltic Sea, and possibly also of other highly dynamic estuarine waters.
Collapse
Affiliation(s)
- Theoden Vigil-Stenman
- Department of Ecology, Environment and Plant Sciences, Stockholm University, Stockholm, Sweden
| | - Karolina Ininbergs
- Department of Ecology, Environment and Plant Sciences, Stockholm University, Stockholm, Sweden
| | - Birgitta Bergman
- Department of Ecology, Environment and Plant Sciences, Stockholm University, Stockholm, Sweden
| | - Martin Ekman
- Department of Ecology, Environment and Plant Sciences, Stockholm University, Stockholm, Sweden
| |
Collapse
|
12
|
Disentangling the effects of selection and loss bias on gene dynamics. Proc Natl Acad Sci U S A 2017; 114:E5616-E5624. [PMID: 28652353 DOI: 10.1073/pnas.1704925114] [Citation(s) in RCA: 31] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
We combine mathematical modeling of genome evolution with comparative analysis of prokaryotic genomes to estimate the relative contributions of selection and intrinsic loss bias to the evolution of different functional classes of genes and mobile genetic elements (MGE). An exact solution for the dynamics of gene family size was obtained under a linear duplication-transfer-loss model with selection. With the exception of genes involved in information processing, particularly translation, which are maintained by strong selection, the average selection coefficient for most nonparasitic genes is low albeit positive, compatible with observed positive correlation between genome size and effective population size. Free-living microbes evolve under stronger selection for gene retention than parasites. Different classes of MGE show a broad range of fitness effects, from the nearly neutral transposons to prophages, which are actively eliminated by selection. Genes involved in antiparasite defense, on average, incur a fitness cost to the host that is at least as high as the cost of plasmids. This cost is probably due to the adverse effects of autoimmunity and curtailment of horizontal gene transfer caused by the defense systems and selfish behavior of some of these systems, such as toxin-antitoxin and restriction modification modules. Transposons follow a biphasic dynamics, with bursts of gene proliferation followed by decay in the copy number that is quantitatively captured by the model. The horizontal gene transfer to loss ratio, but not duplication to loss ratio, correlates with genome size, potentially explaining increased abundance of neutral and costly elements in larger genomes.
Collapse
|
13
|
Elbehery AHA, Aziz RK, Siam R. Insertion sequences enrichment in extreme Red sea brine pool vent. Extremophiles 2016; 21:271-282. [PMID: 27915389 DOI: 10.1007/s00792-016-0900-4] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2016] [Accepted: 11/27/2016] [Indexed: 01/24/2023]
Abstract
Mobile genetic elements are major agents of genome diversification and evolution. Limited studies addressed their characteristics, including abundance, and role in extreme habitats. One of the rare natural habitats exposed to multiple-extreme conditions, including high temperature, salinity and concentration of heavy metals, are the Red Sea brine pools. We assessed the abundance and distribution of different mobile genetic elements in four Red Sea brine pools including the world's largest known multiple-extreme deep-sea environment, the Red Sea Atlantis II Deep. We report a gradient in the abundance of mobile genetic elements, dramatically increasing in the harshest environment of the pool. Additionally, we identified a strong association between the abundance of insertion sequences and extreme conditions, being highest in the harshest and deepest layer of the Red Sea Atlantis II Deep. Our comparative analyses of mobile genetic elements in secluded, extreme and relatively non-extreme environments, suggest that insertion sequences predominantly contribute to polyextremophiles genome plasticity.
Collapse
Affiliation(s)
- Ali H A Elbehery
- Graduate Program of Biotechnology, School of Sciences and Engineering, The American University in Cairo, New Cairo, 11835, Cairo, Egypt
| | - Ramy K Aziz
- Department of Microbiology and Immunology, Faculty of Pharmacy, Cairo University, Cairo, 11562, Egypt
| | - Rania Siam
- Graduate Program of Biotechnology, School of Sciences and Engineering, The American University in Cairo, New Cairo, 11835, Cairo, Egypt.
- Department of Biology, School of Sciences and Engineering, The American University in Cairo, SSE (Parcel 7), Second Floor, Office: Room 2194, AUC Avenue, New Cairo, 11835, Cairo, Egypt.
| |
Collapse
|
14
|
Bacteria and Archaea diversity within the hot springs of Lake Magadi and Little Magadi in Kenya. BMC Microbiol 2016; 16:136. [PMID: 27388368 PMCID: PMC4936230 DOI: 10.1186/s12866-016-0748-x] [Citation(s) in RCA: 65] [Impact Index Per Article: 8.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2015] [Accepted: 06/15/2016] [Indexed: 02/02/2023] Open
Abstract
Background Lake Magadi and little Magadi are hypersaline, alkaline lakes situated in the southern part of Kenyan Rift Valley. Solutes are supplied mainly by a series of alkaline hot springs with temperatures as high as 86 °C. Previous culture-dependent and culture-independent studies have revealed diverse groups of microorganisms thriving under these conditions. Previous culture independent studies were based on the analysis of 16S rDNA but were done on less saline lakes. For the first time, this study combined illumina sequencing and analysis of amplicons of both total community rDNA and 16S rRNA cDNA to determine the diversity and community structure of bacteria and archaea within 3 hot springs of L. Magadi and little Magadi. Methods Water, wet sediments and microbial mats were collected from springs in the main lake at a temperature of 45.1 °C and from Little Magadi “Nasikie eng’ida” (temperature of 81 °C and 83.6 °C). Total community DNA and RNA were extracted from samples using phenol-chloroform and Trizol RNA extraction protocols respectively. The 16S rRNA gene variable region (V4 – V7) of the extracted DNA and RNA were amplified and library construction performed following Illumina sequencing protocol. Sequences were analyzed done using QIIME while calculation of Bray-Curtis dissimilarities between datasets, hierarchical clustering, Non Metric Dimensional Scaling (NMDS) redundancy analysis (RDA) and diversity indices were carried out using the R programming language and the Vegan package. Results Three thousand four hundred twenty-six and one thousand nine hundred thirteen OTUs were recovered from 16S rDNA and 16S rRNA cDNA respectively. Uncultured diversity accounted for 89.35 % 16S rDNA and 87.61 % 16S rRNA cDNA reads. The most abundant phyla in both the 16S rDNA and 16S rRNA cDNA datasets included: Proteobacteria (8.33–50 %), Firmicutes 3.52–28.92 %, Bacteroidetes (3.45–26.44 %), Actinobacteria (0.98–28.57 %) and Euryarchaeota (3.55–34.48 %) in all samples. NMDS analyses of taxonomic composition clustered the taxa into three groups according to sample types (i.e. wet sediments, mats and water samples) with evident overlap of clusters between wet sediments and microbial mats from the three sample types in both DNA and cDNA datasets. The hot spring (45.1 °C) contained less diverse populations compared to those in Little Magadi (81–83 °C). Conclusion There were significant differences in microbial community structure at 95 % level of confidence for both total diversity (P value, 0.009) based on 16S rDNA analysis and active microbial diversity (P value, 0.01) based on 16S rRNA cDNA analysis, within the three hot springs. Differences in microbial composition and structure were observed as a function of sample type and temperature, with wet sediments harboring the highest diversity. Electronic supplementary material The online version of this article (doi:10.1186/s12866-016-0748-x) contains supplementary material, which is available to authorized users.
Collapse
|
15
|
Rosen MJ, Davison M, Bhaya D, Fisher DS. Microbial diversity. Fine-scale diversity and extensive recombination in a quasisexual bacterial population occupying a broad niche. Science 2015; 348:1019-23. [PMID: 26023139 DOI: 10.1126/science.aaa4456] [Citation(s) in RCA: 84] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
Abstract
Extensive fine-scale genetic diversity is found in many microbial species across varied environments, but for most, the evolutionary scenarios that generate the observed variation remain unclear. Deep sequencing of a thermophilic cyanobacterial population and analysis of the statistics of synonymous single-nucleotide polymorphisms revealed a high rate of homologous recombination and departures from neutral drift consistent with the effects of genetic hitchhiking. A sequenced isolate genome resembled an unlinked random mixture of the allelic diversity at the sampled loci. These observations suggested a quasisexual microbial population that occupies a broad ecological niche, with selection driving frequencies of alleles rather than whole genomes.
Collapse
Affiliation(s)
- Michael J Rosen
- Applied Physics Department, Stanford University, Stanford, CA 94305, USA
| | - Michelle Davison
- Department of Plant Biology, Carnegie Institution for Science, Stanford, CA 94305, USA
| | - Devaki Bhaya
- Department of Plant Biology, Carnegie Institution for Science, Stanford, CA 94305, USA.
| | - Daniel S Fisher
- Applied Physics Department, Stanford University, Stanford, CA 94305, USA. Bioengineering Department, Stanford University, Stanford, CA 94305, USA.
| |
Collapse
|
16
|
Iranzo J, Gómez MJ, López de Saro FJ, Manrubia S. Large-scale genomic analysis suggests a neutral punctuated dynamics of transposable elements in bacterial genomes. PLoS Comput Biol 2014; 10:e1003680. [PMID: 24967627 PMCID: PMC4072520 DOI: 10.1371/journal.pcbi.1003680] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2013] [Accepted: 05/08/2014] [Indexed: 11/18/2022] Open
Abstract
Insertion sequences (IS) are the simplest and most abundant form of transposable DNA found in bacterial genomes. When present in multiple copies, it is thought that they can promote genomic plasticity and genetic exchange, thus being a major force of evolutionary change. The main processes that determine IS content in genomes are, though, a matter of debate. In this work, we take advantage of the large amount of genomic data currently available and study the abundance distributions of 33 IS families in 1811 bacterial chromosomes. This allows us to test simple models of IS dynamics and estimate their key parameters by means of a maximum likelihood approach. We evaluate the roles played by duplication, lateral gene transfer, deletion and purifying selection. We find that the observed IS abundances are compatible with a neutral scenario where IS proliferation is controlled by deletions instead of purifying selection. Even if there may be some cases driven by selection, neutral behavior dominates over large evolutionary scales. According to this view, IS and hosts tend to coexist in a dynamic equilibrium state for most of the time. Our approach also allows for a detection of recent IS expansions, and supports the hypothesis that rapid expansions constitute transient events—punctuations—during which the state of coexistence of IS and host becomes perturbated. Insertion sequences (IS) are mobile genetic elements found in most prokaryotic genomes. They are able to autonomously change position and proliferate in chromosomes. The nature of the coevolutionary dynamics of IS with the genome that hosts them is a matter of debate: Do IS proliferate to the point of causing the extinction of the host? Is it possible that IS and hosts stably coexist? Can environmental perturbations cause IS expansions? What is the role of selection in controlling IS copy number? In this study, we have analysed abundance patterns of IS families to test two different evolutionary hypotheses: in the first one IS evolve neutrally, while in the second case they are affected by selection. Our results indicate that, most of the time, IS and their hosts coexist stably in a neutral scenario where the proliferation of IS through duplications and lateral gene transfer is balanced by regular deletions. Occasionally, though, this balance may be disrupted, causing temporary explosions of IS abundance.
Collapse
Affiliation(s)
- Jaime Iranzo
- Centro de Astrobiología (CAB), INTA-CSIC, Torrejón de Ardoz, Madrid, Spain
| | - Manuel J. Gómez
- Centro de Astrobiología (CAB), INTA-CSIC, Torrejón de Ardoz, Madrid, Spain
| | | | - Susanna Manrubia
- Centro de Astrobiología (CAB), INTA-CSIC, Torrejón de Ardoz, Madrid, Spain
- Grupo Interdisciplinar de Sistemas Complejos (GISC), Madrid, Spain
- * E-mail:
| |
Collapse
|
17
|
Lin L, Xu J. Dissecting and engineering metabolic and regulatory networks of thermophilic bacteria for biofuel production. Biotechnol Adv 2013; 31:827-37. [DOI: 10.1016/j.biotechadv.2013.03.003] [Citation(s) in RCA: 36] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2012] [Revised: 03/06/2013] [Accepted: 03/10/2013] [Indexed: 01/08/2023]
|
18
|
Lewin A, Wentzel A, Valla S. Metagenomics of microbial life in extreme temperature environments. Curr Opin Biotechnol 2013; 24:516-25. [DOI: 10.1016/j.copbio.2012.10.012] [Citation(s) in RCA: 59] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2012] [Revised: 10/15/2012] [Accepted: 10/17/2012] [Indexed: 02/04/2023]
|
19
|
López-López O, Cerdán ME, González-Siso MI. Hot spring metagenomics. Life (Basel) 2013; 3:308-20. [PMID: 25369743 PMCID: PMC4187134 DOI: 10.3390/life3020308] [Citation(s) in RCA: 56] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2013] [Revised: 04/11/2013] [Accepted: 04/15/2013] [Indexed: 12/14/2022] Open
Abstract
Hot springs have been investigated since the XIX century, but isolation and examination of their thermophilic microbial inhabitants did not start until the 1950s. Many thermophilic microorganisms and their viruses have since been discovered, although the real complexity of thermal communities was envisaged when research based on PCR amplification of the 16S rRNA genes arose. Thereafter, the possibility of cloning and sequencing the total environmental DNA, defined as metagenome, and the study of the genes rescued in the metagenomic libraries and assemblies made it possible to gain a more comprehensive understanding of microbial communities—their diversity, structure, the interactions existing between their components, and the factors shaping the nature of these communities. In the last decade, hot springs have been a source of thermophilic enzymes of industrial interest, encouraging further study of the poorly understood diversity of microbial life in these habitats.
Collapse
Affiliation(s)
- Olalla López-López
- Departamento de Bioloxía Celular e Molecular, Facultade de Ciencias, Universidade da Coruña, 15071 A Coruña, Spain.
| | - María Esperanza Cerdán
- Departamento de Bioloxía Celular e Molecular, Facultade de Ciencias, Universidade da Coruña, 15071 A Coruña, Spain.
| | - María Isabel González-Siso
- Departamento de Bioloxía Celular e Molecular, Facultade de Ciencias, Universidade da Coruña, 15071 A Coruña, Spain.
| |
Collapse
|
20
|
Wang H, Sivonen K, Rouhiainen L, Fewer DP, Lyra C, Rantala-Ylinen A, Vestola J, Jokela J, Rantasärkkä K, Li Z, Liu B. Genome-derived insights into the biology of the hepatotoxic bloom-forming cyanobacterium Anabaena sp. strain 90. BMC Genomics 2012; 13:613. [PMID: 23148582 PMCID: PMC3542288 DOI: 10.1186/1471-2164-13-613] [Citation(s) in RCA: 43] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2012] [Accepted: 11/05/2012] [Indexed: 11/15/2022] Open
Abstract
Background Cyanobacteria can form massive toxic blooms in fresh and brackish bodies of water and are frequently responsible for the poisoning of animals and pose a health risk for humans. Anabaena is a genus of filamentous diazotrophic cyanobacteria commonly implicated as a toxin producer in blooms in aquatic ecosystems throughout the world. The biology of bloom-forming cyanobacteria is poorly understood at the genome level. Results Here, we report the complete sequence and comprehensive annotation of the bloom-forming Anabaena sp. strain 90 genome. It comprises two circular chromosomes and three plasmids with a total size of 5.3 Mb, encoding a total of 4,738 genes. The genome is replete with mobile genetic elements. Detailed manual annotation demonstrated that almost 5% of the gene repertoire consists of pseudogenes. A further 5% of the genome is dedicated to the synthesis of small peptides that are the products of both ribosomal and nonribosomal biosynthetic pathways. Inactivation of the hassallidin (an antifungal cyclic peptide) biosynthetic gene cluster through a deletion event and a natural mutation of the buoyancy-permitting gvpG gas vesicle gene were documented. The genome contains a large number of genes encoding restriction-modification systems. Two novel excision elements were found in the nifH gene that is required for nitrogen fixation. Conclusions Genome analysis demonstrated that this strain invests heavily in the production of bioactive compounds and restriction-modification systems. This well-annotated genome provides a platform for future studies on the ecology and biology of these important bloom-forming cyanobacteria.
Collapse
Affiliation(s)
- Hao Wang
- Department of Food and Environmental Sciences, University of Helsinki, Helsinki, FIN-00014, Finland
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
21
|
Novel miniature transposable elements in thermophilic Synechococcus strains and their impact on an environmental population. J Bacteriol 2012; 194:3636-42. [PMID: 22563047 DOI: 10.1128/jb.00333-12] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/28/2023] Open
Abstract
The genomes of the two closely related freshwater thermophilic cyanobacteria Synechococcus sp. strain JA-3-3Ab and Synechococcus sp. strain JA-2-3B'a(2-13) each host several families of insertion sequences (ISSoc families) at various copy numbers, resulting in an overall high abundance of insertion sequences in the genomes. In addition to full-length copies, a large number of internal deletion variants have been identified. ISSoc2 has two variants (ISSoc2∂-1 and ISSoc2∂-2) that are observed to have multiple near-exact copies. Comparison of environmental metagenomic sequences to the Synechococcus genomes reveals novel placement of copies of ISSoc2, ISSoc2∂-1, and ISSoc2∂-2. Thus, ISSoc2∂-1 and ISSoc2∂-2 appear to be active nonautonomous mobile elements derived by internal deletion from ISSoc2. Insertion sites interrupting genes that are likely critical for cell viability were detected; however, most insertions either were intergenic or were within genes of unknown function. Most novel insertions detected in the metagenome were rare, suggesting a stringent selective environment. Evidence for mobility of internal deletion variants of other insertion sequences in these isolates suggests that this is a general mechanism for the formation of miniature insertion sequences.
Collapse
|