1
|
MArVD2: a machine learning enhanced tool to discriminate between archaeal and bacterial viruses in viral datasets. ISME COMMUNICATIONS 2023; 3:87. [PMID: 37620369 PMCID: PMC10449787 DOI: 10.1038/s43705-023-00295-9] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/04/2022] [Revised: 08/04/2023] [Accepted: 08/09/2023] [Indexed: 08/26/2023]
Abstract
Our knowledge of viral sequence space has exploded with advancing sequencing technologies and large-scale sampling and analytical efforts. Though archaea are important and abundant prokaryotes in many systems, our knowledge of archaeal viruses outside of extreme environments is limited. This largely stems from the lack of a robust, high-throughput, and systematic way to distinguish between bacterial and archaeal viruses in datasets of curated viruses. Here we upgrade our prior text-based tool (MArVD) via training and testing a random forest machine learning algorithm against a newly curated dataset of archaeal viruses. After optimization, MArVD2 presented a significant improvement over its predecessor in terms of scalability, usability, and flexibility, and will allow user-defined custom training datasets as archaeal virus discovery progresses. Benchmarking showed that a model trained with viral sequences from the hypersaline, marine, and hot spring environments correctly classified 85% of the archaeal viruses with a false detection rate below 2% using a random forest prediction threshold of 80% in a separate benchmarking dataset from the same habitats.
Collapse
|
2
|
Abstract
DNA viruses are increasingly recognized as influencing marine microbes and microbe-mediated biogeochemical cycling. However, little is known about global marine RNA virus diversity, ecology, and ecosystem roles. In this study, we uncover patterns and predictors of marine RNA virus community- and "species"-level diversity and contextualize their ecological impacts from pole to pole. Our analyses revealed four ecological zones, latitudinal and depth diversity patterns, and environmental correlates for RNA viruses. Our findings only partially parallel those of cosampled plankton and show unexpectedly high polar ecological interactions. The influence of RNA viruses on ecosystems appears to be large, as predicted hosts are ecologically important. Moreover, the occurrence of auxiliary metabolic genes indicates that RNA viruses cause reprogramming of diverse host metabolisms, including photosynthesis and carbon cycling, and that RNA virus abundances predict ocean carbon export.
Collapse
|
3
|
Abstract
Whereas DNA viruses are known to be abundant, diverse, and commonly key ecosystem players, RNA viruses are insufficiently studied outside disease settings. In this study, we analyzed ≈28 terabases of Global Ocean RNA sequences to expand Earth's RNA virus catalogs and their taxonomy, investigate their evolutionary origins, and assess their marine biogeography from pole to pole. Using new approaches to optimize discovery and classification, we identified RNA viruses that necessitate substantive revisions of taxonomy (doubling phyla and adding >50% new classes) and evolutionary understanding. "Species"-rank abundance determination revealed that viruses of the new phyla "Taraviricota," a missing link in early RNA virus evolution, and "Arctiviricota" are widespread and dominant in the oceans. These efforts provide foundational knowledge critical to integrating RNA viruses into ecological and epidemiological models.
Collapse
|
4
|
MetaPop: a pipeline for macro- and microdiversity analyses and visualization of microbial and viral metagenome-derived populations. MICROBIOME 2022; 10:49. [PMID: 35287721 PMCID: PMC8922842 DOI: 10.1186/s40168-022-01231-0] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/14/2021] [Accepted: 11/29/2021] [Indexed: 05/08/2023]
Abstract
BACKGROUND Microbes and their viruses are hidden engines driving Earth's ecosystems from the oceans and soils to humans and bioreactors. Though gene marker approaches can now be complemented by genome-resolved studies of inter-(macrodiversity) and intra-(microdiversity) population variation, analytical tools to do so remain scattered or under-developed. RESULTS Here, we introduce MetaPop, an open-source bioinformatic pipeline that provides a single interface to analyze and visualize microbial and viral community metagenomes at both the macro- and microdiversity levels. Macrodiversity estimates include population abundances and α- and β-diversity. Microdiversity calculations include identification of single nucleotide polymorphisms, novel codon-constrained linkage of SNPs, nucleotide diversity (π and θ), and selective pressures (pN/pS and Tajima's D) within and fixation indices (FST) between populations. MetaPop will also identify genes with distinct codon usage. Following rigorous validation, we applied MetaPop to the gut viromes of autistic children that underwent fecal microbiota transfers and their neurotypical peers. The macrodiversity results confirmed our prior findings for viral populations (microbial shotgun metagenomes were not available) that diversity did not significantly differ between autistic and neurotypical children. However, by also quantifying microdiversity, MetaPop revealed lower average viral nucleotide diversity (π) in autistic children. Analysis of the percentage of genomes detected under positive selection was also lower among autistic children, suggesting that higher viral π in neurotypical children may be beneficial because it allows populations to better "bet hedge" in changing environments. Further, comparisons of microdiversity pre- and post-FMT in autistic children revealed that the delivery FMT method (oral versus rectal) may influence viral activity and engraftment of microdiverse viral populations, with children who received their FMT rectally having higher microdiversity post-FMT. Overall, these results show that analyses at the macro level alone can miss important biological differences. CONCLUSIONS These findings suggest that standardized population and genetic variation analyses will be invaluable for maximizing biological inference, and MetaPop provides a convenient tool package to explore the dual impact of macro- and microdiversity across microbial communities. Video abstract.
Collapse
|
5
|
iVirus 2.0: Cyberinfrastructure-supported tools and data to power DNA virus ecology. ISME COMMUNICATIONS 2021; 1:77. [PMID: 36765102 PMCID: PMC9723767 DOI: 10.1038/s43705-021-00083-3] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/20/2021] [Revised: 11/24/2021] [Accepted: 11/29/2021] [Indexed: 11/09/2022]
Abstract
Microbes drive myriad ecosystem processes, but under strong influence from viruses. Because studying viruses in complex systems requires different tools than those for microbes, they remain underexplored. To combat this, we previously aggregated double-stranded DNA (dsDNA) virus analysis capabilities and resources into 'iVirus' on the CyVerse collaborative cyberinfrastructure. Here we substantially expand iVirus's functionality and accessibility, to iVirus 2.0, as follows. First, core iVirus apps were integrated into the Department of Energy's Systems Biology KnowledgeBase (KBase) to provide an additional analytical platform. Second, at CyVerse, 20 software tools (apps) were upgraded or added as new tools and capabilities. Third, nearly 20-fold more sequence reads were aggregated to capture new data and environments. Finally, documentation, as "live" protocols, was updated to maximize user interaction with and contribution to infrastructure development. Together, iVirus 2.0 serves as a uniquely central and accessible analytical platform for studying how viruses, particularly dsDNA viruses, impact diverse microbial ecosystems.
Collapse
|
6
|
Expanding standards in viromics: in silico evaluation of dsDNA viral genome identification, classification, and auxiliary metabolic gene curation. PeerJ 2021; 9:e11447. [PMID: 34178438 PMCID: PMC8210812 DOI: 10.7717/peerj.11447] [Citation(s) in RCA: 37] [Impact Index Per Article: 12.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2020] [Accepted: 04/22/2021] [Indexed: 12/18/2022] Open
Abstract
BACKGROUND Viruses influence global patterns of microbial diversity and nutrient cycles. Though viral metagenomics (viromics), specifically targeting dsDNA viruses, has been critical for revealing viral roles across diverse ecosystems, its analyses differ in many ways from those used for microbes. To date, viromics benchmarking has covered read pre-processing, assembly, relative abundance, read mapping thresholds and diversity estimation, but other steps would benefit from benchmarking and standardization. Here we use in silico-generated datasets and an extensive literature survey to evaluate and highlight how dataset composition (i.e., viromes vs bulk metagenomes) and assembly fragmentation impact (i) viral contig identification tool, (ii) virus taxonomic classification, and (iii) identification and curation of auxiliary metabolic genes (AMGs). RESULTS The in silico benchmarking of five commonly used virus identification tools show that gene-content-based tools consistently performed well for long (≥3 kbp) contigs, while k-mer- and blast-based tools were uniquely able to detect viruses from short (≤3 kbp) contigs. Notably, however, the performance increase of k-mer- and blast-based tools for short contigs was obtained at the cost of increased false positives (sometimes up to ∼5% for virome and ∼75% bulk samples), particularly when eukaryotic or mobile genetic element sequences were included in the test datasets. For viral classification, variously sized genome fragments were assessed using gene-sharing network analytics to quantify drop-offs in taxonomic assignments, which revealed correct assignations ranging from ∼95% (whole genomes) down to ∼80% (3 kbp sized genome fragments). A similar trend was also observed for other viral classification tools such as VPF-class, ViPTree and VIRIDIC, suggesting that caution is warranted when classifying short genome fragments and not full genomes. Finally, we highlight how fragmented assemblies can lead to erroneous identification of AMGs and outline a best-practices workflow to curate candidate AMGs in viral genomes assembled from metagenomes. CONCLUSION Together, these benchmarking experiments and annotation guidelines should aid researchers seeking to best detect, classify, and characterize the myriad viruses 'hidden' in diverse sequence datasets.
Collapse
|
7
|
Potential virus-mediated nitrogen cycling in oxygen-depleted oceanic waters. THE ISME JOURNAL 2021; 15:981-998. [PMID: 33199808 PMCID: PMC8115048 DOI: 10.1038/s41396-020-00825-6] [Citation(s) in RCA: 52] [Impact Index Per Article: 17.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/23/2020] [Revised: 09/30/2020] [Accepted: 10/27/2020] [Indexed: 01/29/2023]
Abstract
Viruses play an important role in the ecology and biogeochemistry of marine ecosystems. Beyond mortality and gene transfer, viruses can reprogram microbial metabolism during infection by expressing auxiliary metabolic genes (AMGs) involved in photosynthesis, central carbon metabolism, and nutrient cycling. While previous studies have focused on AMG diversity in the sunlit and dark ocean, less is known about the role of viruses in shaping metabolic networks along redox gradients associated with marine oxygen minimum zones (OMZs). Here, we analyzed relatively quantitative viral metagenomic datasets that profiled the oxygen gradient across Eastern Tropical South Pacific (ETSP) OMZ waters, assessing whether OMZ viruses might impact nitrogen (N) cycling via AMGs. Identified viral genomes encoded six N-cycle AMGs associated with denitrification, nitrification, assimilatory nitrate reduction, and nitrite transport. The majority of these AMGs (80%) were identified in T4-like Myoviridae phages, predicted to infect Cyanobacteria and Proteobacteria, or in unclassified archaeal viruses predicted to infect Thaumarchaeota. Four AMGs were exclusive to anoxic waters and had distributions that paralleled homologous microbial genes. Together, these findings suggest viruses modulate N-cycling processes within the ETSP OMZ and may contribute to nitrogen loss throughout the global oceans thus providing a baseline for their inclusion in the ecosystem and geochemical models.
Collapse
|
8
|
Abstract
Viruses, despite their great abundance and significance in biological systems, remain largely mysterious. Indeed, the vast majority of the perhaps hundreds of millions of viral species on the planet remain undiscovered. Additionally, many viruses deposited in central databases like GenBank and RefSeq are littered with genes annotated as 'hypothetical protein' or the equivalent. Cenote-Taker 2, a virus discovery and annotation tool available on command line and with a graphical user interface with free high-performance computation access, utilizes highly sensitive models of hallmark virus genes to discover familiar or divergent viral sequences from user-input contigs. Additionally, Cenote-Taker 2 uses a flexible set of modules to automatically annotate the sequence features of contigs, providing more gene information than comparable tools. The outputs include readable and interactive genome maps, virome summary tables, and files that can be directly submitted to GenBank. We expect Cenote-Taker 2 to facilitate virus discovery, annotation, and expansion of the known virome.
Collapse
|
9
|
The Gut Virome Database Reveals Age-Dependent Patterns of Virome Diversity in the Human Gut. Cell Host Microbe 2020; 28:724-740.e8. [PMID: 32841606 PMCID: PMC7443397 DOI: 10.1016/j.chom.2020.08.003] [Citation(s) in RCA: 265] [Impact Index Per Article: 66.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2020] [Revised: 07/14/2020] [Accepted: 08/06/2020] [Indexed: 12/12/2022]
Abstract
The gut microbiome profoundly affects human health and disease, and their infecting viruses are likely as important, but often missed because of reference database limitations. Here, we (1) built a human Gut Virome Database (GVD) from 2,697 viral particle or microbial metagenomes from 1,986 individuals representing 16 countries, (2) assess its effectiveness, and (3) report a meta-analysis that reveals age-dependent patterns across healthy Westerners. The GVD contains 33,242 unique viral populations (approximately species-level taxa) and improves average viral detection rates over viral RefSeq and IMG/VR nearly 182-fold and 2.6-fold, respectively. GVD meta-analyses show highly personalized viromes, reveal that inter-study variability from technical artifacts is larger than any "disease" effect at the population level, and document how viral diversity changes from human infancy into senescence. Together, this compact foundational resource, these standardization guidelines, and these meta-analysis findings provide a systematic toolkit to help maximize our understanding of viral roles in health and disease.
Collapse
|
10
|
DRAM for distilling microbial metabolism to automate the curation of microbiome function. Nucleic Acids Res 2020; 48:8883-8900. [PMID: 32766782 PMCID: PMC7498326 DOI: 10.1093/nar/gkaa621] [Citation(s) in RCA: 299] [Impact Index Per Article: 74.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2020] [Revised: 06/29/2020] [Accepted: 07/21/2020] [Indexed: 12/20/2022] Open
Abstract
Microbial and viral communities transform the chemistry of Earth's ecosystems, yet the specific reactions catalyzed by these biological engines are hard to decode due to the absence of a scalable, metabolically resolved, annotation software. Here, we present DRAM (Distilled and Refined Annotation of Metabolism), a framework to translate the deluge of microbiome-based genomic information into a catalog of microbial traits. To demonstrate the applicability of DRAM across metabolically diverse genomes, we evaluated DRAM performance on a defined, in silico soil community and previously published human gut metagenomes. We show that DRAM accurately assigned microbial contributions to geochemical cycles and automated the partitioning of gut microbial carbohydrate metabolism at substrate levels. DRAM-v, the viral mode of DRAM, established rules to identify virally-encoded auxiliary metabolic genes (AMGs), resulting in the metabolic categorization of thousands of putative AMGs from soils and guts. Together DRAM and DRAM-v provide critical metabolic profiling capabilities that decipher mechanisms underpinning microbiome function.
Collapse
|
11
|
The IsoGenie database: an interdisciplinary data management solution for ecosystems biology and environmental research. PeerJ 2020. [DOI: 10.7717/peerj.9467] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
Modern microbial and ecosystem sciences require diverse interdisciplinary teams that are often challenged in “speaking” to one another due to different languages and data product types. Here we introduce the IsoGenie Database (IsoGenieDB; https://isogenie-db.asc.ohio-state.edu/), a de novo developed data management and exploration platform, as a solution to this challenge of accurately representing and integrating heterogenous environmental and microbial data across ecosystem scales. The IsoGenieDB is a public and private data infrastructure designed to store and query data generated by the IsoGenie Project, a ~10 year DOE-funded project focused on discovering ecosystem climate feedbacks in a thawing permafrost landscape. The IsoGenieDB provides (i) a platform for IsoGenie Project members to explore the project’s interdisciplinary datasets across scales through the inherent relationships among data entities, (ii) a framework to consolidate and harmonize the datasets needed by the team’s modelers, and (iii) a public venue that leverages the same spatially explicit, disciplinarily integrated data structure to share published datasets. The IsoGenieDB is also being expanded to cover the NASA-funded Archaea to Atmosphere (A2A) project, which scales the findings of IsoGenie to a broader suite of Arctic peatlands, via the umbrella A2A Database (A2A-DB). The IsoGenieDB’s expandability and flexible architecture allow it to serve as an example ecosystems database.
Collapse
|
12
|
Abstract
Arctic regions, which are changing rapidly as they warm 2 to 3 times faster than the global average, still retain microbial habitats that serve as natural laboratories for understanding mechanisms of microbial adaptation to extreme conditions. Seawater-derived brines within both sea ice (sea-ice brine) and ancient layers of permafrost (cryopeg brine) support diverse microbes adapted to subzero temperatures and high salinities, yet little is known about viruses in these extreme environments, which, if analogous to other systems, could play important evolutionary and ecosystem roles. Here, we characterized viral communities and their functions in samples of cryopeg brine, sea-ice brine, and melted sea ice. Viral abundance was high in cryopeg brine (1.2 × 108 ml-1) and much lower in sea-ice brine (1.3 × 105 to 2.1 × 105 ml-1), which roughly paralleled the differences in cell concentrations in these samples. Five low-input, quantitative viral metagenomes were sequenced to yield 476 viral populations (i.e., species level; ≥10 kb), only 12% of which could be assigned taxonomy by traditional database approaches, indicating a high degree of novelty. Additional analyses revealed that these viruses: (i) formed communities that differed between sample type and vertically with sea-ice depth; (ii) infected hosts that dominated these extreme ecosystems, including Marinobacter, Glaciecola, and Colwellia; and (iii) encoded fatty acid desaturase (FAD) genes that likely helped their hosts overcome cold and salt stress during infection, as well as mediated horizontal gene transfer of FAD genes between microbes. Together, these findings contribute to understanding viral abundances and communities and how viruses impact their microbial hosts in subzero brines and sea ice.IMPORTANCE This study explores viral community structure and function in remote and extreme Arctic environments, including subzero brines within marine layers of permafrost and sea ice, using a modern viral ecogenomics toolkit for the first time. In addition to providing foundational data sets for these climate-threatened habitats, we found evidence that the viruses had habitat specificity, infected dominant microbial hosts, encoded host-derived metabolic genes, and mediated horizontal gene transfer among hosts. These results advance our understanding of the virosphere and how viruses influence extreme ecosystems. More broadly, the evidence that virally mediated gene transfers may be limited by host range in these extreme habitats contributes to a mechanistic understanding of genetic exchange among microbes under stressful conditions in other systems.
Collapse
|
13
|
Biotic and Environmental Drivers of Plant Microbiomes Across a Permafrost Thaw Gradient. Front Microbiol 2020; 11:796. [PMID: 32499761 PMCID: PMC7243355 DOI: 10.3389/fmicb.2020.00796] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2019] [Accepted: 04/03/2020] [Indexed: 02/01/2023] Open
Abstract
Plant-associated microbiomes are structured by environmental conditions and plant associates, both of which are being altered by climate change. The future structure of plant microbiomes will depend on the, largely unknown, relative importance of each. This uncertainty is particularly relevant for arctic peatlands, which are undergoing large shifts in plant communities and soil microbiomes as permafrost thaws, and are potentially appreciable sources of climate change feedbacks due to their soil carbon (C) storage. We characterized phyllosphere and rhizosphere microbiomes of six plant species, and bulk peat, across a permafrost thaw progression (from intact permafrost, to partially- and fully-thawed stages) via 16S rRNA gene amplicon sequencing. We tested the hypothesis that the relative influence of biotic versus environmental filtering (the role of plant species versus thaw-defined habitat) in structuring microbial communities would differ among phyllosphere, rhizosphere, and bulk peat. Using both abundance- and phylogenetic-based approaches, we found that phyllosphere microbial composition was more strongly explained by plant associate, with little influence of habitat, whereas in the rhizosphere, plant and habitat had similar influence. Network-based community analyses showed that keystone taxa exhibited similar patterns with stronger responses to drivers. However, plant associates appeared to have a larger influence on organisms belonging to families associated with methane-cycling than the bulk community. Putative methanogens were more strongly influenced by plant than habitat in the rhizosphere, and in the phyllosphere putative methanotrophs were more strongly influenced by plant than was the community at large. We conclude that biotic effects can be stronger than environmental filtering, but their relative importance varies among microbial groups. For most microbes in this system, biotic filtering was stronger aboveground than belowground. However, for putative methane-cyclers, plant associations have a stronger influence on community composition than environment despite major hydrological changes with thaw. This suggests that plant successional dynamics may be as important as hydrological changes in determining microbial relevance to C-cycling climate feedbacks. By partitioning the degree that plant versus environmental filtering drives microbiome composition and function we can improve our ability to predict the consequences of warming for C-cycling in other arctic areas undergoing similar permafrost thaw transitions.
Collapse
|
14
|
Towards optimized viral metagenomes for double-stranded and single-stranded DNA viruses from challenging soils. PeerJ 2019; 7:e7265. [PMID: 31309007 PMCID: PMC6612421 DOI: 10.7717/peerj.7265] [Citation(s) in RCA: 31] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2019] [Accepted: 06/07/2019] [Indexed: 11/29/2022] Open
Abstract
Soils impact global carbon cycling and their resident microbes are critical to their biogeochemical processing and ecosystem outputs. Based on studies in marine systems, viruses infecting soil microbes likely modulate host activities via mortality, horizontal gene transfer, and metabolic control. However, their roles remain largely unexplored due to technical challenges with separating, isolating, and extracting DNA from viruses in soils. Some of these challenges have been overcome by using whole genome amplification methods and while these have allowed insights into the identities of soil viruses and their genomes, their inherit biases have prevented meaningful ecological interpretations. Here we experimentally optimized steps for generating quantitatively-amplified viral metagenomes to better capture both ssDNA and dsDNA viruses across three distinct soil habitats along a permafrost thaw gradient. First, we assessed differing DNA extraction methods (PowerSoil, Wizard mini columns, and cetyl trimethylammonium bromide) for quantity and quality of viral DNA. This established PowerSoil as best for yield and quality of DNA from our samples, though ∼1/3 of the viral populations captured by each extraction kit were unique, suggesting appreciable differential biases among DNA extraction kits. Second, we evaluated the impact of purifying viral particles after resuspension (by cesium chloride gradients; CsCl) and of viral lysis method (heat vs bead-beating) on the resultant viromes. DNA yields after CsCl particle-purification were largely non-detectable, while unpurified samples yielded 1–2-fold more DNA after lysis by heat than by bead-beating. Virome quality was assessed by the number and size of metagenome-assembled viral contigs, which showed no increase after CsCl-purification, but did from heat lysis relative to bead-beating. We also evaluated sample preparation protocols for ssDNA virus recovery. In both CsCl-purified and non-purified samples, ssDNA viruses were successfully recovered by using the Accel-NGS 1S Plus Library Kit. While ssDNA viruses were identified in all three soil types, none were identified in the samples that used bead-beating, suggesting this lysis method may impact recovery. Further, 13 ssDNA vOTUs were identified compared to 582 dsDNA vOTUs, and the ssDNA vOTUs only accounted for ∼4% of the assembled reads, implying dsDNA viruses were dominant in these samples. This optimized approach was combined with the previously published viral resuspension protocol into a sample-to-virome protocol for soils now available at protocols.io, where community feedback creates ‘living’ protocols. This collective approach will be particularly valuable given the high physicochemical variability of soils, which will may require considerable soil type-specific optimization. This optimized protocol provides a starting place for developing quantitatively-amplified viromic datasets and will help enable viral ecogenomic studies on organic-rich soils.
Collapse
|
15
|
Taxonomic assignment of uncultivated prokaryotic virus genomes is enabled by gene-sharing networks. Nat Biotechnol 2019; 37:632-639. [PMID: 31061483 DOI: 10.1038/s41587-019-0100-8] [Citation(s) in RCA: 395] [Impact Index Per Article: 79.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2018] [Accepted: 03/11/2019] [Indexed: 01/03/2023]
Abstract
Microbiomes from every environment contain a myriad of uncultivated archaeal and bacterial viruses, but studying these viruses is hampered by the lack of a universal, scalable taxonomic framework. We present vConTACT v.2.0, a network-based application utilizing whole genome gene-sharing profiles for virus taxonomy that integrates distance-based hierarchical clustering and confidence scores for all taxonomic predictions. We report near-identical (96%) replication of existing genus-level viral taxonomy assignments from the International Committee on Taxonomy of Viruses for National Center for Biotechnology Information virus RefSeq. Application of vConTACT v.2.0 to 1,364 previously unclassified viruses deposited in virus RefSeq as reference genomes produced automatic, high-confidence genus assignments for 820 of the 1,364. We applied vConTACT v.2.0 to analyze 15,280 Global Ocean Virome genome fragments and were able to provide taxonomic assignments for 31% of these data, which shows that our algorithm is scalable to very large metagenomic datasets. Our taxonomy tool can be automated and applied to metagenomes from any environment for virus classification.
Collapse
|
16
|
Discovery and ecogenomic context of a global Caldiserica-related phylum active in thawing permafrost, Candidatus Cryosericota phylum nov., Ca. Cryosericia class nov., Ca. Cryosericales ord. nov., Ca. Cryosericaceae fam. nov., comprising the four species Cryosericum septentrionale gen. nov. sp. nov., Ca. C. hinesii sp. nov., Ca. C. odellii sp. nov., Ca. C. terrychapinii sp. nov. Syst Appl Microbiol 2019; 42:54-66. [DOI: 10.1016/j.syapm.2018.12.003] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2018] [Revised: 12/05/2018] [Accepted: 12/05/2018] [Indexed: 10/27/2022]
|
17
|
Soil Viruses Are Underexplored Players in Ecosystem Carbon Processing. mSystems 2018; 3:e00076-18. [PMID: 30320215 PMCID: PMC6172770 DOI: 10.1128/msystems.00076-18] [Citation(s) in RCA: 127] [Impact Index Per Article: 21.2] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2018] [Accepted: 08/24/2018] [Indexed: 01/10/2023] Open
Abstract
Rapidly thawing permafrost harbors ∼30 to 50% of global soil carbon, and the fate of this carbon remains unknown. Microorganisms will play a central role in its fate, and their viruses could modulate that impact via induced mortality and metabolic controls. Because of the challenges of recovering viruses from soils, little is known about soil viruses or their role(s) in microbial biogeochemical cycling. Here, we describe 53 viral populations (viral operational taxonomic units [vOTUs]) recovered from seven quantitatively derived (i.e., not multiple-displacement-amplified) viral-particle metagenomes (viromes) along a permafrost thaw gradient at the Stordalen Mire field site in northern Sweden. Only 15% of these vOTUs had genetic similarity to publicly available viruses in the RefSeq database, and ∼30% of the genes could be annotated, supporting the concept of soils as reservoirs of substantial undescribed viral genetic diversity. The vOTUs exhibited distinct ecology, with different distributions along the thaw gradient habitats, and a shift from soil-virus-like assemblages in the dry palsas to aquatic-virus-like assemblages in the inundated fen. Seventeen vOTUs were linked to microbial hosts (in silico), implicating viruses in infecting abundant microbial lineages from Acidobacteria, Verrucomicrobia, and Deltaproteobacteria, including those encoding key biogeochemical functions such as organic matter degradation. Thirty auxiliary metabolic genes (AMGs) were identified and suggested virus-mediated modulation of central carbon metabolism, soil organic matter degradation, polysaccharide binding, and regulation of sporulation. Together, these findings suggest that these soil viruses have distinct ecology, impact host-mediated biogeochemistry, and likely impact ecosystem function in the rapidly changing Arctic. IMPORTANCE This work is part of a 10-year project to examine thawing permafrost peatlands and is the first virome-particle-based approach to characterize viruses in these systems. This method yielded >2-fold-more viral populations (vOTUs) per gigabase of metagenome than vOTUs derived from bulk-soil metagenomes from the same site (J. B. Emerson, S. Roux, J. R. Brum, B. Bolduc, et al., Nat Microbiol 3:870-880, 2018, https://doi.org/10.1038/s41564-018-0190-y). We compared the ecology of the recovered vOTUs along a permafrost thaw gradient and found (i) habitat specificity, (ii) a shift in viral community identity from soil-like to aquatic-like viruses, (iii) infection of dominant microbial hosts, and (iv) carriage of host metabolic genes. These vOTUs can impact ecosystem carbon processing via top-down (inferred from lysing dominant microbial hosts) and bottom-up (inferred from carriage of auxiliary metabolic genes) controls. This work serves as a foundation which future studies can build upon to increase our understanding of the soil virosphere and how viruses affect soil ecosystem services.
Collapse
|
18
|
Single-virus genomics reveals hidden cosmopolitan and abundant viruses. Nat Commun 2017; 8:15892. [PMID: 28643787 PMCID: PMC5490008 DOI: 10.1038/ncomms15892] [Citation(s) in RCA: 113] [Impact Index Per Article: 16.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2016] [Accepted: 05/10/2017] [Indexed: 12/22/2022] Open
Abstract
Microbes drive ecosystems under constraints imposed by viruses. However, a lack of virus genome information hinders our ability to answer fundamental, biological questions concerning microbial communities. Here we apply single-virus genomics (SVGs) to assess whether portions of marine viral communities are missed by current techniques. The majority of the here-identified 44 viral single-amplified genomes (vSAGs) are more abundant in global ocean virome data sets than published metagenome-assembled viral genomes or isolates. This indicates that vSAGs likely best represent the dsDNA viral populations dominating the oceans. Species-specific recruitment patterns and virome simulation data suggest that vSAGs are highly microdiverse and that microdiversity hinders the metagenomic assembly, which could explain why their genomes have not been identified before. Altogether, SVGs enable the discovery of some of the likely most abundant and ecologically relevant marine viral species, such as vSAG 37-F6, which were overlooked by other methodologies.
Collapse
|
19
|
vConTACT: an iVirus tool to classify double-stranded DNA viruses that infect Archaea and Bacteria. PeerJ 2017; 5:e3243. [PMID: 28480138 PMCID: PMC5419219 DOI: 10.7717/peerj.3243] [Citation(s) in RCA: 161] [Impact Index Per Article: 23.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2017] [Accepted: 03/28/2017] [Indexed: 12/15/2022] Open
Abstract
Taxonomic classification of archaeal and bacterial viruses is challenging, yet also fundamental for developing a predictive understanding of microbial ecosystems. Recent identification of hundreds of thousands of new viral genomes and genome fragments, whose hosts remain unknown, requires a paradigm shift away from traditional classification approaches and towards the use of genomes for taxonomy. Here we revisited the use of genomes and their protein content as a means for developing a viral taxonomy for bacterial and archaeal viruses. A network-based analytic was evaluated and benchmarked against authority-accepted taxonomic assignments and found to be largely concordant. Exceptions were manually examined and found to represent areas of viral genome 'sequence space' that are under-sampled or prone to excessive genetic exchange. While both cases are poorly resolved by genome-based taxonomic approaches, the former will improve as viral sequence space is better sampled and the latter are uncommon. Finally, given the largely robust taxonomic capabilities of this approach, we sought to enable researchers to easily and systematically classify new viruses. Thus, we established a tool, vConTACT, as an app at iVirus, where it operates as a fast, highly scalable, user-friendly app within the free and powerful CyVerse cyberinfrastructure.
Collapse
|
20
|
Comparative Metagenomics of Eight Geographically Remote Terrestrial Hot Springs. MICROBIAL ECOLOGY 2015; 70:411-424. [PMID: 25712554 DOI: 10.1007/s00248-015-0576-9] [Citation(s) in RCA: 67] [Impact Index Per Article: 7.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/28/2014] [Accepted: 01/23/2015] [Indexed: 06/04/2023]
Abstract
Hot springs are natural habitats for thermophilic Archaea and Bacteria. In this paper, we present the metagenomic analysis of eight globally distributed terrestrial hot springs from China, Iceland, Italy, Russia, and the USA with a temperature range between 61 and 92 (∘)C and pH between 1.8 and 7. A comparison of the biodiversity and community composition generally showed a decrease in biodiversity with increasing temperature and decreasing pH. Another important factor shaping microbial diversity of the studied sites was the abundance of organic substrates. Several species of the Crenarchaeal order Thermoprotei were detected, whereas no single bacterial species was found in all samples, suggesting a better adaptation of certain archaeal species to different thermophilic environments. Two hot springs show high abundance of Acidithiobacillus, supporting the idea of a true thermophilic Acidithiobacillus species that can thrive in hyperthermophilic environments. Depending on the sample, up to 58 % of sequencing reads could not be assigned to a known phylum, reinforcing the fact that a large number of microorganisms in nature, including those thriving in hot environments remain to be isolated and characterized.
Collapse
|
21
|
Viral assemblage composition in Yellowstone acidic hot springs assessed by network analysis. ISME JOURNAL 2015; 9:2162-77. [PMID: 26125684 DOI: 10.1038/ismej.2015.28] [Citation(s) in RCA: 45] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/16/2014] [Revised: 12/29/2014] [Accepted: 01/12/2015] [Indexed: 01/31/2023]
Abstract
Understanding of viral assemblage structure in natural environments remains a daunting task. Total viral assemblage sequencing (for example, viral metagenomics) provides a tractable approach. However, even with the availability of next-generation sequencing technology it is usually only possible to obtain a fragmented view of viral assemblages in natural ecosystems. In this study, we applied a network-based approach in combination with viral metagenomics to investigate viral assemblage structure in the high temperature, acidic hot springs of Yellowstone National Park, USA. Our results show that this approach can identify distinct viral groups and provide insights into the viral assemblage structure. We identified 110 viral groups in the hot springs environment, with each viral group likely representing a viral family at the sub-family taxonomic level. Most of these viral groups are previously unknown DNA viruses likely infecting archaeal hosts. Overall, this study demonstrates the utility of combining viral assemblage sequencing approaches with network analysis to gain insights into viral assemblage structure in natural ecosystems.
Collapse
|
22
|
40 Years of archaeal virology: Expanding viral diversity. Virology 2015; 479-480:369-78. [PMID: 25866378 DOI: 10.1016/j.virol.2015.03.031] [Citation(s) in RCA: 31] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2014] [Revised: 02/07/2015] [Accepted: 03/17/2015] [Indexed: 10/23/2022]
Abstract
The first archaeal virus was isolated over 40 years ago prior to the recognition of the three domain structure of life. In the ensuing years, our knowledge of Archaea and their viruses has increased, but they still remain the most mysterious of life's three domains. Currently, over 100 archaeal viruses have been discovered, but few have been described in biochemical or structural detail. However, those that have been characterized have revealed a new world of structural, biochemical and genetic diversity. Several model systems for studying archaeal virus-host interactions have been developed, revealing evolutionary linkages between viruses infecting the three domains of life, new viral lysis systems, and unusual features of host-virus interactions. It is likely that the study of archaeal viruses will continue to provide fertile ground for fundamental discoveries in virus diversity, structure and function.
Collapse
|
23
|
Abstract
The Archaea-and their viruses-remain the most enigmatic of life's three domains. Once thought to inhabit only extreme environments, archaea are now known to inhabit diverse environments. Even though the first archaeal virus was described over 40 years ago, only 117 archaeal viruses have been discovered to date. Despite this small number, these viruses have painted a portrait of enormous morphological and genetic diversity. For example, research centered around the various steps of the archaeal virus life cycle has led to the discovery of unique mechanisms employed by archaeal viruses during replication, maturation, and virion release. In many instances, archaeal virus proteins display very low levels of sequence homology to other proteins listed in the public database, and therefore, structural characterization of these proteins has played an integral role in functional assignment. These structural studies have not only provided insights into structure-function relationships but have also identified links between viruses across all three domains of life.
Collapse
|
24
|
A target-unrelated peptide in an M13 phage display library traced to an advantageous mutation in the gene II ribosome-binding site. Anal Biochem 2007; 373:88-98. [PMID: 17976366 DOI: 10.1016/j.ab.2007.10.015] [Citation(s) in RCA: 50] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2007] [Revised: 10/03/2007] [Accepted: 10/09/2007] [Indexed: 10/22/2022]
Abstract
Screening of the commercially available Ph.D.-7 phage-displayed heptapeptide library for peptides that bind immobilized Zn2+ resulted in the repeated selection of the peptide HAIYPRH, although binding assays indicated that HAIYPRH is not a zinc-binding peptide. HAIYPRH has also been selected in several other laboratories using completely different targets, and its ubiquity suggests that it is a target-unrelated peptide. We demonstrated that phage displaying HAIYPRH are enriched after serial amplification of the library without exposure to target. The amplification of phage displaying HAIYPRH was found to be dramatically faster than that of the library itself. DNA sequencing uncovered a mutation in the Shine-Dalgarno (SD) sequence for gIIp, a protein involved in phage replication, imparting to the SD sequence better complementarity to the 16S ribosomal RNA (rRNA). Introducing this mutation into phage lacking a displayed peptide resulted in accelerated propagation, whereas phage displaying HAIYPRH with a wild-type SD sequence were found to amplify normally. The SD mutation may alter gIIp expression and, consequently, the rate of propagation of phage. In the Ph.D.-7 library, the mutation is coincident with the displayed peptide HAIYPRH, accounting for the target-unrelated selection of this peptide in multiple reported panning experiments.
Collapse
|