1
|
Shi Z, Long X, Zhang C, Chen Z, Usman M, Zhang Y, Zhang S, Luo G. Viral and Bacterial Community Dynamics in Food Waste and Digestate from Full-Scale Biogas Plants. ENVIRONMENTAL SCIENCE & TECHNOLOGY 2024; 58:13010-13022. [PMID: 38989650 DOI: 10.1021/acs.est.4c04109] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/12/2024]
Abstract
Anaerobic digestion (AD) is commonly used in food waste treatment. Prokaryotic microbial communities in AD of food waste have been comprehensively studied. The role of viruses, known to affect microbial dynamics and metabolism, remains largely unexplored. This study employed metagenomic analysis and recovered 967 high-quality viral bins within food waste and digestate derived from 8 full-scale biogas plants. The diversity of viral communities was higher in digestate. In silico predictions linked 20.8% of viruses to microbial host populations, highlighting possible virus predators of key functional microbes. Lineage-specific virus-host ratio varied, indicating that viral infection dynamics might differentially affect microbial responses to the varying process parameters. Evidence for virus-mediated gene transfer was identified, emphasizing the potential role of viruses in controlling the microbiome. AD altered the specific process parameters, potentially promoting a shift in viral lifestyle from lysogenic to lytic. Viruses encoding auxiliary metabolic genes (AMGs) were involved in microbial carbon and nutrient cycling, and most AMGs were transcriptionally expressed in digestate, meaning that viruses with active functional states were likely actively involved in AD. These findings provided a comprehensive profile of viral and bacterial communities and expanded knowledge of the interactions between viruses and hosts in food waste and digestate.
Collapse
Affiliation(s)
- Zhijian Shi
- Shanghai Key Laboratory of Atmospheric Particle Pollution and Prevention (LAP3), Department of Environmental Science and Engineering, Fudan University, Shanghai 200438, China
| | - Xinyi Long
- Shanghai Key Laboratory of Atmospheric Particle Pollution and Prevention (LAP3), Department of Environmental Science and Engineering, Fudan University, Shanghai 200438, China
| | - Chao Zhang
- Shanghai Key Laboratory of Atmospheric Particle Pollution and Prevention (LAP3), Department of Environmental Science and Engineering, Fudan University, Shanghai 200438, China
| | - Zheng Chen
- Shanghai Key Laboratory of Atmospheric Particle Pollution and Prevention (LAP3), Department of Environmental Science and Engineering, Fudan University, Shanghai 200438, China
| | - Muhammad Usman
- Department of Civil and Environmental Engineering, University of Alberta, Edmonton, AB T6G 2R3, Canada
| | - Yalei Zhang
- Shanghai Institute of Pollution Control and Ecological Security, Shanghai 200092, China
- State Key Laboratory of Pollution Control and Resources Reuse, College of Environmental Science and Engineering, Tongji University, Shanghai 200092, China
| | - Shicheng Zhang
- Shanghai Key Laboratory of Atmospheric Particle Pollution and Prevention (LAP3), Department of Environmental Science and Engineering, Fudan University, Shanghai 200438, China
- Shanghai Institute of Pollution Control and Ecological Security, Shanghai 200092, China
- Shanghai Technical Service Platform for Pollution Control and Resource Utilization of Organic Wastes, Shanghai 200438, China
| | - Gang Luo
- Shanghai Key Laboratory of Atmospheric Particle Pollution and Prevention (LAP3), Department of Environmental Science and Engineering, Fudan University, Shanghai 200438, China
- Shanghai Institute of Pollution Control and Ecological Security, Shanghai 200092, China
- Shanghai Technical Service Platform for Pollution Control and Resource Utilization of Organic Wastes, Shanghai 200438, China
| |
Collapse
|
2
|
Liu X, Liu Y, Liu J, Zhang H, Shan C, Guo Y, Gong X, Cui M, Li X, Tang M. Correlation between the gut microbiome and neurodegenerative diseases: a review of metagenomics evidence. Neural Regen Res 2024; 19:833-845. [PMID: 37843219 PMCID: PMC10664138 DOI: 10.4103/1673-5374.382223] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2023] [Revised: 04/19/2023] [Accepted: 06/17/2023] [Indexed: 10/17/2023] Open
Abstract
A growing body of evidence suggests that the gut microbiota contributes to the development of neurodegenerative diseases via the microbiota-gut-brain axis. As a contributing factor, microbiota dysbiosis always occurs in pathological changes of neurodegenerative diseases, such as Alzheimer's disease, Parkinson's disease, and amyotrophic lateral sclerosis. High-throughput sequencing technology has helped to reveal that the bidirectional communication between the central nervous system and the enteric nervous system is facilitated by the microbiota's diverse microorganisms, and for both neuroimmune and neuroendocrine systems. Here, we summarize the bioinformatics analysis and wet-biology validation for the gut metagenomics in neurodegenerative diseases, with an emphasis on multi-omics studies and the gut virome. The pathogen-associated signaling biomarkers for identifying brain disorders and potential therapeutic targets are also elucidated. Finally, we discuss the role of diet, prebiotics, probiotics, postbiotics and exercise interventions in remodeling the microbiome and reducing the symptoms of neurodegenerative diseases.
Collapse
Affiliation(s)
- Xiaoyan Liu
- School of Life Sciences, Jiangsu University, Zhenjiang, Jiangsu Province, China
| | - Yi Liu
- School of Life Sciences, Jiangsu University, Zhenjiang, Jiangsu Province, China
- Institute of Animal Husbandry, Jiangsu Academy of Agricultural Sciences, Nanjing, Jiangsu Province, China
| | - Junlin Liu
- School of Life Sciences, Jiangsu University, Zhenjiang, Jiangsu Province, China
| | - Hantao Zhang
- School of Life Sciences, Jiangsu University, Zhenjiang, Jiangsu Province, China
| | - Chaofan Shan
- School of Life Sciences, Jiangsu University, Zhenjiang, Jiangsu Province, China
| | - Yinglu Guo
- School of Life Sciences, Jiangsu University, Zhenjiang, Jiangsu Province, China
| | - Xun Gong
- Department of Rheumatology & Immunology, Affiliated Hospital of Jiangsu University, Zhenjiang, Jiangsu Province, China
| | - Mengmeng Cui
- Department of Neurology, The Second Affiliated Hospital of Shandong First Medical University, Taian, Shandong Province, China
| | - Xiubin Li
- Department of Neurology, The Second Affiliated Hospital of Shandong First Medical University, Taian, Shandong Province, China
| | - Min Tang
- School of Life Sciences, Jiangsu University, Zhenjiang, Jiangsu Province, China
| |
Collapse
|
3
|
Moubset O, Filloux D, Fontes H, Julian C, Fernandez E, Galzi S, Blondin L, Chehida SB, Lett JM, Mesléard F, Kraberger S, Custer JM, Salywon A, Makings E, Marais A, Chiroleu F, Lefeuvre P, Martin DP, Candresse T, Varsani A, Ravigné V, Roumagnac P. Virome release of an invasive exotic plant species in southern France. Virus Evol 2024; 10:veae025. [PMID: 38566975 PMCID: PMC10986800 DOI: 10.1093/ve/veae025] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2023] [Revised: 02/27/2024] [Accepted: 03/06/2024] [Indexed: 04/04/2024] Open
Abstract
The increase in human-mediated introduction of plant species to new regions has resulted in a rise of invasive exotic plant species (IEPS) that has had significant effects on biodiversity and ecosystem processes. One commonly accepted mechanism of invasions is that proposed by the enemy release hypothesis (ERH), which states that IEPS free from their native herbivores and natural enemies in new environments can outcompete indigenous species and become invasive. We here propose the virome release hypothesis (VRH) as a virus-centered variant of the conventional ERH that is only focused on enemies. The VRH predicts that vertically transmitted plant-associated viruses (PAV, encompassing phytoviruses and mycoviruses) should be co-introduced during the dissemination of the IEPS, while horizontally transmitted PAV of IEPS should be left behind or should not be locally transmitted in the introduced area due to a maladaptation of local vectors. To document the VRH, virome richness and composition as well as PAV prevalence, co-infection, host range, and transmission modes were compared between indigenous plant species and an invasive grass, cane bluestem (Bothriochloa barbinodis), in both its introduced range (southern France) and one area of its native range (Sonoran Desert, Arizona, USA). Contrary to the VRH, we show that invasive populations of B. barbinodis in France were not associated with a lower PAV prevalence or richness than native populations of B. barbinodis from the USA. However, comparison of virome compositions and network analyses further revealed more diverse and complex plant-virus interactions in the French ecosystem, with a significant richness of mycoviruses. Setting mycoviruses apart, only one putatively vertically transmitted phytovirus (belonging to the Amalgaviridae family) and one putatively horizontally transmitted phytovirus (belonging to the Geminiviridae family) were identified from B. barbinodis plants in the introduced area. Collectively, these characteristics of the B. barbinodis-associated PAV community in southern France suggest that a virome release phase may have immediately followed the introduction of B. barbinodis to France in the 1960s or 1970s, and that, since then, the invasive populations of this IEPS have already transitioned out of this virome release phase, and have started interacting with several local mycoviruses and a few local plant viruses.
Collapse
Affiliation(s)
- Oumaima Moubset
- UMR PHIM, CIRAD, Baillarguet TA A-54/K, Montpellier 34090, France
- PHIM Plant Health Institute, Univ Montpellier, CIRAD, INRAE, Institut Agro, IRD, Baillarguet TA A-54/K, Montpellier 34090, France
| | - Denis Filloux
- UMR PHIM, CIRAD, Baillarguet TA A-54/K, Montpellier 34090, France
- PHIM Plant Health Institute, Univ Montpellier, CIRAD, INRAE, Institut Agro, IRD, Baillarguet TA A-54/K, Montpellier 34090, France
| | - Hugo Fontes
- Tour du Valat, Institut de recherche pour la conservation des zones humides méditerranéennes, Le Sambuc, Arles 13200, France
- Institut Méditerranéen de Biodiversité et Ecologie, UMR CNRS-IRD, Avignon Université, Aix-Marseille Université, IUT d’Avignon, Avignon 84911, France
| | - Charlotte Julian
- UMR PHIM, CIRAD, Baillarguet TA A-54/K, Montpellier 34090, France
- PHIM Plant Health Institute, Univ Montpellier, CIRAD, INRAE, Institut Agro, IRD, Baillarguet TA A-54/K, Montpellier 34090, France
| | - Emmanuel Fernandez
- UMR PHIM, CIRAD, Baillarguet TA A-54/K, Montpellier 34090, France
- PHIM Plant Health Institute, Univ Montpellier, CIRAD, INRAE, Institut Agro, IRD, Baillarguet TA A-54/K, Montpellier 34090, France
| | - Serge Galzi
- UMR PHIM, CIRAD, Baillarguet TA A-54/K, Montpellier 34090, France
- PHIM Plant Health Institute, Univ Montpellier, CIRAD, INRAE, Institut Agro, IRD, Baillarguet TA A-54/K, Montpellier 34090, France
| | - Laurence Blondin
- UMR PHIM, CIRAD, Baillarguet TA A-54/K, Montpellier 34090, France
- PHIM Plant Health Institute, Univ Montpellier, CIRAD, INRAE, Institut Agro, IRD, Baillarguet TA A-54/K, Montpellier 34090, France
| | | | | | - François Mesléard
- Tour du Valat, Institut de recherche pour la conservation des zones humides méditerranéennes, Le Sambuc, Arles 13200, France
- Institut Méditerranéen de Biodiversité et Ecologie, UMR CNRS-IRD, Avignon Université, Aix-Marseille Université, IUT d’Avignon, Avignon 84911, France
| | - Simona Kraberger
- The Biodesign Center for Fundamental and Applied Microbiomics, Center for Evolution and Medicine, School of Life Sciences, Arizona State University, Tempe, AZ 85287, USA
| | - Joy M Custer
- The Biodesign Center for Fundamental and Applied Microbiomics, Center for Evolution and Medicine, School of Life Sciences, Arizona State University, Tempe, AZ 85287, USA
| | - Andrew Salywon
- Department of Research, Conservation and Collections, Desert Botanical Garden, Phoenix, AZ 85008, USA
| | - Elizabeth Makings
- Vascular Plant Herbarium, School of Life Sciences, Arizona State University, 734 West Alameda Drive, Tempe Tempe, AZ 85282, USA
| | - Armelle Marais
- UMR BFP, University Bordeaux, INRAE, Villenave d’Ornon 33140, France
| | | | | | - Darren P Martin
- Division of Computational Biology, Department of Integrative Biomedical Sciences, Institute of infectious Diseases and Molecular Medicine, University of Cape Town, Anzio Rd, Cape Town 7925, South Africa
| | - Thierry Candresse
- UMR BFP, University Bordeaux, INRAE, Villenave d’Ornon 33140, France
| | - Arvind Varsani
- The Biodesign Center for Fundamental and Applied Microbiomics, Center for Evolution and Medicine, School of Life Sciences, Arizona State University, Tempe, AZ 85287, USA
- Structural Biology Research Unit, Department of Integrative Biomedical Sciences, University of Cape Town, Observatory, Cape Town 7700, South Africa
| | - Virginie Ravigné
- UMR PHIM, CIRAD, Baillarguet TA A-54/K, Montpellier 34090, France
- PHIM Plant Health Institute, Univ Montpellier, CIRAD, INRAE, Institut Agro, IRD, Baillarguet TA A-54/K, Montpellier 34090, France
| | - Philippe Roumagnac
- UMR PHIM, CIRAD, Baillarguet TA A-54/K, Montpellier 34090, France
- PHIM Plant Health Institute, Univ Montpellier, CIRAD, INRAE, Institut Agro, IRD, Baillarguet TA A-54/K, Montpellier 34090, France
| |
Collapse
|
4
|
Gong C, Chakraborty D, Koudelka GB. A prophage encoded ribosomal RNA methyltransferase regulates the virulence of Shiga-toxin-producing Escherichia coli (STEC). Nucleic Acids Res 2024; 52:856-871. [PMID: 38084890 PMCID: PMC10810198 DOI: 10.1093/nar/gkad1150] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2023] [Revised: 11/09/2023] [Accepted: 11/14/2023] [Indexed: 01/26/2024] Open
Abstract
Shiga toxin (Stx) released by Shiga toxin producing Escherichia coli (STEC) causes life-threatening illness. Its production and release require induction of Stx-encoding prophage resident within the STEC genome. We identified two different STEC strains, PA2 and PA8, bearing Stx-encoding prophage whose sequences primarily differ by the position of an IS629 insertion element, yet differ in their abilities to kill eukaryotic cells and whose prophages differ in their spontaneous induction frequencies. The IS629 element in ϕPA2, disrupts an ORF predicted to encode a DNA adenine methyltransferase, whereas in ϕPA8, this element lies in an intergenic region. Introducing a plasmid expressing the methyltransferase gene product into ϕPA2 bearing-strains increases both the prophage spontaneous induction frequency and virulence to those exhibited by ϕPA8 bearing-strains. However, a plasmid bearing mutations predicted to disrupt the putative active site of the methyltransferase does not complement either of these defects. When complexed with a second protein, the methyltransferase holoenzyme preferentially uses 16S rRNA as a substrate. The second subunit is responsible for directing the preferential methylation of rRNA. Together these findings reveal a previously unrecognized role for rRNA methylation in regulating induction of Stx-encoding prophage.
Collapse
Affiliation(s)
- Chen Gong
- Department of Biological Sciences University at Buffalo, Buffalo, NY 14260, USA
| | | | - Gerald B Koudelka
- Department of Biological Sciences University at Buffalo, Buffalo, NY 14260, USA
| |
Collapse
|
5
|
Brait N, Hackl T, Morel C, Exbrayat A, Gutierrez S, Lequime S. A tale of caution: How endogenous viral elements affect virus discovery in transcriptomic data. Virus Evol 2023; 10:vead088. [PMID: 38516656 PMCID: PMC10956553 DOI: 10.1093/ve/vead088] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2023] [Revised: 11/24/2023] [Accepted: 12/22/2023] [Indexed: 03/23/2024] Open
Abstract
Large-scale metagenomic and -transcriptomic studies have revolutionized our understanding of viral diversity and abundance. In contrast, endogenous viral elements (EVEs), remnants of viral sequences integrated into host genomes, have received limited attention in the context of virus discovery, especially in RNA-Seq data. EVEs resemble their original viruses, a challenge that makes distinguishing between active infections and integrated remnants difficult, affecting virus classification and biases downstream analyses. Here, we systematically assess the effects of EVEs on a prototypical virus discovery pipeline, evaluate their impact on data integrity and classification accuracy, and provide some recommendations for better practices. We examined EVEs and exogenous viral sequences linked to Orthomyxoviridae, a diverse family of negative-sense segmented RNA viruses, in 13 genomic and 538 transcriptomic datasets of Culicinae mosquitoes. Our analysis revealed a substantial number of viral sequences in transcriptomic datasets. However, a significant portion appeared not to be exogenous viruses but transcripts derived from EVEs. Distinguishing between transcribed EVEs and exogenous virus sequences was especially difficult in samples with low viral abundance. For example, three transcribed EVEs showed full-length segments, devoid of frameshift and nonsense mutations, exhibiting sufficient mean read depths that qualify them as exogenous virus hits. Mapping reads on a host genome containing EVEs before assembly somewhat alleviated the EVE burden, but it led to a drastic reduction of viral hits and reduced quality of assemblies, especially in regions of the viral genome relatively similar to EVEs. Our study highlights that our knowledge of the genetic diversity of viruses can be altered by the underestimated presence of EVEs in transcriptomic datasets, leading to false positives and altered or missing sequence information. Thus, recognizing and addressing the influence of EVEs in virus discovery pipelines will be key in enhancing our ability to capture the full spectrum of viral diversity.
Collapse
Affiliation(s)
- Nadja Brait
- Cluster of Microbial Ecology, Groningen Institute for Evolutionary Life Sciences, University of Groningen, Groningen 9747 AG, The Netherlands
| | | | - Côme Morel
- ASTRE research unit, Cirad, INRAe, Université de Montpellier, Montpellier 34398, France
| | - Antoni Exbrayat
- ASTRE research unit, Cirad, INRAe, Université de Montpellier, Montpellier 34398, France
| | - Serafin Gutierrez
- ASTRE research unit, Cirad, INRAe, Université de Montpellier, Montpellier 34398, France
| | - Sebastian Lequime
- Cluster of Microbial Ecology, Groningen Institute for Evolutionary Life Sciences, University of Groningen, Groningen 9747 AG, The Netherlands
| |
Collapse
|
6
|
Du Y, Fuhrman JA, Sun F. ViralCC retrieves complete viral genomes and virus-host pairs from metagenomic Hi-C data. Nat Commun 2023; 14:502. [PMID: 36720887 PMCID: PMC9889337 DOI: 10.1038/s41467-023-35945-y] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2022] [Accepted: 01/09/2023] [Indexed: 02/01/2023] Open
Abstract
The introduction of high-throughput chromosome conformation capture (Hi-C) into metagenomics enables reconstructing high-quality metagenome-assembled genomes (MAGs) from microbial communities. Despite recent advances in recovering eukaryotic, bacterial, and archaeal genomes using Hi-C contact maps, few of Hi-C-based methods are designed to retrieve viral genomes. Here we introduce ViralCC, a publicly available tool to recover complete viral genomes and detect virus-host pairs using Hi-C data. Compared to other Hi-C-based methods, ViralCC leverages the virus-host proximity structure as a complementary information source for the Hi-C interactions. Using mock and real metagenomic Hi-C datasets from several different microbial ecosystems, including the human gut, cow fecal, and wastewater, we demonstrate that ViralCC outperforms existing Hi-C-based binning methods as well as state-of-the-art tools specifically dedicated to metagenomic viral binning. ViralCC can also reveal the taxonomic structure of viruses and virus-host pairs in microbial communities. When applied to a real wastewater metagenomic Hi-C dataset, ViralCC constructs a phage-host network, which is further validated using CRISPR spacer analyses. ViralCC is an open-source pipeline available at https://github.com/dyxstat/ViralCC .
Collapse
Affiliation(s)
- Yuxuan Du
- Department of Quantitative and Computational Biology, University of Southern California, Los Angeles, CA, USA
| | - Jed A Fuhrman
- Department of Biological Sciences, University of Southern California, Los Angeles, CA, USA
| | - Fengzhu Sun
- Department of Quantitative and Computational Biology, University of Southern California, Los Angeles, CA, USA.
| |
Collapse
|
7
|
Tithi SS, Aylward FO, Jensen RV, Zhang L. FastViromeExplorer-Novel: Recovering Draft Genomes of Novel Viruses and Phages in Metagenomic Data. JOURNAL OF COMPUTATIONAL BIOLOGY : A JOURNAL OF COMPUTATIONAL MOLECULAR CELL BIOLOGY 2023; 30:391-408. [PMID: 36607772 DOI: 10.1089/cmb.2022.0397] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/07/2023]
Abstract
Despite the recent surge of viral metagenomic studies, recovering complete virus/phage genomes from metagenomic data is still extremely difficult and most viral contigs generated from de novo assembly programs are highly fragmented, posing serious challenges to downstream analysis and inference. In this study, we develop FastViromeExplorer (FVE)-novel, a computational pipeline for reconstructing complete or near-complete viral draft genomes from metagenomic data. The FVE-novel deploys FVE to efficiently map metagenomic reads to viral reference genomes, performs de novo assembly of the mapped reads to generate contigs, and extends the contigs through iterative assembly to produce final viral scaffolds. We applied FVE-novel to an ocean metagenomic sample and obtained 268 viral scaffolds that potentially come from novel viruses. Through manual examination and validation of the 10 longest scaffolds, we successfully recovered 4 complete viral genomes, 2 are novel as they cannot be found in the existing databases and the other 2 are related to known phages. This hybrid reference-based and de novo assembly approach used by FVE-novel represents a powerful new approach for uncovering near-complete viral genomes in metagenomic data.
Collapse
Affiliation(s)
| | - Frank O Aylward
- Department of Biological Sciences, Virginia Tech, Blacksburg, Virginia, USA
| | - Roderick V Jensen
- Department of Biological Sciences, Virginia Tech, Blacksburg, Virginia, USA
| | - Liqing Zhang
- Department of Computer Science, Virginia Tech, Blacksburg, Virginia, USA
| |
Collapse
|
8
|
Zhao J, Wang Z, Li C, Shi T, Liang Y, Jiao N, Zhang Y. Significant Differences in Planktonic Virus Communities Between "Cellular Fraction" (0.22 ~ 3.0 µm) and "Viral Fraction" (< 0.22 μm) in the Ocean. MICROBIAL ECOLOGY 2022:10.1007/s00248-022-02167-6. [PMID: 36585490 DOI: 10.1007/s00248-022-02167-6] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/01/2022] [Accepted: 12/26/2022] [Indexed: 06/17/2023]
Abstract
Compared to free-living viruses (< 0.22 m) in the ocean, planktonic viruses in the "cellular fraction" (0.22 ~ 3.0 μm) are now far less well understood, and the differences between them remain largely unexplored. Here, we revealed that even in the same seawater samples, the "cellular fraction" comprised significantly distinct virus communities from the free virioplankton, with only 13.87% overlap in viral contigs at the species level. Compared to the viral genomes deposited in NCBI RefSeq database, 99% of the assembled viral genomes in the "cellular fraction" represented novel genera. Notably, the assembled (near-) complete viral genomes within the "cellular fraction" were significantly larger than that in the "viral fraction," and the "cellular fraction" contained three times more species of giant viruses or jumbo phages with genomes > 200 kb than the "viral fraction." The longest complete genomes of jumbo phage (~ 252 kb) and giant virus (~ 716 kb) were both detected only in the "cellular fraction." Moreover, a relatively higher proportion of proviruses were predicted within the "cellular fraction" than "viral fraction." Besides the substantial divergence in viral community structure, the different fractions also contained their unique viral auxiliary metabolic genes; e.g., those potentially participating in inorganic carbon fixation in deep sea were detected only in the "cellular-fraction" viromes. In addition, there was a considerable divergence in the community structure of both "cellular fraction" and "viral fraction" viromes between the surface and deep-sea habitats, suggesting that they might have similar environmental adaptation properties. The findings deepen our understanding of the complexity of viral community structure and function in the ocean.
Collapse
Affiliation(s)
- Jiulong Zhao
- Shandong Provincial Key Laboratory of Energy Genetics, Qingdao Institute of Bioenergy and Bioprocess Technology, Chinese Academy of Sciences, Qingdao, 266101, China
- University of Chinese Academy of Sciences, Beijing, 100049, China
| | - Zengmeng Wang
- Shandong Provincial Key Laboratory of Energy Genetics, Qingdao Institute of Bioenergy and Bioprocess Technology, Chinese Academy of Sciences, Qingdao, 266101, China
- University of Chinese Academy of Sciences, Beijing, 100049, China
| | - Chengcheng Li
- Shandong Provincial Key Laboratory of Energy Genetics, Qingdao Institute of Bioenergy and Bioprocess Technology, Chinese Academy of Sciences, Qingdao, 266101, China
| | - Tongmei Shi
- Shandong Provincial Key Laboratory of Energy Genetics, Qingdao Institute of Bioenergy and Bioprocess Technology, Chinese Academy of Sciences, Qingdao, 266101, China
- University of Chinese Academy of Sciences, Beijing, 100049, China
| | - Yantao Liang
- Shandong Provincial Key Laboratory of Energy Genetics, Qingdao Institute of Bioenergy and Bioprocess Technology, Chinese Academy of Sciences, Qingdao, 266101, China
| | - Nianzhi Jiao
- State Key Laboratory of Marine Environmental Science, Xiamen University, Xiamen, 361005, China
| | - Yongyu Zhang
- Shandong Provincial Key Laboratory of Energy Genetics, Qingdao Institute of Bioenergy and Bioprocess Technology, Chinese Academy of Sciences, Qingdao, 266101, China.
- University of Chinese Academy of Sciences, Beijing, 100049, China.
| |
Collapse
|
9
|
Gupta AK, Kumar M. Benchmarking and Assessment of Eight De Novo Genome Assemblers on Viral Next-Generation Sequencing Data, Including the SARS-CoV-2. OMICS : A JOURNAL OF INTEGRATIVE BIOLOGY 2022; 26:372-381. [PMID: 35759429 DOI: 10.1089/omi.2022.0042] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Viral genomics has become crucial in clinical diagnostics and ecology, not to mention to stem the COVID-19 pandemic. Whole-genome sequencing (WGS) is pivotal in gaining an improved understanding of viral evolution, genomic epidemiology, infectious outbreaks, pathobiology, clinical management, and vaccine development. Genome assembly is one of the crucial steps in WGS data analyses. A series of different assemblers has been developed with the advent of high-throughput next-generation sequencing (NGS). Various studies have reported the evaluation of these assembly tools on distinct datasets; however, these lack data from viral origin. In this study, we performed a comparative evaluation and benchmarking of eight de novo assemblers: SOAPdenovo, Velvet, assembly by short sequences (ABySS), iterative De Bruijn graph assembler (IDBA), SPAdes, Edena, iterative virus assembler, and VICUNA on the viral NGS data from distinct Illumina (GAIIx, Hiseq, Miseq, and Nextseq) platforms. WGS data of diverse viruses, that is, severe acute respiratory syndrome coronavirus-2 (SARS-CoV-2), dengue virus 3, human immunodeficiency virus 1, hepatitis B virus, human herpesvirus 8, human papillomavirus 16, rhinovirus A, and West Nile virus, were utilized to assess these assemblers. Performance metrics such as genome fraction recovery, assembly lengths, NG50, N50, contig length, contig numbers, mismatches, and misassemblies were analyzed. Overall, three assemblers, that is, SPAdes, IDBA, and ABySS, performed consistently well, including for genome assembly of SARS-CoV-2. These assembly methods should be considered and recommended for future studies of viruses. The study also suggests that implementing two or more assembly approaches should be considered in viral NGS studies, especially in clinical settings. Taken together, the benchmarking of eight de novo genome assemblers reported in this study can inform future public health and ecology research concerning the viruses, the COVID-19 pandemic, and viral outbreaks.
Collapse
Affiliation(s)
- Amit Kumar Gupta
- Virology Unit and Bioinformatics Centre, Institute of Microbial Technology, Council of Scientific and Industrial Research (CSIR), Chandigarh, India
| | - Manoj Kumar
- Virology Unit and Bioinformatics Centre, Institute of Microbial Technology, Council of Scientific and Industrial Research (CSIR), Chandigarh, India
- Academy of Scientific and Innovative Research (AcSIR), Ghaziabad, India
| |
Collapse
|
10
|
Andrade-Martínez JS, Camelo Valera LC, Chica Cárdenas LA, Forero-Junco L, López-Leal G, Moreno-Gallego JL, Rangel-Pineros G, Reyes A. Computational Tools for the Analysis of Uncultivated Phage Genomes. Microbiol Mol Biol Rev 2022; 86:e0000421. [PMID: 35311574 PMCID: PMC9199400 DOI: 10.1128/mmbr.00004-21] [Citation(s) in RCA: 13] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022] Open
Abstract
Over a century of bacteriophage research has uncovered a plethora of fundamental aspects of their biology, ecology, and evolution. Furthermore, the introduction of community-level studies through metagenomics has revealed unprecedented insights on the impact that phages have on a range of ecological and physiological processes. It was not until the introduction of viral metagenomics that we began to grasp the astonishing breadth of genetic diversity encompassed by phage genomes. Novel phage genomes have been reported from a diverse range of biomes at an increasing rate, which has prompted the development of computational tools that support the multilevel characterization of these novel phages based solely on their genome sequences. The impact of these technologies has been so large that, together with MAGs (Metagenomic Assembled Genomes), we now have UViGs (Uncultivated Viral Genomes), which are now officially recognized by the International Committee for the Taxonomy of Viruses (ICTV), and new taxonomic groups can now be created based exclusively on genomic sequence information. Even though the available tools have immensely contributed to our knowledge of phage diversity and ecology, the ongoing surge in software programs makes it challenging to keep up with them and the purpose each one is designed for. Therefore, in this review, we describe a comprehensive set of currently available computational tools designed for the characterization of phage genome sequences, focusing on five specific analyses: (i) assembly and identification of phage and prophage sequences, (ii) phage genome annotation, (iii) phage taxonomic classification, (iv) phage-host interaction analysis, and (v) phage microdiversity.
Collapse
Affiliation(s)
- Juan Sebastián Andrade-Martínez
- Max Planck Tandem Group in Computational Biology, Department of Biological Sciences, Universidad de los Andes, Bogotá, Colombia
| | - Laura Carolina Camelo Valera
- Max Planck Tandem Group in Computational Biology, Department of Biological Sciences, Universidad de los Andes, Bogotá, Colombia
| | - Luis Alberto Chica Cárdenas
- Max Planck Tandem Group in Computational Biology, Department of Biological Sciences, Universidad de los Andes, Bogotá, Colombia
| | - Laura Forero-Junco
- Max Planck Tandem Group in Computational Biology, Department of Biological Sciences, Universidad de los Andes, Bogotá, Colombia
- Department of Plant and Environmental Science, University of Copenhagen, Frederiksberg, Denmark
| | - Gamaliel López-Leal
- Max Planck Tandem Group in Computational Biology, Department of Biological Sciences, Universidad de los Andes, Bogotá, Colombia
| | - J. Leonardo Moreno-Gallego
- Max Planck Tandem Group in Computational Biology, Department of Biological Sciences, Universidad de los Andes, Bogotá, Colombia
- Department of Microbiome Science, Max Planck Institute for Developmental Biology, Tübingen, Germany
| | - Guillermo Rangel-Pineros
- Max Planck Tandem Group in Computational Biology, Department of Biological Sciences, Universidad de los Andes, Bogotá, Colombia
- The GLOBE Institute, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark
| | - Alejandro Reyes
- Max Planck Tandem Group in Computational Biology, Department of Biological Sciences, Universidad de los Andes, Bogotá, Colombia
- The Edison Family Center for Genome Sciences and Systems Biology, Washington University School of Medicine, St. Louis, Missouri, USA
| |
Collapse
|
11
|
Weinheimer AR, Aylward FO. Infection strategy and biogeography distinguish cosmopolitan groups of marine jumbo bacteriophages. THE ISME JOURNAL 2022; 16:1657-1667. [PMID: 35260829 PMCID: PMC9123017 DOI: 10.1038/s41396-022-01214-x] [Citation(s) in RCA: 18] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/18/2021] [Revised: 02/03/2022] [Accepted: 02/10/2022] [Indexed: 11/08/2022]
Abstract
Recent research has underscored the immense diversity and key biogeochemical roles of large DNA viruses in the ocean. Although they are important constituents of marine ecosystems, it is sometimes difficult to detect these viruses due to their large size and complex genomes. This is true for "jumbo" bacteriophages, which have genome sizes >200 kbp and large capsids reaching up to 0.45 µm in diameter. In this study, we sought to assess the genomic diversity and distribution of these bacteriophages in the ocean by generating and analyzing jumbo phage genomes from metagenomes. We recover 85 marine jumbo phages that ranged in size from 201 to 498 kilobases, and we examine their genetic similarities and biogeography together with a reference database of marine jumbo phage genomes. By analyzing Tara Oceans metagenomic data, we show that although most jumbo phages can be detected in a range of different size fractions, 17 of our bins tend to be found in those greater than 0.22 µm, potentially due to their large size. Our network-based analysis of gene-sharing patterns reveals that jumbo bacteriophages belong to five genome clusters that are typified by diverse replication strategies, genomic repertoires, and potential host ranges. Our analysis of jumbo phage distributions in the ocean reveals that depth is a major factor shaping their biogeography, with some phage genome clusters occurring preferentially in either surface or mesopelagic waters, respectively. Taken together, our findings indicate that jumbo phages are widespread community members in the ocean with complex genomic repertoires and ecological impacts that warrant further targeted investigation.
Collapse
Affiliation(s)
| | - Frank O Aylward
- Department of Biological Sciences, Virginia Tech, Blacksburg, VA, USA
- Center for Emerging, Zoonotic, and Arthropod-borne Pathogens, Virginia Polytechnic Institute and State University, Blacksburg, VA, 24061-0913, USA
| |
Collapse
|
12
|
VPipe: an Automated Bioinformatics Platform for Assembly and Management of Viral Next-Generation Sequencing Data. Microbiol Spectr 2022; 10:e0256421. [PMID: 35234489 PMCID: PMC8941893 DOI: 10.1128/spectrum.02564-21] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022] Open
Abstract
Next-generation sequencing (NGS) is a powerful tool for detecting and investigating viral pathogens; however, analysis and management of the enormous amounts of data generated from these technologies remains a challenge. Here, we present VPipe (the Viral NGS Analysis Pipeline and Data Management System), an automated bioinformatics pipeline optimized for whole-genome assembly of viral sequences and identification of diverse species. VPipe automates the data quality control, assembly, and contig identification steps typically performed when analyzing NGS data. Users access the pipeline through a secure web-based portal, which provides an easy-to-use interface with advanced search capabilities for reviewing results. In addition, VPipe provides a centralized system for storing and analyzing NGS data, eliminating common bottlenecks in bioinformatics analyses for public health laboratories with limited on-site computational infrastructure. The performance of VPipe was validated through the analysis of publicly available NGS data sets for viral pathogens, generating high-quality assemblies for 12 data sets. VPipe also generated assemblies with greater contiguity than similar pipelines for 41 human respiratory syncytial virus isolates and 23 SARS-CoV-2 specimens. IMPORTANCE Computational infrastructure and bioinformatics analysis are bottlenecks in the application of NGS to viral pathogens. As of September 2021, VPipe has been used by the U.S. Centers for Disease Control and Prevention (CDC) and 12 state public health laboratories to characterize >17,500 and 1,500 clinical specimens and isolates, respectively. VPipe automates genome assembly for a wide range of viruses, including high-consequence pathogens such as SARS-CoV-2. Such automated functionality expedites public health responses to viral outbreaks and pathogen surveillance.
Collapse
|
13
|
Johansen J, Plichta DR, Nissen JN, Jespersen ML, Shah SA, Deng L, Stokholm J, Bisgaard H, Nielsen DS, Sørensen SJ, Rasmussen S. Genome binning of viral entities from bulk metagenomics data. Nat Commun 2022; 13:965. [PMID: 35181661 PMCID: PMC8857322 DOI: 10.1038/s41467-022-28581-5] [Citation(s) in RCA: 41] [Impact Index Per Article: 20.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2021] [Accepted: 01/28/2022] [Indexed: 12/26/2022] Open
Abstract
Despite the accelerating number of uncultivated virus sequences discovered in metagenomics and their apparent importance for health and disease, the human gut virome and its interactions with bacteria in the gastrointestinal tract are not well understood. This is partly due to a paucity of whole-virome datasets and limitations in current approaches for identifying viral sequences in metagenomics data. Here, combining a deep-learning based metagenomics binning algorithm with paired metagenome and metavirome datasets, we develop Phages from Metagenomics Binning (PHAMB), an approach that allows the binning of thousands of viral genomes directly from bulk metagenomics data, while simultaneously enabling clustering of viral genomes into accurate taxonomic viral populations. When applied on the Human Microbiome Project 2 (HMP2) dataset, PHAMB recovered 6,077 high-quality genomes from 1,024 viral populations, and identified viral-microbial host interactions. PHAMB can be advantageously applied to existing and future metagenomes to illuminate viral ecological dynamics with other microbiome constituents.
Collapse
Affiliation(s)
- Joachim Johansen
- Novo Nordisk Foundation Center for Protein Research, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark.,Infectious Disease and Microbiome Program, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Damian R Plichta
- Infectious Disease and Microbiome Program, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Jakob Nybo Nissen
- Novo Nordisk Foundation Center for Protein Research, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark.,Statens Serum Institut, Viral & Microbial Special diagnostics, Copenhagen, Denmark
| | - Marie Louise Jespersen
- Novo Nordisk Foundation Center for Protein Research, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark.,National Food Institute, Technical University of Denmark, Kongens Lyngby, Denmark
| | - Shiraz A Shah
- Copenhagen Prospective Studies on Asthma in Childhood (COPSAC), Herlev and Gentofte Hospital, University of Copenhagen, Copenhagen, Denmark
| | - Ling Deng
- Section of Food Microbiology and Fermentation, Department of Food Science, Faculty of Science, University of Copenhagen, Copenhagen, Denmark
| | - Jakob Stokholm
- Copenhagen Prospective Studies on Asthma in Childhood (COPSAC), Herlev and Gentofte Hospital, University of Copenhagen, Copenhagen, Denmark.,Section of Food Microbiology and Fermentation, Department of Food Science, Faculty of Science, University of Copenhagen, Copenhagen, Denmark
| | - Hans Bisgaard
- Copenhagen Prospective Studies on Asthma in Childhood (COPSAC), Herlev and Gentofte Hospital, University of Copenhagen, Copenhagen, Denmark
| | - Dennis Sandris Nielsen
- Section of Food Microbiology and Fermentation, Department of Food Science, Faculty of Science, University of Copenhagen, Copenhagen, Denmark
| | - Søren J Sørensen
- Section of Microbiology, Department of Biology, University of Copenhagen, Copenhagen, Denmark
| | - Simon Rasmussen
- Novo Nordisk Foundation Center for Protein Research, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark.
| |
Collapse
|
14
|
Arisdakessian CG, Nigro OD, Steward GF, Poisson G, Belcaid M. CoCoNet: an efficient deep learning tool for viral metagenome binning. Bioinformatics 2021; 37:2803-2810. [PMID: 33822891 DOI: 10.1093/bioinformatics/btab213] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2020] [Revised: 03/24/2021] [Accepted: 04/02/2021] [Indexed: 02/02/2023] Open
Abstract
MOTIVATION Metagenomic approaches hold the potential to characterize microbial communities and unravel the intricate link between the microbiome and biological processes. Assembly is one of the most critical steps in metagenomics experiments. It consists of transforming overlapping DNA sequencing reads into sufficiently accurate representations of the community's genomes. This process is computationally difficult and commonly results in genomes fragmented across many contigs. Computational binning methods are used to mitigate fragmentation by partitioning contigs based on their sequence composition, abundance or chromosome organization into bins representing the community's genomes. Existing binning methods have been principally tuned for bacterial genomes and do not perform favorably on viral metagenomes. RESULTS We propose Composition and Coverage Network (CoCoNet), a new binning method for viral metagenomes that leverages the flexibility and the effectiveness of deep learning to model the co-occurrence of contigs belonging to the same viral genome and provide a rigorous framework for binning viral contigs. Our results show that CoCoNet substantially outperforms existing binning methods on viral datasets. AVAILABILITY AND IMPLEMENTATION CoCoNet was implemented in Python and is available for download on PyPi (https://pypi.org/). The source code is hosted on GitHub at https://github.com/Puumanamana/CoCoNet and the documentation is available at https://coconet.readthedocs.io/en/latest/index.html. CoCoNet does not require extensive resources to run. For example, binning 100k contigs took about 4 h on 10 Intel CPU Cores (2.4 GHz), with a memory peak at 27 GB (see Supplementary Fig. S9). To process a large dataset, CoCoNet may need to be run on a high RAM capacity server. Such servers are typically available in high-performance or cloud computing settings. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Cédric G Arisdakessian
- Department of Information and Computer Sciences, University of Hawai'i at Mānoa, Honolulu, HI 96822, USA
| | - Olivia D Nigro
- Department of Natural Science, Hawai'i Pacific University, Honolulu, HI 96813, USA
| | - Grieg F Steward
- Department of Oceanography, University of Hawai'i at Mānoa, Honolulu, HI 96822, USA
| | - Guylaine Poisson
- Department of Information and Computer Sciences, University of Hawai'i at Mānoa, Honolulu, HI 96822, USA
| | - Mahdi Belcaid
- Department of Information and Computer Sciences, University of Hawai'i at Mānoa, Honolulu, HI 96822, USA.,Hawai'i Institute of Marine Biology, University of Hawai'i at Mānoa, Honolulu, HI 96816, USA
| |
Collapse
|
15
|
Sakamoto T, Ortega JM. Taxallnomy: an extension of NCBI Taxonomy that produces a hierarchically complete taxonomic tree. BMC Bioinformatics 2021; 22:388. [PMID: 34325658 PMCID: PMC8323199 DOI: 10.1186/s12859-021-04304-3] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2021] [Accepted: 07/12/2021] [Indexed: 01/02/2023] Open
Abstract
BACKGROUND NCBI Taxonomy is the main taxonomic source for several bioinformatics tools and databases since all organisms with sequence accessions deposited on INSDC are organized in its hierarchical structure. Despite the extensive use and application of this data source, an alternative representation of data as a table would facilitate the use of information for processing bioinformatics data. To do so, since some taxonomic-ranks are missing in some lineages, an algorithm might propose provisional names for all taxonomic-ranks. RESULTS To address this issue, we developed an algorithm that takes the tree structure from NCBI Taxonomy and generates a hierarchically complete taxonomic table, maintaining its compatibility with the original tree. The procedures performed by the algorithm consist of attempting to assign a taxonomic-rank to an existing clade or "no rank" node when possible, using its name as part of the created taxonomic-rank name (e.g. Ord_Ornithischia) or interpolating parent nodes when needed (e.g. Cla_of_Ornithischia), both examples given for the dinosaur Brachylophosaurus lineage. The new hierarchical structure was named Taxallnomy because it contains names for all taxonomic-ranks, and it contains 41 hierarchical levels corresponding to the 41 taxonomic-ranks currently found in the NCBI Taxonomy database. From Taxallnomy, users can obtain the complete taxonomic lineage with 41 nodes of all taxa available in the NCBI Taxonomy database, without any hazard to the original tree information. In this work, we demonstrate its applicability by embedding taxonomic information of a specified rank into a phylogenetic tree and by producing metagenomics profiles. CONCLUSION Taxallnomy applies to any bioinformatics analyses that depend on the information from NCBI Taxonomy. Taxallnomy is updated periodically but with a distributed PERL script users can generate it locally using NCBI Taxonomy as input. All Taxallnomy resources are available at http://bioinfo.icb.ufmg.br/taxallnomy .
Collapse
Affiliation(s)
- Tetsu Sakamoto
- BioME - Bioinformatics Multidisciplinary Environment, Instituto Metrópole Digital (IMD), Universidade Federal Do Rio Grande Do Norte (UFRN), Natal, RN, Brazil
- Laboratório de Biodados, Departamento de Bioquímica E Imunologia, Instituto de Ciências Biológicas (ICB), Universidade Federal de Minas Gerais (UFMG), Belo Horizonte, MG, Brazil
| | - J Miguel Ortega
- Laboratório de Biodados, Departamento de Bioquímica E Imunologia, Instituto de Ciências Biológicas (ICB), Universidade Federal de Minas Gerais (UFMG), Belo Horizonte, MG, Brazil.
| |
Collapse
|
16
|
Benler S, Koonin EV. Fishing for phages in metagenomes: what do we catch, what do we miss? Curr Opin Virol 2021; 49:142-150. [PMID: 34139668 DOI: 10.1016/j.coviro.2021.05.008] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/02/2023]
Abstract
Metagenomics and metatranscriptomics have become the principal approaches for discovery of novel bacteriophages and preliminary characterization of their ecology and biology. Metagenomic sequencing dramatically expanded the known diversity of tailed and non-tailed phages with double-stranded DNA genomes and those with single-stranded DNA genomes, whereas metatranscriptomics led to the discovery of thousands of new single-stranded RNA phages. Apart from expanding phage diversity, metagenomics studies discover major novel groups of phages with unique features of genome organization, expression strategy and virus-host interaction, such as the putative order 'crAssvirales', which includes the most abundant human-associated viruses. The continued success of metagenomics hinges on the combination of the most powerful computational methods for phage genome assembly and analysis including harnessing CRISPR spacers for the discovery of novel phages and host assignment. Together, these approaches could make a comprehensive characterization of the earth phageome a realistic goal.
Collapse
Affiliation(s)
- Sean Benler
- National Center for Biotechnology Information, National Institutes of Health, Bethesda MD, United States.
| | - Eugene V Koonin
- National Center for Biotechnology Information, National Institutes of Health, Bethesda MD, United States.
| |
Collapse
|
17
|
Yuan Z, Ye X, Zhu L, Zhang N, An Z, Zheng WJ. Virome assembly and annotation in brain tissue based on next-generation sequencing. Cancer Med 2020; 9:6776-6790. [PMID: 32738030 PMCID: PMC7520322 DOI: 10.1002/cam4.3325] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2019] [Revised: 06/20/2020] [Accepted: 07/01/2020] [Indexed: 12/15/2022] Open
Abstract
The glioblastoma multiforme (GBM) is one of the deadliest tumors. It has been speculated that virus plays a role in GBM but the evidences are controversy. Published researches are mainly limited to studies on the presence of human cytomegalovirus (HCMV) in GBM. No comprehensive assessment of the brain virome, the collection of viral material in the brain, based on recently sequenced data has been performed. Here, we characterized the virome from 111 GBM samples and 57 normal brain samples from eight projects in the SRA database by a tested and comprehensive assembly approach. The annotation of the assembled contigs showed that most viral sequences in the brain belong to the viral family Retroviridae. In some GBM samples, we also detected full genome sequence of a novel picornavirus recently discovered in invertebrates. Unlike previous reports, our study did not detect herpes virus such as HCMV in GBM from the data we used. However, some contigs that cannot be annotated with any known genes exhibited antibody epitopes in their sequences. These findings provide several avenues for potential cancer therapy: the newly discovered picornavirus could be a starting point to engineer novel oncolytic virus; and the exhibited antibody epitopes could be a source to explore potential drug targets for immune cancer therapy. By characterizing the virosphere in GBM and normal brain at a global level, the results from this study strengthen the link between GBM and viral infection which warrants the further investigation.
Collapse
Affiliation(s)
- Zihao Yuan
- School of Biomedical InformaticsUniversity of Texas Health Science Center at HoustonHoustonTXUSA
- Texas Therapeutics InstituteInstitute of Molecular MedicineMcGovern Medical SchoolUniversity of Texas Health Science Center at HoustonHoustonTXUSA
| | - Xiaohua Ye
- Texas Therapeutics InstituteInstitute of Molecular MedicineMcGovern Medical SchoolUniversity of Texas Health Science Center at HoustonHoustonTXUSA
| | - Lisha Zhu
- School of Biomedical InformaticsUniversity of Texas Health Science Center at HoustonHoustonTXUSA
| | - Ningyan Zhang
- Texas Therapeutics InstituteInstitute of Molecular MedicineMcGovern Medical SchoolUniversity of Texas Health Science Center at HoustonHoustonTXUSA
| | - Zhiqiang An
- Texas Therapeutics InstituteInstitute of Molecular MedicineMcGovern Medical SchoolUniversity of Texas Health Science Center at HoustonHoustonTXUSA
| | - W. Jim Zheng
- School of Biomedical InformaticsUniversity of Texas Health Science Center at HoustonHoustonTXUSA
| |
Collapse
|
18
|
Ledesma J, Williams D, Stanford FA, Hewitt PE, Zuckerman M, Bansal S, Dhawan A, Mbisa JL, Tedder R, Ijaz S. Resolution by deep sequencing of a dual hepatitis E virus infection transmitted via blood components. J Gen Virol 2020; 100:1491-1500. [PMID: 31592753 DOI: 10.1099/jgv.0.001302] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022] Open
Abstract
Hepatitis E virus (HEV) is a zoonotic infection, with consumption of processed pork products thought to be the major route of transmission in England. The clinical features of HEV infection range from asymptomatic infection to mild hepatitis to fulminant liver failure. Persistent, chronic hepatitis is increasingly recognized in immunocompromised patients. Infection via HEV-containing blood components and organs has been reported and measures to reduce this transmission risk were introduced into the blood service in England in 2016. We report here the sequence and phylogenetic findings from investigations into a transmission event from an HEV-infected donor to two recipients. Phylogenetic analysis of HEV genome sequence fragments obtained by Sanger sequencing showed that, whilst most of the sequences from both recipients' samples grouped with the sequence from the blood donor sample, the relationship of five sequences from recipient 2 were unresolved. Analysis of Illumina short-read deep sequence data demonstrated the presence of two divergent viral populations in the donor's sample that were also present in samples from both recipients. A clear phylogenetic relationship was established, indicating a probable transmission of both populations from the donor to each of the immunocompromised recipients. This study demonstrates the value of the application of new sequencing technologies combined with bioinformatic data analysis when Sanger sequencing is not able to clarify a proper phylogenetic relationship in the investigation of transmission events.
Collapse
Affiliation(s)
- Juan Ledesma
- National Institute for Health Research Health Protection Research Unit (NIHR HPRU) in Blood Borne and Sexually Transmitted Infections, London, UK.,Antiviral Unit, Virus Reference Department, National Infection Service, Public Health England, London, UK
| | - David Williams
- Bioinformatics, Virus Reference Department, National Infection Service, Public Health England, London, UK
| | - Felicia Adelina Stanford
- Blood Borne Virus Unit, Virus Reference Department, National Infection Service, Public Health England, London, UK
| | | | - Mark Zuckerman
- South London Specialist Virology Centre, King's College Hospital NHS Foundation Trust, London, UK
| | - Sanjay Bansal
- Paediatric Liver, GI and Nutrition Centre and Mowat Labs, King's College Hospital, London, UK
| | - Anil Dhawan
- Paediatric Liver, GI and Nutrition Centre and Mowat Labs, King's College Hospital, London, UK
| | - Jean Lutamyo Mbisa
- National Institute for Health Research Health Protection Research Unit (NIHR HPRU) in Blood Borne and Sexually Transmitted Infections, London, UK.,Antiviral Unit, Virus Reference Department, National Infection Service, Public Health England, London, UK
| | - Richard Tedder
- Blood Borne Virus Unit, Virus Reference Department, National Infection Service, Public Health England, London, UK
| | - Samreen Ijaz
- Blood Borne Virus Unit, Virus Reference Department, National Infection Service, Public Health England, London, UK.,National Institute for Health Research Health Protection Research Unit (NIHR HPRU) in Blood Borne and Sexually Transmitted Infections, London, UK
| |
Collapse
|
19
|
Whole-Virome Analysis Sheds Light on Viral Dark Matter in Inflammatory Bowel Disease. Cell Host Microbe 2019; 26:764-778.e5. [DOI: 10.1016/j.chom.2019.10.009] [Citation(s) in RCA: 162] [Impact Index Per Article: 32.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2019] [Revised: 09/02/2019] [Accepted: 10/14/2019] [Indexed: 12/18/2022]
|
20
|
Sutton TDS, Hill C. Gut Bacteriophage: Current Understanding and Challenges. Front Endocrinol (Lausanne) 2019; 10:784. [PMID: 31849833 PMCID: PMC6895007 DOI: 10.3389/fendo.2019.00784] [Citation(s) in RCA: 96] [Impact Index Per Article: 19.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/09/2019] [Accepted: 10/28/2019] [Indexed: 12/13/2022] Open
Abstract
The gut microbiome is widely accepted to have a significant impact on human health yet, despite years of research on this complex ecosystem, the contributions of different forces driving microbial population structure remain to be fully elucidated. The viral component of the human gut microbiome is dominated by bacteriophage, which are known to play crucial roles in shaping microbial composition, driving bacterial diversity, and facilitating horizontal gene transfer. Bacteriophage are also one of the most poorly understood components of the human gut microbiome, with the vast majority of viral sequences sharing little to no homology to reference databases. If we are to understand the dynamics of bacteriophage populations, their interaction with the human microbiome and ultimately their influence on human health, we will depend heavily on sequence based approaches and in silico tools. This is complicated by the fact that, as with any research field in its infancy, methods of analyses vary and this can impede our ability to compare the outputs of different studies. Here, we discuss the major findings to date regarding the human virome and reflect on our current understanding of how gut bacteriophage shape the microbiome. We consider whether or not the virome field is built on unstable foundations and if so, how can we provide a solid basis for future experimentation. The virome is a challenging yet crucial piece of the human microbiome puzzle. In order to develop our understanding, we will discuss the need to underpin future studies with robust research methods and suggest some solutions to existing challenges.
Collapse
Affiliation(s)
| | - Colin Hill
- APC Microbiome Ireland and School of Microbiology, University College Cork, Cork, Ireland
| |
Collapse
|
21
|
Sutton TDS, Clooney AG, Ryan FJ, Ross RP, Hill C. Choice of assembly software has a critical impact on virome characterisation. MICROBIOME 2019; 7:12. [PMID: 30691529 PMCID: PMC6350398 DOI: 10.1186/s40168-019-0626-5] [Citation(s) in RCA: 86] [Impact Index Per Article: 17.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/16/2018] [Accepted: 01/14/2019] [Indexed: 05/19/2023]
Abstract
BACKGROUND The viral component of microbial communities plays a vital role in driving bacterial diversity, facilitating nutrient turnover and shaping community composition. Despite their importance, the vast majority of viral sequences are poorly annotated and share little or no homology to reference databases. As a result, investigation of the viral metagenome (virome) relies heavily on de novo assembly of short sequencing reads to recover compositional and functional information. Metagenomic assembly is particularly challenging for virome data, often resulting in fragmented assemblies and poor recovery of viral community members. Despite the essential role of assembly in virome analysis and difficulties posed by these data, current assembly comparisons have been limited to subsections of virome studies or bacterial datasets. DESIGN This study presents the most comprehensive virome assembly comparison to date, featuring 16 metagenomic assembly approaches which have featured in human virome studies. Assemblers were assessed using four independent virome datasets, namely, simulated reads, two mock communities, viromes spiked with a known phage and human gut viromes. RESULTS Assembly performance varied significantly across all test datasets, with SPAdes (meta) performing consistently well. Performance of MIRA and VICUNA varied, highlighting the importance of using a range of datasets when comparing assembly programs. It was also found that while some assemblers addressed the challenges of virome data better than others, all assemblers had limitations. Low read coverage and genomic repeats resulted in assemblies with poor genome recovery, high degrees of fragmentation and low-accuracy contigs across all assemblers. These limitations must be considered when setting thresholds for downstream analysis and when drawing conclusions from virome data.
Collapse
Affiliation(s)
- Thomas D S Sutton
- APC Microbiome Ireland, Cork, Ireland
- School for Microbiology, University College Cork, Cork, Ireland
| | - Adam G Clooney
- APC Microbiome Ireland, Cork, Ireland
- School for Microbiology, University College Cork, Cork, Ireland
| | - Feargal J Ryan
- APC Microbiome Ireland, Cork, Ireland
- School for Microbiology, University College Cork, Cork, Ireland
- Present Address: South Australian Health and Medical Research Institute, Adelaide, Australia
| | - R Paul Ross
- APC Microbiome Ireland, Cork, Ireland
- School for Microbiology, University College Cork, Cork, Ireland
- Teagasc Food Research Centre, Fermoy, Cork, Ireland
| | - Colin Hill
- APC Microbiome Ireland, Cork, Ireland.
- School for Microbiology, University College Cork, Cork, Ireland.
| |
Collapse
|
22
|
Papudeshi B, Haggerty JM, Doane M, Morris MM, Walsh K, Beattie DT, Pande D, Zaeri P, Silva GGZ, Thompson F, Edwards RA, Dinsdale EA. Optimizing and evaluating the reconstruction of Metagenome-assembled microbial genomes. BMC Genomics 2017; 18:915. [PMID: 29183281 PMCID: PMC5706307 DOI: 10.1186/s12864-017-4294-1] [Citation(s) in RCA: 38] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2017] [Accepted: 11/13/2017] [Indexed: 11/12/2022] Open
Abstract
Background Microbiome/host interactions describe characteristics that affect the host's health. Shotgun metagenomics includes sequencing a random subset of the microbiome to analyze its taxonomic and metabolic potential. Reconstruction of DNA fragments into genomes from metagenomes (called metagenome-assembled genomes) assigns unknown fragments to taxa/function and facilitates discovery of novel organisms. Genome reconstruction incorporates sequence assembly and sorting of assembled sequences into bins, characteristic of a genome. However, the microbial community composition, including taxonomic and phylogenetic diversity may influence genome reconstruction. We determine the optimal reconstruction method for four microbiome projects that had variable sequencing platforms (IonTorrent and Illumina), diversity (high or low), and environment (coral reefs and kelp forests), using a set of parameters to select for optimal assembly and binning tools. Methods We tested the effects of the assembly and binning processes on population genome reconstruction using 105 marine metagenomes from 4 projects. Reconstructed genomes were obtained from each project using 3 assemblers (IDBA, MetaVelvet, and SPAdes) and 2 binning tools (GroopM and MetaBat). We assessed the efficiency of assemblers using statistics that including contig continuity and contig chimerism and the effectiveness of binning tools using genome completeness and taxonomic identification. Results We concluded that SPAdes, assembled more contigs (143,718 ± 124 contigs) of longer length (N50 = 1632 ± 108 bp), and incorporated the most sequences (sequences-assembled = 19.65%). The microbial richness and evenness were maintained across the assembly, suggesting low contig chimeras. SPAdes assembly was responsive to the biological and technological variations within the project, compared with other assemblers. Among binning tools, we conclude that MetaBat produced bins with less variation in GC content (average standard deviation: 1.49), low species richness (4.91 ± 0.66), and higher genome completeness (40.92 ± 1.75) across all projects. MetaBat extracted 115 bins from the 4 projects of which 66 bins were identified as reconstructed metagenome-assembled genomes with sequences belonging to a specific genus. We identified 13 novel genomes, some of which were 100% complete, but show low similarity to genomes within databases. Conclusions In conclusion, we present a set of biologically relevant parameters for evaluation to select for optimal assembly and binning tools. For the tools we tested, SPAdes assembler and MetaBat binning tools reconstructed quality metagenome-assembled genomes for the four projects. We also conclude that metagenomes from microbial communities that have high coverage of phylogenetically distinct, and low taxonomic diversity results in highest quality metagenome-assembled genomes. Electronic supplementary material The online version of this article (10.1186/s12864-017-4294-1) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Bhavya Papudeshi
- Bioinformatics and Medical Informatics, San Diego State University, San Diego, California, USA.,National Center for Genome Analysis Support, Indiana University, Bloomington, Indiana, USA
| | - J Matthew Haggerty
- Department of Biology, San Diego State University, 5500 Campanile Drive, San Diego, 92115, California, USA
| | - Michael Doane
- Department of Biology, San Diego State University, 5500 Campanile Drive, San Diego, 92115, California, USA
| | - Megan M Morris
- Department of Biology, San Diego State University, 5500 Campanile Drive, San Diego, 92115, California, USA
| | - Kevin Walsh
- Department of Biology, San Diego State University, 5500 Campanile Drive, San Diego, 92115, California, USA
| | - Douglas T Beattie
- Department of Biology, University of New South Wales, Sydney, New South Wales, Australia
| | - Dnyanada Pande
- Bioinformatics and Medical Informatics, San Diego State University, San Diego, California, USA
| | - Parisa Zaeri
- Department of Mathematics and Statistics, San Diego State University, San Diego, California, USA
| | - Genivaldo G Z Silva
- Computational Science Research Center, San Diego State University, San Diego, California, USA
| | - Fabiano Thompson
- Institute of Biology, Federal University of Rio de Janeiro (UFRJ), Rio de Janeiro, Brazil
| | - Robert A Edwards
- Department of Computer Science, San Diego State University, 5500 Campanile Drive, San Diego, California, USA
| | - Elizabeth A Dinsdale
- Department of Biology, San Diego State University, 5500 Campanile Drive, San Diego, 92115, California, USA.
| |
Collapse
|
23
|
François S, Filloux D, Frayssinet M, Roumagnac P, Martin DP, Ogliastro M, Froissart R. Increase in taxonomic assignment efficiency of viral reads in metagenomic studies. Virus Res 2017; 244:230-234. [PMID: 29154906 DOI: 10.1016/j.virusres.2017.11.011] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2017] [Revised: 11/10/2017] [Accepted: 11/10/2017] [Indexed: 12/17/2022]
Abstract
Metagenomics studies have revolutionized the field of biology by revealing the presence of many previously unisolated and uncultured micro-organisms. However, one of the main problems encountered in metagenomic studies is the high percentage of sequences that cannot be assigned taxonomically using commonly used similarity-based approaches (e.g. BLAST or HMM). These unassigned sequences are allegorically called « dark matter » in the metagenomic literature and are often referred to as being derived from new or unknown organisms. Here, based on published and original metagenomic datasets coming from virus-like particle enriched samples, we present and quantify the improvement of viral taxonomic assignment that is achievable with a new similarity-based approach. Indeed, prior to any use of similarity based taxonomic assignment methods, we propose assembling contigs from short reads as is currently routinely done in metagenomic studies, but then to further map unassembled reads to the assembled contigs. This additional mapping step increases significantly the proportions of taxonomically assignable sequence reads from a variety -plant, insect and environmental (estuary, lakes, soil, feces) - of virome studies.
Collapse
Affiliation(s)
- S François
- INRA-Université de Montpellier UMR DGIMI 34095 Montpellier, France
| | - D Filloux
- CIRAD-INRA-Supagro, UMR BGPI, Campus International de Baillarguet, 34398 Montpellier, France
| | - M Frayssinet
- INRA-Université de Montpellier UMR DGIMI 34095 Montpellier, France
| | - P Roumagnac
- CIRAD-INRA-Supagro, UMR BGPI, Campus International de Baillarguet, 34398 Montpellier, France
| | - D P Martin
- Computational Biology Group, Institute of Infectious Disease and Molecular Medicine, Faculty of Health Sciences, University of Cape Town, Observatory, South Africa
| | - M Ogliastro
- INRA-Université de Montpellier UMR DGIMI 34095 Montpellier, France
| | - R Froissart
- CNRS-IRD-Université de Montpellier, UMR MIVEGEC, 911 avenue Agropolis, 34394, Montpellier, France.
| |
Collapse
|
24
|
Roux S, Emerson JB, Eloe-Fadrosh EA, Sullivan MB. Benchmarking viromics: an in silico evaluation of metagenome-enabled estimates of viral community composition and diversity. PeerJ 2017; 5:e3817. [PMID: 28948103 PMCID: PMC5610896 DOI: 10.7717/peerj.3817] [Citation(s) in RCA: 170] [Impact Index Per Article: 24.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/24/2017] [Accepted: 08/26/2017] [Indexed: 12/20/2022] Open
Abstract
Background Viral metagenomics (viromics) is increasingly used to obtain uncultivated viral genomes, evaluate community diversity, and assess ecological hypotheses. While viromic experimental methods are relatively mature and widely accepted by the research community, robust bioinformatics standards remain to be established. Here we used in silico mock viral communities to evaluate the viromic sequence-to-ecological-inference pipeline, including (i) read pre-processing and metagenome assembly, (ii) thresholds applied to estimate viral relative abundances based on read mapping to assembled contigs, and (iii) normalization methods applied to the matrix of viral relative abundances for alpha and beta diversity estimates. Results Tools specifically designed for metagenomes, specifically metaSPAdes, MEGAHIT, and IDBA-UD, were the most effective at assembling viromes. Read pre-processing, such as partitioning, had virtually no impact on assembly output, but may be useful when hardware is limited. Viral populations with 2–5 × coverage typically assembled well, whereas lesser coverage led to fragmented assembly. Strain heterogeneity within populations hampered assembly, especially when strains were closely related (average nucleotide identity, or ANI ≥97%) and when the most abundant strain represented <50% of the population. Viral community composition assessments based on read recruitment were generally accurate when the following thresholds for detection were applied: (i) ≥10 kb contig lengths to define populations, (ii) coverage defined from reads mapping at ≥90% identity, and (iii) ≥75% of contig length with ≥1 × coverage. Finally, although data are limited to the most abundant viruses in a community, alpha and beta diversity patterns were robustly estimated (±10%) when comparing samples of similar sequencing depth, but more divergent (up to 80%) when sequencing depth was uneven across the dataset. In the latter cases, the use of normalization methods specifically developed for metagenomes provided the best estimates. Conclusions These simulations provide benchmarks for selecting analysis cut-offs and establish that an optimized sample-to-ecological-inference viromics pipeline is robust for making ecological inferences from natural viral communities. Continued development to better accessing RNA, rare, and/or diverse viral populations and improved reference viral genome availability will alleviate many of viromics remaining limitations.
Collapse
Affiliation(s)
- Simon Roux
- Department of Microbiology, Ohio State University, Columbus, OH, United States of America
| | - Joanne B Emerson
- Department of Microbiology, Ohio State University, Columbus, OH, United States of America
| | - Emiley A Eloe-Fadrosh
- Joint Genome Institute, Department of Energy, Walnut Creek, CA, United States of America
| | - Matthew B Sullivan
- Department of Microbiology, Ohio State University, Columbus, OH, United States of America.,Department of Civil, Environmental and Geodetic Engineering, Ohio State University, Columbus, OH, United States of America
| |
Collapse
|
25
|
Nurk S, Meleshko D, Korobeynikov A, Pevzner PA. metaSPAdes: a new versatile metagenomic assembler. Genome Res 2017; 27:824-834. [PMID: 28298430 PMCID: PMC5411777 DOI: 10.1101/gr.213959.116] [Citation(s) in RCA: 2086] [Impact Index Per Article: 298.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2016] [Accepted: 03/13/2017] [Indexed: 01/25/2023]
Abstract
While metagenomics has emerged as a technology of choice for analyzing bacterial populations, the assembly of metagenomic data remains challenging, thus stifling biological discoveries. Moreover, recent studies revealed that complex bacterial populations may be composed from dozens of related strains, thus further amplifying the challenge of metagenomic assembly. metaSPAdes addresses various challenges of metagenomic assembly by capitalizing on computational ideas that proved to be useful in assemblies of single cells and highly polymorphic diploid genomes. We benchmark metaSPAdes against other state-of-the-art metagenome assemblers and demonstrate that it results in high-quality assemblies across diverse data sets.
Collapse
Affiliation(s)
- Sergey Nurk
- Center for Algorithmic Biotechnology, Institute for Translational Biomedicine, St. Petersburg State University, St. Petersburg, Russia 199004
| | - Dmitry Meleshko
- Center for Algorithmic Biotechnology, Institute for Translational Biomedicine, St. Petersburg State University, St. Petersburg, Russia 199004
| | - Anton Korobeynikov
- Center for Algorithmic Biotechnology, Institute for Translational Biomedicine, St. Petersburg State University, St. Petersburg, Russia 199004.,Department of Statistical Modelling, St. Petersburg State University, St. Petersburg, Russia 198515
| | - Pavel A Pevzner
- Center for Algorithmic Biotechnology, Institute for Translational Biomedicine, St. Petersburg State University, St. Petersburg, Russia 199004.,Department of Computer Science and Engineering, University of California, San Diego, California 92093-0404, USA
| |
Collapse
|
26
|
Hesse U, van Heusden P, Kirby BM, Olonade I, van Zyl LJ, Trindade M. Virome Assembly and Annotation: A Surprise in the Namib Desert. Front Microbiol 2017; 8:13. [PMID: 28167933 PMCID: PMC5253355 DOI: 10.3389/fmicb.2017.00013] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2016] [Accepted: 01/03/2017] [Indexed: 11/13/2022] Open
Abstract
Sequencing, assembly, and annotation of environmental virome samples is challenging. Methodological biases and differences in species abundance result in fragmentary read coverage; sequence reconstruction is further complicated by the mosaic nature of viral genomes. In this paper, we focus on biocomputational aspects of virome analysis, emphasizing latent pitfalls in sequence annotation. Using simulated viromes that mimic environmental data challenges we assessed the performance of five assemblers (CLC-Workbench, IDBA-UD, SPAdes, RayMeta, ABySS). Individual analyses of relevant scaffold length fractions revealed shortcomings of some programs in reconstruction of viral genomes with excessive read coverage (IDBA-UD, RayMeta), and in accurate assembly of scaffolds ≥50 kb (SPAdes, RayMeta, ABySS). The CLC-Workbench assembler performed best in terms of genome recovery (including highly covered genomes) and correct reconstruction of large scaffolds; and was used to assemble a virome from a copper rich site in the Namib Desert. We found that scaffold network analysis and cluster-specific read reassembly improved reconstruction of sequences with excessive read coverage, and that strict data filtering for non-viral sequences prior to downstream analyses was essential. In this study we describe novel viral genomes identified in the Namib Desert copper site virome. Taxonomic affiliations of diverse proteins in the dataset and phylogenetic analyses of circovirus-like proteins indicated links to the marine habitat. Considering additional evidence from this dataset we hypothesize that viruses may have been carried from the Atlantic Ocean into the Namib Desert by fog and wind, highlighting the impact of the extended environment on an investigated niche in metagenome studies.
Collapse
Affiliation(s)
- Uljana Hesse
- Institute for Microbial Biotechnology and Metagenomics, University of the Western CapeBellville, South Africa
- South African National Bioinformatics Institute, University of the Western CapeBellville, South Africa
| | - Peter van Heusden
- South African National Bioinformatics Institute, University of the Western CapeBellville, South Africa
| | - Bronwyn M. Kirby
- Institute for Microbial Biotechnology and Metagenomics, University of the Western CapeBellville, South Africa
| | - Israel Olonade
- Institute for Microbial Biotechnology and Metagenomics, University of the Western CapeBellville, South Africa
| | - Leonardo J. van Zyl
- Institute for Microbial Biotechnology and Metagenomics, University of the Western CapeBellville, South Africa
| | - Marla Trindade
- Institute for Microbial Biotechnology and Metagenomics, University of the Western CapeBellville, South Africa
| |
Collapse
|
27
|
Cobián Güemes AG, Youle M, Cantú VA, Felts B, Nulton J, Rohwer F. Viruses as Winners in the Game of Life. Annu Rev Virol 2016; 3:197-214. [DOI: 10.1146/annurev-virology-100114-054952] [Citation(s) in RCA: 158] [Impact Index Per Article: 19.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Affiliation(s)
| | | | - Vito Adrian Cantú
- Computational Sciences Research Center, San Diego State University, San Diego, California 92182
| | - Ben Felts
- Department of Mathematics and Statistics, San Diego State University, San Diego, California 92182
| | - James Nulton
- Department of Mathematics and Statistics, San Diego State University, San Diego, California 92182
| | - Forest Rohwer
- Department of Biology, San Diego State University, San Diego, California 92182;
| |
Collapse
|