Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: García-López R, Vázquez-Castellanos JF, Moya A. Fragmentation and Coverage Variation in Viral Metagenome Assemblies, and Their Effect in Diversity Calculations. Front Bioeng Biotechnol 2015;3:141. [PMID: 26442255 PMCID: PMC4585024 DOI: 10.3389/fbioe.2015.00141] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2015] [Accepted: 09/03/2015] [Indexed: 01/01/2023] Open

For:	García-López R, Vázquez-Castellanos JF, Moya A. Fragmentation and Coverage Variation in Viral Metagenome Assemblies, and Their Effect in Diversity Calculations. Front Bioeng Biotechnol 2015;3:141. [PMID: 26442255 PMCID: PMC4585024 DOI: 10.3389/fbioe.2015.00141] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2015] [Accepted: 09/03/2015] [Indexed: 01/01/2023] Open

Number

Cited by Other Article(s)

Shi Z, Long X, Zhang C, Chen Z, Usman M, Zhang Y, Zhang S, Luo G. Viral and Bacterial Community Dynamics in Food Waste and Digestate from Full-Scale Biogas Plants. ENVIRONMENTAL SCIENCE & TECHNOLOGY 2024;58:13010-13022. [PMID: 38989650 DOI: 10.1021/acs.est.4c04109] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/12/2024]

Affiliation(s)

Zhijian Shi Shanghai Key Laboratory of Atmospheric Particle Pollution and Prevention (LAP3), Department of Environmental Science and Engineering, Fudan University, Shanghai 200438, China
Xinyi Long Shanghai Key Laboratory of Atmospheric Particle Pollution and Prevention (LAP3), Department of Environmental Science and Engineering, Fudan University, Shanghai 200438, China
Chao Zhang Shanghai Key Laboratory of Atmospheric Particle Pollution and Prevention (LAP3), Department of Environmental Science and Engineering, Fudan University, Shanghai 200438, China
Zheng Chen Shanghai Key Laboratory of Atmospheric Particle Pollution and Prevention (LAP3), Department of Environmental Science and Engineering, Fudan University, Shanghai 200438, China
Muhammad Usman Department of Civil and Environmental Engineering, University of Alberta, Edmonton, AB T6G 2R3, Canada
Yalei Zhang Shanghai Institute of Pollution Control and Ecological Security, Shanghai 200092, China State Key Laboratory of Pollution Control and Resources Reuse, College of Environmental Science and Engineering, Tongji University, Shanghai 200092, China
Shicheng Zhang Shanghai Key Laboratory of Atmospheric Particle Pollution and Prevention (LAP3), Department of Environmental Science and Engineering, Fudan University, Shanghai 200438, China Shanghai Institute of Pollution Control and Ecological Security, Shanghai 200092, China Shanghai Technical Service Platform for Pollution Control and Resource Utilization of Organic Wastes, Shanghai 200438, China
Gang Luo Shanghai Key Laboratory of Atmospheric Particle Pollution and Prevention (LAP3), Department of Environmental Science and Engineering, Fudan University, Shanghai 200438, China Shanghai Institute of Pollution Control and Ecological Security, Shanghai 200092, China Shanghai Technical Service Platform for Pollution Control and Resource Utilization of Organic Wastes, Shanghai 200438, China

Collapse

Liu X, Liu Y, Liu J, Zhang H, Shan C, Guo Y, Gong X, Cui M, Li X, Tang M. Correlation between the gut microbiome and neurodegenerative diseases: a review of metagenomics evidence. Neural Regen Res 2024;19:833-845. [PMID: 37843219 PMCID: PMC10664138 DOI: 10.4103/1673-5374.382223] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2023] [Revised: 04/19/2023] [Accepted: 06/17/2023] [Indexed: 10/17/2023] Open

Moubset O, Filloux D, Fontes H, Julian C, Fernandez E, Galzi S, Blondin L, Chehida SB, Lett JM, Mesléard F, Kraberger S, Custer JM, Salywon A, Makings E, Marais A, Chiroleu F, Lefeuvre P, Martin DP, Candresse T, Varsani A, Ravigné V, Roumagnac P. Virome release of an invasive exotic plant species in southern France. Virus Evol 2024;10:veae025. [PMID: 38566975 PMCID: PMC10986800 DOI: 10.1093/ve/veae025] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2023] [Revised: 02/27/2024] [Accepted: 03/06/2024] [Indexed: 04/04/2024] Open

Abstract

The increase in human-mediated introduction of plant species to new regions has resulted in a rise of invasive exotic plant species (IEPS) that has had significant effects on biodiversity and ecosystem processes. One commonly accepted mechanism of invasions is that proposed by the enemy release hypothesis (ERH), which states that IEPS free from their native herbivores and natural enemies in new environments can outcompete indigenous species and become invasive. We here propose the virome release hypothesis (VRH) as a virus-centered variant of the conventional ERH that is only focused on enemies. The VRH predicts that vertically transmitted plant-associated viruses (PAV, encompassing phytoviruses and mycoviruses) should be co-introduced during the dissemination of the IEPS, while horizontally transmitted PAV of IEPS should be left behind or should not be locally transmitted in the introduced area due to a maladaptation of local vectors. To document the VRH, virome richness and composition as well as PAV prevalence, co-infection, host range, and transmission modes were compared between indigenous plant species and an invasive grass, cane bluestem (Bothriochloa barbinodis), in both its introduced range (southern France) and one area of its native range (Sonoran Desert, Arizona, USA). Contrary to the VRH, we show that invasive populations of B. barbinodis in France were not associated with a lower PAV prevalence or richness than native populations of B. barbinodis from the USA. However, comparison of virome compositions and network analyses further revealed more diverse and complex plant-virus interactions in the French ecosystem, with a significant richness of mycoviruses. Setting mycoviruses apart, only one putatively vertically transmitted phytovirus (belonging to the Amalgaviridae family) and one putatively horizontally transmitted phytovirus (belonging to the Geminiviridae family) were identified from B. barbinodis plants in the introduced area. Collectively, these characteristics of the B. barbinodis-associated PAV community in southern France suggest that a virome release phase may have immediately followed the introduction of B. barbinodis to France in the 1960s or 1970s, and that, since then, the invasive populations of this IEPS have already transitioned out of this virome release phase, and have started interacting with several local mycoviruses and a few local plant viruses.

Collapse

Affiliation(s)

Oumaima Moubset UMR PHIM, CIRAD, Baillarguet TA A-54/K, Montpellier 34090, France PHIM Plant Health Institute, Univ Montpellier, CIRAD, INRAE, Institut Agro, IRD, Baillarguet TA A-54/K, Montpellier 34090, France
Denis Filloux UMR PHIM, CIRAD, Baillarguet TA A-54/K, Montpellier 34090, France PHIM Plant Health Institute, Univ Montpellier, CIRAD, INRAE, Institut Agro, IRD, Baillarguet TA A-54/K, Montpellier 34090, France
Hugo Fontes Tour du Valat, Institut de recherche pour la conservation des zones humides méditerranéennes, Le Sambuc, Arles 13200, France Institut Méditerranéen de Biodiversité et Ecologie, UMR CNRS-IRD, Avignon Université, Aix-Marseille Université, IUT d’Avignon, Avignon 84911, France
Charlotte Julian UMR PHIM, CIRAD, Baillarguet TA A-54/K, Montpellier 34090, France PHIM Plant Health Institute, Univ Montpellier, CIRAD, INRAE, Institut Agro, IRD, Baillarguet TA A-54/K, Montpellier 34090, France
Emmanuel Fernandez UMR PHIM, CIRAD, Baillarguet TA A-54/K, Montpellier 34090, France PHIM Plant Health Institute, Univ Montpellier, CIRAD, INRAE, Institut Agro, IRD, Baillarguet TA A-54/K, Montpellier 34090, France
Serge Galzi UMR PHIM, CIRAD, Baillarguet TA A-54/K, Montpellier 34090, France PHIM Plant Health Institute, Univ Montpellier, CIRAD, INRAE, Institut Agro, IRD, Baillarguet TA A-54/K, Montpellier 34090, France
Laurence Blondin UMR PHIM, CIRAD, Baillarguet TA A-54/K, Montpellier 34090, France PHIM Plant Health Institute, Univ Montpellier, CIRAD, INRAE, Institut Agro, IRD, Baillarguet TA A-54/K, Montpellier 34090, France
Sélim Ben Chehida UMR PVBMT, CIRAD, Saint-Pierre, La Réunion F-97410, France
Jean-Michel Lett UMR PVBMT, CIRAD, Saint-Pierre, La Réunion F-97410, France
François Mesléard Tour du Valat, Institut de recherche pour la conservation des zones humides méditerranéennes, Le Sambuc, Arles 13200, France Institut Méditerranéen de Biodiversité et Ecologie, UMR CNRS-IRD, Avignon Université, Aix-Marseille Université, IUT d’Avignon, Avignon 84911, France
Simona Kraberger The Biodesign Center for Fundamental and Applied Microbiomics, Center for Evolution and Medicine, School of Life Sciences, Arizona State University, Tempe, AZ 85287, USA
Joy M Custer The Biodesign Center for Fundamental and Applied Microbiomics, Center for Evolution and Medicine, School of Life Sciences, Arizona State University, Tempe, AZ 85287, USA
Andrew Salywon Department of Research, Conservation and Collections, Desert Botanical Garden, Phoenix, AZ 85008, USA
Elizabeth Makings Vascular Plant Herbarium, School of Life Sciences, Arizona State University, 734 West Alameda Drive, Tempe Tempe, AZ 85282, USA
Armelle Marais UMR BFP, University Bordeaux, INRAE, Villenave d’Ornon 33140, France
Frédéric Chiroleu UMR PVBMT, CIRAD, Saint-Pierre, La Réunion F-97410, France
Pierre Lefeuvre UMR PVBMT, CIRAD, Saint-Pierre, La Réunion F-97410, France
Darren P Martin Division of Computational Biology, Department of Integrative Biomedical Sciences, Institute of infectious Diseases and Molecular Medicine, University of Cape Town, Anzio Rd, Cape Town 7925, South Africa
Thierry Candresse UMR BFP, University Bordeaux, INRAE, Villenave d’Ornon 33140, France
Arvind Varsani The Biodesign Center for Fundamental and Applied Microbiomics, Center for Evolution and Medicine, School of Life Sciences, Arizona State University, Tempe, AZ 85287, USA Structural Biology Research Unit, Department of Integrative Biomedical Sciences, University of Cape Town, Observatory, Cape Town 7700, South Africa
Virginie Ravigné UMR PHIM, CIRAD, Baillarguet TA A-54/K, Montpellier 34090, France PHIM Plant Health Institute, Univ Montpellier, CIRAD, INRAE, Institut Agro, IRD, Baillarguet TA A-54/K, Montpellier 34090, France
Philippe Roumagnac UMR PHIM, CIRAD, Baillarguet TA A-54/K, Montpellier 34090, France PHIM Plant Health Institute, Univ Montpellier, CIRAD, INRAE, Institut Agro, IRD, Baillarguet TA A-54/K, Montpellier 34090, France

Collapse

Gong C, Chakraborty D, Koudelka GB. A prophage encoded ribosomal RNA methyltransferase regulates the virulence of Shiga-toxin-producing Escherichia coli (STEC). Nucleic Acids Res 2024;52:856-871. [PMID: 38084890 PMCID: PMC10810198 DOI: 10.1093/nar/gkad1150] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2023] [Revised: 11/09/2023] [Accepted: 11/14/2023] [Indexed: 01/26/2024] Open

Brait N, Hackl T, Morel C, Exbrayat A, Gutierrez S, Lequime S. A tale of caution: How endogenous viral elements affect virus discovery in transcriptomic data. Virus Evol 2023;10:vead088. [PMID: 38516656 PMCID: PMC10956553 DOI: 10.1093/ve/vead088] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2023] [Revised: 11/24/2023] [Accepted: 12/22/2023] [Indexed: 03/23/2024] Open

Abstract

Large-scale metagenomic and -transcriptomic studies have revolutionized our understanding of viral diversity and abundance. In contrast, endogenous viral elements (EVEs), remnants of viral sequences integrated into host genomes, have received limited attention in the context of virus discovery, especially in RNA-Seq data. EVEs resemble their original viruses, a challenge that makes distinguishing between active infections and integrated remnants difficult, affecting virus classification and biases downstream analyses. Here, we systematically assess the effects of EVEs on a prototypical virus discovery pipeline, evaluate their impact on data integrity and classification accuracy, and provide some recommendations for better practices. We examined EVEs and exogenous viral sequences linked to Orthomyxoviridae, a diverse family of negative-sense segmented RNA viruses, in 13 genomic and 538 transcriptomic datasets of Culicinae mosquitoes. Our analysis revealed a substantial number of viral sequences in transcriptomic datasets. However, a significant portion appeared not to be exogenous viruses but transcripts derived from EVEs. Distinguishing between transcribed EVEs and exogenous virus sequences was especially difficult in samples with low viral abundance. For example, three transcribed EVEs showed full-length segments, devoid of frameshift and nonsense mutations, exhibiting sufficient mean read depths that qualify them as exogenous virus hits. Mapping reads on a host genome containing EVEs before assembly somewhat alleviated the EVE burden, but it led to a drastic reduction of viral hits and reduced quality of assemblies, especially in regions of the viral genome relatively similar to EVEs. Our study highlights that our knowledge of the genetic diversity of viruses can be altered by the underestimated presence of EVEs in transcriptomic datasets, leading to false positives and altered or missing sequence information. Thus, recognizing and addressing the influence of EVEs in virus discovery pipelines will be key in enhancing our ability to capture the full spectrum of viral diversity.

Collapse

Du Y, Fuhrman JA, Sun F. ViralCC retrieves complete viral genomes and virus-host pairs from metagenomic Hi-C data. Nat Commun 2023;14:502. [PMID: 36720887 PMCID: PMC9889337 DOI: 10.1038/s41467-023-35945-y] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2022] [Accepted: 01/09/2023] [Indexed: 02/01/2023] Open

Tithi SS, Aylward FO, Jensen RV, Zhang L. FastViromeExplorer-Novel: Recovering Draft Genomes of Novel Viruses and Phages in Metagenomic Data. JOURNAL OF COMPUTATIONAL BIOLOGY : A JOURNAL OF COMPUTATIONAL MOLECULAR CELL BIOLOGY 2023;30:391-408. [PMID: 36607772 DOI: 10.1089/cmb.2022.0397] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/07/2023]

Zhao J, Wang Z, Li C, Shi T, Liang Y, Jiao N, Zhang Y. Significant Differences in Planktonic Virus Communities Between "Cellular Fraction" (0.22 ~ 3.0 µm) and "Viral Fraction" (< 0.22 μm) in the Ocean. MICROBIAL ECOLOGY 2022:10.1007/s00248-022-02167-6. [PMID: 36585490 DOI: 10.1007/s00248-022-02167-6] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/01/2022] [Accepted: 12/26/2022] [Indexed: 06/17/2023]

Gupta AK, Kumar M. Benchmarking and Assessment of Eight De Novo Genome Assemblers on Viral Next-Generation Sequencing Data, Including the SARS-CoV-2. OMICS : A JOURNAL OF INTEGRATIVE BIOLOGY 2022;26:372-381. [PMID: 35759429 DOI: 10.1089/omi.2022.0042] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]

Abstract

Viral genomics has become crucial in clinical diagnostics and ecology, not to mention to stem the COVID-19 pandemic. Whole-genome sequencing (WGS) is pivotal in gaining an improved understanding of viral evolution, genomic epidemiology, infectious outbreaks, pathobiology, clinical management, and vaccine development. Genome assembly is one of the crucial steps in WGS data analyses. A series of different assemblers has been developed with the advent of high-throughput next-generation sequencing (NGS). Various studies have reported the evaluation of these assembly tools on distinct datasets; however, these lack data from viral origin. In this study, we performed a comparative evaluation and benchmarking of eight de novo assemblers: SOAPdenovo, Velvet, assembly by short sequences (ABySS), iterative De Bruijn graph assembler (IDBA), SPAdes, Edena, iterative virus assembler, and VICUNA on the viral NGS data from distinct Illumina (GAIIx, Hiseq, Miseq, and Nextseq) platforms. WGS data of diverse viruses, that is, severe acute respiratory syndrome coronavirus-2 (SARS-CoV-2), dengue virus 3, human immunodeficiency virus 1, hepatitis B virus, human herpesvirus 8, human papillomavirus 16, rhinovirus A, and West Nile virus, were utilized to assess these assemblers. Performance metrics such as genome fraction recovery, assembly lengths, NG50, N50, contig length, contig numbers, mismatches, and misassemblies were analyzed. Overall, three assemblers, that is, SPAdes, IDBA, and ABySS, performed consistently well, including for genome assembly of SARS-CoV-2. These assembly methods should be considered and recommended for future studies of viruses. The study also suggests that implementing two or more assembly approaches should be considered in viral NGS studies, especially in clinical settings. Taken together, the benchmarking of eight de novo genome assemblers reported in this study can inform future public health and ecology research concerning the viruses, the COVID-19 pandemic, and viral outbreaks.

Collapse

Andrade-Martínez JS, Camelo Valera LC, Chica Cárdenas LA, Forero-Junco L, López-Leal G, Moreno-Gallego JL, Rangel-Pineros G, Reyes A. Computational Tools for the Analysis of Uncultivated Phage Genomes. Microbiol Mol Biol Rev 2022;86:e0000421. [PMID: 35311574 PMCID: PMC9199400 DOI: 10.1128/mmbr.00004-21] [Citation(s) in RCA: 13] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022] Open

Weinheimer AR, Aylward FO. Infection strategy and biogeography distinguish cosmopolitan groups of marine jumbo bacteriophages. THE ISME JOURNAL 2022;16:1657-1667. [PMID: 35260829 PMCID: PMC9123017 DOI: 10.1038/s41396-022-01214-x] [Citation(s) in RCA: 18] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/18/2021] [Revised: 02/03/2022] [Accepted: 02/10/2022] [Indexed: 11/08/2022]

VPipe: an Automated Bioinformatics Platform for Assembly and Management of Viral Next-Generation Sequencing Data. Microbiol Spectr 2022;10:e0256421. [PMID: 35234489 PMCID: PMC8941893 DOI: 10.1128/spectrum.02564-21] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022] Open

Abstract

Next-generation sequencing (NGS) is a powerful tool for detecting and investigating viral pathogens; however, analysis and management of the enormous amounts of data generated from these technologies remains a challenge. Here, we present VPipe (the Viral NGS Analysis Pipeline and Data Management System), an automated bioinformatics pipeline optimized for whole-genome assembly of viral sequences and identification of diverse species. VPipe automates the data quality control, assembly, and contig identification steps typically performed when analyzing NGS data. Users access the pipeline through a secure web-based portal, which provides an easy-to-use interface with advanced search capabilities for reviewing results. In addition, VPipe provides a centralized system for storing and analyzing NGS data, eliminating common bottlenecks in bioinformatics analyses for public health laboratories with limited on-site computational infrastructure. The performance of VPipe was validated through the analysis of publicly available NGS data sets for viral pathogens, generating high-quality assemblies for 12 data sets. VPipe also generated assemblies with greater contiguity than similar pipelines for 41 human respiratory syncytial virus isolates and 23 SARS-CoV-2 specimens. IMPORTANCE Computational infrastructure and bioinformatics analysis are bottlenecks in the application of NGS to viral pathogens. As of September 2021, VPipe has been used by the U.S. Centers for Disease Control and Prevention (CDC) and 12 state public health laboratories to characterize >17,500 and 1,500 clinical specimens and isolates, respectively. VPipe automates genome assembly for a wide range of viruses, including high-consequence pathogens such as SARS-CoV-2. Such automated functionality expedites public health responses to viral outbreaks and pathogen surveillance.

Collapse

Johansen J, Plichta DR, Nissen JN, Jespersen ML, Shah SA, Deng L, Stokholm J, Bisgaard H, Nielsen DS, Sørensen SJ, Rasmussen S. Genome binning of viral entities from bulk metagenomics data. Nat Commun 2022;13:965. [PMID: 35181661 PMCID: PMC8857322 DOI: 10.1038/s41467-022-28581-5] [Citation(s) in RCA: 41] [Impact Index Per Article: 20.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2021] [Accepted: 01/28/2022] [Indexed: 12/26/2022] Open

Affiliation(s)

Joachim Johansen Novo Nordisk Foundation Center for Protein Research, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark.,Infectious Disease and Microbiome Program, Broad Institute of MIT and Harvard, Cambridge, MA, USA
Damian R Plichta Infectious Disease and Microbiome Program, Broad Institute of MIT and Harvard, Cambridge, MA, USA
Jakob Nybo Nissen Novo Nordisk Foundation Center for Protein Research, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark.,Statens Serum Institut, Viral & Microbial Special diagnostics, Copenhagen, Denmark
Marie Louise Jespersen Novo Nordisk Foundation Center for Protein Research, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark.,National Food Institute, Technical University of Denmark, Kongens Lyngby, Denmark
Shiraz A Shah Copenhagen Prospective Studies on Asthma in Childhood (COPSAC), Herlev and Gentofte Hospital, University of Copenhagen, Copenhagen, Denmark
Ling Deng Section of Food Microbiology and Fermentation, Department of Food Science, Faculty of Science, University of Copenhagen, Copenhagen, Denmark
Jakob Stokholm Copenhagen Prospective Studies on Asthma in Childhood (COPSAC), Herlev and Gentofte Hospital, University of Copenhagen, Copenhagen, Denmark.,Section of Food Microbiology and Fermentation, Department of Food Science, Faculty of Science, University of Copenhagen, Copenhagen, Denmark
Hans Bisgaard Copenhagen Prospective Studies on Asthma in Childhood (COPSAC), Herlev and Gentofte Hospital, University of Copenhagen, Copenhagen, Denmark
Dennis Sandris Nielsen Section of Food Microbiology and Fermentation, Department of Food Science, Faculty of Science, University of Copenhagen, Copenhagen, Denmark
Søren J Sørensen Section of Microbiology, Department of Biology, University of Copenhagen, Copenhagen, Denmark
Simon Rasmussen Novo Nordisk Foundation Center for Protein Research, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark.

Collapse

Arisdakessian CG, Nigro OD, Steward GF, Poisson G, Belcaid M. CoCoNet: an efficient deep learning tool for viral metagenome binning. Bioinformatics 2021;37:2803-2810. [PMID: 33822891 DOI: 10.1093/bioinformatics/btab213] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2020] [Revised: 03/24/2021] [Accepted: 04/02/2021] [Indexed: 02/02/2023] Open

Abstract

MOTIVATION

Metagenomic approaches hold the potential to characterize microbial communities and unravel the intricate link between the microbiome and biological processes. Assembly is one of the most critical steps in metagenomics experiments. It consists of transforming overlapping DNA sequencing reads into sufficiently accurate representations of the community's genomes. This process is computationally difficult and commonly results in genomes fragmented across many contigs. Computational binning methods are used to mitigate fragmentation by partitioning contigs based on their sequence composition, abundance or chromosome organization into bins representing the community's genomes. Existing binning methods have been principally tuned for bacterial genomes and do not perform favorably on viral metagenomes.

RESULTS

We propose Composition and Coverage Network (CoCoNet), a new binning method for viral metagenomes that leverages the flexibility and the effectiveness of deep learning to model the co-occurrence of contigs belonging to the same viral genome and provide a rigorous framework for binning viral contigs. Our results show that CoCoNet substantially outperforms existing binning methods on viral datasets.

AVAILABILITY AND IMPLEMENTATION

CoCoNet was implemented in Python and is available for download on PyPi (https://pypi.org/). The source code is hosted on GitHub at https://github.com/Puumanamana/CoCoNet and the documentation is available at https://coconet.readthedocs.io/en/latest/index.html. CoCoNet does not require extensive resources to run. For example, binning 100k contigs took about 4 h on 10 Intel CPU Cores (2.4 GHz), with a memory peak at 27 GB (see Supplementary Fig. S9). To process a large dataset, CoCoNet may need to be run on a high RAM capacity server. Such servers are typically available in high-performance or cloud computing settings.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

Collapse

Sakamoto T, Ortega JM. Taxallnomy: an extension of NCBI Taxonomy that produces a hierarchically complete taxonomic tree. BMC Bioinformatics 2021;22:388. [PMID: 34325658 PMCID: PMC8323199 DOI: 10.1186/s12859-021-04304-3] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2021] [Accepted: 07/12/2021] [Indexed: 01/02/2023] Open

Abstract

BACKGROUND

NCBI Taxonomy is the main taxonomic source for several bioinformatics tools and databases since all organisms with sequence accessions deposited on INSDC are organized in its hierarchical structure. Despite the extensive use and application of this data source, an alternative representation of data as a table would facilitate the use of information for processing bioinformatics data. To do so, since some taxonomic-ranks are missing in some lineages, an algorithm might propose provisional names for all taxonomic-ranks.

RESULTS

To address this issue, we developed an algorithm that takes the tree structure from NCBI Taxonomy and generates a hierarchically complete taxonomic table, maintaining its compatibility with the original tree. The procedures performed by the algorithm consist of attempting to assign a taxonomic-rank to an existing clade or "no rank" node when possible, using its name as part of the created taxonomic-rank name (e.g. Ord_Ornithischia) or interpolating parent nodes when needed (e.g. Cla_of_Ornithischia), both examples given for the dinosaur Brachylophosaurus lineage. The new hierarchical structure was named Taxallnomy because it contains names for all taxonomic-ranks, and it contains 41 hierarchical levels corresponding to the 41 taxonomic-ranks currently found in the NCBI Taxonomy database. From Taxallnomy, users can obtain the complete taxonomic lineage with 41 nodes of all taxa available in the NCBI Taxonomy database, without any hazard to the original tree information. In this work, we demonstrate its applicability by embedding taxonomic information of a specified rank into a phylogenetic tree and by producing metagenomics profiles.

CONCLUSION

Taxallnomy applies to any bioinformatics analyses that depend on the information from NCBI Taxonomy. Taxallnomy is updated periodically but with a distributed PERL script users can generate it locally using NCBI Taxonomy as input. All Taxallnomy resources are available at http://bioinfo.icb.ufmg.br/taxallnomy .

Collapse

Benler S, Koonin EV. Fishing for phages in metagenomes: what do we catch, what do we miss? Curr Opin Virol 2021;49:142-150. [PMID: 34139668 DOI: 10.1016/j.coviro.2021.05.008] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/02/2023]

Yuan Z, Ye X, Zhu L, Zhang N, An Z, Zheng WJ. Virome assembly and annotation in brain tissue based on next-generation sequencing. Cancer Med 2020;9:6776-6790. [PMID: 32738030 PMCID: PMC7520322 DOI: 10.1002/cam4.3325] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2019] [Revised: 06/20/2020] [Accepted: 07/01/2020] [Indexed: 12/15/2022] Open

Ledesma J, Williams D, Stanford FA, Hewitt PE, Zuckerman M, Bansal S, Dhawan A, Mbisa JL, Tedder R, Ijaz S. Resolution by deep sequencing of a dual hepatitis E virus infection transmitted via blood components. J Gen Virol 2020;100:1491-1500. [PMID: 31592753 DOI: 10.1099/jgv.0.001302] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022] Open

Whole-Virome Analysis Sheds Light on Viral Dark Matter in Inflammatory Bowel Disease. Cell Host Microbe 2019;26:764-778.e5. [DOI: 10.1016/j.chom.2019.10.009] [Citation(s) in RCA: 162] [Impact Index Per Article: 32.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2019] [Revised: 09/02/2019] [Accepted: 10/14/2019] [Indexed: 12/18/2022]

Sutton TDS, Hill C. Gut Bacteriophage: Current Understanding and Challenges. Front Endocrinol (Lausanne) 2019;10:784. [PMID: 31849833 PMCID: PMC6895007 DOI: 10.3389/fendo.2019.00784] [Citation(s) in RCA: 96] [Impact Index Per Article: 19.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/09/2019] [Accepted: 10/28/2019] [Indexed: 12/13/2022] Open

Sutton TDS, Clooney AG, Ryan FJ, Ross RP, Hill C. Choice of assembly software has a critical impact on virome characterisation. MICROBIOME 2019;7:12. [PMID: 30691529 PMCID: PMC6350398 DOI: 10.1186/s40168-019-0626-5] [Citation(s) in RCA: 86] [Impact Index Per Article: 17.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/16/2018] [Accepted: 01/14/2019] [Indexed: 05/19/2023]

Abstract

BACKGROUND

The viral component of microbial communities plays a vital role in driving bacterial diversity, facilitating nutrient turnover and shaping community composition. Despite their importance, the vast majority of viral sequences are poorly annotated and share little or no homology to reference databases. As a result, investigation of the viral metagenome (virome) relies heavily on de novo assembly of short sequencing reads to recover compositional and functional information. Metagenomic assembly is particularly challenging for virome data, often resulting in fragmented assemblies and poor recovery of viral community members. Despite the essential role of assembly in virome analysis and difficulties posed by these data, current assembly comparisons have been limited to subsections of virome studies or bacterial datasets.

DESIGN

This study presents the most comprehensive virome assembly comparison to date, featuring 16 metagenomic assembly approaches which have featured in human virome studies. Assemblers were assessed using four independent virome datasets, namely, simulated reads, two mock communities, viromes spiked with a known phage and human gut viromes.

RESULTS

Assembly performance varied significantly across all test datasets, with SPAdes (meta) performing consistently well. Performance of MIRA and VICUNA varied, highlighting the importance of using a range of datasets when comparing assembly programs. It was also found that while some assemblers addressed the challenges of virome data better than others, all assemblers had limitations. Low read coverage and genomic repeats resulted in assemblies with poor genome recovery, high degrees of fragmentation and low-accuracy contigs across all assemblers. These limitations must be considered when setting thresholds for downstream analysis and when drawing conclusions from virome data.

Collapse

Papudeshi B, Haggerty JM, Doane M, Morris MM, Walsh K, Beattie DT, Pande D, Zaeri P, Silva GGZ, Thompson F, Edwards RA, Dinsdale EA. Optimizing and evaluating the reconstruction of Metagenome-assembled microbial genomes. BMC Genomics 2017;18:915. [PMID: 29183281 PMCID: PMC5706307 DOI: 10.1186/s12864-017-4294-1] [Citation(s) in RCA: 38] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2017] [Accepted: 11/13/2017] [Indexed: 11/12/2022] Open

Abstract

Background

Microbiome/host interactions describe characteristics that affect the host's health. Shotgun metagenomics includes sequencing a random subset of the microbiome to analyze its taxonomic and metabolic potential. Reconstruction of DNA fragments into genomes from metagenomes (called metagenome-assembled genomes) assigns unknown fragments to taxa/function and facilitates discovery of novel organisms. Genome reconstruction incorporates sequence assembly and sorting of assembled sequences into bins, characteristic of a genome. However, the microbial community composition, including taxonomic and phylogenetic diversity may influence genome reconstruction. We determine the optimal reconstruction method for four microbiome projects that had variable sequencing platforms (IonTorrent and Illumina), diversity (high or low), and environment (coral reefs and kelp forests), using a set of parameters to select for optimal assembly and binning tools.

Methods

We tested the effects of the assembly and binning processes on population genome reconstruction using 105 marine metagenomes from 4 projects. Reconstructed genomes were obtained from each project using 3 assemblers (IDBA, MetaVelvet, and SPAdes) and 2 binning tools (GroopM and MetaBat). We assessed the efficiency of assemblers using statistics that including contig continuity and contig chimerism and the effectiveness of binning tools using genome completeness and taxonomic identification.

Results

We concluded that SPAdes, assembled more contigs (143,718 ± 124 contigs) of longer length (N50 = 1632 ± 108 bp), and incorporated the most sequences (sequences-assembled = 19.65%). The microbial richness and evenness were maintained across the assembly, suggesting low contig chimeras. SPAdes assembly was responsive to the biological and technological variations within the project, compared with other assemblers. Among binning tools, we conclude that MetaBat produced bins with less variation in GC content (average standard deviation: 1.49), low species richness (4.91 ± 0.66), and higher genome completeness (40.92 ± 1.75) across all projects. MetaBat extracted 115 bins from the 4 projects of which 66 bins were identified as reconstructed metagenome-assembled genomes with sequences belonging to a specific genus. We identified 13 novel genomes, some of which were 100% complete, but show low similarity to genomes within databases.

Conclusions

In conclusion, we present a set of biologically relevant parameters for evaluation to select for optimal assembly and binning tools. For the tools we tested, SPAdes assembler and MetaBat binning tools reconstructed quality metagenome-assembled genomes for the four projects. We also conclude that metagenomes from microbial communities that have high coverage of phylogenetically distinct, and low taxonomic diversity results in highest quality metagenome-assembled genomes.

Electronic supplementary material

The online version of this article (10.1186/s12864-017-4294-1) contains supplementary material, which is available to authorized users.

Collapse

François S, Filloux D, Frayssinet M, Roumagnac P, Martin DP, Ogliastro M, Froissart R. Increase in taxonomic assignment efficiency of viral reads in metagenomic studies. Virus Res 2017;244:230-234. [PMID: 29154906 DOI: 10.1016/j.virusres.2017.11.011] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2017] [Revised: 11/10/2017] [Accepted: 11/10/2017] [Indexed: 12/17/2022]

Roux S, Emerson JB, Eloe-Fadrosh EA, Sullivan MB. Benchmarking viromics: an in silico evaluation of metagenome-enabled estimates of viral community composition and diversity. PeerJ 2017;5:e3817. [PMID: 28948103 PMCID: PMC5610896 DOI: 10.7717/peerj.3817] [Citation(s) in RCA: 170] [Impact Index Per Article: 24.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/24/2017] [Accepted: 08/26/2017] [Indexed: 12/20/2022] Open

Abstract

Background

Viral metagenomics (viromics) is increasingly used to obtain uncultivated viral genomes, evaluate community diversity, and assess ecological hypotheses. While viromic experimental methods are relatively mature and widely accepted by the research community, robust bioinformatics standards remain to be established. Here we used in silico mock viral communities to evaluate the viromic sequence-to-ecological-inference pipeline, including (i) read pre-processing and metagenome assembly, (ii) thresholds applied to estimate viral relative abundances based on read mapping to assembled contigs, and (iii) normalization methods applied to the matrix of viral relative abundances for alpha and beta diversity estimates.

Results

Tools specifically designed for metagenomes, specifically metaSPAdes, MEGAHIT, and IDBA-UD, were the most effective at assembling viromes. Read pre-processing, such as partitioning, had virtually no impact on assembly output, but may be useful when hardware is limited. Viral populations with 2–5 × coverage typically assembled well, whereas lesser coverage led to fragmented assembly. Strain heterogeneity within populations hampered assembly, especially when strains were closely related (average nucleotide identity, or ANI ≥97%) and when the most abundant strain represented <50% of the population. Viral community composition assessments based on read recruitment were generally accurate when the following thresholds for detection were applied: (i) ≥10 kb contig lengths to define populations, (ii) coverage defined from reads mapping at ≥90% identity, and (iii) ≥75% of contig length with ≥1 × coverage. Finally, although data are limited to the most abundant viruses in a community, alpha and beta diversity patterns were robustly estimated (±10%) when comparing samples of similar sequencing depth, but more divergent (up to 80%) when sequencing depth was uneven across the dataset. In the latter cases, the use of normalization methods specifically developed for metagenomes provided the best estimates.

Conclusions

These simulations provide benchmarks for selecting analysis cut-offs and establish that an optimized sample-to-ecological-inference viromics pipeline is robust for making ecological inferences from natural viral communities. Continued development to better accessing RNA, rare, and/or diverse viral populations and improved reference viral genome availability will alleviate many of viromics remaining limitations.

Collapse

Nurk S, Meleshko D, Korobeynikov A, Pevzner PA. metaSPAdes: a new versatile metagenomic assembler. Genome Res 2017;27:824-834. [PMID: 28298430 PMCID: PMC5411777 DOI: 10.1101/gr.213959.116] [Citation(s) in RCA: 2086] [Impact Index Per Article: 298.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2016] [Accepted: 03/13/2017] [Indexed: 01/25/2023]

Hesse U, van Heusden P, Kirby BM, Olonade I, van Zyl LJ, Trindade M. Virome Assembly and Annotation: A Surprise in the Namib Desert. Front Microbiol 2017;8:13. [PMID: 28167933 PMCID: PMC5253355 DOI: 10.3389/fmicb.2017.00013] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2016] [Accepted: 01/03/2017] [Indexed: 11/13/2022] Open

Cobián Güemes AG, Youle M, Cantú VA, Felts B, Nulton J, Rohwer F. Viruses as Winners in the Game of Life. Annu Rev Virol 2016;3:197-214. [DOI: 10.1146/annurev-virology-100114-054952] [Citation(s) in RCA: 158] [Impact Index Per Article: 19.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]