51
|
Adler A, Poirier S, Pagni M, Maillard J, Holliger C. Disentangle genus microdiversity within a complex microbial community by using a multi-distance long-read binning method: example of Candidatus Accumulibacter. Environ Microbiol 2022; 24:2136-2156. [PMID: 35315560 PMCID: PMC9311429 DOI: 10.1111/1462-2920.15947] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2021] [Accepted: 02/19/2022] [Indexed: 11/26/2022]
Abstract
Complete genomes can be recovered from metagenomes by assembling and binning DNA sequences into metagenome assembled genomes (MAGs). Yet, the presence of microdiversity can hamper the assembly and binning processes, possibly yielding chimeric, highly fragmented and incomplete genomes. Here, the metagenomes of four samples of aerobic granular sludge bioreactors containing Candidatus (Ca.) Accumulibacter, a phosphate-accumulating organism of interest for wastewater treatment, were sequenced with both PacBio and Illumina. Different strategies of genome assembly and binning were investigated, including published protocols and a binning procedure adapted to the binning of long contigs (MuLoBiSC). Multiple criteria were considered to select the best strategy for Ca. Accumulibacter, whose multiple strains in every sample represent a challenging microdiversity. In this case, the best strategy relies on long-read only assembly and a custom binning procedure including MuLoBiSC in metaWRAP. Several high-quality Ca. Accumulibacter MAGs, including a novel species, were obtained independently from different samples. Comparative genomic analysis showed that MAGs retrieved in different samples harbour genomic rearrangements in addition to accumulation of point mutations. The microdiversity of Ca. Accumulibacter, likely driven by mobile genetic elements, causes major difficulties in recovering MAGs, but it is also a hallmark of the panmictic lifestyle of these bacteria.
Collapse
Affiliation(s)
- Aline Adler
- Laboratory for Environmental Biotechnology, Ecole Polytechnique Fédérale de Lausanne, Lausanne, Switzerland
| | - Simon Poirier
- Laboratory for Environmental Biotechnology, Ecole Polytechnique Fédérale de Lausanne, Lausanne, Switzerland
| | - Marco Pagni
- Laboratory for Environmental Biotechnology, Ecole Polytechnique Fédérale de Lausanne, Lausanne, Switzerland.,Vital-IT Group, SIB Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Julien Maillard
- Laboratory for Environmental Biotechnology, Ecole Polytechnique Fédérale de Lausanne, Lausanne, Switzerland.,IFP Energie nouvelles, 1 et 4 avenue de Bois-Préau, 92852, Rueil-Malmaison Cedex, France
| | - Christof Holliger
- Laboratory for Environmental Biotechnology, Ecole Polytechnique Fédérale de Lausanne, Lausanne, Switzerland
| |
Collapse
|
52
|
Translational multi-omics microbiome research for strategies to improve cattle production and health. Emerg Top Life Sci 2022; 6:201-213. [PMID: 35311904 DOI: 10.1042/etls20210257] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2021] [Revised: 02/23/2022] [Accepted: 03/01/2022] [Indexed: 12/27/2022]
Abstract
Cattle microbiome plays a vital role in cattle growth and performance and affects many economically important traits such as feed efficiency, milk/meat yield and quality, methane emission, immunity and health. To date, most cattle microbiome research has focused on metataxonomic and metagenomic characterization to reveal who are there and what they may do, preventing the determination of the active functional dynamics in vivo and their causal relationships with the traits. Therefore, there is an urgent need to combine other advanced omics approaches to improve microbiome analysis to determine their mode of actions and host-microbiome interactions in vivo. This review will critically discuss the current multi-omics microbiome research in beef and dairy cattle, aiming to provide insights on how the information generated can be applied to future strategies to improve production efficiency, health and welfare, and environment-friendliness in cattle production through microbiome manipulations.
Collapse
|
53
|
Ventolero MF, Wang S, Hu H, Li X. Computational analyses of bacterial strains from shotgun reads. Brief Bioinform 2022; 23:6524011. [PMID: 35136954 DOI: 10.1093/bib/bbac013] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2021] [Revised: 01/10/2022] [Accepted: 01/11/2022] [Indexed: 12/21/2022] Open
Abstract
Shotgun sequencing is routinely employed to study bacteria in microbial communities. With the vast amount of shotgun sequencing reads generated in a metagenomic project, it is crucial to determine the microbial composition at the strain level. This study investigated 20 computational tools that attempt to infer bacterial strain genomes from shotgun reads. For the first time, we discussed the methodology behind these tools. We also systematically evaluated six novel-strain-targeting tools on the same datasets and found that BHap, mixtureS and StrainFinder performed better than other tools. Because the performance of the best tools is still suboptimal, we discussed future directions that may address the limitations.
Collapse
Affiliation(s)
| | - Saidi Wang
- Department of Computer Science, University of Central Florida, Orlando, FL 32816, USA
| | - Haiyan Hu
- Department of Computer Science, University of Central Florida, Orlando, FL 32816, USA.,Genomics and Bioinformatics Cluster, University of Central Florida, Orlando, FL 32816, USA
| | - Xiaoman Li
- Burnett School of Biomedical Science, University of Central Florida, Orlando, FL 32816, USA
| |
Collapse
|
54
|
Vuong P, Wise MJ, Whiteley AS, Kaur P. Small investments with big returns: environmental genomic bioprospecting of microbial life. Crit Rev Microbiol 2022; 48:641-655. [PMID: 35100064 DOI: 10.1080/1040841x.2021.2011833] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
Microorganisms and their natural products are major drivers of ecological processes and industrial applications. Microbial bioprospecting has been critical for the advancement in various fields such as pharmaceuticals, sustainable industries, food security and bioremediation. Next generation sequencing has been paramount in the exploration of diverse environmental microbiomes. It presents a culture-independent approach to investigating hitherto uncultured taxa, resulting in the creation of massive sequence databases, which are available in the public domain. Genome mining searches available (meta)genomic data for target biosynthetic genes, and combined with the large-scale public data, this in-silico bioprospecting method presents an efficient and extensive way to uncover microbial bioproducts. Bioinformatic tools have progressed to a stage where we can recover genomes from the environment; these metagenome-assembled genomes present a way to understand the metabolic capacity of microorganisms in a physiological and ecological context. Environmental sampling been extensive across various ecological settings, including microbiomes with unique physicochemical properties that could influence the discovery of novel functions and metabolic pathways. Although in-silico methods cannot completely substitute in-vitro studies, the contextual information it provides is invaluable for understanding the ecological and taxonomic distribution of microbial genotypes and to form effective strategies for future microbial bioprospecting efforts.
Collapse
Affiliation(s)
- Paton Vuong
- UWA School of Agriculture & Environment, University of Western Australia, Perth, Australia
| | - Michael J Wise
- School of Physics, Mathematics and Computing, University of Western Australia, Perth, Australia
| | - Andrew S Whiteley
- Centre for Environment & Life Sciences, Commonwealth Scientific and Industrial Research Organisation (CSIRO), Floreat, Australia
| | - Parwinder Kaur
- UWA School of Agriculture & Environment, University of Western Australia, Perth, Australia
| |
Collapse
|
55
|
Dai D, Brown C, Bürgmann H, Larsson DGJ, Nambi I, Zhang T, Flach CF, Pruden A, Vikesland PJ. Long-read metagenomic sequencing reveals shifts in associations of antibiotic resistance genes with mobile genetic elements from sewage to activated sludge. MICROBIOME 2022; 10:20. [PMID: 35093160 PMCID: PMC8801152 DOI: 10.1186/s40168-021-01216-5] [Citation(s) in RCA: 38] [Impact Index Per Article: 19.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/08/2021] [Accepted: 12/13/2021] [Indexed: 05/10/2023]
Abstract
BACKGROUND There is concern that the microbially rich activated sludge environment of wastewater treatment plants (WWTPs) may contribute to the dissemination of antibiotic resistance genes (ARGs). We applied long-read (nanopore) sequencing to profile ARGs and their neighboring genes to illuminate their fate in the activated sludge treatment by comparing their abundance, genetic locations, mobility potential, and bacterial hosts within activated sludge relative to those in influent sewage across five WWTPs from three continents. RESULTS The abundances (gene copies per Gb of reads, aka gc/Gb) of all ARGs and those carried by putative pathogens decreased 75-90% from influent sewage (192-605 gc/Gb) to activated sludge (31-62 gc/Gb) at all five WWTPs. Long reads enabled quantification of the percent abundance of ARGs with mobility potential (i.e., located on plasmids or co-located with other mobile genetic elements (MGEs)). The abundance of plasmid-associated ARGs decreased at four of five WWTPs (from 40-73 to 31-68%), and ARGs co-located with transposable, integrative, and conjugative element hallmark genes showed similar trends. Most ARG-associated elements decreased 0.35-13.52% while integrative and transposable elements displayed slight increases at two WWTPs (1.4-2.4%). While resistome and taxonomic compositions both shifted significantly, host phyla for chromosomal ARG classes remained relatively consistent, indicating vertical gene transfer via active biomass growth in activated sludge as the key pathway of chromosomal ARG dissemination. CONCLUSIONS Overall, our results suggest that the activated sludge process acted as a barrier against the proliferation of most ARGs, while those that persisted or increased warrant further attention. Video abstract.
Collapse
Affiliation(s)
- Dongjuan Dai
- Department of Civil and Environmental Engineering, Virginia Polytechnic and State University, Blacksburg, VA, USA
| | - Connor Brown
- Department of Genetics, Bioinformatics, and Computational Biology, Virginia Polytechnic and State University, Blacksburg, VA, USA
| | - Helmut Bürgmann
- Eawag: Swiss Federal Institute of Aquatic Science and Technology, Kastanienbaum, Switzerland
| | - D G Joakim Larsson
- Institute of Biomedicine, Department of Infectious Diseases, University of Gothenburg, Gothenburg, Sweden
- Centre for Antibiotic Resistance Research (CARe), University of Gothenburg, Gothenburg, Sweden
| | - Indumathi Nambi
- Department of Civil Engineering, Indian Institute of Technology, Madras, India
| | - Tong Zhang
- Department of Civil Engineering, The University of Hong Kong, Hong Kong SAR, China
| | - Carl-Fredrik Flach
- Institute of Biomedicine, Department of Infectious Diseases, University of Gothenburg, Gothenburg, Sweden
- Centre for Antibiotic Resistance Research (CARe), University of Gothenburg, Gothenburg, Sweden
| | - Amy Pruden
- Department of Civil and Environmental Engineering, Virginia Polytechnic and State University, Blacksburg, VA, USA.
| | - Peter J Vikesland
- Department of Civil and Environmental Engineering, Virginia Polytechnic and State University, Blacksburg, VA, USA.
| |
Collapse
|
56
|
Trigodet F, Lolans K, Fogarty E, Shaiber A, Morrison HG, Barreiro L, Jabri B, Eren AM. High molecular weight DNA extraction strategies for long-read sequencing of complex metagenomes. Mol Ecol Resour 2022; 22:1786-1802. [PMID: 35068060 PMCID: PMC9177515 DOI: 10.1111/1755-0998.13588] [Citation(s) in RCA: 18] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2021] [Revised: 12/10/2021] [Accepted: 01/14/2022] [Indexed: 11/28/2022]
Abstract
By offering extremely long contiguous characterization of individual DNA molecules, rapidly emerging long‐read sequencing strategies offer comprehensive insights into the organization of genetic information in genomes and metagenomes. However, successful long‐read sequencing experiments demand high concentrations of highly purified DNA of high molecular weight (HMW), which limits the utility of established DNA extraction kits designed for short‐read sequencing. The challenges associated with input DNA quality intensify further when working with complex environmental samples of low microbial biomass, which requires new protocols that are tailored to study metagenomes with long‐read sequencing. Here, we use human tongue scrapings to benchmark six HMW DNA extraction strategies that are based on commercially available kits, phenol–chloroform (PC) extraction and agarose encasement followed by agarase digestion. A typical end goal of HMW DNA extractions is to obtain the longest possible reads during sequencing, which is often achieved by PC extractions, as demonstrated in sequencing of cultured cells. Yet our analyses that consider overall read‐size distribution, assembly performance and the number of circularized elements found in sequencing results suggest that column‐based kits with enzyme supplementation, rather than PC methods, may be more appropriate for long‐read sequencing of metagenomes.
Collapse
Affiliation(s)
- Florian Trigodet
- Department of Medicine, The University of Chicago, Chicago, IL, 60637, USA
| | - Karen Lolans
- Department of Medicine, The University of Chicago, Chicago, IL, 60637, USA
| | - Emily Fogarty
- Committee on Microbiology, University of Chicago, Chicago, IL, 60637, USA
| | - Alon Shaiber
- BioPhysical Sciences Program, The University of Chicago, Chicago, IL, 60637, USA
| | - Hilary G Morrison
- Josephine Bay Paul Center for Comparative Molecular Biology and Evolution, Marine Biological Laboratory, Woods Hole, MA, 02543, USA
| | - Luis Barreiro
- Department of Medicine, The University of Chicago, Chicago, IL, 60637, USA
| | - Bana Jabri
- Department of Medicine, The University of Chicago, Chicago, IL, 60637, USA
| | - A Murat Eren
- Department of Medicine, The University of Chicago, Chicago, IL, 60637, USA.,Committee on Microbiology, University of Chicago, Chicago, IL, 60637, USA.,BioPhysical Sciences Program, The University of Chicago, Chicago, IL, 60637, USA.,Josephine Bay Paul Center for Comparative Molecular Biology and Evolution, Marine Biological Laboratory, Woods Hole, MA, 02543, USA
| |
Collapse
|
57
|
Garber AI, Zehnpfennig JR, Sheik CS, Henson MW, Ramírez GA, Mahon AR, Halanych KM, Learman DR. Metagenomics of Antarctic Marine Sediment Reveals Potential for Diverse Chemolithoautotrophy. mSphere 2021; 6:e0077021. [PMID: 34817234 PMCID: PMC8612310 DOI: 10.1128/msphere.00770-21] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2021] [Accepted: 11/10/2021] [Indexed: 11/30/2022] Open
Abstract
The microbial biogeochemical processes occurring in marine sediment in Antarctica remain underexplored due to limited access. Further, these polar habitats are unique, as they are being exposed to significant changes in their climate. To explore how microbes drive biogeochemistry in these sediments, we performed a shotgun metagenomic survey of marine surficial sediment (0 to 3 cm of the seafloor) collected from 13 locations in western Antarctica and assembled 16 high-quality metagenome assembled genomes for focused interrogation of the lifestyles of some abundant lineages. We observe an abundance of genes from pathways for the utilization of reduced carbon, sulfur, and nitrogen sources. Although organotrophy is pervasive, nitrification and sulfide oxidation are the dominant lithotrophic pathways and likely fuel carbon fixation via the reverse tricarboxylic acid and Calvin cycles. Oxygen-dependent terminal oxidases are common, and genes for reduction of oxidized nitrogen are sporadically present in our samples. Our results suggest that the underlying benthic communities are well primed for the utilization of settling organic matter, which is consistent with findings from highly productive surface water. Despite the genetic potential for nitrate reduction, the net catabolic pathway in our samples remains aerobic respiration, likely coupled to the oxidation of sulfur and nitrogen imported from the highly productive Antarctic water column above. IMPORTANCE The impacts of climate change in polar regions, like Antarctica, have the potential to alter numerous ecosystems and biogeochemical cycles. Increasing temperature and freshwater runoff from melting ice can have profound impacts on the cycling of organic and inorganic nutrients between the pelagic and benthic ecosystems. Within the benthos, sediment microbial communities play a critical role in carbon mineralization and the cycles of essential nutrients like nitrogen and sulfur. Metagenomic data collected from sediment samples from the continental shelf of western Antarctica help to examine this unique system and document the metagenomic potential for lithotrophic metabolisms and the cycles of both nitrogen and sulfur, which support not only benthic microbes but also life in the pelagic zone.
Collapse
Affiliation(s)
- Arkadiy I. Garber
- Biodesign Center for Mechanisms for Evolution, Arizona State University, Tempe, Arizona, USA
| | | | - Cody S. Sheik
- Biology Department and Large Lakes Observatory, University of Minnesota Duluth, Duluth, Minnesota, USA
| | - Michael W. Henson
- Department of Biology, Central Michigan University, Mt. Pleasant, Michigan, USA
| | - Gustavo A. Ramírez
- College of Veterinary Medicine, Western University of Health Sciences, Pomona, California, USA
- Department of Marine Biology, Haifa University, Haifa, Israel
| | - Andrew R. Mahon
- Department of Biology, Central Michigan University, Mt. Pleasant, Michigan, USA
| | - Kenneth M. Halanych
- Center for Marine Science, University of North Carolina Wilmington, Wilmington, North Carolina, USA
| | - Deric R. Learman
- Department of Biology, Central Michigan University, Mt. Pleasant, Michigan, USA
| |
Collapse
|
58
|
Behera BK, Dehury B, Rout AK, Patra B, Mantri N, Chakraborty HJ, Sarkar DJ, Kaushik NK, Bansal V, Singh I, Das BK, Rao AR, Rai A. Metagenomics study in aquatic resource management: Recent trends, applied methodologies and future needs. GENE REPORTS 2021. [DOI: 10.1016/j.genrep.2021.101372] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022]
|
59
|
Robinson SL, Piel J, Sunagawa S. A roadmap for metagenomic enzyme discovery. Nat Prod Rep 2021; 38:1994-2023. [PMID: 34821235 PMCID: PMC8597712 DOI: 10.1039/d1np00006c] [Citation(s) in RCA: 62] [Impact Index Per Article: 20.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2021] [Indexed: 12/13/2022]
Abstract
Covering: up to 2021Metagenomics has yielded massive amounts of sequencing data offering a glimpse into the biosynthetic potential of the uncultivated microbial majority. While genome-resolved information about microbial communities from nearly every environment on earth is now available, the ability to accurately predict biocatalytic functions directly from sequencing data remains challenging. Compared to primary metabolic pathways, enzymes involved in secondary metabolism often catalyze specialized reactions with diverse substrates, making these pathways rich resources for the discovery of new enzymology. To date, functional insights gained from studies on environmental DNA (eDNA) have largely relied on PCR- or activity-based screening of eDNA fragments cloned in fosmid or cosmid libraries. As an alternative, shotgun metagenomics holds underexplored potential for the discovery of new enzymes directly from eDNA by avoiding common biases introduced through PCR- or activity-guided functional metagenomics workflows. However, inferring new enzyme functions directly from eDNA is similar to searching for a 'needle in a haystack' without direct links between genotype and phenotype. The goal of this review is to provide a roadmap to navigate shotgun metagenomic sequencing data and identify new candidate biosynthetic enzymes. We cover both computational and experimental strategies to mine metagenomes and explore protein sequence space with a spotlight on natural product biosynthesis. Specifically, we compare in silico methods for enzyme discovery including phylogenetics, sequence similarity networks, genomic context, 3D structure-based approaches, and machine learning techniques. We also discuss various experimental strategies to test computational predictions including heterologous expression and screening. Finally, we provide an outlook for future directions in the field with an emphasis on meta-omics, single-cell genomics, cell-free expression systems, and sequence-independent methods.
Collapse
Affiliation(s)
| | - Jörn Piel
- Eidgenössische Technische Hochschule (ETH), Zürich, Switzerland.
| | | |
Collapse
|
60
|
Ma T, McAllister TA, Guan LL. A review of the resistome within the digestive tract of livestock. J Anim Sci Biotechnol 2021; 12:121. [PMID: 34763729 PMCID: PMC8588621 DOI: 10.1186/s40104-021-00643-6] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2021] [Accepted: 10/07/2021] [Indexed: 12/25/2022] Open
Abstract
Antimicrobials have been widely used to prevent and treat infectious diseases and promote growth in food-production animals. However, the occurrence of antimicrobial resistance poses a huge threat to public and animal health, especially in less developed countries where food-producing animals often intermingle with humans. To limit the spread of antimicrobial resistance from food-production animals to humans and the environment, it is essential to have a comprehensive knowledge of the role of the resistome in antimicrobial resistance (AMR), The resistome refers to the collection of all antimicrobial resistance genes associated with microbiota in a given environment. The dense microbiota in the digestive tract is known to harbour one of the most diverse resistomes in nature. Studies of the resistome in the digestive tract of humans and animals are increasing exponentially as a result of advancements in next-generation sequencing and the expansion of bioinformatic resources/tools to identify and describe the resistome. In this review, we outline the various tools/bioinformatic pipelines currently available to characterize and understand the nature of the intestinal resistome of swine, poultry, and ruminants. We then propose future research directions including analysis of resistome using long-read sequencing, investigation in the role of mobile genetic elements in the expression, function and transmission of AMR. This review outlines the current knowledge and approaches to studying the resistome in food-producing animals and sheds light on future strategies to reduce antimicrobial usage and control the spread of AMR both within and from livestock production systems.
Collapse
Affiliation(s)
- Tao Ma
- Key laboratory of Feed Biotechnology of the Ministry of Agriculture, Institute of Feed Research, Chinese Academy of Agricultural Sciences, Beijing, 100081, China.,Department of Agricultural, Food and Nutritional Science, University of Alberta, T6G2P5, Edmonton, AB, Canada
| | - Tim A McAllister
- Lethbridge Research and Development Centre, Lethbridge, AB, T1J 4P4, Canada
| | - Le Luo Guan
- Department of Agricultural, Food and Nutritional Science, University of Alberta, T6G2P5, Edmonton, AB, Canada.
| |
Collapse
|
61
|
Reconstruction of evolving gene variants and fitness from short sequencing reads. Nat Chem Biol 2021; 17:1188-1198. [PMID: 34635842 PMCID: PMC8551035 DOI: 10.1038/s41589-021-00876-6] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2020] [Accepted: 08/09/2021] [Indexed: 12/23/2022]
Abstract
Directed evolution can generate proteins with tailor-made activities. However, full-length genotypes, their frequencies and fitnesses are difficult to measure for evolving gene-length biomolecules using most high-throughput DNA sequencing methods, as short read lengths can lose mutation linkages in haplotypes. Here we present Evoracle, a machine learning method that accurately reconstructs full-length genotypes (R2 = 0.94) and fitness using short-read data from directed evolution experiments, with substantial improvements over related methods. We validate Evoracle on phage-assisted continuous evolution (PACE) and phage-assisted non-continuous evolution (PANCE) of adenine base editors and OrthoRep evolution of drug-resistant enzymes. Evoracle retains strong performance (R2 = 0.86) on data with complete linkage loss between neighboring nucleotides and large measurement noise, such as pooled Sanger sequencing data (~US$10 per timepoint), and broadens the accessibility of training machine learning models on gene variant fitnesses. Evoracle can also identify high-fitness variants, including low-frequency 'rising stars', well before they are identifiable from consensus mutations.
Collapse
|
62
|
Wan XH. Artificial intelligence reveals roles of gut microbiota in driving human colorectal cancer evolution. Artif Intell Cancer 2021; 2:69-78. [DOI: 10.35713/aic.v2.i5.69] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/19/2021] [Revised: 10/24/2021] [Accepted: 10/27/2021] [Indexed: 02/06/2023] Open
Abstract
With the rapid development of high-throughput sequencing and artificial intelligence (AI) techniques, gut mucosal microbiota begins to be recognized as critical drivers of human colorectal cancer (CRC). Various AI approaches have been designed to obtain effective information from enormous numbers of microbial cells residing in gut mucosal as well as cancer cells. These mainly include detection of microbial markers for early clinical diagnosis of stage-specific CRC, characterization of pathogenic bacterial activities via genomic and transcriptomic analyses, and prediction of interplay between bacterial drivers and host immune systems. Here I review the current progresses of AI applications in profiling gut microbiomes linked to CRC initiation and development. I further look forward to future AI research for improving our understanding of the roles of gut microbiota in CRC evolution.
Collapse
Affiliation(s)
- Xue-Hua Wan
- TEDA Institute of Biological Sciences and Biotechnology, Nankai University, Tianjin 300457, China
| |
Collapse
|
63
|
Abstract
Xylella fastidiosa (Xf) is a globally distributed plant-pathogenic bacterium. The primary control strategy for Xf diseases is eradicating infected plants; therefore, timely and accurate detection is necessary to prevent crop losses and further pathogen dispersal. Conventional Xf diagnostics primarily relies on quantitative PCR (qPCR) assays. However, these methods do not consider new or emerging variants due to pathogen genetic recombination and sensitivity limitations. We developed and tested a metagenomics pipeline using in-house short-read sequencing as a complementary approach for affordable, fast, and highly accurate Xf detection. We used metagenomics to identify Xf to the strain level in single- and mixed-infected plant samples at concentrations as low as 1 pg of bacterial DNA per gram of tissue. We also tested naturally infected samples from various plant species originating from Europe and the United States. We identified Xf subspecies in samples previously considered inconclusive with real-time PCR (quantification cycle [Cq], >35). Overall, we showed the versatility of the pipeline by using different plant hosts and DNA extraction methods. Our pipeline provides taxonomic and functional information for Xf diagnostics without extensive knowledge of the disease. This pipeline demonstrates that metagenomics can be used for early detection of Xf and incorporated as a tool to inform disease management strategies. IMPORTANCE Destructive Xylella fastidiosa (Xf) outbreaks in Europe highlight this pathogen’s capacity to expand its host range and geographical distribution. The current disease diagnostic approaches are limited by a multiple-step process, biases to known sequences, and detection limits. We developed a low-cost, user-friendly metagenomic sequencing tool for Xf detection. In less than 3 days, we were able to identify Xf subspecies and strains in field-collected samples. Overall, our pipeline is a diagnostics tool that could be easily extended to other plant-pathogen interactions and implemented for emerging plant threat surveillance.
Collapse
|
64
|
Trindade M, Sithole N, Kubicki S, Thies S, Burger A. Screening Strategies for Biosurfactant Discovery. ADVANCES IN BIOCHEMICAL ENGINEERING/BIOTECHNOLOGY 2021; 181:17-52. [PMID: 34518910 DOI: 10.1007/10_2021_174] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
The isolation and screening of bacteria and fungi for the production of surface-active compounds has been the basis for the majority of the biosurfactants discovered to date. Hence, a wide variety of well-established and relatively simple methods are available for screening, mostly focused on the detection of surface or interfacial activity of the culture supernatant. However, the success of any biodiscovery effort, specifically aiming to access novelty, relies directly on the characteristics being screened for and the uniqueness of the microorganisms being screened. Therefore, given that rather few novel biosurfactant structures have been discovered during the last decade, advanced strategies are now needed to widen access to novel chemistries and properties. In addition, more modern Omics technologies should be considered to the traditional culture-based approaches for biosurfactant discovery. This chapter summarizes the screening methods and strategies typically used for the discovery of biosurfactants and highlights some of the Omics-based approaches that have resulted in the discovery of unique biosurfactants. These studies illustrate the potentially enormous diversity that has yet to be unlocked and how we can begin to tap into these biological resources.
Collapse
Affiliation(s)
- Marla Trindade
- Institute for Microbial Biotechnology and Metagenomics, University of the Western Cape, Cape Town, South Africa.
| | - Nombuso Sithole
- Institute for Microbial Biotechnology and Metagenomics, University of the Western Cape, Cape Town, South Africa
| | - Sonja Kubicki
- Institute of Molecular Enzyme Technology, Heinrich-Heine-Universität Düsseldorf, Düsseldorf, Germany
| | - Stephan Thies
- Institute of Molecular Enzyme Technology, Heinrich-Heine-Universität Düsseldorf, Düsseldorf, Germany
| | - Anita Burger
- Institute for Microbial Biotechnology and Metagenomics, University of the Western Cape, Cape Town, South Africa
| |
Collapse
|
65
|
Kayani MUR, Huang W, Feng R, Chen L. Genome-resolved metagenomics using environmental and clinical samples. Brief Bioinform 2021; 22:bbab030. [PMID: 33758906 PMCID: PMC8425419 DOI: 10.1093/bib/bbab030] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2020] [Revised: 11/29/2020] [Accepted: 01/20/2021] [Indexed: 12/25/2022] Open
Abstract
Recent advances in high-throughput sequencing technologies and computational methods have added a new dimension to metagenomic data analysis i.e. genome-resolved metagenomics. In general terms, it refers to the recovery of draft or high-quality microbial genomes and their taxonomic classification and functional annotation. In recent years, several studies have utilized the genome-resolved metagenome analysis approach and identified previously unknown microbial species from human and environmental metagenomes. In this review, we describe genome-resolved metagenome analysis as a series of four necessary steps: (i) preprocessing of the sequencing reads, (ii) de novo metagenome assembly, (iii) genome binning and (iv) taxonomic and functional analysis of the recovered genomes. For each of these four steps, we discuss the most commonly used tools and the currently available pipelines to guide the scientific community in the recovery and subsequent analyses of genomes from any metagenome sample. Furthermore, we also discuss the tools required for validation of assembly quality as well as for improving quality of the recovered genomes. We also highlight the currently available pipelines that can be used to automate the whole analysis without having advanced bioinformatics knowledge. Finally, we will highlight the most widely adapted and actively maintained tools and pipelines that can be helpful to the scientific community in decision making before they commence the analysis.
Collapse
Affiliation(s)
- Masood ur Rehman Kayani
- Center for Microbiota and Immunological Diseases, Shanghai General Hospital, Shanghai Institute of Immunology, Shanghai Jiao Tong University, School of Medicine, Shanghai 2,000,025, China
| | - Wanqiu Huang
- Shanghai Institute of Immunology, Shanghai Jiao Tong University, School of Medicine, Shanghai 200,000, China
| | - Ru Feng
- Center for Microbiota and Immunological Diseases, Shanghai General Hospital, Shanghai Institute of Immunology, Shanghai Jiao Tong University, School of Medicine, Shanghai 2,000,025, China
| | - Lei Chen
- Center for Microbiota and Immunological Diseases, Shanghai General Hospital, Shanghai Institute of Immunology, Shanghai Jiao Tong University, School of Medicine, Shanghai 2,000,025, China
| |
Collapse
|
66
|
Chen YH, Chiang PW, Rogozin DY, Degermendzhy AG, Chiu HH, Tang SL. Salvaging high-quality genomes of microbial species from a meromictic lake using a hybrid sequencing approach. Commun Biol 2021; 4:996. [PMID: 34426638 PMCID: PMC8382752 DOI: 10.1038/s42003-021-02510-6] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2021] [Accepted: 08/01/2021] [Indexed: 11/08/2022] Open
Abstract
Most of Earth's bacteria have yet to be cultivated. The metabolic and functional potentials of these uncultivated microorganisms thus remain mysterious, and the metagenome-assembled genome (MAG) approach is the most robust method for uncovering these potentials. However, MAGs discovered by conventional metagenomic assembly and binning are usually highly fragmented genomes with heterogeneous sequence contamination. In this study, we combined Illumina and Nanopore data to develop a new workflow to reconstruct 233 MAGs-six novel bacterial orders, 20 families, 66 genera, and 154 species-from Lake Shunet, a secluded meromictic lake in Siberia. With our workflow, the average N50 of reconstructed MAGs greatly increased 10-40-fold compared to when the conventional Illumina assembly and binning method were used. More importantly, six complete MAGs were recovered from our datasets. The recovery of 154 novel species MAGs from a rarely explored lake greatly expands the current bacterial genome encyclopedia.
Collapse
Affiliation(s)
- Yu-Hsiang Chen
- Bioinformatics Program, Taiwan International Graduate Program, National Taiwan University, Taipei, Taiwan
- Bioinformatics Program, Institute of Information Science, Taiwan International Graduate Program, Academia Sinica, Taipei, Taiwan
- Biodiversity Research Center, Academia Sinica, Taipei, Taiwan
| | - Pei-Wen Chiang
- Biodiversity Research Center, Academia Sinica, Taipei, Taiwan
| | - Denis Yu Rogozin
- Institute of Biophysics, Siberian Branch of Russian Academy of Sciences, Krasnoyarsk, Russia
- Siberian Federal University, Krasnoyarsk, Russia
| | - Andrey G Degermendzhy
- Institute of Biophysics, Siberian Branch of Russian Academy of Sciences, Krasnoyarsk, Russia
| | - Hsiu-Hui Chiu
- Biodiversity Research Center, Academia Sinica, Taipei, Taiwan
| | - Sen-Lin Tang
- Bioinformatics Program, Institute of Information Science, Taiwan International Graduate Program, Academia Sinica, Taipei, Taiwan.
- Biodiversity Research Center, Academia Sinica, Taipei, Taiwan.
| |
Collapse
|
67
|
Nisrina L, Effendi Y, Pancoro A. Revealing the role of Plant Growth Promoting Rhizobacteria in suppressive soils against Fusarium oxysporum f.sp. cubense based on metagenomic analysis. Heliyon 2021; 7:e07636. [PMID: 34401567 PMCID: PMC8353484 DOI: 10.1016/j.heliyon.2021.e07636] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2020] [Revised: 09/23/2020] [Accepted: 07/19/2021] [Indexed: 02/04/2023] Open
Abstract
Fusarium oxysporum f.sp. cubense (Foc) is a soil-borne pathogen causing fusarium wilt banana disease. Management of soil-borne disease generally required the application of toxic pesticides or fungicides strongly affect the soil microbiomes ecosystem. Suppressive soil is a promising method for controlling soil-borne pathogens in which soil microbiomes may affect the suppressiveness. The comparative analysis of microbial diversity was conducted from suppressive and conducive soils by analyzing whole shotgun metagenomic DNA data. Two suppressive soil samples and two conducive soil samples were collected from a banana plantation in Sukabumi, West Java, Indonesia. Each soil sample was prepared by mixing the soil samples collected from three points sampling sites with 20 cm depth. Analysis of microbial abundance, diversity, co-occurrence network using Metagenome Analyzer 6 (MEGAN6) and functional analysis using Kyoto Encyclopedia of Genes and Genomes (KEGG) was performed. Data showed the abundance of Actinobacteria, Betaproteobacteria, Rhizobiales, Burkholderiales, Bradyrhizobiaceae, Methylobacteriaceae, Rhodopseudomonas palustris, and Methylobacterium nodulans were higher in the suppressive than conducive soils. Interestingly, those bacteria groups are known functionally as members of Plant Growth Promoting Rhizobacteria (PGPR). The co-occurrence analysis showed Pseudomonas, Burkholderia, and Streptomyces were present in the suppressive soils, while Bacillus and more Streptomyces were found in the conducive soils. Furthermore, the relative abundance of Pseudomonas, Burkholderia, Bacillus, and Streptomyces was performed. The analysis showed that the relative abundance of Pseudomonas and Burkholderia was higher in the suppressive than conducive soils. Therefore, it assumed Pseudomonas and Burkholderia play a role in suppressing Foc based on co-occurrence and abundance analysis. Functional analysis of Pseudomonas and Burkholderia showed that the zinc/manganese transport system was higher in the suppressive than conducive soils. In contrast, the phosphate transport system was not found in conducive soils. Both functions are may be responsible for the synthesis of a siderophore and phosphate solubilization. In conclusion, this study provides information that PGPR may be contributing to Foc growth suppressing by releasing secondary metabolites.
Collapse
Affiliation(s)
- Lulu' Nisrina
- School of Life Sciences and Technology, Bandung Institute of Technology, Jalan Ganesha 10, 40132, Bandung, Indonesia
| | - Yunus Effendi
- Department of Biology, Al-Azhar Univerisity of Indonesia, Jalan Sisimangaraja 2, 12110, Jakarta, Indonesia
| | - Adi Pancoro
- School of Life Sciences and Technology, Bandung Institute of Technology, Jalan Ganesha 10, 40132, Bandung, Indonesia
| |
Collapse
|
68
|
Yen S, Johnson JS. Metagenomics: a path to understanding the gut microbiome. Mamm Genome 2021; 32:282-296. [PMID: 34259891 PMCID: PMC8295064 DOI: 10.1007/s00335-021-09889-x] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2021] [Accepted: 06/28/2021] [Indexed: 12/16/2022]
Abstract
The gut microbiome is a major determinant of host health, yet it is only in the last 2 decades that the advent of next-generation sequencing has enabled it to be studied at a genomic level. Shotgun sequencing is beginning to provide insight into the prokaryotic as well as eukaryotic and viral components of the gut community, revealing not just their taxonomy, but also the functions encoded by their collective metagenome. This revolution in understanding is being driven by continued development of sequencing technologies and in consequence necessitates reciprocal development of computational approaches that can adapt to the evolving nature of sequence datasets. In this review, we provide an overview of current bioinformatic strategies for handling metagenomic sequence data and discuss their strengths and limitations. We then go on to discuss key technological developments that have the potential to once again revolutionise the way we are able to view and hence understand the microbiome.
Collapse
Affiliation(s)
- Sandi Yen
- Oxford Centre for Microbiome Studies, Kennedy Institute of Rheumatology, University of Oxford, Roosevelt Drive, Headington, Oxford, OX3 7FY, UK
| | - Jethro S Johnson
- Oxford Centre for Microbiome Studies, Kennedy Institute of Rheumatology, University of Oxford, Roosevelt Drive, Headington, Oxford, OX3 7FY, UK.
| |
Collapse
|
69
|
Milani C, Lugli GA, Fontana F, Mancabelli L, Alessandri G, Longhi G, Anzalone R, Viappiani A, Turroni F, van Sinderen D, Ventura M. METAnnotatorX2: a Comprehensive Tool for Deep and Shallow Metagenomic Data Set Analyses. mSystems 2021; 6:e0058321. [PMID: 34184911 PMCID: PMC8269244 DOI: 10.1128/msystems.00583-21] [Citation(s) in RCA: 38] [Impact Index Per Article: 12.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2021] [Accepted: 06/10/2021] [Indexed: 12/04/2022] Open
Abstract
The use of bioinformatic tools for read-based taxonomic and functional analyses of metagenomic data sets, including their assembly and management, is rather fragmentary due to the absence of an accepted gold standard. Moreover, most currently available software tools need input of millions of reads and rely on approximations in data analysis in order to reduce computing times. These issues result in suboptimal results in terms of accuracy, sensitivity, and specificity when used either for the reconstruction of taxonomic or functional profiles through read analysis or analysis of genomes reconstructed by metagenomic assembly. Moreover, the recent introduction of novel DNA sequencing technologies that generate long reads, such as Nanopore and PacBio, represent a valuable data resource that still suffers from a lack of dedicated tools to perform integrated hybrid analysis alongside short read data. In order to overcome these limitations, here we describe a comprehensive bioinformatic platform, METAnnotatorX2, aimed at providing an optimized user-friendly resource which maximizes output quality, while also allowing user-specific adaptation of the pipeline and straightforward integrated analysis of both short and long read data. To further improve performance quality and accuracy of taxonomic assignment of reads and contigs, custom preprocessed and taxonomically revised genomic databases for viruses, prokaryotes, and various eukaryotes were developed. The performance of METAnnotatorX2 was tested by analysis of artificial data sets encompassing viral, archaeal, bacterial, and eukaryotic (fungal) sequence reads that simulate different biological matrices. Moreover, real biological samples were employed to validate in silico results. IMPORTANCE We developed a novel tool, i.e., METAnnotatorX2, that includes a number of new advanced features for analysis of deep and shallow metagenomic data sets and is accompanied by (regularly updated) customized databases for archaea, bacteria, fungi, protists, and viruses. Both software and databases were developed so as to maximize sensitivity and specificity while including support for shallow metagenomic data sets. Through extensive tests performed on Illumina and Nanopore artificial data sets, we demonstrated the high performance of the software to not only extract taxonomic and functional information from sequence reads but also to assemble and process genomes from metagenomic data. The robustness of these functionalities was validated using "real-life" data sets obtained from Illumina and Nanopore sequencing of biological samples. Furthermore, the performance of METAnnotatorX2 was compared to other available software tools for analysis of shotgun metagenomics data.
Collapse
Affiliation(s)
- Christian Milani
- Laboratory of Probiogenomics, Department of Chemistry, Life Sciences, and Environmental Sustainability, University of Parma, Parma, Italy
- Microbiome Research Hub, University of Parma, Parma, Italy
| | - Gabriele Andrea Lugli
- Laboratory of Probiogenomics, Department of Chemistry, Life Sciences, and Environmental Sustainability, University of Parma, Parma, Italy
| | - Federico Fontana
- Laboratory of Probiogenomics, Department of Chemistry, Life Sciences, and Environmental Sustainability, University of Parma, Parma, Italy
- GenProbio srl, Parma, Italy
| | - Leonardo Mancabelli
- Laboratory of Probiogenomics, Department of Chemistry, Life Sciences, and Environmental Sustainability, University of Parma, Parma, Italy
| | - Giulia Alessandri
- Laboratory of Probiogenomics, Department of Chemistry, Life Sciences, and Environmental Sustainability, University of Parma, Parma, Italy
| | - Giulia Longhi
- Laboratory of Probiogenomics, Department of Chemistry, Life Sciences, and Environmental Sustainability, University of Parma, Parma, Italy
- GenProbio srl, Parma, Italy
| | | | | | - Francesca Turroni
- Laboratory of Probiogenomics, Department of Chemistry, Life Sciences, and Environmental Sustainability, University of Parma, Parma, Italy
- Microbiome Research Hub, University of Parma, Parma, Italy
| | - Douwe van Sinderen
- APC Microbiome Ireland and School of Microbiology, Bioscience Institute, National University of Ireland, Cork, Ireland
| | - Marco Ventura
- Laboratory of Probiogenomics, Department of Chemistry, Life Sciences, and Environmental Sustainability, University of Parma, Parma, Italy
- Microbiome Research Hub, University of Parma, Parma, Italy
| |
Collapse
|
70
|
Abstract
It is critical to identify individual genomes from microbiomic samples in order to carry out analysis of the microbes. Methods based on existing databases, however, may have limited capabilities in elucidating and quantifying the microbes due to the largely unidentified microbial species in natural or human-associated environments. We thus developed a database-free method, MaxBin 2.0, to aid in the process of recovering microbial genomes from metagenomes in a de novo manner. The recovery of individual genomes allows analysis of the microbiome in terms of a collection of microbial genomes so that one can understand the functional roles of each species. The data of individual microbes may then be analyzed collectively to untangle the interactions between different microbial organisms. By reporting the genome abundance information for co-assembled metagenomes, one may also identify which microorganisms dominate the microbiome and which species may co-occur from the MaxBin 2.0 results. © 2021 Wiley Periodicals LLC. Basic Protocol 1: Recovering genomes from one shotgun metagenome with coverage information Basic Protocol 2: Recovering genomes from one shotgun metagenome without coverage information Basic Protocol 3: Recovering genomes given multiple shotgun metagenomes with coverage information for each metagenome Basic Protocol 4: Recovering genomes given multiple shotgun metagenomes without coverage information Support Protocol 1: MaxBin installation Support Protocol 2: Assembling and co-assembling NGS reads.
Collapse
Affiliation(s)
- Yu-Wei Wu
- Graduate Institute of Biomedical Informatics, College of Medical Science and Technology, Taipei Medical University, Taipei, Taiwan.,Clinical Big Data Research Center, Taipei Medical University Hospital, Taipei, Taiwan
| | - Steven W Singer
- Joint BioEnergy Institute, Emeryville, California.,Biological Systems and Engineering Division, Lawrence Berkeley National Laboratory, Berkeley, California
| |
Collapse
|
71
|
Alam I, Kamau AA, Ngugi DK, Gojobori T, Duarte CM, Bajic VB. KAUST Metagenomic Analysis Platform (KMAP), enabling access to massive analytics of re-annotated metagenomic data. Sci Rep 2021; 11:11511. [PMID: 34075103 PMCID: PMC8169707 DOI: 10.1038/s41598-021-90799-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2020] [Accepted: 05/18/2021] [Indexed: 11/09/2022] Open
Abstract
Exponential rise of metagenomics sequencing is delivering massive functional environmental genomics data. However, this also generates a procedural bottleneck for on-going re-analysis as reference databases grow and methods improve, and analyses need be updated for consistency, which require acceess to increasingly demanding bioinformatic and computational resources. Here, we present the KAUST Metagenomic Analysis Platform (KMAP), a new integrated open web-based tool for the comprehensive exploration of shotgun metagenomic data. We illustrate the capacities KMAP provides through the re-assembly of ~ 27,000 public metagenomic samples captured in ~ 450 studies sampled across ~ 77 diverse habitats. A small subset of these metagenomic assemblies is used in this pilot study grouped into 36 new habitat-specific gene catalogs, all based on full-length (complete) genes. Extensive taxonomic and gene annotations are stored in Gene Information Tables (GITs), a simple tractable data integration format useful for analysis through command line or for database management. KMAP pilot study provides the exploration and comparison of microbial GITs across different habitats with over 275 million genes. KMAP access to data and analyses is available at https://www.cbrc.kaust.edu.sa/aamg/kmap.start .
Collapse
Affiliation(s)
- Intikhab Alam
- Computational Bioscience Research Center, King Abdullah University of Science and Technology (KAUST), Thuwal, 23955-6900, Saudi Arabia.
| | - Allan Anthony Kamau
- Computational Bioscience Research Center, King Abdullah University of Science and Technology (KAUST), Thuwal, 23955-6900, Saudi Arabia
| | - David Kamanda Ngugi
- Leibniz Institute DSMZ-German Collection of Microorganisms and Cell Cultures, Inhoffenstraße 7B, 38124, Brunswick, Germany
| | - Takashi Gojobori
- Computational Bioscience Research Center, King Abdullah University of Science and Technology (KAUST), Thuwal, 23955-6900, Saudi Arabia
| | - Carlos M Duarte
- Computational Bioscience Research Center, King Abdullah University of Science and Technology (KAUST), Thuwal, 23955-6900, Saudi Arabia.,Red Sea Bioscience Research Center, King Abdullah University of Science and Technology (KAUST), Thuwal, 23955-6900, Saudi Arabia
| | - Vladimir B Bajic
- Computational Bioscience Research Center, King Abdullah University of Science and Technology (KAUST), Thuwal, 23955-6900, Saudi Arabia
| |
Collapse
|
72
|
Robinson T, Harkin J, Shukla P. Hardware Acceleration of Genomics Data Analysis: Challenges and Opportunities. Bioinformatics 2021; 37:1785-1795. [PMID: 34037688 PMCID: PMC8317111 DOI: 10.1093/bioinformatics/btab017] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2020] [Revised: 11/03/2020] [Accepted: 05/24/2021] [Indexed: 12/11/2022] Open
Abstract
The significant decline in the cost of genome sequencing has dramatically changed the typical bioinformatics pipeline for analysing sequencing data. Where traditionally, the computational challenge of sequencing is now secondary to genomic data analysis. Short read alignment (SRA) is a ubiquitous process within every modern bioinformatics pipeline in the field of genomics and is often regarded as the principal computational bottleneck. Many hardware and software approaches have been provided to solve the challenge of acceleration. However, previous attempts to increase throughput using many-core processing strategies have enjoyed limited success, mainly due to a dependence on global memory for each computational block. The limited scalability and high energy costs of many-core SRA implementations pose a significant constraint in maintaining acceleration. The Networks-On-Chip (NoC) hardware interconnect mechanism has advanced the scalability of many-core computing systems and, more recently, has demonstrated potential in SRA implementations by integrating multiple computational blocks such as pre-alignment filtering and sequence alignment efficiently, while minimising memory latency and global memory access. This paper provides a state of the art review on current hardware acceleration strategies for genomic data analysis, and it establishes the challenges and opportunities of utilising NoCs as a critical building block in next-generation sequencing (NGS) technologies for advancing the speed of analysis.
Collapse
Affiliation(s)
- Tony Robinson
- School of Computing, Engineering and Intelligent Systems, Ulster University, Magee Campus, Derry/Londonderry, BT48 7JL, UK
| | - Jim Harkin
- School of Computing, Engineering and Intelligent Systems, Ulster University, Magee Campus, Derry/Londonderry, BT48 7JL, UK
| | - Priyank Shukla
- Northern Ireland Centre for Stratified Medicine, Biomedical Sciences Research Institute, Ulster University, C-TRIC Building, Altnagelvin Area Hospital, Derry/Londonderry, BT47 6SB, UK
| |
Collapse
|
73
|
Precision Pandemic Preparedness: Improving Diagnostics with Metagenomics. J Clin Microbiol 2021; 59:JCM.02146-20. [PMID: 33472896 DOI: 10.1128/jcm.02146-20] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022] Open
Abstract
The threat posed by novel pandemics in the future remains active. Equipping our routine laboratory with clinical metagenomics to detect unknown threats early on offers a considerable advantage and may be feasible and scalable with the ability to identify complicated infectious diseases in routine care. Though several technical and regulatory challenges still exist, clinical metagenomics may improve individual patient outcomes and provide earlier warning signs to improve pandemic preparedness.
Collapse
|
74
|
Lui LM, Nielsen TN, Arkin AP. A method for achieving complete microbial genomes and improving bins from metagenomics data. PLoS Comput Biol 2021; 17:e1008972. [PMID: 33961626 PMCID: PMC8172020 DOI: 10.1371/journal.pcbi.1008972] [Citation(s) in RCA: 17] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2020] [Revised: 06/02/2021] [Accepted: 04/16/2021] [Indexed: 11/19/2022] Open
Abstract
Metagenomics facilitates the study of the genetic information from uncultured microbes and complex microbial communities. Assembling complete genomes from metagenomics data is difficult because most samples have high organismal complexity and strain diversity. Some studies have attempted to extract complete bacterial, archaeal, and viral genomes and often focus on species with circular genomes so they can help confirm completeness with circularity. However, less than 100 circularized bacterial and archaeal genomes have been assembled and published from metagenomics data despite the thousands of datasets that are available. Circularized genomes are important for (1) building a reference collection as scaffolds for future assemblies, (2) providing complete gene content of a genome, (3) confirming little or no contamination of a genome, (4) studying the genomic context and synteny of genes, and (5) linking protein coding genes to ribosomal RNA genes to aid metabolic inference in 16S rRNA gene sequencing studies. We developed a semi-automated method called Jorg to help circularize small bacterial, archaeal, and viral genomes using iterative assembly, binning, and read mapping. In addition, this method exposes potential misassemblies from k-mer based assemblies. We chose species of the Candidate Phyla Radiation (CPR) to focus our initial efforts because they have small genomes and are only known to have one ribosomal RNA operon. In addition to 34 circular CPR genomes, we present one circular Margulisbacteria genome, one circular Chloroflexi genome, and two circular megaphage genomes from 19 public and published datasets. We demonstrate findings that would likely be difficult without circularizing genomes, including that ribosomal genes are likely not operonic in the majority of CPR, and that some CPR harbor diverged forms of RNase P RNA. Code and a tutorial for this method is available at https://github.com/lmlui/Jorg and is available on the DOE Systems Biology KnowledgeBase as a beta app.
Collapse
Affiliation(s)
- Lauren M. Lui
- Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA, United States of America
| | - Torben N. Nielsen
- Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA, United States of America
| | - Adam P. Arkin
- Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA, United States of America
- Department of Bioengineering, University of California, Berkeley, California, United States of America
- Innovative Genomics Institute, Berkeley, CA, United States of America
| |
Collapse
|
75
|
Werbin ZR, Hackos B, Lopez-Nava J, Dietze MC, Bhatnagar JM. The National Ecological Observatory Network's soil metagenomes: assembly and basic analysis. F1000Res 2021; 10:299. [PMID: 35707452 PMCID: PMC9178279 DOI: 10.12688/f1000research.51494.2] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 02/08/2022] [Indexed: 11/20/2022] Open
Abstract
The largest dataset of soil metagenomes has recently been released by the National Ecological Observatory Network (NEON), which performs annual shotgun sequencing of soils at 47 sites across the United States. NEON serves as a valuable educational resource, thanks to its open data and programming tutorials, but there is currently no introductory tutorial for accessing and analyzing the soil shotgun metagenomic dataset. Here, we describe methods for processing raw soil metagenome sequencing reads using a bioinformatics pipeline tailored to the high complexity and diversity of the soil microbiome. We describe the rationale, necessary resources, and implementation of steps such as cleaning raw reads, taxonomic classification, assembly into contigs or genomes, annotation of predicted genes using custom protein databases, and exporting data for downstream analysis. The workflow presented here aims to increase the accessibility of NEON's shotgun metagenome data, which can provide important clues about soil microbial communities and their ecological roles.
Collapse
Affiliation(s)
- Zoey R. Werbin
- Department of Biology, Boston University, Boston, MA, 02215, USA
| | - Briana Hackos
- Department of Mathematics, University of Colorado, Boulder, Boulder, CO, 80309, USA
| | - Jorge Lopez-Nava
- Department of Mathematics, Swarthmore College, Swarthmore, PA 19081, USA
| | - Michael C. Dietze
- Department of Earth & Environment, Boston University, Boston, MA, 02215, USA
| | | |
Collapse
|
76
|
Garg S. Computational methods for chromosome-scale haplotype reconstruction. Genome Biol 2021; 22:101. [PMID: 33845884 PMCID: PMC8040228 DOI: 10.1186/s13059-021-02328-9] [Citation(s) in RCA: 40] [Impact Index Per Article: 13.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2021] [Accepted: 03/25/2021] [Indexed: 12/13/2022] Open
Abstract
High-quality chromosome-scale haplotype sequences of diploid genomes, polyploid genomes, and metagenomes provide important insights into genetic variation associated with disease and biodiversity. However, whole-genome short read sequencing does not yield haplotype information spanning whole chromosomes directly. Computational assembly of shorter haplotype fragments is required for haplotype reconstruction, which can be challenging owing to limited fragment lengths and high haplotype and repeat variability across genomes. Recent advancements in long-read and chromosome-scale sequencing technologies, alongside computational innovations, are improving the reconstruction of haplotypes at the level of whole chromosomes. Here, we review recent and discuss methodological progress and perspectives in these areas.
Collapse
Affiliation(s)
- Shilpa Garg
- Department of Biology, University of Copenhagen, Copenhagen, Denmark.
| |
Collapse
|
77
|
Lapidus AL, Korobeynikov AI. Metagenomic Data Assembly - The Way of Decoding Unknown Microorganisms. Front Microbiol 2021; 12:613791. [PMID: 33833738 PMCID: PMC8021871 DOI: 10.3389/fmicb.2021.613791] [Citation(s) in RCA: 47] [Impact Index Per Article: 15.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2020] [Accepted: 03/03/2021] [Indexed: 01/08/2023] Open
Abstract
Metagenomics is a segment of conventional microbial genomics dedicated to the sequencing and analysis of combined genomic DNA of entire environmental samples. The most critical step of the metagenomic data analysis is the reconstruction of individual genes and genomes of the microorganisms in the communities using metagenomic assemblers - computational programs that put together small fragments of sequenced DNA generated by sequencing instruments. Here, we describe the challenges of metagenomic assembly, a wide spectrum of applications in which metagenomic assemblies were used to better understand the ecology and evolution of microbial ecosystems, and present one of the most efficient microbial assemblers, SPAdes that was upgraded to become applicable for metagenomics.
Collapse
Affiliation(s)
- Alla L. Lapidus
- Center for Algorithmic Biotechnology, St. Petersburg State University, Saint Petersburg, Russia
| | | |
Collapse
|
78
|
Deng Z, Delwart E. ContigExtender: a new approach to improving de novo sequence assembly for viral metagenomics data. BMC Bioinformatics 2021; 22:119. [PMID: 33706720 PMCID: PMC7953547 DOI: 10.1186/s12859-021-04038-2] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2019] [Accepted: 02/21/2021] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Metagenomics is the study of microbial genomes for pathogen detection and discovery in human clinical, animal, and environmental samples via Next-Generation Sequencing (NGS). Metagenome de novo sequence assembly is a crucial analytical step in which longer contigs, ideally whole chromosomes/genomes, are formed from shorter NGS reads. However, the contigs generated from the de novo assembly are often very fragmented and rarely longer than a few kilo base pairs (kb). Therefore, a time-consuming extension process is routinely performed on the de novo assembled contigs. RESULTS To facilitate this process, we propose a new tool for metagenome contig extension after de novo assembly. ContigExtender employs a novel recursive extending strategy that explores multiple extending paths to achieve highly accurate longer contigs. We demonstrate that ContigExtender outperforms existing tools in synthetic, animal, and human metagenomics datasets. CONCLUSIONS A novel software tool ContigExtender has been developed to assist and enhance the performance of metagenome de novo assembly. ContigExtender effectively extends contigs from a variety of sources and can be incorporated in most viral metagenomics analysis pipelines for a wide variety of applications, including pathogen detection and viral discovery.
Collapse
Affiliation(s)
- Zachary Deng
- Vitalant Research Institute, San Francisco, CA, 94118, USA.
- Department of Laboratory Medicine, University of California at San Francisco, San Francisco, CA, 94107, USA.
| | - Eric Delwart
- Vitalant Research Institute, San Francisco, CA, 94118, USA.
- Department of Laboratory Medicine, University of California at San Francisco, San Francisco, CA, 94107, USA.
| |
Collapse
|
79
|
Abstract
Cystic fibrosis patients frequently suffer from recurring respiratory infections caused by colonizing pathogenic and commensal bacteria. Although modern therapies can sometimes alleviate respiratory symptoms by ameliorating residual function of the protein responsible for the disorder, management of chronic respiratory infections remains an issue. In cystic fibrosis, dynamic and complex communities of microbial pathogens and commensals can colonize the lung. Cultured isolates from lung sputum reveal high inter- and intraindividual variability in pathogen strains, sequence variants, and phenotypes; disease progression likely depends on the precise combination of infecting lineages. Routine clinical protocols, however, provide a limited overview of the colonizer populations. Therefore, a more comprehensive and precise identification and characterization of infecting lineages could assist in making corresponding decisions on treatment. Here, we describe longitudinal tracking for four cystic fibrosis patients who exhibited extreme clinical phenotypes and, thus, were selected from a pilot cohort of 11 patients with repeated sampling for more than a year. Following metagenomics sequencing of lung sputum, we find that the taxonomic identity of individual colonizer lineages can be easily established. Crucially, even superficially clonal pathogens can be subdivided into multiple sublineages at the sequence level. By tracking individual allelic differences over time, an assembly-free clustering approach allows us to reconstruct multiple lineage-specific genomes with clear structural differences. Our study showcases a culture-independent shotgun metagenomics approach for longitudinal tracking of sublineage pathogen dynamics, opening up the possibility of using such methods to assist in monitoring disease progression through providing high-resolution routine characterization of the cystic fibrosis lung microbiome.
Collapse
|
80
|
Brown CL, Keenum IM, Dai D, Zhang L, Vikesland PJ, Pruden A. Critical evaluation of short, long, and hybrid assembly for contextual analysis of antibiotic resistance genes in complex environmental metagenomes. Sci Rep 2021; 11:3753. [PMID: 33580146 PMCID: PMC7881036 DOI: 10.1038/s41598-021-83081-8] [Citation(s) in RCA: 32] [Impact Index Per Article: 10.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2020] [Accepted: 01/05/2021] [Indexed: 02/06/2023] Open
Abstract
In the fight to limit the global spread of antibiotic resistance, the assembly of environmental metagenomes has the potential to provide rich contextual information (e.g., taxonomic hosts, carriage on mobile genetic elements) about antibiotic resistance genes (ARG) in the environment. However, computational challenges associated with assembly can impact the accuracy of downstream analyses. This work critically evaluates the impact of assembly leveraging short reads, nanopore MinION long-reads, and a combination of the two (hybrid) on ARG contextualization for ten environmental metagenomes using seven prominent assemblers (IDBA-UD, MEGAHIT, Canu, Flye, Opera-MS, metaSpades and HybridSpades). While short-read and hybrid assemblies produced similar patterns of ARG contextualization, raw or assembled long nanopore reads produced distinct patterns. Based on an in-silico spike-in experiment using real and simulated reads, we show that low to intermediate coverage species are more likely to be incorporated into chimeric contigs across all assemblers and sequencing technologies, while more abundant species produce assemblies with a greater frequency of inversions and insertion/deletions (indels). In sum, our analyses support hybrid assembly as a valuable technique for boosting the reliability and accuracy of assembly-based analyses of ARGs and neighboring genes at environmentally-relevant coverages, provided that sufficient short-read sequencing depth is achieved.
Collapse
Affiliation(s)
- Connor L Brown
- Genetics, Bioinformatics, and Computational Biology, Virginia Tech, Blacksburg, VA, 24060, USA
| | - Ishi M Keenum
- Department of Civil & Environmental Engineering, Virginia Tech, Blacksburg, VA, 24060, USA
| | - Dongjuan Dai
- Department of Civil & Environmental Engineering, Virginia Tech, Blacksburg, VA, 24060, USA
| | - Liqing Zhang
- Genetics, Bioinformatics, and Computational Biology, Virginia Tech, Blacksburg, VA, 24060, USA.
| | - Peter J Vikesland
- Department of Civil & Environmental Engineering, Virginia Tech, Blacksburg, VA, 24060, USA
| | - Amy Pruden
- Department of Civil & Environmental Engineering, Virginia Tech, Blacksburg, VA, 24060, USA.
| |
Collapse
|
81
|
Information Theoretic Metagenome Assembly Allows the Discovery of Disease Biomarkers in Human Microbiome. ENTROPY 2021; 23:e23020187. [PMID: 33540903 PMCID: PMC7913240 DOI: 10.3390/e23020187] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/13/2020] [Revised: 01/11/2021] [Accepted: 01/13/2021] [Indexed: 11/26/2022]
Abstract
Quantitative metagenomics is an important field that has delivered successful microbiome biomarkers associated with host phenotypes. The current convention mainly depends on unsupervised assembly of metagenomic contigs with a possibility of leaving interesting genetic material unassembled. Additionally, biomarkers are commonly defined on the differential relative abundance of compositional or functional units. Accumulating evidence supports that microbial genetic variations are as important as the differential abundance content, implying the need for novel methods accounting for the genetic variations in metagenomics studies. We propose an information theoretic metagenome assembly algorithm, discovering genomic fragments with maximal self-information, defined by the empirical distributions of nucleotides across the phenotypes and quantified with the help of statistical tests. Our algorithm infers fragments populating the most informative genetic variants in a single contig, named supervariant fragments. Experiments on simulated metagenomes, as well as on a colorectal cancer and an atherosclerotic cardiovascular disease dataset consistently discovered sequences strongly associated with the disease phenotypes. Moreover, the discriminatory power of these putative biomarkers was mainly attributed to the genetic variations rather than relative abundance. Our results support that a focus on metagenomics methods considering microbiome population genetics might be useful in discovering disease biomarkers with a great potential of translating to molecular diagnostics and biotherapeutics applications.
Collapse
|
82
|
Brumfield KD, Cotruvo JA, Shanks OC, Sivaganesan M, Hey J, Hasan NA, Huq A, Colwell RR, Leddy MB. Metagenomic Sequencing and Quantitative Real-Time PCR for Fecal Pollution Assessment in an Urban Watershed. FRONTIERS IN WATER 2021; 3:626849. [PMID: 34263162 PMCID: PMC8274573 DOI: 10.3389/frwa.2021.626849] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/06/2023]
Abstract
Microbial contamination of recreation waters is a major concern globally, with pollutants originating from many sources, including human and other animal wastes often introduced during storm events. Fecal contamination is traditionally monitored by employing culture methods targeting fecal indicator bacteria (FIB), namely E. coli and enterococci, which provides only limited information of a few microbial taxa and no information on their sources. Host-associated qPCR and metagenomic DNA sequencing are complementary methods for FIB monitoring that can provide enhanced understanding of microbial communities and sources of fecal pollution. Whole metagenome sequencing (WMS), quantitative real-time PCR (qPCR), and culture-based FIB tests were performed in an urban watershed before and after a rainfall event to determine the feasibility and application of employing a multi-assay approach for examining microbial content of ambient source waters. Cultivated E. coli and enterococci enumeration confirmed presence of fecal contamination in all samples exceeding local single sample recreational water quality thresholds (E. coli, 410 MPN/100 mL; enterococci, 107 MPN/100 mL) following a rainfall. Test results obtained with qPCR showed concentrations of E. coli, enterococci, and human-associated genetic markers increased after rainfall by 1.52-, 1.26-, and 1.11-fold log10 copies per 100 mL, respectively. Taxonomic analysis of the surface water microbiome and detection of antibiotic resistance genes, general FIB, and human-associated microorganisms were also employed. Results showed that fecal contamination from multiple sources (human, avian, dog, and ruminant), as well as FIB, enteric microorganisms, and antibiotic resistance genes increased demonstrably after a storm event. In summary, the addition of qPCR and WMS to traditional surrogate techniques may provide enhanced characterization and improved understanding of microbial pollution sources in ambient waters.
Collapse
Affiliation(s)
- Kyle D. Brumfield
- Maryland Pathogen Research Institute, University of Maryland, College Park, MD, United States
- University of Maryland Institute for Advanced Computer Studies, University of Maryland, College Park, MD, United States
| | | | - Orin C. Shanks
- U.S. Environmental Protection Agency, Office of Research and Development, Cincin nati, OH, United States
| | - Mano Sivaganesan
- U.S. Environmental Protection Agency, Office of Research and Development, Cincin nati, OH, United States
| | - Jessica Hey
- U.S. Environmental Protection Agency, Office of Research and Development, Cincin nati, OH, United States
| | - Nur A. Hasan
- University of Maryland Institute for Advanced Computer Studies, University of Maryland, College Park, MD, United States
| | - Anwar Huq
- Maryland Pathogen Research Institute, University of Maryland, College Park, MD, United States
| | - Rita R. Colwell
- Maryland Pathogen Research Institute, University of Maryland, College Park, MD, United States
- University of Maryland Institute for Advanced Computer Studies, University of Maryland, College Park, MD, United States
- CosmosID Inc., Rockville, MD, United States
- Correspondence: Rita R. Colwell , Menu B. Leddy
| | - Menu B. Leddy
- Essential Environmental and Engineering Systems, Huntington Beach, CA, United States
- Correspondence: Rita R. Colwell , Menu B. Leddy
| |
Collapse
|
83
|
Albanese D, Donati C. Genome Recovery, Functional Profiling, and Taxonomic Classification from Metagenomes. Methods Mol Biol 2021; 2242:153-172. [PMID: 33961223 DOI: 10.1007/978-1-0716-1099-2_10] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Recovering and annotating bacterial genomes from metagenomes involves a series of complex computational tools that are often difficult to use for researches without a specialistic bioinformatic background. In this chapter we review all the steps that lead from raw reads to a collection of quality-controlled, functionally annotated bacterial genomes and propose a working protocol using state-of-the-art, open source software tools.
Collapse
Affiliation(s)
- Davide Albanese
- Unit of Computational Biology, Research and Innovation Centre, Fondazione Edmund Mach, San Michele all'Adige, Italy
| | - Claudio Donati
- Unit of Computational Biology, Research and Innovation Centre, Fondazione Edmund Mach, San Michele all'Adige, Italy.
| |
Collapse
|
84
|
Abstract
Microbial communities are found across diverse environments, including within and across the human body. As many microbes are unculturable in the lab, much of what is known about a microbiome-a collection of bacteria, fungi, archaea, and viruses inhabiting an environment--is from the sequencing of DNA from within the constituent community. Here, we provide an introduction to whole-metagenome shotgun sequencing studies, a ubiquitous approach for characterizing microbial communities, by reviewing three major research areas in metagenomics: assembly, community profiling, and functional profiling. Though not exhaustive, these areas encompass a large component of the metagenomics literature. We discuss each area in depth, the challenges posed by whole-metagenome shotgun sequencing, and approaches fundamental to the solutions of each. We conclude by discussing promising areas for future research. Though our emphasis is on the human microbiome, the methods discussed are broadly applicable across study systems.
Collapse
Affiliation(s)
- Tyler A Joseph
- Department of Computer Science, Fu Foundation School of Engineering & Applied Science, Columbia University, New York, NY, USA
| | - Itsik Pe'er
- Department of Computer Science, Fu Foundation School of Engineering & Applied Science, Columbia University, New York, NY, USA.
| |
Collapse
|
85
|
Davis BC, Riquelme MV, Ramirez-Toro G, Bandaragoda C, Garner E, Rhoads WJ, Vikesland P, Pruden A. Demonstrating an Integrated Antibiotic Resistance Gene Surveillance Approach in Puerto Rican Watersheds Post-Hurricane Maria. ENVIRONMENTAL SCIENCE & TECHNOLOGY 2020; 54:15108-15119. [PMID: 33205660 DOI: 10.1021/acs.est.0c05567] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/24/2023]
Abstract
Comprehensive surveillance approaches are needed to assess sources, clinical relevance, and mobility of antibiotic resistance genes (ARGs) in watersheds. Here, we examined metrics derived from shotgun metagenomic sequencing and relationship to human fecal markers (HFMs; crAssphage, enterococci) and anthropogenic antibiotic resistance markers (AARMs; intI1, sul1) in three distinct Puerto Rican watersheds as a function of adjacent land use and wastewater treatment plant (WWTP) input 6 months after Hurricane Maria, a category V storm. Relative abundance and diversity of total ARGs increased markedly downstream of WWTP inputs, with ARGs unique to WWTP and WWTP-impacted river samples predominantly belonging to the aminoglycoside and β-lactam resistance classes. WWTP and other anthropogenic inputs were similarly associated with elevated resistome risk scores and mobility incidence (M%). Contig analysis indicated a wide variety of mobile β-lactam ARGs associated with pathogens downstream of WWTP discharge that were consistent with regional clinical concern, e.g., Klebsiella pneumoniae contigs containing KPC-2 within an ISKpn6-like transposase. HFMs and AARMs correlated strongly with the absolute abundance of total ARGs, but AARMs better predicted the majority of ARGs in general (85.4 versus <2%) and β-lactam ARGs in particular. This study reveals sensitive, quantitative, mobile, clinically relevant, and comprehensive targets for antibiotic resistance surveillance in watersheds.
Collapse
Affiliation(s)
- Benjamin C Davis
- Department of Civil & Environmental Engineering, Virginia Tech, Blacksburg, Virginia 24060, United States
| | - Maria Virginia Riquelme
- Department of Civil & Environmental Engineering, Virginia Tech, Blacksburg, Virginia 24060, United States
| | - Graciela Ramirez-Toro
- Center for Environmental Education, Conservation and Research, Inter American University, San Germán, Puerto Rico 00683, United States
| | - Christina Bandaragoda
- Department of Civil and Environmental Engineering, University of Washington, Seattle, Washington 98195, United States
| | - Emily Garner
- Department of Civil & Environmental Engineering, Virginia Tech, Blacksburg, Virginia 24060, United States
| | - William J Rhoads
- Department of Civil & Environmental Engineering, Virginia Tech, Blacksburg, Virginia 24060, United States
| | - Peter Vikesland
- Department of Civil & Environmental Engineering, Virginia Tech, Blacksburg, Virginia 24060, United States
| | - Amy Pruden
- Department of Civil & Environmental Engineering, Virginia Tech, Blacksburg, Virginia 24060, United States
| |
Collapse
|
86
|
Laczi K, Erdeiné Kis Á, Szilágyi Á, Bounedjoum N, Bodor A, Vincze GE, Kovács T, Rákhely G, Perei K. New Frontiers of Anaerobic Hydrocarbon Biodegradation in the Multi-Omics Era. Front Microbiol 2020; 11:590049. [PMID: 33304336 PMCID: PMC7701123 DOI: 10.3389/fmicb.2020.590049] [Citation(s) in RCA: 19] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2020] [Accepted: 10/26/2020] [Indexed: 12/17/2022] Open
Abstract
The accumulation of petroleum hydrocarbons in the environment substantially endangers terrestrial and aquatic ecosystems. Many microbial strains have been recognized to utilize aliphatic and aromatic hydrocarbons under aerobic conditions. Nevertheless, most of these pollutants are transferred by natural processes, including rain, into the underground anaerobic zones where their degradation is much more problematic. In oxic zones, anaerobic microenvironments can be formed as a consequence of the intensive respiratory activities of (facultative) aerobic microbes. Even though aerobic bioremediation has been well-characterized over the past few decades, ample research is yet to be done in the field of anaerobic hydrocarbon biodegradation. With the emergence of high-throughput techniques, known as omics (e.g., genomics and metagenomics), the individual biodegraders, hydrocarbon-degrading microbial communities and metabolic pathways, interactions can be described at a contaminated site. Omics approaches provide the opportunity to examine single microorganisms or microbial communities at the system level and elucidate the metabolic networks, interspecies interactions during hydrocarbon mineralization. Metatranscriptomics and metaproteomics, for example, can shed light on the active genes and proteins and functional importance of the less abundant species. Moreover, novel unculturable hydrocarbon-degrading strains and enzymes can be discovered and fit into the metabolic networks of the community. Our objective is to review the anaerobic hydrocarbon biodegradation processes, the most important hydrocarbon degraders and their diverse metabolic pathways, including the use of various terminal electron acceptors and various electron transfer processes. The review primarily focuses on the achievements obtained by the current high-throughput (multi-omics) techniques which opened new perspectives in understanding the processes at the system level including the metabolic routes of individual strains, metabolic/electric interaction of the members of microbial communities. Based on the multi-omics techniques, novel metabolic blocks can be designed and used for the construction of microbial strains/consortia for efficient removal of hydrocarbons in anaerobic zones.
Collapse
Affiliation(s)
- Krisztián Laczi
- Department of Biotechnology, University of Szeged, Szeged, Hungary
| | - Ágnes Erdeiné Kis
- Department of Biotechnology, University of Szeged, Szeged, Hungary.,Institute of Biophysics, Biological Research Centre, Szeged, Hungary
| | - Árpád Szilágyi
- Department of Biotechnology, University of Szeged, Szeged, Hungary
| | - Naila Bounedjoum
- Department of Biotechnology, University of Szeged, Szeged, Hungary.,Institute of Environmental and Technological Sciences, University of Szeged, Szeged, Hungary
| | - Attila Bodor
- Department of Biotechnology, University of Szeged, Szeged, Hungary.,Institute of Biophysics, Biological Research Centre, Szeged, Hungary.,Institute of Environmental and Technological Sciences, University of Szeged, Szeged, Hungary
| | | | - Tamás Kovács
- Department of Biotechnology, Nanophagetherapy Center, Enviroinvest Corporation, Pécs, Hungary
| | - Gábor Rákhely
- Department of Biotechnology, University of Szeged, Szeged, Hungary.,Institute of Biophysics, Biological Research Centre, Szeged, Hungary.,Institute of Environmental and Technological Sciences, University of Szeged, Szeged, Hungary
| | - Katalin Perei
- Department of Biotechnology, University of Szeged, Szeged, Hungary.,Institute of Environmental and Technological Sciences, University of Szeged, Szeged, Hungary
| |
Collapse
|
87
|
Tran Q, Phan V. Assembling Reads Improves Taxonomic Classification of Species. Genes (Basel) 2020; 11:E946. [PMID: 32824429 PMCID: PMC7465921 DOI: 10.3390/genes11080946] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2020] [Revised: 08/11/2020] [Accepted: 08/13/2020] [Indexed: 11/22/2022] Open
Abstract
Most current approach to metagenomic classification employ short next generation sequencing (NGS) reads that are present in metagenomic samples to identify unique genomic regions. NGS reads, however, might not be long enough to differentiate similar genomes. This suggests a potential for using longer reads to improve classification performance. Presently, longer reads tend to have a higher rate of sequencing errors. Thus, given the pros and cons, it remains unclear which types of reads is better for metagenomic classification. We compared two taxonomic classification protocols: a traditional assembly-free protocol and a novel assembly-based protocol. The novel assembly-based protocol consists of assembling short-reads into longer reads, which will be subsequently classified by a traditional taxonomic classifier. We discovered that most classifiers made fewer predictions with longer reads and that they achieved higher classification performance on synthetic metagenomic data. Generally, we observed a significant increase in precision, while having similar recall rates. On real data, we observed similar characteristics that suggest that the classifiers might have similar performance of higher precision with similar recall with longer reads. We have shown a noticeable difference in performance between assembly-based and assembly-free taxonomic classification. This finding strongly suggests that classifying species in metagenomic environments can be achieved with higher overall performance simply by assembling short reads. Further, it also suggests that long-read technologies might be better for species classification.
Collapse
Affiliation(s)
- Quang Tran
- Department of Computer Science, University of Memphis, Memphis, TN 38152, USA;
| | | |
Collapse
|
88
|
Yan Y, Nguyen LH, Franzosa EA, Huttenhower C. Strain-level epidemiology of microbial communities and the human microbiome. Genome Med 2020; 12:71. [PMID: 32791981 PMCID: PMC7427293 DOI: 10.1186/s13073-020-00765-y] [Citation(s) in RCA: 58] [Impact Index Per Article: 14.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2020] [Accepted: 07/14/2020] [Indexed: 02/07/2023] Open
Abstract
The biological importance and varied metabolic capabilities of specific microbial strains have long been established in the scientific community. Strains have, in the past, been largely defined and characterized based on microbial isolates. However, the emergence of new technologies and techniques has enabled assessments of their ecology and phenotypes within microbial communities and the human microbiome. While it is now more obvious how pathogenic strain variants are detrimental to human health, the consequences of subtle genetic variation in the microbiome have only recently been exposed. Here, we review the operational definitions of strains (e.g., genetic and structural variants) as they can now be identified from microbial communities using different high-throughput, often culture-independent techniques. We summarize the distribution and diversity of strains across the human body and their emerging links to health maintenance, disease risk and progression, and biochemical responses to perturbations, such as diet or drugs. We list methods for identifying, quantifying, and tracking strains, utilizing high-throughput sequencing along with other molecular and “culturomics” technologies. Finally, we discuss implications of population studies in bridging experimental gaps and leading to a better understanding of the health effects of strains in the human microbiome.
Collapse
Affiliation(s)
- Yan Yan
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, 677 Huntington Ave, Boston, MA, 02115, USA.,Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Long H Nguyen
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, 677 Huntington Ave, Boston, MA, 02115, USA.,Division of Gastroenterology, Massachusetts General Hospital and Harvard Medical School, Boston, MA, USA.,Clinical and Translational Epidemiology Unit, Massachusetts General Hospital and Harvard Medical School, Boston, MA, USA
| | - Eric A Franzosa
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, 677 Huntington Ave, Boston, MA, 02115, USA.,Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Curtis Huttenhower
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, 677 Huntington Ave, Boston, MA, 02115, USA. .,Broad Institute of MIT and Harvard, Cambridge, MA, USA.
| |
Collapse
|
89
|
Latorre-Pérez A, Villalba-Bermell P, Pascual J, Vilanova C. Assembly methods for nanopore-based metagenomic sequencing: a comparative study. Sci Rep 2020; 10:13588. [PMID: 32788623 PMCID: PMC7423617 DOI: 10.1038/s41598-020-70491-3] [Citation(s) in RCA: 34] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2020] [Accepted: 07/22/2020] [Indexed: 02/08/2023] Open
Abstract
Metagenomic sequencing has allowed for the recovery of previously unexplored microbial genomes. Whereas short-read sequencing platforms often result in highly fragmented metagenomes, nanopore-based sequencers could lead to more contiguous assemblies due to their potential to generate long reads. Nevertheless, there is a lack of updated and systematic studies evaluating the performance of different assembly tools on nanopore data. In this study, we have benchmarked the ability of different assemblers to reconstruct two different commercially-available mock communities that have been sequenced using Oxford Nanopore Technologies platforms. Among the tested tools, only metaFlye, Raven, and Canu performed well in all the datasets. These tools retrieved highly contiguous genomes (or even complete genomes) directly from the metagenomic data. Despite the intrinsic high error of nanopore sequencing, final assemblies reached high accuracy (~ 99.5 to 99.8% of consensus accuracy). Polishing strategies demonstrated to be necessary for reducing the number of indels, and this had an impact on the prediction of biosynthetic gene clusters. Correction with high quality short reads did not always result in higher quality draft assemblies. Overall, nanopore metagenomic sequencing data-adapted to MinION's current output-proved sufficient for assembling and characterizing low-complexity microbial communities.
Collapse
|
90
|
De novo sequence assembly requires bioinformatic checking of chimeric sequences. PLoS One 2020; 15:e0237455. [PMID: 32777809 PMCID: PMC7417191 DOI: 10.1371/journal.pone.0237455] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2020] [Accepted: 07/27/2020] [Indexed: 11/24/2022] Open
Abstract
De novo assembly of sequence reads from next generation sequencing platforms is a common strategy for detecting presence and sequencing of viruses in biospecimens. Amplification artifacts and presence of several related viruses in the same specimen can lead to assembly of erroneous, chimeric sequences. We now report that such chimeras can also occur between viral and non-viral biological sequences incorrectly joined together which may cause erroneous detection of viruses, highlighting the importance of performing a chimera checking step in bioinformatics pipelines. Using Illumina NextSeq and metagenomic sequencing, we analyzed 80 consecutive non-melanoma skin cancers (NMSCs) from 11 immunosuppressed patients together with 11 NMSCs from patients who had only developed 1 NMSC. We aligned high-quality reads against a Human Papillomavirus (HPV) database and found HPV sequences in 9/91 specimens. A previous bioinformatic analysis of the same crude sequencing data from some of these samples had found an additional 3 specimens to be HPV-positive after performing de novo assembly. The reason for the discrepancy was investigated and found to be mostly caused by chimeric sequences containing both viral and non-viral sequences. Non-viral sequences were present in these 3 samples. To avoid erroneous detection of HPV when performing sequencing, we thus developed a novel script to identify HPV chimeric sequences.
Collapse
|
91
|
Prayogo FA, Budiharjo A, Kusumaningrum HP, Wijanarka W, Suprihadi A, Nurhayati N. Metagenomic applications in exploration and development of novel enzymes from nature: a review. J Genet Eng Biotechnol 2020; 18:39. [PMID: 32749574 PMCID: PMC7403272 DOI: 10.1186/s43141-020-00043-9] [Citation(s) in RCA: 19] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2020] [Accepted: 06/10/2020] [Indexed: 12/14/2022]
Abstract
BACKGROUND Microbial community has an essential role in various fields, especially the industrial sector. Microbes produce metabolites in the form of enzymes, which are one of the essential compounds for industrial processes. Unfortunately, there are still numerous microbes that cannot be identified and cultivated because of the limitations of the culture-based method. The metagenomic approach is a solution for researchers to overcome these problems. Metagenomics is a strategy used to analyze the genomes of microbial communities in the environment directly. Metagenomics application used to explore novel enzymes is essential because it allows researchers to obtain data on microbial diversity, reaching of 99% and various types of genes encoding an enzyme that has not yet been identified. Basic methods in metagenomics have been developed and are commonly used in various studies. A basic understanding of metagenomics for researchers is needed, especially young researchers to support the success of the research. SHORT CONCLUSION Therefore, this review was done in order to provide a deep understanding of metagenomics. It also discussed the application and basic methods of metagenomics in the exploration of novel enzymes, especially in the latest research. Several types of enzymes, such as cellulases, proteases, and lipases, which have been explored using metagenomics, were reviewed in this article.
Collapse
Affiliation(s)
- Fitra Adi Prayogo
- Department of Biology, Faculty of Science and Mathematics, Diponegoro University, Semarang City, 50275 Indonesia
| | - Anto Budiharjo
- Biotechnology Study Program, Faculty of Science and Mathematics, Diponegoro University, Jl. Prof. Sudharto SH, Semarang, 50275 Indonesia
- Molecular and Applied Microbiology Laboratory, Center Central Laboratory of Research and Service - Diponegoro University, Jl. Prof. Sudharto SH, Semarang, 50275 Indonesia
| | | | - Wijanarka Wijanarka
- Biotechnology Study Program, Faculty of Science and Mathematics, Diponegoro University, Jl. Prof. Sudharto SH, Semarang, 50275 Indonesia
| | - Agung Suprihadi
- Biotechnology Study Program, Faculty of Science and Mathematics, Diponegoro University, Jl. Prof. Sudharto SH, Semarang, 50275 Indonesia
| | - Nurhayati Nurhayati
- Biotechnology Study Program, Faculty of Science and Mathematics, Diponegoro University, Jl. Prof. Sudharto SH, Semarang, 50275 Indonesia
| |
Collapse
|
92
|
Abstract
Shotgun metagenomic sequencing has revolutionized our ability to detect and characterize the diversity and function of complex microbial communities. In this review, we highlight the benefits of using metagenomics as well as the breadth of conclusions that can be made using currently available analytical tools, such as greater resolution of species and strains across phyla and functional content, while highlighting challenges of metagenomic data analysis. Major challenges remain in annotating function, given the dearth of functional databases for environmental bacteria compared to model organisms, and the technical difficulties of metagenome assembly and phasing in heterogeneous environmental samples. In the future, improvements and innovation in technology and methodology will lead to lowered costs. Data integration using multiple technological platforms will lead to a better understanding of how to harness metagenomes. Subsequently, we will be able not only to characterize complex microbiomes but also to manipulate communities to achieve prosperous outcomes for health, agriculture, and environmental sustainability.
Collapse
Affiliation(s)
- Felicia N New
- Meinig School of Biomedical Engineering, Cornell University, Ithaca, New York 14853, USA;
| | - Ilana L Brito
- Meinig School of Biomedical Engineering, Cornell University, Ithaca, New York 14853, USA;
| |
Collapse
|
93
|
Frioux C, Singh D, Korcsmaros T, Hildebrand F. From bag-of-genes to bag-of-genomes: metabolic modelling of communities in the era of metagenome-assembled genomes. Comput Struct Biotechnol J 2020; 18:1722-1734. [PMID: 32670511 PMCID: PMC7347713 DOI: 10.1016/j.csbj.2020.06.028] [Citation(s) in RCA: 40] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2020] [Revised: 06/16/2020] [Accepted: 06/17/2020] [Indexed: 12/12/2022] Open
Abstract
Metagenomic sequencing of complete microbial communities has greatly enhanced our understanding of the taxonomic composition of microbiotas. This has led to breakthrough developments in bioinformatic disciplines such as assembly, gene clustering, metagenomic binning of species genomes and the discovery of an incredible, so far undiscovered, taxonomic diversity. However, functional annotations and estimating metabolic processes from single species - or communities - is still challenging. Earlier approaches relied mostly on inferring the presence of key enzymes for metabolic pathways in the whole metagenome, ignoring the genomic context of such enzymes, resulting in the 'bag-of-genes' approach to estimate functional capacities of microbiotas. Here, we review recent developments in metagenomic bioinformatics, with a special focus on emerging technologies to simulate and estimate metabolic information, that can be derived from metagenomic assembled genomes. Genome-scale metabolic models can be used to model the emergent properties of microbial consortia and whole communities, and the progress in this area is reviewed. While this subfield of metagenomics is still in its infancy, it is becoming evident that there is a dire need for further bioinformatic tools to address the complex combinatorial problems in modelling the metabolism of large communities as a 'bag-of-genomes'.
Collapse
Affiliation(s)
- Clémence Frioux
- Inria, CNRS, INRAE Bordeaux, France
- Gut Microbes and Health, Quadram Institute Bioscience, Norwich, Norfolk, UK
| | - Dipali Singh
- Microbes in the Food Chain, Quadram Institute Bioscience, Norwich, Norfolk, UK
| | - Tamas Korcsmaros
- Gut Microbes and Health, Quadram Institute Bioscience, Norwich, Norfolk, UK
- Digital Biology, Earlham Institute, Norwich, Norfolk, UK
| | - Falk Hildebrand
- Gut Microbes and Health, Quadram Institute Bioscience, Norwich, Norfolk, UK
- Digital Biology, Earlham Institute, Norwich, Norfolk, UK
| |
Collapse
|
94
|
Klitting R, Mehta SB, Oguzie JU, Oluniyi PE, Pauthner MG, Siddle KJ, Andersen KG, Happi CT, Sabeti PC. Lassa Virus Genetics. Curr Top Microbiol Immunol 2020. [PMID: 32418034 DOI: 10.1007/82_2020_212] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
Abstract
In a pattern repeated across a range of ecological niches, arenaviruses have evolved a compact four-gene genome to orchestrate a complex life cycle in a narrow range of susceptible hosts. A number of mammalian arenaviruses cross-infect humans, often causing a life-threatening viral hemorrhagic fever. Among this group of geographically bound zoonoses, Lassa virus has evolved a unique niche that leads to significant and sustained human morbidity and mortality. As a biosafety level 4 pathogen, direct study of the pathogenesis of Lassa virus is limited by the sparse availability, high operating costs, and technical restrictions of the high-level biocontainment laboratories required for safe experimentation. In this chapter, we introduce the relationship between genome structure and the life cycle of Lassa virus and outline reverse genetic approaches used to probe and describe functional elements of the Lassa virus genome. We then review the tools used to obtain viral genomic sequences used for phylogeny and molecular diagnostics, before shifting to a population perspective to assess the contributions of phylogenetic analysis in understanding the evolution and ecology of Lassa virus in West Africa. We finally consider the future outlook and clinical applications for genetic study of Lassa virus.
Collapse
Affiliation(s)
- Raphaëlle Klitting
- Department of Immunology and Microbiology, The Scripps Research Institute , La Jolla, CA, USA
| | - Samar B Mehta
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Division of Infectious Diseases, Beth Israel Deaconess Medical Center, Boston, MA, USA
| | - Judith U Oguzie
- African Center of Excellence for Genomics of Infectious Diseases (ACEGID), Redeemer's University, Ede, Osun State, Nigeria
- Department of Biological Sciences, Faculty of Natural Sciences, Redeemers University, Ede, Osun State, Nigeria
| | - Paul E Oluniyi
- African Center of Excellence for Genomics of Infectious Diseases (ACEGID), Redeemer's University, Ede, Osun State, Nigeria
- Department of Biological Sciences, Faculty of Natural Sciences, Redeemers University, Ede, Osun State, Nigeria
| | - Matthias G Pauthner
- Department of Immunology and Microbiology, The Scripps Research Institute , La Jolla, CA, USA
| | | | - Kristian G Andersen
- Department of Immunology and Microbiology, The Scripps Research Institute , La Jolla, CA, USA.
| | - Christian T Happi
- African Center of Excellence for Genomics of Infectious Diseases (ACEGID), Redeemer's University, Ede, Osun State, Nigeria
- Department of Biological Sciences, Faculty of Natural Sciences, Redeemers University, Ede, Osun State, Nigeria
| | - Pardis C Sabeti
- Broad Institute of MIT and Harvard, Cambridge, MA, USA.
- Center for Systems Biology, Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, MA, USA.
- Department of Immunology and Infectious Diseases, Harvard TH Chan School of Public Health, Boston, MA, USA.
- Howard Hughes Medical Institute, Chevy Chase, MD, USA.
| |
Collapse
|
95
|
Wu R, Chai B, Cole JR, Gunturu SK, Guo X, Tian R, Gu JD, Zhou J, Tiedje JM. Targeted assemblies of cas1 suggest CRISPR-Cas's response to soil warming. ISME JOURNAL 2020; 14:1651-1662. [PMID: 32221408 PMCID: PMC7305122 DOI: 10.1038/s41396-020-0635-1] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/16/2019] [Revised: 03/03/2020] [Accepted: 03/16/2020] [Indexed: 02/06/2023]
Abstract
There is an increasing interest in the clustered regularly interspaced short palindromic repeats CRISPR-associated protein (CRISPR-Cas) system to reveal potential virus–host dynamics. The universal and most conserved Cas protein, cas1 is an ideal marker to elucidate CRISPR-Cas ecology. We constructed eight Hidden Markov Models (HMMs) and assembled cas1 directly from metagenomes by a targeted-gene assembler, Xander, to improve detection capacity and resolve the diverse CRISPR-Cas systems. The eight HMMs were first validated by recovering all 17 cas1 subtypes from the simulated metagenome generated from 91 prokaryotic genomes across 11 phyla. We challenged the targeted method with 48 metagenomes from a tallgrass prairie in Central Oklahoma recovering 3394 cas1. Among those, 88 were near full length, 5 times more than in de-novo assemblies from the Oklahoma metagenomes. To validate the host assignment by cas1, the targeted-assembled cas1 was mapped to the de-novo assembled contigs. All the phylum assignments of those mapped contigs were assigned independent of CRISPR-Cas genes on the same contigs and consistent with the host taxonomies predicted by the mapped cas1. We then investigated whether 8 years of soil warming altered cas1 prevalence within the communities. A shift in microbial abundances was observed during the year with the biggest temperature differential (mean 4.16 °C above ambient). cas1 prevalence increased and even in the phyla with decreased microbial abundances over the next 3 years, suggesting increasing virus–host interactions in response to soil warming. This targeted method provides an alternative means to effectively mine cas1 from metagenomes and uncover the host communities.
Collapse
Affiliation(s)
- Ruonan Wu
- Laboratory of Environmental Microbiology and Toxicology, School of Biological Sciences, Faculty of Science, The University of Hong Kong, Hong Kong SAR, China.,Center for Microbial Ecology, Michigan State University, East Lansing, MI, USA
| | - Benli Chai
- Center for Microbial Ecology, Michigan State University, East Lansing, MI, USA
| | - James R Cole
- Center for Microbial Ecology, Michigan State University, East Lansing, MI, USA.,Department of Plant, Soil and Microbial Sciences, Michigan State University, East Lansing, MI, USA
| | - Santosh K Gunturu
- Center for Microbial Ecology, Michigan State University, East Lansing, MI, USA
| | - Xue Guo
- Department of Microbiology & Plant Biology, Institute for Environmental Genomics, and School of Civil Engineering and Environmental Sciences, University of Oklahoma, Norman, OK, USA.,State Key Joint Laboratory of Environment Simulation and Pollution Control, School of Environment, Tsinghua University, Beijing, China
| | - Renmao Tian
- Department of Microbiology & Plant Biology, Institute for Environmental Genomics, and School of Civil Engineering and Environmental Sciences, University of Oklahoma, Norman, OK, USA.,Institute for Food Safety and Health, Illinois Institute of Technology, Chicago, IL, USA
| | - Ji-Dong Gu
- Laboratory of Environmental Microbiology and Toxicology, School of Biological Sciences, Faculty of Science, The University of Hong Kong, Hong Kong SAR, China
| | - Jizhong Zhou
- Department of Microbiology & Plant Biology, Institute for Environmental Genomics, and School of Civil Engineering and Environmental Sciences, University of Oklahoma, Norman, OK, USA.,State Key Joint Laboratory of Environment Simulation and Pollution Control, School of Environment, Tsinghua University, Beijing, China.,Earth and Environmental Sciences, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - James M Tiedje
- Center for Microbial Ecology, Michigan State University, East Lansing, MI, USA. .,Department of Plant, Soil and Microbial Sciences, Michigan State University, East Lansing, MI, USA.
| |
Collapse
|
96
|
Abstract
Over the past decade, it has become exceedingly clear that the microbiome is a critical factor in human health and disease and thus should be investigated to develop innovative treatment strategies. The field of metagenomics has come a long way in leveraging the advances of next-generation sequencing technologies resulting in the capability to identify and quantify all microorganisms present in human specimens. However, the field of metagenomics is still in its infancy, specifically in regard to the limitations in computational analysis, statistical assessments, standardization, and validation due to vast variability in the cohorts themselves, experimental design, and bioinformatic workflows. This review summarizes the methods, technologies, computational tools, and model systems for characterizing and studying the microbiome. We also discuss important considerations investigators must make when interrogating the involvement of the microbiome in health and disease in order to establish robust results and mechanistic insights before moving into therapeutic design and intervention.
Collapse
|
97
|
Bharti R, Grimm DG. Current challenges and best-practice protocols for microbiome analysis. Brief Bioinform 2019; 22:178-193. [PMID: 31848574 PMCID: PMC7820839 DOI: 10.1093/bib/bbz155] [Citation(s) in RCA: 215] [Impact Index Per Article: 43.0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2019] [Revised: 10/23/2019] [Accepted: 11/06/2019] [Indexed: 12/15/2022] Open
Abstract
Analyzing the microbiome of diverse species and environments using next-generation sequencing techniques has significantly enhanced our understanding on metabolic, physiological and ecological roles of environmental microorganisms. However, the analysis of the microbiome is affected by experimental conditions (e.g. sequencing errors and genomic repeats) and computationally intensive and cumbersome downstream analysis (e.g. quality control, assembly, binning and statistical analyses). Moreover, the introduction of new sequencing technologies and protocols led to a flood of new methodologies, which also have an immediate effect on the results of the analyses. The aim of this work is to review the most important workflows for 16S rRNA sequencing and shotgun and long-read metagenomics, as well as to provide best-practice protocols on experimental design, sample processing, sequencing, assembly, binning, annotation and visualization. To simplify and standardize the computational analysis, we provide a set of best-practice workflows for 16S rRNA and metagenomic sequencing data (available at https://github.com/grimmlab/MicrobiomeBestPracticeReview).
Collapse
Affiliation(s)
- Richa Bharti
- Weihenstephan-Triesdorf University of Applied Sciences and Technical University of Munich, TUM Campus Straubing for Biotechnology and Sustainability, Straubing, Germany
| | - Dominik G Grimm
- Weihenstephan-Triesdorf University of Applied Sciences and Technical University of Munich, TUM Campus Straubing for Biotechnology and Sustainability, Straubing, Germany
| |
Collapse
|
98
|
Hester ER, Jetten MSM, Welte CU, Lücker S. Metabolic Overlap in Environmentally Diverse Microbial Communities. Front Genet 2019; 10:989. [PMID: 31681424 PMCID: PMC6811665 DOI: 10.3389/fgene.2019.00989] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2019] [Accepted: 09/17/2019] [Indexed: 12/23/2022] Open
Abstract
The majority of microbial communities consist of hundreds to thousands of species, creating a massive network of organisms competing for available resources within an ecosystem. In natural microbial communities, it appears that many microbial species have highly redundant metabolisms and seemingly are capable of utilizing the same substrates. This is paradoxical, as theory indicates that species requiring a common resource should outcompete one another. To better understand why microbial species can coexist, we developed metabolic overlap (MO) as a new metric to survey the functional redundancy of microbial communities at the genome scale across a wide variety of ecosystems. Using metagenome-assembled genomes, we surveyed nearly 1,000 studies across nine ecosystem types. We found the highest MO in extreme (i.e., low pH/high temperature) and aquatic environments, while the lowest MO was observed in communities associated with animal hosts, the built/engineered environment, and soil. In addition, different metabolism subcategories were explored for their degree of MO. For instance, overlap in nitrogen metabolism was among the lowest in animal and engineered ecosystems, while species from the built environment had the highest overlap. Together, we present a metric that utilizes whole genome information to explore overlapping niches of microbes. This provides a detailed picture of potential metabolic competition and cooperation between species present in an ecosystem, indicates the main substrate types sustaining the community, and serves as a valuable tool to generate hypotheses for future research.
Collapse
Affiliation(s)
- Eric R Hester
- Department of Microbiology, Radboud University, Nijmegen, Netherlands
| | - Mike S M Jetten
- Department of Microbiology, Radboud University, Nijmegen, Netherlands
| | - Cornelia U Welte
- Department of Microbiology, Radboud University, Nijmegen, Netherlands
| | - Sebastian Lücker
- Department of Microbiology, Radboud University, Nijmegen, Netherlands
| |
Collapse
|
99
|
Douglas GM, Langille MGI. Current and Promising Approaches to Identify Horizontal Gene Transfer Events in Metagenomes. Genome Biol Evol 2019; 11:2750-2766. [PMID: 31504488 PMCID: PMC6777429 DOI: 10.1093/gbe/evz184] [Citation(s) in RCA: 39] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 08/19/2019] [Indexed: 12/16/2022] Open
Abstract
High-throughput shotgun metagenomics sequencing has enabled the profiling of myriad natural communities. These data are commonly used to identify gene families and pathways that were potentially gained or lost in an environment and which may be involved in microbial adaptation. Despite the widespread interest in these events, there are no established best practices for identifying gene gain and loss in metagenomics data. Horizontal gene transfer (HGT) represents several mechanisms of gene gain that are especially of interest in clinical microbiology due to the rapid spread of antibiotic resistance genes in natural communities. Several additional mechanisms of gene gain and loss, including gene duplication, gene loss-of-function events, and de novo gene birth are also important to consider in the context of metagenomes but have been less studied. This review is largely focused on detecting HGT in prokaryotic metagenomes, but methods for detecting these other mechanisms are first discussed. For this article to be self-contained, we provide a general background on HGT and the different possible signatures of this process. Lastly, we discuss how improved assembly of genomes from metagenomes would be the most straight-forward approach for improving the inference of gene gain and loss events. Several recent technological advances could help improve metagenome assemblies: long-read sequencing, determining the physical proximity of contigs, optical mapping of short sequences along chromosomes, and single-cell metagenomics. The benefits and limitations of these advances are discussed and open questions in this area are highlighted.
Collapse
Affiliation(s)
- Gavin M Douglas
- Department of Microbiology and Immunology, Dalhousie University, Halifax, Nova Scotia, Canada
| | - Morgan G I Langille
- Department of Microbiology and Immunology, Dalhousie University, Halifax, Nova Scotia, Canada
| |
Collapse
|
100
|
Liu J, Taft DH, Maldonado-Gomez MX, Johnson D, Treiber ML, Lemay DG, DePeters EJ, Mills DA. The fecal resistome of dairy cattle is associated with diet during nursing. Nat Commun 2019; 10:4406. [PMID: 31562300 PMCID: PMC6765000 DOI: 10.1038/s41467-019-12111-x] [Citation(s) in RCA: 94] [Impact Index Per Article: 18.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2018] [Accepted: 07/24/2019] [Indexed: 01/07/2023] Open
Abstract
Antimicrobial resistance is a global public health concern, and livestock play a significant role in selecting for resistance and maintaining such reservoirs. Here we study the succession of dairy cattle resistome during early life using metagenomic sequencing, as well as the relationship between resistome, gut microbiota, and diet. In our dataset, the gut of dairy calves serves as a reservoir of 329 antimicrobial resistance genes (ARGs) presumably conferring resistance to 17 classes of antibiotics, and the abundance of ARGs declines gradually during nursing. ARGs appear to co-occur with antibacterial biocide or metal resistance genes. Colostrum is a potential source of ARGs observed in calves at day 2. The dynamic changes in the resistome are likely a result of gut microbiota assembly, which is closely associated with diet transition in dairy calves. Modifications in the resistome may be possible via early-life dietary interventions to reduce overall antimicrobial resistance.
Collapse
Affiliation(s)
- Jinxin Liu
- Department of Food Science and Technology, Robert Mondavi Institute for Wine and Food Science, University of California, Davis, One Shields Ave., Davis, CA, 95616, USA
- Foods for Health Institute, University of California, Davis, California, One Shields Ave., Davis, CA, 95616, USA
| | - Diana H Taft
- Department of Food Science and Technology, Robert Mondavi Institute for Wine and Food Science, University of California, Davis, One Shields Ave., Davis, CA, 95616, USA
- Foods for Health Institute, University of California, Davis, California, One Shields Ave., Davis, CA, 95616, USA
| | - Maria X Maldonado-Gomez
- Department of Food Science and Technology, Robert Mondavi Institute for Wine and Food Science, University of California, Davis, One Shields Ave., Davis, CA, 95616, USA
- Foods for Health Institute, University of California, Davis, California, One Shields Ave., Davis, CA, 95616, USA
| | - Daisy Johnson
- Department of Food Science and Technology, Robert Mondavi Institute for Wine and Food Science, University of California, Davis, One Shields Ave., Davis, CA, 95616, USA
- Foods for Health Institute, University of California, Davis, California, One Shields Ave., Davis, CA, 95616, USA
| | - Michelle L Treiber
- Department of Food Science and Technology, Robert Mondavi Institute for Wine and Food Science, University of California, Davis, One Shields Ave., Davis, CA, 95616, USA
- USDA ARS Western Human Nutrition Research Center, 430 West Health Sciences Dr., Davis, CA, 95616, USA
| | - Danielle G Lemay
- USDA ARS Western Human Nutrition Research Center, 430 West Health Sciences Dr., Davis, CA, 95616, USA
- Genome Center, University of California, 451 Health Science Dr., Davis, CA, 95616, USA
- Department of Nutrition, University of California, Davis, California, Davis, CA, 95616, USA
| | - Edward J DePeters
- Department of Animal Science, University of California, Davis, California, Davis, CA, 95616, USA
| | - David A Mills
- Department of Food Science and Technology, Robert Mondavi Institute for Wine and Food Science, University of California, Davis, One Shields Ave., Davis, CA, 95616, USA.
- Foods for Health Institute, University of California, Davis, California, One Shields Ave., Davis, CA, 95616, USA.
- Department of Viticulture and Enology, Robert Mondavi Institute for Wine and Food Science, University of California, Davis, California, One Shields Ave., Davis, CA, 95616, USA.
| |
Collapse
|