1
|
Achudhan AB, Kannan P, Gupta A, Saleena LM. A Review of Web-Based Metagenomics Platforms for Analysing Next-Generation Sequence Data. Biochem Genet 2024; 62:621-632. [PMID: 37507643 DOI: 10.1007/s10528-023-10467-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2023] [Accepted: 07/18/2023] [Indexed: 07/30/2023]
Abstract
Metagenomics has now evolved as a promising technology for understanding the microbial population in the environment. By metagenomics, a number of extreme and complex environment has been explored for their microbial population. Using this technology, researchers have brought out novel genes and their potential characteristics, which have robust applications in food, pharmaceutical, scientific research, and other biotechnological fields. A sequencing platform can provide a sequence of microbial populations in any given environment. The sequence needs to be analysed computationally to derive meaningful information. It is presumed that only bioinformaticians with extensive computational skills can process the sequencing data till the downstream end. However, numerous open-source software and online servers are available to analyse the metagenomic data developed for a biologist with less computational skills. This review is focused on bioinformatics tools such as Galaxy, CSI-NGS portal, ANASTASIA and SHAMAN, EBI- metagenomics, IDseq, and MG-RAST for analysing metagenomic data.
Collapse
Affiliation(s)
- Arunmozhi Bharathi Achudhan
- Department of Biotechnology, School of Bioengineering, College of Engineering and Technology, SRM Institute of Science and Technology, Kattankulathur, Tamil Nadu, India
| | - Priya Kannan
- Department of Biotechnology, School of Bioengineering, College of Engineering and Technology, SRM Institute of Science and Technology, Kattankulathur, Tamil Nadu, India
| | - Annapurna Gupta
- Department of Biotechnology, School of Bioengineering, College of Engineering and Technology, SRM Institute of Science and Technology, Kattankulathur, Tamil Nadu, India
| | - Lilly M Saleena
- Department of Biotechnology, School of Bioengineering, College of Engineering and Technology, SRM Institute of Science and Technology, Kattankulathur, Tamil Nadu, India.
| |
Collapse
|
2
|
Wang L, Ding R, He S, Wang Q, Zhou Y. A Pipeline for Constructing Reference Genomes for Large Cohort-Specific Metagenome Compression. Microorganisms 2023; 11:2560. [PMID: 37894218 PMCID: PMC10609127 DOI: 10.3390/microorganisms11102560] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2023] [Revised: 09/16/2023] [Accepted: 09/18/2023] [Indexed: 10/29/2023] Open
Abstract
Metagenomic data compression is very important as metagenomic projects are facing the challenges of larger data volumes per sample and more samples nowadays. Reference-based compression is a promising method to obtain a high compression ratio. However, existing microbial reference genome databases are not suitable to be directly used as references for compression due to their large size and redundancy, and different metagenomic cohorts often have various microbial compositions. We present a novel pipeline that generated simplified and tailored reference genomes for large metagenomic cohorts, enabling the reference-based compression of metagenomic data. We constructed customized reference genomes, ranging from 2.4 to 3.9 GB, for 29 real metagenomic datasets and evaluated their compression performance. Reference-based compression achieved an impressive compression ratio of over 20 for human whole-genome data and up to 33.8 for all samples, demonstrating a remarkable 4.5 times improvement than the standard Gzip compression. Our method provides new insights into reference-based metagenomic data compression and has a broad application potential for faster and cheaper data transfer, storage, and analysis.
Collapse
Affiliation(s)
- Linqi Wang
- State Key Laboratory of Genetic Engineering, School of Life Sciences, Fudan University, Shanghai 200438, China; (L.W.); (Q.W.)
| | - Renpeng Ding
- MGI Tech, Shenzhen 518083, China; (R.D.); (S.H.)
| | - Shixu He
- MGI Tech, Shenzhen 518083, China; (R.D.); (S.H.)
| | - Qinyu Wang
- State Key Laboratory of Genetic Engineering, School of Life Sciences, Fudan University, Shanghai 200438, China; (L.W.); (Q.W.)
| | - Yan Zhou
- State Key Laboratory of Genetic Engineering, School of Life Sciences, Fudan University, Shanghai 200438, China; (L.W.); (Q.W.)
- MGI Tech, Shenzhen 518083, China; (R.D.); (S.H.)
| |
Collapse
|
3
|
Blumberg KL, Ponsero AJ, Bomhoff M, Wood-Charlson EM, DeLong EF, Hurwitz BL. Ontology-Enriched Specifications Enabling Findable, Accessible, Interoperable, and Reusable Marine Metagenomic Datasets in Cyberinfrastructure Systems. Front Microbiol 2021; 12:765268. [PMID: 34956127 PMCID: PMC8692764 DOI: 10.3389/fmicb.2021.765268] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2021] [Accepted: 11/16/2021] [Indexed: 11/13/2022] Open
Abstract
Marine microbial ecology requires the systematic comparison of biogeochemical and sequence data to analyze environmental influences on the distribution and variability of microbial communities. With ever-increasing quantities of metagenomic data, there is a growing need to make datasets Findable, Accessible, Interoperable, and Reusable (FAIR) across diverse ecosystems. FAIR data is essential to developing analytical frameworks that integrate microbiological, genomic, ecological, oceanographic, and computational methods. Although community standards defining the minimal metadata required to accompany sequence data exist, they haven’t been consistently used across projects, precluding interoperability. Moreover, these data are not machine-actionable or discoverable by cyberinfrastructure systems. By making ‘omic and physicochemical datasets FAIR to machine systems, we can enable sequence data discovery and reuse based on machine-readable descriptions of environments or physicochemical gradients. In this work, we developed a novel technical specification for dataset encapsulation for the FAIR reuse of marine metagenomic and physicochemical datasets within cyberinfrastructure systems. This includes using Frictionless Data Packages enriched with terminology from environmental and life-science ontologies to annotate measured variables, their units, and the measurement devices used. This approach was implemented in Planet Microbe, a cyberinfrastructure platform and marine metagenomic web-portal. Here, we discuss the data properties built into the specification to make global ocean datasets FAIR within the Planet Microbe portal. We additionally discuss the selection of, and contributions to marine-science ontologies used within the specification. Finally, we use the system to discover data by which to answer various biological questions about environments, physicochemical gradients, and microbial communities in meta-analyses. This work represents a future direction in marine metagenomic research by proposing a specification for FAIR dataset encapsulation that, if adopted within cyberinfrastructure systems, would automate the discovery, exchange, and re-use of data needed to answer broader reaching questions than originally intended.
Collapse
Affiliation(s)
- Kai L Blumberg
- Department of Biosystems Engineering, University of Arizona, Tucson, AZ, United States
| | - Alise J Ponsero
- Department of Biosystems Engineering, University of Arizona, Tucson, AZ, United States
| | - Matthew Bomhoff
- Department of Biosystems Engineering, University of Arizona, Tucson, AZ, United States
| | - Elisha M Wood-Charlson
- E.O. Lawrence Berkeley National Laboratory, Environmental Genomics and Systems Biology Division, Berkeley, CA, United States
| | - Edward F DeLong
- Daniel K. Inouye Center for Microbial Oceanography, University of Hawai'i, Honolulu, HI, United States
| | - Bonnie L Hurwitz
- Department of Biosystems Engineering, University of Arizona, Tucson, AZ, United States.,BIO5 Institute, University of Arizona, Tucson, AZ, United States
| |
Collapse
|
4
|
Shao L, Liao J, Qian J, Chen W, Fan X. MetaGeneBank: a standardized database to study deep sequenced metagenomic data from human fecal specimen. BMC Microbiol 2021; 21:263. [PMID: 34592929 PMCID: PMC8485520 DOI: 10.1186/s12866-021-02321-z] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2021] [Accepted: 08/23/2021] [Indexed: 12/25/2022] Open
Abstract
BACKGROUND Microbiome big data from population-scale cohorts holds the key to unleash the power of microbiomes to overcome critical challenges in disease control, treatment and precision medicine. However, variations introduced during data generation and processing limit the comparisons among independent studies in respect of interpretability. Although multiple databases have been constructed as platforms for data reuse, they are of limited value since only raw sequencing files are considered. DESCRIPTION Here, we present MetaGeneBank, a standardized database that provides details on sample collection and sequencing, and abundances of genes, microbiota and molecular functions for 4470 raw sequencing files (over 12 TB) collected from 16 studies covering over 10 types of diseases and 14 countries using a unified data-processing pipeline. The incorporation of tools that enable browsing and searching with descriptive attributes, gene sequences, microbiota and functions makes the database user-friendly. We found that the source of specimen contributes more than sequencing centers or platforms to the variations of microbiota. Special attention should be paid when re-analyzing sequencing files from different countries. CONCLUSIONS Collectively, MetaGeneBank provides a gateway to utilize the untapped potential of gut metagenomic data in helping fighting against human diseases. With the continuous updating of the database in terms of data volume, data types and sample types, MetaGeneBank would undoubtedly be the benchmarking database in the future in respect of data reuse, and would be valuable in translational science.
Collapse
Affiliation(s)
- Li Shao
- Hangzhou Normal University, Institute of Translational Medicine, The Affiliated Hospital of Hangzhou Normal University, Hangzhou, 311121, Zhejiang, China.,iMedicine Lab, Alibaba-Zhejiang University Joint Research Center for Future Digital Health , Hangzhou, 310018, Zhejiang, China
| | - Jie Liao
- Pharmaceutical Informatics Institute, College of Pharmaceutical Sciences, Zhejiang University , Hangzhou, 310003, China
| | - Jingyang Qian
- Pharmaceutical Informatics Institute, College of Pharmaceutical Sciences, Zhejiang University , Hangzhou, 310003, China
| | - Wenbin Chen
- The First Affiliated Hospital, School of Medicine, Zhejiang University , Hangzhou, 310003, Zhejiang, China
| | - Xiaohui Fan
- Pharmaceutical Informatics Institute, College of Pharmaceutical Sciences, Zhejiang University , Hangzhou, 310003, China. .,iMedicine Lab, Alibaba-Zhejiang University Joint Research Center for Future Digital Health , Hangzhou, 310018, Zhejiang, China. .,Westlake Laboratory of Life Sciences and Biomedicine, Hangzhou , Hangzhou, 310058, China.
| |
Collapse
|
5
|
Zeng T, Yu X, Chen Z. Applying artificial intelligence in the microbiome for gastrointestinal diseases: A review. J Gastroenterol Hepatol 2021; 36:832-840. [PMID: 33880762 DOI: 10.1111/jgh.15503] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/09/2021] [Revised: 03/18/2021] [Accepted: 03/18/2021] [Indexed: 12/20/2022]
Abstract
For a long time, gut bacteria have been recognized for their important roles in the occurrence and progression of gastrointestinal diseases like colorectal cancer, and the ever-increasing amounts of microbiome data combined with other high-quality clinical and imaging datasets are leading the study of gastrointestinal diseases into an era of biomedical big data. The "omics" technologies used for microbiome analysis continuously evolve, and the machine learning or artificial intelligence technologies are key to extract the relevant information from microbiome data. This review intends to provide a focused summary of recent research and applications of microbiome big data and to discuss the use of artificial intelligence to combat gastrointestinal diseases.
Collapse
Affiliation(s)
- Tao Zeng
- CAS Key Laboratory of Computational Biology, Bio-Med Big Data Center, Shanghai Institute of Nutrition and Health, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai, China
- Key Laboratory of Systems Biology, Shanghai Institute of Biochemistry and Cell Biology, Center for Excellence in Molecular Cell Science, Chinese Academy of Sciences, Shanghai, China
| | - Xiangtian Yu
- Clinical Reasearch Center, Shanghai Jiao Tong University Affiliated Sixth People's Hospital, Shanghai, China
| | - Zhangran Chen
- Institute for Microbial Ecology, School of Medicine, Xiamen University, Xiamen, China
| |
Collapse
|
6
|
Keller-Costa T, Lago-Lestón A, Saraiva JP, Toscan R, Silva SG, Gonçalves J, Cox CJ, Kyrpides N, Nunes da Rocha U, Costa R. Metagenomic insights into the taxonomy, function, and dysbiosis of prokaryotic communities in octocorals. MICROBIOME 2021; 9:72. [PMID: 33766108 PMCID: PMC7993494 DOI: 10.1186/s40168-021-01031-y] [Citation(s) in RCA: 31] [Impact Index Per Article: 10.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/08/2020] [Accepted: 02/08/2021] [Indexed: 05/06/2023]
Abstract
BACKGROUND In octocorals (Cnidaria Octocorallia), the functional relationship between host health and its symbiotic consortium has yet to be determined. Here, we employed comparative metagenomics to uncover the distinct functional and phylogenetic features of the microbiomes of healthy Eunicella gazella, Eunicella verrucosa, and Leptogorgia sarmentosa tissues, in contrast with the microbiomes found in seawater and sediments. We further explored how the octocoral microbiome shifts to a pathobiome state in E. gazella. RESULTS Multivariate analyses based on 16S rRNA genes, Clusters of Orthologous Groups of proteins (COGs), Protein families (Pfams), and secondary metabolite-biosynthetic gene clusters annotated from 20 Illumina-sequenced metagenomes each revealed separate clustering of the prokaryotic communities of healthy tissue samples of the three octocoral species from those of necrotic E. gazella tissue and surrounding environments. While the healthy octocoral microbiome was distinguished by so-far uncultivated Endozoicomonadaceae, Oceanospirillales, and Alteromonadales phylotypes in all host species, a pronounced increase of Flavobacteriaceae and Alphaproteobacteria, originating from seawater, was observed in necrotic E. gazella tissue. Increased abundances of eukaryotic-like proteins, exonucleases, restriction endonucleases, CRISPR/Cas proteins, and genes encoding for heat-shock proteins, inorganic ion transport, and iron storage distinguished the prokaryotic communities of healthy octocoral tissue regardless of the host species. An increase of arginase and nitric oxide reductase genes, observed in necrotic E. gazella tissues, suggests the existence of a mechanism for suppression of nitrite oxide production by which octocoral pathogens may overcome the host's immune system. CONCLUSIONS This is the first study to employ primer-less, shotgun metagenome sequencing to unveil the taxonomic, functional, and secondary metabolism features of prokaryotic communities in octocorals. Our analyses reveal that the octocoral microbiome is distinct from those of the environmental surroundings, is host genus (but not species) specific, and undergoes large, complex structural changes in the transition to the dysbiotic state. Host-symbiont recognition, abiotic-stress response, micronutrient acquisition, and an antiviral defense arsenal comprising multiple restriction endonucleases, CRISPR/Cas systems, and phage lysogenization regulators are signatures of prokaryotic communities in octocorals. We argue that these features collectively contribute to the stabilization of symbiosis in the octocoral holobiont and constitute beneficial traits that can guide future studies on coral reef conservation and microbiome therapy. Video Abstract.
Collapse
Affiliation(s)
- T. Keller-Costa
- Instituto de Bioengenharia e Biociências (iBB), Instituto Superior Técnico (IST), Universidade de Lisboa, Av. Rovisco Pais 1, 1049-001 Lisbon, Portugal
| | - A. Lago-Lestón
- División de Biología Experimental y Aplicada (DBEA), Centro de Investigación Científica y de Educación Superior de Ensenada (CICESE), Carr. Ensenada-Tijuana 3918, Zona Playitas, C.P 22860 Ensenada, Baja California Mexico
| | - J. P. Saraiva
- Helmholtz Centre for Environmental Research (UFZ), Leipzig, 04318 Germany
| | - R. Toscan
- Helmholtz Centre for Environmental Research (UFZ), Leipzig, 04318 Germany
| | - S. G. Silva
- Instituto de Bioengenharia e Biociências (iBB), Instituto Superior Técnico (IST), Universidade de Lisboa, Av. Rovisco Pais 1, 1049-001 Lisbon, Portugal
| | - J. Gonçalves
- Centro de Ciências do Mar (CCMAR), Universidade do Algarve, 8005-139 Faro, Portugal
| | - C. J. Cox
- Centro de Ciências do Mar (CCMAR), Universidade do Algarve, 8005-139 Faro, Portugal
| | - N. Kyrpides
- Department of Energy, Joint Genome Institute, Berkeley, CA 94720 USA
| | - U. Nunes da Rocha
- Helmholtz Centre for Environmental Research (UFZ), Leipzig, 04318 Germany
| | - R. Costa
- Instituto de Bioengenharia e Biociências (iBB), Instituto Superior Técnico (IST), Universidade de Lisboa, Av. Rovisco Pais 1, 1049-001 Lisbon, Portugal
- Centro de Ciências do Mar (CCMAR), Universidade do Algarve, 8005-139 Faro, Portugal
- Department of Energy, Joint Genome Institute, Berkeley, CA 94720 USA
| |
Collapse
|
7
|
Raimundo I, Silva R, Meunier L, Valente SM, Lago-Lestón A, Keller-Costa T, Costa R. Functional metagenomics reveals differential chitin degradation and utilization features across free-living and host-associated marine microbiomes. MICROBIOME 2021; 9:43. [PMID: 33583433 PMCID: PMC7883442 DOI: 10.1186/s40168-020-00970-2] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/06/2020] [Accepted: 10/18/2020] [Indexed: 06/01/2023]
Abstract
BACKGROUND Chitin ranks as the most abundant polysaccharide in the oceans yet knowledge of shifts in structure and diversity of chitin-degrading communities across marine niches is scarce. Here, we integrate cultivation-dependent and -independent approaches to shed light on the chitin processing potential within the microbiomes of marine sponges, octocorals, sediments, and seawater. RESULTS We found that cultivatable host-associated bacteria in the genera Aquimarina, Enterovibrio, Microbulbifer, Pseudoalteromonas, Shewanella, and Vibrio were able to degrade colloidal chitin in vitro. Congruent with enzymatic activity bioassays, genome-wide inspection of cultivated symbionts revealed that Vibrio and Aquimarina species, particularly, possess several endo- and exo-chitinase-encoding genes underlying their ability to cleave the large chitin polymer into oligomers and dimers. Conversely, Alphaproteobacteria species were found to specialize in the utilization of the chitin monomer N-acetylglucosamine more often. Phylogenetic assessments uncovered a high degree of within-genome diversification of multiple, full-length endo-chitinase genes for Aquimarina and Vibrio strains, suggestive of a versatile chitin catabolism aptitude. We then analyzed the abundance distributions of chitin metabolism-related genes across 30 Illumina-sequenced microbial metagenomes and found that the endosymbiotic consortium of Spongia officinalis is enriched in polysaccharide deacetylases, suggesting the ability of the marine sponge microbiome to convert chitin into its deacetylated-and biotechnologically versatile-form chitosan. Instead, the abundance of endo-chitinase and chitin-binding protein-encoding genes in healthy octocorals leveled up with those from the surrounding environment but was found to be depleted in necrotic octocoral tissue. Using cultivation-independent, taxonomic assignments of endo-chitinase encoding genes, we unveiled previously unsuspected richness and divergent structures of chitinolytic communities across host-associated and free-living biotopes, revealing putative roles for uncultivated Gammaproteobacteria and Chloroflexi symbionts in chitin processing within sessile marine invertebrates. CONCLUSIONS Our findings suggest that differential chitin degradation pathways, utilization, and turnover dictate the processing of chitin across marine micro-niches and support the hypothesis that inter-species cross-feeding could facilitate the co-existence of chitin utilizers within marine invertebrate microbiomes. We further identified chitin metabolism functions which may serve as indicators of microbiome integrity/dysbiosis in corals and reveal putative novel chitinolytic enzymes in the genus Aquimarina that may find applications in the blue biotechnology sector. Video abstract.
Collapse
Affiliation(s)
- I. Raimundo
- Instituto de Bioengenharia e Biociências, Instituto Superior Técnico (IST), Universidade de Lisboa, Av. Rovisco Pais 1, Torre Sul, Piso 11, 11.6.11b, 1049-001 Lisbon, Portugal
| | - R. Silva
- Instituto de Bioengenharia e Biociências, Instituto Superior Técnico (IST), Universidade de Lisboa, Av. Rovisco Pais 1, Torre Sul, Piso 11, 11.6.11b, 1049-001 Lisbon, Portugal
| | - L. Meunier
- Instituto de Bioengenharia e Biociências, Instituto Superior Técnico (IST), Universidade de Lisboa, Av. Rovisco Pais 1, Torre Sul, Piso 11, 11.6.11b, 1049-001 Lisbon, Portugal
- Laboratory of Aquatic Systems Ecology, Université Libre de Bruxelles, Brussels, Belgium
| | - S. M. Valente
- Instituto de Bioengenharia e Biociências, Instituto Superior Técnico (IST), Universidade de Lisboa, Av. Rovisco Pais 1, Torre Sul, Piso 11, 11.6.11b, 1049-001 Lisbon, Portugal
| | - A. Lago-Lestón
- Department of Medical Innovation, Centro de Investigación Científica y de Educación Superior de Ensenada (CICESE), 22860 Ensenada, Mexico
| | - T. Keller-Costa
- Instituto de Bioengenharia e Biociências, Instituto Superior Técnico (IST), Universidade de Lisboa, Av. Rovisco Pais 1, Torre Sul, Piso 11, 11.6.11b, 1049-001 Lisbon, Portugal
| | - R. Costa
- Instituto de Bioengenharia e Biociências, Instituto Superior Técnico (IST), Universidade de Lisboa, Av. Rovisco Pais 1, Torre Sul, Piso 11, 11.6.11b, 1049-001 Lisbon, Portugal
- Centro de Ciências do Mar (CCMAR), Universidade do Algarve, 8005-139 Faro, Portugal
- Department of Energy, Joint Genome Institute, Berkeley, CA 94720 USA
- Lawrence Berkeley National Laboratory, Berkeley, CA 94720 USA
| |
Collapse
|
8
|
Environmental prospecting of black yeast-like agents of human disease using culture-independent methodology. Sci Rep 2020; 10:14229. [PMID: 32848176 PMCID: PMC7450056 DOI: 10.1038/s41598-020-70915-0] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2019] [Accepted: 07/22/2020] [Indexed: 11/16/2022] Open
Abstract
Melanized fungi and black yeasts in the family Herpotrichiellaceae (order Chaetothyriales) are important agents of human and animal infectious diseases such as chromoblastomycosis and phaeohyphomycosis. The oligotrophic nature of these fungi enables them to survive in adverse environments where common saprobes are absent. Due to their slow growth, they lose competition with common saprobes, and therefore isolation studies yielded low frequencies of clinically relevant species in environmental habitats from which humans are thought to be infected. This problem can be solved with metagenomic techniques which allow recognition of microorganisms independent from culture. The present study aimed to identify species of the family Herpotrichiellaceae that are known to occur in Brazil by the use of molecular markers to screen public environmental metagenomic datasets from Brazil available in the Sequence Read Archive (SRA). Species characterization was performed with the BLAST comparison of previously described barcodes and padlock probe sequences. A total of 18,329 sequences was collected comprising the genera Cladophialophora, Exophiala, Fonsecaea, Rhinocladiella and Veronaea, with a focus on species related to the chromoblastomycosis. The data obtained in this study demonstrated presence of these opportunists in the investigated datasets. The used techniques contribute to our understanding of environmental occurrence and epidemiology of black fungi.
Collapse
|
9
|
Rodriguez-R LM, Tsementzi D, Luo C, Konstantinidis KT. Iterative subtractive binning of freshwater chronoseries metagenomes identifies over 400 novel species and their ecologic preferences. Environ Microbiol 2020; 22:3394-3412. [PMID: 32495495 DOI: 10.1111/1462-2920.15112] [Citation(s) in RCA: 33] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/02/2020] [Revised: 04/26/2020] [Accepted: 05/31/2020] [Indexed: 01/22/2023]
Abstract
Recent advances in sequencing technology and bioinformatic pipelines have allowed unprecedented access to the genomes of yet-uncultivated microorganisms from diverse environments. However, the catalogue of freshwater genomes remains limited, and most genome recovery attempts in freshwater ecosystems have only targeted specific taxa. Here, we present a genome recovery pipeline incorporating iterative subtractive binning, and apply it to a time series of 100 metagenomic datasets from seven connected lakes and estuaries along the Chattahoochee River (Southeastern USA). Our set of metagenome-assembled genomes (MAGs) represents >400 yet-unnamed genomospecies, substantially increasing the number of high-quality MAGs from freshwater lakes. We propose names for two novel species: 'Candidatus Elulimicrobium humile' ('Ca. Elulimicrobiota', 'Patescibacteria') and 'Candidatus Aquidulcis frankliniae' ('Chloroflexi'). Collectively, our MAGs represented about half of the total microbial community at any sampling point. To evaluate the prevalence of these genomospecies in the chronoseries, we introduce methodologies to estimate relative abundance and habitat preference that control for uneven genome quality and sample representation. We demonstrate high degrees of habitat-specialization and endemicity for most genomospecies in the Chattahoochee lakes. Wider ecological ranges characterized smaller genomes with higher coding densities, indicating an overall advantage of smaller, more compact genomes for cosmopolitan distributions.
Collapse
Affiliation(s)
- Luis M Rodriguez-R
- School of Civil and Environmental Engineering, Georgia Institute of Technology, 311 Ferst Dr NW, Atlanta, GA, 30332, USA
| | - Despina Tsementzi
- School of Civil and Environmental Engineering, Georgia Institute of Technology, 311 Ferst Dr NW, Atlanta, GA, 30332, USA
| | - Chengwei Luo
- School of Civil and Environmental Engineering, Georgia Institute of Technology, 311 Ferst Dr NW, Atlanta, GA, 30332, USA
| | - Konstantinos T Konstantinidis
- School of Civil and Environmental Engineering, Georgia Institute of Technology, 311 Ferst Dr NW, Atlanta, GA, 30332, USA
| |
Collapse
|
10
|
Buongermino Pereira M, Österlund T, Eriksson KM, Backhaus T, Axelson-Fisk M, Kristiansson E. A comprehensive survey of integron-associated genes present in metagenomes. BMC Genomics 2020; 21:495. [PMID: 32689930 PMCID: PMC7370490 DOI: 10.1186/s12864-020-06830-5] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2019] [Accepted: 06/15/2020] [Indexed: 12/19/2022] Open
Abstract
Background Integrons are genomic elements that mediate horizontal gene transfer by inserting and removing genetic material using site-specific recombination. Integrons are commonly found in bacterial genomes, where they maintain a large and diverse set of genes that plays an important role in adaptation and evolution. Previous studies have started to characterize the wide range of biological functions present in integrons. However, the efforts have so far mainly been limited to genomes from cultivable bacteria and amplicons generated by PCR, thus targeting only a small part of the total integron diversity. Metagenomic data, generated by direct sequencing of environmental and clinical samples, provides a more holistic and unbiased analysis of integron-associated genes. However, the fragmented nature of metagenomic data has previously made such analysis highly challenging. Results Here, we present a systematic survey of integron-associated genes in metagenomic data. The analysis was based on a newly developed computational method where integron-associated genes were identified by detecting their associated recombination sites. By processing contiguous sequences assembled from more than 10 terabases of metagenomic data, we were able to identify 13,397 unique integron-associated genes. Metagenomes from marine microbial communities had the highest occurrence of integron-associated genes with levels more than 100-fold higher than in the human microbiome. The identified genes had a large functional diversity spanning over several functional classes. Genes associated with defense mechanisms and mobility facilitators were most overrepresented and more than five times as common in integrons compared to other bacterial genes. As many as two thirds of the genes were found to encode proteins of unknown function. Less than 1% of the genes were associated with antibiotic resistance, of which several were novel, previously undescribed, resistance gene variants. Conclusions Our results highlight the large functional diversity maintained by integrons present in unculturable bacteria and significantly expands the number of described integron-associated genes.
Collapse
Affiliation(s)
- Mariana Buongermino Pereira
- Department of Mathematical Sciences, Chalmers University of Technology, Gothenburg, Sweden.,Centre for Antibiotic Resistance Research (CARe) at University of Gothenburg, Gothenburg, Sweden
| | - Tobias Österlund
- Department of Mathematical Sciences, Chalmers University of Technology, Gothenburg, Sweden.,Centre for Antibiotic Resistance Research (CARe) at University of Gothenburg, Gothenburg, Sweden
| | - K Martin Eriksson
- Department of Biological and Environmental Sciences, University of Gothenburg, Gothenburg, Sweden.,Gothenburg Centre for Sustainable Development, Chalmers University of Technology, Gothenburg, Sweden
| | - Thomas Backhaus
- Centre for Antibiotic Resistance Research (CARe) at University of Gothenburg, Gothenburg, Sweden.,Department of Biological and Environmental Sciences, University of Gothenburg, Gothenburg, Sweden
| | - Marina Axelson-Fisk
- Department of Mathematical Sciences, Chalmers University of Technology, Gothenburg, Sweden
| | - Erik Kristiansson
- Department of Mathematical Sciences, Chalmers University of Technology, Gothenburg, Sweden. .,Centre for Antibiotic Resistance Research (CARe) at University of Gothenburg, Gothenburg, Sweden.
| |
Collapse
|
11
|
Kumar A, Dubey A. Rhizosphere microbiome: Engineering bacterial competitiveness for enhancing crop production. J Adv Res 2020; 24:337-352. [PMID: 32461810 PMCID: PMC7240055 DOI: 10.1016/j.jare.2020.04.014] [Citation(s) in RCA: 72] [Impact Index Per Article: 18.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2020] [Revised: 04/15/2020] [Accepted: 04/25/2020] [Indexed: 12/29/2022] Open
Abstract
Plants in nature are constantly exposed to a variety of abiotic and biotic stresses which limits their growth and production. Enhancing crop yield and production to feed exponentially growing global population in a sustainable manner by reduced chemical fertilization and agrochemicals will be a big challenge. Recently, the targeted application of beneficial plant microbiome and their cocktails to counteract abiotic and biotic stress is gaining momentum and becomes an exciting frontier of research. Advances in next generation sequencing (NGS) platform, gene editing technologies, metagenomics and bioinformatics approaches allows us to unravel the entangled webs of interactions of holobionts and core microbiomes for efficiently deploying the microbiome to increase crops nutrient acquisition and resistance to abiotic and biotic stress. In this review, we focused on shaping rhizosphere microbiome of susceptible host plant from resistant plant which comprises of specific type of microbial community with multiple potential benefits and targeted CRISPR/Cas9 based strategies for the manipulation of susceptibility genes in crop plants for improving plant health. This review is significant in providing first-hand information to improve fundamental understanding of the process which helps in shaping rhizosphere microbiome.
Collapse
Affiliation(s)
- Ashwani Kumar
- Metagenomics and Secretomics Research Laboratory, Department of Botany, Dr. Harisingh Gour University (A Central University), Sagar 470003, M.P., India
| | - Anamika Dubey
- Metagenomics and Secretomics Research Laboratory, Department of Botany, Dr. Harisingh Gour University (A Central University), Sagar 470003, M.P., India
| |
Collapse
|
12
|
Meyer F, Bagchi S, Chaterji S, Gerlach W, Grama A, Harrison T, Paczian T, Trimble WL, Wilke A. MG-RAST version 4-lessons learned from a decade of low-budget ultra-high-throughput metagenome analysis. Brief Bioinform 2020; 20:1151-1159. [PMID: 29028869 DOI: 10.1093/bib/bbx105] [Citation(s) in RCA: 75] [Impact Index Per Article: 18.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2017] [Revised: 07/21/2017] [Indexed: 11/12/2022] Open
Abstract
As technologies change, MG-RAST is adapting. Newly available software is being included to improve accuracy and performance. As a computational service constantly running large volume scientific workflows, MG-RAST is the right location to perform benchmarking and implement algorithmic or platform improvements, in many cases involving trade-offs between specificity, sensitivity and run-time cost. The work in [Glass EM, Dribinsky Y, Yilmaz P, et al. ISME J 2014;8:1-3] is an example; we use existing well-studied data sets as gold standards representing different environments and different technologies to evaluate any changes to the pipeline. Currently, we use well-understood data sets in MG-RAST as platform for benchmarking. The use of artificial data sets for pipeline performance optimization has not added value, as these data sets are not presenting the same challenges as real-world data sets. In addition, the MG-RAST team welcomes suggestions for improvements of the workflow. We are currently working on versions 4.02 and 4.1, both of which contain significant input from the community and our partners that will enable double barcoding, stronger inferences supported by longer-read technologies, and will increase throughput while maintaining sensitivity by using Diamond and SortMeRNA. On the technical platform side, the MG-RAST team intends to support the Common Workflow Language as a standard to specify bioinformatics workflows, both to facilitate development and efficient high-performance implementation of the community's data analysis tasks.
Collapse
|
13
|
Dovrolis N, Kolios G, Spyrou GM, Maroulakou I. Computational profiling of the gut-brain axis: microflora dysbiosis insights to neurological disorders. Brief Bioinform 2020; 20:825-841. [PMID: 29186317 DOI: 10.1093/bib/bbx154] [Citation(s) in RCA: 22] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2017] [Revised: 10/17/2017] [Indexed: 12/14/2022] Open
Abstract
Almost 2500 years after Hippocrates' observations on health and its direct association to the gastrointestinal tract, a paradigm shift has recently occurred, making the gut and its symbionts (bacteria, fungi, archaea and viruses) a point of convergence for studies. It is nowadays well established that the gut microflora's compositional diversity regulates via its genes (the microbiome) the host's health and provides preliminary insights into disease progression and regulation. The microbiome's involvement is evident in immunological and physiological studies that link changes in its biodiversity to its contributions to the host's phenotype but also in neurological investigations, substantiating the aptly named gut-brain axis. The definitive mechanisms of this last bidirectional interaction will be our main focus because it presents researchers with a new conundrum. In this review, we prospect current literature for computational analysis methodologies that accommodate the need for better understanding of the microbiome-gut-brain interactions and neurological disorder onset and progression, through cross-disciplinary systems biology applications. We will present bioinformatics tools used in exploring these synergies that help build and interpret microbial 16S ribosomal RNA data sets, produced by shotgun and high-throughput sequencing of healthy and neurological disorder samples stored in biological databases. These approaches provide alternative means for researchers to form hypotheses to their inquests faster, cheaper and swith precision. The goal of these studies relies on the integration of combined metagenomics and metabolomics assessments. An accurate characterization of the microbiome and its functionality can support new diagnostic, prognostic and therapeutic strategies for neurological disorders, customized for each individual host.
Collapse
|
14
|
Reconstruction and in silico analysis of new Marinobacter adhaerens t76_800 with potential for long-chain hydrocarbon bioremediation associated with marine environmental lipases. Mar Genomics 2020. [DOI: 10.1016/j.margen.2019.04.010] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]
|
15
|
Krishna SBN, Dubey A, Malla MA, Kothari R, Upadhyay CP, Adam JK, Kumar A. Integrating Microbiome Network: Establishing Linkages Between Plants, Microbes and Human Health. Open Microbiol J 2019. [DOI: 10.2174/1874285801913020330] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022] Open
Abstract
The trillions of microbes that colonize and live around us govern the health of both plants and animals through a cascade of direct and indirect mechanisms. Understanding of this enormous and largely untapped microbial diversity has been the focus of microbial research from the past few decades or so. Amidst the advancements in sequencing technologies, significant progress has been made to taxonomically and functionally catalogue these microbes and also to establish their exact role in the health and disease state. In comparison to the human microbiome, plants are also surrounded by a vast diversity of microbes that form complex ecological communities that affect plant growth and health through collective metabolic activities and interactions. This plant microbiome has a substantial influence on human health and environment via its passage through the nasal route and digestive tract and is responsible for changing our gut microbiome. This review primarily focused on the advances and challenges in microbiome research at the interface of plant and human, and role of microbiome at different compartments of the body’s ecosystems along with their correlation to health and diseases. This review also highlighted the potential therapies in modulating the gut microbiota and technologies for studying the microbiome.
Collapse
|
16
|
Chemical and microbial diversity covary in fresh water to influence ecosystem functioning. Proc Natl Acad Sci U S A 2019; 116:24689-24695. [PMID: 31740592 DOI: 10.1073/pnas.1904896116] [Citation(s) in RCA: 66] [Impact Index Per Article: 13.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Invisible to the naked eye lies a tremendous diversity of organic molecules and organisms that make major contributions to important biogeochemical cycles. However, how the diversity and composition of these two communities are interlinked remains poorly characterized in fresh waters, despite the potential for chemical and microbial diversity to promote one another. Here we exploited gradients in chemodiversity within a common microbial pool to test how chemical and biological diversity covary and characterized the implications for ecosystem functioning. We found that both chemodiversity and genes associated with organic matter decomposition increased as more plant litterfall accumulated in experimental lake sediments, consistent with scenarios of future environmental change. Chemical and microbial diversity were also positively correlated, with dissolved organic matter having stronger effects on microbes than vice versa. Under our experimental scenarios that increased sediment organic matter from 5 to 25% or darkened overlying waters by 2.5 times, the resulting increases in chemodiversity could increase greenhouse gas concentrations in lake sediments by an average of 1.5 to 2.7 times, when all of the other effects of litterfall and water color were considered. Our results open a major new avenue for research in aquatic ecosystems by exposing connections between chemical and microbial diversity and their implications for the global carbon cycle in greater detail than ever before.
Collapse
|
17
|
Paczian T, Trimble WL, Gerlach W, Harrison T, Wilke A, Meyer F. The MG-RAST API explorer: an on-ramp for RESTful query composition. BMC Bioinformatics 2019; 20:561. [PMID: 31703549 PMCID: PMC6842160 DOI: 10.1186/s12859-019-2993-0] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2018] [Accepted: 07/11/2019] [Indexed: 12/04/2022] Open
Abstract
Background The MG-RAST API provides search capabilities and delivers organism and function data as well as raw or annotated sequence data via the web interface and its RESTful API. For casual users, however, RESTful APIs are hard to learn and work with. Results We created the graphical MG-RAST API explorer to help researchers more easily build and export API queries; understand the data abstractions and indices available in MG-RAST; and use the results presented in-browser for exploration, development, and debugging. Conclusions The API explorer lowers the barrier to entry for occasional or first-time MG-RAST API users.
Collapse
Affiliation(s)
- Tobias Paczian
- Argonne National Laboratory, Lemont, IL, USA.,University of Chicago, Chicago, IL, USA
| | - William L Trimble
- Argonne National Laboratory, Lemont, IL, USA.,University of Chicago, Chicago, IL, USA
| | - Wolfgang Gerlach
- Argonne National Laboratory, Lemont, IL, USA.,University of Chicago, Chicago, IL, USA
| | - Travis Harrison
- Argonne National Laboratory, Lemont, IL, USA.,University of Chicago, Chicago, IL, USA
| | - Andreas Wilke
- Argonne National Laboratory, Lemont, IL, USA.,University of Chicago, Chicago, IL, USA
| | - Folker Meyer
- Argonne National Laboratory, Lemont, IL, USA. .,University of Chicago, Chicago, IL, USA.
| |
Collapse
|
18
|
Bioinformatics for Marine Products: An Overview of Resources, Bottlenecks, and Perspectives. Mar Drugs 2019; 17:md17100576. [PMID: 31614509 PMCID: PMC6835618 DOI: 10.3390/md17100576] [Citation(s) in RCA: 21] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2019] [Revised: 10/01/2019] [Accepted: 10/02/2019] [Indexed: 12/13/2022] Open
Abstract
The sea represents a major source of biodiversity. It exhibits many different ecosystems in a huge variety of environmental conditions where marine organisms have evolved with extensive diversification of structures and functions, making the marine environment a treasure trove of molecules with potential for biotechnological applications and innovation in many different areas. Rapid progress of the omics sciences has revealed novel opportunities to advance the knowledge of biological systems, paving the way for an unprecedented revolution in the field and expanding marine research from model organisms to an increasing number of marine species. Multi-level approaches based on molecular investigations at genomic, metagenomic, transcriptomic, metatranscriptomic, proteomic, and metabolomic levels are essential to discover marine resources and further explore key molecular processes involved in their production and action. As a consequence, omics approaches, accompanied by the associated bioinformatic resources and computational tools for molecular analyses and modeling, are boosting the rapid advancement of biotechnologies. In this review, we provide an overview of the most relevant bioinformatic resources and major approaches, highlighting perspectives and bottlenecks for an appropriate exploitation of these opportunities for biotechnology applications from marine resources.
Collapse
|
19
|
Gao NL, Zhang C, Zhang Z, Hu S, Lercher MJ, Zhao XM, Bork P, Liu Z, Chen WH. MVP: a microbe-phage interaction database. Nucleic Acids Res 2019; 46:D700-D707. [PMID: 29177508 PMCID: PMC5753265 DOI: 10.1093/nar/gkx1124] [Citation(s) in RCA: 48] [Impact Index Per Article: 9.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2017] [Accepted: 11/19/2017] [Indexed: 12/15/2022] Open
Abstract
Phages invade microbes, accomplish host lysis and are of vital importance in shaping the community structure of environmental microbiota. More importantly, most phages have very specific hosts; they are thus ideal tools to manipulate environmental microbiota at species-resolution. The main purpose of MVP (Microbe Versus Phage) is to provide a comprehensive catalog of phage–microbe interactions and assist users to select phage(s) that can target (and potentially to manipulate) specific microbes of interest. We first collected 50 782 viral sequences from various sources and clustered them into 33 097 unique viral clusters based on sequence similarity. We then identified 26 572 interactions between 18 608 viral clusters and 9245 prokaryotes (i.e. bacteria and archaea); we established these interactions based on 30 321 evidence entries that we collected from published datasets, public databases and re-analysis of genomic and metagenomic sequences. Based on these interactions, we calculated the host range for each of the phage clusters and accordingly grouped them into subgroups such as ‘species-’, ‘genus-’ and ‘family-’ specific phage clusters. MVP is equipped with a modern, responsive and intuitive interface, and is freely available at: http://mvp.medgenius.info.
Collapse
Affiliation(s)
- Na L Gao
- Key Laboratory of Molecular Biophysics of the Ministry of Education, Hubei Key Laboratory of Bioinformatics and Molecular-imaging, Department of Bioinformatics and Systems Biology, College of Life Science and Technology, Huazhong University of Science and Technology (HUST), 430074 Wuhan, Hubei, China.,Institute for Computer Science and Cluster of Excellence on Plant Sciences CEPLAS, Heinrich Heine University, 40225 Düsseldorf, Germany
| | - Chengwei Zhang
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics (BIG), Chinese Academy of Sciences (CAS), No.7 Beitucheng West Road, Chaoyang District, 100029 Beijing, PR China.,University of Chinese Academy of Sciences, Beijing 100049, China
| | - Zhanbing Zhang
- Key Laboratory of Molecular Biophysics of the Ministry of Education, Hubei Key Laboratory of Bioinformatics and Molecular-imaging, Department of Bioinformatics and Systems Biology, College of Life Science and Technology, Huazhong University of Science and Technology (HUST), 430074 Wuhan, Hubei, China
| | - Songnian Hu
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics (BIG), Chinese Academy of Sciences (CAS), No.7 Beitucheng West Road, Chaoyang District, 100029 Beijing, PR China
| | - Martin J Lercher
- Institute for Computer Science and Cluster of Excellence on Plant Sciences CEPLAS, Heinrich Heine University, 40225 Düsseldorf, Germany
| | - Xing-Ming Zhao
- Institute of Science and Technology for Brain-Inspired Intelligence (ISTBI), Fudan University, Office 2304, East Main Building of Guanghua Towers, 220 Handan Road, Shanghai 200433, China
| | - Peer Bork
- European molecular biology laboratory (EMBL), Meyerhofstrasse 1, 69117 Heidelberg, Germany.,Molecular Medicine Partnership Unit, University of Heidelberg and European Molecular Biology Laboratory, 69120 Heidelberg, Germany.,Max-Delbrück-Centre for Molecular Medicine, Robert-Rössle-Straße 10, 13125 Berlin, Germany.,Department of Bioinformatics, Biocenter, University of Würzburg, 97074 Würzburg, Germany
| | - Zhi Liu
- Key Laboratory of Molecular Biophysics of the Ministry of Education, Hubei Key Laboratory of Bioinformatics and Molecular-imaging, Department of Bioinformatics and Systems Biology, College of Life Science and Technology, Huazhong University of Science and Technology (HUST), 430074 Wuhan, Hubei, China
| | - Wei-Hua Chen
- Key Laboratory of Molecular Biophysics of the Ministry of Education, Hubei Key Laboratory of Bioinformatics and Molecular-imaging, Department of Bioinformatics and Systems Biology, College of Life Science and Technology, Huazhong University of Science and Technology (HUST), 430074 Wuhan, Hubei, China
| |
Collapse
|
20
|
Klemetsen T, Raknes IA, Fu J, Agafonov A, Balasundaram SV, Tartari G, Robertsen E, Willassen NP. The MAR databases: development and implementation of databases specific for marine metagenomics. Nucleic Acids Res 2019; 46:D692-D699. [PMID: 29106641 PMCID: PMC5753341 DOI: 10.1093/nar/gkx1036] [Citation(s) in RCA: 67] [Impact Index Per Article: 13.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2017] [Accepted: 10/18/2017] [Indexed: 12/03/2022] Open
Abstract
We introduce the marine databases; MarRef, MarDB and MarCat (https://mmp.sfb.uit.no/databases/), which are publicly available resources that promote marine research and innovation. These data resources, which have been implemented in the Marine Metagenomics Portal (MMP) (https://mmp.sfb.uit.no/), are collections of richly annotated and manually curated contextual (metadata) and sequence databases representing three tiers of accuracy. While MarRef is a database for completely sequenced marine prokaryotic genomes, which represent a marine prokaryote reference genome database, MarDB includes all incomplete sequenced prokaryotic genomes regardless level of completeness. The last database, MarCat, represents a gene (protein) catalog of uncultivable (and cultivable) marine genes and proteins derived from marine metagenomics samples. The first versions of MarRef and MarDB contain 612 and 3726 records, respectively. Each record is built up of 106 metadata fields including attributes for sampling, sequencing, assembly and annotation in addition to the organism and taxonomic information. Currently, MarCat contains 1227 records with 55 metadata fields. Ontologies and controlled vocabularies are used in the contextual databases to enhance consistency. The user-friendly web interface lets the visitors browse, filter and search in the contextual databases and perform BLAST searches against the corresponding sequence databases. All contextual and sequence databases are freely accessible and downloadable from https://s1.sfb.uit.no/public/mar/.
Collapse
Affiliation(s)
- Terje Klemetsen
- Centre for Bioinformatics, Faculty of science and technology, UiT The Arctic University of Norway, PO Box 6050 Langnes, TromsøN-9037, Norway
| | - Inge A Raknes
- Centre for Bioinformatics, Faculty of science and technology, UiT The Arctic University of Norway, PO Box 6050 Langnes, TromsøN-9037, Norway
| | - Juan Fu
- Centre for Bioinformatics, Faculty of science and technology, UiT The Arctic University of Norway, PO Box 6050 Langnes, TromsøN-9037, Norway
| | - Alexander Agafonov
- Centre for Bioinformatics, Faculty of science and technology, UiT The Arctic University of Norway, PO Box 6050 Langnes, TromsøN-9037, Norway
| | - Sudhagar V Balasundaram
- Centre for Bioinformatics, Faculty of science and technology, UiT The Arctic University of Norway, PO Box 6050 Langnes, TromsøN-9037, Norway
| | - Giacomo Tartari
- Centre for Bioinformatics, Faculty of science and technology, UiT The Arctic University of Norway, PO Box 6050 Langnes, TromsøN-9037, Norway.,Department of Information Technology, UiT The Arctic University of Norway, PO Box 6050 Langnes, TromsøN-9037, Norway
| | - Espen Robertsen
- Centre for Bioinformatics, Faculty of science and technology, UiT The Arctic University of Norway, PO Box 6050 Langnes, TromsøN-9037, Norway
| | - Nils P Willassen
- Centre for Bioinformatics, Faculty of science and technology, UiT The Arctic University of Norway, PO Box 6050 Langnes, TromsøN-9037, Norway
| |
Collapse
|
21
|
Santamaria M, Fosso B, Licciulli F, Balech B, Larini I, Grillo G, De Caro G, Liuni S, Pesole G. ITSoneDB: a comprehensive collection of eukaryotic ribosomal RNA Internal Transcribed Spacer 1 (ITS1) sequences. Nucleic Acids Res 2019; 46:D127-D132. [PMID: 29036529 PMCID: PMC5753230 DOI: 10.1093/nar/gkx855] [Citation(s) in RCA: 26] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2017] [Accepted: 09/18/2017] [Indexed: 01/21/2023] Open
Abstract
A holistic understanding of environmental communities is the new challenge of metagenomics. Accordingly, the amplicon-based or metabarcoding approach, largely applied to investigate bacterial microbiomes, is moving to the eukaryotic world too. Indeed, the analysis of metabarcoding data may provide a comprehensive assessment of both bacterial and eukaryotic composition in a variety of environments, including human body. In this respect, whereas hypervariable regions of the 16S rRNA are the de facto standard barcode for bacteria, the Internal Transcribed Spacer 1 (ITS1) of ribosomal RNA gene cluster has shown a high potential in discriminating eukaryotes at deep taxonomic levels. As metabarcoding data analysis rely on the availability of a well-curated barcode reference resource, a comprehensive collection of ITS1 sequences supplied with robust taxonomies, is highly needed. To address this issue, we created ITSoneDB (available at http://itsonedb.cloud.ba.infn.it/) which in its current version hosts 985 240 ITS1 sequences spanning over 134 000 eukaryotic species. Each ITS1 is mapped on the NCBI reference taxonomy with its start and end positions precisely annotated. ITSoneDB has been developed in agreement to the FAIR guidelines by enabling the users to query and download its content through a simple web-interface and access relevant metadata by cross-linking to European Nucleotide Archive.
Collapse
Affiliation(s)
- Monica Santamaria
- Institute of Biomembranes, Bioenergetics and Molecular Biotechnologies, Consiglio Nazionale delle Ricerche, Bari 70126, Italy
| | - Bruno Fosso
- Institute of Biomembranes, Bioenergetics and Molecular Biotechnologies, Consiglio Nazionale delle Ricerche, Bari 70126, Italy
| | - Flavio Licciulli
- Institute of Biomedical Technologies, Consiglio Nazionale delle Ricerche, Bari 70126, Italy
| | - Bachir Balech
- Institute of Biomembranes, Bioenergetics and Molecular Biotechnologies, Consiglio Nazionale delle Ricerche, Bari 70126, Italy
| | - Ilaria Larini
- Department of Biosciences, Biotechnology and Biopharmaceutics, University of Bari 'A. Moro', Bari 70126, Italy
| | - Giorgio Grillo
- Institute of Biomedical Technologies, Consiglio Nazionale delle Ricerche, Bari 70126, Italy
| | - Giorgio De Caro
- Institute of Biomedical Technologies, Consiglio Nazionale delle Ricerche, Bari 70126, Italy
| | - Sabino Liuni
- Institute of Biomedical Technologies, Consiglio Nazionale delle Ricerche, Bari 70126, Italy
| | - Graziano Pesole
- Institute of Biomembranes, Bioenergetics and Molecular Biotechnologies, Consiglio Nazionale delle Ricerche, Bari 70126, Italy.,Department of Biosciences, Biotechnology and Biopharmaceutics, University of Bari 'A. Moro', Bari 70126, Italy
| |
Collapse
|
22
|
de Sousa AGG, Tomasino MP, Duarte P, Fernández-Méndez M, Assmy P, Ribeiro H, Surkont J, Leite RB, Pereira-Leal JB, Torgo L, Magalhães C. Diversity and Composition of Pelagic Prokaryotic and Protist Communities in a Thin Arctic Sea-Ice Regime. MICROBIAL ECOLOGY 2019; 78:388-408. [PMID: 30623212 DOI: 10.1007/s00248-018-01314-2] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/26/2018] [Accepted: 12/25/2018] [Indexed: 06/09/2023]
Abstract
One of the most prominent manifestations of climate change is the changing Arctic sea-ice regime with a reduction in the summer sea-ice extent and a shift from thicker, perennial multiyear ice towards thinner, first-year ice. These changes in the physical environment are likely to impact microbial communities, a key component of Arctic marine food webs and biogeochemical cycles. During the Norwegian young sea ICE expedition (N-ICE2015) north of Svalbard, seawater samples were collected at the surface (5 m), subsurface (20 or 50 m), and mesopelagic (250 m) depths on 9 March, 27 April, and 16 June 2015. In addition, several physical and biogeochemical data were recorded to contextualize the collected microbial communities. Through the massively parallel sequencing of the small subunit ribosomal RNA amplicon and metagenomic data, this work allows studying the Arctic's microbial community structure during the late winter to early summer transition. Results showed that, at compositional level, Alpha- (30.7%) and Gammaproteobacteria (28.6%) are the most frequent taxa across the prokaryotic N-ICE2015 collection, and also the most phylogenetically diverse. Winter to early summer trends were quite evident since there was a high relative abundance of thaumarchaeotes in the under-ice water column in late winter while this group was nearly absent during early summer. Moreover, the emergence of Flavobacteria and the SAR92 clade in early summer might be associated with the degradation of a spring bloom of Phaeocystis. High relative abundance of hydrocarbonoclastic bacteria, particularly Alcanivorax (54.3%) and Marinobacter (6.3%), was also found. Richness showed different patterns along the depth gradient for prokaryotic (highest at mesopelagic depth) and protistan communities (higher at subsurface depths). The microbial N-ICE2015 collection analyzed in the present study provides comprehensive new knowledge about the pelagic microbiota below drifting Arctic sea-ice. The higher microbial diversity found in late winter/early spring communities reinforces the need to continue with further studies to properly characterize the winter microbial communities under the pack-ice.
Collapse
Affiliation(s)
- António Gaspar G de Sousa
- CIIMAR/CIMAR - Interdisciplinary Centre of Marine and Environmental Research, University of Porto, Terminal de Cruzeiros do Porto de Leixões, Av. General Norton de Matos s/n, 4450-208, Porto, Portugal.
- Department of Biology, Faculty of Sciences, University of Porto, Rua Campo Alegre s/n, 4169-007, Porto, Portugal.
| | - Maria Paola Tomasino
- CIIMAR/CIMAR - Interdisciplinary Centre of Marine and Environmental Research, University of Porto, Terminal de Cruzeiros do Porto de Leixões, Av. General Norton de Matos s/n, 4450-208, Porto, Portugal
| | - Pedro Duarte
- Norwegian Polar Institute, Fram Centre, N-9296, Tromsø, Norway
| | | | - Philipp Assmy
- Norwegian Polar Institute, Fram Centre, N-9296, Tromsø, Norway
| | - Hugo Ribeiro
- CIIMAR/CIMAR - Interdisciplinary Centre of Marine and Environmental Research, University of Porto, Terminal de Cruzeiros do Porto de Leixões, Av. General Norton de Matos s/n, 4450-208, Porto, Portugal
| | - Jaroslaw Surkont
- Instituto Gulbenkian de Ciência, Rua da Quinta Grande, 6, 2780-156, Oeiras, Portugal
| | - Ricardo B Leite
- Instituto Gulbenkian de Ciência, Rua da Quinta Grande, 6, 2780-156, Oeiras, Portugal
| | - José B Pereira-Leal
- Instituto Gulbenkian de Ciência, Rua da Quinta Grande, 6, 2780-156, Oeiras, Portugal
| | - Luís Torgo
- LIAAD - Laboratory of Artificial Intelligence and Decision Support, INESC Tec, Porto, Portugal
- Faculty of Computer Science, Dalhousie University, Halifax, Canada, USA
| | - Catarina Magalhães
- CIIMAR/CIMAR - Interdisciplinary Centre of Marine and Environmental Research, University of Porto, Terminal de Cruzeiros do Porto de Leixões, Av. General Norton de Matos s/n, 4450-208, Porto, Portugal
- Department of Biology, Faculty of Sciences, University of Porto, Rua Campo Alegre s/n, 4169-007, Porto, Portugal
| |
Collapse
|
23
|
Graells T, Ishak H, Larsson M, Guy L. The all-intracellular order Legionellales is unexpectedly diverse, globally distributed and lowly abundant. FEMS Microbiol Ecol 2019; 94:5110392. [PMID: 30973601 PMCID: PMC6167759 DOI: 10.1093/femsec/fiy185] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2018] [Accepted: 09/08/2018] [Indexed: 12/14/2022] Open
Abstract
Legionellales is an order of the Gammaproteobacteria, only composed of host-adapted, intracellular bacteria, including the accidental human pathogens Legionella pneumophila and Coxiella burnetii. Although the diversity in terms of lifestyle is large across the order, only a few genera have been sequenced, owing to the difficulty to grow intracellular bacteria in pure culture. In particular, we know little about their global distribution and abundance. Here, we analyze 16/18S rDNA amplicons both from tens of thousands of published studies and from two separate sampling campaigns in and around ponds and in a silver mine. We demonstrate that the diversity of the order is much larger than previously thought, with over 450 uncultured genera. We show that Legionellales are found in about half of the samples from freshwater, soil and marine environments and quasi-ubiquitous in man-made environments. Their abundance is low, typically 0.1%, with few samples up to 1%. Most Legionellales OTUs are globally distributed, while many do not belong to a previously identified species. This study sheds a new light on the ubiquity and diversity of one major group of host-adapted bacteria. It also emphasizes the need to use metagenomics to better understand the role of host-adapted bacteria in all environments.
Collapse
Affiliation(s)
- Tiscar Graells
- Department of Medical Biochemistry and Microbiology, Science for Life Laboratory, Uppsala University, Box 582, 75123 Uppsala, Sweden.,Departament de Genètica i Microbiologia, Universitat Autònoma de Barcelona, Edifici C, Carrer de la Vall Moronta, 08193 Bellaterra, Spain
| | - Helena Ishak
- Department of Medical Biochemistry and Microbiology, Science for Life Laboratory, Uppsala University, Box 582, 75123 Uppsala, Sweden
| | - Madeleine Larsson
- Department of Medical Biochemistry and Microbiology, Science for Life Laboratory, Uppsala University, Box 582, 75123 Uppsala, Sweden
| | - Lionel Guy
- Department of Medical Biochemistry and Microbiology, Science for Life Laboratory, Uppsala University, Box 582, 75123 Uppsala, Sweden
| |
Collapse
|
24
|
De Tender C, Mesuere B, Van der Jeugt F, Haegeman A, Ruttink T, Vandecasteele B, Dawyndt P, Debode J, Kuramae EE. Peat substrate amended with chitin modulates the N-cycle, siderophore and chitinase responses in the lettuce rhizobiome. Sci Rep 2019; 9:9890. [PMID: 31289280 PMCID: PMC6617458 DOI: 10.1038/s41598-019-46106-x] [Citation(s) in RCA: 21] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2018] [Accepted: 06/19/2019] [Indexed: 11/09/2022] Open
Abstract
Chitin is a valuable peat substrate amendment by increasing lettuce growth and reducing the survival of the zoonotic pathogen Salmonella enterica on lettuce leaves. The production of chitin-catabolic enzymes (chitinases) play a crucial role and are mediated through the microbial community. A higher abundance of plant-growth promoting microorganisms and genera involved in N and chitin metabolism are present in a chitin-enriched substrate. In this study, we hypothesize that chitin addition to peat substrate stimulates the microbial chitinase production. The degradation of chitin leads to nutrient release and the production of small chitin oligomers that are related to plant growth promotion and activation of the plant's defense response. First a shotgun metagenomics approach was used to decipher the potential rhizosphere microbial functions then the nutritional content of the peat substrate was measured. Our results show that chitin addition increases chitin-catabolic enzymes, bacterial ammonium oxidizing and siderophore genes. Lettuce growth promotion can be explained by a cascade degradation of chitin to N-acetylglucosamine and eventually ammonium. The occurrence of increased ammonium oxidizing bacteria, Nitrosospira, and amoA genes results in an elevated concentration of plant-available nitrate. In addition, the increase in chitinase and siderophore genes may have stimulated the plant's systemic resistance.
Collapse
Affiliation(s)
- C De Tender
- Flanders Research Institute for Agriculture, Fisheries and Food, Plant Sciences Unit, Burgemeester Van Gansberghelaan 92, 9820, Merelbeke, Belgium.
- Ghent University, Department of Applied Mathematics, Computer Science and Statistics, Krijgslaan 281 S9, 9000, Ghent, Belgium.
| | - B Mesuere
- Ghent University, Department of Applied Mathematics, Computer Science and Statistics, Krijgslaan 281 S9, 9000, Ghent, Belgium
- VIB-UGent Center for Medical Biotechnology, VIB, B-9000, Ghent, Belgium
| | - F Van der Jeugt
- Ghent University, Department of Applied Mathematics, Computer Science and Statistics, Krijgslaan 281 S9, 9000, Ghent, Belgium
| | - A Haegeman
- Flanders Research Institute for Agriculture, Fisheries and Food, Plant Sciences Unit, Burgemeester Van Gansberghelaan 92, 9820, Merelbeke, Belgium
| | - T Ruttink
- Flanders Research Institute for Agriculture, Fisheries and Food, Plant Sciences Unit, Burgemeester Van Gansberghelaan 92, 9820, Merelbeke, Belgium
| | - B Vandecasteele
- Flanders Research Institute for Agriculture, Fisheries and Food, Plant Sciences Unit, Burgemeester Van Gansberghelaan 92, 9820, Merelbeke, Belgium
| | - P Dawyndt
- Ghent University, Department of Applied Mathematics, Computer Science and Statistics, Krijgslaan 281 S9, 9000, Ghent, Belgium
| | - J Debode
- Flanders Research Institute for Agriculture, Fisheries and Food, Plant Sciences Unit, Burgemeester Van Gansberghelaan 92, 9820, Merelbeke, Belgium
| | - E E Kuramae
- Netherlands Institute of Ecology, department of Microbial Ecology, Droevendaalsesteeg 10, 6708 PB, Wageningen, The Netherlands
| |
Collapse
|
25
|
Park YM, Squizzato S, Buso N, Gur T, Lopez R. The EBI search engine: EBI search as a service-making biological data accessible for all. Nucleic Acids Res 2019; 45:W545-W549. [PMID: 28472374 PMCID: PMC5570174 DOI: 10.1093/nar/gkx359] [Citation(s) in RCA: 27] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2017] [Accepted: 04/20/2017] [Indexed: 12/02/2022] Open
Abstract
We present an update of the EBI Search engine, an easy-to-use fast text search and indexing system with powerful data navigation and retrieval capabilities. The interconnectivity that exists between data resources at EMBL–EBI provides easy, quick and precise navigation and a better understanding of the relationship between different data types that include nucleotide and protein sequences, genes, gene products, proteins, protein domains, protein families, enzymes and macromolecular structures, as well as the life science literature. EBI Search provides a powerful RESTful API that enables its integration into third-party portals, thus providing ‘Search as a Service’ capabilities, which are the main topic of this article.
Collapse
Affiliation(s)
- Young M Park
- European Bioinformatics Institute, EMBL Outstation, Wellcome Trust Genome Campus, Hinxton, CB10 1SD Cambridge, UK
| | - Silvano Squizzato
- European Bioinformatics Institute, EMBL Outstation, Wellcome Trust Genome Campus, Hinxton, CB10 1SD Cambridge, UK
| | - Nicola Buso
- European Bioinformatics Institute, EMBL Outstation, Wellcome Trust Genome Campus, Hinxton, CB10 1SD Cambridge, UK
| | - Tamer Gur
- European Bioinformatics Institute, EMBL Outstation, Wellcome Trust Genome Campus, Hinxton, CB10 1SD Cambridge, UK
| | - Rodrigo Lopez
- European Bioinformatics Institute, EMBL Outstation, Wellcome Trust Genome Campus, Hinxton, CB10 1SD Cambridge, UK
| |
Collapse
|
26
|
Dhariwal A, Chong J, Habib S, King IL, Agellon LB, Xia J. MicrobiomeAnalyst: a web-based tool for comprehensive statistical, visual and meta-analysis of microbiome data. Nucleic Acids Res 2019; 45:W180-W188. [PMID: 28449106 PMCID: PMC5570177 DOI: 10.1093/nar/gkx295] [Citation(s) in RCA: 1041] [Impact Index Per Article: 208.2] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2017] [Accepted: 04/11/2017] [Indexed: 12/11/2022] Open
Abstract
The widespread application of next-generation sequencing technologies has revolutionized microbiome research by enabling high-throughput profiling of the genetic contents of microbial communities. How to analyze the resulting large complex datasets remains a key challenge in current microbiome studies. Over the past decade, powerful computational pipelines and robust protocols have been established to enable efficient raw data processing and annotation. The focus has shifted toward downstream statistical analysis and functional interpretation. Here, we introduce MicrobiomeAnalyst, a user-friendly tool that integrates recent progress in statistics and visualization techniques, coupled with novel knowledge bases, to enable comprehensive analysis of common data outputs produced from microbiome studies. MicrobiomeAnalyst contains four modules - the Marker Data Profiling module offers various options for community profiling, comparative analysis and functional prediction based on 16S rRNA marker gene data; the Shotgun Data Profiling module supports exploratory data analysis, functional profiling and metabolic network visualization of shotgun metagenomics or metatranscriptomics data; the Taxon Set Enrichment Analysis module helps interpret taxonomic signatures via enrichment analysis against >300 taxon sets manually curated from literature and public databases; finally, the Projection with Public Data module allows users to visually explore their data with a public reference data for pattern discovery and biological insights. MicrobiomeAnalyst is freely available at http://www.microbiomeanalyst.ca.
Collapse
Affiliation(s)
- Achal Dhariwal
- Department of Animal Science, McGill University, Quebec, Canada
| | - Jasmine Chong
- Institute of Parasitology, McGill University, Quebec, Canada
| | - Salam Habib
- School of Dietetics and Human Nutrition, McGill University, Quebec, Canada
| | - Irah L King
- Department of Microbiology and Immunology, McGill University, Quebec, Canada.,Microbiome and Disease Tolerance Center (MDTC), McGill University, Quebec, Canada
| | - Luis B Agellon
- School of Dietetics and Human Nutrition, McGill University, Quebec, Canada
| | - Jianguo Xia
- Department of Animal Science, McGill University, Quebec, Canada.,Institute of Parasitology, McGill University, Quebec, Canada.,Department of Microbiology and Immunology, McGill University, Quebec, Canada.,Microbiome and Disease Tolerance Center (MDTC), McGill University, Quebec, Canada
| |
Collapse
|
27
|
Pinnell LJ, Turner JW. Shotgun Metagenomics Reveals the Benthic Microbial Community Response to Plastic and Bioplastic in a Coastal Marine Environment. Front Microbiol 2019; 10:1252. [PMID: 31231339 PMCID: PMC6566015 DOI: 10.3389/fmicb.2019.01252] [Citation(s) in RCA: 92] [Impact Index Per Article: 18.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2019] [Accepted: 05/20/2019] [Indexed: 11/23/2022] Open
Abstract
Plastic is incredibly abundant in marine environments but little is known about its effects on benthic microbiota and biogeochemical cycling. This study reports the shotgun metagenomic sequencing of biofilms fouling plastic and bioplastic microcosms staged at the sediment–water interface of a coastal lagoon. Community composition analysis revealed that plastic biofilms were indistinguishable in comparison to a ceramic biofilm control. By contrast, bioplastic biofilms were distinct and dominated by sulfate-reducing microorganisms (SRM). Analysis of bioplastic gene pools revealed the enrichment of esterases, depolymerases, adenylyl sulfate reductases (aprBA), and dissimilatory sulfite reductases (dsrAB). The nearly 20-fold enrichment of a phylogenetically diverse polyhydroxybutyrate (PHB) depolymerase suggests this gene was distributed across a mixed microbial assemblage. The metagenomic reconstruction of genomes identified novel species of Desulfovibrio, Desulfobacteraceae, and Desulfobulbaceae among the abundant SRM, and these genomes contained genes integral to both bioplastic degradation and sulfate reduction. Findings indicate that bioplastic promoted a rapid and significant shift in benthic microbial diversity and gene pools, selecting for microbes that participate in bioplastic degradation and sulfate reduction. If plastic pollution is traded for bioplastic pollution and sedimentary inputs are large, the microbial response could unintentionally affect benthic biogeochemical activities through the stimulation of sulfate reducers.
Collapse
Affiliation(s)
- Lee J Pinnell
- Department of Life Sciences, Texas A&M University - Corpus Christi, Corpus Christi, TX, United States
| | - Jeffrey W Turner
- Department of Life Sciences, Texas A&M University - Corpus Christi, Corpus Christi, TX, United States
| |
Collapse
|
28
|
Borsetto C, Amos GCA, da Rocha UN, Mitchell AL, Finn RD, Laidi RF, Vallin C, Pearce DA, Newsham KK, Wellington EMH. Microbial community drivers of PK/NRP gene diversity in selected global soils. MICROBIOME 2019; 7:78. [PMID: 31118083 PMCID: PMC6532259 DOI: 10.1186/s40168-019-0692-8] [Citation(s) in RCA: 25] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/26/2018] [Accepted: 05/08/2019] [Indexed: 06/09/2023]
Abstract
BACKGROUND The emergence of antibiotic-resistant pathogens has created an urgent need for novel antimicrobial treatments. Advances in next-generation sequencing have opened new frontiers for discovery programmes for natural products allowing the exploitation of a larger fraction of the microbial community. Polyketide (PK) and non-ribosomal pepetide (NRP) natural products have been reported to be related to compounds with antimicrobial and anticancer activities. We report here a new culture-independent approach to explore bacterial biosynthetic diversity and determine bacterial phyla in the microbial community associated with PK and NRP diversity in selected soils. RESULTS Through amplicon sequencing, we explored the microbial diversity (16S rRNA gene) of 13 soils from Antarctica, Africa, Europe and a Caribbean island and correlated this with the amplicon diversity of the adenylation (A) and ketosynthase (KS) domains within functional genes coding for non-ribosomal peptide synthetases (NRPSs) and polyketide synthases (PKSs), which are involved in the production of NRP and PK, respectively. Mantel and Procrustes correlation analyses with microbial taxonomic data identified not only the well-studied phyla Actinobacteria and Proteobacteria, but also, interestingly, the less biotechnologically exploited phyla Verrucomicrobia and Bacteroidetes, as potential sources harbouring diverse A and KS domains. Some soils, notably that from Antarctica, provided evidence of endemic diversity, whilst others, such as those from Europe, clustered together. In particular, the majority of the domain reads from Antarctica remained unmatched to known sequences suggesting they could encode enzymes for potentially novel PK and NRP. CONCLUSIONS The approach presented here highlights potential sources of metabolic novelty in the environment which will be a useful precursor to metagenomic biosynthetic gene cluster mining for PKs and NRPs which could provide leads for new antimicrobial metabolites.
Collapse
Affiliation(s)
- Chiara Borsetto
- School of Life Sciences, University of Warwick, Coventry, UK
| | - Gregory C. A. Amos
- School of Life Sciences, University of Warwick, Coventry, UK
- Present addresses: G.C.A.A National Institute for Biological Standards and Control (NIBSC), Potters Bar, UK
| | - Ulisses Nunes da Rocha
- Department of Environmental Microbiology, Helmholtz Centre for Environmental Research—UFZ, Leipzig, Germany
| | - Alex L. Mitchell
- EMBL-EBI European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, UK
| | - Robert D. Finn
- EMBL-EBI European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, UK
| | | | | | - David A. Pearce
- Applied Sciences, Faculty of Health and Life Sciences, Northumbria University at Newcastle, Ellison Building, Northumberland Road, Newcastle, NE1 8ST UK
- Natural Environment Research Council, British Antarctic Survey, Cambridge, UK
| | - Kevin K. Newsham
- Natural Environment Research Council, British Antarctic Survey, Cambridge, UK
| | | |
Collapse
|
29
|
Karppinen EM, Mamet SD, Stewart KJ, Siciliano SD. The Charosphere Promotes Mineralization of 13C-Phenanthrene by Psychrotrophic Microorganisms in Greenland Soils. JOURNAL OF ENVIRONMENTAL QUALITY 2019; 48:559-567. [PMID: 31180417 DOI: 10.2134/jeq2018.10.0370] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
When soil is frozen, biochar promotes petroleum hydrocarbon (PHC) degradation, yet we still do not understand why. To investigate microbial biodegradation activity under frozen conditions, we placed 60-μm mesh bags containing 6% (v/v) biochar created from fishmeal, bonemeal, bone chip, or wood into PHC-contaminated soil, which was then frozen to -5°C. This created three soil niches: biochar particles, the charosphere (biochar-contiguous soil), and bulk soil outside of the bags. After 90 d, C-phenanthrene mineralization reached 55% in bonemeal biochar and 84% in bone chip biochar charosphere soil, compared with only 43% in bulk soil and 13% in bone chip biochar particles. Soil pH remained near neutral in bone chip and bonemeal biochar treatments, unlike wood biochar, which increased alkalinity and likely made phosphate unavailable for microorganisms. Generally, charosphere soil had higher aromatic degradative gene abundances than bulk soil, but gene abundance was not directly linked to C-phenanthrene mineralization. In bone chip biochar-amended soils, phosphate successfully predicted microbial community composition, and abundances of and increased in charosphere soil. Biochar effects on charosphere soil were dependent on feedstock material and suggest that optimizing the charosphere in bone-derived biochars may increase remediation success in northern regions.
Collapse
|
30
|
Perz AI, Giles CB, Brown CA, Porter H, Roopnarinesingh X, Wren JD. MNEMONIC: MetageNomic Experiment Mining to create an OTU Network of Inhabitant Correlations. BMC Bioinformatics 2019; 20:96. [PMID: 30871469 PMCID: PMC6419333 DOI: 10.1186/s12859-019-2623-x] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022] Open
Abstract
Background The number of publicly available metagenomic experiments in various environments has been rapidly growing, empowering the potential to identify similar shifts in species abundance between different experiments. This could be a potentially powerful way to interpret new experiments, by identifying common themes and causes behind changes in species abundance. Results We propose a novel framework for comparing microbial shifts between conditions. Using data from one of the largest human metagenome projects to date, the American Gut Project (AGP), we obtain differential abundance vectors for microbes using experimental condition information provided with the AGP metadata, such as patient age, dietary habits, or health status. We show it can be used to identify similar and opposing shifts in microbial species, and infer putative interactions between microbes. Our results show that groups of shifts with similar effects on microbiome can be identified and that similar dietary interventions display similar microbial abundance shifts. Conclusions Without comparison to prior data, it is difficult for experimentalists to know if their observed changes in species abundance have been observed by others, both in their conditions and in others they would never consider comparable. Yet, this can be a very important contextual factor in interpreting the significance of a shift. We’ve proposed and tested an algorithmic solution to this problem, which also allows for comparing the metagenomic signature shifts between conditions in the existing body of data. Electronic supplementary material The online version of this article (10.1186/s12859-019-2623-x) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Aleksandra I Perz
- Arthritis and Clinical Immunology Program, Division of Genomics and Data Sciences, Oklahoma Medical Research Foundation, Oklahoma City, OK, 73104-5005, USA.
| | - Cory B Giles
- Arthritis and Clinical Immunology Program, Division of Genomics and Data Sciences, Oklahoma Medical Research Foundation, Oklahoma City, OK, 73104-5005, USA.,Department of Geriatric Medicine, University of Oklahoma Health Sciences Center, Oklahoma City, OK, USA
| | - Chase A Brown
- Arthritis and Clinical Immunology Program, Division of Genomics and Data Sciences, Oklahoma Medical Research Foundation, Oklahoma City, OK, 73104-5005, USA.,Oklahoma Center for Neuroscience, University of Oklahoma Health Sciences Center, Oklahoma City, OK, USA
| | - Hunter Porter
- Arthritis and Clinical Immunology Program, Division of Genomics and Data Sciences, Oklahoma Medical Research Foundation, Oklahoma City, OK, 73104-5005, USA.,Oklahoma Center for Neuroscience, University of Oklahoma Health Sciences Center, Oklahoma City, OK, USA
| | - Xiavan Roopnarinesingh
- Arthritis and Clinical Immunology Program, Division of Genomics and Data Sciences, Oklahoma Medical Research Foundation, Oklahoma City, OK, 73104-5005, USA.,Department of Biochemistry and Molecular Biology, University of Oklahoma Health Sciences Center, Oklahoma City, OK, USA
| | - Jonathan D Wren
- Arthritis and Clinical Immunology Program, Division of Genomics and Data Sciences, Oklahoma Medical Research Foundation, Oklahoma City, OK, 73104-5005, USA. .,Department of Biochemistry and Molecular Biology, University of Oklahoma Health Sciences Center, Oklahoma City, OK, USA. .,Oklahoma Center for Neuroscience, University of Oklahoma Health Sciences Center, Oklahoma City, OK, USA. .,Department of Geriatric Medicine, University of Oklahoma Health Sciences Center, Oklahoma City, OK, USA.
| |
Collapse
|
31
|
Online Interactive Microbial Classification and Geospatial Distributional Analysis Using BioAtlas. Methods Mol Biol 2019; 1807:21-35. [PMID: 30030801 DOI: 10.1007/978-1-4939-8561-6_3] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/06/2023]
Abstract
In recent decades, the accumulation of data on 16s ribosomal RNA genes has yielded free and public databases such as SILVA, GreenGenes, The Ribosomal Database Project, and IMG, handling massive amounts of raw data and meta information. 16s rRNA gene contains hypervariable regions with great classification power. As a result, numerous classification tools have emerged including state-of-the-art tools such as Mothur, Qiime, and the 16s classifier. However, there is a gap between the sequence databases, the taxonomy profiling tools and available meta information such as geo/body-location information. Here, we present BioAtlas, and interactive web tool for searching, exploring, and analyzing prokaryotic distributions by integration of various resources of metagenomics databases. In the following section we show how to use BioAtlas to (1) search and explore prokaryote occurrences across the geospatial map of the world, (2) investigate and hunt for occurrences across generic user-generated surface-specific maps, with an example map of a human female, with data from Bouslimani et al., and (3) classify a user-given sequences dataset through our online platform for visual exploration of the spatial abundances of the identified microbes.
Collapse
|
32
|
Methods in Metagenomics and Environmental Biotechnology. NANOSCIENCE AND BIOTECHNOLOGY FOR ENVIRONMENTAL APPLICATIONS 2019. [DOI: 10.1007/978-3-319-97922-9_4] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
|
33
|
Maarala AI, Bzhalava Z, Dillner J, Heljanko K, Bzhalava D. ViraPipe: scalable parallel pipeline for viral metagenome analysis from next generation sequencing reads. Bioinformatics 2018; 34:928-935. [PMID: 29106455 DOI: 10.1093/bioinformatics/btx702] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2017] [Accepted: 11/01/2017] [Indexed: 11/13/2022] Open
Abstract
Motivation Next Generation Sequencing (NGS) technology enables identification of microbial genomes from massive amount of human microbiomes more rapidly and cheaper than ever before. However, the traditional sequential genome analysis algorithms, tools, and platforms are inefficient for performing large-scale metagenomic studies on ever-growing sample data volumes. Currently, there is an urgent need for scalable analysis pipelines that enable harnessing all the power of parallel computation in computing clusters and in cloud computing environments. We propose ViraPipe, a scalable metagenome analysis pipeline that is able to analyze thousands of human microbiomes in parallel in tolerable time. The pipeline is tuned for analyzing viral metagenomes and the software is applicable for other metagenomic analyses as well. ViraPipe integrates parallel BWA-MEM read aligner, MegaHit De novo assembler, and BLAST and HMMER3 sequence search tools. We show the scalability of ViraPipe by running experiments on mining virus related genomes from NGS datasets in a distributed Spark computing cluster. Results ViraPipe analyses 768 human samples in 210 minutes on a Spark computing cluster comprising 23 nodes and 1288 cores in total. The speedup of ViraPipe executed on 23 nodes was 11x compared to the sequential analysis pipeline executed on a single node. The whole process includes parallel decompression, read interleaving, BWA-MEM read alignment, filtering and normalizing of non-human reads, De novo contigs assembling, and searching of sequences with BLAST and HMMER3 tools. Contact ilari.maarala@aalto.fi. Availability and implementation https://github.com/NGSeq/ViraPipe.
Collapse
Affiliation(s)
- Altti Ilari Maarala
- Department of Computer Science, Aalto University, Espoo, Finland.,Helsinki Institute for Information Technology HIIT, Espoo, Finland
| | - Zurab Bzhalava
- Department of Laboratory Medicine, Karolinska Institutet, Stockholm, Sweden
| | - Joakim Dillner
- Department of Laboratory Medicine, Karolinska Institutet, Stockholm, Sweden
| | - Keijo Heljanko
- Department of Computer Science, Aalto University, Espoo, Finland.,Helsinki Institute for Information Technology HIIT, Espoo, Finland
| | - Davit Bzhalava
- Department of Laboratory Medicine, Karolinska Institutet, Stockholm, Sweden
| |
Collapse
|
34
|
Fitch A, Orland C, Willer D, Emilson EJS, Tanentzap AJ. Feasting on terrestrial organic matter: Dining in a dark lake changes microbial decomposition. GLOBAL CHANGE BIOLOGY 2018; 24:5110-5122. [PMID: 29998600 PMCID: PMC6220883 DOI: 10.1111/gcb.14391] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 02/26/2018] [Revised: 05/30/2018] [Accepted: 06/22/2018] [Indexed: 06/08/2023]
Abstract
Boreal lakes are major components of the global carbon cycle, partly because of sediment-bound heterotrophic microorganisms that decompose within-lake and terrestrially derived organic matter (t-OM). The ability for sediment bacteria to break down and alter t-OM may depend on environmental characteristics and community composition. However, the connection between these two potential drivers of decomposition is poorly understood. We tested how bacterial activity changed along experimental gradients in the quality and quantity of t-OM inputs into littoral sediments of two small boreal lakes, a dark and a clear lake, and measured the abundance of operational taxonomic units and functional genes to identify mechanisms underlying bacterial responses. We found that bacterial production (BP) decreased across lakes with aromatic dissolved organic matter (DOM) in sediment pore water, but the process underlying this pattern differed between lakes. Bacteria in the dark lake invested in the energetically costly production of extracellular enzymes as aromatic DOM increased in availability in the sediments. By contrast, bacteria in the clear lake may have lacked the nutrients and/or genetic potential to degrade aromatic DOM and instead mineralized photo-degraded OM into CO2 . The two lakes differed in community composition, with concentrations of dissolved organic carbon and pH differentiating microbial assemblages. Furthermore, functional genes relating to t-OM degradation were relatively higher in the dark lake. Our results suggest that future changes in t-OM inputs to lake sediments will have different effects on carbon cycling depending on the potential for photo-degradation of OM and composition of resident bacterial communities.
Collapse
Affiliation(s)
- Amelia Fitch
- Department of Plant SciencesUniversity of CambridgeCambridgeUK
| | - Chloe Orland
- Department of Plant SciencesUniversity of CambridgeCambridgeUK
| | - David Willer
- Department of Plant SciencesUniversity of CambridgeCambridgeUK
| | - Erik J. S. Emilson
- Department of Plant SciencesUniversity of CambridgeCambridgeUK
- Natural Resources Canada, Great Lakes Forestry CentreSault Ste. MarieOntario
| | | |
Collapse
|
35
|
Ugarte A, Vicedomini R, Bernardes J, Carbone A. A multi-source domain annotation pipeline for quantitative metagenomic and metatranscriptomic functional profiling. MICROBIOME 2018; 6:149. [PMID: 30153857 PMCID: PMC6114274 DOI: 10.1186/s40168-018-0532-2] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/27/2018] [Accepted: 08/13/2018] [Indexed: 05/23/2023]
Abstract
BACKGROUND Biochemical and regulatory pathways have until recently been thought and modelled within one cell type, one organism and one species. This vision is being dramatically changed by the advent of whole microbiome sequencing studies, revealing the role of symbiotic microbial populations in fundamental biochemical functions. The new landscape we face requires the reconstruction of biochemical and regulatory pathways at the community level in a given environment. In order to understand how environmental factors affect the genetic material and the dynamics of the expression from one environment to another, we want to evaluate the quantity of gene protein sequences or transcripts associated to a given pathway by precisely estimating the abundance of protein domains, their weak presence or absence in environmental samples. RESULTS MetaCLADE is a novel profile-based domain annotation pipeline based on a multi-source domain annotation strategy. It applies directly to reads and improves identification of the catalog of functions in microbiomes. MetaCLADE is applied to simulated data and to more than ten metagenomic and metatranscriptomic datasets from different environments where it outperforms InterProScan in the number of annotated domains. It is compared to the state-of-the-art non-profile-based and profile-based methods, UProC and HMM-GRASPx, showing complementary predictions to UProC. A combination of MetaCLADE and UProC improves even further the functional annotation of environmental samples. CONCLUSIONS Learning about the functional activity of environmental microbial communities is a crucial step to understand microbial interactions and large-scale environmental impact. MetaCLADE has been explicitly designed for metagenomic and metatranscriptomic data and allows for the discovery of patterns in divergent sequences, thanks to its multi-source strategy. MetaCLADE highly improves current domain annotation methods and reaches a fine degree of accuracy in annotation of very different environments such as soil and marine ecosystems, ancient metagenomes and human tissues.
Collapse
Affiliation(s)
- Ari Ugarte
- Sorbonne Université, UPMC-Univ P6, CNRS, IBPS, Laboratoire de Biologie Computationnelle et Quantitative - UMR 7238, 4 Place Jussieu, Paris, 75005 France
| | - Riccardo Vicedomini
- Sorbonne Université, UPMC-Univ P6, CNRS, IBPS, Laboratoire de Biologie Computationnelle et Quantitative - UMR 7238, 4 Place Jussieu, Paris, 75005 France
- Sorbonne Université, UPMC-Univ P6, CNRS, Institut des Sciences du Calcul et des Donnees, 4 Place Jussieu, Paris, 75005 France
| | - Juliana Bernardes
- Sorbonne Université, UPMC-Univ P6, CNRS, IBPS, Laboratoire de Biologie Computationnelle et Quantitative - UMR 7238, 4 Place Jussieu, Paris, 75005 France
| | - Alessandra Carbone
- Sorbonne Université, UPMC-Univ P6, CNRS, IBPS, Laboratoire de Biologie Computationnelle et Quantitative - UMR 7238, 4 Place Jussieu, Paris, 75005 France
- Institut Universitaire de France, Paris, 75005 France
| |
Collapse
|
36
|
Patsch D, Vliet S, Marcantini LG, Johnson DR. Generality of associations between biological richness and the rates of metabolic processes across microbial communities. Environ Microbiol 2018; 20:4356-4368. [DOI: 10.1111/1462-2920.14352] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2018] [Revised: 07/02/2018] [Accepted: 07/02/2018] [Indexed: 11/29/2022]
Affiliation(s)
- Deborah Patsch
- Department of Environmental Systems ScienceETH Zürich 8092 Zürich Switzerland
- Department of Environmental MicrobiologySwiss Federal Institute of Aquatic Science and Technology (Eawag) 8600 Dübendorf Switzerland
| | - Simon Vliet
- Department of Environmental Systems ScienceETH Zürich 8092 Zürich Switzerland
- Department of Environmental MicrobiologySwiss Federal Institute of Aquatic Science and Technology (Eawag) 8600 Dübendorf Switzerland
| | - Lorenzo Garbani Marcantini
- Department of Urban Water ManagementSwiss Federal Institute of Aquatic Science and Technology (Eawag) 8600 Dübendorf Switzerland
| | - David R. Johnson
- Department of Environmental MicrobiologySwiss Federal Institute of Aquatic Science and Technology (Eawag) 8600 Dübendorf Switzerland
| |
Collapse
|
37
|
Orland C, Emilson EJS, Basiliko N, Mykytczuk NCS, Gunn JM, Tanentzap AJ. Microbiome functioning depends on individual and interactive effects of the environment and community structure. ISME JOURNAL 2018; 13:1-11. [PMID: 30042502 DOI: 10.1038/s41396-018-0230-x] [Citation(s) in RCA: 33] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/28/2018] [Revised: 06/15/2018] [Accepted: 06/20/2018] [Indexed: 01/16/2023]
Abstract
How ecosystem functioning changes with microbial communities remains an open question in natural ecosystems. Both present-day environmental conditions and historical events, such as past differences in dispersal, can have a greater influence over ecosystem function than the diversity or abundance of both taxa and genes. Here, we estimated how individual and interactive effects of microbial community structure defined by diversity and abundance, present-day environmental conditions, and an indicator of historical legacies influenced ecosystem functioning in lake sediments. We studied sediments because they have strong gradients in all three of these ecosystem properties and deliver important functions worldwide. By characterizing bacterial community composition and functional traits at eight sites fed by discrete and contrasting catchments, we found that taxonomic diversity and the normalized abundance of oxidase-encoding genes explained as much variation in CO2 production as present-day gradients of pH and organic matter quantity and quality. Functional gene diversity was not linked to CO2 production rates. Surprisingly, the effects of taxonomic diversity and normalized oxidase abundance in the model predicting CO2 production were attributable to site-level differences in bacterial communities unrelated to the present-day environment, suggesting that colonization history rather than habitat-based filtering indirectly influenced ecosystem functioning. Our findings add to limited evidence that biodiversity and gene abundance explain patterns of microbiome functioning in nature. Yet we highlight among the first time how these relationships depend directly on present-day environmental conditions and indirectly on historical legacies, and so need to be contextualized with these other ecosystem properties.
Collapse
Affiliation(s)
- Chloé Orland
- Ecosystems and Global Change Group, Department of Plant Sciences, University of Cambridge, Downing Street, CB2 3EA, Cambridge, UK.
| | - Erik J S Emilson
- Ecosystems and Global Change Group, Department of Plant Sciences, University of Cambridge, Downing Street, CB2 3EA, Cambridge, UK.,Natural Resources Canada, Great Lakes Forestry Centre, 1219 Queen St. E., Sault. Ste. Marie, ON, P6A 2E5, Canada
| | - Nathan Basiliko
- Vale Living with Lakes Centre, Laurentian University, 935 Ramsey Lake Road, Sudbury, ON, Canada, P3E 2C6
| | - Nadia C S Mykytczuk
- Vale Living with Lakes Centre, Laurentian University, 935 Ramsey Lake Road, Sudbury, ON, Canada, P3E 2C6
| | - John M Gunn
- Vale Living with Lakes Centre, Laurentian University, 935 Ramsey Lake Road, Sudbury, ON, Canada, P3E 2C6
| | - Andrew J Tanentzap
- Ecosystems and Global Change Group, Department of Plant Sciences, University of Cambridge, Downing Street, CB2 3EA, Cambridge, UK
| |
Collapse
|
38
|
Li J, Tseng CS, Federico A, Ivankovic F, Huang YS, Ciccodicola A, Swanson MS, Yu P. SFMetaDB: a comprehensive annotation of mouse RNA splicing factor RNA-Seq datasets. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2018; 2017:4161772. [PMID: 29220461 PMCID: PMC5737203 DOI: 10.1093/database/bax071] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/13/2017] [Accepted: 08/15/2017] [Indexed: 02/07/2023]
Abstract
Although the number of RNA-Seq datasets deposited publicly has increased over the past few years, incomplete annotation of the associated metadata limits their potential use. Because of the importance of RNA splicing in diseases and biological processes, we constructed a database called SFMetaDB by curating datasets related with RNA splicing factors. Our effort focused on the RNA-Seq datasets in which splicing factors were knocked-down, knocked-out or over-expressed, leading to 75 datasets corresponding to 56 splicing factors. These datasets can be used in differential alternative splicing analysis for the identification of the potential targets of these splicing factors and other functional studies. Surprisingly, only ∼15% of all the splicing factors have been studied by loss- or gain-of-function experiments using RNA-Seq. In particular, splicing factors with domains from a few dominant Pfam domain families have not been studied. This suggests a significant gap that needs to be addressed to fully elucidate the splicing regulatory landscape. Indeed, there are already mouse models available for ∼20 of the unstudied splicing factors, and it can be a fruitful research direction to study these splicing factors in vitro and in vivo using RNA-Seq. Database URL:http://sfmetadb.ece.tamu.edu/
Collapse
Affiliation(s)
- Jin Li
- Department of Electrical and Computer Engineering.,TEES-AgriLife Center for Bioinformatics and Genomic Systems Engineering, Texas A&M University, College Station, TX 77843, USA
| | - Ching-San Tseng
- Institute of Biomedical Sciences, Academia Sinica, Taipei, Taiwan
| | - Antonio Federico
- Institute of Genetics and Biophysics "Adriano Buzzati Traverso", CNR, Naples, Italy.,Department of Science and Technology, University of Naples "Parthenope", Naples, Italy
| | - Franjo Ivankovic
- Department of Molecular Genetics and Microbiology, Center for NeuroGenetics and the Genetics Institute, College of Medicine, University of Florida, Gainesville, Florida, USA
| | - Yi-Shuian Huang
- Institute of Biomedical Sciences, Academia Sinica, Taipei, Taiwan
| | - Alfredo Ciccodicola
- Institute of Genetics and Biophysics "Adriano Buzzati Traverso", CNR, Naples, Italy.,Department of Science and Technology, University of Naples "Parthenope", Naples, Italy
| | - Maurice S Swanson
- Department of Molecular Genetics and Microbiology, Center for NeuroGenetics and the Genetics Institute, College of Medicine, University of Florida, Gainesville, Florida, USA
| | - Peng Yu
- Department of Electrical and Computer Engineering.,TEES-AgriLife Center for Bioinformatics and Genomic Systems Engineering, Texas A&M University, College Station, TX 77843, USA
| |
Collapse
|
39
|
Ten Hoopen P, Finn RD, Bongo LA, Corre E, Fosso B, Meyer F, Mitchell A, Pelletier E, Pesole G, Santamaria M, Willassen NP, Cochrane G. The metagenomic data life-cycle: standards and best practices. Gigascience 2018. [PMID: 28637310 PMCID: PMC5737865 DOI: 10.1093/gigascience/gix047] [Citation(s) in RCA: 31] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open
Abstract
Metagenomics data analyses from independent studies can only be compared if the analysis workflows are described in a harmonized way. In this overview, we have mapped the landscape of data standards available for the description of essential steps in metagenomics: (i) material sampling, (ii) material sequencing, (iii) data analysis, and (iv) data archiving and publishing. Taking examples from marine research, we summarize essential variables used to describe material sampling processes and sequencing procedures in a metagenomics experiment. These aspects of metagenomics dataset generation have been to some extent addressed by the scientific community, but greater awareness and adoption is still needed. We emphasize the lack of standards relating to reporting how metagenomics datasets are analysed and how the metagenomics data analysis outputs should be archived and published. We propose best practice as a foundation for a community standard to enable reproducibility and better sharing of metagenomics datasets, leading ultimately to greater metagenomics data reuse and repurposing.
Collapse
Affiliation(s)
- Petra Ten Hoopen
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, United Kingdom
| | - Robert D Finn
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, United Kingdom
| | | | - Erwan Corre
- CNRS-UPMC, FR 2424, Station Biologique, Roscoff 29680, France
| | - Bruno Fosso
- Institute of Biomembranes, Bioenergetics and Molecular Biotechnologies, CNR, Bari 70126, Italy
| | - Folker Meyer
- Argonne National Laboratory, Argonne IL 60439, USA
| | - Alex Mitchell
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, United Kingdom
| | - Eric Pelletier
- Genoscope, CEA, Évry 91000, France.,CNRS/UMR-8030, Évry 91000, France.,Université Évry val d'Essonne, Évry 91000, France
| | - Graziano Pesole
- Institute of Biomembranes, Bioenergetics and Molecular Biotechnologies, CNR, Bari 70126, Italy.,Department of Biosciences, Biotechnologies and Biopharmaceutics, University of Bari "A. Moro," Bari 70126, Italy
| | - Monica Santamaria
- Institute of Biomembranes, Bioenergetics and Molecular Biotechnologies, CNR, Bari 70126, Italy
| | | | - Guy Cochrane
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, United Kingdom
| |
Collapse
|
40
|
Nonpareil 3: Fast Estimation of Metagenomic Coverage and Sequence Diversity. mSystems 2018; 3:mSystems00039-18. [PMID: 29657970 PMCID: PMC5893860 DOI: 10.1128/msystems.00039-18] [Citation(s) in RCA: 117] [Impact Index Per Article: 19.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2018] [Accepted: 03/23/2018] [Indexed: 01/15/2023] Open
Abstract
Estimations of microbial community diversity based on metagenomic data sets are affected, often to an unknown degree, by biases derived from insufficient coverage and reference database-dependent estimations of diversity. For instance, the completeness of reference databases cannot be generally estimated since it depends on the extant diversity sampled to date, which, with the exception of a few habitats such as the human gut, remains severely undersampled. Further, estimation of the degree of coverage of a microbial community by a metagenomic data set is prohibitively time-consuming for large data sets, and coverage values may not be directly comparable between data sets obtained with different sequencing technologies. Here, we extend Nonpareil, a database-independent tool for the estimation of coverage in metagenomic data sets, to a high-performance computing implementation that scales up to hundreds of cores and includes, in addition, a k-mer-based estimation as sensitive as the original alignment-based version but about three hundred times as fast. Further, we propose a metric of sequence diversity (Nd ) derived directly from Nonpareil curves that correlates well with alpha diversity assessed by traditional metrics. We use this metric in different experiments demonstrating the correlation with the Shannon index estimated on 16S rRNA gene profiles and show that Nd additionally reveals seasonal patterns in marine samples that are not captured by the Shannon index and more precise rankings of the magnitude of diversity of microbial communities in different habitats. Therefore, the new version of Nonpareil, called Nonpareil 3, advances the toolbox for metagenomic analyses of microbiomes. IMPORTANCE Estimation of the coverage provided by a metagenomic data set, i.e., what fraction of the microbial community was sampled by DNA sequencing, represents an essential first step of every culture-independent genomic study that aims to robustly assess the sequence diversity present in a sample. However, estimation of coverage remains elusive because of several technical limitations associated with high computational requirements and limiting statistical approaches to quantify diversity. Here we described Nonpareil 3, a new bioinformatics algorithm that circumvents several of these limitations and thus can facilitate culture-independent studies in clinical or environmental settings, independent of the sequencing platform employed. In addition, we present a new metric of sequence diversity based on rarefied coverage and demonstrate its use in communities from diverse ecosystems.
Collapse
|
41
|
Neville BA, Forster SC, Lawley TD. Commensal Koch's postulates: establishing causation in human microbiota research. Curr Opin Microbiol 2018; 42:47-52. [DOI: 10.1016/j.mib.2017.10.001] [Citation(s) in RCA: 73] [Impact Index Per Article: 12.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2017] [Revised: 10/07/2017] [Accepted: 10/09/2017] [Indexed: 12/16/2022]
|
42
|
Hornung B, Martins Dos Santos VAP, Smidt H, Schaap PJ. Studying microbial functionality within the gut ecosystem by systems biology. GENES AND NUTRITION 2018; 13:5. [PMID: 29556373 PMCID: PMC5840735 DOI: 10.1186/s12263-018-0594-6] [Citation(s) in RCA: 27] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/22/2017] [Accepted: 02/13/2018] [Indexed: 12/13/2022]
Abstract
Humans are not autonomous entities. We are all living in a complex environment, interacting not only with our peers, but as true holobionts; we are also very much in interaction with our coexisting microbial ecosystems living on and especially within us, in the intestine. Intestinal microorganisms, often collectively referred to as intestinal microbiota, contribute significantly to our daily energy uptake by breaking down complex carbohydrates into simple sugars, which are fermented to short-chain fatty acids and subsequently absorbed by human cells. They also have an impact on our immune system, by suppressing or enhancing the growth of malevolent and beneficial microbes. Our lifestyle can have a large influence on this ecosystem. What and how much we consume can tip the ecological balance in the intestine. A "western diet" containing mainly processed food will have a different effect on our health than a balanced diet fortified with pre- and probiotics. In recent years, new technologies have emerged, which made a more detailed understanding of microbial communities and ecosystems feasible. This includes progress in the sequencing of PCR-amplified phylogenetic marker genes as well as the collective microbial metagenome and metatranscriptome, allowing us to determine with an increasing level of detail, which microbial species are in the microbiota, understand what these microorganisms do and how they respond to changes in lifestyle and diet. These new technologies also include the use of synthetic and in vitro systems, which allow us to study the impact of substrates and addition of specific microbes to microbial communities at a high level of detail, and enable us to gather quantitative data for modelling purposes. Here, we will review the current state of microbiome research, summarizing the computational methodologies in this area and highlighting possible outcomes for personalized nutrition and medicine.
Collapse
Affiliation(s)
- Bastian Hornung
- 1Laboratory of Systems and Synthetic Biology, Wageningen University and Research, Stippeneng 4, 6708 WE Wageningen, the Netherlands
| | - Vitor A P Martins Dos Santos
- 1Laboratory of Systems and Synthetic Biology, Wageningen University and Research, Stippeneng 4, 6708 WE Wageningen, the Netherlands
| | - Hauke Smidt
- 2Laboratory of Microbiology, Wageningen University and Research, Stippeneng 4, 6708 WE Wageningen, the Netherlands
| | - Peter J Schaap
- 1Laboratory of Systems and Synthetic Biology, Wageningen University and Research, Stippeneng 4, 6708 WE Wageningen, the Netherlands
| |
Collapse
|
43
|
Kumaresan D, Stephenson J, Doxey AC, Bandukwala H, Brooks E, Hillebrand-Voiculescu A, Whiteley AS, Murrell JC. Aerobic proteobacterial methylotrophs in Movile Cave: genomic and metagenomic analyses. MICROBIOME 2018; 6:1. [PMID: 29291746 PMCID: PMC5748958 DOI: 10.1186/s40168-017-0383-2] [Citation(s) in RCA: 148] [Impact Index Per Article: 24.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/19/2017] [Accepted: 12/14/2017] [Indexed: 05/14/2023]
Abstract
BACKGROUND Movile Cave (Mangalia, Romania) is a unique ecosystem where the food web is sustained by microbial primary production, analogous to deep-sea hydrothermal vents. Specifically, chemoautotrophic microbes deriving energy from the oxidation of hydrogen sulphide and methane form the basis of the food web. RESULTS Here, we report the isolation of the first methane-oxidizing bacterium from the Movile Cave ecosystem, Candidatus Methylomonas sp. LWB, a new species and representative of Movile Cave microbial mat samples. While previous research has suggested a prevalence of anoxic conditions in deeper lake water and sediment, using small-scale shotgun metagenome sequencing, we show that metabolic genes encoding enzymes for aerobic methylotrophy are prevalent in sediment metagenomes possibly indicating the presence of microoxic conditions. Moreover, this study also indicates that members within the family Gallionellaceae (Sideroxydans and Gallionella) were the dominant taxa within the sediment microbial community, thus suggesting a major role for microaerophilic iron-oxidising bacteria in nutrient cycling within the Movile Cave sediments. CONCLUSIONS In this study, based on phylogenetic and metabolic gene surveys of metagenome sequences, the possibility of aerobic microbial processes (i.e., methylotrophy and iron oxidation) within the sediment is indicated. We also highlight significant gaps in our knowledge on biogeochemical cycles within the Movile Cave ecosystem, and the need to further investigate potential feedback mechanisms between microbial communities in both lake sediment and lake water.
Collapse
Affiliation(s)
- Deepak Kumaresan
- School of Environmental Sciences, University of East Anglia, Norwich, UK
- School of Biological Sciences and Institute for Global Food Security, Queen’s University Belfast, 97 Lisburn Road, Belfast, BT9 7BL UK
| | | | - Andrew C. Doxey
- Department of Biology, University of Waterloo, Waterloo, Canada
| | - Hina Bandukwala
- Department of Biology, University of Waterloo, Waterloo, Canada
| | - Elliot Brooks
- School of Environmental Sciences, University of East Anglia, Norwich, UK
| | | | - Andrew S. Whiteley
- UWA School of Agriculture and Environment, University of Western Australia, Perth, Australia
| | - J Colin Murrell
- School of Environmental Sciences, University of East Anglia, Norwich, UK
| |
Collapse
|
44
|
Abstract
The diversity and huge omics data take biology and biomedicine research and application into a big data era, just like that popular in human society a decade ago. They are opening a new challenge from horizontal data ensemble (e.g., the similar types of data collected from different labs or companies) to vertical data ensemble (e.g., the different types of data collected for a group of person with match information), which requires the integrative analysis in biology and biomedicine and also asks for emergent development of data integration to address the great changes from previous population-guided to newly individual-guided investigations.Data integration is an effective concept to solve the complex problem or understand the complicate system. Several benchmark studies have revealed the heterogeneity and trade-off that existed in the analysis of omics data. Integrative analysis can combine and investigate many datasets in a cost-effective reproducible way. Current integration approaches on biological data have two modes: one is "bottom-up integration" mode with follow-up manual integration, and the other one is "top-down integration" mode with follow-up in silico integration.This paper will firstly summarize the combinatory analysis approaches to give candidate protocol on biological experiment design for effectively integrative study on genomics and then survey the data fusion approaches to give helpful instruction on computational model development for biological significance detection, which have also provided newly data resources and analysis tools to support the precision medicine dependent on the big biomedical data. Finally, the problems and future directions are highlighted for integrative analysis of omics big data.
Collapse
Affiliation(s)
- Xiang-Tian Yu
- Key Laboratory of Systems Biology, Institute of Biochemistry and Cell Biology, Chinese Academy Science, Shanghai, China
| | - Tao Zeng
- Key Laboratory of Systems Biology, Institute of Biochemistry and Cell Biology, Chinese Academy Science, Shanghai, China.
| |
Collapse
|
45
|
Noronha MF, Lacerda Júnior GV, Gilbert JA, de Oliveira VM. Taxonomic and functional patterns across soil microbial communities of global biomes. THE SCIENCE OF THE TOTAL ENVIRONMENT 2017; 609:1064-1074. [PMID: 28787780 DOI: 10.1016/j.scitotenv.2017.07.159] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/30/2017] [Revised: 07/17/2017] [Accepted: 07/18/2017] [Indexed: 05/24/2023]
Affiliation(s)
- Melline Fontes Noronha
- Microbial Resources Division, Multidisciplinary Center for Chemistry, Biology and Agriculture Research (CPQBA), Campinas University, Brazil; Institute of Biology, Campinas University, Brazil.
| | - Gileno Vieira Lacerda Júnior
- Microbial Resources Division, Multidisciplinary Center for Chemistry, Biology and Agriculture Research (CPQBA), Campinas University, Brazil; Institute of Biology, Campinas University, Brazil
| | - Jack A Gilbert
- The Microbiome Center, Department of Surgery, University of Chicago, Chicago, IL, USA; The Microbiome Center, Bioscience Division, Argonne National Laboratory, Lemont, IL, USA
| | - Valéria Maia de Oliveira
- Microbial Resources Division, Multidisciplinary Center for Chemistry, Biology and Agriculture Research (CPQBA), Campinas University, Brazil
| |
Collapse
|
46
|
Lassalle F, Spagnoletti M, Fumagalli M, Shaw L, Dyble M, Walker C, Thomas MG, Bamberg Migliano A, Balloux F. Oral microbiomes from hunter-gatherers and traditional farmers reveal shifts in commensal balance and pathogen load linked to diet. Mol Ecol 2017; 27:182-195. [PMID: 29165844 DOI: 10.1111/mec.14435] [Citation(s) in RCA: 65] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2017] [Accepted: 11/06/2017] [Indexed: 01/22/2023]
Abstract
Maladaptation to modern diets has been implicated in several chronic disorders. Given the higher prevalence of disease such as dental caries and chronic gum diseases in industrialized societies, we sought to investigate the impact of different subsistence strategies on oral health and physiology, as documented by the oral microbiome. To control for confounding variables such as environment and host genetics, we sampled saliva from three pairs of populations of hunter-gatherers and traditional farmers living in close proximity in the Philippines. Deep shotgun sequencing of salivary DNA generated high-coverage microbiomes along with human genomes. Comparing these microbiomes with publicly available data from individuals living on a Western diet revealed that abundance ratios of core species were significantly correlated with subsistence strategy, with hunter-gatherers and Westerners occupying either end of a gradient of Neisseria against Haemophilus, and traditional farmers falling in between. Species found preferentially in hunter-gatherers included microbes often considered as oral pathogens, despite their hosts' apparent good oral health. Discriminant analysis of gene functions revealed vitamin B5 autotrophy and urease-mediated pH regulation as candidate adaptations of the microbiome to the hunter-gatherer and Western diets, respectively. These results suggest that major transitions in diet selected for different communities of commensals and likely played a role in the emergence of modern oral pathogens.
Collapse
Affiliation(s)
- Florent Lassalle
- University College London, UCL Genetics Institute, London, UK.,Department of Infectious Disease Epidemiology, Imperial College London, London, UK
| | | | | | - Liam Shaw
- University College London, UCL Genetics Institute, London, UK
| | - Mark Dyble
- Department of Anthropology, University College London, London, UK.,Department of Zoology, University of Cambridge, Cambridge, UK
| | | | - Mark G Thomas
- University College London, UCL Genetics Institute, London, UK
| | | | | |
Collapse
|
47
|
Karimi E, Ramos M, Gonçalves JMS, Xavier JR, Reis MP, Costa R. Comparative Metagenomics Reveals the Distinctive Adaptive Features of the Spongia officinalis Endosymbiotic Consortium. Front Microbiol 2017; 8:2499. [PMID: 29312205 PMCID: PMC5735121 DOI: 10.3389/fmicb.2017.02499] [Citation(s) in RCA: 31] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2017] [Accepted: 11/30/2017] [Indexed: 12/14/2022] Open
Abstract
Current knowledge of sponge microbiome functioning derives mostly from comparative analyses with bacterioplankton communities. We employed a metagenomics-centered approach to unveil the distinct features of the Spongia officinalis endosymbiotic consortium in the context of its two primary environmental vicinities. Microbial metagenomic DNA samples (n = 10) from sponges, seawater, and sediments were subjected to Hiseq Illumina sequencing (c. 15 million 100 bp reads per sample). Totals of 10,272 InterPro (IPR) predicted protein entries and 784 rRNA gene operational taxonomic units (OTUs, 97% cut-off) were uncovered from all metagenomes. Despite the large divergence in microbial community assembly between the surveyed biotopes, the S. officinalis symbiotic community shared slightly greater similarity (p < 0.05), in terms of both taxonomy and function, to sediment than to seawater communities. The vast majority of the dominant S. officinalis symbionts (i.e., OTUs), representing several, so-far uncultivable lineages in diverse bacterial phyla, displayed higher residual abundances in sediments than in seawater. CRISPR-Cas proteins and restriction endonucleases presented much higher frequencies (accompanied by lower viral abundances) in sponges than in the environment. However, several genomic features sharply enriched in the sponge specimens, including eukaryotic-like repeat motifs (ankyrins, tetratricopeptides, WD-40, and leucine-rich repeats), and genes encoding for plasmids, sulfatases, polyketide synthases, type IV secretion proteins, and terpene/terpenoid synthases presented, to varying degrees, higher frequencies in sediments than in seawater. In contrast, much higher abundances of motility and chemotaxis genes were found in sediments and seawater than in sponges. Higher cell and surface densities, sponge cell shedding and particle uptake, and putative chemical signaling processes favoring symbiont persistence in particulate matrices all may act as mechanisms underlying the observed degrees of taxonomic connectivity and functional convergence between sponges and sediments. The reduced frequency of motility and chemotaxis genes in the sponge microbiome reinforces the notion of a prevalent mutualistic mode of living inside the host. This study highlights the S. officinalis “endosymbiome” as a distinct consortium of uncultured prokaryotes displaying a likely “sit-and-wait” strategy to nutrient foraging coupled to sophisticated anti-viral defenses, unique natural product biosynthesis, nutrient utilization and detoxification capacities, and both microbe–microbe and host–microbe gene transfer amenability.
Collapse
Affiliation(s)
- Elham Karimi
- Microbial Ecology and Evolution Research Group, Centre of Marine Sciences, University of Algarve, Faro, Portugal
| | - Miguel Ramos
- Microbial Ecology and Evolution Research Group, Centre of Marine Sciences, University of Algarve, Faro, Portugal
| | - Jorge M S Gonçalves
- Fisheries, Biodiversity and Conservation Research Group, Centre of Marine Sciences, University of Algarve, Faro, Portugal
| | - Joana R Xavier
- Department of Biology and K.G. Jebsen Centre for Deep Sea Research, University of Bergen, Bergen, Norway
| | - Margarida P Reis
- Faculty of Science and Technology, University of Algarve, Faro, Portugal
| | - Rodrigo Costa
- Microbial Ecology and Evolution Research Group, Centre of Marine Sciences, University of Algarve, Faro, Portugal.,Institute for Bioengineering and Biosciences, Department of Bioengineering, Instituto Superior Técnico, Universidade de Lisboa, Lisbon, Portugal
| |
Collapse
|
48
|
|
49
|
Sczyrba A, Hofmann P, Belmann P, Koslicki D, Janssen S, Dröge J, Gregor I, Majda S, Fiedler J, Dahms E, Bremges A, Fritz A, Garrido-Oter R, Jørgensen TS, Shapiro N, Blood PD, Gurevich A, Bai Y, Turaev D, DeMaere MZ, Chikhi R, Nagarajan N, Quince C, Meyer F, Balvočiūtė M, Hansen LH, Sørensen SJ, Chia BKH, Denis B, Froula JL, Wang Z, Egan R, Don Kang D, Cook JJ, Deltel C, Beckstette M, Lemaitre C, Peterlongo P, Rizk G, Lavenier D, Wu YW, Singer SW, Jain C, Strous M, Klingenberg H, Meinicke P, Barton MD, Lingner T, Lin HH, Liao YC, Silva GGZ, Cuevas DA, Edwards RA, Saha S, Piro VC, Renard BY, Pop M, Klenk HP, Göker M, Kyrpides NC, Woyke T, Vorholt JA, Schulze-Lefert P, Rubin EM, Darling AE, Rattei T, McHardy AC. Critical Assessment of Metagenome Interpretation-a benchmark of metagenomics software. Nat Methods 2017; 14:1063-1071. [PMID: 28967888 DOI: 10.1101/099127] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2016] [Accepted: 08/25/2017] [Indexed: 05/25/2023]
Abstract
Methods for assembly, taxonomic profiling and binning are key to interpreting metagenome data, but a lack of consensus about benchmarking complicates performance assessment. The Critical Assessment of Metagenome Interpretation (CAMI) challenge has engaged the global developer community to benchmark their programs on highly complex and realistic data sets, generated from ∼700 newly sequenced microorganisms and ∼600 novel viruses and plasmids and representing common experimental setups. Assembly and genome binning programs performed well for species represented by individual genomes but were substantially affected by the presence of related strains. Taxonomic profiling and binning programs were proficient at high taxonomic ranks, with a notable performance decrease below family level. Parameter settings markedly affected performance, underscoring their importance for program reproducibility. The CAMI results highlight current challenges but also provide a roadmap for software selection to answer specific research questions.
Collapse
Affiliation(s)
- Alexander Sczyrba
- Faculty of Technology, Bielefeld University, Bielefeld, Germany
- Center for Biotechnology, Bielefeld University, Bielefeld, Germany
| | - Peter Hofmann
- Formerly Department of Algorithmic Bioinformatics, Heinrich Heine University (HHU), Duesseldorf, Germany
- Department of Computational Biology of Infection Research, Helmholtz Centre for Infection Research (HZI), Braunschweig, Germany
- Braunschweig Integrated Centre of Systems Biology (BRICS), Braunschweig, Germany
| | - Peter Belmann
- Faculty of Technology, Bielefeld University, Bielefeld, Germany
- Center for Biotechnology, Bielefeld University, Bielefeld, Germany
- Department of Computational Biology of Infection Research, Helmholtz Centre for Infection Research (HZI), Braunschweig, Germany
- Braunschweig Integrated Centre of Systems Biology (BRICS), Braunschweig, Germany
| | - David Koslicki
- Mathematics Department, Oregon State University, Corvallis, Oregon, USA
| | - Stefan Janssen
- Department of Computational Biology of Infection Research, Helmholtz Centre for Infection Research (HZI), Braunschweig, Germany
- Department of Pediatrics, University of California, San Diego, California, USA
- Department of Computer Science and Engineering, University of California, San Diego, California, USA
| | - Johannes Dröge
- Formerly Department of Algorithmic Bioinformatics, Heinrich Heine University (HHU), Duesseldorf, Germany
- Department of Computational Biology of Infection Research, Helmholtz Centre for Infection Research (HZI), Braunschweig, Germany
- Braunschweig Integrated Centre of Systems Biology (BRICS), Braunschweig, Germany
| | - Ivan Gregor
- Formerly Department of Algorithmic Bioinformatics, Heinrich Heine University (HHU), Duesseldorf, Germany
- Department of Computational Biology of Infection Research, Helmholtz Centre for Infection Research (HZI), Braunschweig, Germany
- Braunschweig Integrated Centre of Systems Biology (BRICS), Braunschweig, Germany
| | - Stephan Majda
- Formerly Department of Algorithmic Bioinformatics, Heinrich Heine University (HHU), Duesseldorf, Germany
| | - Jessika Fiedler
- Formerly Department of Algorithmic Bioinformatics, Heinrich Heine University (HHU), Duesseldorf, Germany
- Department of Computational Biology of Infection Research, Helmholtz Centre for Infection Research (HZI), Braunschweig, Germany
| | - Eik Dahms
- Formerly Department of Algorithmic Bioinformatics, Heinrich Heine University (HHU), Duesseldorf, Germany
- Department of Computational Biology of Infection Research, Helmholtz Centre for Infection Research (HZI), Braunschweig, Germany
- Braunschweig Integrated Centre of Systems Biology (BRICS), Braunschweig, Germany
| | - Andreas Bremges
- Faculty of Technology, Bielefeld University, Bielefeld, Germany
- Center for Biotechnology, Bielefeld University, Bielefeld, Germany
- Department of Computational Biology of Infection Research, Helmholtz Centre for Infection Research (HZI), Braunschweig, Germany
- Braunschweig Integrated Centre of Systems Biology (BRICS), Braunschweig, Germany
- German Center for Infection Research (DZIF), partner site Hannover-Braunschweig, Braunschweig, Germany
| | - Adrian Fritz
- Department of Computational Biology of Infection Research, Helmholtz Centre for Infection Research (HZI), Braunschweig, Germany
- Braunschweig Integrated Centre of Systems Biology (BRICS), Braunschweig, Germany
| | - Ruben Garrido-Oter
- Formerly Department of Algorithmic Bioinformatics, Heinrich Heine University (HHU), Duesseldorf, Germany
- Department of Computational Biology of Infection Research, Helmholtz Centre for Infection Research (HZI), Braunschweig, Germany
- Braunschweig Integrated Centre of Systems Biology (BRICS), Braunschweig, Germany
- Department of Plant Microbe Interactions, Max Planck Institute for Plant Breeding Research, Cologne, Germany
- Cluster of Excellence on Plant Sciences (CEPLAS)
| | - Tue Sparholt Jørgensen
- Department of Environmental Science, Section of Environmental microbiology and Biotechnology, Aarhus University, Roskilde, Denmark
- Department of Microbiology, University of Copenhagen, Copenhagen, Denmark
- Department of Science and Environment, Roskilde University, Roskilde, Denmark
| | - Nicole Shapiro
- Department of Energy, Joint Genome Institute, Walnut Creek, California, USA
| | - Philip D Blood
- Pittsburgh Supercomputing Center, Carnegie Mellon University, Pittsburgh, Pennsylvania, USA
| | - Alexey Gurevich
- Center for Algorithmic Biotechnology, Institute of Translational Biomedicine, Saint Petersburg State University, Saint Petersburg, Russia
| | - Yang Bai
- Department of Plant Microbe Interactions, Max Planck Institute for Plant Breeding Research, Cologne, Germany
| | - Dmitrij Turaev
- Department of Microbiology and Ecosystem Science, University of Vienna, Vienna, Austria
| | - Matthew Z DeMaere
- The ithree institute, University of Technology Sydney, Sydney, New South Wales, Australia
| | - Rayan Chikhi
- Department of Computer Science, Research Center in Computer Science (CRIStAL), Signal and Automatic Control of Lille, Lille, France
- National Centre of the Scientific Research (CNRS), Rennes, France
| | - Niranjan Nagarajan
- Department of Computational and Systems Biology, Genome Institute of Singapore, Singapore
| | - Christopher Quince
- Department of Microbiology and Infection, Warwick Medical School, University of Warwick, Coventry, UK
| | - Fernando Meyer
- Department of Computational Biology of Infection Research, Helmholtz Centre for Infection Research (HZI), Braunschweig, Germany
- Braunschweig Integrated Centre of Systems Biology (BRICS), Braunschweig, Germany
| | - Monika Balvočiūtė
- Department of Computer Science, University of Tuebingen, Tuebingen, Germany
| | - Lars Hestbjerg Hansen
- Department of Environmental Science, Section of Environmental microbiology and Biotechnology, Aarhus University, Roskilde, Denmark
| | - Søren J Sørensen
- Department of Microbiology, University of Copenhagen, Copenhagen, Denmark
| | - Burton K H Chia
- Department of Computational and Systems Biology, Genome Institute of Singapore, Singapore
| | - Bertrand Denis
- Department of Computational and Systems Biology, Genome Institute of Singapore, Singapore
| | - Jeff L Froula
- Department of Energy, Joint Genome Institute, Walnut Creek, California, USA
| | - Zhong Wang
- Department of Energy, Joint Genome Institute, Walnut Creek, California, USA
| | - Robert Egan
- Department of Energy, Joint Genome Institute, Walnut Creek, California, USA
| | - Dongwan Don Kang
- Department of Energy, Joint Genome Institute, Walnut Creek, California, USA
| | | | - Charles Deltel
- GenScale-Bioinformatics Research Team, Inria Rennes-Bretagne Atlantique Research Centre, Rennes, France
- Institute of Research in Informatics and Random Systems (IRISA), Rennes, France
| | - Michael Beckstette
- Department of Molecular Infection Biology, Helmholtz Centre for Infection Research, Braunschweig, Germany
| | - Claire Lemaitre
- GenScale-Bioinformatics Research Team, Inria Rennes-Bretagne Atlantique Research Centre, Rennes, France
- Institute of Research in Informatics and Random Systems (IRISA), Rennes, France
| | - Pierre Peterlongo
- GenScale-Bioinformatics Research Team, Inria Rennes-Bretagne Atlantique Research Centre, Rennes, France
- Institute of Research in Informatics and Random Systems (IRISA), Rennes, France
| | - Guillaume Rizk
- Institute of Research in Informatics and Random Systems (IRISA), Rennes, France
- Algorizk-IT consulting and software systems, Paris, France
| | - Dominique Lavenier
- National Centre of the Scientific Research (CNRS), Rennes, France
- Institute of Research in Informatics and Random Systems (IRISA), Rennes, France
| | - Yu-Wei Wu
- Joint BioEnergy Institute, Emeryville, California, USA
- Graduate Institute of Biomedical Informatics, College of Medical Science and Technology, Taipei Medical University, Taipei, Taiwan
| | - Steven W Singer
- Joint BioEnergy Institute, Emeryville, California, USA
- Biological Systems and Engineering Division, Lawrence Berkeley National Laboratory, Berkeley, California, USA
| | - Chirag Jain
- School of Computational Science and Engineering, Georgia Institute of Technology, Atlanta, Georgia, USA
| | - Marc Strous
- Energy Engineering and Geomicrobiology, University of Calgary, Calgary, Alberta, Canada
| | - Heiner Klingenberg
- Department of Bioinformatics, Institute for Microbiology and Genetics, University of Goettingen, Goettingen, Germany
| | - Peter Meinicke
- Department of Bioinformatics, Institute for Microbiology and Genetics, University of Goettingen, Goettingen, Germany
| | - Michael D Barton
- Department of Energy, Joint Genome Institute, Walnut Creek, California, USA
| | | | - Hsin-Hung Lin
- Institute of Population Health Sciences, National Health Research Institutes, Zhunan Town, Taiwan
| | - Yu-Chieh Liao
- Institute of Population Health Sciences, National Health Research Institutes, Zhunan Town, Taiwan
| | | | - Daniel A Cuevas
- Computational Science Research Center, San Diego State University, San Diego, California, USA
| | - Robert A Edwards
- Computational Science Research Center, San Diego State University, San Diego, California, USA
| | - Surya Saha
- Boyce Thompson Institute for Plant Research, New York, New York, USA
| | - Vitor C Piro
- Research Group Bioinformatics (NG4), Robert Koch Institute, Berlin, Germany
- Coordination for the Improvement of Higher Education Personnel (CAPES) Foundation, Ministry of Education of Brazil, Brasília, Brazil
| | - Bernhard Y Renard
- Research Group Bioinformatics (NG4), Robert Koch Institute, Berlin, Germany
| | - Mihai Pop
- Center for Bioinformatics and Computational Biology, University of Maryland, College Park, Maryland, USA
- Department of Computer Science, University of Maryland, College Park, Maryland, USA
| | - Hans-Peter Klenk
- School of Biology, Newcastle University, Newcastle upon Tyne, UK
| | - Markus Göker
- Leibniz Institute DSMZ-German Collection of Microorganisms and Cell Cultures, Braunschweig, Germany
| | - Nikos C Kyrpides
- Department of Energy, Joint Genome Institute, Walnut Creek, California, USA
| | - Tanja Woyke
- Department of Energy, Joint Genome Institute, Walnut Creek, California, USA
| | | | - Paul Schulze-Lefert
- Department of Plant Microbe Interactions, Max Planck Institute for Plant Breeding Research, Cologne, Germany
- Cluster of Excellence on Plant Sciences (CEPLAS)
| | - Edward M Rubin
- Department of Energy, Joint Genome Institute, Walnut Creek, California, USA
| | - Aaron E Darling
- The ithree institute, University of Technology Sydney, Sydney, New South Wales, Australia
| | - Thomas Rattei
- Department of Microbiology and Ecosystem Science, University of Vienna, Vienna, Austria
| | - Alice C McHardy
- Formerly Department of Algorithmic Bioinformatics, Heinrich Heine University (HHU), Duesseldorf, Germany
- Department of Computational Biology of Infection Research, Helmholtz Centre for Infection Research (HZI), Braunschweig, Germany
- Braunschweig Integrated Centre of Systems Biology (BRICS), Braunschweig, Germany
- Cluster of Excellence on Plant Sciences (CEPLAS)
| |
Collapse
|
50
|
Beisser D, Graupner N, Grossmann L, Timm H, Boenigk J, Rahmann S. TaxMapper: an analysis tool, reference database and workflow for metatranscriptome analysis of eukaryotic microorganisms. BMC Genomics 2017; 18:787. [PMID: 29037173 PMCID: PMC5644092 DOI: 10.1186/s12864-017-4168-6] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2017] [Accepted: 10/05/2017] [Indexed: 12/17/2022] Open
Abstract
Background High-throughput sequencing (HTS) technologies are increasingly applied to analyse complex microbial ecosystems by mRNA sequencing of whole communities, also known as metatranscriptome sequencing. This approach is at the moment largely limited to prokaryotic communities and communities of few eukaryotic species with sequenced genomes. For eukaryotes the analysis is hindered mainly by a low and fragmented coverage of the reference databases to infer the community composition, but also by lack of automated workflows for the task. Results From the databases of the National Center for Biotechnology Information and Marine Microbial Eukaryote Transcriptome Sequencing Project, 142 references were selected in such a way that the taxa represent the main lineages within each of the seven supergroups of eukaryotes and possess predominantly complete transcriptomes or genomes. From these references, we created an annotated microeukaryotic reference database. We developed a tool called TaxMapper for a reliably mapping of sequencing reads against this database and filtering of unreliable assignments. For filtering, a classifier was trained and tested on each of the following: sequences of taxa in the database, sequences of taxa related to those in the database, and random sequences. Additionally, TaxMapper is part of a metatranscriptomic Snakemake workflow developed to perform quality assessment, functional and taxonomic annotation and (multivariate) statistical analysis including environmental data. The workflow is provided and described in detail to empower researchers to apply it for metatranscriptome analysis of any environmental sample. Conclusions TaxMapper shows superior performance compared to standard approaches, resulting in a higher number of true positive taxonomic assignments. Both the TaxMapper tool and the workflow are available as open-source code at Bitbucket under the MIT license: https://bitbucket.org/dbeisser/taxmapperand as a Bioconda package: https://bioconda.github.io/recipes/taxmapper/README.html. Electronic supplementary material The online version of this article (doi:10.1186/s12864-017-4168-6) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Daniela Beisser
- Biodiversity, University of Duisburg-Essen, Universitätsstr. 5, Essen, 45141, Germany.
| | - Nadine Graupner
- Biodiversity, University of Duisburg-Essen, Universitätsstr. 5, Essen, 45141, Germany
| | - Lars Grossmann
- Biodiversity, University of Duisburg-Essen, Universitätsstr. 5, Essen, 45141, Germany
| | - Henning Timm
- Genome Informatics, University of Duisburg-Essen, University Hospital Essen, Hufelandstr. 55, Essen, 45147, Germany
| | - Jens Boenigk
- Biodiversity, University of Duisburg-Essen, Universitätsstr. 5, Essen, 45141, Germany
| | - Sven Rahmann
- Genome Informatics, University of Duisburg-Essen, University Hospital Essen, Hufelandstr. 55, Essen, 45147, Germany
| |
Collapse
|