1
|
Ryan MJ, Schloter M, Berg G, Kinkel LL, Eversole K, Macklin JA, Rybakova D, Sessitsch A. Towards a unified data infrastructure to support European and global microbiome research: a call to action. Environ Microbiol 2020; 23:372-375. [PMID: 33196130 PMCID: PMC7898335 DOI: 10.1111/1462-2920.15323] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2020] [Accepted: 11/12/2020] [Indexed: 11/30/2022]
Abstract
High‐quality microbiome research relies on the integrity, management and quality of supporting data. Currently biobanks and culture collections have different formats and approaches to data management. This necessitates a standard data format to underpin research, particularly in line with the FAIR data standards of findability, accessibility, interoperability and reusability. We address the importance of a unified, coordinated approach that ensures compatibility of data between that needed by biobanks and culture collections, but also to ensure linkage between bioinformatic databases and the wider research community.
Collapse
Affiliation(s)
| | - Michael Schloter
- Helmholtz Zentrum München, National Research Center for Environmental Health, Research Unit for Comparative Microbiome Analysis, Oberschleissheim, Germany
| | - Gabriele Berg
- Institute of Environmental Biotechnology, Graz University of Technology, Graz, Austria
| | - Linda L Kinkel
- Department of Plant Pathology, University of Minnesota, Saint Paul, MN, USA
| | - Kellye Eversole
- International Alliance for Phytobiomes Research, Lee's Summit, MO, USA.,Eversole Associates, Bethesda, MD, USA
| | - James A Macklin
- Ottawa Research and Development Centre, Agriculture and Agri-Food Canada, Ottawa, ON, Canada
| | - Daria Rybakova
- Institute of Environmental Biotechnology, Graz University of Technology, Graz, Austria
| | - Angela Sessitsch
- AIT Austrian Institute of Technology, Center for Health and Bioresources, Bioresources Unit, Tulln, Austria
| |
Collapse
|
2
|
Xu P, Modavi C, Demaree B, Twigg F, Liang B, Sun C, Zhang W, Abate AR. Microfluidic automated plasmid library enrichment for biosynthetic gene cluster discovery. Nucleic Acids Res 2020; 48:e48. [PMID: 32095820 PMCID: PMC7192590 DOI: 10.1093/nar/gkaa131] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2019] [Revised: 01/22/2020] [Accepted: 02/19/2020] [Indexed: 12/13/2022] Open
Abstract
Microbial biosynthetic gene clusters are a valuable source of bioactive molecules. However, because they typically represent a small fraction of genomic material in most metagenomic samples, it remains challenging to deeply sequence them. We present an approach to isolate and sequence gene clusters in metagenomic samples using microfluidic automated plasmid library enrichment. Our approach provides deep coverage of the target gene cluster, facilitating reassembly. We demonstrate the approach by isolating and sequencing type I polyketide synthase gene clusters from an Antarctic soil metagenome. Our method promotes the discovery of functional-related genes and biosynthetic pathways.
Collapse
Affiliation(s)
- Peng Xu
- Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, San Francisco, CA, USA
| | - Cyrus Modavi
- Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, San Francisco, CA, USA
| | - Benjamin Demaree
- Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, San Francisco, CA, USA
- UC Berkeley-UCSF Graduate Program in Bioengineering, University of California, San Francisco, CA, USA
| | - Frederick Twigg
- Department of Chemical and Biomolecular Engineering, University of California Berkeley, Berkeley, CA, USA
| | - Benjamin Liang
- Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, San Francisco, CA, USA
| | - Chen Sun
- Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, San Francisco, CA, USA
| | - Wenjun Zhang
- Department of Chemical and Biomolecular Engineering, University of California Berkeley, Berkeley, CA, USA
- Chan Zuckerberg Biohub, San Francisco, CA, USA
| | - Adam R Abate
- Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, San Francisco, CA, USA
- Chan Zuckerberg Biohub, San Francisco, CA, USA
| |
Collapse
|
3
|
Metagenomic and chemical characterization of soil cobalamin production. ISME JOURNAL 2019; 14:53-66. [PMID: 31492962 PMCID: PMC6908642 DOI: 10.1038/s41396-019-0502-0] [Citation(s) in RCA: 42] [Impact Index Per Article: 8.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/18/2019] [Revised: 07/15/2019] [Accepted: 07/31/2019] [Indexed: 01/01/2023]
Abstract
Cobalamin (vitamin B12) is an essential enzyme cofactor for most branches of life. Despite the potential importance of this cofactor for soil microbial communities, the producers and consumers of cobalamin in terrestrial environments are still unknown. Here we provide the first metagenome-based assessment of soil cobalamin-producing bacteria and archaea, quantifying and classifying genes encoding proteins for cobalamin biosynthesis, transport, remodeling, and dependency in 155 soil metagenomes with profile hidden Markov models. We also measured several forms of cobalamin (CN-, Me-, OH-, Ado-B12) and the cobalamin lower ligand (5,6-dimethylbenzimidazole; DMB) in 40 diverse soil samples. Metagenomic analysis revealed that less than 10% of soil bacteria and archaea encode the genetic potential for de novo synthesis of this important enzyme cofactor. Predominant soil cobalamin producers were associated with the Proteobacteria, Actinobacteria, Firmicutes, Nitrospirae, and Thaumarchaeota. In contrast, a much larger proportion of abundant soil genera lacked cobalamin synthesis genes and instead were associated with gene sequences encoding cobalamin transport and cobalamin-dependent enzymes. The enrichment of DMB and corresponding DMB synthesis genes, relative to corrin ring synthesis genes, suggests an important role for cobalamin remodelers in terrestrial habitats. Together, our results indicate that microbial cobalamin production and repair serve as keystone functions that are significantly correlated with microbial community size, diversity, and biogeochemistry of terrestrial ecosystems.
Collapse
|
4
|
Rahman MA, LaPierre N, Rangwala H, Barbara D. Metagenome sequence clustering with hash-based canopies. J Bioinform Comput Biol 2017; 15:1740006. [PMID: 29113561 DOI: 10.1142/s0219720017400066] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
Metagenomics is the collective sequencing of co-existing microbial communities which are ubiquitous across various clinical and ecological environments. Due to the large volume and random short sequences (reads) obtained from community sequences, analysis of diversity, abundance and functions of different organisms within these communities are challenging tasks. We present a fast and scalable clustering algorithm for analyzing large-scale metagenome sequence data. Our approach achieves efficiency by partitioning the large number of sequence reads into groups (called canopies) using hashing. These canopies are then refined by using state-of-the-art sequence clustering algorithms. This canopy-clustering (CC) algorithm can be used as a pre-processing phase for computationally expensive clustering algorithms. We use and compare three hashing schemes for canopy construction with five popular and state-of-the-art sequence clustering methods. We evaluate our clustering algorithm on synthetic and real-world 16S and whole metagenome benchmarks. We demonstrate the ability of our proposed approach to determine meaningful Operational Taxonomic Units (OTU) and observe significant speedup with regards to run time when compared to different clustering algorithms. We also make our source code publicly available on Github. a.
Collapse
Affiliation(s)
| | - Nathan LaPierre
- † Department of Computer Science, University of California, Los Angeles, California, USA
| | - Huzefa Rangwala
- * Department of Computer Science, George Mason University, Fairfax, Virginia, USA
| | - Daniel Barbara
- * Department of Computer Science, George Mason University, Fairfax, Virginia, USA
| |
Collapse
|
5
|
Rahman SJ, Charles TC, Kaur P. Metagenomic Approaches to Identify Novel Organisms from the Soil Environment in a Classroom Setting. JOURNAL OF MICROBIOLOGY & BIOLOGY EDUCATION 2016; 17:423-429. [PMID: 28101269 PMCID: PMC5134946 DOI: 10.1128/jmbe.v17i3.1115] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]
Abstract
Molecular Microbial Metagenomics is a research-based undergraduate course developed at Georgia State University. This semester-long course provides hands-on research experience in the area of microbial diversity and introduces molecular approaches to study diversity. Students are part of an ongoing research project that uses metagenomic approaches to isolate clones containing 16S ribosomal ribonucleic acid (rRNA) genes from a soil metagenomic library. These approaches not only provide a measure of microbial diversity in the sample but may also allow discovery of novel organisms. Metagenomic approaches differ from the traditional culturing methods in that they use molecular analysis of community deoxyribonucleic acid (DNA) instead of culturing individual organisms. Groups of students select a batch of 100 clones from a metagenomic library. Using universal primers to amplify 16S rRNA genes from the pool of DNA isolated from 100 clones, and a stepwise process of elimination, each group isolates individual clones containing 16S rRNA genes within their batch of 100 clones. The amplified 16S rRNA genes are sequenced and analyzed using bioinformatics tools to determine whether the rRNA gene belongs to a novel organism. This course provides avenues for active learning and enhances students' conceptual understanding of microbial diversity. Average scores on six assessment methods used during field testing indicated that success in achieving different learning objectives varied between 84% and 95%, with 65% of the students demonstrating complete grasp of the project based on the end-of-project lab report. The authentic research experience obtained in this course is also expected to result in more undergraduates choosing research-based graduate programs or careers.
Collapse
Affiliation(s)
- Sadia J. Rahman
- Department of Biology, Georgia State University, Atlanta, GA, 30303, USA
| | - Trevor C. Charles
- Department of Biology, University of Waterloo, Waterloo, ON N2V 2P1, Canada
| | - Parjit Kaur
- Department of Biology, Georgia State University, Atlanta, GA, 30303, USA
| |
Collapse
|
6
|
Open-Source Sequence Clustering Methods Improve the State Of the Art. mSystems 2016; 1:mSystems00003-15. [PMID: 27822515 PMCID: PMC5069751 DOI: 10.1128/msystems.00003-15] [Citation(s) in RCA: 110] [Impact Index Per Article: 13.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2015] [Accepted: 01/10/2016] [Indexed: 02/07/2023] Open
Abstract
Massive collections of next-generation sequencing data call for fast, accurate, and easily accessible bioinformatics algorithms to perform sequence clustering. A comprehensive benchmark is presented, including open-source tools and the popular USEARCH suite. Simulated, mock, and environmental communities were used to analyze sensitivity, selectivity, species diversity (alpha and beta), and taxonomic composition. The results demonstrate that recent clustering algorithms can significantly improve accuracy and preserve estimated diversity without the application of aggressive filtering. Moreover, these tools are all open source, apply multiple levels of multithreading, and scale to the demands of modern next-generation sequencing data, which is essential for the analysis of massive multidisciplinary studies such as the Earth Microbiome Project (EMP) (J. A. Gilbert, J. K. Jansson, and R. Knight, BMC Biol 12:69, 2014, http://dx.doi.org/10.1186/s12915-014-0069-1). Sequence clustering is a common early step in amplicon-based microbial community analysis, when raw sequencing reads are clustered into operational taxonomic units (OTUs) to reduce the run time of subsequent analysis steps. Here, we evaluated the performance of recently released state-of-the-art open-source clustering software products, namely, OTUCLUST, Swarm, SUMACLUST, and SortMeRNA, against current principal options (UCLUST and USEARCH) in QIIME, hierarchical clustering methods in mothur, and USEARCH’s most recent clustering algorithm, UPARSE. All the latest open-source tools showed promising results, reporting up to 60% fewer spurious OTUs than UCLUST, indicating that the underlying clustering algorithm can vastly reduce the number of these derived OTUs. Furthermore, we observed that stringent quality filtering, such as is done in UPARSE, can cause a significant underestimation of species abundance and diversity, leading to incorrect biological results. Swarm, SUMACLUST, and SortMeRNA have been included in the QIIME 1.9.0 release. IMPORTANCE Massive collections of next-generation sequencing data call for fast, accurate, and easily accessible bioinformatics algorithms to perform sequence clustering. A comprehensive benchmark is presented, including open-source tools and the popular USEARCH suite. Simulated, mock, and environmental communities were used to analyze sensitivity, selectivity, species diversity (alpha and beta), and taxonomic composition. The results demonstrate that recent clustering algorithms can significantly improve accuracy and preserve estimated diversity without the application of aggressive filtering. Moreover, these tools are all open source, apply multiple levels of multithreading, and scale to the demands of modern next-generation sequencing data, which is essential for the analysis of massive multidisciplinary studies such as the Earth Microbiome Project (EMP) (J. A. Gilbert, J. K. Jansson, and R. Knight, BMC Biol 12:69, 2014, http://dx.doi.org/10.1186/s12915-014-0069-1).
Collapse
|
7
|
Lam KN, Cheng J, Engel K, Neufeld JD, Charles TC. Current and future resources for functional metagenomics. Front Microbiol 2015; 6:1196. [PMID: 26579102 PMCID: PMC4625089 DOI: 10.3389/fmicb.2015.01196] [Citation(s) in RCA: 75] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2015] [Accepted: 10/14/2015] [Indexed: 11/18/2022] Open
Abstract
Functional metagenomics is a powerful experimental approach for studying gene function, starting from the extracted DNA of mixed microbial populations. A functional approach relies on the construction and screening of metagenomic libraries—physical libraries that contain DNA cloned from environmental metagenomes. The information obtained from functional metagenomics can help in future annotations of gene function and serve as a complement to sequence-based metagenomics. In this Perspective, we begin by summarizing the technical challenges of constructing metagenomic libraries and emphasize their value as resources. We then discuss libraries constructed using the popular cloning vector, pCC1FOS, and highlight the strengths and shortcomings of this system, alongside possible strategies to maximize existing pCC1FOS-based libraries by screening in diverse hosts. Finally, we discuss the known bias of libraries constructed from human gut and marine water samples, present results that suggest bias may also occur for soil libraries, and consider factors that bias metagenomic libraries in general. We anticipate that discussion of current resources and limitations will advance tools and technologies for functional metagenomics research.
Collapse
Affiliation(s)
- Kathy N Lam
- Department of Biology, University of Waterloo Waterloo, ON, Canada
| | - Jiujun Cheng
- Department of Biology, University of Waterloo Waterloo, ON, Canada
| | - Katja Engel
- Department of Biology, University of Waterloo Waterloo, ON, Canada
| | - Josh D Neufeld
- Department of Biology, University of Waterloo Waterloo, ON, Canada
| | - Trevor C Charles
- Department of Biology, University of Waterloo Waterloo, ON, Canada
| |
Collapse
|
8
|
Lam KN, Charles TC. Strong spurious transcription likely contributes to DNA insert bias in typical metagenomic clone libraries. MICROBIOME 2015; 3:22. [PMID: 26056565 PMCID: PMC4459075 DOI: 10.1186/s40168-015-0086-5] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/09/2014] [Accepted: 05/01/2015] [Indexed: 05/24/2023]
Abstract
BACKGROUND Clone libraries provide researchers with a powerful resource to study nucleic acid from diverse sources. Metagenomic clone libraries in particular have aided in studies of microbial biodiversity and function, and allowed the mining of novel enzymes. Libraries are often constructed by cloning large inserts into cosmid or fosmid vectors. Recently, there have been reports of GC bias in fosmid metagenomic libraries, and it was speculated to be a result of fragmentation and loss of AT-rich sequences during cloning. However, evidence in the literature suggests that transcriptional activity or gene product toxicity may play a role. RESULTS To explore possible mechanisms responsible for sequence bias in clone libraries, we constructed a cosmid library from a human microbiome sample and sequenced DNA from different steps during library construction: crude extract DNA, size-selected DNA, and cosmid library DNA. We confirmed a GC bias in the final cosmid library, and we provide evidence that the bias is not due to fragmentation and loss of AT-rich sequences but is likely occurring after DNA is introduced into Escherichia coli. To investigate the influence of strong constitutive transcription, we searched the sequence data for promoters and found that rpoD/σ(70) promoter sequences were underrepresented in the cosmid library. Furthermore, when we examined the genomes of taxa that were differentially abundant in the cosmid library relative to the original sample, we found the bias to be more correlated with the number of rpoD/σ(70) consensus sequences in the genome than with simple GC content. CONCLUSIONS The GC bias of metagenomic libraries does not appear to be due to DNA fragmentation. Rather, analysis of promoter sequences provides support for the hypothesis that strong constitutive transcription from sequences recognized as rpoD/σ(70) consensus-like in E. coli may lead to instability, causing loss of the plasmid or loss of the insert DNA that gives rise to the transcription. Despite widespread use of E. coli to propagate foreign DNA in metagenomic libraries, the effects of in vivo transcriptional activity on clone stability are not well understood. Further work is required to tease apart the effects of transcription from those of gene product toxicity.
Collapse
Affiliation(s)
- Kathy N. Lam
- Department of Biology, University of Waterloo, Waterloo, ON Canada
| | | |
Collapse
|
9
|
Abstract
Soil microbial diversity represents the largest global reservoir of novel microorganisms and enzymes. In this study, we coupled functional metagenomics and DNA stable-isotope probing (DNA-SIP) using multiple plant-derived carbon substrates and diverse soils to characterize active soil bacterial communities and their glycoside hydrolase genes, which have value for industrial applications. We incubated samples from three disparate Canadian soils (tundra, temperate rainforest, and agricultural) with five native carbon (12C) or stable-isotope-labeled (13C) carbohydrates (glucose, cellobiose, xylose, arabinose, and cellulose). Indicator species analysis revealed high specificity and fidelity for many uncultured and unclassified bacterial taxa in the heavy DNA for all soils and substrates. Among characterized taxa, Actinomycetales (Salinibacterium), Rhizobiales (Devosia), Rhodospirillales (Telmatospirillum), and Caulobacterales (Phenylobacterium and Asticcacaulis) were bacterial indicator species for the heavy substrates and soils tested. Both Actinomycetales and Caulobacterales (Phenylobacterium) were associated with metabolism of cellulose, and Alphaproteobacteria were associated with the metabolism of arabinose; members of the order Rhizobiales were strongly associated with the metabolism of xylose. Annotated metagenomic data suggested diverse glycoside hydrolase gene representation within the pooled heavy DNA. By screening 2,876 cloned fragments derived from the 13C-labeled DNA isolated from soils incubated with cellulose, we demonstrate the power of combining DNA-SIP, multiple-displacement amplification (MDA), and functional metagenomics by efficiently isolating multiple clones with activity on carboxymethyl cellulose and fluorogenic proxy substrates for carbohydrate-active enzymes. The ability to identify genes based on function, instead of sequence homology, allows the discovery of genes that would not be identified through sequence alone. This is arguably the most powerful application of metagenomics for the recovery of novel genes and a natural partner of the stable-isotope-probing approach for targeting active-yet-uncultured microorganisms. We expanded on previous efforts to combine stable-isotope probing and metagenomics, enriching microorganisms from multiple soils that were active in degrading plant-derived carbohydrates, followed by construction of a cellulose-based metagenomic library and recovery of glycoside hydrolases through functional metagenomics. The major advance of our study was the discovery of active-yet-uncultivated soil microorganisms and enrichment of their glycoside hydrolases. We recovered positive cosmid clones in a higher frequency than would be expected with direct metagenomic analysis of soil DNA. This study has generated an invaluable metagenomic resource that future research will exploit for genetic and enzymatic potential.
Collapse
|
10
|
Evaluation of a pooled strategy for high-throughput sequencing of cosmid clones from metagenomic libraries. PLoS One 2014; 9:e98968. [PMID: 24911009 PMCID: PMC4049660 DOI: 10.1371/journal.pone.0098968] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2013] [Accepted: 05/09/2014] [Indexed: 11/19/2022] Open
Abstract
High-throughput sequencing methods have been instrumental in the growing field of metagenomics, with technological improvements enabling greater throughput at decreased costs. Nonetheless, the economy of high-throughput sequencing cannot be fully leveraged in the subdiscipline of functional metagenomics. In this area of research, environmental DNA is typically cloned to generate large-insert libraries from which individual clones are isolated, based on specific activities of interest. Sequence data are required for complete characterization of such clones, but the sequencing of a large set of clones requires individual barcode-based sample preparation; this can become costly, as the cost of clone barcoding scales linearly with the number of clones processed, and thus sequencing a large number of metagenomic clones often remains cost-prohibitive. We investigated a hybrid Sanger/Illumina pooled sequencing strategy that omits barcoding altogether, and we evaluated this strategy by comparing the pooled sequencing results to reference sequence data obtained from traditional barcode-based sequencing of the same set of clones. Using identity and coverage metrics in our evaluation, we show that pooled sequencing can generate high-quality sequence data, without producing problematic chimeras. Though caveats of a pooled strategy exist and further optimization of the method is required to improve recovery of complete clone sequences and to avoid circumstances that generate unrecoverable clone sequences, our results demonstrate that pooled sequencing represents an effective and low-cost alternative for sequencing large sets of metagenomic clones.
Collapse
|
11
|
Versatile broad-host-range cosmids for construction of high quality metagenomic libraries. J Microbiol Methods 2014; 99:27-34. [PMID: 24495694 DOI: 10.1016/j.mimet.2014.01.015] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2013] [Revised: 01/22/2014] [Accepted: 01/25/2014] [Indexed: 12/18/2022]
Abstract
We constructed IncP broad-host-range Gateway® entry cosmids pJC8 and pJC24, which replicate in diverse Proteobacteria. We demonstrate the functionality of these vectors by extracting, purifying, and size-selecting metagenomic DNA from agricultural corn and wheat soils, followed by cloning into pJC8. Metagenomic DNA libraries of 8×10(4) (corn soil) and 9×10(6) (wheat soil) clones were generated for functional screening. The DNA cloned in these libraries can be transferred from these recombinant cosmids to Gateway® destination vectors for specialized screening purposes. Those library clones are available from the Canadian MetaMicroBiome Library project (http://www.cm2bl.org/).
Collapse
|
12
|
Engel K, Ashby D, Brady SF, Cowan DA, Doemer J, Edwards EA, Fiebig K, Martens EC, McCormac D, Mead DA, Miyazaki K, Moreno-Hagelsieb G, O'Gara F, Reid A, Rose DR, Simonet P, Sjöling S, Smalla K, Streit WR, Tedman-Jones J, Valla S, Wellington EMH, Wu CC, Liles MR, Neufeld JD, Sessitsch A, Charles TC. Meeting report: 1st international functional metagenomics workshop may 7-8, 2012, st. Jacobs, ontario, Canada. Stand Genomic Sci 2013; 8:106-11. [PMID: 23961315 PMCID: PMC3739178 DOI: 10.4056/sigs.3406845] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022] Open
Abstract
This report summarizes the events of the 1st International Functional Metagenomics Workshop. The workshop was held on May 7 and 8, 2012, in St. Jacobs, Ontario, Canada and was focused on building an international functional metagenomics community, exploring strategic research areas, and identifying opportunities for future collaboration and funding. The workshop was initiated by researchers at the University of Waterloo with support from the Ontario Genomics Institute (OGI), Natural Sciences and Engineering Research Council of Canada (NSERC) and the University of Waterloo.
Collapse
Affiliation(s)
- Katja Engel
- University of Waterloo, Waterloo, ON, Canada
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
13
|
Targeted recovery of novel phylogenetic diversity from next-generation sequence data. ISME JOURNAL 2012; 6:2067-77. [PMID: 22791239 PMCID: PMC3475379 DOI: 10.1038/ismej.2012.50] [Citation(s) in RCA: 45] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]
Abstract
Next-generation sequencing technologies have led to recognition of a so-called ‘rare biosphere'. These microbial operational taxonomic units (OTUs) are defined by low relative abundance and may be specifically adapted to maintaining low population sizes. We hypothesized that mining of low-abundance next-generation 16S ribosomal RNA (rRNA) gene data would lead to the discovery of novel phylogenetic diversity, reflecting microorganisms not yet discovered by previous sampling efforts. Here, we test this hypothesis by combining molecular and bioinformatic approaches for targeted retrieval of phylogenetic novelty within rare biosphere OTUs. We combined BLASTN network analysis, phylogenetics and targeted primer design to amplify 16S rRNA gene sequences from unique potential bacterial lineages, comprising part of the rare biosphere from a multi-million sequence data set from an Arctic tundra soil sample. Demonstrating the feasibility of the protocol developed here, three of seven recovered phylogenetic lineages represented extremely divergent taxonomic entities. These divergent target sequences correspond to (a) a previously unknown lineage within the BRC1 candidate phylum, (b) a sister group to the early diverging and currently recognized monospecific Cyanobacteria Gloeobacter, a genus containing multiple plesiomorphic traits and (c) a highly divergent lineage phylogenetically resolved within mitochondria. A comparison to twelve next-generation data sets from additional soils suggested persistent low-abundance distributions of these novel 16S rRNA genes. The results demonstrate this sequence analysis and retrieval pipeline as applicable for exploring underrepresented phylogenetic novelty and recovering taxa that may represent significant steps in bacterial evolution.
Collapse
|