1
|
Santana-Pereira ALR. Identification of PKS Gene Clusters from Metagenomic Libraries Using a Next-Generation Sequencing Approach. Methods Mol Biol 2023; 2555:73-90. [PMID: 36306079 DOI: 10.1007/978-1-0716-2795-2_5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
Microbial secondary metabolites have been an important source of bioactive compounds with diverse applications from medicine to agriculture, noticeably those encoded by polyketide synthase (PKS) clusters due to their astounding chemical diversity. While most discovered compounds originate from culturable microorganisms, yet-to-be cultured microbes represent a reservoir of previously inaccessible compounds. The advent and development of metagenomics have allowed not only the characterization of these microorganisms but also their metabolic potential, making viable the prospection of environmental PKS for natural product discovery.Study of environmental PKSs often relies on the construction of metagenomic libraries and their mining, with clones containing PKS clusters identified via amplification of conserved domains and then screened for an activity of interest. Compounds produced by clones exhibiting the desired bioactivity can be isolated and characterized. However, these approaches can be less sensitive and biased against more divergent clusters, in addition to precluding the use of bioinformatics for cluster characterization prior to expression. While direct shotgun sequencing of metagenomes has identified and profiled a great number of PKSs from different environments and yet-to-be cultured microorganisms, it does not lend itself well to heterologous expression, the cruxes of natural product discovery.Here, we describe a strategy for sequencing entire metagenomic libraries while maintaining correspondence between sequence and clone, allowing the full characterization and annotation of all clusters present in a library using bioinformatic tools and then seamlessly passing clones of interest for activity screening through heterologous expression. Once a library is sequenced, the methods herein can be adapted for the mining of any biosynthetic gene cluster of interest within a metagenomic library.
Collapse
|
2
|
Santana-Pereira ALR, Sandoval-Powers M, Monsma S, Zhou J, Santos SR, Mead DA, Liles MR. Discovery of Novel Biosynthetic Gene Cluster Diversity From a Soil Metagenomic Library. Front Microbiol 2020; 11:585398. [PMID: 33365020 PMCID: PMC7750434 DOI: 10.3389/fmicb.2020.585398] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2020] [Accepted: 11/16/2020] [Indexed: 12/31/2022] Open
Abstract
Soil microorganisms historically have been a rich resource for natural product discovery, yet the majority of these microbes remain uncultivated and their biosynthetic capacity is left underexplored. To identify the biosynthetic potential of soil microorganisms using a culture-independent approach, we constructed a large-insert metagenomic library in Escherichia coli from a topsoil sampled from the Cullars Rotation (Auburn, AL, United States), a long-term crop rotation experiment. Library clones were screened for biosynthetic gene clusters (BGCs) using either PCR or a NGS (next generation sequencing) multiplexed pooling strategy, coupled with bioinformatic analysis to identify contigs associated with each metagenomic clone. A total of 1,015 BGCs were detected from 19,200 clones, identifying 223 clones (1.2%) that carry a polyketide synthase (PKS) and/or a non-ribosomal peptide synthetase (NRPS) cluster, a dramatically improved hit rate compared to PCR screening that targeted type I polyketide ketosynthase (KS) domains. The NRPS and PKS clusters identified by NGS were distinct from known BGCs in the MIBiG database or those PKS clusters identified by PCR. Likewise, 16S rRNA gene sequences obtained by NGS of the library included many representatives that were not recovered by PCR, in concordance with the same bias observed in KS amplicon screening. This study provides novel resources for natural product discovery and circumvents amplification bias to allow annotation of a soil metagenomic library for a more complete picture of its functional and phylogenetic diversity.
Collapse
Affiliation(s)
| | | | - Scott Monsma
- Lucigen Corporation, Middleton, WI, United States
| | - Jinglie Zhou
- Department of Biological Sciences, Auburn University, Auburn, AL, United States
| | - Scott R. Santos
- Department of Biological Sciences, Auburn University, Auburn, AL, United States
| | - David A. Mead
- Varigen Biosciences Corporation, Madison, WI, United States
| | - Mark R. Liles
- Department of Biological Sciences, Auburn University, Auburn, AL, United States
- Varigen Biosciences Corporation, Madison, WI, United States
| |
Collapse
|
3
|
Chiara M, Placido A, Picardi E, Ceci LR, Horner DS, Pesole G. A-GAME: improving the assembly of pooled functional metagenomics sequence data. BMC Genomics 2018; 19:44. [PMID: 29329522 PMCID: PMC5767027 DOI: 10.1186/s12864-017-4369-z] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2017] [Accepted: 12/08/2017] [Indexed: 02/06/2023] Open
Abstract
Background Expression screening of environmental DNA (eDNA) libraries is a popular approach for the identification and characterization of novel microbial enzymes with promising biotechnological properties. In such “functional metagenomics” experiments, inserts, selected on the basis of activity assays, are sequenced with high throughput sequencing technologies. Assembly is followed by gene prediction, annotation and identification of candidate genes that are subsequently evaluated for biotechnological applications. Results Here we present A-GAME (A GAlaxy suite for functional MEtagenomics), a web service incorporating state of the art tools and workflows for the analysis of eDNA sequence data. We illustrate the potential of A-GAME workflows using real functional metagenomics data, showing that they outperform alternative metagenomics assemblers. Dedicated tools available in A-GAME allow efficient analysis of pooled libraries and rapid identification of candidate genes, reducing sequencing costs and saving the need for laborious manual annotation. Conclusion In conclusion, we believe A-GAME will constitute a valuable resource for the functional metagenomics community. A-GAME is publicly available at http://beaconlab.it/agame Electronic supplementary material The online version of this article (10.1186/s12864-017-4369-z) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Matteo Chiara
- Department of Biosciences, University of Milan, via Celoria 26, 20133, Milan, Italy
| | - Antonio Placido
- Institute of Biomembranes, Bioenergetics and Molecular Biotechnology, Consiglio Nazionale delle Ricerche, via Amendola 165A, 70126, Bari, Italy
| | - Ernesto Picardi
- Institute of Biomembranes, Bioenergetics and Molecular Biotechnology, Consiglio Nazionale delle Ricerche, via Amendola 165A, 70126, Bari, Italy.,Department of Biosciences, Biotechnology and Biopharmaceutics, University of Bari "A. Moro", via Orabona, 4, 70126, Bari, Italy
| | - Luigi Ruggiero Ceci
- Institute of Biomembranes, Bioenergetics and Molecular Biotechnology, Consiglio Nazionale delle Ricerche, via Amendola 165A, 70126, Bari, Italy
| | - David Stephen Horner
- Department of Biosciences, University of Milan, via Celoria 26, 20133, Milan, Italy. .,Institute of Biomembranes, Bioenergetics and Molecular Biotechnology, Consiglio Nazionale delle Ricerche, via Amendola 165A, 70126, Bari, Italy.
| | - Graziano Pesole
- Institute of Biomembranes, Bioenergetics and Molecular Biotechnology, Consiglio Nazionale delle Ricerche, via Amendola 165A, 70126, Bari, Italy.,Department of Biosciences, Biotechnology and Biopharmaceutics, University of Bari "A. Moro", via Orabona, 4, 70126, Bari, Italy
| |
Collapse
|
4
|
Eriksson P, Mourkas E, González-Acuna D, Olsen B, Ellström P. Evaluation and optimization of microbial DNA extraction from fecal samples of wild Antarctic bird species. Infect Ecol Epidemiol 2017; 7:1386536. [PMID: 29152162 PMCID: PMC5678435 DOI: 10.1080/20008686.2017.1386536] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2017] [Accepted: 09/12/2017] [Indexed: 10/29/2022] Open
Abstract
Introduction: Advances in the development of nucleic acid-based methods have dramatically facilitated studies of host-microbial interactions. Fecal DNA analysis can provide information about the host's microbiota and gastrointestinal pathogen burden. Numerous studies have been conducted in mammals, yet birds are less well studied. Avian fecal DNA extraction has proved challenging, partly due to the mixture of fecal and urinary excretions and the deficiency of optimized protocols. This study presents an evaluation of the performance in avian fecal DNA extraction of six commercial kits from different bird species, focusing on penguins. Material and methods: Six DNA extraction kits were first tested according to the manufacturers' instructions using mallard feces. The kit giving the highest DNA yield was selected for further optimization and evaluation using Antarctic bird feces. Results: Penguin feces constitute a challenging sample type: most of the DNA extraction kits failed to yield acceptable amounts of DNA. The QIAamp cador Pathogen kit (Qiagen) performed the best in the initial investigation. Further optimization of the protocol resulted in good yields of high-quality DNA from seven bird species of different avian orders. Conclusion: This study presents an optimized approach to DNA extraction from challenging avian fecal samples.
Collapse
Affiliation(s)
- Per Eriksson
- Zoonosis Science Center, Department of Medical Sciences, Uppsala University, Uppsala, Sweden
- Zoonosis Science Center, Department of Medical Biochemistry and Microbiology, Uppsala University, Uppsala, Sweden
| | - Evangelos Mourkas
- Zoonosis Science Center, Department of Medical Sciences, Uppsala University, Uppsala, Sweden
- The Milner Centre for Evolution, Department of Biology and Biochemistry, University of Bath, Bath, UK
| | | | - Björn Olsen
- Zoonosis Science Center, Department of Medical Sciences, Uppsala University, Uppsala, Sweden
| | - Patrik Ellström
- Zoonosis Science Center, Department of Medical Sciences, Uppsala University, Uppsala, Sweden
- Zoonosis Science Center, Department of Medical Biochemistry and Microbiology, Uppsala University, Uppsala, Sweden
| |
Collapse
|
5
|
Evaluation of a pooled strategy for high-throughput sequencing of cosmid clones from metagenomic libraries. PLoS One 2014; 9:e98968. [PMID: 24911009 PMCID: PMC4049660 DOI: 10.1371/journal.pone.0098968] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2013] [Accepted: 05/09/2014] [Indexed: 11/19/2022] Open
Abstract
High-throughput sequencing methods have been instrumental in the growing field of metagenomics, with technological improvements enabling greater throughput at decreased costs. Nonetheless, the economy of high-throughput sequencing cannot be fully leveraged in the subdiscipline of functional metagenomics. In this area of research, environmental DNA is typically cloned to generate large-insert libraries from which individual clones are isolated, based on specific activities of interest. Sequence data are required for complete characterization of such clones, but the sequencing of a large set of clones requires individual barcode-based sample preparation; this can become costly, as the cost of clone barcoding scales linearly with the number of clones processed, and thus sequencing a large number of metagenomic clones often remains cost-prohibitive. We investigated a hybrid Sanger/Illumina pooled sequencing strategy that omits barcoding altogether, and we evaluated this strategy by comparing the pooled sequencing results to reference sequence data obtained from traditional barcode-based sequencing of the same set of clones. Using identity and coverage metrics in our evaluation, we show that pooled sequencing can generate high-quality sequence data, without producing problematic chimeras. Though caveats of a pooled strategy exist and further optimization of the method is required to improve recovery of complete clone sequences and to avoid circumstances that generate unrecoverable clone sequences, our results demonstrate that pooled sequencing represents an effective and low-cost alternative for sequencing large sets of metagenomic clones.
Collapse
|