1
|
Blin K, Shaw S, Steinke K, Villebro R, Ziemert N, Lee SY, Medema MH, Weber T. antiSMASH 5.0: updates to the secondary metabolite genome mining pipeline. Nucleic Acids Res 2019; 47:W81-W87. [PMID: 31032519 PMCID: PMC6602434 DOI: 10.1093/nar/gkz310] [Citation(s) in RCA: 2090] [Impact Index Per Article: 348.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2019] [Revised: 04/02/2019] [Accepted: 04/17/2019] [Indexed: 12/13/2022] Open
Abstract
Secondary metabolites produced by bacteria and fungi are an important source of antimicrobials and other bioactive compounds. In recent years, genome mining has seen broad applications in identifying and characterizing new compounds as well as in metabolic engineering. Since 2011, the 'antibiotics and secondary metabolite analysis shell-antiSMASH' (https://antismash.secondarymetabolites.org) has assisted researchers in this, both as a web server and a standalone tool. It has established itself as the most widely used tool for identifying and analysing biosynthetic gene clusters (BGCs) in bacterial and fungal genome sequences. Here, we present an entirely redesigned and extended version 5 of antiSMASH. antiSMASH 5 adds detection rules for clusters encoding the biosynthesis of acyl-amino acids, β-lactones, fungal RiPPs, RaS-RiPPs, polybrominated diphenyl ethers, C-nucleosides, PPY-like ketones and lipolanthines. For type II polyketide synthase-encoding gene clusters, antiSMASH 5 now offers more detailed predictions. The HTML output visualization has been redesigned to improve the navigation and visual representation of annotations. We have again improved the runtime of analysis steps, making it possible to deliver comprehensive annotations for bacterial genomes within a few minutes. A new output file in the standard JavaScript object notation (JSON) format is aimed at downstream tools that process antiSMASH results programmatically.
Collapse
|
research-article |
6 |
2090 |
2
|
Blin K, Shaw S, Kloosterman AM, Charlop-Powers Z, van Wezel GP, Medema MH, Weber T. antiSMASH 6.0: improving cluster detection and comparison capabilities. Nucleic Acids Res 2021; 49:W29-W35. [PMID: 33978755 PMCID: PMC8262755 DOI: 10.1093/nar/gkab335] [Citation(s) in RCA: 1546] [Impact Index Per Article: 386.5] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2021] [Revised: 04/12/2021] [Accepted: 04/19/2021] [Indexed: 12/18/2022] Open
Abstract
Many microorganisms produce natural products that form the basis of antimicrobials, antivirals, and other drugs. Genome mining is routinely used to complement screening-based workflows to discover novel natural products. Since 2011, the "antibiotics and secondary metabolite analysis shell—antiSMASH" (https://antismash.secondarymetabolites.org/) has supported researchers in their microbial genome mining tasks, both as a free-to-use web server and as a standalone tool under an OSI-approved open-source license. It is currently the most widely used tool for detecting and characterising biosynthetic gene clusters (BGCs) in bacteria and fungi. Here, we present the updated version 6 of antiSMASH. antiSMASH 6 increases the number of supported cluster types from 58 to 71, displays the modular structure of multi-modular BGCs, adds a new BGC comparison algorithm, allows for the integration of results from other prediction tools, and more effectively detects tailoring enzymes in RiPP clusters.
Collapse
|
Research Support, Non-U.S. Gov't |
4 |
1546 |
3
|
Weber T, Blin K, Duddela S, Krug D, Kim HU, Bruccoleri R, Lee SY, Fischbach MA, Müller R, Wohlleben W, Breitling R, Takano E, Medema MH. antiSMASH 3.0-a comprehensive resource for the genome mining of biosynthetic gene clusters. Nucleic Acids Res 2015; 43:W237-43. [PMID: 25948579 PMCID: PMC4489286 DOI: 10.1093/nar/gkv437] [Citation(s) in RCA: 1431] [Impact Index Per Article: 143.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2015] [Accepted: 04/23/2015] [Indexed: 12/01/2022] Open
Abstract
Microbial secondary metabolism constitutes a rich source of antibiotics, chemotherapeutics, insecticides and other high-value chemicals. Genome mining of gene clusters that encode the biosynthetic pathways for these metabolites has become a key methodology for novel compound discovery. In 2011, we introduced antiSMASH, a web server and stand-alone tool for the automatic genomic identification and analysis of biosynthetic gene clusters, available at http://antismash.secondarymetabolites.org. Here, we present version 3.0 of antiSMASH, which has undergone major improvements. A full integration of the recently published ClusterFinder algorithm now allows using this probabilistic algorithm to detect putative gene clusters of unknown types. Also, a new dereplication variant of the ClusterBlast module now identifies similarities of identified clusters to any of 1172 clusters with known end products. At the enzyme level, active sites of key biosynthetic enzymes are now pinpointed through a curated pattern-matching procedure and Enzyme Commission numbers are assigned to functionally classify all enzyme-coding genes. Additionally, chemical structure prediction has been improved by incorporating polyketide reduction states. Finally, in order for users to be able to organize and analyze multiple antiSMASH outputs in a private setting, a new XML output module allows offline editing of antiSMASH annotations within the Geneious software.
Collapse
|
Research Support, Non-U.S. Gov't |
10 |
1431 |
4
|
Medema MH, Blin K, Cimermancic P, de Jager V, Zakrzewski P, Fischbach MA, Weber T, Takano E, Breitling R. antiSMASH: rapid identification, annotation and analysis of secondary metabolite biosynthesis gene clusters in bacterial and fungal genome sequences. Nucleic Acids Res 2011; 39:W339-46. [PMID: 21672958 PMCID: PMC3125804 DOI: 10.1093/nar/gkr466] [Citation(s) in RCA: 1413] [Impact Index Per Article: 100.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Bacterial and fungal secondary metabolism is a rich source of novel bioactive compounds with potential pharmaceutical applications as antibiotics, anti-tumor drugs or cholesterol-lowering drugs. To find new drug candidates, microbiologists are increasingly relying on sequencing genomes of a wide variety of microbes. However, rapidly and reliably pinpointing all the potential gene clusters for secondary metabolites in dozens of newly sequenced genomes has been extremely challenging, due to their biochemical heterogeneity, the presence of unknown enzymes and the dispersed nature of the necessary specialized bioinformatics tools and resources. Here, we present antiSMASH (antibiotics & Secondary Metabolite Analysis Shell), the first comprehensive pipeline capable of identifying biosynthetic loci covering the whole range of known secondary metabolite compound classes (polyketides, non-ribosomal peptides, terpenes, aminoglycosides, aminocoumarins, indolocarbazoles, lantibiotics, bacteriocins, nucleosides, beta-lactams, butyrolactones, siderophores, melanins and others). It aligns the identified regions at the gene cluster level to their nearest relatives from a database containing all other known gene clusters, and integrates or cross-links all previously available secondary-metabolite specific gene analysis methods in one interactive view. antiSMASH is available at http://antismash.secondarymetabolites.org.
Collapse
|
Research Support, Non-U.S. Gov't |
14 |
1413 |
5
|
Blin K, Wolf T, Chevrette MG, Lu X, Schwalen CJ, Kautsar SA, Suarez Duran HG, de los Santos E, Kim HU, Nave M, Dickschat JS, Mitchell DA, Shelest E, Breitling R, Takano E, Lee SY, Weber T, Medema MH. antiSMASH 4.0-improvements in chemistry prediction and gene cluster boundary identification. Nucleic Acids Res 2017; 45:W36-W41. [PMID: 28460038 PMCID: PMC5570095 DOI: 10.1093/nar/gkx319] [Citation(s) in RCA: 899] [Impact Index Per Article: 112.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2017] [Revised: 04/07/2017] [Accepted: 04/13/2017] [Indexed: 02/07/2023] Open
Abstract
Many antibiotics, chemotherapeutics, crop protection agents and food preservatives originate from molecules produced by bacteria, fungi or plants. In recent years, genome mining methodologies have been widely adopted to identify and characterize the biosynthetic gene clusters encoding the production of such compounds. Since 2011, the 'antibiotics and secondary metabolite analysis shell-antiSMASH' has assisted researchers in efficiently performing this, both as a web server and a standalone tool. Here, we present the thoroughly updated antiSMASH version 4, which adds several novel features, including prediction of gene cluster boundaries using the ClusterFinder method or the newly integrated CASSIS algorithm, improved substrate specificity prediction for non-ribosomal peptide synthetase adenylation domains based on the new SANDPUMA algorithm, improved predictions for terpene and ribosomally synthesized and post-translationally modified peptides cluster products, reporting of sequence similarity to proteins encoded in experimentally characterized gene clusters on a per-protein basis and a domain-level alignment tool for comparative analysis of trans-AT polyketide synthase assembly line architectures. Additionally, several usability features have been updated and improved. Together, these improvements make antiSMASH up-to-date with the latest developments in natural product research and will further facilitate computational genome mining for the discovery of novel bioactive molecules.
Collapse
|
Research Support, N.I.H., Extramural |
8 |
899 |
6
|
Blin K, Shaw S, Augustijn HE, Reitz ZL, Biermann F, Alanjary M, Fetter A, Terlouw BR, Metcalf WW, Helfrich EJN, van Wezel GP, Medema MH, Weber T. antiSMASH 7.0: new and improved predictions for detection, regulation, chemical structures and visualisation. Nucleic Acids Res 2023:7151336. [PMID: 37140036 DOI: 10.1093/nar/gkad344] [Citation(s) in RCA: 831] [Impact Index Per Article: 415.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2023] [Revised: 03/31/2023] [Accepted: 04/26/2023] [Indexed: 05/05/2023] Open
Abstract
Microorganisms produce small bioactive compounds as part of their secondary or specialised metabolism. Often, such metabolites have antimicrobial, anticancer, antifungal, antiviral or other bio-activities and thus play an important role for applications in medicine and agriculture. In the past decade, genome mining has become a widely-used method to explore, access, and analyse the available biodiversity of these compounds. Since 2011, the 'antibiotics and secondary metabolite analysis shell-antiSMASH' (https://antismash.secondarymetabolites.org/) has supported researchers in their microbial genome mining tasks, both as a free to use web server and as a standalone tool under an OSI-approved open source licence. It is currently the most widely used tool for detecting and characterising biosynthetic gene clusters (BGCs) in archaea, bacteria, and fungi. Here, we present the updated version 7 of antiSMASH. antiSMASH 7 increases the number of supported cluster types from 71 to 81, as well as containing improvements in the areas of chemical structure prediction, enzymatic assembly-line visualisation and gene cluster regulation.
Collapse
|
|
2 |
831 |
7
|
Blin K, Medema MH, Kazempour D, Fischbach MA, Breitling R, Takano E, Weber T. antiSMASH 2.0--a versatile platform for genome mining of secondary metabolite producers. Nucleic Acids Res 2013; 41:W204-12. [PMID: 23737449 PMCID: PMC3692088 DOI: 10.1093/nar/gkt449] [Citation(s) in RCA: 639] [Impact Index Per Article: 53.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022] Open
Abstract
Microbial secondary metabolites are a potent source of antibiotics and other pharmaceuticals. Genome mining of their biosynthetic gene clusters has become a key method to accelerate their identification and characterization. In 2011, we developed antiSMASH, a web-based analysis platform that automates this process. Here, we present the highly improved antiSMASH 2.0 release, available at http://antismash.secondarymetabolites.org/. For the new version, antiSMASH was entirely re-designed using a plug-and-play concept that allows easy integration of novel predictor or output modules. antiSMASH 2.0 now supports input of multiple related sequences simultaneously (multi-FASTA/GenBank/EMBL), which allows the analysis of draft genomes comprising multiple contigs. Moreover, direct analysis of protein sequences is now possible. antiSMASH 2.0 has also been equipped with the capacity to detect additional classes of secondary metabolites, including oligosaccharide antibiotics, phenazines, thiopeptides, homo-serine lactones, phosphonates and furans. The algorithm for predicting the core structure of the cluster end product is now also covering lantipeptides, in addition to polyketides and non-ribosomal peptides. The antiSMASH ClusterBlast functionality has been extended to identify sub-clusters involved in the biosynthesis of specific chemical building blocks. The new features currently make antiSMASH 2.0 the most comprehensive resource for identifying and analyzing novel secondary metabolite biosynthetic pathways in microorganisms.
Collapse
|
Research Support, Non-U.S. Gov't |
12 |
639 |
8
|
Medema MH, Kottmann R, Yilmaz P, Cummings M, Biggins JB, Blin K, de Bruijn I, Chooi YH, Claesen J, Coates RC, Cruz-Morales P, Duddela S, Düsterhus S, Edwards DJ, Fewer DP, Garg N, Geiger C, Gomez-Escribano JP, Greule A, Hadjithomas M, Haines AS, Helfrich EJN, Hillwig ML, Ishida K, Jones AC, Jones CS, Jungmann K, Kegler C, Kim HU, Kötter P, Krug D, Masschelein J, Melnik AV, Mantovani SM, Monroe EA, Moore M, Moss N, Nützmann HW, Pan G, Pati A, Petras D, Reen FJ, Rosconi F, Rui Z, Tian Z, Tobias NJ, Tsunematsu Y, Wiemann P, Wyckoff E, Yan X, Yim G, Yu F, Xie Y, Aigle B, Apel AK, Balibar CJ, Balskus EP, Barona-Gómez F, Bechthold A, Bode HB, Borriss R, Brady SF, Brakhage AA, Caffrey P, Cheng YQ, Clardy J, Cox RJ, De Mot R, Donadio S, Donia MS, van der Donk WA, Dorrestein PC, Doyle S, Driessen AJM, Ehling-Schulz M, Entian KD, Fischbach MA, Gerwick L, Gerwick WH, Gross H, Gust B, Hertweck C, Höfte M, Jensen SE, Ju J, Katz L, Kaysser L, Klassen JL, Keller NP, Kormanec J, Kuipers OP, Kuzuyama T, Kyrpides NC, Kwon HJ, Lautru S, Lavigne R, Lee CY, Linquan B, Liu X, Liu W, Luzhetskyy A, Mahmud T, Mast Y, Méndez C, Metsä-Ketelä M, Micklefield J, Mitchell DA, Moore BS, Moreira LM, Müller R, Neilan BA, Nett M, Nielsen J, O'Gara F, Oikawa H, Osbourn A, Osburne MS, Ostash B, Payne SM, Pernodet JL, Petricek M, Piel J, Ploux O, Raaijmakers JM, Salas JA, Schmitt EK, Scott B, Seipke RF, Shen B, Sherman DH, Sivonen K, Smanski MJ, Sosio M, Stegmann E, Süssmuth RD, Tahlan K, Thomas CM, Tang Y, Truman AW, Viaud M, Walton JD, Walsh CT, Weber T, van Wezel GP, Wilkinson B, Willey JM, Wohlleben W, Wright GD, Ziemert N, Zhang C, Zotchev SB, Breitling R, Takano E, Glöckner FO. Minimum Information about a Biosynthetic Gene cluster. Nat Chem Biol 2015; 11:625-31. [PMID: 26284661 PMCID: PMC5714517 DOI: 10.1038/nchembio.1890] [Citation(s) in RCA: 588] [Impact Index Per Article: 58.8] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2023]
|
research-article |
10 |
588 |
9
|
Röttig M, Medema MH, Blin K, Weber T, Rausch C, Kohlbacher O. NRPSpredictor2--a web server for predicting NRPS adenylation domain specificity. Nucleic Acids Res 2011; 39:W362-7. [PMID: 21558170 PMCID: PMC3125756 DOI: 10.1093/nar/gkr323] [Citation(s) in RCA: 485] [Impact Index Per Article: 34.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022] Open
Abstract
The products of many bacterial non-ribosomal peptide synthetases (NRPS) are highly important secondary metabolites, including vancomycin and other antibiotics. The ability to predict substrate specificity of newly detected NRPS Adenylation (A-) domains by genome sequencing efforts is of great importance to identify and annotate new gene clusters that produce secondary metabolites. Prediction of A-domain specificity based on the sequence alone can be achieved through sequence signatures or, more accurately, through machine learning methods. We present an improved predictor, based on previous work (NRPSpredictor), that predicts A-domain specificity using Support Vector Machines on four hierarchical levels, ranging from gross physicochemical properties of an A-domain’s substrates down to single amino acid substrates. The three more general levels are predicted with an F-measure better than 0.89 and the most detailed level with an average F-measure of 0.80. We also modeled the applicability domain of our predictor to estimate for new A-domains whether they lie in the applicability domain. Finally, since there are also NRPS that play an important role in natural products chemistry of fungi, such as peptaibols and cephalosporins, we added a predictor for fungal A-domains, which predicts gross physicochemical properties with an F-measure of 0.84. The service is available at http://nrps.informatik.uni-tuebingen.de/.
Collapse
|
Research Support, Non-U.S. Gov't |
14 |
485 |
10
|
Kautsar SA, Blin K, Shaw S, Navarro-Muñoz JC, Terlouw BR, van der Hooft JJJ, van Santen JA, Tracanna V, Suarez Duran HG, Pascal Andreu V, Selem-Mojica N, Alanjary M, Robinson SL, Lund G, Epstein SC, Sisto AC, Charkoudian LK, Collemare J, Linington RG, Weber T, Medema MH. MIBiG 2.0: a repository for biosynthetic gene clusters of known function. Nucleic Acids Res 2020; 48:D454-D458. [PMID: 31612915 PMCID: PMC7145714 DOI: 10.1093/nar/gkz882] [Citation(s) in RCA: 301] [Impact Index Per Article: 60.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2019] [Revised: 09/25/2019] [Accepted: 10/01/2019] [Indexed: 11/18/2022] Open
Abstract
Fueled by the explosion of (meta)genomic data, genome mining of specialized metabolites has become a major technology for drug discovery and studying microbiome ecology. In these efforts, computational tools like antiSMASH have played a central role through the analysis of Biosynthetic Gene Clusters (BGCs). Thousands of candidate BGCs from microbial genomes have been identified and stored in public databases. Interpreting the function and novelty of these predicted BGCs requires comparison with a well-documented set of BGCs of known function. The MIBiG (Minimum Information about a Biosynthetic Gene Cluster) Data Standard and Repository was established in 2015 to enable curation and storage of known BGCs. Here, we present MIBiG 2.0, which encompasses major updates to the schema, the data, and the online repository itself. Over the past five years, 851 new BGCs have been added. Additionally, we performed extensive manual data curation of all entries to improve the annotation quality of our repository. We also redesigned the data schema to ensure the compliance of future annotations. Finally, we improved the user experience by adding new features such as query searches and a statistics page, and enabled direct link-outs to chemical structure databases. The repository is accessible online at https://mibig.secondarymetabolites.org/.
Collapse
|
Research Support, U.S. Gov't, Non-P.H.S. |
5 |
301 |
11
|
Jiang X, Ellabaan MMH, Charusanti P, Munck C, Blin K, Tong Y, Weber T, Sommer MOA, Lee SY. Dissemination of antibiotic resistance genes from antibiotic producers to pathogens. Nat Commun 2017; 8:15784. [PMID: 28589945 PMCID: PMC5467266 DOI: 10.1038/ncomms15784] [Citation(s) in RCA: 244] [Impact Index Per Article: 30.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2016] [Accepted: 04/27/2017] [Indexed: 12/25/2022] Open
Abstract
It has been hypothesized that some antibiotic resistance genes (ARGs) found in pathogenic bacteria derive from antibiotic-producing actinobacteria. Here we provide bioinformatic and experimental evidence supporting this hypothesis. We identify genes in proteobacteria, including some pathogens, that appear to be closely related to actinobacterial ARGs known to confer resistance against clinically important antibiotics. Furthermore, we identify two potential examples of recent horizontal transfer of actinobacterial ARGs to proteobacterial pathogens. Based on this bioinformatic evidence, we propose and experimentally test a 'carry-back' mechanism for the transfer, involving conjugative transfer of a carrier sequence from proteobacteria to actinobacteria, recombination of the carrier sequence with the actinobacterial ARG, followed by natural transformation of proteobacteria with the carrier-sandwiched ARG. Our results support the existence of ancient and, possibly, recent transfers of ARGs from antibiotic-producing actinobacteria to proteobacteria, and provide evidence for a defined mechanism.
Collapse
|
research-article |
8 |
244 |
12
|
Terlouw BR, Blin K, Navarro-Muñoz JC, Avalon NE, Chevrette MG, Egbert S, Lee S, Meijer D, Recchia MJ, Reitz Z, van Santen J, Selem-Mojica N, Tørring T, Zaroubi L, Alanjary M, Aleti G, Aguilar C, Al-Salihi SA, Augustijn H, Avelar-Rivas J, Avitia-Domínguez L, Barona-Gómez F, Bernaldo-Agüero J, Bielinski VA, Biermann F, Booth T, Carrion Bravo V, Castelo-Branco R, Chagas F, Cruz-Morales P, Du C, Duncan K, Gavriilidou A, Gayrard D, Gutiérrez-García K, Haslinger K, Helfrich EN, van der Hooft JJ, Jati A, Kalkreuter E, Kalyvas N, Kang K, Kautsar S, Kim W, Kunjapur A, Li YX, Lin GM, Loureiro C, Louwen JR, Louwen NL, Lund G, Parra J, Philmus B, Pourmohsenin B, Pronk LU, Rego A, Rex D, Robinson S, Rosas-Becerra L, Roxborough E, Schorn M, Scobie D, Singh K, Sokolova N, Tang X, Udwary D, Vigneshwari A, Vind K, Vromans SJM, Waschulin V, Williams S, Winter J, Witte T, Xie H, Yang D, Yu J, Zdouc M, Zhong Z, Collemare J, Linington R, Weber T, Medema M. MIBiG 3.0: a community-driven effort to annotate experimentally validated biosynthetic gene clusters. Nucleic Acids Res 2022; 51:D603-D610. [PMID: 36399496 PMCID: PMC9825592 DOI: 10.1093/nar/gkac1049] [Citation(s) in RCA: 202] [Impact Index Per Article: 67.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2022] [Revised: 10/07/2022] [Accepted: 10/21/2022] [Indexed: 11/19/2022] Open
Abstract
With an ever-increasing amount of (meta)genomic data being deposited in sequence databases, (meta)genome mining for natural product biosynthetic pathways occupies a critical role in the discovery of novel pharmaceutical drugs, crop protection agents and biomaterials. The genes that encode these pathways are often organised into biosynthetic gene clusters (BGCs). In 2015, we defined the Minimum Information about a Biosynthetic Gene cluster (MIBiG): a standardised data format that describes the minimally required information to uniquely characterise a BGC. We simultaneously constructed an accompanying online database of BGCs, which has since been widely used by the community as a reference dataset for BGCs and was expanded to 2021 entries in 2019 (MIBiG 2.0). Here, we describe MIBiG 3.0, a database update comprising large-scale validation and re-annotation of existing entries and 661 new entries. Particular attention was paid to the annotation of compound structures and biological activities, as well as protein domain selectivities. Together, these new features keep the database up-to-date, and will provide new opportunities for the scientific community to use its freely available data, e.g. for the training of new machine learning models to predict sequence-structure-function relationships for diverse natural products. MIBiG 3.0 is accessible online at https://mibig.secondarymetabolites.org/.
Collapse
|
research-article |
3 |
202 |
13
|
Kautsar SA, Suarez Duran HG, Blin K, Osbourn A, Medema MH. plantiSMASH: automated identification, annotation and expression analysis of plant biosynthetic gene clusters. Nucleic Acids Res 2019; 45:W55-W63. [PMID: 28453650 PMCID: PMC5570173 DOI: 10.1093/nar/gkx305] [Citation(s) in RCA: 185] [Impact Index Per Article: 30.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2017] [Accepted: 04/12/2017] [Indexed: 12/18/2022] Open
Abstract
Plant specialized metabolites are chemically highly diverse, play key roles in host-microbe interactions, have important nutritional value in crops and are frequently applied as medicines. It has recently become clear that plant biosynthetic pathway-encoding genes are sometimes densely clustered in specific genomic loci: biosynthetic gene clusters (BGCs). Here, we introduce plantiSMASH, a versatile online analysis platform that automates the identification of candidate plant BGCs. Moreover, it allows integration of transcriptomic data to prioritize candidate BGCs based on the coexpression patterns of predicted biosynthetic enzyme-coding genes, and facilitates comparative genomic analysis to study the evolutionary conservation of each cluster. Applied on 48 high-quality plant genomes, plantiSMASH identifies a rich diversity of candidate plant BGCs. These results will guide further experimental exploration of the nature and dynamics of gene clustering in plant metabolism. Moreover, spurred by the continuing decrease in costs of plant genome sequencing, they will allow genome mining technologies to be applied to plant natural product discovery. The plantiSMASH web server, precalculated results and source code are freely available from http://plantismash.secondarymetabolites.org.
Collapse
|
Research Support, Non-U.S. Gov't |
6 |
185 |
14
|
Blin K, Medema MH, Kottmann R, Lee SY, Weber T. The antiSMASH database, a comprehensive database of microbial secondary metabolite biosynthetic gene clusters. Nucleic Acids Res 2016; 45:D555-D559. [PMID: 27924032 PMCID: PMC5210647 DOI: 10.1093/nar/gkw960] [Citation(s) in RCA: 164] [Impact Index Per Article: 18.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2016] [Revised: 10/01/2016] [Accepted: 10/11/2016] [Indexed: 12/28/2022] Open
Abstract
Secondary metabolites produced by microorganisms are the main source of bioactive compounds that are in use as antimicrobial and anticancer drugs, fungicides, herbicides and pesticides. In the last decade, the increasing availability of microbial genomes has established genome mining as a very important method for the identification of their biosynthetic gene clusters (BGCs). One of the most popular tools for this task is antiSMASH. However, so far, antiSMASH is limited to de novo computing results for user-submitted genomes and only partially connects these with BGCs from other organisms. Therefore, we developed the antiSMASH database, a simple but highly useful new resource to browse antiSMASH-annotated BGCs in the currently 3907 bacterial genomes in the database and perform advanced search queries combining multiple search criteria. antiSMASH-DB is available at http://antismash-db.secondarymetabolites.org/.
Collapse
|
Research Support, Non-U.S. Gov't |
9 |
164 |
15
|
Blin K, Pascal Andreu V, de Los Santos ELC, Del Carratore F, Lee SY, Medema MH, Weber T. The antiSMASH database version 2: a comprehensive resource on secondary metabolite biosynthetic gene clusters. Nucleic Acids Res 2020; 47:D625-D630. [PMID: 30395294 PMCID: PMC6324005 DOI: 10.1093/nar/gky1060] [Citation(s) in RCA: 143] [Impact Index Per Article: 28.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2018] [Accepted: 10/17/2018] [Indexed: 11/29/2022] Open
Abstract
Natural products originating from microorganisms are frequently used in antimicrobial and anticancer drugs, pesticides, herbicides or fungicides. In the last years, the increasing availability of microbial genome data has made it possible to access the wealth of biosynthetic clusters responsible for the production of these compounds by genome mining. antiSMASH is one of the most popular tools in this field. The antiSMASH database provides pre-computed antiSMASH results for many publicly available microbial genomes and allows for advanced cross-genome searches. The current version 2 of the antiSMASH database contains annotations for 6200 full bacterial genomes and 18,576 bacterial draft genomes and is available at https://antismash-db.secondarymetabolites.org/.
Collapse
|
Research Support, Non-U.S. Gov't |
5 |
143 |
16
|
Alanjary M, Kronmiller B, Adamek M, Blin K, Weber T, Huson D, Philmus B, Ziemert N. The Antibiotic Resistant Target Seeker (ARTS), an exploration engine for antibiotic cluster prioritization and novel drug target discovery. Nucleic Acids Res 2019; 45:W42-W48. [PMID: 28472505 PMCID: PMC5570205 DOI: 10.1093/nar/gkx360] [Citation(s) in RCA: 132] [Impact Index Per Article: 22.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2017] [Accepted: 04/20/2017] [Indexed: 11/12/2022] Open
Abstract
With the rise of multi-drug resistant pathogens and the decline in number of potential new antibiotics in development there is a fervent need to reinvigorate the natural products discovery pipeline. Most antibiotics are derived from secondary metabolites produced by microorganisms and plants. To avoid suicide, an antibiotic producer harbors resistance genes often found within the same biosynthetic gene cluster (BGC) responsible for manufacturing the antibiotic. Existing mining tools are excellent at detecting BGCs or resistant genes in general, but provide little help in prioritizing and identifying gene clusters for compounds active against specific and novel targets. Here we introduce the 'Antibiotic Resistant Target Seeker' (ARTS) available at https://arts.ziemertlab.com. ARTS allows for specific and efficient genome mining for antibiotics with interesting and novel targets. The aim of this web server is to automate the screening of large amounts of sequence data and to focus on the most promising strains that produce antibiotics with new modes of action. ARTS integrates target directed genome mining methods, antibiotic gene cluster predictions and 'essential gene screening' to provide an interactive page for rapid identification of known and putative targets in BGCs.
Collapse
|
Research Support, Non-U.S. Gov't |
6 |
132 |
17
|
Kautsar SA, Blin K, Shaw S, Weber T, Medema MH. BiG-FAM: the biosynthetic gene cluster families database. Nucleic Acids Res 2021; 49:D490-D497. [PMID: 33010170 PMCID: PMC7778980 DOI: 10.1093/nar/gkaa812] [Citation(s) in RCA: 118] [Impact Index Per Article: 29.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2020] [Revised: 09/11/2020] [Accepted: 09/15/2020] [Indexed: 02/07/2023] Open
Abstract
Computational analysis of biosynthetic gene clusters (BGCs) has revolutionized natural product discovery by enabling the rapid investigation of secondary metabolic potential within microbial genome sequences. Grouping homologous BGCs into Gene Cluster Families (GCFs) facilitates mapping their architectural and taxonomic diversity and provides insights into the novelty of putative BGCs, through dereplication with BGCs of known function. While multiple databases exist for exploring BGCs from publicly available data, no public resources exist that focus on GCF relationships. Here, we present BiG-FAM, a database of 29,955 GCFs capturing the global diversity of 1,225,071 BGCs predicted from 209,206 publicly available microbial genomes and metagenome-assembled genomes (MAGs). The database offers rich functionalities, such as multi-criterion GCF searches, direct links to BGC databases such as antiSMASH-DB, and rapid GCF annotation of user-supplied BGCs from antiSMASH results. BiG-FAM can be accessed online at https://bigfam.bioinformatics.nl.
Collapse
|
research-article |
4 |
118 |
18
|
Blin K, Kim HU, Medema MH, Weber T. Recent development of antiSMASH and other computational approaches to mine secondary metabolite biosynthetic gene clusters. Brief Bioinform 2020; 20:1103-1113. [PMID: 29112695 PMCID: PMC6781578 DOI: 10.1093/bib/bbx146] [Citation(s) in RCA: 113] [Impact Index Per Article: 22.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2017] [Revised: 10/10/2017] [Indexed: 01/06/2023] Open
Abstract
Many drugs are derived from small molecules produced by microorganisms and plants, so-called natural products. Natural products have diverse chemical structures, but the biosynthetic pathways producing those compounds are often organized as biosynthetic gene clusters (BGCs) and follow a highly conserved biosynthetic logic. This allows for the identification of core biosynthetic enzymes using genome mining strategies that are based on the sequence similarity of the involved enzymes/genes. However, mining for a variety of BGCs quickly approaches a complexity level where manual analyses are no longer possible and require the use of automated genome mining pipelines, such as the antiSMASH software. In this review, we discuss the principles underlying the predictions of antiSMASH and other tools and provide practical advice for their application. Furthermore, we discuss important caveats such as rule-based BGC detection, sequence and annotation quality and cluster boundary prediction, which all have to be considered while planning for, performing and analyzing the results of genome mining studies.
Collapse
|
Research Support, Non-U.S. Gov't |
5 |
113 |
19
|
Blin K, Dieterich C, Wurmus R, Rajewsky N, Landthaler M, Akalin A. DoRiNA 2.0--upgrading the doRiNA database of RNA interactions in post-transcriptional regulation. Nucleic Acids Res 2014; 43:D160-7. [PMID: 25416797 PMCID: PMC4383974 DOI: 10.1093/nar/gku1180] [Citation(s) in RCA: 102] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/12/2023] Open
Abstract
The expression of almost all genes in animals is subject to post-transcriptional regulation by RNA binding proteins (RBPs) and microRNAs (miRNAs). The interactions between both RBPs and miRNAs with mRNA can be mapped on a whole-transcriptome level using experimental and computational techniques established in the past years. The combined action of RBPs and miRNAs is thought to form a post-transcriptional regulatory code. Here we present doRiNA 2.0, available at http://dorina.mdc-berlin.de. In this highly improved new version, we have completely reworked the user interface and expanded the database to improve the usability of the website. Taking into account user feedback over the past years, the input forms for both the simple and the combinatorial search function have been streamlined and combined into a single web page that will also display the search results. Especially, custom uploads is one of the key new features in doRiNA 2.0. To enable the inclusion of doRiNA into third-party analysis pipelines, all operations are accessible via a REST API. Alternatively, local installations can be queried using a Python API. Both the web application and the APIs are available under an OSI-approved Open Source license that allows research and commercial access and re-use.
Collapse
|
Research Support, Non-U.S. Gov't |
11 |
102 |
20
|
Blin K, Pedersen LE, Weber T, Lee SY. CRISPy-web: An online resource to design sgRNAs for CRISPR applications. Synth Syst Biotechnol 2016; 1:118-121. [PMID: 29062934 PMCID: PMC5640694 DOI: 10.1016/j.synbio.2016.01.003] [Citation(s) in RCA: 98] [Impact Index Per Article: 10.9] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2015] [Revised: 01/05/2016] [Accepted: 01/10/2016] [Indexed: 11/27/2022] Open
Abstract
CRISPR/Cas9-based genome editing has been one of the major achievements of molecular biology, allowing the targeted engineering of a wide range of genomes. The system originally evolved in prokaryotes as an adaptive immune system against bacteriophage infections. It now sees widespread application in genome engineering workflows, especially using the Streptococcus pyogenes endonuclease Cas9. To utilize Cas9, so-called single guide RNAs (sgRNAs) need to be designed for each target gene. While there are many tools available to design sgRNAs for the popular model organisms, only few tools that allow designing sgRNAs for non-model organisms exist. Here, we present CRISPy-web (http://crispy.secondarymetabolites.org/), an easy to use web tool based on CRISPy to design sgRNAs for any user-provided microbial genome. CRISPy-web allows researchers to interactively select a region of their genome of interest to scan for possible sgRNAs. After checks for potential off-target matches, the resulting sgRNA sequences are displayed graphically and can be exported to text files. All steps and information are accessible from a web browser without the requirement to install and use command line scripts.
Collapse
|
Journal Article |
9 |
98 |
21
|
Mullowney MW, Duncan KR, Elsayed SS, Garg N, van der Hooft JJJ, Martin NI, Meijer D, Terlouw BR, Biermann F, Blin K, Durairaj J, Gorostiola González M, Helfrich EJN, Huber F, Leopold-Messer S, Rajan K, de Rond T, van Santen JA, Sorokina M, Balunas MJ, Beniddir MA, van Bergeijk DA, Carroll LM, Clark CM, Clevert DA, Dejong CA, Du C, Ferrinho S, Grisoni F, Hofstetter A, Jespers W, Kalinina OV, Kautsar SA, Kim H, Leao TF, Masschelein J, Rees ER, Reher R, Reker D, Schwaller P, Segler M, Skinnider MA, Walker AS, Willighagen EL, Zdrazil B, Ziemert N, Goss RJM, Guyomard P, Volkamer A, Gerwick WH, Kim HU, Müller R, van Wezel GP, van Westen GJP, Hirsch AKH, Linington RG, Robinson SL, Medema MH. Artificial intelligence for natural product drug discovery. Nat Rev Drug Discov 2023; 22:895-916. [PMID: 37697042 DOI: 10.1038/s41573-023-00774-7] [Citation(s) in RCA: 96] [Impact Index Per Article: 48.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 07/20/2023] [Indexed: 09/13/2023]
Abstract
Developments in computational omics technologies have provided new means to access the hidden diversity of natural products, unearthing new potential for drug discovery. In parallel, artificial intelligence approaches such as machine learning have led to exciting developments in the computational drug design field, facilitating biological activity prediction and de novo drug design for molecular targets of interest. Here, we describe current and future synergies between these developments to effectively identify drug candidates from the plethora of molecules produced by nature. We also discuss how to address key challenges in realizing the potential of these synergies, such as the need for high-quality datasets to train deep learning algorithms and appropriate strategies for algorithm validation.
Collapse
|
Review |
2 |
96 |
22
|
Blin K, Shaw S, Kautsar SA, Medema MH, Weber T. The antiSMASH database version 3: increased taxonomic coverage and new query features for modular enzymes. Nucleic Acids Res 2021; 49:D639-D643. [PMID: 33152079 PMCID: PMC7779067 DOI: 10.1093/nar/gkaa978] [Citation(s) in RCA: 94] [Impact Index Per Article: 23.5] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2020] [Revised: 10/08/2020] [Accepted: 10/10/2020] [Indexed: 11/23/2022] Open
Abstract
Microorganisms produce natural products that are frequently used in the development of antibacterial, antiviral, and anticancer drugs, pesticides, herbicides, or fungicides. In recent years, genome mining has evolved into a prominent method to access this potential. antiSMASH is one of the most popular tools for this task. Here, we present version 3 of the antiSMASH database, providing a means to access and query precomputed antiSMASH-5.2-detected biosynthetic gene clusters from representative, publicly available, high-quality microbial genomes via an interactive graphical user interface. In version 3, the database contains 147 517 high quality BGC regions from 388 archaeal, 25 236 bacterial and 177 fungal genomes and is available at https://antismash-db.secondarymetabolites.org/.
Collapse
|
Research Support, Non-U.S. Gov't |
4 |
94 |
23
|
Mungan MD, Alanjary M, Blin K, Weber T, Medema MH, Ziemert N. ARTS 2.0: feature updates and expansion of the Antibiotic Resistant Target Seeker for comparative genome mining. Nucleic Acids Res 2020; 48:W546-W552. [PMID: 32427317 PMCID: PMC7319560 DOI: 10.1093/nar/gkaa374] [Citation(s) in RCA: 92] [Impact Index Per Article: 18.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2020] [Revised: 04/19/2020] [Accepted: 04/29/2020] [Indexed: 01/21/2023] Open
Abstract
Multi-drug resistant pathogens have become a major threat to human health and new antibiotics are urgently needed. Most antibiotics are derived from secondary metabolites produced by bacteria. In order to avoid suicide, these bacteria usually encode resistance genes, in some cases within the biosynthetic gene cluster (BGC) of the respective antibiotic compound. Modern genome mining tools enable researchers to computationally detect and predict BGCs that encode the biosynthesis of secondary metabolites. The major challenge now is the prioritization of the most promising BGCs encoding antibiotics with novel modes of action. A recently developed target-directed genome mining approach allows researchers to predict the mode of action of the encoded compound of an uncharacterized BGC based on the presence of resistant target genes. In 2017, we introduced the ‘Antibiotic Resistant Target Seeker’ (ARTS). ARTS allows for specific and efficient genome mining for antibiotics with interesting and novel targets by rapidly linking housekeeping and known resistance genes to BGC proximity, duplication and horizontal gene transfer (HGT) events. Here, we present ARTS 2.0 available at http://arts.ziemertlab.com. ARTS 2.0 now includes options for automated target directed genome mining in all bacterial taxa as well as metagenomic data. Furthermore, it enables comparison of similar BGCs from different genomes and their putative resistance genes.
Collapse
|
Research Support, Non-U.S. Gov't |
5 |
92 |
24
|
Perez-Riverol Y, Gatto L, Wang R, Sachsenberg T, Uszkoreit J, Leprevost FDV, Fufezan C, Ternent T, Eglen SJ, Katz DS, Pollard TJ, Konovalov A, Flight RM, Blin K, Vizcaíno JA. Ten Simple Rules for Taking Advantage of Git and GitHub. PLoS Comput Biol 2016; 12:e1004947. [PMID: 27415786 PMCID: PMC4945047 DOI: 10.1371/journal.pcbi.1004947] [Citation(s) in RCA: 82] [Impact Index Per Article: 9.1] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022] Open
|
Editorial |
9 |
82 |
25
|
Blin K, Kazempour D, Wohlleben W, Weber T. Improved lanthipeptide detection and prediction for antiSMASH. PLoS One 2014; 9:e89420. [PMID: 24586765 PMCID: PMC3930743 DOI: 10.1371/journal.pone.0089420] [Citation(s) in RCA: 37] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2013] [Accepted: 01/21/2014] [Indexed: 11/18/2022] Open
Abstract
Lanthipeptides are a class of ribosomally synthesised and post-translationally modified peptide (RiPP) natural products from the bacterial secondary metabolism. Their name is derived from the characteristic lanthionine or methyl-lanthionine residues contained in the processed peptide. Lanthipeptides that possess an antibacterial activity are called lantibiotics. Whereas multiple tools exist to identify lanthipeptide gene clusters from genomic data, no programs are available to predict the post-translational modifications of lanthipeptides, such as the proteolytic cleavage of the leader peptide part or tailoring modifications based on the analysis of the gene cluster sequence. antiSMASH is a software pipeline for the identification of secondary metabolite biosynthetic clusters from genomic input and the prediction of products produced by the identified clusters. Here we present a novel antiSMASH module using a rule-based approach to combine signature motifs for biosynthetic enzymes and lanthipeptide-specific cleavage site motifs to identify lanthipeptide clusters in genomic data, assign the specific lanthipeptide class, predict prepeptide cleavage, tailoring reactions, and the processed molecular weight of the mature peptide products.
Collapse
|
Research Support, Non-U.S. Gov't |
11 |
37 |