1
|
Toukach P. Carbohydrate Structure Database: current state and recent developments. Anal Bioanal Chem 2024:10.1007/s00216-024-05383-w. [PMID: 38914734 DOI: 10.1007/s00216-024-05383-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2024] [Revised: 05/18/2024] [Accepted: 05/28/2024] [Indexed: 06/26/2024]
Abstract
Carbohydrate Structure Database (CSDB) is a curated glycan data collection and a glycoinformatic platform. In this report, its database, analytical, and other components that have appeared for the recent years are reviewed. The major improvements were achieving close-to-full coverage on glycans from microorganisms, launching modules for glycosyltransferases and saccharide conformations, online glycan builder and 3D modeler, NMR simulator, NMR-based structure predictor, and other tools.
Collapse
Affiliation(s)
- Philip Toukach
- N.D. Zelinsky Institute of Organic Chemistry, Russian Academy of Sciences, Moscow, Russia.
- Faculty of Chemistry, National Research University Higher School of Economics, Moscow, Russia.
| |
Collapse
|
2
|
Toukach PV. Supplementing the Carbohydrate Structure Database with glycoepitopes. Glycobiology 2023; 33:528-531. [PMID: 37306951 DOI: 10.1093/glycob/cwad043] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2023] [Revised: 05/10/2023] [Accepted: 05/27/2023] [Indexed: 06/13/2023] Open
Abstract
Carbohydrate structures in the Carbohydrate Structure Database have been referenced to glycoepitopes from the Immune Epitope Database allowing users to explore the glycan structures and contained epitopes. Starting with an epitope, one can figure out the glycans from other organisms that share the same structural determinant, and retrieve the associated taxonomical, medical, and other data. This database mapping demonstrates the advantages of the integration of immunological and glycomic databases.
Collapse
Affiliation(s)
- Philip V Toukach
- N.D. Zelinsky Institute of Organic Chemistry, Russian Academy of Sciences, Laboratory of carbohydrate chemistry and biocides, Leninsky pr. 47, Moscow 119991, Russia
| |
Collapse
|
3
|
Examining the diversity of structural motifs in fungal glycome. Comput Struct Biotechnol J 2022; 20:5466-5476. [PMID: 36249563 PMCID: PMC9535381 DOI: 10.1016/j.csbj.2022.09.040] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2022] [Revised: 09/26/2022] [Accepted: 09/26/2022] [Indexed: 11/22/2022] Open
Abstract
In this paper, we present the results of a systematic statistical analysis of the fungal glycome in comparison with the prokaryotic and protistal glycomes as described in the scientific literature and presented in the Carbohydrate Structure Database (CSDB). The monomeric and dimeric compositions of glycans, their non-carbohydrate modifications, glycosidic linkages, sizes of structures, branching degree and net charge are assessed. The obtained information can help elucidating carbohydrate molecular markers for various fungal classes which, in its turn, can be demanded for the development of diagnostic tools and carbohydrate-based vaccines against pathogenic fungi. It can also be useful for revealing specific glycosyltransferases active in a particular fungal species.
Collapse
|
4
|
Toukach PV, Shirkovskaya AI. Carbohydrate Structure Database and Other Glycan Databases as an Important Element of Glycoinformatics. RUSSIAN JOURNAL OF BIOORGANIC CHEMISTRY 2022. [DOI: 10.1134/s1068162022030190] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
|
5
|
Toukach PV, Egorova KS. Source files of the Carbohydrate Structure Database: the way to sophisticated analysis of natural glycans. Sci Data 2022; 9:131. [PMID: 35354826 PMCID: PMC8968703 DOI: 10.1038/s41597-022-01186-9] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2021] [Accepted: 02/03/2022] [Indexed: 11/18/2022] Open
Abstract
The Carbohydrate Structure Database (CSDB, http://csdb.glycoscience.ru/ ) is a free curated repository storing various data on glycans of bacterial, fungal and plant origins. Currently, it maintains a close-to-full coverage on bacterial and fungal carbohydrates up to the year 2020. The CSDB web-interface provides free access to the database content and dedicated tools. Still, the number of these tools and the types of the corresponding analyses is limited, whereas the database itself contains data that can be used in a broader scope of analytical studies. In this paper, we present CSDB source data files and a self-contained SQL dump, and exemplify their possible application in glycan-related studies. By using CSDB in an SQL format, the user can gain access to the chain length distribution or charge distribution (as an example) in a given set of glycans defined according to specific structural, taxonomic, or other parameters, whereas the source text dump files can be imported to any dedicated database with a specific internal architecture differing from that of CSDB.
Collapse
Affiliation(s)
- Philip V Toukach
- N.D. Zelinsky Institute of Organic Chemistry, Russian Academy of Sciences, Leninsky prospect 47, Moscow, 119991, Russia.
| | - Ksenia S Egorova
- N.D. Zelinsky Institute of Organic Chemistry, Russian Academy of Sciences, Leninsky prospect 47, Moscow, 119991, Russia.
| |
Collapse
|
6
|
Scherbinina SI, Frank M, Toukach PV. Carbohydrate structure database (CSDB) oligosaccharide conformation tool. Glycobiology 2022; 32:460-468. [PMID: 35275211 DOI: 10.1093/glycob/cwac011] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2021] [Revised: 02/17/2022] [Accepted: 03/04/2022] [Indexed: 11/13/2022] Open
Abstract
Population analysis in terms of glycosidic torsion angles is frequently used to reveal preferred conformers of glycans. However, due to high structural diversity and flexibility of carbohydrates, conformational characterization of complex glycans can be a challenging task. Herein we present a conformation module of oligosaccharide fragments occurring in natural glycan structures developed on the platform of the Carbohydrate Structure Database (CSDB). Currently, this module deposits free energy surface and conformer abundance maps plotted as a function of glycosidic torsions for 194 inter-residue bonds. Data are automatically and continuously derived from explicit-solvent molecular dynamics (MD) simulations. The module was also supplemented with high-temperature MD data of saccharides (2403 maps) provided by GlycoMapsDB (hosted by GLYCOSCIENCES.de project). Conformational data defined by up to four torsional degrees of freedom can be freely explored using a web interface of the module available at http://csdb.glycoscience.ru/database/core/search_conf.html.
Collapse
Affiliation(s)
- S I Scherbinina
- Higher Chemical College, D. Mendeleev University of Chemical Technology of Russia, Miusskaya Square 9, 125047 Moscow, Russia
| | - M Frank
- Biognos AB, Box 8963, 40274 Göteborg, Sweden
| | - P V Toukach
- N.D. Zelinsky Institute of Organic Chemistry, Russian Academy of Science, Leninsky prospect 47, 119991 Moscow, Russia
| |
Collapse
|
7
|
Dealing with the Ambiguity of Glycan Substructure Search. MOLECULES (BASEL, SWITZERLAND) 2021; 27:molecules27010065. [PMID: 35011294 PMCID: PMC8746581 DOI: 10.3390/molecules27010065] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/02/2021] [Revised: 12/17/2021] [Accepted: 12/17/2021] [Indexed: 01/15/2023]
Abstract
The level of ambiguity in describing glycan structure has significantly increased with the upsurge of large-scale glycomics and glycoproteomics experiments. Consequently, an ontology-based model appears as an appropriate solution for navigating these data. However, navigation is not sufficient and the model should also enable advanced search and comparison. A new ontology with a tree logical structure is introduced to represent glycan structures irrespective of the precision of molecular details. The model heavily relies on the GlycoCT encoding of glycan structures. Its implementation in the GlySTreeM knowledge base was validated with GlyConnect data and benchmarked with the Glycowork library. GlySTreeM is shown to be fast, consistent, reliable and more flexible than existing solutions for matching parts of or whole glycan structures. The model is also well suited for painless future expansion.
Collapse
|
8
|
Bochkov AY, Toukach PV. CSDB/SNFG Structure Editor: An Online Glycan Builder with 2D and 3D Structure Visualization. J Chem Inf Model 2021; 61:4940-4948. [PMID: 34595926 DOI: 10.1021/acs.jcim.1c00917] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
This article describes features, usage, and application of an CSDB/SNFG Structure Editor, a new online tool for quick and intuitive input of carbohydrate and derivative structures using Symbol Nomenclature for Glycans (SNFG). The Editor is built on a platform of the Carbohydrate Structure Database (CSDB) and relies on its online services via the dedicated web-API. The Editor allows building of oligo- and polymeric glycan structures and supports most features of natural glycans, such as underdetermined structures, alternative branches, repeating subunits, SMILES specification of atypical monomers, and others. The vocabulary of building blocks contains 600+ monomeric residues, including 327 monosaccharides. Support for SMILES allows input and visualization of chemical structures of virtually unlimited complexity. On the other hand, the interface follows the recognized GlycanBuilder style easy to novice users. The export feature includes support for CSDB Linear, GlycoCT, WURCS, SweetDB, and Glycam notations, SMILES codes, MOL/PDB atomic coordinate formats, raster and vector SNFG images, and on-the-fly visualization as 2D structural formulas and 3D molecular models. Integration of the Editor into any web-based glycoinformatics project is straightforward and simple, similarly to any other modern JavaScript application.
Collapse
Affiliation(s)
- Andrei Y Bochkov
- Laboratory of Carbohydrate Chemistry, Zelinsky Institute of Organic Chemistry, Russian Academy of Sciences, Leninsky prospect 47, 119991 Moscow, Russia
| | - Philip V Toukach
- Laboratory of Carbohydrate Chemistry, Zelinsky Institute of Organic Chemistry, Russian Academy of Sciences, Leninsky prospect 47, 119991 Moscow, Russia.,Faculty of Chemistry, National Research University Higher School of Economics, Vavilova 7, 117312 Moscow, Russia
| |
Collapse
|
9
|
Probiotic Bacteria with High Alpha-Gal Content Protect Zebrafish against Mycobacteriosis. Pharmaceuticals (Basel) 2021; 14:ph14070635. [PMID: 34208966 PMCID: PMC8308674 DOI: 10.3390/ph14070635] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2021] [Revised: 06/28/2021] [Accepted: 06/28/2021] [Indexed: 12/22/2022] Open
Abstract
Mycobacteriosis affects wild fish and aquaculture worldwide, and alternatives to antibiotics are needed for an effective and environmentally sound control of infectious diseases. Probiotics have shown beneficial effects on fish growth, nutrient metabolism, immune responses, disease prevention and control, and gut microbiota with higher water quality. However, the identification and characterization of the molecules and mechanisms associated with probiotics is a challenge that requires investigation. To address this challenge, herein we used the zebrafish model for the study of the efficacy and mechanisms of probiotic interventions against tuberculosis. First, bacteria from fish gut microbiota were identified with high content of the surface glycotope Galα1-3Galβ1-(3)4GlcNAc-R (α-Gal) that has been shown to induce protective immune responses. The results showed that probiotics of selected bacteria with high α-Gal content, namely Aeromonas veronii and Pseudomonas entomophila, were biosafe and effective for the control of Mycobacterium marinum. Protective mechanisms regulating immunity and metabolism activated in response to α-Gal and probiotics with high α-Gal content included modification of gut microbiota composition, B-cell maturation, anti-α-Gal antibodies-mediated control of mycobacteria, induced innate immune responses, beneficial effects on nutrient metabolism and reduced oxidative stress. These results support the potential of probiotics with high -Gal content for the control of fish mycobacteriosis and suggested the possibility of exploring the development of combined probiotic treatments alone and in combination with -Gal for the control of infectious diseases.
Collapse
|
10
|
Egorova KS, Smirnova NS, Toukach PV. CSDB_GT, a curated glycosyltransferase database with close-to-full coverage on three most studied nonanimal species. Glycobiology 2020; 31:524-529. [PMID: 33242091 DOI: 10.1093/glycob/cwaa107] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2020] [Revised: 11/13/2020] [Accepted: 11/18/2020] [Indexed: 11/13/2022] Open
Abstract
We report the accomplishment of the first stage of the development of a novel manually curated database on glycosyltransferase (GT) activities, CSDB_GT. CSDB_GT (http://csdb.glycoscience.ru/gt.html) has been supplemented with GT activities from Saccharomyces cerevisiae. Now it provides the close-to-complete coverage on experimentally confirmed GTs from the three most studied model organisms from the three kingdoms: plantae (Arabidopsis thaliana, ca. 930 activities), bacteria (Escherichia coli, ca. 820 activities) and fungi (S. cerevisiae, ca. 270 activities).
Collapse
Affiliation(s)
- Ksenia S Egorova
- Laboratory of Metal-Complex and Nano-Scale Catalysts, N.D. Zelinsky Institute of Organic Chemistry, Russian Academy of Sciences, Leninsky prospect 47, Moscow 119991, Russia
| | - Nadezhda S Smirnova
- Kurnakov Institute of General and Inorganic Chemistry, Russian Academy of Sciences, Leninsky prospect 31, Moscow 119991, Russia
| | - Philip V Toukach
- Laboratory of Carbohydrate Chemistry, N.D. Zelinsky Institute of Organic Chemistry, Russian Academy of Sciences, Leninsky prospect 47, Moscow 119991, Russia
| |
Collapse
|
11
|
Comparison of Methods for Bulk Automated Simulation of Glycosidic Bond Conformations. Int J Mol Sci 2020; 21:ijms21207626. [PMID: 33076365 PMCID: PMC7589101 DOI: 10.3390/ijms21207626] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2020] [Revised: 10/10/2020] [Accepted: 10/10/2020] [Indexed: 02/08/2023] Open
Abstract
Six empirical force fields were tested for applicability to calculations for automated carbohydrate database filling. They were probed on eleven disaccharide molecules containing representative structural features from widespread classes of carbohydrates. The accuracy of each method was queried by predictions of nuclear Overhauser effects (NOEs) from conformational ensembles obtained from 50 to 100 ns molecular dynamics (MD) trajectories and their comparison to the published experimental data. Using various ranking schemes, it was concluded that explicit solvent MM3 MD yielded non-inferior NOE accuracy with newer GLYCAM-06, and ultimately PBE0-D3/def2-TZVP (Triple-Zeta Valence Polarized) Density Functional Theory (DFT) simulations. For seven of eleven molecules, at least one empirical force field with explicit solvent outperformed DFT in NOE prediction. The aggregate of characteristics (accuracy, speed, and compatibility) made MM3 dynamics with explicit solvent at 300 K the most favorable method for bulk generation of disaccharide conformation maps for massive database filling.
Collapse
|
12
|
David L, Thakkar A, Mercado R, Engkvist O. Molecular representations in AI-driven drug discovery: a review and practical guide. J Cheminform 2020; 12:56. [PMID: 33431035 PMCID: PMC7495975 DOI: 10.1186/s13321-020-00460-5] [Citation(s) in RCA: 165] [Impact Index Per Article: 41.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2020] [Accepted: 09/05/2020] [Indexed: 02/08/2023] Open
Abstract
The technological advances of the past century, marked by the computer revolution and the advent of high-throughput screening technologies in drug discovery, opened the path to the computational analysis and visualization of bioactive molecules. For this purpose, it became necessary to represent molecules in a syntax that would be readable by computers and understandable by scientists of various fields. A large number of chemical representations have been developed over the years, their numerosity being due to the fast development of computers and the complexity of producing a representation that encompasses all structural and chemical characteristics. We present here some of the most popular electronic molecular and macromolecular representations used in drug discovery, many of which are based on graph representations. Furthermore, we describe applications of these representations in AI-driven drug discovery. Our aim is to provide a brief guide on structural representations that are essential to the practice of AI in drug discovery. This review serves as a guide for researchers who have little experience with the handling of chemical representations and plan to work on applications at the interface of these fields.
Collapse
Affiliation(s)
- Laurianne David
- Hit Discovery, Discovery Sciences, BioPharmaceuticals R&D, Astrazeneca Gothenburg, Sweden.
| | - Amol Thakkar
- Hit Discovery, Discovery Sciences, BioPharmaceuticals R&D, Astrazeneca Gothenburg, Sweden
- Department of Chemistry and Biochemistry, University of Bern, Bern, Switzerland
| | - Rocío Mercado
- Hit Discovery, Discovery Sciences, BioPharmaceuticals R&D, Astrazeneca Gothenburg, Sweden
| | - Ola Engkvist
- Hit Discovery, Discovery Sciences, BioPharmaceuticals R&D, Astrazeneca Gothenburg, Sweden
| |
Collapse
|