1
|
Yamada I, Campbell MP, Edwards N, Castro LJ, Lisacek F, Mariethoz J, Ono T, Ranzinger R, Shinmachi D, Aoki-Kinoshita KF. Corrigendum to: The glycoconjugate ontology (GlycoCoO) for standardizing the annotation of glycoconjugate data and its application. Glycobiology 2021; 32:909. [PMID: 34379754 PMCID: PMC9487897 DOI: 10.1093/glycob/cwab065] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2020] [Revised: 12/31/2020] [Accepted: 01/01/2021] [Indexed: 12/01/2022] Open
|
2
|
Navelkar R, Owen G, Mutherkrishnan V, Thiessen P, Cheng T, Bolerlton E, Edwards N, Tiemeyer M, Campbell MP, Martin M, Vora J, Kahsay R, Mazumder R. Enhancing the interoperability of glycan data flow between ChEBI, PubChem, and GlyGen. Glycobiology 2021; 31:1510-1519. [PMID: 34314492 DOI: 10.1093/glycob/cwab078] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2021] [Revised: 07/02/2021] [Accepted: 07/18/2021] [Indexed: 11/13/2022] Open
Abstract
Glycans play a vital role in health, disease, bioenergy, biomaterials, and bio-therapeutics. As a result, there is keen interest to identify and increase glycan data in bioinformatics databases like ChEBI and PubChem, and connecting them to resources at the EMBL-EBI and NCBI to facilitate access to important annotations at a global level. GlyTouCan is a comprehensive archival database that contains glycans obtained primarily through batch upload from glycan repositories, glycoprotein databases, and individual laboratories. In many instances, the glycan structures deposited in GlyTouCan may not be fully defined or have supporting experimental evidence and citations. Databases like ChEBI and PubChem were designed to accommodate complete atomistic structures with well-defined chemical linkages. As a result, they cannot easily accommodate the structural ambiguity inherent in glycan databases. Consequently, there is a need to improve the organization of glycan data coherently to enhance connectivity across the major NCBI, EMBL-EBI, and glycoscience databases. This paper outlines a workflow developed in collaboration between GlyGen, ChEBI, and PubChem to improve the visibility and connectivity of glycan data across these resources. GlyGen hosts a subset of glycans (~29,000) from the GlyTouCan database and has submitted valuable glycan annotations to the PubChem database and integrated over 10,500 (including ambiguously defined) glycans into the ChEBI database. The integrated glycans were prioritized based on links to PubChem and connectivity to glycoprotein data. The pipeline provides a blueprint for how glycan data can be harmonized between different resources. The current PubChem, ChEBI, and GlyTouCan mappings can be downloaded from GlyGen (https://data.glygen.org).
Collapse
|
3
|
Yamada I, Campbell MP, Edwards N, Castro LJ, Lisacek F, Mariethoz J, Ono T, Ranzinger R, Shinmachi D, Aoki-Kinoshita KF. The glycoconjugate ontology (GlycoCoO) for standardizing the annotation of glycoconjugate data and its application. Glycobiology 2021; 31:741-750. [PMID: 33677548 PMCID: PMC8351504 DOI: 10.1093/glycob/cwab013] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2020] [Revised: 12/31/2020] [Accepted: 01/01/2021] [Indexed: 01/19/2023] Open
Abstract
Recent years have seen great advances in the development of glycoproteomics protocols and methods resulting in a sustainable increase in the reporting proteins, their attached glycans and glycosylation sites. However, only very few of these reports find their way into databases or data repositories. One of the major reasons is the absence of digital standard to represent glycoproteins and the challenging annotations with glycans. Depending on the experimental method, such a standard must be able to represent glycans as complete structures or as compositions, store not just single glycans but also represent glycoforms on a specific glycosylation side, deal with partially missing site information if no site mapping was performed, and store abundances or ratios of glycans within a glycoform of a specific site. To support the above, we have developed the GlycoConjugate Ontology (GlycoCoO) as a standard semantic framework to describe and represent glycoproteomics data. GlycoCoO can be used to represent glycoproteomics data in triplestores and can serve as a basis for data exchange formats. The ontology, database providers and supporting documentation are available online (https://github.com/glycoinfo/GlycoCoO).
Collapse
|
4
|
Kahsay R, Vora J, Navelkar R, Mousavi R, Fochtman BC, Holmes X, Pattabiraman N, Ranzinger R, Mahadik R, Williamson T, Kulkarni S, Agarwal G, Martin M, Vasudev P, Garcia L, Edwards N, Zhang W, Natale DA, Ross K, Aoki-Kinoshita KF, Campbell MP, York WS, Mazumder R. GlyGen data model and processing workflow. Bioinformatics 2020; 36:3941-3943. [PMID: 32324859 PMCID: PMC7320628 DOI: 10.1093/bioinformatics/btaa238] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2020] [Revised: 03/31/2020] [Accepted: 04/16/2020] [Indexed: 11/18/2022] Open
Abstract
Summary Glycoinformatics plays a major role in glycobiology research, and the development of a comprehensive glycoinformatics knowledgebase is critical. This application note describes the GlyGen data model, processing workflow and the data access interfaces featuring programmatic use case example queries based on specific biological questions. The GlyGen project is a data integration, harmonization and dissemination project for carbohydrate and glycoconjugate-related data retrieved from multiple international data sources including UniProtKB, GlyTouCan, UniCarbKB and other key resources. Availability and implementation GlyGen web portal is freely available to access at https://glygen.org. The data portal, web services, SPARQL endpoint and GitHub repository are also freely available at https://data.glygen.org, https://api.glygen.org, https://sparql.glygen.org and https://github.com/glygener, respectively. All code is released under license GNU General Public License version 3 (GNU GPLv3) and is available on GitHub https://github.com/glygener. The datasets are made available under Creative Commons Attribution 4.0 International (CC BY 4.0) license. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
|
5
|
Kellman BP, Zhang Y, Logomasini E, Meinhardt E, Godinez-Macias KP, Chiang AWT, Sorrentino JT, Liang C, Bao B, Zhou Y, Akase S, Sogabe I, Kouka T, Winzeler EA, Wilson IBH, Campbell MP, Neelamegham S, Krambeck FJ, Aoki-Kinoshita KF, Lewis NE. A consensus-based and readable extension of Linear Code for Reaction Rules (LiCoRR). Beilstein J Org Chem 2020; 16:2645-2662. [PMID: 33178355 PMCID: PMC7607430 DOI: 10.3762/bjoc.16.215] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2020] [Accepted: 09/17/2020] [Indexed: 12/18/2022] Open
Abstract
Systems glycobiology aims to provide models and analysis tools that account for the biosynthesis, regulation, and interactions with glycoconjugates. To facilitate these methods, there is a need for a clear glycan representation accessible to both computers and humans. Linear Code, a linearized and readily parsable glycan structure representation, is such a language. For this reason, Linear Code was adapted to represent reaction rules, but the syntax has drifted from its original description to accommodate new and originally unforeseen challenges. Here, we delineate the consensuses and inconsistencies that have arisen through this adaptation. We recommend options for a consensus-based extension of Linear Code that can be used for reaction rule specification going forward. Through this extension and specification of Linear Code to reaction rules, we aim to minimize inconsistent symbology thereby making glycan database queries easier. With a clear guide for generating reaction rule descriptions, glycan synthesis models will be more interoperable and reproducible thereby moving glycoinformatics closer to compliance with FAIR standards. Here, we present Linear Code for Reaction Rules (LiCoRR), version 1.0, an unambiguous representation for describing glycosylation reactions in both literature and code.
Collapse
|
6
|
York WS, Mazumder R, Ranzinger R, Edwards N, Kahsay R, Aoki-Kinoshita KF, Campbell MP, Cummings RD, Feizi T, Martin M, Natale DA, Packer NH, Woods RJ, Agarwal G, Arpinar S, Bhat S, Blake J, Castro LJG, Fochtman B, Gildersleeve J, Goldman R, Holmes X, Jain V, Kulkarni S, Mahadik R, Mehta A, Mousavi R, Nakarakommula S, Navelkar R, Pattabiraman N, Pierce MJ, Ross K, Vasudev P, Vora J, Williamson T, Zhang W. GlyGen: Computational and Informatics Resources for Glycoscience. Glycobiology 2020; 30:72-73. [PMID: 31616925 DOI: 10.1093/glycob/cwz080] [Citation(s) in RCA: 87] [Impact Index Per Article: 21.8] [Reference Citation Analysis] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2019] [Revised: 09/19/2019] [Accepted: 09/19/2019] [Indexed: 11/13/2022] Open
|
7
|
Taherzadeh G, Dehzangi A, Golchin M, Zhou Y, Campbell MP. SPRINT-Gly: predicting N- and O-linked glycosylation sites of human and mouse proteins by using sequence and predicted structural properties. Bioinformatics 2020; 35:4140-4146. [PMID: 30903686 DOI: 10.1093/bioinformatics/btz215] [Citation(s) in RCA: 38] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2018] [Revised: 03/03/2019] [Accepted: 03/21/2019] [Indexed: 12/19/2022] Open
Abstract
MOTIVATION Protein glycosylation is one of the most abundant post-translational modifications that plays an important role in immune responses, intercellular signaling, inflammation and host-pathogen interactions. However, due to the poor ionization efficiency and microheterogeneity of glycopeptides identifying glycosylation sites is a challenging task, and there is a demand for computational methods. Here, we constructed the largest dataset of human and mouse glycosylation sites to train deep learning neural networks and support vector machine classifiers to predict N-/O-linked glycosylation sites, respectively. RESULTS The method, called SPRINT-Gly, achieved consistent results between ten-fold cross validation and independent test for predicting human and mouse glycosylation sites. For N-glycosylation, a mouse-trained model performs equally well in human glycoproteins and vice versa, however, due to significant differences in O-linked sites separate models were generated. Overall, SPRINT-Gly is 18% and 50% higher in Matthews correlation coefficient than the next best method compared in N-linked and O-linked sites, respectively. This improved performance is due to the inclusion of novel structure and sequence-based features. AVAILABILITY AND IMPLEMENTATION http://sparks-lab.org/server/SPRINT-Gly/. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
|
8
|
Abrahams JL, Taherzadeh G, Jarvas G, Guttman A, Zhou Y, Campbell MP. Recent advances in glycoinformatic platforms for glycomics and glycoproteomics. Curr Opin Struct Biol 2020. [PMID: 31874386 DOI: 10.1016/jsbi.2019.11.009] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/19/2023]
Abstract
Protein glycosylation is the most complex and prevalent post-translation modification in terms of the number of proteins modified and the diversity generated. To understand the functional roles of glycoproteins it is important to gain an insight into the repertoire of oligosaccharides present. The comparison and relative quantitation of glycoforms combined with site-specific identification and occupancy are necessary steps in this direction. Computational platforms have continued to mature assisting researchers with the interpretation of such glycomics and glycoproteomics data sets, but frequently support dedicated workflows and users rely on the manual interpretation of data to gain insights into the glycoproteome. The growth of site-specific knowledge has also led to the implementation of machine-learning algorithms to predict glycosylation which is now being integrated into glycoproteomics pipelines. This short review describes commercial and open-access databases and software with an emphasis on those that are actively maintained and designed to support current analytical workflows.
Collapse
|
9
|
Jarvas G, Szigeti M, Campbell MP, Guttman A. Expanding the capillary electrophoresis-based glucose unit database of the GUcal app. Glycobiology 2020; 30:362-364. [PMID: 31829415 DOI: 10.1093/glycob/cwz102] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/03/2019] [Revised: 11/22/2019] [Accepted: 12/03/2019] [Indexed: 02/02/2023] Open
Abstract
GUcal is a standalone application for automatically calculating the glucose unit (GU) values for separated N-glycan components of interest in an electropherogram and suggests their tentative structures by utilizing an internal database. We have expanded the original database of GUcal by integrating all publicly available capillary electrophoresis (CE) data in the GlycoStore collection (https://www.glycostore.org) and with in-house measured GU values. The GUcal app is freely available online (https://www.gucal.hu) and readily facilitates CE-based high throughput GU value determination for first line structural elucidation.
Collapse
|
10
|
Abrahams JL, Taherzadeh G, Jarvas G, Guttman A, Zhou Y, Campbell MP. Recent advances in glycoinformatic platforms for glycomics and glycoproteomics. Curr Opin Struct Biol 2019; 62:56-69. [PMID: 31874386 DOI: 10.1016/j.sbi.2019.11.009] [Citation(s) in RCA: 65] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2019] [Revised: 11/05/2019] [Accepted: 11/15/2019] [Indexed: 12/16/2022]
Abstract
Protein glycosylation is the most complex and prevalent post-translation modification in terms of the number of proteins modified and the diversity generated. To understand the functional roles of glycoproteins it is important to gain an insight into the repertoire of oligosaccharides present. The comparison and relative quantitation of glycoforms combined with site-specific identification and occupancy are necessary steps in this direction. Computational platforms have continued to mature assisting researchers with the interpretation of such glycomics and glycoproteomics data sets, but frequently support dedicated workflows and users rely on the manual interpretation of data to gain insights into the glycoproteome. The growth of site-specific knowledge has also led to the implementation of machine-learning algorithms to predict glycosylation which is now being integrated into glycoproteomics pipelines. This short review describes commercial and open-access databases and software with an emphasis on those that are actively maintained and designed to support current analytical workflows.
Collapse
|
11
|
Campbell MP, Abrahams JL, Rapp E, Struwe WB, Costello CE, Novotny M, Ranzinger R, York WS, Kolarich D, Rudd PM, Kettner C. The minimum information required for a glycomics experiment (MIRAGE) project: LC guidelines. Glycobiology 2019; 29:349-354. [PMID: 30778580 DOI: 10.1093/glycob/cwz009] [Citation(s) in RCA: 28] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2019] [Revised: 02/11/2019] [Accepted: 02/13/2019] [Indexed: 11/13/2022] Open
Abstract
The Minimum Information Required for a Glycomics Experiment (MIRAGE) is an initiative created by experts in the fields of glycobiology, glycoanalytics and glycoinformatics to design guidelines that improve the reporting and reproducibility of glycoanalytical methods. Previously, the MIRAGE Commission has published guidelines for describing sample preparation methods and the reporting of glycan array and mass spectrometry techniques and data collections. Here, we present the first version of guidelines that aim to improve the quality of the reporting of liquid chromatography (LC) glycan data in the scientific literature. These guidelines cover all aspects of instrument setup and modality of data handling and manipulation and is cross-linked with other MIRAGE recommendations. The most recent version of the MIRAGE-LC guidelines is freely available at the MIRAGE project website doi:10.3762/mirage.4.
Collapse
|
12
|
Zhao S, Walsh I, Abrahams JL, Royle L, Nguyen-Khuong T, Spencer D, Fernandes DL, Packer NH, Rudd PM, Campbell MP. GlycoStore: a database of retention properties for glycan analysis. Bioinformatics 2019; 34:3231-3232. [PMID: 29897488 DOI: 10.1093/bioinformatics/bty319] [Citation(s) in RCA: 59] [Impact Index Per Article: 11.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2017] [Accepted: 04/19/2018] [Indexed: 11/12/2022] Open
Abstract
Summary GlycoStore is a curated chromatographic, electrophoretic and mass-spectrometry composition database of N-, O-, glycosphingolipid (GSL) glycans and free oligosaccharides associated with a range of glycoproteins, glycolipids and biotherapeutics. The database is built on publicly available experimental datasets from GlycoBase developed in the Oxford Glycobiology Institute and then the National Institute for Bioprocessing Research and Training (NIBRT). It has now been extended to include recently published and in-house data collections from the Bioprocessing Technology Institute (BTI) A*STAR, Macquarie University and Ludger Ltd. GlycoStore provides access to approximately 850 unique glycan structure entries supported by over 8500 retention positions determined by: (i) hydrophilic interaction chromatography (HILIC) ultra-high performance liquid chromatography (U/HPLC) and reversed phase (RP)-U/HPLC with fluorescent detection; (ii) porous graphitized carbon (PGC) chromatography in combination with ESI-MS/MS detection; and (iii) capillary electrophoresis with laser induced fluorescence detection (CE-LIF). GlycoStore enhances many features previously available in GlycoBase while addressing the limitations of the data collections and model of this popular resource. GlycoStore aims to support detailed glycan analysis by providing a resource that underpins current workflows. It will be regularly updated by expert annotation of published data and data obtained from the project partners. Availability and implementation http://www.glycostore.org. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
|
13
|
Tiemeyer M, Aoki K, Paulson J, Cummings RD, York WS, Karlsson NG, Lisacek F, Packer NH, Campbell MP, Aoki NP, Fujita A, Matsubara M, Shinmachi D, Tsuchiya S, Yamada I, Pierce M, Ranzinger R, Narimatsu H, Aoki-Kinoshita KF. GlyTouCan: an accessible glycan structure repository. Glycobiology 2017; 27:915-919. [PMID: 28922742 PMCID: PMC5881658 DOI: 10.1093/glycob/cwx066] [Citation(s) in RCA: 103] [Impact Index Per Article: 14.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2017] [Revised: 07/06/2017] [Accepted: 07/07/2017] [Indexed: 11/12/2022] Open
Abstract
Rapid and continued growth in the generation of glycomic data has revealed the need for enhanced development of basic infrastructure for presenting and interpreting these datasets in a manner that engages the broader biomedical research community. Early in their growth, the genomic and proteomic fields implemented mechanisms for assigning unique gene and protein identifiers that were essential for organizing data presentation and for enhancing bioinformatic approaches to extracting knowledge. Similar unique identifiers are currently absent from glycomic data. In order to facilitate continued growth and expanded accessibility of glycomic data, the authors strongly encourage the glycomics community to coordinate the submission of their glycan structures to the GlyTouCan Repository and to make use of GlyTouCan identifiers in their communications and publications. The authors also deeply encourage journals to recommend a submission workflow in which submitted publications utilize GlyTouCan identifiers as a standard reference for explicitly describing glycan structures cited in manuscripts.
Collapse
|
14
|
Abrahams JL, Campbell MP, Packer NH. Building a PGC-LC-MS N-glycan retention library and elution mapping resource. Glycoconj J 2017; 35:15-29. [PMID: 28905148 DOI: 10.1007/s10719-017-9793-4] [Citation(s) in RCA: 78] [Impact Index Per Article: 11.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2017] [Revised: 08/10/2017] [Accepted: 08/18/2017] [Indexed: 11/27/2022]
Abstract
Porous graphitised carbon-liquid chromatography (PGC-LC) has been proven to be a powerful technique for the analysis and characterisation of complex mixtures of isomeric and isobaric glycan structures. Here we evaluate the elution behaviour of N-glycans on PGC-LC and thereby provide the potential of using chromatographic separation properties, together with mass spectrometry (MS) fragmentation, to determine glycan structure assignments more easily. We used previously reported N-glycan structures released from the purified glycoproteins Immunoglobulin G (IgG), Immunoglobulin A (IgA), lactoferrin, α1-acid glycoprotein, Ribonuclease B (RNase B), fetuin and ovalbumin to profile their behaviour on capillary PGC-LC-MS. Over 100 glycan structures were determined by MS/MS, and together with targeted exoglycosidase digestions, created a N-glycan PGC retention library covering a full spectrum of biologically significant N-glycans from pauci mannose to sialylated tetra-antennary classes. The resultant PGC retention library ( http://www.glycostore.org/showPgc ) incorporates retention times and supporting fragmentation spectra including exoglycosidase digestion products, and provides detailed knowledge on the elution properties of N-glycans by PGC-LC. Consequently, this platform should serve as a valuable resource for facilitating the detailed analysis of the glycosylation of both purified recombinant, and complex mixtures of, glycoproteins using established workflows.
Collapse
|
15
|
Campbell MP. A Review of Software Applications and Databases for the Interpretation of Glycopeptide Data. TRENDS GLYCOSCI GLYC 2017. [DOI: 10.4052/tigg.1601.1e] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]
|
16
|
Campbell MP, Peterson RA, Gasteiger E, Mariethoz J, Lisacek F, Packer NH. Navigating the Glycome Space and Connecting the Glycoproteome. Methods Mol Biol 2017; 1558:139-158. [PMID: 28150237 DOI: 10.1007/978-1-4939-6783-4_7] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]
Abstract
UniCarbKB ( http://unicarbkb.org ) is a comprehensive resource for mammalian glycoprotein and annotation data. In particular, the database provides information on the oligosaccharides characterized from a glycoprotein at either the global or site-specific level. This evidence is accumulated from a peer-reviewed and manually curated collection of information on oligosaccharides derived from membrane and secreted glycoproteins purified from biological fluids and/or tissues. This information is further supplemented with experimental method descriptions that summarize important sample preparation and analytical strategies. A new release of UniCarbKB is published every three months, each includes a collection of curated data and improvements to database functionality. In this Chapter, we outline the objectives of UniCarbKB, and describe a selection of step-by-step workflows for navigating the information available. We also provide a short description of web services available and future plans for improving data access. The information presented in this Chapter supplements content available in our knowledgebase including regular updates on interface improvements, new features, and revisions to the database content ( http://confluence.unicarbkb.org ).
Collapse
|
17
|
Lisacek F, Mariethoz J, Alocci D, Rudd PM, Abrahams JL, Campbell MP, Packer NH, Ståhle J, Widmalm G, Mullen E, Adamczyk B, Rojas-Macias MA, Jin C, Karlsson NG. Databases and Associated Tools for Glycomics and Glycoproteomics. Methods Mol Biol 2017; 1503:235-264. [PMID: 27743371 DOI: 10.1007/978-1-4939-6493-2_18] [Citation(s) in RCA: 36] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/03/2022]
Abstract
The access to biodatabases for glycomics and glycoproteomics has proven to be essential for current glycobiological research. This chapter presents available databases that are devoted to different aspects of glycobioinformatics. This includes oligosaccharide sequence databases, experimental databases, 3D structure databases (of both glycans and glycorelated proteins) and association of glycans with tissue, disease, and proteins. Specific search protocols are also provided using tools associated with experimental databases for converting primary glycoanalytical data to glycan structural information. In particular, researchers using glycoanalysis methods by U/HPLC (GlycoBase), MS (GlycoWorkbench, UniCarb-DB, GlycoDigest), and NMR (CASPER) will benefit from this chapter. In addition we also include information on how to utilize glycan structural information to query databases that associate glycans with proteins (UniCarbKB) and with interactions with pathogens (SugarBind).
Collapse
|
18
|
Campbell MP, McConville JPV, McQuaid RGP, Prabhakaran D, Kumar A, Gregg JM. Hall effect in charged conducting ferroelectric domain walls. Nat Commun 2016; 7:13764. [PMID: 27941794 PMCID: PMC5159852 DOI: 10.1038/ncomms13764] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2016] [Accepted: 10/31/2016] [Indexed: 11/17/2022] Open
Abstract
Enhanced conductivity at specific domain walls in ferroelectrics is now an established phenomenon. Surprisingly, however, little is known about the most fundamental aspects of conduction. Carrier types, densities and mobilities have not been determined and transport mechanisms are still a matter of guesswork. Here we demonstrate that intermittent-contact atomic force microscopy (AFM) can detect the Hall effect in conducting domain walls. Studying YbMnO3 single crystals, we have confirmed that p-type conduction occurs in tail-to-tail charged domain walls. By calibration of the AFM signal, an upper estimate of ∼1 × 1016 cm−3 is calculated for the mobile carrier density in the wall, around four orders of magnitude below that required for complete screening of the polar discontinuity. A carrier mobility of∼50 cm2V−1s−1 is calculated, about an order of magnitude below equivalent carrier mobilities in p-type silicon, but sufficiently high to preclude carrier-lattice coupling associated with small polarons.
Conduction in ferroelectric domain walls is now an established phenomenon, yet fundamental aspects of transport physics remain elusive. Here, Campbell et al. report the type, density and mobility of carriers in conducting domain walls in ytterbium manganite using nanoscale Hall effect measurements.
Collapse
|
19
|
Liu Y, McBride R, Stoll M, Palma AS, Silva L, Agravat S, Aoki-Kinoshita KF, Campbell MP, Costello CE, Dell A, Haslam SM, Karlsson NG, Khoo KH, Kolarich D, Novotny MV, Packer NH, Ranzinger R, Rapp E, Rudd PM, Struwe WB, Tiemeyer M, Wells L, York WS, Zaia J, Kettner C, Paulson JC, Feizi T, Smith DF. The minimum information required for a glycomics experiment (MIRAGE) project: improving the standards for reporting glycan microarray-based data. Glycobiology 2016; 27:280-284. [PMID: 27993942 PMCID: PMC5444268 DOI: 10.1093/glycob/cww118] [Citation(s) in RCA: 50] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2016] [Revised: 11/14/2016] [Accepted: 11/21/2016] [Indexed: 11/12/2022] Open
Abstract
MIRAGE (Minimum Information Required for AGlycomics Experiment) is an initiative that was created by experts in the fields of glycobiology, glycoanalytics and glycoinformatics to produce guidelines for reporting results from the diverse types of experiments and analyses used in structural and functional studies of glycans in the scientific literature. As a sequel to the guidelines for sample preparation (Struwe et al. 2016, Glycobiology, 26:907–910) and mass spectrometry data (Kolarich et al. 2013, Mol. Cell Proteomics, 12:991–995), here we present the first version of guidelines intended to improve the standards for reporting data from glycan microarray analyses. For each of eight areas in the workflow of a glycan microarray experiment, we provide guidelines for the minimal information that should be provided in reporting results. We hope that the MIRAGE glycan microarray guidelines proposed here will gain broad acceptance by the community, and will facilitate interpretation and reproducibility of the glycan microarray results with implications in comparison of data from different laboratories and eventual deposition of glycan microarray data in international databases.
Collapse
|
20
|
Struwe WB, Agravat S, Aoki-Kinoshita KF, Campbell MP, Costello CE, Dell A, Ten Feizi, Haslam SM, Karlsson NG, Khoo KH, Kolarich D, Liu Y, McBride R, Novotny MV, Packer NH, Paulson JC, Rapp E, Ranzinger R, Rudd PM, Smith DF, Tiemeyer M, Wells L, York WS, Zaia J, Kettner C. The minimum information required for a glycomics experiment (MIRAGE) project: sample preparation guidelines for reliable reporting of glycomics datasets. Glycobiology 2016; 26:907-910. [PMID: 27654115 DOI: 10.1093/glycob/cww082] [Citation(s) in RCA: 55] [Impact Index Per Article: 6.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2015] [Accepted: 04/22/2016] [Indexed: 11/13/2022] Open
Abstract
The minimum information required for a glycomics experiment (MIRAGE) project was established in 2011 to provide guidelines to aid in data reporting from all types of experiments in glycomics research including mass spectrometry (MS), liquid chromatography, glycan arrays, data handling and sample preparation. MIRAGE is a concerted effort of the wider glycomics community that considers the adaptation of reporting guidelines as an important step towards critical evaluation and dissemination of datasets as well as broadening of experimental techniques worldwide. The MIRAGE Commission published reporting guidelines for MS data and here we outline guidelines for sample preparation. The sample preparation guidelines include all aspects of sample generation, purification and modification from biological and/or synthetic carbohydrate material. The application of MIRAGE sample preparation guidelines will lead to improved recording of experimental protocols and reporting of understandable and reproducible glycomics datasets.
Collapse
|
21
|
Akune Y, Lin CH, Abrahams JL, Zhang J, Packer NH, Aoki-Kinoshita KF, Campbell MP. Comprehensive analysis of the N-glycan biosynthetic pathway using bioinformatics to generate UniCorn: A theoretical N-glycan structure database. Carbohydr Res 2016; 431:56-63. [PMID: 27318307 DOI: 10.1016/j.carres.2016.05.012] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2015] [Revised: 05/23/2016] [Accepted: 05/29/2016] [Indexed: 02/06/2023]
Abstract
Glycan structures attached to proteins are comprised of diverse monosaccharide sequences and linkages that are produced from precursor nucleotide-sugars by a series of glycosyltransferases. Databases of these structures are an essential resource for the interpretation of analytical data and the development of bioinformatics tools. However, with no template to predict what structures are possible the human glycan structure databases are incomplete and rely heavily on the curation of published, experimentally determined, glycan structure data. In this work, a library of 45 human glycosyltransferases was used to generate a theoretical database of N-glycan structures comprised of 15 or less monosaccharide residues. Enzyme specificities were sourced from major online databases including Kyoto Encyclopedia of Genes and Genomes (KEGG) Glycan, Consortium for Functional Glycomics (CFG), Carbohydrate-Active enZymes (CAZy), GlycoGene DataBase (GGDB) and BRENDA. Based on the known activities, more than 1.1 million theoretical structures and 4.7 million synthetic reactions were generated and stored in our database called UniCorn. Furthermore, we analyzed the differences between the predicted glycan structures in UniCorn and those contained in UniCarbKB (www.unicarbkb.org), a database which stores experimentally described glycan structures reported in the literature, and demonstrate that UniCorn can be used to aid in the assignment of ambiguous structures whilst also serving as a discovery database.
Collapse
|
22
|
Campbell MP, Packer NH. UniCarbKB: New database features for integrating glycan structure abundance, compositional glycoproteomics data, and disease associations. Biochim Biophys Acta Gen Subj 2016; 1860:1669-75. [PMID: 26940363 DOI: 10.1016/j.bbagen.2016.02.016] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2016] [Revised: 02/23/2016] [Accepted: 02/24/2016] [Indexed: 10/22/2022]
Abstract
BACKGROUND UniCarbKB aims to provide a resource for the representation of mammalian glycobiology knowledge by providing a curated database of structural and experimental data, supported by a web application that allows users to easily find and view richly annotated information. The database comprises two levels of annotation (i) global-specific data of oligosaccharides released and characterised from single purified glycoproteins and (ii) information pertaining to site-specific glycan heterogeneity. Additional, contextual information is provided including structural, bibliographic, and taxonomic information for each entry. METHODS Since the launch of UniCarbKB in 2012, we have continued to improve the organisation of our data model. Recently, we have extended our pipeline to collate structural and abundance changes of oligosaccharides in different human disease states and experimental models to extend our coverage of the human glycome. RESULTS In this manuscript, we demonstrate the capability of UniCarbKB to store and query relative glycan abundance data using a set of published colorectal and prostate cancer cell lines as examples. Furthermore, we outline our strategy for managing large-scale glycoproteomics data, site-specific and glycan compositional data, and how this information is adding value to UniCarbKB. Finally, we summarise our efforts to improve the efficient representation of disease terms and associated changes in glycan heterogeneity by integrating the Disease Ontology. CONCLUSIONS Updates and improvements to UniCarbKB have introduced unique features for storing and displaying glycosylation features of mammalian glycoproteins. The integration of site-specific glycosylation data obtained from large-scale glycoproteomics and introduction of cell line studies will improve the analysis of glycoproteins and entire glycomes. GENERAL SIGNIFICANCE Continuing advancements in analytical technologies and new data types are advancing disease-related glycomics. It is increasingly necessary to ensure all the data are comprehensively annotated. UniCarbKB was established with the mission of providing a resource for human glycobiology by capturing a wide range of data with corresponding annotations. This article is part of a Special Issue entitled "Glycans in personalised medicine" Guest Editor: Professor Gordan Lauc.
Collapse
|
23
|
Mariethoz J, Khatib K, Alocci D, Campbell MP, Karlsson NG, Packer NH, Mullen EH, Lisacek F. SugarBindDB, a resource of glycan-mediated host-pathogen interactions. Nucleic Acids Res 2016; 44:D1243-50. [PMID: 26578555 PMCID: PMC4702881 DOI: 10.1093/nar/gkv1247] [Citation(s) in RCA: 34] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/27/2015] [Revised: 10/22/2015] [Accepted: 10/31/2015] [Indexed: 12/16/2022] Open
Abstract
The SugarBind Database (SugarBindDB) covers knowledge of glycan binding of human pathogen lectins and adhesins. It is a curated database; each glycan-protein binding pair is associated with at least one published reference. The core data element of SugarBindDB is a set of three inseparable components: the pathogenic agent, a lectin/adhesin and a glycan ligand. Each entity (agent, lectin or ligand) is described by a range of properties that are summarized in an entity-dedicated page. Several search, navigation and visualisation tools are implemented to investigate the functional role of glycans in pathogen binding. The database is cross-linked to protein and glycan-relaled resources such as UniProtKB and UniCarbKB. It is tightly bound to the latter via a substructure search tool that maps each ligand to full structures where it occurs. Thus, a glycan-lectin binding pair of SugarBindDB can lead to the identification of a glycan-mediated protein-protein interaction, that is, a lectin-glycoprotein interaction, via substructure search and the knowledge of site-specific glycosylation stored in UniCarbKB. SugarBindDB is accessible at: http://sugarbind.expasy.org.
Collapse
|
24
|
Alocci D, Mariethoz J, Horlacher O, Bolleman JT, Campbell MP, Lisacek F. Property Graph vs RDF Triple Store: A Comparison on Glycan Substructure Search. PLoS One 2015; 10:e0144578. [PMID: 26656740 PMCID: PMC4684231 DOI: 10.1371/journal.pone.0144578] [Citation(s) in RCA: 32] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2015] [Accepted: 11/22/2015] [Indexed: 11/18/2022] Open
Abstract
Resource description framework (RDF) and Property Graph databases are emerging technologies that are used for storing graph-structured data. We compare these technologies through a molecular biology use case: glycan substructure search. Glycans are branched tree-like molecules composed of building blocks linked together by chemical bonds. The molecular structure of a glycan can be encoded into a direct acyclic graph where each node represents a building block and each edge serves as a chemical linkage between two building blocks. In this context, Graph databases are possible software solutions for storing glycan structures and Graph query languages, such as SPARQL and Cypher, can be used to perform a substructure search. Glycan substructure searching is an important feature for querying structure and experimental glycan databases and retrieving biologically meaningful data. This applies for example to identifying a region of the glycan recognised by a glycan binding protein (GBP). In this study, 19,404 glycan structures were selected from GlycomeDB (www.glycome-db.org) and modelled for being stored into a RDF triple store and a Property Graph. We then performed two different sets of searches and compared the query response times and the results from both technologies to assess performance and accuracy. The two implementations produced the same results, but interestingly we noted a difference in the query response times. Qualitative measures such as portability were also used to define further criteria for choosing the technology adapted to solving glycan substructure search and other comparable issues.
Collapse
|
25
|
Struwe WB, Pagel K, Benesch JLP, Harvey DJ, Campbell MP. GlycoMob: an ion mobility-mass spectrometry collision cross section database for glycomics. Glycoconj J 2015; 33:399-404. [PMID: 26314736 DOI: 10.1007/s10719-015-9613-7] [Citation(s) in RCA: 61] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2015] [Revised: 07/21/2015] [Accepted: 07/27/2015] [Indexed: 12/29/2022]
Abstract
Ion mobility mass spectrometry (IM-MS) is a promising analytical technique for glycomics that separates glycan ions based on their collision cross section (CCS) and provides glycan precursor and fragment masses. It has been shown that isomeric oligosaccharide species can be separated by IM and identified on basis of their CCS and fragmentation. These results indicate that adding CCSs information for glycans and glycan fragments to searchable databases and analysis pipelines will increase identification confidence and accuracy. We have developed a freely accessible database, GlycoMob ( http://www.glycomob.org ), containing over 900 CCSs values of glycans, oligosaccharide standards and their fragments that will be continually updated. We have measured the absolute CCSs of calibration standards, biologically derived and synthetic N-glycans ionized with various adducts in positive and negative mode or as protonated (positive ion) and deprotonated (negative ion) ions.
Collapse
|