1
|
Taujale R, Uchimiya M, Clendinen CS, Borges RM, Turck CW, Edison AS. PyINETA: Open-source platform for INADEQUATE-JRES integration in NMR metabolomics. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.07.10.601875. [PMID: 39026850 PMCID: PMC11257532 DOI: 10.1101/2024.07.10.601875] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/20/2024]
Abstract
Annotating compounds with high confidence is a critical element in metabolomics. 13C-detection NMR experiment INADEQUATE (incredible natural abundance double-quantum transfer experiment) stands out as a powerful tool for structural elucidation, whereas this valuable experiment is not often included in metabolomics studies. This is partly due to the lack of community platform that provides structural information based INADEQUATE. Also, it is often the case that a single study uses various NMR experiments synergistically to improve the quality of information or balance total NMR experiment time, but there is no public platform that can integrate the outputs of INADEQUATE and other NMR experiments either. Here, we introduce PyINETA, Python-based INADEQUATE network analysis. PyINETA is an open-source platform that provides structural information of molecules using INADEQUATE, conducts database search, and integrates information of INADEQUATE and a complementary NMR experiment 13C J-resolved experiment (13C-JRES). Those steps are carried out automatically, and PyINETA keeps track of all the pipeline parameters and outputs, ensuring the transparency of annotation in metabolomics. Our evaluation of PyINETA using a model mouse study showed that our pipeline successfully integrated INADEQUATE and 13C-JRES. The results showed that 13C-labeled amino acids that were fed to mice were transferred to different tissues, and, also, they were transformed to other metabolites. The distribution of those compounds was tissue-specific, showing enrichment of particular metabolites in liver, spleen, pancreas, muscle, or lung. The value of PyINETA was not limited to those known compounds; PyINETA also provided fragment information for unknown compounds. PyINETA is available on NMRbox.
Collapse
Affiliation(s)
- Rahil Taujale
- Institute of Bioinformatics, University of Georgia, 120 E Green St, Athens, GA, USA
- Complex Carbohydrate Research Center, University of Georgia, 315 Riverbend Rd., Athens, GA 30602, USA
| | - Mario Uchimiya
- Complex Carbohydrate Research Center, University of Georgia, 315 Riverbend Rd., Athens, GA 30602, USA
| | - Chaevien S. Clendinen
- The Environmental Molecular Sciences Laboratory, Pacific Northwest National Laboratory, Richland, WA, 99354 USA
| | - Ricardo M. Borges
- Instituto de Pesquisas de Produtos Naturais, Universidade Federal do Rio de Janeiro, 21941-902, Rio de Janeiro, RJ, Brazil
| | - Christoph W. Turck
- Max Planck Institute of Psychiatry, Proteomics and Biomarkers, Kraepelinstr. 2-10, 80804 Munich, Germany
- Key Laboratory of Animal Models and Human Disease Mechanisms of Yunnan Province, and KIZ/CUHK Joint Laboratory of Bioresources and Molecular Research in Common Diseases, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, 650223, China
- National Resource Center for Non-human Primates, and National Research Facility for Phenotypic & Genetic Analysis of Model Animals, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, 650107, China
| | - Arthur S. Edison
- Complex Carbohydrate Research Center, University of Georgia, 315 Riverbend Rd., Athens, GA 30602, USA
- Department of Biochemistry and Molecular Biology, University of Georgia, 120 E Green St, Athens, GA 30602, USA
| |
Collapse
|
2
|
Jiang S, Xu W, Xia Q, Yi M, Zhou Y, Shang J, Cheng X. Application of machine learning in the study of cobalt-based oxide catalysts for antibiotic degradation: An innovative reverse synthesis strategy. JOURNAL OF HAZARDOUS MATERIALS 2024; 471:134309. [PMID: 38653133 DOI: 10.1016/j.jhazmat.2024.134309] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/13/2024] [Revised: 04/07/2024] [Accepted: 04/13/2024] [Indexed: 04/25/2024]
Abstract
This study addresses antibiotic pollution in global water bodies by integrating machine learning and optimization algorithms to develop a novel reverse synthesis strategy for inorganic catalysts. We meticulously analyzed data from 96 studies, ensuring quality through preprocessing steps. Employing the AdaBoost model, we achieved 90.57% accuracy in classification and an R²value of 0.93 in regression, showcasing strong predictive power. A key innovation is the Sparrow Search Algorithm (SSA), which optimizes catalyst selection and experimental setup tailored to specific antibiotics. Empirical experiments validated SSA's efficacy, with degradation rates of 94% for Levofloxacin and 97% for Norfloxacin, aligning closely with predictions within a 2% margin of error. This research advances theoretical understanding and offers practical applications in material science and environmental engineering, significantly enhancing catalyst design efficiency and accuracy through the fusion of advanced machine learning techniques and optimization algorithms.
Collapse
Affiliation(s)
- Siyuan Jiang
- Key Laboratory for Environmental Pollution Prediction and Control, Gansu Province, College of Earth and Environmental Sciences, Lanzhou University, Lanzhou 730000, PR China
| | - Wen Xu
- Key Laboratory for Environmental Pollution Prediction and Control, Gansu Province, College of Earth and Environmental Sciences, Lanzhou University, Lanzhou 730000, PR China
| | - Qi Xia
- Key Laboratory for Environmental Pollution Prediction and Control, Gansu Province, College of Earth and Environmental Sciences, Lanzhou University, Lanzhou 730000, PR China
| | - Ming Yi
- Key Laboratory for Environmental Pollution Prediction and Control, Gansu Province, College of Earth and Environmental Sciences, Lanzhou University, Lanzhou 730000, PR China
| | - Yuerong Zhou
- Key Laboratory for Environmental Pollution Prediction and Control, Gansu Province, College of Earth and Environmental Sciences, Lanzhou University, Lanzhou 730000, PR China
| | - Jiangwei Shang
- Key Laboratory for Environmental Pollution Prediction and Control, Gansu Province, College of Earth and Environmental Sciences, Lanzhou University, Lanzhou 730000, PR China
| | - Xiuwen Cheng
- Key Laboratory for Environmental Pollution Prediction and Control, Gansu Province, College of Earth and Environmental Sciences, Lanzhou University, Lanzhou 730000, PR China.
| |
Collapse
|
3
|
Sajed T, Sayeeda Z, Lee BL, Berjanskii M, Wang F, Gautam V, Wishart DS. Accurate Prediction of 1H NMR Chemical Shifts of Small Molecules Using Machine Learning. Metabolites 2024; 14:290. [PMID: 38786767 PMCID: PMC11123270 DOI: 10.3390/metabo14050290] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2024] [Revised: 05/11/2024] [Accepted: 05/16/2024] [Indexed: 05/25/2024] Open
Abstract
NMR is widely considered the gold standard for organic compound structure determination. As such, NMR is routinely used in organic compound identification, drug metabolite characterization, natural product discovery, and the deconvolution of metabolite mixtures in biofluids (metabolomics and exposomics). In many cases, compound identification by NMR is achieved by matching measured NMR spectra to experimentally collected NMR spectral reference libraries. Unfortunately, the number of available experimental NMR reference spectra, especially for metabolomics, medical diagnostics, or drug-related studies, is quite small. This experimental gap could be filled by predicting NMR chemical shifts for known compounds using computational methods such as machine learning (ML). Here, we describe how a deep learning algorithm that is trained on a high-quality, "solvent-aware" experimental dataset can be used to predict 1H chemical shifts more accurately than any other known method. The new program, called PROSPRE (PROton Shift PREdictor) can accurately (mean absolute error of <0.10 ppm) predict 1H chemical shifts in water (at neutral pH), chloroform, dimethyl sulfoxide, and methanol from a user-submitted chemical structure. PROSPRE (pronounced "prosper") has also been used to predict 1H chemical shifts for >600,000 molecules in many popular metabolomic, drug, and natural product databases.
Collapse
Affiliation(s)
- Tanvir Sajed
- Department of Biological Sciences, University of Alberta, Edmonton, AB T6G 2E9, Canada
| | - Zinat Sayeeda
- Department of Biological Sciences, University of Alberta, Edmonton, AB T6G 2E9, Canada
| | - Brian L. Lee
- Department of Biological Sciences, University of Alberta, Edmonton, AB T6G 2E9, Canada
| | - Mark Berjanskii
- Department of Biological Sciences, University of Alberta, Edmonton, AB T6G 2E9, Canada
| | - Fei Wang
- Department of Computing Science, University of Alberta, Edmonton, AB T6G 2E8, Canada
| | - Vasuk Gautam
- Department of Biological Sciences, University of Alberta, Edmonton, AB T6G 2E9, Canada
| | - David S. Wishart
- Department of Biological Sciences, University of Alberta, Edmonton, AB T6G 2E9, Canada
- Department of Computing Science, University of Alberta, Edmonton, AB T6G 2E8, Canada
- Department of Laboratory Medicine and Pathology, University of Alberta, Edmonton, AB T6G 2B7, Canada
- Faculty of Pharmacy and Pharmaceutical Sciences, University of Alberta, Edmonton, AB T6G 2H7, Canada
| |
Collapse
|
4
|
Gouveia GJ, Head T, Cheng LL, Clendinen CS, Cort JR, Du X, Edison AS, Fleischer CC, Hoch J, Mercaldo N, Pathmasiri W, Raftery D, Schock TB, Sumner LW, Takis PG, Copié V, Eghbalnia HR, Powers R. Perspective: use and reuse of NMR-based metabolomics data: what works and what remains challenging. Metabolomics 2024; 20:41. [PMID: 38480600 DOI: 10.1007/s11306-024-02090-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/20/2023] [Accepted: 01/12/2024] [Indexed: 04/20/2024]
Abstract
BACKGROUND The National Cancer Institute issued a Request for Information (RFI; NOT-CA-23-007) in October 2022, soliciting input on using and reusing metabolomics data. This RFI aimed to gather input on best practices for metabolomics data storage, management, and use/reuse. AIM OF REVIEW The nuclear magnetic resonance (NMR) Interest Group within the Metabolomics Association of North America (MANA) prepared a set of recommendations regarding the deposition, archiving, use, and reuse of NMR-based and, to a lesser extent, mass spectrometry (MS)-based metabolomics datasets. These recommendations were built on the collective experiences of metabolomics researchers within MANA who are generating, handling, and analyzing diverse metabolomics datasets spanning experimental (sample handling and preparation, NMR/MS metabolomics data acquisition, processing, and spectral analyses) to computational (automation of spectral processing, univariate and multivariate statistical analysis, metabolite prediction and identification, multi-omics data integration, etc.) studies. KEY SCIENTIFIC CONCEPTS OF REVIEW We provide a synopsis of our collective view regarding the use and reuse of metabolomics data and articulate several recommendations regarding best practices, which are aimed at encouraging researchers to strengthen efforts toward maximizing the utility of metabolomics data, multi-omics data integration, and enhancing the overall scientific impact of metabolomics studies.
Collapse
Affiliation(s)
- Goncalo Jorge Gouveia
- Metabolomics Association of North America (MANA), NMR Special Interest Group, Edmonton, Canada
- Institute for Bioscience and Biotechnology Research, National Institute of Standards and Technology, University of Maryland, Gudelsky Drive, Rockville, MD, 20850, USA
| | - Thomas Head
- Metabolomics Association of North America (MANA), NMR Special Interest Group, Edmonton, Canada
- University of British Columbia, Kelowna, BC, V1V 1V7, Canada
| | - Leo L Cheng
- Metabolomics Association of North America (MANA), NMR Special Interest Group, Edmonton, Canada
- Department of Pathology and Department of Radiology, Massachusetts General Hospital and Harvard Medical School, Boston, MA, USA
| | - Chaevien S Clendinen
- Metabolomics Association of North America (MANA), NMR Special Interest Group, Edmonton, Canada
- Earth and Biological Sciences Directorate, Environmental Molecular Sciences Laboratory, Pacific Northwest National Laboratory, Richland, WA, 99352, USA
| | - John R Cort
- Metabolomics Association of North America (MANA), NMR Special Interest Group, Edmonton, Canada
- Earth and Biological Sciences Directorate, Biological Sciences Division, Pacific Northwest National Laboratory, Richland, WA, 99352, USA
| | - Xiuxia Du
- Metabolomics Association of North America (MANA), NMR Special Interest Group, Edmonton, Canada
- Department of Bioinformatics and Genomics, University of North Carolina at Charlotte, 9291 University City Blvd, Charlotte, NC, 28223, USA
| | - Arthur S Edison
- Metabolomics Association of North America (MANA), NMR Special Interest Group, Edmonton, Canada
- Department of Biochemistry, University of Georgia, Athens, GA, USA
| | - Candace C Fleischer
- Metabolomics Association of North America (MANA), NMR Special Interest Group, Edmonton, Canada
- Department of Radiology and Imaging Sciences, Emory University School of Medicine, Atlanta, GA, 30322, USA
| | - Jeffrey Hoch
- Metabolomics Association of North America (MANA), NMR Special Interest Group, Edmonton, Canada
- Department of Molecular Biology and Biophysics, UConn Health, Farmington, CT, 06030-3305, USA
| | - Nathaniel Mercaldo
- Metabolomics Association of North America (MANA), NMR Special Interest Group, Edmonton, Canada
- Department of Radiology, Massachusetts General Hospital and Harvard Medical School, Boston, MA, USA
| | - Wimal Pathmasiri
- Metabolomics Association of North America (MANA), NMR Special Interest Group, Edmonton, Canada
- Department of Nutrition, School of Public Health, Nutrition Research Institute, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA
| | - Daniel Raftery
- Metabolomics Association of North America (MANA), NMR Special Interest Group, Edmonton, Canada
- Department of Anesthesia and Pain Medicine, University of Washington, Seattle, WA, 98109, USA
| | - Tracey B Schock
- Metabolomics Association of North America (MANA), NMR Special Interest Group, Edmonton, Canada
- Chemical Sciences Division, National Institute of Standards and Technology (NIST), Charleston, SC, 29412, USA
| | - Lloyd W Sumner
- Metabolomics Association of North America (MANA), NMR Special Interest Group, Edmonton, Canada
- Department of Biochemistry, MU Metabolomics Center, Bond Life Sciences Center, Interdisciplinary Plant Group, University of Missouri, Columbia, MO, 65211, USA
| | - Panteleimon G Takis
- Metabolomics Association of North America (MANA), NMR Special Interest Group, Edmonton, Canada
- Section of Bioanalytical Chemistry, Division of Systems Medicine, Department of Metabolism, Digestion and Reproduction, Imperial College London, London, SW7 2AZ, UK
- Department of Metabolism, Digestion and Reproduction, National Phenome Centre, Imperial College London, London, W12 0NN, UK
| | - Valérie Copié
- Metabolomics Association of North America (MANA), NMR Special Interest Group, Edmonton, Canada
- Department of Chemistry and Biochemistry, Montana State University, Bozeman, MT, 59717-3400, USA
| | - Hamid R Eghbalnia
- Metabolomics Association of North America (MANA), NMR Special Interest Group, Edmonton, Canada
- Department of Molecular Biology and Biophysics, UConn Health, Farmington, CT, 06030-3305, USA
| | - Robert Powers
- Metabolomics Association of North America (MANA), NMR Special Interest Group, Edmonton, Canada.
- Department of Chemistry, Nebraska Center for Integrated Biomolecular Communication, University of Nebraska-Lincoln, 722 Hamilton Hall, Lincoln, NE, 68588-0304, USA.
| |
Collapse
|
5
|
Villalba H, Llambrich M, Gumà J, Brezmes J, Cumeras R. A Metabolites Merging Strategy (MMS): Harmonization to Enable Studies' Intercomparison. Metabolites 2023; 13:1167. [PMID: 38132849 PMCID: PMC10744506 DOI: 10.3390/metabo13121167] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2023] [Accepted: 11/16/2023] [Indexed: 12/23/2023] Open
Abstract
Metabolomics encounters challenges in cross-study comparisons due to diverse metabolite nomenclature and reporting practices. To bridge this gap, we introduce the Metabolites Merging Strategy (MMS), offering a systematic framework to harmonize multiple metabolite datasets for enhanced interstudy comparability. MMS has three steps. Step 1: Translation and merging of the different datasets by employing InChIKeys for data integration, encompassing the translation of metabolite names (if needed). Followed by Step 2: Attributes' retrieval from the InChIkey, including descriptors of name (title name from PubChem and RefMet name from Metabolomics Workbench), and chemical properties (molecular weight and molecular formula), both systematic (InChI, InChIKey, SMILES) and non-systematic identifiers (PubChem, CheBI, HMDB, KEGG, LipidMaps, DrugBank, Bin ID and CAS number), and their ontology. Finally, a meticulous three-step curation process is used to rectify disparities for conjugated base/acid compounds (optional step), missing attributes, and synonym checking (duplicated information). The MMS procedure is exemplified through a case study of urinary asthma metabolites, where MMS facilitated the identification of significant pathways hidden when no dataset merging strategy was followed. This study highlights the need for standardized and unified metabolite datasets to enhance the reproducibility and comparability of metabolomics studies.
Collapse
Affiliation(s)
- Héctor Villalba
- Department of Oncology, Hospital Universitari Sant Joan de Reus, Institut d’Investigació Sanitària Pere Virgili (IISPV), CERCA, 43204 Reus, Spain
| | - Maria Llambrich
- Department of Electrical Electronic Engineering and Automation, University of Rovira i Virgili (URV), 43007 Tarragona, Spain
- Department of Nutrition and Metabolism, Institut d’Investigació Sanitària Pere Virgili (IISPV), CERCA, 43204 Reus, Spain
| | - Josep Gumà
- Department of Oncology, Hospital Universitari Sant Joan de Reus, Institut d’Investigació Sanitària Pere Virgili (IISPV), CERCA, 43204 Reus, Spain
- Department of Medicine and Surgery, University of Rovira i Virgili (URV), 43007 Tarragona, Spain
| | - Jesús Brezmes
- Department of Electrical Electronic Engineering and Automation, University of Rovira i Virgili (URV), 43007 Tarragona, Spain
- Department of Nutrition and Metabolism, Institut d’Investigació Sanitària Pere Virgili (IISPV), CERCA, 43204 Reus, Spain
| | - Raquel Cumeras
- Department of Oncology, Hospital Universitari Sant Joan de Reus, Institut d’Investigació Sanitària Pere Virgili (IISPV), CERCA, 43204 Reus, Spain
- Department of Electrical Electronic Engineering and Automation, University of Rovira i Virgili (URV), 43007 Tarragona, Spain
| |
Collapse
|
6
|
Uchimiya M, Olofsson M, Powers MA, Hopkinson BM, Moran MA, Edison AS. 13C NMR metabolomics: J-resolved STOCSY meets INADEQUATE. JOURNAL OF MAGNETIC RESONANCE (SAN DIEGO, CALIF. : 1997) 2023; 347:107365. [PMID: 36634594 DOI: 10.1016/j.jmr.2022.107365] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/28/2022] [Revised: 12/20/2022] [Accepted: 12/28/2022] [Indexed: 06/17/2023]
Abstract
Robust annotation of metabolites is a challenging task in metabolomics. Among available applications, 13C NMR experiment INADEQUATE determines direct 13C-13C connectivity unambiguously, offering indispensable information on molecular structure. Despite its great utility, it is not always practical to collect INADEQUATE data on every sample in a large metabolomics study because of its relatively long experiment time. Here, we propose an alternative approach that maintains the quality of information but saves experiment time. In this approach, individual samples in a study are first screened by 13C homonuclear J-resolved experiment (JRES). Next, JRES data are processed by statistical total correlation spectroscopy (STOCSY) to extract peaks that behave similarly among samples. Finally, INADEQUATE is collected on one internal pooled sample to select STOCSY peaks that originate from the same compound. We tested this concept using the 13C-labeled endometabolome of a model marine diatom strain incubated under various settings, intending to cover a range of metabolites produced under different external conditions. This scheme was able to extract known diatom metabolites proline, 2,3-dihydroxypropane-1-sulfonate (DHPS), β-1,3-glucan, choline, and glutamate. This pipeline also detected unknown compounds with structural information, which is valuable in metabolomics where a priori knowledge of metabolites is not always available. The ability of this scheme was seen even in sugar regions, which are usually challenging in 1H NMR due to severe peak overlap. JRES and INADEQUATE were highly complementary; INADEQUATE provided directly-bonded 13C networks, whereas JRES linked INADEQUATE networks within the same compound but broken by nitrogen or sulfur atoms, highlighting the advantage of this integrated approach.
Collapse
Affiliation(s)
- Mario Uchimiya
- Complex Carbohydrate Research Center, University of Georgia, Athens, GA 30602, USA
| | - Malin Olofsson
- Department of Aquatic Sciences and Assessment, Swedish University of Agricultural Sciences, Sweden; Department of Marine Sciences, University of Georgia, Athens, GA 30602, USA
| | - McKenzie A Powers
- Department of Marine Sciences, University of Georgia, Athens, GA 30602, USA
| | - Brian M Hopkinson
- Department of Marine Sciences, University of Georgia, Athens, GA 30602, USA
| | - Mary Ann Moran
- Department of Marine Sciences, University of Georgia, Athens, GA 30602, USA
| | - Arthur S Edison
- Complex Carbohydrate Research Center, University of Georgia, Athens, GA 30602, USA; Department of Biochemistry and Molecular Biology, University of Georgia, Athens, GA 30602, USA.
| |
Collapse
|
7
|
Hoch JC, Baskaran K, Burr H, Chin J, Eghbalnia H, Fujiwara T, Gryk M, Iwata T, Kojima C, Kurisu G, Maziuk D, Miyanoiri Y, Wedell J, Wilburn C, Yao H, Yokochi M. Biological Magnetic Resonance Data Bank. Nucleic Acids Res 2023; 51:D368-D376. [PMID: 36478084 PMCID: PMC9825541 DOI: 10.1093/nar/gkac1050] [Citation(s) in RCA: 34] [Impact Index Per Article: 34.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2022] [Revised: 10/20/2022] [Accepted: 10/23/2022] [Indexed: 12/12/2022] Open
Abstract
The Biological Magnetic Resonance Data Bank (BMRB, https://bmrb.io) is the international open data repository for biomolecular nuclear magnetic resonance (NMR) data. Comprised of both empirical and derived data, BMRB has applications in the study of biomacromolecular structure and dynamics, biomolecular interactions, drug discovery, intrinsically disordered proteins, natural products, biomarkers, and metabolomics. Advances including GHz-class NMR instruments, national and trans-national NMR cyberinfrastructure, hybrid structural biology methods and machine learning are driving increases in the amount, type, and applications of NMR data in the biosciences. BMRB is a Core Archive and member of the World-wide Protein Data Bank (wwPDB).
Collapse
Affiliation(s)
- Jeffrey C Hoch
- Department of Molecular Biology and Biophysics, UConn Health, Farmington, CT 06030-3305, USA
| | - Kumaran Baskaran
- Department of Molecular Biology and Biophysics, UConn Health, Farmington, CT 06030-3305, USA
| | - Harrison Burr
- Department of Molecular Biology and Biophysics, UConn Health, Farmington, CT 06030-3305, USA
| | - John Chin
- Department of Molecular Biology and Biophysics, UConn Health, Farmington, CT 06030-3305, USA
| | - Hamid R Eghbalnia
- Department of Molecular Biology and Biophysics, UConn Health, Farmington, CT 06030-3305, USA
| | - Toshimichi Fujiwara
- Protein Data Bank Japan, Institute for Protein Research, Osaka University, Suita, Osaka 565-0871. Japan
| | - Michael R Gryk
- Department of Molecular Biology and Biophysics, UConn Health, Farmington, CT 06030-3305, USA
| | - Takeshi Iwata
- Protein Data Bank Japan, Institute for Protein Research, Osaka University, Suita, Osaka 565-0871. Japan
| | - Chojiro Kojima
- Protein Data Bank Japan, Institute for Protein Research, Osaka University, Suita, Osaka 565-0871. Japan
- Graduate School of Engineering Science, Yokohama National University, Yokohama 240-8501, Japan
| | - Genji Kurisu
- Protein Data Bank Japan, Institute for Protein Research, Osaka University, Suita, Osaka 565-0871. Japan
| | - Dmitri Maziuk
- Department of Molecular Biology and Biophysics, UConn Health, Farmington, CT 06030-3305, USA
| | - Yohei Miyanoiri
- Protein Data Bank Japan, Institute for Protein Research, Osaka University, Suita, Osaka 565-0871. Japan
| | - Jonathan R Wedell
- Department of Molecular Biology and Biophysics, UConn Health, Farmington, CT 06030-3305, USA
| | - Colin Wilburn
- Department of Molecular Biology and Biophysics, UConn Health, Farmington, CT 06030-3305, USA
| | - Hongyang Yao
- Department of Molecular Biology and Biophysics, UConn Health, Farmington, CT 06030-3305, USA
| | - Masashi Yokochi
- Protein Data Bank Japan, Institute for Protein Research, Osaka University, Suita, Osaka 565-0871. Japan
| |
Collapse
|
8
|
Dou J, Ilina P, Hemming J, Malinen K, Mäkkylä H, Oliveira de Farias N, Tammela P, de Aragão Umbuzeiro G, Räisänen R, Vuorinen T. Effect of Hybrid Type and Harvesting Season on Phytochemistry and Antibacterial Activity of Extracted Metabolites from Salix Bark. JOURNAL OF AGRICULTURAL AND FOOD CHEMISTRY 2022; 70:2948-2956. [PMID: 35200036 PMCID: PMC8915259 DOI: 10.1021/acs.jafc.1c08161] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/21/2021] [Revised: 02/09/2022] [Accepted: 02/10/2022] [Indexed: 06/14/2023]
Abstract
Hundreds of different fast-growing Salix hybrids have been developed mainly for energy crops. In this paper, we studied water extracts from the bark of 15 willow hybrids and species as potential antimicrobial additives. Treatment of ground bark in water under mild conditions extracted 12-25% of the dry material. Preparative high-performance liquid chromatography is proven here as a fast and highly efficient tool in the small-scale recovery of raffinose from Salix bark crude extracts for structural elucidation. Less than half of the dissolved material was assigned by chromatographic (gas chromatography and liquid chromatography) and spectroscopic (mass spectrometry and nuclear magnetic resonance spectroscopy) techniques for low-molecular-weight compounds, including mono- and oligosaccharides (sucrose, raffinose, and stachyose) and aromatic phytochemicals (triandrin, catechin, salicin, and picein). The composition of the extracts varied greatly depending on the hybrid or species and the harvesting season. This information generated new scientific knowledge on the variation in the content and composition of the extracts between Salix hybrids and harvesting season depending on the desired molecule. The extracts showed high antibacterial activity on Staphylococcus aureus with a minimal inhibitory concentration (MIC) of 0.6-0.8 mg/mL; however, no inhibition was observed against Escherichia coli, Enterococcus faecalis, and Salmonella typhimurium. MIC of triandrin (i.e., 1.25 mg/mL) is reported for the first time. Although antibacterial triandrin and (+)-catechin were present in extracts, clear correlation between the antibacterial effect and the chemical composition was not established, which indicates that antibacterial activity of the extracts mainly originates from some not yet elucidated substances. Aquatic toxicity and mutagenicity assessments showed the safe usage of Salix water extracts as possible antibacterial additives.
Collapse
Affiliation(s)
- Jinze Dou
- Department
of Bioproducts and Biosystems, School of Chemical Engineering, Aalto University, Espoo 02150, Finland
| | - Polina Ilina
- Drug
Research Program, Division of Pharmaceutical Biosciences, Faculty
of Pharmacy, University of Helsinki, Helsinki 00014, Finland
| | - Jarl Hemming
- Johan
Gadolin Process Chemistry Centre, c/o Laboratory of Natural Materials
Technology, Åbo Akademi University, Turku 20500, Finland
| | - Kiia Malinen
- Department
of Bioproducts and Biosystems, School of Chemical Engineering, Aalto University, Espoo 02150, Finland
| | - Heidi Mäkkylä
- Drug
Research Program, Division of Pharmaceutical Biosciences, Faculty
of Pharmacy, University of Helsinki, Helsinki 00014, Finland
| | - Natália Oliveira de Farias
- Laboratory
of Ecotoxicology and Genotoxicity—LAEG, School of Technology, University of Campinas, Campinas 13083-970, Brazil
| | - Päivi Tammela
- Drug
Research Program, Division of Pharmaceutical Biosciences, Faculty
of Pharmacy, University of Helsinki, Helsinki 00014, Finland
| | - Gisela de Aragão Umbuzeiro
- Laboratory
of Ecotoxicology and Genotoxicity—LAEG, School of Technology, University of Campinas, Campinas 13083-970, Brazil
| | - Riikka Räisänen
- HELSUS
Helsinki Institute of Sustainability Science, Craft Studies, University of Helsinki, Helsinki 00014, Finland
| | - Tapani Vuorinen
- Department
of Bioproducts and Biosystems, School of Chemical Engineering, Aalto University, Espoo 02150, Finland
| |
Collapse
|
9
|
Tan H, Reed S. Metabolovigilance: Associating Drug Metabolites with Adverse Drug Reactions. Mol Inform 2022; 41:e2100261. [PMID: 34994061 DOI: 10.1002/minf.202100261] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2021] [Accepted: 01/03/2022] [Indexed: 11/05/2022]
Abstract
The Metabolovigilance database (https://pharmacogenomics.clas.ucdenver.edu/pharmacogenomics/side-effect/) is a single repository of information on over 15,920 pharmaceuticals and the compounds expected to result from metabolism of these drugs. Metabolovigilance functions as both a web server, providing data directly to users and as a web application, applying user inputs to create logic statements that curate the data presented or downloaded. Using this tool, it is easy to collect information on drugs, their side effects, and the metabolites associated with specific side effects. Information on these compounds can be sorted based on physical properties of the drugs and their metabolites. All of this information can be viewed, sorted, and downloaded for use in other applications. This open-access tool will facilitate molecular studies on the causes of adverse drug reactions and is well suited to integrate with genomic data furthering the goals of personalized medicine.
Collapse
Affiliation(s)
- Henry Tan
- University of Colorado Denver, UNITED STATES
| | - Scott Reed
- University of Colorado Denver, UNITED STATES
| |
Collapse
|
10
|
Yones SA, Csombordi R, Komorowski J, Diamanti K. MetaFetcheR: An R Package for Complete Mapping of Small-Compound Data. Metabolites 2021; 11:metabo11110743. [PMID: 34822401 PMCID: PMC8620779 DOI: 10.3390/metabo11110743] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2021] [Revised: 10/22/2021] [Accepted: 10/27/2021] [Indexed: 11/20/2022] Open
Abstract
Small-compound databases contain a large amount of information for metabolites and metabolic pathways. However, the plethora of such databases and the redundancy of their information lead to major issues with analysis and standardization. A lack of preventive establishment of means of data access at the infant stages of a project might lead to mislabelled compounds, reduced statistical power, and large delays in delivery of results. We developed MetaFetcheR, an open-source R package that links metabolite data from several small-compound databases, resolves inconsistencies, and covers a variety of use-cases of data fetching. We showed that the performance of MetaFetcheR was superior to existing approaches and databases by benchmarking the performance of the algorithm in three independent case studies based on two published datasets.
Collapse
Affiliation(s)
- Sara A. Yones
- Department of Cellular and Molecular Biology, Uppsala University, 751 24 Uppsala, Sweden; (R.C.); (J.K.)
- Correspondence: (S.A.Y.); (K.D.); Tel.: +46-76-592-2512 (S.A.Y.); +46-73-926-7648 (K.D.)
| | - Rajmund Csombordi
- Department of Cellular and Molecular Biology, Uppsala University, 751 24 Uppsala, Sweden; (R.C.); (J.K.)
| | - Jan Komorowski
- Department of Cellular and Molecular Biology, Uppsala University, 751 24 Uppsala, Sweden; (R.C.); (J.K.)
- Institute of Computer Science, Polish Academy of Sciences, 01-248 Warsaw, Poland
- Washington National Primate Research Center, Seattle, WA 98121, USA
- Swedish Collegium for Advanced Study, 752 38 Uppsala, Sweden
| | - Klev Diamanti
- Department of Cellular and Molecular Biology, Uppsala University, 751 24 Uppsala, Sweden; (R.C.); (J.K.)
- Department of Immunology, Genetics and Pathology, Uppsala University, 751 85 Uppsala, Sweden
- Correspondence: (S.A.Y.); (K.D.); Tel.: +46-76-592-2512 (S.A.Y.); +46-73-926-7648 (K.D.)
| |
Collapse
|
11
|
Yu L, Su Y, Liu Y, Zeng X. Review of unsupervised pretraining strategies for molecules representation. Brief Funct Genomics 2021; 20:323-332. [PMID: 34342611 DOI: 10.1093/bfgp/elab036] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2021] [Revised: 07/07/2021] [Accepted: 07/08/2021] [Indexed: 11/14/2022] Open
Abstract
In recent years, the computer-assisted techniques make a great progress in the field of drug discovery. And, yet, the problem of limited labeled data problem is still challenging and also restricts the performance of these techniques in specific tasks, such as molecular property prediction, compound-protein interaction and de novo molecular generation. One effective solution is to utilize the experience and knowledge gained from other tasks to cope with related pursuits. Unsupervised pretraining is promising, due to its capability of leveraging a vast number of unlabeled molecules and acquiring a more informative molecular representation for the downstream tasks. In particular, models trained on large-scale unlabeled molecules can capture generalizable features, and this ability can be employed to improve the performance of specific downstream tasks. Many relevant pretraining works have been recently proposed. Here, we provide an overview of molecular unsupervised pretraining and related applications in drug discovery. Challenges and possible solutions are also summarized.
Collapse
|
12
|
Abstract
Chemical graph generators are software packages to generate computer representations of chemical structures adhering to certain boundary conditions. Their development is a research topic of cheminformatics. Chemical graph generators are used in areas such as virtual library generation in drug design, in molecular design with specified properties, called inverse QSAR/QSPR, as well as in organic synthesis design, retrosynthesis or in systems for computer-assisted structure elucidation (CASE). CASE systems again have regained interest for the structure elucidation of unknowns in computational metabolomics, a current area of computational biology.
Collapse
Affiliation(s)
- Mehmet Aziz Yirik
- Friedrich Schiller Universität Jena, Institute for Inorganic and Analytical Chemistry, Jena, Germany
| | - Christoph Steinbeck
- Friedrich Schiller Universität Jena, Institute for Inorganic and Analytical Chemistry, Jena, Germany
| |
Collapse
|
13
|
Chen CY, Lee W, Renhowe PA, Jung J, Montfort WR. Solution structures of the Shewanella woodyi H-NOX protein in the presence and absence of soluble guanylyl cyclase stimulator IWP-051. Protein Sci 2020; 30:448-463. [PMID: 33236796 DOI: 10.1002/pro.4005] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2020] [Revised: 11/05/2020] [Accepted: 11/24/2020] [Indexed: 12/14/2022]
Abstract
Heme-nitric oxide/oxygen binding (H-NOX) domains bind gaseous ligands for signal transduction in organisms spanning prokaryotic and eukaryotic kingdoms. In the bioluminescent marine bacterium Shewanella woodyi (Sw), H-NOX proteins regulate quorum sensing and biofilm formation. In higher animals, soluble guanylyl cyclase (sGC) binds nitric oxide with an H-NOX domain to induce cyclase activity and regulate vascular tone, wound healing and memory formation. sGC also binds stimulator compounds targeting cardiovascular disease. The molecular details of stimulator binding to sGC remain obscure but involve a binding pocket near an interface between H-NOX and coiled-coil domains. Here, we report the full NMR structure for CO-ligated Sw H-NOX in the presence and absence of stimulator compound IWP-051, and its backbone dynamics. Nonplanar heme geometry was retained using a semi-empirical quantum potential energy approach. Although IWP-051 binding is weak, a single binding conformation was found at the interface of the two H-NOX subdomains, near but not overlapping with sites identified in sGC. Binding leads to rotation of the subdomains and closure of the binding pocket. Backbone dynamics are similar across both domains except for two helix-connecting loops, which display increased dynamics that are further enhanced by compound binding. Structure-based sequence analyses indicate high sequence diversity in the binding pocket, but the pocket itself appears conserved among H-NOX proteins. The largest dynamical loop lies at the interface between Sw H-NOX and its binding partner as well as in the interface with the coiled coil in sGC, suggesting a critical role for the loop in signal transduction.
Collapse
Affiliation(s)
- Cheng-Yu Chen
- Department of Chemistry and Biochemistry, University of Arizona, Tucson, Arizona, USA
| | - Woonghee Lee
- National Magnetic Resonance Facility at Madison, Biochemistry Department, University of Wisconsin-Madison, Madison, Wisconsin, USA.,Department of Chemistry, University of Colorado Denver, Denver, Colorado, USA
| | | | - Joon Jung
- Cyclerion Therapeutics, Cambridge, Massachusetts, USA
| | - William R Montfort
- Department of Chemistry and Biochemistry, University of Arizona, Tucson, Arizona, USA
| |
Collapse
|
14
|
Nichols FC, Clark RB, Maciejewski MW, Provatas AA, Balsbaugh JL, Dewhirst FE, Smith MB, Rahmlow A. A novel phosphoglycerol serine-glycine lipodipeptide of Porphyromonas gingivalis is a TLR2 ligand. J Lipid Res 2020; 61:1645-1657. [PMID: 32912852 PMCID: PMC7707167 DOI: 10.1194/jlr.ra120000951] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022] Open
Abstract
Porphyromonas gingivalis is a Gram-negative anaerobic periodontal microorganism strongly associated with tissue-destructive processes in human periodontitis. Following oral infection with P. gingivalis, the periodontal bone loss in mice is reported to require the engagement of Toll-like receptor 2 (TLR2). Serine-glycine lipodipeptide or glycine aminolipid classes of P. gingivalis engage human and mouse TLR2, but a novel lipid class reported here is considerably more potent in engaging TLR2 and the heterodimer receptor TLR2/TLR6. The novel lipid class, termed Lipid 1256, consists of a diacylated phosphoglycerol moiety linked to a serine-glycine lipodipeptide previously termed Lipid 654. Lipid 1256 is approximately 50-fold more potent in engaging TLR2 than the previously reported serine-glycine lipid classes. Lipid 1256 also stimulates cytokine secretory responses from peripheral blood monocytes and is recovered in selected oral and intestinal Bacteroidetes organisms. Therefore, these findings suggest that Lipid 1256 may be a microbial TLR2 ligand relevant to chronic periodontitis in humans.
Collapse
Affiliation(s)
- Frank C Nichols
- Department of Oral Health and Diagnostic Sciences, University of Connecticut School of Dental Medicine, Farmington, CT, USA.
| | - Robert B Clark
- Department of Immunology, University of Connecticut School of Medicine, Farmington, CT, USA; Department of Medicine, University of Connecticut School of Medicine, Farmington, CT, USA
| | - Mark W Maciejewski
- Department of Molecular Biology and Biophysics, University of Connecticut School of Medicine, Farmington, CT, USA
| | - Anthony A Provatas
- Center for Environmental Sciences and Engineering, University of Connecticut, Storrs, CT, USA
| | - Jeremy L Balsbaugh
- Center for Open Research Resources and Equipment, University of Connecticut, Storrs, CT, USA
| | - Floyd E Dewhirst
- Department of Microbiology, The Forsyth Institute, Cambridge, MA, USA; Department of Oral Medicine, Harvard School of Dental Medicine, Boston, MA, USA
| | - Michael B Smith
- Department of Chemistry, University of Connecticut, Storrs, CT USA
| | - Amanda Rahmlow
- Department of Oral Health and Diagnostic Sciences, University of Connecticut School of Dental Medicine, Farmington, CT, USA
| |
Collapse
|
15
|
Atom Identifiers Generated by a Neighborhood-Specific Graph Coloring Method Enable Compound Harmonization across Metabolic Databases. Metabolites 2020; 10:metabo10090368. [PMID: 32933023 PMCID: PMC7570338 DOI: 10.3390/metabo10090368] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2020] [Revised: 09/04/2020] [Accepted: 09/08/2020] [Indexed: 02/06/2023] Open
Abstract
Metabolic flux analysis requires both a reliable metabolic model and reliable metabolic profiles in characterizing metabolic reprogramming. Advances in analytic methodologies enable production of high-quality metabolomics datasets capturing isotopic flux. However, useful metabolic models can be difficult to derive due to the lack of relatively complete atom-resolved metabolic networks for a variety of organisms, including human. Here, we developed a neighborhood-specific graph coloring method that creates unique identifiers for each atom in a compound facilitating construction of an atom-resolved metabolic network. What is more, this method is guaranteed to generate the same identifier for symmetric atoms, enabling automatic identification of possible additional mappings caused by molecular symmetry. Furthermore, a compound coloring identifier derived from the corresponding atom coloring identifiers can be used for compound harmonization across various metabolic network databases, which is an essential first step in network integration. With the compound coloring identifiers, 8865 correspondences between KEGG (Kyoto Encyclopedia of Genes and Genomes) and MetaCyc compounds are detected, with 5451 of them confirmed by other identifiers provided by the two databases. In addition, we found that the Enzyme Commission numbers (EC) of reactions can be used to validate possible correspondence pairs, with 1848 unconfirmed pairs validated by commonality in reaction ECs. Moreover, we were able to detect various issues and errors with compound representation in KEGG and MetaCyc databases by compound coloring identifiers, demonstrating the usefulness of this methodology for database curation.
Collapse
|
16
|
Fragment screening targeting Ebola virus nucleoprotein C-terminal domain identifies lead candidates. Antiviral Res 2020; 180:104822. [PMID: 32446802 PMCID: PMC7894038 DOI: 10.1016/j.antiviral.2020.104822] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2020] [Revised: 05/08/2020] [Accepted: 05/15/2020] [Indexed: 01/24/2023]
Abstract
The Ebola Virus is a causative agent of viral hemorrhagic fever outbreaks and a potential global health risk. The outbreak in West Africa (2013-2016) led to 11,000+ deaths and 30,000+ Ebola infected individuals. The current outbreak in the Democratic Republic of Congo (DRC) with 3000+ confirmed cases and 2000+ deaths attributed to Ebola virus infections provides a reminder that innovative countermeasures are still needed. Ebola virus encodes 7 open reading frames (ORFs). Of these, the nucleocapsid protein (eNP) encoded by the first ORF plays many significant roles, including a role in viral RNA synthesis. Here we describe efforts to target the C-terminal domain of eNP (eNP-CTD) that contains highly conserved residues 641-739 as a pan-Ebola antiviral target. Interactions of eNP-CTD with Ebola Viral Protein 30 (eVP30) and Viral Protein 40 (eVP40) have been shown to be crucial for viral RNA synthesis, virion formation, and virion transport. We used nuclear magnetic response (NMR)-based methods to screened the eNP-CTD against a fragment library. Perturbations of 1D 1H NMR spectra identified of 48 of the 439 compounds screened as potential eNP CTD interactors. Subsequent analysis of these compounds to measure chemical shift perturbations in 2D 1H,15N NMR spectra of 15N-labeled protein identified six with low millimolar affinities. All six perturbed an area consisting mainly of residues at or near the extreme C-terminus that we named "Site 1" while three other sites were perturbed by other compounds. Our findings here demonstrate the potential utility of eNP as a target, several fragment hits, and provide an experimental pipeline to validate viral-viral interactions as potential panfiloviral inhibitor targets.
Collapse
|
17
|
Dashti H, Westler WM, Wedell JR, Demler OV, Eghbalnia HR, Markley JL, Mora S. Probabilistic identification of saccharide moieties in biomolecules and their protein complexes. Sci Data 2020; 7:210. [PMID: 32620933 PMCID: PMC7335193 DOI: 10.1038/s41597-020-0547-y] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2020] [Accepted: 06/02/2020] [Indexed: 12/27/2022] Open
Abstract
The chemical composition of saccharide complexes underlies their biomedical activities as biomarkers for cardiometabolic disease, various types of cancer, and other conditions. However, because these molecules may undergo major structural modifications, distinguishing between compounds of saccharide and non-saccharide origin becomes a challenging computational problem that hinders the aggregation of information about their bioactive moieties. We have developed an algorithm and software package called "Cheminformatics Tool for Probabilistic Identification of Carbohydrates" (CTPIC) that analyzes the covalent structure of a compound to yield a probabilistic measure for distinguishing saccharides and saccharide-derivatives from non-saccharides. CTPIC analysis of the RCSB Ligand Expo (database of small molecules found to bind proteins in the Protein Data Bank) led to a substantial increase in the number of ligands characterized as saccharides. CTPIC analysis of Protein Data Bank identified 7.7% of the proteins as saccharide-binding. CTPIC is freely available as a webservice at (http://ctpic.nmrfam.wisc.edu).
Collapse
Affiliation(s)
- Hesam Dashti
- Center for Lipid Metabolomics, Division of Preventive Medicine, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, 02215, Massachusetts, USA
- Department of Biochemistry, National Magnetic Resonance Facility at Madison and BioMagResBank, University of Wisconsin Madison, Madison, 53706, Wisconsin, USA
| | - William M Westler
- Department of Biochemistry, National Magnetic Resonance Facility at Madison and BioMagResBank, University of Wisconsin Madison, Madison, 53706, Wisconsin, USA
| | - Jonathan R Wedell
- Department of Biochemistry, National Magnetic Resonance Facility at Madison and BioMagResBank, University of Wisconsin Madison, Madison, 53706, Wisconsin, USA
| | - Olga V Demler
- Center for Lipid Metabolomics, Division of Preventive Medicine, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, 02215, Massachusetts, USA
| | - Hamid R Eghbalnia
- Department of Biochemistry, National Magnetic Resonance Facility at Madison and BioMagResBank, University of Wisconsin Madison, Madison, 53706, Wisconsin, USA
| | - John L Markley
- Department of Biochemistry, National Magnetic Resonance Facility at Madison and BioMagResBank, University of Wisconsin Madison, Madison, 53706, Wisconsin, USA.
| | - Samia Mora
- Center for Lipid Metabolomics, Division of Preventive Medicine, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, 02215, Massachusetts, USA.
- Cardiovascular Division, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, 02215, Massachusetts, USA.
| |
Collapse
|
18
|
Romero PR, Kobayashi N, Wedell JR, Baskaran K, Iwata T, Yokochi M, Maziuk D, Yao H, Fujiwara T, Kurusu G, Ulrich EL, Hoch JC, Markley JL. BioMagResBank (BMRB) as a Resource for Structural Biology. Methods Mol Biol 2020; 2112:187-218. [PMID: 32006287 DOI: 10.1007/978-1-0716-0270-6_14] [Citation(s) in RCA: 36] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023]
Abstract
The Biological Magnetic Resonance Data Bank (BioMagResBank or BMRB), founded in 1988, serves as the archive for data generated by nuclear magnetic resonance (NMR) spectroscopy of biological systems. NMR spectroscopy is unique among biophysical approaches in its ability to provide a broad range of atomic and higher-level information relevant to the structural, dynamic, and chemical properties of biological macromolecules, as well as report on metabolite and natural product concentrations in complex mixtures and their chemical structures. BMRB became a core member of the Worldwide Protein Data Bank (wwPDB) in 2007, and the BMRB archive is now a core archive of the wwPDB. Currently, about 10% of the structures deposited into the PDB archive are based on NMR spectroscopy. BMRB stores experimental and derived data from biomolecular NMR studies. Newer BMRB biopolymer depositions are divided about evenly between those associated with structure determinations (atomic coordinates and supporting information archived in the PDB) and those reporting experimental information on molecular dynamics, conformational transitions, ligand binding, assigned chemical shifts, or other results from NMR spectroscopy. BMRB also provides resources for NMR studies of metabolites and other small molecules that are often macromolecular ligands and/or nonstandard residues. This chapter is directed to the structural biology community rather than the metabolomics and natural products community. Our goal is to describe various BMRB services offered to structural biology researchers and how they can be accessed and utilized. These services can be classified into four main groups: (1) data deposition, (2) data retrieval, (3) data analysis, and (4) services for NMR spectroscopists and software developers. The chapter also describes the NMR-STAR data format used by BMRB and the tools provided to facilitate its use. For programmers, BMRB offers an application programming interface (API) and libraries in the Python and R languages that enable users to develop their own BMRB-based tools for data analysis, visualization, and manipulation of NMR-STAR formatted files. BMRB also provides users with direct access tools through the NMRbox platform.
Collapse
Affiliation(s)
- Pedro R Romero
- BMRB, Biochemistry Department, University of Wisconsin-Madison, Madison, WI, USA
| | - Naohiro Kobayashi
- PDBj-BMRB, Institute for Protein Research, Osaka University, Suita, Osaka, Japan
| | - Jonathan R Wedell
- BMRB, Biochemistry Department, University of Wisconsin-Madison, Madison, WI, USA
| | - Kumaran Baskaran
- BMRB, Biochemistry Department, University of Wisconsin-Madison, Madison, WI, USA
| | - Takeshi Iwata
- PDBj-BMRB, Institute for Protein Research, Osaka University, Suita, Osaka, Japan
| | - Masashi Yokochi
- PDBj-BMRB, Institute for Protein Research, Osaka University, Suita, Osaka, Japan
| | - Dimitri Maziuk
- BMRB, Biochemistry Department, University of Wisconsin-Madison, Madison, WI, USA
| | - Hongyang Yao
- BMRB, Biochemistry Department, University of Wisconsin-Madison, Madison, WI, USA
| | - Toshimichi Fujiwara
- PDBj-BMRB, Institute for Protein Research, Osaka University, Suita, Osaka, Japan
| | - Genji Kurusu
- PDBj-BMRB, Institute for Protein Research, Osaka University, Suita, Osaka, Japan
| | - Eldon L Ulrich
- BMRB, Biochemistry Department, University of Wisconsin-Madison, Madison, WI, USA
| | - Jeffrey C Hoch
- BMRB, Department of Molecular Biology and Biophysics, UConn Health, Farmington, CT, USA
| | - John L Markley
- BMRB, Biochemistry Department, University of Wisconsin-Madison, Madison, WI, USA.
| |
Collapse
|
19
|
Tugui C, Tiron V, Dascalu M, Sacarescu L, Cazacu M. From ultra-high molecular weight polydimethylsiloxane to super-soft elastomer. Eur Polym J 2019. [DOI: 10.1016/j.eurpolymj.2019.109243] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
|
20
|
Dashti H, Wedell JR, Westler WM, Markley JL, Eghbalnia HR. Automated evaluation of consistency within the PubChem Compound database. Sci Data 2019; 6:190023. [PMID: 30778259 PMCID: PMC6380220 DOI: 10.1038/sdata.2019.23] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2018] [Accepted: 01/14/2019] [Indexed: 12/14/2022] Open
Abstract
Identification of discrepant data in aggregated databases is a key step in data curation and remediation. We have applied the ALATIS approach, which is based on the international chemical shift identifier (InChI) model, to the full PubChem Compound database to generate unique and reproducible compound and atom identifiers for all entries for which three-dimensional structures were available. This exercise also served to identify entries with discrepancies between structures and chemical formulas or InChI strings. The use of unique compound identifiers and atom nomenclature should support more rigorous links between small-molecule databases including those containing atom-specific information of the type available from crystallography and spectroscopy. The comprehensive results from this analysis are publicly available through our webserver [http://alatis.nmrfam.wisc.edu/].
Collapse
Affiliation(s)
- Hesam Dashti
- Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, Massachusetts 02215, USA.,National Magnetic Resonance Facility at Madison and BioMagResBank, Department of Biochemistry, University of Wisconsin Madison, Madison, Wisconsin 53706, USA
| | - Jonathan R Wedell
- National Magnetic Resonance Facility at Madison and BioMagResBank, Department of Biochemistry, University of Wisconsin Madison, Madison, Wisconsin 53706, USA
| | - William M Westler
- National Magnetic Resonance Facility at Madison and BioMagResBank, Department of Biochemistry, University of Wisconsin Madison, Madison, Wisconsin 53706, USA
| | - John L Markley
- National Magnetic Resonance Facility at Madison and BioMagResBank, Department of Biochemistry, University of Wisconsin Madison, Madison, Wisconsin 53706, USA
| | - Hamid R Eghbalnia
- National Magnetic Resonance Facility at Madison and BioMagResBank, Department of Biochemistry, University of Wisconsin Madison, Madison, Wisconsin 53706, USA
| |
Collapse
|
21
|
Nagana Gowda GA, Abell L, Tian R. Extending the Scope of 1H NMR Spectroscopy for the Analysis of Cellular Coenzyme A and Acetyl Coenzyme A. Anal Chem 2019; 91:2464-2471. [PMID: 30608643 DOI: 10.1021/acs.analchem.8b05286] [Citation(s) in RCA: 17] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]
Abstract
Coenzyme A (CoA) and acetyl-coenzyme A (acetyl-CoA) are ubiquitous cellular molecules, which mediate hundreds of anabolic and catabolic reactions including energy metabolism. Highly sensitive methods including absorption spectroscopy and mass spectrometry enable their analysis, albeit with many limitations. To date, however, NMR spectroscopy has not been used to analyze these important molecules. Building on our recent efforts, which enabled simultaneous analysis of a large number of metabolites in tissue and blood including many coenzymes and antioxidants ( Anal. Chem. 2016, 88, 4817-24; ibid 2017, 89, 4620-4627), we describe here a new method for identification and quantitation of CoA and acetyl-CoA ex vivo in tissue. Using mouse heart, kidney, liver, brain, and skeletal tissue, we show that a simple 1H NMR experiment can simultaneously measure these molecules. Identification of the two species involved a comprehensive analysis of the different tissue types using 1D and 2D NMR, in combination with spectral databases for standards, as well as spiking with authentic compounds. Time dependent studies showed that while the acetyl-CoA levels remain unaltered, CoA levels diminish by more than 50% within 24 h, which indicates that CoA is labile in solution; however, degassing the sample with helium gas halted its oxidation. Further, interestingly, we also identified endogenous coenzyme A glutathione disulfide (CoA-S-S-G) in tissue for the first time by NMR and show that CoA, when oxidized in tissue extract, also forms the same disulfide metabolite. The ability to simultaneously visualize absolute concentrations of CoA, acetyl-CoA, and endogenous CoA-S-S-G along with redox coenzymes (NAD+, NADH, NADP+, NADPH), energy coenzymes (ATP, ADP, AMP), antioxidants (GSH, GSSG), and a vast pool of other metabolites using a single 1D NMR spectrum offers a new avenue in the metabolomics field for investigation of cellular function in health and disease.
Collapse
|
22
|
Markley JL, Dashti H, Wedell JR, Westler WM, Eghbalnia HR. Tools for Enhanced NMR-Based Metabolomics Analysis. Methods Mol Biol 2019; 2037:413-427. [PMID: 31463858 PMCID: PMC7995344 DOI: 10.1007/978-1-4939-9690-2_23] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
Metabolomics is the study of profiles of small molecules in biological fluids, cells, or organs. These profiles can be thought of as the "fingerprints" left behind from chemical processes occurring in biological systems. Because of its potential for groundbreaking applications in disease diagnostics, biomarker discovery, and systems biology, metabolomics has emerged as a rapidly growing area of research. Metabolomics investigations often, but not always, involve the identification and quantification of endogenous and exogenous metabolites in biological samples. Software tools and databases play a crucial role in advancing the rigor, robustness, reproducibility, and validation of these studies. Specifically, the establishment of a robust library of spectral signatures with unique compound descriptors and atom identities plays a key role in profiling studies based on data from nuclear magnetic resonance (NMR) spectroscopy. Here, we discuss developments leading to a rigorous basis for unique identification of compounds, reproducible numbering of atoms, the compact representation of NMR spectra of metabolites and small molecules, tools for improved compound identification, quantification and visualization, and approaches toward the goal of rigorous analysis of metabolomics data.
Collapse
Affiliation(s)
- John L Markley
- Department of Biochemistry, University of Wisconsin Madison, Madison, WI, USA.
| | - Hesam Dashti
- Department of Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
| | - Jonathan R Wedell
- Department of Biochemistry, University of Wisconsin Madison, Madison, WI, USA
| | - William M Westler
- Department of Biochemistry, University of Wisconsin Madison, Madison, WI, USA
| | - Hamid R Eghbalnia
- Department of Biochemistry, University of Wisconsin Madison, Madison, WI, USA
| |
Collapse
|
23
|
Dashti H, Wedell JR, Westler WM, Tonelli M, Aceti D, Amarasinghe GK, Markley JL, Eghbalnia HR. Applications of Parametrized NMR Spin Systems of Small Molecules. Anal Chem 2018; 90:10646-10649. [PMID: 30125102 DOI: 10.1021/acs.analchem.8b02660] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]
Abstract
We have developed technology for producing accurate spectral fingerprints of small molecules through modeling of NMR spin system matrices to encapsulate their chemical shifts and scalar couplings. We describe here how libraries of these spin systems utilizing unique and reproducible atom numbering can be used to improve NMR-based ligand screening and metabolomics studies. We introduce new Web services that facilitate the analysis of NMR spectra of mixtures of small molecules to yield their identification and quantification. The library of parametrized compounds has been expanded to cover simulations of 1H NMR spectra at a variety of magnetic fields of more than 1100 compounds, included are many common metabolites and a library of drug-like molecular fragments used in ligand screening. The compound library and related Web services are freely available from http://gissmo.nmrfam.wisc.edu/ .
Collapse
Affiliation(s)
| | | | | | | | | | - Gaya K Amarasinghe
- Department of Pathology and Immunology , Washington University School of Medicine , St. Louis , Missouri 63110 , United States
| | | | | |
Collapse
|
24
|
Pupier M, Nuzillard JM, Wist J, Schlörer NE, Kuhn S, Erdelyi M, Steinbeck C, Williams AJ, Butts C, Claridge TD, Mikhova B, Robien W, Dashti H, Eghbalnia HR, Farès C, Adam C, Kessler P, Moriaud F, Elyashberg M, Argyropoulos D, Pérez M, Giraudeau P, Gil RR, Trevorrow P, Jeannerat D. NMReDATA, a standard to report the NMR assignment and parameters of organic compounds. MAGNETIC RESONANCE IN CHEMISTRY : MRC 2018; 56:703-715. [PMID: 29656574 PMCID: PMC6226248 DOI: 10.1002/mrc.4737] [Citation(s) in RCA: 54] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/22/2018] [Revised: 02/22/2018] [Accepted: 03/25/2018] [Indexed: 05/29/2023]
Abstract
Even though NMR has found countless applications in the field of small molecule characterization, there is no standard file format available for the NMR data relevant to structure characterization of small molecules. A new format is therefore introduced to associate the NMR parameters extracted from 1D and 2D spectra of organic compounds to the proposed chemical structure. These NMR parameters, which we shall call NMReDATA (for nuclear magnetic resonance extracted data), include chemical shift values, signal integrals, intensities, multiplicities, scalar coupling constants, lists of 2D correlations, relaxation times, and diffusion rates. The file format is an extension of the existing Structure Data Format, which is compatible with the commonly used MOL format. The association of an NMReDATA file with the raw and spectral data from which it originates constitutes an NMR record. This format is easily readable by humans and computers and provides a simple and efficient way for disseminating results of structural chemistry investigations, allowing automatic verification of published results, and for assisting the constitution of highly needed open-source structural databases.
Collapse
Affiliation(s)
- Marion Pupier
- Department of Organic Chemistry, University of Geneva, 30 Quai E. Ansermet, 1211 Geneva 4, Switzerland
| | - Jean-Marc Nuzillard
- Institut de Chimie Moléculaire de Reims, UMR CNRS 7312, BP 1039, 51687, Reims Cedex 2, France
| | - Julien Wist
- Chemistry Department, Universidad del Valle, 76001 Cali, Colombia
| | - Nils E. Schlörer
- Department of Chemistry, University of Cologne, Greinstr. 4, 50939 Köln, Germany
| | - Stefan Kuhn
- Department of Chemistry, University of Cologne, Greinstr. 4, 50939 Köln, Germany
| | - Mate Erdelyi
- Department of Chemistry - BMC, Uppsala University, Husargatan 3, 752 37 Uppsala, Sweden
| | - Christoph Steinbeck
- Institute for Inorganic and Analytical Chemistry, Friedrich-Schiller-University, Lessingstr. 8, 07743 Jena, Germany
| | - Antony J. Williams
- National Center for Computational Toxicology, Environmental Protection Agency, 109 T.W. Alexander Drive, Room D131I, Mail Drop D143-02, Research Triangle Park, NC 27711, USA
| | - Craig Butts
- School of Chemistry, Bristol University, BS8 1TS Bristol, UK
| | - Tim D.W. Claridge
- Department of Chemistry, University of Oxford, Chemistry Research Laboratory, Mansfield Road, Oxford OX1 3TA, UK
| | - Bozhana Mikhova
- Institute of Organic Chemistry with Centre of Phytochemistry, Bulgarian Academy of Sciences, Akad. G. Bonchev Str. Bl.9, Sofia 1113, Bulgaria
| | - Wolfgang Robien
- University of Vienna, Department of Organic Chemistry, Währingerstr. 38, 1090 Vienna, Austria
| | - Hesam Dashti
- Department of Biochemistry, National Magnetic Resonance Facility at Madison (NMRFAM), 433 Babcock Drive, Madison, WI, USA
| | - Hamid R. Eghbalnia
- Department of Biochemistry, National Magnetic Resonance Facility at Madison (NMRFAM), 433 Babcock Drive, Madison, WI, USA
| | - Christophe Farès
- Max-Planck-Institut für Kohlenforschung, Abteilung NMR, Kaiser-Wilhelm-Platz 1, 45470 Mülheim an der Ruhr, Germany
| | - Christian Adam
- Karlsruhe Institute of Technology, Hermann-von-Helmholtz-Platz 1, 76344 Eggenstein-Leopoldshafen, Germany
| | - Pavel Kessler
- Bruker BioSpin GmbH, Silberstreifen, 76287 Rheinstetten, Germany
| | - Fabrice Moriaud
- Bruker BioSpin AG, Industriestrasse 26, 8117 Fällanden, Switzerland
| | - Mikhail Elyashberg
- Moscow Department, Advanced Chemistry Development, 6 Akademik Bakulev Street, Moscow 117513, Russian Federation
| | - Dimitris Argyropoulos
- Advanced Chemistry Development, Inc. (ACD/Labs), Venture House, Arlington Square, Downshire Way, Bracknell, Berkshire RG12 1WA, UK
| | - Manuel Pérez
- Mestrelab Research, S.L., Feliciano Barrera 9B - Bajo, ES-15706 Santiago de Compostela, Spain
| | - Patrick Giraudeau
- EBSI Team, Chimie et Interdisciplinarité: Synthèse, Analyse, Modélisation (CEISAM) CNRS, UMR 6230, Université de Nantes, 92208, 2 rue de la Houssinière, BP 44322 Nantes, France
- Institut Universitaire de France, 1 rue Descartes, 75005 Paris Cedex 05, France
| | - Roberto R. Gil
- Department of Chemistry, Carnegie Mellon University, 4400 Fifth Ave., Pittsburgh, PA 15213, USA
| | | | - Damien Jeannerat
- Department of Organic Chemistry, University of Geneva, 30 Quai E. Ansermet, 1211 Geneva 4, Switzerland
| |
Collapse
|
25
|
Spin System Modeling of Nuclear Magnetic Resonance Spectra for Applications in Metabolomics and Small Molecule Screening. Anal Chem 2017; 89:12201-12208. [PMID: 29058410 DOI: 10.1021/acs.analchem.7b02884] [Citation(s) in RCA: 27] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/27/2023]
Abstract
The exceptionally rich information content of nuclear magnetic resonance (NMR) spectra is routinely used to identify and characterize molecules and molecular interactions in a wide range of applications, including clinical biomarker discovery, drug discovery, environmental chemistry, and metabolomics. The set of peak positions and intensities from a reference NMR spectrum generally serves as the identifying signature for a compound. Reference spectra normally are collected under specific conditions of pH, temperature, and magnetic field strength, because changes in conditions can distort the identifying signatures of compounds. A spin system matrix that parametrizes chemical shifts and coupling constants among spins provides a much richer feature set for a compound than a spectral signature based on peak positions and intensities. Spin system matrices expand the applicability of NMR spectral libraries beyond the specific conditions under which data were collected. In addition to being able to simulate spectra at any field strength, spin parameters can be adjusted to systematically explore alterations in chemical shift patterns due to variations in other experimental conditions, such as compound concentration, pH, or temperature. We present methodology and software for efficient interactive optimization of spin parameters against experimental 1D-1H NMR spectra of small molecules. We have used the software to generate spin system matrices for a set of key mammalian metabolites and are also using the software to parametrize spectra of small molecules used in NMR-based ligand screening. The software, along with optimized spin system matrix data for a growing number of compounds, is available from http://gissmo.nmrfam.wisc.edu/ .
Collapse
|
26
|
Le Guennec A, Tayyari F, Edison AS. Alternatives to Nuclear Overhauser Enhancement Spectroscopy Presat and Carr-Purcell-Meiboom-Gill Presat for NMR-Based Metabolomics. Anal Chem 2017; 89:8582-8588. [PMID: 28737383 PMCID: PMC5588096 DOI: 10.1021/acs.analchem.7b02354] [Citation(s) in RCA: 33] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2017] [Accepted: 07/24/2017] [Indexed: 01/01/2023]
Abstract
NMR metabolomics are primarily conducted with 1D nuclear Overhauser enhancement spectroscopy (NOESY) presat for water suppression and Carr-Purcell-Meiboom-Gill (CPMG) presat as a T2 filter to remove macromolecule signals. Others pulse sequences exist for these two objectives but are not often used in metabolomics studies, because they are less robust or unknown to the NMR metabolomics community. However, recent improvements on alternative pulse sequences provide attractive alternatives to 1D NOESY presat and CPMG presat. We focus this perspective on PURGE, a water suppression technique, and PROJECT presat, a T2 filter. These two pulse sequences, when optimized, performed at least on par with 1D NOESY presat and CPMG presat, if not better. These pulse sequences were tested on common samples for metabolomics, human plasma, and urine.
Collapse
Affiliation(s)
- Adrien Le Guennec
- Complex
Carbohydrate Research Center (CCRC), Departments of Genetics and Biochemistry
& Molecular Biology, and Institute of Bioinformatics, University of Georgia, 315 Riverbend Road, Athens, Georgia 30602, United
States
| | - Fariba Tayyari
- Complex
Carbohydrate Research Center (CCRC), Departments of Genetics and Biochemistry
& Molecular Biology, and Institute of Bioinformatics, University of Georgia, 315 Riverbend Road, Athens, Georgia 30602, United
States
| | - Arthur S. Edison
- Complex
Carbohydrate Research Center (CCRC), Departments of Genetics and Biochemistry
& Molecular Biology, and Institute of Bioinformatics, University of Georgia, 315 Riverbend Road, Athens, Georgia 30602, United
States
| |
Collapse
|