1
|
Zhu B, Li Z, Jin Z, Zhong Y, Lv T, Ge Z, Li H, Wang T, Lin Y, Liu H, Ma T, Wang S, Liao J, Fan X. Knowledge-based in silico fragmentation and annotation of mass spectra for natural products with MassKG. Comput Struct Biotechnol J 2024; 23:3327-3341. [PMID: 39310281 PMCID: PMC11415640 DOI: 10.1016/j.csbj.2024.09.001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2024] [Revised: 09/04/2024] [Accepted: 09/04/2024] [Indexed: 09/25/2024] Open
Abstract
Liquid chromatography coupled with tandem mass spectrometry (LC-MS/MS) is a potent analytical technique utilized for identifying natural products from complex sources. However, due to the structural diversity, annotating LC-MS/MS data of natural products efficiently remains challenging, hindering the discovery process of novel active structures. Here, we introduce MassKG, an algorithm that combines a knowledge-based fragmentation strategy and a deep learning-based molecule generation model to aid in rapid dereplication and the discovery of novel NP structures. Specifically, MassKG has compiled 407,720 known NP structures and, based on this, generated 266,353 new structures using chemical language models for the discovery of potential novel compounds. Furthermore, MassKG demonstrates exceptional performance in spectra annotation compared to state-of-the-art algorithms. To enhance usability, MassKG has been implemented as a web server for annotating tandem mass spectral data (MS/MS, MS2) with a user-friendly interface, automatic reporting, and fragment tree visualization. Lastly, the interpretive capability of MassKG is comprehensively validated through composition analysis and MS annotation of Panax notoginseng, Ginkgo biloba, Codonopsis pilosula, and Astragalus membranaceus. MassKG is now accessible at https://xomics.com.cn/masskg.
Collapse
Affiliation(s)
- Bingjie Zhu
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China
- State Key Laboratory of Chinese Medicine Modernization, Innovation Center of Yangtze River Delta, Zhejiang University, Jiaxing 314100, China
| | - Zhenhao Li
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China
- Zhang Boli Intelligent Health Innovation Lab, Hangzhou 311121, China
| | - Zehua Jin
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China
- State Key Laboratory of Chinese Medicine Modernization, Innovation Center of Yangtze River Delta, Zhejiang University, Jiaxing 314100, China
| | - Yi Zhong
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China
| | - Tianhang Lv
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China
- State Key Laboratory of Chinese Medicine Modernization, Innovation Center of Yangtze River Delta, Zhejiang University, Jiaxing 314100, China
| | - Zhiwei Ge
- Analysis Center of Agrobiology and Environmental Sciences, Zhejiang University, Hangzhou 310058, China
| | - Haoran Li
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China
- State Key Laboratory of Chinese Medicine Modernization, Innovation Center of Yangtze River Delta, Zhejiang University, Jiaxing 314100, China
| | - Tianhao Wang
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China
- State Key Laboratory of Chinese Medicine Modernization, Innovation Center of Yangtze River Delta, Zhejiang University, Jiaxing 314100, China
| | - Yugang Lin
- Department of Pharmacy, Affiliated Jinhua Hospital, Zhejiang University School of Medicine, Jinhua 321000, China
| | - Huihui Liu
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China
| | - Tianyi Ma
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China
| | - Shufang Wang
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China
- State Key Laboratory of Chinese Medicine Modernization, Innovation Center of Yangtze River Delta, Zhejiang University, Jiaxing 314100, China
| | - Jie Liao
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China
- State Key Laboratory of Chinese Medicine Modernization, Innovation Center of Yangtze River Delta, Zhejiang University, Jiaxing 314100, China
| | - Xiaohui Fan
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China
- State Key Laboratory of Chinese Medicine Modernization, Innovation Center of Yangtze River Delta, Zhejiang University, Jiaxing 314100, China
- Zhang Boli Intelligent Health Innovation Lab, Hangzhou 311121, China
- The Joint-laboratory of Clinical Multi-Omics Research between Zhejiang University and Ningbo Municipal Hospital of TCM, Ningbo Municipal Hospital of TCM, 315100 Ningbo, China
| |
Collapse
|
2
|
Arora S, Satija S, Mittal A, Solanki S, Mohanty SK, Srivastava V, Sengupta D, Rout D, Arul Murugan N, Borkar RM, Ahuja G. Unlocking The Mysteries of DNA Adducts with Artificial Intelligence. Chembiochem 2024; 25:e202300577. [PMID: 37874183 DOI: 10.1002/cbic.202300577] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2023] [Revised: 10/18/2023] [Accepted: 10/23/2023] [Indexed: 10/25/2023]
Abstract
Cellular genome is considered a dynamic blueprint of a cell since it encodes genetic information that gets temporally altered due to various endogenous and exogenous insults. Largely, the extent of genomic dynamicity is controlled by the trade-off between DNA repair processes and the genotoxic potential of the causative agent (genotoxins or potential carcinogens). A subset of genotoxins form DNA adducts by covalently binding to the cellular DNA, triggering structural or functional changes that lead to significant alterations in cellular processes via genetic (e. g., mutations) or non-genetic (e. g., epigenome) routes. Identification, quantification, and characterization of DNA adducts are indispensable for their comprehensive understanding and could expedite the ongoing efforts in predicting carcinogenicity and their mode of action. In this review, we elaborate on using Artificial Intelligence (AI)-based modeling in adducts biology and present multiple computational strategies to gain advancements in decoding DNA adducts. The proposed AI-based strategies encompass predictive modeling for adduct formation via metabolic activation, novel adducts' identification, prediction of biochemical routes for adduct formation, adducts' half-life predictions within biological ecosystems, and, establishing methods to predict the link between adducts chemistry and its location within the genomic DNA. In summary, we discuss some futuristic AI-based approaches in DNA adduct biology.
Collapse
Affiliation(s)
- Sakshi Arora
- Department of Computational Biology, Indraprastha Institute of Information Technology (IIIT-Delhi) Okhla, Phase III, New Delhi, 110020, India
| | - Shiva Satija
- Department of Computational Biology, Indraprastha Institute of Information Technology (IIIT-Delhi) Okhla, Phase III, New Delhi, 110020, India
| | - Aayushi Mittal
- Department of Computational Biology, Indraprastha Institute of Information Technology (IIIT-Delhi) Okhla, Phase III, New Delhi, 110020, India
| | - Saveena Solanki
- Department of Computational Biology, Indraprastha Institute of Information Technology (IIIT-Delhi) Okhla, Phase III, New Delhi, 110020, India
| | - Sanjay Kumar Mohanty
- Department of Computational Biology, Indraprastha Institute of Information Technology (IIIT-Delhi) Okhla, Phase III, New Delhi, 110020, India
| | - Vaibhav Srivastava
- Division of Glycoscience, Department of Chemistry CBH School, Royal Institute of Technology (KTH) AlbaNova University Center, 10691, Stockholm, Sweden
| | - Debarka Sengupta
- Department of Computational Biology, Indraprastha Institute of Information Technology (IIIT-Delhi) Okhla, Phase III, New Delhi, 110020, India
| | - Diptiranjan Rout
- Department of Transfusion Medicine National Cancer Institute, AIIMS, New Delhi, All India Institute of Medical Sciences, Ansari Nagar, New Delhi, 110608, India
| | - Natarajan Arul Murugan
- Department of Computational Biology, Indraprastha Institute of Information Technology (IIIT-Delhi) Okhla, Phase III, New Delhi, 110020, India
| | - Roshan M Borkar
- Department of Pharmaceutical Analysis, National Institute of Pharmaceutical Education and Research (NIPER)-Guwahati, Sila Katamur Halugurisuk P.O.: Changsari, Dist, Guwahati, Assam, 781101, India
| | - Gaurav Ahuja
- Department of Computational Biology, Indraprastha Institute of Information Technology (IIIT-Delhi) Okhla, Phase III, New Delhi, 110020, India
| |
Collapse
|
3
|
Karunaratne E, Hill DW, Dührkop K, Böcker S, Grant DF. Combining Experimental with Computational Infrared and Mass Spectra for High-Throughput Nontargeted Chemical Structure Identification. Anal Chem 2023; 95:11901-11907. [PMID: 37540774 DOI: 10.1021/acs.analchem.3c00937] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/06/2023]
Abstract
The inability to identify the structures of most metabolites detected in environmental or biological samples limits the utility of nontargeted metabolomics. The most widely used analytical approaches combine mass spectrometry and machine learning methods to rank candidate structures contained in large chemical databases. Given the large chemical space typically searched, the use of additional orthogonal data may improve the identification rates and reliability. Here, we present results of combining experimental and computational mass and IR spectral data for high-throughput nontargeted chemical structure identification. Experimental MS/MS and gas-phase IR data for 148 test compounds were obtained from NIST. Candidate structures for each of the test compounds were obtained from PubChem (mean = 4444 candidate structures per test compound). Our workflow used CSI:FingerID to initially score and rank the candidate structures. The top 1000 ranked candidates were subsequently used for IR spectra prediction, scoring, and ranking using density functional theory (DFT-IR). Final ranking of the candidates was based on a composite score calculated as the average of the CSI:FingerID and DFT-IR rankings. This approach resulted in the correct identification of 88 of the 148 test compounds (59%). 129 of the 148 test compounds (87%) were ranked within the top 20 candidates. These identification rates are the highest yet reported when candidate structures are used from PubChem. Combining experimental and computational MS/MS and IR spectral data is a potentially powerful option for prioritizing candidates for final structure verification.
Collapse
Affiliation(s)
- Erandika Karunaratne
- Department of Pharmaceutical Sciences, University of Connecticut, Storrs, Connecticut 06269, United States
| | - Dennis W Hill
- Department of Pharmaceutical Sciences, University of Connecticut, Storrs, Connecticut 06269, United States
| | - Kai Dührkop
- Chair for Bioinformatics, Faculty of Mathematics and Computer Science, Friedrich Schiller University Jena, Jena 07743, Germany
| | - Sebastian Böcker
- Chair for Bioinformatics, Faculty of Mathematics and Computer Science, Friedrich Schiller University Jena, Jena 07743, Germany
| | - David F Grant
- Department of Pharmaceutical Sciences, University of Connecticut, Storrs, Connecticut 06269, United States
| |
Collapse
|
4
|
Maia M, Figueiredo A, Cordeiro C, Sousa Silva M. FT-ICR-MS-based metabolomics: A deep dive into plant metabolism. MASS SPECTROMETRY REVIEWS 2021. [PMID: 34545595 DOI: 10.1002/mas.21731] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/07/2021] [Revised: 08/30/2021] [Accepted: 09/09/2021] [Indexed: 06/13/2023]
Abstract
Metabolomics involves the identification and quantification of metabolites to unravel the chemical footprints behind cellular regulatory processes and to decipher metabolic networks, opening new insights to understand the correlation between genes and metabolites. In plants, it is estimated the existence of hundreds of thousands of metabolites and the majority is still unknown. Fourier transform ion cyclotron resonance mass spectrometry (FT-ICR-MS) is a powerful analytical technique to tackle such challenges. The resolving power and sensitivity of this ultrahigh mass accuracy mass analyzer is such that a complex mixture, such as plant extracts, can be analyzed and thousands of metabolite signals can be detected simultaneously and distinguished based on the naturally abundant elemental isotopes. In this review, FT-ICR-MS-based plant metabolomics studies are described, emphasizing FT-ICR-MS increasing applications in plant science through targeted and untargeted approaches, allowing for a better understanding of plant development, responses to biotic and abiotic stresses, and the discovery of new natural nutraceutical compounds. Improved metabolite extraction protocols compatible with FT-ICR-MS, metabolite analysis methods and metabolite identification platforms are also explored as well as new in silico approaches. Most recent advances in MS imaging are also discussed.
Collapse
Affiliation(s)
- Marisa Maia
- Departamento de Química e Bioquímica, Laboratório de FTICR e Espectrometria de Massa Estrutural, MARE-Marine and Environmental Sciences Centre, Faculdade de Ciências, Universidade de Lisboa, Lisboa, Portugal
- Departamento de Biologia Vegetal, Faculdade de Ciências, Grapevine Pathogen Systems Lab (GPS Lab), Biosystems and Integrative Sciences Institute (BioISI), Universidade de Lisboa, Lisboa, Portugal
| | - Andreia Figueiredo
- Departamento de Biologia Vegetal, Faculdade de Ciências, Grapevine Pathogen Systems Lab (GPS Lab), Biosystems and Integrative Sciences Institute (BioISI), Universidade de Lisboa, Lisboa, Portugal
| | - Carlos Cordeiro
- Departamento de Química e Bioquímica, Laboratório de FTICR e Espectrometria de Massa Estrutural, MARE-Marine and Environmental Sciences Centre, Faculdade de Ciências, Universidade de Lisboa, Lisboa, Portugal
| | - Marta Sousa Silva
- Departamento de Química e Bioquímica, Laboratório de FTICR e Espectrometria de Massa Estrutural, MARE-Marine and Environmental Sciences Centre, Faculdade de Ciências, Universidade de Lisboa, Lisboa, Portugal
| |
Collapse
|
5
|
Muthubharathi BC, Gowripriya T, Balamurugan K. Metabolomics: small molecules that matter more. Mol Omics 2021; 17:210-229. [PMID: 33598670 DOI: 10.1039/d0mo00176g] [Citation(s) in RCA: 71] [Impact Index Per Article: 23.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
Abstract
Metabolomics, an analytical study with high-throughput profiling, helps to understand interactions within a biological system. Small molecules, called metabolites or metabolomes with the size of <1500 Da, depict the status of a biological system in a different manner. Currently, we are in need to globally analyze the metabolome and the pathways involved in healthy, as well as diseased conditions, for possible therapeutic applications. Metabolome analysis has revealed high-abundance molecules during different conditions such as diet, environmental stress, microbiota, and disease and treatment states. As a result, it is hard to understand the complete and stable network of metabolites of a biological system. This review helps readers know the available techniques to study metabolomics in addition to other major omics such as genomics, transcriptomics, and proteomics. This review also discusses the metabolomics in various pathological conditions and the importance of metabolomics in therapeutic applications.
Collapse
|
6
|
Desmet S, Brouckaert M, Boerjan W, Morreel K. Seeing the forest for the trees: Retrieving plant secondary biochemical pathways from metabolome networks. Comput Struct Biotechnol J 2020; 19:72-85. [PMID: 33384856 PMCID: PMC7753198 DOI: 10.1016/j.csbj.2020.11.050] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2020] [Revised: 11/26/2020] [Accepted: 11/28/2020] [Indexed: 02/06/2023] Open
Abstract
Over the last decade, a giant leap forward has been made in resolving the main bottleneck in metabolomics, i.e., the structural characterization of the many unknowns. This has led to the next challenge in this research field: retrieving biochemical pathway information from the various types of networks that can be constructed from metabolome data. Searching putative biochemical pathways, referred to as biotransformation paths, is complicated because several flaws occur during the construction of metabolome networks. Multiple network analysis tools have been developed to deal with these flaws, while in silico retrosynthesis is appearing as an alternative approach. In this review, the different types of metabolome networks, their flaws, and the various tools to trace these biotransformation paths are discussed.
Collapse
Affiliation(s)
- Sandrien Desmet
- Ghent University, Department of Plant Biotechnology and Bioinformatics, Ghent, Belgium
- VIB Center for Plant Systems Biology, Ghent, Belgium
| | - Marlies Brouckaert
- Ghent University, Department of Plant Biotechnology and Bioinformatics, Ghent, Belgium
- VIB Center for Plant Systems Biology, Ghent, Belgium
| | - Wout Boerjan
- Ghent University, Department of Plant Biotechnology and Bioinformatics, Ghent, Belgium
- VIB Center for Plant Systems Biology, Ghent, Belgium
| | - Kris Morreel
- Ghent University, Department of Plant Biotechnology and Bioinformatics, Ghent, Belgium
- VIB Center for Plant Systems Biology, Ghent, Belgium
| |
Collapse
|
7
|
|
8
|
Erbilgin O, Rübel O, Louie KB, Trinh M, Raad MD, Wildish T, Udwary D, Hoover C, Deutsch S, Northen TR, Bowen BP. MAGI: A Method for Metabolite Annotation and Gene Integration. ACS Chem Biol 2019; 14:704-714. [PMID: 30896917 DOI: 10.1021/acschembio.8b01107] [Citation(s) in RCA: 21] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Metabolomics is a widely used technology for obtaining direct measures of metabolic activities from diverse biological systems. However, ambiguous metabolite identifications are a common challenge and biochemical interpretation is often limited by incomplete and inaccurate genome-based predictions of enzyme activities (that is, gene annotations). Metabolite Annotation and Gene Integration (MAGI) generates a metabolite-gene association score using a biochemical reaction network. This is calculated by a method that emphasizes consensus between metabolites and genes via biochemical reactions. To demonstrate the potential of this method, we applied MAGI to integrate sequence data and metabolomics data collected from Streptomyces coelicolor A3(2), an extensively characterized bacterium that produces diverse secondary metabolites. Our findings suggest that coupling metabolomics and genomics data by scoring consensus between the two increases the quality of both metabolite identifications and gene annotations in this organism. MAGI also made biochemical predictions for poorly annotated genes that were consistent with the extensive literature on this important organism. This limited analysis suggests that using metabolomics data has the potential to improve annotations in sequenced organisms and also provides testable hypotheses for specific biochemical functions. MAGI is freely available for academic use both as an online tool at https://magi.nersc.gov and with source code available at https://github.com/biorack/magi .
Collapse
Affiliation(s)
- Onur Erbilgin
- Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory, Berkeley, California 94720, United States
| | - Oliver Rübel
- Data Analytics and Visualization Group, Computational Research Division, Lawrence Berkeley National Laboratory, Berkeley, California 94720, United States
| | - Katherine B. Louie
- Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, California 94720, United States
| | - Matthew Trinh
- Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory, Berkeley, California 94720, United States
| | - Markus de Raad
- Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory, Berkeley, California 94720, United States
| | - Tony Wildish
- Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, California 94720, United States
- National Energy Research Scientific Computing Center, Lawrence Berkeley National Laboratory, Berkeley, California 94720, United States
| | - Daniel Udwary
- Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, California 94720, United States
- National Energy Research Scientific Computing Center, Lawrence Berkeley National Laboratory, Berkeley, California 94720, United States
| | - Cindi Hoover
- Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, California 94720, United States
| | - Samuel Deutsch
- Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory, Berkeley, California 94720, United States
- Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, California 94720, United States
| | - Trent R. Northen
- Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory, Berkeley, California 94720, United States
- Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, California 94720, United States
| | - Benjamin P. Bowen
- Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory, Berkeley, California 94720, United States
- Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, California 94720, United States
| |
Collapse
|
9
|
Cohen Hubal EA, Wetmore BA, Wambaugh JF, El-Masri H, Sobus JR, Bahadori T. Advancing internal exposure and physiologically-based toxicokinetic modeling for 21st-century risk assessments. JOURNAL OF EXPOSURE SCIENCE & ENVIRONMENTAL EPIDEMIOLOGY 2019; 29:11-20. [PMID: 30116055 PMCID: PMC6760598 DOI: 10.1038/s41370-018-0046-9] [Citation(s) in RCA: 23] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/17/2017] [Revised: 03/15/2018] [Accepted: 03/19/2018] [Indexed: 05/22/2023]
Abstract
Scientifically sound, risk-informed evaluation of chemicals is essential to protecting public health. Systematically leveraging information from exposure, toxicology, and epidemiology studies can provide a holistic understanding of how real-world exposure to chemicals may impact the health of populations, including sensitive and vulnerable individuals and life-stages. Increasingly, public health policy makers are employing toxicokinetic (TK) modeling tools to integrate these data streams and predict potential human health impact. Development of a suite of tools for predicting internal exposure, including physiologically-based toxicokinetic (PBTK) models, is being driven by needs to address large numbers of data-poor chemicals efficiently, translate bioactivity, and mechanistic information from new in vitro test systems, and integrate multiple lines of evidence to enable scientifically sound, risk-informed decisions. New modeling approaches are being designed "fit for purpose" to inform specific decision contexts, with applications ranging from rapid screening of hundreds of chemicals, to improved prediction of risks during sensitive stages of development. New data are being generated experimentally and computationally to support these models. Progress to meet the demand for internal exposure and PBTK modeling tools will require transparent publication of models and data to build credibility in results, as well as opportunities to partner with decision makers to evaluate and build confidence in use of these for improved decisions that promote safe use of chemicals.
Collapse
Affiliation(s)
| | - Barbara A Wetmore
- National Exposure Research Laboratory (NERL), US EPA, Washington, USA
| | - John F Wambaugh
- National Center for Computational Toxicology (NCCT), US EPA, Washington, USA
| | - Hisham El-Masri
- National Health and Environmental Effects Laboratory (NHEERL), US EPA, Washington, USA
| | - Jon R Sobus
- National Exposure Research Laboratory (NERL), US EPA, Washington, USA
| | - Tina Bahadori
- National Center for Environmental Assessment (NCEA), US EPA, Washington, USA
| |
Collapse
|
10
|
Sobus JR, Wambaugh JF, Isaacs KK, Williams AJ, McEachran AD, Richard AM, Grulke CM, Ulrich EM, Rager JE, Strynar MJ, Newton SR. Integrating tools for non-targeted analysis research and chemical safety evaluations at the US EPA. JOURNAL OF EXPOSURE SCIENCE & ENVIRONMENTAL EPIDEMIOLOGY 2018; 28:411-426. [PMID: 29288256 PMCID: PMC6661898 DOI: 10.1038/s41370-017-0012-y] [Citation(s) in RCA: 136] [Impact Index Per Article: 22.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/27/2017] [Revised: 08/04/2017] [Accepted: 08/25/2017] [Indexed: 05/18/2023]
Abstract
Tens-of-thousands of chemicals are registered in the U.S. for use in countless processes and products. Recent evidence suggests that many of these chemicals are measureable in environmental and/or biological systems, indicating the potential for widespread exposures. Traditional public health research tools, including in vivo studies and targeted analytical chemistry methods, have been unable to meet the needs of screening programs designed to evaluate chemical safety. As such, new tools have been developed to enable rapid assessment of potentially harmful chemical exposures and their attendant biological responses. One group of tools, known as "non-targeted analysis" (NTA) methods, allows the rapid characterization of thousands of never-before-studied compounds in a wide variety of environmental, residential, and biological media. This article discusses current applications of NTA methods, challenges to their effective use in chemical screening studies, and ways in which shared resources (e.g., chemical standards, databases, model predictions, and media measurements) can advance their use in risk-based chemical prioritization. A brief review is provided of resources and projects within EPA's Office of Research and Development (ORD) that provide benefit to, and receive benefits from, NTA research endeavors. A summary of EPA's Non-Targeted Analysis Collaborative Trial (ENTACT) is also given, which makes direct use of ORD resources to benefit the global NTA research community. Finally, a research framework is described that shows how NTA methods will bridge chemical prioritization efforts within ORD. This framework exists as a guide for institutions seeking to understand the complexity of chemical exposures, and the impact of these exposures on living systems.
Collapse
Affiliation(s)
- Jon R Sobus
- U.S. Environmental Protection Agency, Office of Research and Development, National Exposure Research Laboratory, 109 T.W. Alexander Drive, Research Triangle Park, NC, 27709, USA.
| | - John F Wambaugh
- U.S. Environmental Protection Agency, Office of Research and Development, National Center for Computational Toxicology, 109 T.W. Alexander Drive, Research Triangle Park, NC, 27709, USA
| | - Kristin K Isaacs
- U.S. Environmental Protection Agency, Office of Research and Development, National Exposure Research Laboratory, 109 T.W. Alexander Drive, Research Triangle Park, NC, 27709, USA
| | - Antony J Williams
- U.S. Environmental Protection Agency, Office of Research and Development, National Center for Computational Toxicology, 109 T.W. Alexander Drive, Research Triangle Park, NC, 27709, USA
| | - Andrew D McEachran
- Oak Ridge Institute for Science and Education (ORISE) Participant, 109 T.W. Alexander Drive, Research Triangle Park, NC, 27709, USA
| | - Ann M Richard
- U.S. Environmental Protection Agency, Office of Research and Development, National Center for Computational Toxicology, 109 T.W. Alexander Drive, Research Triangle Park, NC, 27709, USA
| | - Christopher M Grulke
- U.S. Environmental Protection Agency, Office of Research and Development, National Center for Computational Toxicology, 109 T.W. Alexander Drive, Research Triangle Park, NC, 27709, USA
| | - Elin M Ulrich
- U.S. Environmental Protection Agency, Office of Research and Development, National Exposure Research Laboratory, 109 T.W. Alexander Drive, Research Triangle Park, NC, 27709, USA
| | - Julia E Rager
- Oak Ridge Institute for Science and Education (ORISE) Participant, 109 T.W. Alexander Drive, Research Triangle Park, NC, 27709, USA
- ToxStrategies, Inc., 9390 Research Blvd., Suite 100, Austin, TX, 78759, USA
| | - Mark J Strynar
- U.S. Environmental Protection Agency, Office of Research and Development, National Exposure Research Laboratory, 109 T.W. Alexander Drive, Research Triangle Park, NC, 27709, USA
| | - Seth R Newton
- U.S. Environmental Protection Agency, Office of Research and Development, National Exposure Research Laboratory, 109 T.W. Alexander Drive, Research Triangle Park, NC, 27709, USA
| |
Collapse
|
11
|
Kind T, Tsugawa H, Cajka T, Ma Y, Lai Z, Mehta SS, Wohlgemuth G, Barupal DK, Showalter MR, Arita M, Fiehn O. Identification of small molecules using accurate mass MS/MS search. MASS SPECTROMETRY REVIEWS 2018; 37:513-532. [PMID: 28436590 PMCID: PMC8106966 DOI: 10.1002/mas.21535] [Citation(s) in RCA: 266] [Impact Index Per Article: 44.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/27/2016] [Revised: 03/17/2017] [Accepted: 03/18/2017] [Indexed: 05/03/2023]
Abstract
Tandem mass spectral library search (MS/MS) is the fastest way to correctly annotate MS/MS spectra from screening small molecules in fields such as environmental analysis, drug screening, lipid analysis, and metabolomics. The confidence in MS/MS-based annotation of chemical structures is impacted by instrumental settings and requirements, data acquisition modes including data-dependent and data-independent methods, library scoring algorithms, as well as post-curation steps. We critically discuss parameters that influence search results, such as mass accuracy, precursor ion isolation width, intensity thresholds, centroiding algorithms, and acquisition speed. A range of publicly and commercially available MS/MS databases such as NIST, MassBank, MoNA, LipidBlast, Wiley MSforID, and METLIN are surveyed. In addition, software tools including NIST MS Search, MS-DIAL, Mass Frontier, SmileMS, Mass++, and XCMS2 to perform fast MS/MS search are discussed. MS/MS scoring algorithms and challenges during compound annotation are reviewed. Advanced methods such as the in silico generation of tandem mass spectra using quantum chemistry and machine learning methods are covered. Community efforts for curation and sharing of tandem mass spectra that will allow for faster distribution of scientific discoveries are discussed.
Collapse
Affiliation(s)
- Tobias Kind
- Genome Center, Metabolomics, UC Davis, Davis, California
| | - Hiroshi Tsugawa
- RIKEN Center for Sustainable Resource Science, Yokohama, Kanagawa, Japan
| | - Tomas Cajka
- Genome Center, Metabolomics, UC Davis, Davis, California
| | - Yan Ma
- National Institute of Biological Sciences, Beijing, People’s Republic of China
| | - Zijuan Lai
- Genome Center, Metabolomics, UC Davis, Davis, California
| | | | | | | | | | - Masanori Arita
- RIKEN Center for Sustainable Resource Science, Yokohama, Kanagawa, Japan
| | - Oliver Fiehn
- Genome Center, Metabolomics, UC Davis, Davis, California
- Faculty of Sciences, Department of Biochemistry, King Abdulaziz University, Jeddah, Saudi Arabia
| |
Collapse
|
12
|
Abstract
Disturbances in cardiac metabolism underlie most cardiovascular diseases. Metabolomics, one of the newer omics technologies, has emerged as a powerful tool for defining changes in both global and cardiac-specific metabolism that occur across a spectrum of cardiovascular disease states. Findings from metabolomics studies have contributed to better understanding of the metabolic changes that occur in heart failure and ischemic heart disease and have identified new cardiovascular disease biomarkers. As technologies advance, the metabolomics field continues to evolve rapidly. In this review, we will discuss the current state of metabolomics technologies, including consideration of various metabolomics platforms and elements of study design; the emerging utility of stable isotopes for metabolic flux studies; and the use of metabolomics to better understand specific cardiovascular diseases, with an emphasis on recent advances in the field.
Collapse
Affiliation(s)
- Robert W McGarrah
- From the Sarah W. Stedman Nutrition and Metabolism Center and Duke Molecular Physiology Institute (R.W.M., S.B.C., G.F.Z., S.H.S., C.B.N.)
- Division of Cardiology (R.W.M., S.H.S.)
- Department of Medicine (R.W.M., G.F.Z., S.H.S., C.B.N.)
| | - Scott B Crown
- From the Sarah W. Stedman Nutrition and Metabolism Center and Duke Molecular Physiology Institute (R.W.M., S.B.C., G.F.Z., S.H.S., C.B.N.)
| | - Guo-Fang Zhang
- From the Sarah W. Stedman Nutrition and Metabolism Center and Duke Molecular Physiology Institute (R.W.M., S.B.C., G.F.Z., S.H.S., C.B.N.)
- Division of Endocrinology (G.F.Z., C.B.N.)
- Department of Medicine (R.W.M., G.F.Z., S.H.S., C.B.N.)
| | - Svati H Shah
- From the Sarah W. Stedman Nutrition and Metabolism Center and Duke Molecular Physiology Institute (R.W.M., S.B.C., G.F.Z., S.H.S., C.B.N.)
- Division of Cardiology (R.W.M., S.H.S.)
- Department of Medicine (R.W.M., G.F.Z., S.H.S., C.B.N.)
| | - Christopher B Newgard
- From the Sarah W. Stedman Nutrition and Metabolism Center and Duke Molecular Physiology Institute (R.W.M., S.B.C., G.F.Z., S.H.S., C.B.N.)
- Division of Endocrinology (G.F.Z., C.B.N.)
- Department of Medicine (R.W.M., G.F.Z., S.H.S., C.B.N.)
- Departments of Pharmacology and Cancer Biology (C.B.N.), Duke University Medical Center, Durham, NC
| |
Collapse
|
13
|
Godzien J, Gil de la Fuente A, Otero A, Barbas C. Metabolite Annotation and Identification. COMPREHENSIVE ANALYTICAL CHEMISTRY 2018. [DOI: 10.1016/bs.coac.2018.07.004] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/19/2022]
|
14
|
Jana K, Bandyopadhyay T, Ganguly B. Designed inhibitors with hetero linkers for gastric proton pump H +,K +-ATPase: Steered molecular dynamics and metadynamics studies. J Mol Graph Model 2017; 78:129-138. [PMID: 29055186 DOI: 10.1016/j.jmgm.2017.10.006] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2017] [Revised: 10/06/2017] [Accepted: 10/09/2017] [Indexed: 02/07/2023]
Abstract
Acid suppressant SCH28080 and its derivatives reversibly reduce acid secretion activity of the H+,K+-ATPase in a K+ competitive manner. The results on homologation of the SCH28080 by varying the linker chain length suggested the improvement in efficacy. However, the pharmacokinetic studies reveal that the hydrophobic nature of the CH2 linker units may not help it to function as a better acid suppressant. We have exploited the role of linker unit to enhance the efficacy of such reversible acid suppressant drug molecules using hetero linker, i.e., disulfide and peroxy linkers. The logarithm of partition coefficient defined for a drug molecule relates to the partition coefficient, which allows the optimum solubility characteristics to reach the active site. The logarithm of partition coefficient calculated for the designed inhibitors suggests that inhibitors would possibly reach the active site in sufficient concentration like in the case of SCH28080. The steered molecular dynamics studies have revealed that the Inhibitor-1 with disulfide linker unit is more stable at the active site due to greater noncovalent interactions compared to the SCH28080. Centre of mass distance analysis suggests that the Cysteine-813 amino acid residue selectively plays an important role in the inhibition of H+,K+-ATPase for Inhibitor-1. Furthermore, the quantum chemical calculations with M11L/6-31+G(d,p) level of theory have been performed to account the noncovalent interactions responsible for the stabilization of inhibitor molecules in the active site gorge of the gastric proton pump at different time scale. The hydrogen bonding and hydrophobic interaction studies corroborate the center of mass distance analysis as well. Well-tempered metadynamics free energy surface and center of mass separation analysis for the Inhibitor-1 is in good agreement with the steered molecular dynamics results. The torsional angle of the linker units seems to be crucial for better efficacy of drug molecules. The torsional angle of linker units of SCH28080 (COCH2C) and of Inhibitor 1 (CSSC) prefers to lie within ∼60°-90° for a longer time during the simulations, whereas, the peroxy linker (COOC) of Inhibitor 2 prefers to adopt ∼120-160°. Therefore, it appears that the smaller torsion angle of linker units can achieve better interactions with the active site residues of H+,K+-ATPase to inhibit the acid secretion activity. The reversible drug molecules with disulfide linker unit would be a promising candidate as proton pump antagonist to H+,K+-ATPase.
Collapse
Affiliation(s)
- Kalyanashis Jana
- Computation and Simulation Unit (Analytical Discipline and Centralized Instrument Facility), CSIR, Central Salt and Marine Chemicals Research Institute, Bhavnagar 364002, Gujarat, India; Academy of Scientific and Innovative Research, CSIR, CSMCRI, Bhavnagar 364002, Gujarat, India
| | - Tusar Bandyopadhyay
- Theoretical Chemistry Section, Bhabha Atomic Research Centre, Trombay, Mumbai 400 085, India.
| | - Bishwajit Ganguly
- Computation and Simulation Unit (Analytical Discipline and Centralized Instrument Facility), CSIR, Central Salt and Marine Chemicals Research Institute, Bhavnagar 364002, Gujarat, India; Academy of Scientific and Innovative Research, CSIR, CSMCRI, Bhavnagar 364002, Gujarat, India.
| |
Collapse
|
15
|
Hufsky F, Böcker S. Mining molecular structure databases: Identification of small molecules based on fragmentation mass spectrometry data. MASS SPECTROMETRY REVIEWS 2017; 36:624-633. [PMID: 26763615 DOI: 10.1002/mas.21489] [Citation(s) in RCA: 63] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/21/2015] [Accepted: 12/18/2015] [Indexed: 06/05/2023]
Abstract
Mass spectrometry (MS) is a key technology for the analysis of small molecules. For the identification and structural elucidation of novel molecules, new approaches beyond straightforward spectral comparison are required. In this review, we will cover computational methods that help with the identification of small molecules by analyzing fragmentation MS data. We focus on the four main approaches to mine a database of metabolite structures, that is rule-based fragmentation spectrum prediction, combinatorial fragmentation, competitive fragmentation modeling, and molecular fingerprint prediction. © 2016 Wiley Periodicals, Inc. Mass Spec Rev 36:624-633, 2017.
Collapse
Affiliation(s)
- Franziska Hufsky
- Lehrstuhl für Bioinformatik, Friedrich-Schiller-Universität Jena, Ernst-Abbe-Platz 2, Jena, 07743, Germany
- Bioinformatik für Hochdurchsatzverfahren, Friedrich-Schiller-Universität Jena, Leutragraben 1, Jena, 07743, Germany
| | - Sebastian Böcker
- Lehrstuhl für Bioinformatik, Friedrich-Schiller-Universität Jena, Ernst-Abbe-Platz 2, Jena, 07743, Germany
| |
Collapse
|
16
|
Ponting DJ, Murray E, Long A. Quantifying confidence in the reporting of metabolic biotransformations. Drug Discov Today 2017; 22:970-975. [PMID: 28088443 DOI: 10.1016/j.drudis.2017.01.001] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2016] [Revised: 12/07/2016] [Accepted: 01/05/2017] [Indexed: 11/20/2022]
Abstract
How confident can we be in the assignment of metabolite structures? Are the analytical techniques used sufficient to support hypotheses about what is being formed? In this Feature, we discuss the results of an extensive survey into the analytical techniques used, and their value in the characterisation of metabolites. The survey covers the structures of over 16000 metabolites formed from 1732 query compounds, covering over 35 years of the literature and a variety of journals. The value of different characterisation techniques is considered, alongside or in the absence of synthetic standards. The changes in analytical techniques used over time are briefly considered, and a metric for the confidence that a claimed metabolite has been confirmed is proposed.
Collapse
Affiliation(s)
- David J Ponting
- Lhasa Limited, Granary Wharf House, 2 Canal Wharf, Leeds LS11 5PS, UK.
| | - Ernest Murray
- Lhasa Limited, Granary Wharf House, 2 Canal Wharf, Leeds LS11 5PS, UK
| | - Anthony Long
- Lhasa Limited, Granary Wharf House, 2 Canal Wharf, Leeds LS11 5PS, UK
| |
Collapse
|
17
|
Metz TO, Baker ES, Schymanski EL, Renslow RS, Thomas DG, Causon TJ, Webb IK, Hann S, Smith RD, Teeguarden JG. Integrating ion mobility spectrometry into mass spectrometry-based exposome measurements: what can it add and how far can it go? Bioanalysis 2017; 9:81-98. [PMID: 27921453 PMCID: PMC5674211 DOI: 10.4155/bio-2016-0244] [Citation(s) in RCA: 60] [Impact Index Per Article: 8.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2016] [Accepted: 10/12/2016] [Indexed: 01/01/2023] Open
Abstract
Measuring the exposome remains a challenge due to the range and number of anthropogenic molecules that are encountered in our daily lives, as well as the complex systemic responses to these exposures. One option for improving the coverage, dynamic range and throughput of measurements is to incorporate ion mobility spectrometry (IMS) into current MS-based analytical methods. The implementation of IMS in exposomics studies will lead to more frequent observations of previously undetected chemicals and metabolites. LC-IMS-MS will provide increased overall measurement dynamic range, resulting in detections of lower abundance molecules. Alternatively, the throughput of IMS-MS alone will provide the opportunity to analyze many thousands of longitudinal samples over lifetimes of exposure, capturing evidence of transitory accumulations of chemicals or metabolites. The volume of data corresponding to these new chemical observations will almost certainly outpace the generation of reference data to enable their confident identification. In this perspective, we briefly review the state-of-the-art in measuring the exposome, and discuss the potential use for IMS-MS and the physico-chemical property of collisional cross section in both exposure assessment and molecular identification.
Collapse
Affiliation(s)
- Thomas O Metz
- Biological Sciences Division, Pacific Northwest National Laboratory, Richland, WA, USA
| | - Erin S Baker
- Biological Sciences Division, Pacific Northwest National Laboratory, Richland, WA, USA
| | - Emma L Schymanski
- Eawag, Swiss Federal Institute of Aquatic Science & Technology, Dübendorf, Switzerland
| | - Ryan S Renslow
- Biological Sciences Division, Pacific Northwest National Laboratory, Richland, WA, USA
| | - Dennis G Thomas
- Biological Sciences Division, Pacific Northwest National Laboratory, Richland, WA, USA
| | - Tim J Causon
- Division of Analytical Chemistry, Department of Chemistry, University of Natural Resources & Life Sciences (BOKU Vienna), Vienna, Austria
| | - Ian K Webb
- Biological Sciences Division, Pacific Northwest National Laboratory, Richland, WA, USA
| | - Stephan Hann
- Division of Analytical Chemistry, Department of Chemistry, University of Natural Resources & Life Sciences (BOKU Vienna), Vienna, Austria
| | - Richard D Smith
- Biological Sciences Division, Pacific Northwest National Laboratory, Richland, WA, USA
| | - Justin G Teeguarden
- Biological Sciences Division, Pacific Northwest National Laboratory, Richland, WA, USA
- Department of Environmental & Molecular Toxicology, Oregon State University, Corvallis, OR, USA
| |
Collapse
|
18
|
Allard PM, Péresse T, Bisson J, Gindro K, Marcourt L, Pham VC, Roussi F, Litaudon M, Wolfender JL. Integration of Molecular Networking and In-Silico MS/MS Fragmentation for Natural Products Dereplication. Anal Chem 2016; 88:3317-23. [DOI: 10.1021/acs.analchem.5b04804] [Citation(s) in RCA: 241] [Impact Index Per Article: 30.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
Affiliation(s)
- Pierre-Marie Allard
- School
of Pharmaceutical Sciences, EPGL, University of Geneva, University of Lausanne, Quai Ernest-Ansermet 30, CH-1211 Geneva 4, Switzerland
| | - Tiphaine Péresse
- Institut
de Chimie des Substances Naturelles CNRS UPR 2301, University Paris-Saclay, 1 Avenue de la Terrasse, 91198 Gif-sur-Yvette, France
| | - Jonathan Bisson
- Center for Natural
Product Technologies, Department of Medicinal Chemistry
and Pharmacognosy College of Pharmacy, University of Illinois at Chicago, 833 South Wood Street, Chicago, Illinois 60612, United States
| | - Katia Gindro
- Mycology and Biotechnology
group, Institute for Plant Production Sciences IPS, Agroscope, Route de Duillier 50, P.O. Box 1012, 1260 Nyon, Switzerland
| | - Laurence Marcourt
- School
of Pharmaceutical Sciences, EPGL, University of Geneva, University of Lausanne, Quai Ernest-Ansermet 30, CH-1211 Geneva 4, Switzerland
| | - Van Cuong Pham
- Institute of Marine Biochemistry of the Vietnam Academy of Science and Technology (VAST), 18 Hoang Quoc
Viet road, Cau Giay Hanoi, Vietnam
| | - Fanny Roussi
- Institut
de Chimie des Substances Naturelles CNRS UPR 2301, University Paris-Saclay, 1 Avenue de la Terrasse, 91198 Gif-sur-Yvette, France
| | - Marc Litaudon
- Institut
de Chimie des Substances Naturelles CNRS UPR 2301, University Paris-Saclay, 1 Avenue de la Terrasse, 91198 Gif-sur-Yvette, France
| | - Jean-Luc Wolfender
- School
of Pharmaceutical Sciences, EPGL, University of Geneva, University of Lausanne, Quai Ernest-Ansermet 30, CH-1211 Geneva 4, Switzerland
| |
Collapse
|
19
|
Van den Eede N, Cuykx M, Rodrigues RM, Laukens K, Neels H, Covaci A, Vanhaecke T. Metabolomics analysis of the toxicity pathways of triphenyl phosphate in HepaRG cells and comparison to oxidative stress mechanisms caused by acetaminophen. Toxicol In Vitro 2015; 29:2045-54. [DOI: 10.1016/j.tiv.2015.08.012] [Citation(s) in RCA: 27] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2015] [Revised: 05/15/2015] [Accepted: 08/14/2015] [Indexed: 12/21/2022]
|
20
|
Searching molecular structure databases with tandem mass spectra using CSI:FingerID. Proc Natl Acad Sci U S A 2015; 112:12580-5. [PMID: 26392543 DOI: 10.1073/pnas.1509788112] [Citation(s) in RCA: 596] [Impact Index Per Article: 66.2] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/17/2023] Open
Abstract
Metabolites provide a direct functional signature of cellular state. Untargeted metabolomics experiments usually rely on tandem MS to identify the thousands of compounds in a biological sample. Today, the vast majority of metabolites remain unknown. We present a method for searching molecular structure databases using tandem MS data of small molecules. Our method computes a fragmentation tree that best explains the fragmentation spectrum of an unknown molecule. We use the fragmentation tree to predict the molecular structure fingerprint of the unknown compound using machine learning. This fingerprint is then used to search a molecular structure database such as PubChem. Our method is shown to improve on the competing methods for computational metabolite identification by a considerable margin.
Collapse
|
21
|
Jeffryes JG, Colastani RL, Elbadawi-Sidhu M, Kind T, Niehaus TD, Broadbelt LJ, Hanson AD, Fiehn O, Tyo KEJ, Henry CS. MINEs: open access databases of computationally predicted enzyme promiscuity products for untargeted metabolomics. J Cheminform 2015; 7:44. [PMID: 26322134 PMCID: PMC4550642 DOI: 10.1186/s13321-015-0087-1] [Citation(s) in RCA: 135] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2015] [Accepted: 07/06/2015] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND In spite of its great promise, metabolomics has proven difficult to execute in an untargeted and generalizable manner. Liquid chromatography-mass spectrometry (LC-MS) has made it possible to gather data on thousands of cellular metabolites. However, matching metabolites to their spectral features continues to be a bottleneck, meaning that much of the collected information remains uninterpreted and that new metabolites are seldom discovered in untargeted studies. These challenges require new approaches that consider compounds beyond those available in curated biochemistry databases. DESCRIPTION Here we present Metabolic In silico Network Expansions (MINEs), an extension of known metabolite databases to include molecules that have not been observed, but are likely to occur based on known metabolites and common biochemical reactions. We utilize an algorithm called the Biochemical Network Integrated Computational Explorer (BNICE) and expert-curated reaction rules based on the Enzyme Commission classification system to propose the novel chemical structures and reactions that comprise MINE databases. Starting from the Kyoto Encyclopedia of Genes and Genomes (KEGG) COMPOUND database, the MINE contains over 571,000 compounds, of which 93% are not present in the PubChem database. However, these MINE compounds have on average higher structural similarity to natural products than compounds from KEGG or PubChem. MINE databases were able to propose annotations for 98.6% of a set of 667 MassBank spectra, 14% more than KEGG alone and equivalent to PubChem while returning far fewer candidates per spectra than PubChem (46 vs. 1715 median candidates). Application of MINEs to LC-MS accurate mass data enabled the identity of an unknown peak to be confidently predicted. CONCLUSIONS MINE databases are freely accessible for non-commercial use via user-friendly web-tools at http://minedatabase.mcs.anl.gov and developer-friendly APIs. MINEs improve metabolomics peak identification as compared to general chemical databases whose results include irrelevant synthetic compounds. Furthermore, MINEs complement and expand on previous in silico generated compound databases that focus on human metabolism. We are actively developing the database; future versions of this resource will incorporate transformation rules for spontaneous chemical reactions and more advanced filtering and prioritization of candidate structures. Graphical abstractMINE database construction and access methods. The process of constructing a MINE database from the curated source databases is depicted on the left. The methods for accessing the database are shown on the right.
Collapse
Affiliation(s)
- James G Jeffryes
- Department of Chemical and Biological Engineering, Northwestern University, Evanston, IL USA ; Mathematics and Computer Science Division, Argonne National Laboratory, Argonne, IL USA
| | - Ricardo L Colastani
- Mathematics and Computer Science Division, Argonne National Laboratory, Argonne, IL USA
| | | | - Tobias Kind
- West Coast Metabolomics Center, University of California, Davis, CA USA
| | - Thomas D Niehaus
- Horticultural Sciences Department, University of Florida, Gainesville, FL USA
| | - Linda J Broadbelt
- Department of Chemical and Biological Engineering, Northwestern University, Evanston, IL USA
| | - Andrew D Hanson
- Horticultural Sciences Department, University of Florida, Gainesville, FL USA
| | - Oliver Fiehn
- West Coast Metabolomics Center, University of California, Davis, CA USA ; Biochemistry Department, King Abdulaziz University, Jeddah, Saudi Arabia
| | - Keith E J Tyo
- Department of Chemical and Biological Engineering, Northwestern University, Evanston, IL USA
| | - Christopher S Henry
- Mathematics and Computer Science Division, Argonne National Laboratory, Argonne, IL USA
| |
Collapse
|
22
|
Hamdalla MA, Rajasekaran S, Grant DF, Măndoiu II. Metabolic pathway predictions for metabolomics: a molecular structure matching approach. J Chem Inf Model 2015; 55:709-18. [PMID: 25668446 DOI: 10.1021/ci500517v] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/19/2023]
Abstract
Metabolic pathways are composed of a series of chemical reactions occurring within a cell. In each pathway, enzymes catalyze the conversion of substrates into structurally similar products. Thus, structural similarity provides a potential means for mapping newly identified biochemical compounds to known metabolic pathways. In this paper, we present TrackSM, a cheminformatics tool designed to associate a chemical compound to a known metabolic pathway based on molecular structure matching techniques. Validation experiments show that TrackSM is capable of associating 93% of tested structures to their correct KEGG pathway class and 88% to their correct individual KEGG pathway. This suggests that TrackSM may be a valuable tool to aid in associating previously unknown small molecules to known biochemical pathways and improve our ability to link metabolomics, proteomic, and genomic data sets. TrackSM is freely available at http://metabolomics.pharm.uconn.edu/?q=Software.html .
Collapse
Affiliation(s)
- Mai A Hamdalla
- ‡Computer Science Department, Helwan University, Cairo, Egypt
| | | | | | | |
Collapse
|
23
|
Rohloff J. Analysis of phenolic and cyclic compounds in plants using derivatization techniques in combination with GC-MS-based metabolite profiling. Molecules 2015; 20:3431-62. [PMID: 25690297 PMCID: PMC6272321 DOI: 10.3390/molecules20023431] [Citation(s) in RCA: 41] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2014] [Revised: 01/06/2015] [Accepted: 02/10/2015] [Indexed: 12/13/2022] Open
Abstract
Metabolite profiling has been established as a modern technology platform for the description of complex chemical matrices and compound identification in biological samples. Gas chromatography coupled with mass spectrometry (GC-MS) in particular is a fast and accurate method widely applied in diagnostics, functional genomics and for screening purposes. Following solvent extraction and derivatization, hundreds of metabolites from different chemical groups can be characterized in one analytical run. Besides sugars, acids, and polyols, diverse phenolic and other cyclic metabolites can be efficiently detected by metabolite profiling. The review describes own results from plant research to exemplify the applicability of GC-MS profiling and concurrent detection and identification of phenolics and other cyclic structures.
Collapse
Affiliation(s)
- Jens Rohloff
- Department of Biology, Norwegian University of Science and Technology, Trondheim 7491, Norway.
| |
Collapse
|
24
|
Pertusi DA, Stine AE, Broadbelt LJ, Tyo KEJ. Efficient searching and annotation of metabolic networks using chemical similarity. ACTA ACUST UNITED AC 2014; 31:1016-24. [PMID: 25417203 DOI: 10.1093/bioinformatics/btu760] [Citation(s) in RCA: 40] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2014] [Accepted: 11/11/2014] [Indexed: 11/14/2022]
Abstract
MOTIVATION The urgent need for efficient and sustainable biological production of fuels and high-value chemicals has elicited a wave of in silico techniques for identifying promising novel pathways to these compounds in large putative metabolic networks. To date, these approaches have primarily used general graph search algorithms, which are prohibitively slow as putative metabolic networks may exceed 1 million compounds. To alleviate this limitation, we report two methods--SimIndex (SI) and SimZyme--which use chemical similarity of 2D chemical fingerprints to efficiently navigate large metabolic networks and propose enzymatic connections between the constituent nodes. We also report a Byers-Waterman type pathway search algorithm for further paring down pertinent networks. RESULTS Benchmarking tests run with SI show it can reduce the number of nodes visited in searching a putative network by 100-fold with a computational time improvement of up to 10(5)-fold. Subsequent Byers-Waterman search application further reduces the number of nodes searched by up to 100-fold, while SimZyme demonstrates ∼ 90% accuracy in matching query substrates with enzymes. Using these modules, we have designed and annotated an alternative to the methylerythritol phosphate pathway to produce isopentenyl pyrophosphate with more favorable thermodynamics than the native pathway. These algorithms will have a significant impact on our ability to use large metabolic networks that lack annotation of promiscuous reactions. AVAILABILITY AND IMPLEMENTATION Python files will be available for download at http://tyolab.northwestern.edu/tools/. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Dante A Pertusi
- Department of Chemical and Biological Engineering, Northwestern University, Evanston, IL 60208, USA
| | - Andrew E Stine
- Department of Chemical and Biological Engineering, Northwestern University, Evanston, IL 60208, USA
| | - Linda J Broadbelt
- Department of Chemical and Biological Engineering, Northwestern University, Evanston, IL 60208, USA
| | - Keith E J Tyo
- Department of Chemical and Biological Engineering, Northwestern University, Evanston, IL 60208, USA
| |
Collapse
|
25
|
Kwon H, Park J, An Y, Sim J, Park S. A smartphone metabolomics platform and its application to the assessment of cisplatin-induced kidney toxicity. Anal Chim Acta 2014; 845:15-22. [DOI: 10.1016/j.aca.2014.08.006] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2014] [Revised: 07/25/2014] [Accepted: 08/05/2014] [Indexed: 11/30/2022]
|
26
|
Scalbert A, Brennan L, Manach C, Andres-Lacueva C, Dragsted LO, Draper J, Rappaport SM, van der Hooft JJJ, Wishart DS. The food metabolome: a window over dietary exposure. Am J Clin Nutr 2014; 99:1286-308. [PMID: 24760973 DOI: 10.3945/ajcn.113.076133] [Citation(s) in RCA: 346] [Impact Index Per Article: 34.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022] Open
Abstract
The food metabolome is defined as the part of the human metabolome directly derived from the digestion and biotransformation of foods and their constituents. With >25,000 compounds known in various foods, the food metabolome is extremely complex, with a composition varying widely according to the diet. By its very nature it represents a considerable and still largely unexploited source of novel dietary biomarkers that could be used to measure dietary exposures with a high level of detail and precision. Most dietary biomarkers currently have been identified on the basis of our knowledge of food compositions by using hypothesis-driven approaches. However, the rapid development of metabolomics resulting from the development of highly sensitive modern analytic instruments, the availability of metabolite databases, and progress in (bio)informatics has made agnostic approaches more attractive as shown by the recent identification of novel biomarkers of intakes for fruit, vegetables, beverages, meats, or complex diets. Moreover, examples also show how the scrutiny of the food metabolome can lead to the discovery of bioactive molecules and dietary factors associated with diseases. However, researchers still face hurdles, which slow progress and need to be resolved to bring this emerging field of research to maturity. These limits were discussed during the First International Workshop on the Food Metabolome held in Glasgow. Key recommendations made during the workshop included more coordination of efforts; development of new databases, software tools, and chemical libraries for the food metabolome; and shared repositories of metabolomic data. Once achieved, major progress can be expected toward a better understanding of the complex interactions between diet and human health.
Collapse
Affiliation(s)
- Augustin Scalbert
- From the International Agency for Research on Cancer, Lyon, France (AS); University College Dublin, Dublin, Ireland (LB); the Institut National de la Recherche Agronomique, Clermont-Ferrand, France (CM); Clermont University, Clermont-Ferrand, France (CM); the University of Barcelona, Barcelona, Spain (CA-L); the University of Copenhagen, Frederiksberg, Denmark (LOD); Aberystwyth University, Aberystwyth, United Kingdom (JD); the University of California, Berkeley, CA (SMR); the University of Glasgow, Glasgow, United Kingdom (JJJvdH); and the University of Alberta, Edmonton, Canada (DSW)
| | - Lorraine Brennan
- From the International Agency for Research on Cancer, Lyon, France (AS); University College Dublin, Dublin, Ireland (LB); the Institut National de la Recherche Agronomique, Clermont-Ferrand, France (CM); Clermont University, Clermont-Ferrand, France (CM); the University of Barcelona, Barcelona, Spain (CA-L); the University of Copenhagen, Frederiksberg, Denmark (LOD); Aberystwyth University, Aberystwyth, United Kingdom (JD); the University of California, Berkeley, CA (SMR); the University of Glasgow, Glasgow, United Kingdom (JJJvdH); and the University of Alberta, Edmonton, Canada (DSW)
| | - Claudine Manach
- From the International Agency for Research on Cancer, Lyon, France (AS); University College Dublin, Dublin, Ireland (LB); the Institut National de la Recherche Agronomique, Clermont-Ferrand, France (CM); Clermont University, Clermont-Ferrand, France (CM); the University of Barcelona, Barcelona, Spain (CA-L); the University of Copenhagen, Frederiksberg, Denmark (LOD); Aberystwyth University, Aberystwyth, United Kingdom (JD); the University of California, Berkeley, CA (SMR); the University of Glasgow, Glasgow, United Kingdom (JJJvdH); and the University of Alberta, Edmonton, Canada (DSW)
| | - Cristina Andres-Lacueva
- From the International Agency for Research on Cancer, Lyon, France (AS); University College Dublin, Dublin, Ireland (LB); the Institut National de la Recherche Agronomique, Clermont-Ferrand, France (CM); Clermont University, Clermont-Ferrand, France (CM); the University of Barcelona, Barcelona, Spain (CA-L); the University of Copenhagen, Frederiksberg, Denmark (LOD); Aberystwyth University, Aberystwyth, United Kingdom (JD); the University of California, Berkeley, CA (SMR); the University of Glasgow, Glasgow, United Kingdom (JJJvdH); and the University of Alberta, Edmonton, Canada (DSW)
| | - Lars O Dragsted
- From the International Agency for Research on Cancer, Lyon, France (AS); University College Dublin, Dublin, Ireland (LB); the Institut National de la Recherche Agronomique, Clermont-Ferrand, France (CM); Clermont University, Clermont-Ferrand, France (CM); the University of Barcelona, Barcelona, Spain (CA-L); the University of Copenhagen, Frederiksberg, Denmark (LOD); Aberystwyth University, Aberystwyth, United Kingdom (JD); the University of California, Berkeley, CA (SMR); the University of Glasgow, Glasgow, United Kingdom (JJJvdH); and the University of Alberta, Edmonton, Canada (DSW)
| | - John Draper
- From the International Agency for Research on Cancer, Lyon, France (AS); University College Dublin, Dublin, Ireland (LB); the Institut National de la Recherche Agronomique, Clermont-Ferrand, France (CM); Clermont University, Clermont-Ferrand, France (CM); the University of Barcelona, Barcelona, Spain (CA-L); the University of Copenhagen, Frederiksberg, Denmark (LOD); Aberystwyth University, Aberystwyth, United Kingdom (JD); the University of California, Berkeley, CA (SMR); the University of Glasgow, Glasgow, United Kingdom (JJJvdH); and the University of Alberta, Edmonton, Canada (DSW)
| | - Stephen M Rappaport
- From the International Agency for Research on Cancer, Lyon, France (AS); University College Dublin, Dublin, Ireland (LB); the Institut National de la Recherche Agronomique, Clermont-Ferrand, France (CM); Clermont University, Clermont-Ferrand, France (CM); the University of Barcelona, Barcelona, Spain (CA-L); the University of Copenhagen, Frederiksberg, Denmark (LOD); Aberystwyth University, Aberystwyth, United Kingdom (JD); the University of California, Berkeley, CA (SMR); the University of Glasgow, Glasgow, United Kingdom (JJJvdH); and the University of Alberta, Edmonton, Canada (DSW)
| | - Justin J J van der Hooft
- From the International Agency for Research on Cancer, Lyon, France (AS); University College Dublin, Dublin, Ireland (LB); the Institut National de la Recherche Agronomique, Clermont-Ferrand, France (CM); Clermont University, Clermont-Ferrand, France (CM); the University of Barcelona, Barcelona, Spain (CA-L); the University of Copenhagen, Frederiksberg, Denmark (LOD); Aberystwyth University, Aberystwyth, United Kingdom (JD); the University of California, Berkeley, CA (SMR); the University of Glasgow, Glasgow, United Kingdom (JJJvdH); and the University of Alberta, Edmonton, Canada (DSW)
| | - David S Wishart
- From the International Agency for Research on Cancer, Lyon, France (AS); University College Dublin, Dublin, Ireland (LB); the Institut National de la Recherche Agronomique, Clermont-Ferrand, France (CM); Clermont University, Clermont-Ferrand, France (CM); the University of Barcelona, Barcelona, Spain (CA-L); the University of Copenhagen, Frederiksberg, Denmark (LOD); Aberystwyth University, Aberystwyth, United Kingdom (JD); the University of California, Berkeley, CA (SMR); the University of Glasgow, Glasgow, United Kingdom (JJJvdH); and the University of Alberta, Edmonton, Canada (DSW)
| |
Collapse
|
27
|
Ridder L, van der Hooft JJJ, Verhoeven S, de Vos RCH, Vervoort J, Bino RJ. In silico prediction and automatic LC-MS(n) annotation of green tea metabolites in urine. Anal Chem 2014; 86:4767-74. [PMID: 24779709 DOI: 10.1021/ac403875b] [Citation(s) in RCA: 33] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/05/2023]
Abstract
The colonic breakdown and human biotransformation of small molecules present in food can give rise to a large variety of potentially bioactive metabolites in the human body. However, the absence of reference data for many of these components limits their identification in complex biological samples, such as plasma and urine. We present an in silico workflow for automatic chemical annotation of metabolite profiling data from liquid chromatography coupled with multistage accurate mass spectrometry (LC-MS(n)), which we used to systematically screen for the presence of tea-derived metabolites in human urine samples after green tea consumption. Reaction rules for intestinal degradation and human biotransformation were systematically applied to chemical structures of 75 green tea components, resulting in a virtual library of 27,245 potential metabolites. All matching precursor ions in the urine LC-MS(n) data sets, as well as the corresponding fragment ions, were automatically annotated by in silico generated (sub)structures. The results were evaluated based on 74 previously identified urinary metabolites and lead to the putative identification of 26 additional green tea-derived metabolites. A total of 77% of all annotated metabolites were not present in the Pubchem database, demonstrating the benefit of in silico metabolite prediction for the automatic annotation of yet unknown metabolites in LC-MS(n) data from nutritional metabolite profiling experiments.
Collapse
Affiliation(s)
- Lars Ridder
- Laboratory of Biochemistry, Wageningen University , Dreijenlaan 3, 6703 HA, Wageningen, The Netherlands
| | | | | | | | | | | |
Collapse
|