1
|
Sun X, Xia Y, Zhao X, Wang X, Zhang Y, Jia Z, Zheng F, Li Z, Zhang X, Zhao C, Lu X, Xu G. Deep Characterization of Serum Metabolome Based on the Segment-Optimized Spectral-Stitching Direct-Infusion Fourier Transform Ion Cyclotron Resonance Mass Spectrometry Approach. Anal Chem 2023. [PMID: 37406615 DOI: 10.1021/acs.analchem.2c04995] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/07/2023]
Abstract
Direct-infusion Fourier transform ion cyclotron resonance mass spectrometry (DI-FTICR MS) shows great promise for metabolomic analysis due to ultrahigh mass accuracy and resolution. However, most of the DI-FTICR MS approaches focused on high-throughput metabolomics analysis at the expense of sensitivity and resolution and the potential for metabolome characterization has not been fully explored. Here, we proposed a novel deep characterization approach of serum metabolome using a segment-optimized spectral-stitching DI-FTICR MS method integrated with high-confidence and database-independent formula assignments. With varied acquisition parameters for each segment, a highly efficient acquisition was achieved for the whole mass range with sub-ppm mass accuracy. In a pooled human serum sample, thousands of features were assigned with unambiguous formulas and possible candidates based on highly accurate mass measurements. Furthermore, a reaction network was used to select confidently unique formulas from possible candidates, which was constructed by unambiguous formulas and possible candidates connected by the formula differences resulting from biochemical and MS transformation. Compared with full-range and conventional segment acquisition, 8- and 1.2-fold increases in observed features were achieved, respectively. Assignment accuracy was 93-94% for both a standard mixture containing 190 metabolites and a spiked serum sample with the root mean square mass error of 0.15-0.16 ppm. In total, 3534 unequivocal neutral molecular formulas were assigned in the pooled serum sample, 35% of which are contained in the HMDB. This method offers great enhancement in the deep characterization of serum metabolome by DI-FTICR MS.
Collapse
Affiliation(s)
- Xiaoshan Sun
- CAS Key Laboratory of Separation Science for Analytical Chemistry, Dalian Institute of Chemical Physics, Chinese Academy of Sciences, Dalian, Liaoning 116023, P.R. China
- University of Chinese Academy of Sciences, Beijing 100049, P.R. China
- Liaoning Province Key Laboratory of Metabolomics, Dalian, Liaoning 116023, P.R. China
| | - Yueyi Xia
- CAS Key Laboratory of Separation Science for Analytical Chemistry, Dalian Institute of Chemical Physics, Chinese Academy of Sciences, Dalian, Liaoning 116023, P.R. China
- University of Chinese Academy of Sciences, Beijing 100049, P.R. China
- Liaoning Province Key Laboratory of Metabolomics, Dalian, Liaoning 116023, P.R. China
| | - Xinjie Zhao
- CAS Key Laboratory of Separation Science for Analytical Chemistry, Dalian Institute of Chemical Physics, Chinese Academy of Sciences, Dalian, Liaoning 116023, P.R. China
- University of Chinese Academy of Sciences, Beijing 100049, P.R. China
- Liaoning Province Key Laboratory of Metabolomics, Dalian, Liaoning 116023, P.R. China
| | - Xinxin Wang
- CAS Key Laboratory of Separation Science for Analytical Chemistry, Dalian Institute of Chemical Physics, Chinese Academy of Sciences, Dalian, Liaoning 116023, P.R. China
- University of Chinese Academy of Sciences, Beijing 100049, P.R. China
- Liaoning Province Key Laboratory of Metabolomics, Dalian, Liaoning 116023, P.R. China
| | - Yuqing Zhang
- CAS Key Laboratory of Separation Science for Analytical Chemistry, Dalian Institute of Chemical Physics, Chinese Academy of Sciences, Dalian, Liaoning 116023, P.R. China
- Liaoning Province Key Laboratory of Metabolomics, Dalian, Liaoning 116023, P.R. China
- Zhang Dayu School of Chemistry, Dalian University of Technology, Dalian 116024, P.R. China
| | - Zhen Jia
- CAS Key Laboratory of Separation Science for Analytical Chemistry, Dalian Institute of Chemical Physics, Chinese Academy of Sciences, Dalian, Liaoning 116023, P.R. China
- Liaoning Province Key Laboratory of Metabolomics, Dalian, Liaoning 116023, P.R. China
- Department of Cell Biology, College of Life Sciences, China Medical University, Shenyang 110122 Liaoning, P.R. China
| | - Fujian Zheng
- CAS Key Laboratory of Separation Science for Analytical Chemistry, Dalian Institute of Chemical Physics, Chinese Academy of Sciences, Dalian, Liaoning 116023, P.R. China
- University of Chinese Academy of Sciences, Beijing 100049, P.R. China
- Liaoning Province Key Laboratory of Metabolomics, Dalian, Liaoning 116023, P.R. China
| | - Zaifang Li
- CAS Key Laboratory of Separation Science for Analytical Chemistry, Dalian Institute of Chemical Physics, Chinese Academy of Sciences, Dalian, Liaoning 116023, P.R. China
- University of Chinese Academy of Sciences, Beijing 100049, P.R. China
- Liaoning Province Key Laboratory of Metabolomics, Dalian, Liaoning 116023, P.R. China
| | - Xiuqiong Zhang
- CAS Key Laboratory of Separation Science for Analytical Chemistry, Dalian Institute of Chemical Physics, Chinese Academy of Sciences, Dalian, Liaoning 116023, P.R. China
- University of Chinese Academy of Sciences, Beijing 100049, P.R. China
- Liaoning Province Key Laboratory of Metabolomics, Dalian, Liaoning 116023, P.R. China
| | - Chunxia Zhao
- CAS Key Laboratory of Separation Science for Analytical Chemistry, Dalian Institute of Chemical Physics, Chinese Academy of Sciences, Dalian, Liaoning 116023, P.R. China
- University of Chinese Academy of Sciences, Beijing 100049, P.R. China
- Liaoning Province Key Laboratory of Metabolomics, Dalian, Liaoning 116023, P.R. China
| | - Xin Lu
- CAS Key Laboratory of Separation Science for Analytical Chemistry, Dalian Institute of Chemical Physics, Chinese Academy of Sciences, Dalian, Liaoning 116023, P.R. China
- University of Chinese Academy of Sciences, Beijing 100049, P.R. China
- Liaoning Province Key Laboratory of Metabolomics, Dalian, Liaoning 116023, P.R. China
| | - Guowang Xu
- CAS Key Laboratory of Separation Science for Analytical Chemistry, Dalian Institute of Chemical Physics, Chinese Academy of Sciences, Dalian, Liaoning 116023, P.R. China
- University of Chinese Academy of Sciences, Beijing 100049, P.R. China
- Liaoning Province Key Laboratory of Metabolomics, Dalian, Liaoning 116023, P.R. China
| |
Collapse
|
2
|
Morehouse NJ, Clark TN, McMann EJ, van Santen JA, Haeckl FPJ, Gray CA, Linington RG. Annotation of natural product compound families using molecular networking topology and structural similarity fingerprinting. Nat Commun 2023; 14:308. [PMID: 36658161 PMCID: PMC9852437 DOI: 10.1038/s41467-022-35734-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2021] [Accepted: 12/20/2022] [Indexed: 01/20/2023] Open
Abstract
Spectral matching of MS2 fragmentation spectra has become a popular method for characterizing natural products libraries but identification remains challenging due to differences in MS2 fragmentation properties between instruments and the low coverage of current spectral reference libraries. To address this bottleneck we present Structural similarity Network Annotation Platform for Mass Spectrometry (SNAP-MS) which matches chemical similarity grouping in the Natural Products Atlas to grouping of mass spectrometry features from molecular networking. This approach assigns compound families to molecular networking subnetworks without the need for experimental or calculated reference spectra. We demonstrate SNAP-MS can accurately annotate subnetworks built from both reference spectra and an in-house microbial extract library, and correctly predict compound families from published molecular networks acquired on a range of MS instrumentation. Compound family annotations for the microbial extract library are validated by co-injection of standards or isolation and spectroscopic analysis. SNAP-MS is freely available at www.npatlas.org/discover/snapms .
Collapse
Affiliation(s)
- Nicholas J Morehouse
- Department of Biological Sciences, University of New Brunswick, Saint John, NB, Canada
| | - Trevor N Clark
- Department of Chemistry, Simon Fraser University, Burnaby, BC, Canada
| | - Emily J McMann
- Department of Chemistry, Simon Fraser University, Burnaby, BC, Canada
| | | | - F P Jake Haeckl
- Department of Chemistry, Simon Fraser University, Burnaby, BC, Canada
| | - Christopher A Gray
- Department of Biological Sciences, University of New Brunswick, Saint John, NB, Canada.,Department of Chemistry, University of New Brunswick, Fredericton, NB, Canada
| | - Roger G Linington
- Department of Chemistry, Simon Fraser University, Burnaby, BC, Canada.
| |
Collapse
|
3
|
An Open-Source Pipeline for Processing Direct Infusion Mass Spectrometry Data of the Human Plasma Metabolome. Metabolites 2022; 12:metabo12080768. [PMID: 36005640 PMCID: PMC9415960 DOI: 10.3390/metabo12080768] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2022] [Revised: 07/25/2022] [Accepted: 08/18/2022] [Indexed: 11/30/2022] Open
Abstract
Direct infusion mass spectrometry (DIMS) is growing in popularity as an effective method for the screening of biological samples in clinical metabolomics. Being quick to execute, DIMS generally requires special skills when interpreting the results of measurements. By inspecting the similarities between two-dimensional electrospray ionization with quadrupole time-of-flight (ESI-QTOF) and matrix-assisted laser desorption/ionization (MALDI) mass spectra, the pipeline for processing QTOF mass spectra using open-source packages (MALDIquant, MSnbase and MetaboAnalystR) was tested. Previously, all algorithmic workflows have relied on the application of software either provided by a vendor or privately developed by enthusiasts. Here, we computationally examined two ways of interpreting the DIMS results of human blood metabolomic profiling. The studied spectra were acquired using ESI-QTOF maXis Impact II (Bruker Daltonics, Billerica, MA, USA), then pre-processed using COMPASS/DataAnalysis commercial software and mapped onto the metabolites using in-lab-developed MatLab scripts. Alternatively, in this work we used the open-source packages MALDIquant, for spectrum pre-processing, and MetaboAnalystR, for data interpretation, instead of the low-availability commercial and home-made tools. Using a set of 100 plasma samples (20 from volunteers with normal body mass index and 80 from patients at different stages of obesity), we observed a high degree of concordance in annotated metabolic pathways between the proprietary DataAnalysis/MatLab pipeline and our freely available solution.
Collapse
|
4
|
Chen L, Lu W, Wang L, Xing X, Chen Z, Teng X, Zeng X, Muscarella AD, Shen Y, Cowan A, McReynolds MR, Kennedy BJ, Lato AM, Campagna SR, Singh M, Rabinowitz JD. Metabolite discovery through global annotation of untargeted metabolomics data. Nat Methods 2021; 18:1377-1385. [PMID: 34711973 PMCID: PMC8733904 DOI: 10.1038/s41592-021-01303-3] [Citation(s) in RCA: 95] [Impact Index Per Article: 31.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2021] [Accepted: 09/16/2021] [Indexed: 11/08/2022]
Abstract
Liquid chromatography-high-resolution mass spectrometry (LC-MS)-based metabolomics aims to identify and quantify all metabolites, but most LC-MS peaks remain unidentified. Here we present a global network optimization approach, NetID, to annotate untargeted LC-MS metabolomics data. The approach aims to generate, for all experimentally observed ion peaks, annotations that match the measured masses, retention times and (when available) tandem mass spectrometry fragmentation patterns. Peaks are connected based on mass differences reflecting adduction, fragmentation, isotopes, or feasible biochemical transformations. Global optimization generates a single network linking most observed ion peaks, enhances peak assignment accuracy, and produces chemically informative peak-peak relationships, including for peaks lacking tandem mass spectrometry spectra. Applying this approach to yeast and mouse data, we identified five previously unrecognized metabolites (thiamine derivatives and N-glucosyl-taurine). Isotope tracer studies indicate active flux through these metabolites. Thus, NetID applies existing metabolomic knowledge and global optimization to substantially improve annotation coverage and accuracy in untargeted metabolomics datasets, facilitating metabolite discovery.
Collapse
Affiliation(s)
- Li Chen
- Shanghai Key Laboratory of Metabolic Remodeling and Health, Institute of Metabolism and Integrative Biology, Fudan University, Shanghai, China
- Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ, USA
| | - Wenyun Lu
- Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ, USA
- Department of Chemistry, Princeton University, Princeton, NJ, USA
| | - Lin Wang
- Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ, USA
- Department of Chemistry, Princeton University, Princeton, NJ, USA
| | - Xi Xing
- Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ, USA
- Department of Chemistry, Princeton University, Princeton, NJ, USA
| | - Ziyang Chen
- Shanghai Key Laboratory of Metabolic Remodeling and Health, Institute of Metabolism and Integrative Biology, Fudan University, Shanghai, China
- Department of Molecular Biology, Princeton University, Princeton, NJ, USA
| | - Xin Teng
- Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ, USA
| | - Xianfeng Zeng
- Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ, USA
- Department of Chemistry, Princeton University, Princeton, NJ, USA
| | - Antonio D Muscarella
- Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ, USA
| | - Yihui Shen
- Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ, USA
| | - Alexis Cowan
- Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ, USA
- Department of Molecular Biology, Princeton University, Princeton, NJ, USA
| | - Melanie R McReynolds
- Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ, USA
- Department of Chemistry, Princeton University, Princeton, NJ, USA
| | - Brandon J Kennedy
- Lotus Separations, LLC, Department of Chemistry, Princeton University, Princeton, NJ, USA
| | - Ashley M Lato
- Department of Chemistry, The University of Tennessee at Knoxville, Knoxville, TN, USA
| | - Shawn R Campagna
- Department of Chemistry, The University of Tennessee at Knoxville, Knoxville, TN, USA
| | - Mona Singh
- Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ, USA
- Department of Computer Science, Princeton University, Princeton, NJ, USA
| | - Joshua D Rabinowitz
- Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ, USA.
- Department of Chemistry, Princeton University, Princeton, NJ, USA.
- Department of Molecular Biology, Princeton University, Princeton, NJ, USA.
- Ludwig Institute for Cancer Research, Princeton Branch, Princeton, NJ, USA.
| |
Collapse
|
5
|
Maia M, Figueiredo A, Cordeiro C, Sousa Silva M. FT-ICR-MS-based metabolomics: A deep dive into plant metabolism. MASS SPECTROMETRY REVIEWS 2021. [PMID: 34545595 DOI: 10.1002/mas.21731] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/07/2021] [Revised: 08/30/2021] [Accepted: 09/09/2021] [Indexed: 06/13/2023]
Abstract
Metabolomics involves the identification and quantification of metabolites to unravel the chemical footprints behind cellular regulatory processes and to decipher metabolic networks, opening new insights to understand the correlation between genes and metabolites. In plants, it is estimated the existence of hundreds of thousands of metabolites and the majority is still unknown. Fourier transform ion cyclotron resonance mass spectrometry (FT-ICR-MS) is a powerful analytical technique to tackle such challenges. The resolving power and sensitivity of this ultrahigh mass accuracy mass analyzer is such that a complex mixture, such as plant extracts, can be analyzed and thousands of metabolite signals can be detected simultaneously and distinguished based on the naturally abundant elemental isotopes. In this review, FT-ICR-MS-based plant metabolomics studies are described, emphasizing FT-ICR-MS increasing applications in plant science through targeted and untargeted approaches, allowing for a better understanding of plant development, responses to biotic and abiotic stresses, and the discovery of new natural nutraceutical compounds. Improved metabolite extraction protocols compatible with FT-ICR-MS, metabolite analysis methods and metabolite identification platforms are also explored as well as new in silico approaches. Most recent advances in MS imaging are also discussed.
Collapse
Affiliation(s)
- Marisa Maia
- Departamento de Química e Bioquímica, Laboratório de FTICR e Espectrometria de Massa Estrutural, MARE-Marine and Environmental Sciences Centre, Faculdade de Ciências, Universidade de Lisboa, Lisboa, Portugal
- Departamento de Biologia Vegetal, Faculdade de Ciências, Grapevine Pathogen Systems Lab (GPS Lab), Biosystems and Integrative Sciences Institute (BioISI), Universidade de Lisboa, Lisboa, Portugal
| | - Andreia Figueiredo
- Departamento de Biologia Vegetal, Faculdade de Ciências, Grapevine Pathogen Systems Lab (GPS Lab), Biosystems and Integrative Sciences Institute (BioISI), Universidade de Lisboa, Lisboa, Portugal
| | - Carlos Cordeiro
- Departamento de Química e Bioquímica, Laboratório de FTICR e Espectrometria de Massa Estrutural, MARE-Marine and Environmental Sciences Centre, Faculdade de Ciências, Universidade de Lisboa, Lisboa, Portugal
| | - Marta Sousa Silva
- Departamento de Química e Bioquímica, Laboratório de FTICR e Espectrometria de Massa Estrutural, MARE-Marine and Environmental Sciences Centre, Faculdade de Ciências, Universidade de Lisboa, Lisboa, Portugal
| |
Collapse
|
6
|
Lokhov PG, Maslov DL, Lichtenberg S, Trifonova OP, Balashova EE. Holistic Metabolomic Laboratory-Developed Test (LDT): Development and Use for the Diagnosis of Early-Stage Parkinson's Disease. Metabolites 2020; 11:metabo11010014. [PMID: 33383698 PMCID: PMC7824177 DOI: 10.3390/metabo11010014] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2020] [Revised: 12/24/2020] [Accepted: 12/24/2020] [Indexed: 02/04/2023] Open
Abstract
A laboratory-developed test (LDT) is a type of in vitro diagnostic test that is developed and used within a single laboratory. The holistic metabolomic LDT integrating the currently available data on human metabolic pathways, changes in the concentrations of low-molecular-weight compounds in the human blood during diseases and other conditions, and their prevalent location in the body was developed. That is, the LDT uses all of the accumulated metabolic data relevant for disease diagnosis and high-resolution mass spectrometry with data processing by in-house software. In this study, the LDT was applied to diagnose early-stage Parkinson's disease (PD), which currently lacks available laboratory tests. The use of the LDT for blood plasma samples confirmed its ability for such diagnostics with 73% accuracy. The diagnosis was based on relevant data, such as the detection of overrepresented metabolite sets associated with PD and other neurodegenerative diseases. Additionally, the ability of the LDT to detect normal composition of low-molecular-weight compounds in blood was demonstrated, thus providing a definition of healthy at the molecular level. This LDT approach as a screening tool can be used for the further widespread testing for other diseases, since 'omics' tests, to which the metabolomic LDT belongs, cover a variety of them.
Collapse
Affiliation(s)
- Petr G. Lokhov
- Institute of Biomedical Chemistry, 10 Building 8, Pogodinskaya Street, 119121 Moscow, Russia; (D.L.M.); (O.P.T.); (E.E.B.)
- Correspondence:
| | - Dmitry L. Maslov
- Institute of Biomedical Chemistry, 10 Building 8, Pogodinskaya Street, 119121 Moscow, Russia; (D.L.M.); (O.P.T.); (E.E.B.)
| | - Steven Lichtenberg
- Metabometrics, Inc, 651 N Broad St, Suite 205 #1370, Middletown, DE 19709, USA;
| | - Oxana P. Trifonova
- Institute of Biomedical Chemistry, 10 Building 8, Pogodinskaya Street, 119121 Moscow, Russia; (D.L.M.); (O.P.T.); (E.E.B.)
| | - Elena E. Balashova
- Institute of Biomedical Chemistry, 10 Building 8, Pogodinskaya Street, 119121 Moscow, Russia; (D.L.M.); (O.P.T.); (E.E.B.)
| |
Collapse
|
7
|
Desmet S, Brouckaert M, Boerjan W, Morreel K. Seeing the forest for the trees: Retrieving plant secondary biochemical pathways from metabolome networks. Comput Struct Biotechnol J 2020; 19:72-85. [PMID: 33384856 PMCID: PMC7753198 DOI: 10.1016/j.csbj.2020.11.050] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2020] [Revised: 11/26/2020] [Accepted: 11/28/2020] [Indexed: 02/06/2023] Open
Abstract
Over the last decade, a giant leap forward has been made in resolving the main bottleneck in metabolomics, i.e., the structural characterization of the many unknowns. This has led to the next challenge in this research field: retrieving biochemical pathway information from the various types of networks that can be constructed from metabolome data. Searching putative biochemical pathways, referred to as biotransformation paths, is complicated because several flaws occur during the construction of metabolome networks. Multiple network analysis tools have been developed to deal with these flaws, while in silico retrosynthesis is appearing as an alternative approach. In this review, the different types of metabolome networks, their flaws, and the various tools to trace these biotransformation paths are discussed.
Collapse
Affiliation(s)
- Sandrien Desmet
- Ghent University, Department of Plant Biotechnology and Bioinformatics, Ghent, Belgium
- VIB Center for Plant Systems Biology, Ghent, Belgium
| | - Marlies Brouckaert
- Ghent University, Department of Plant Biotechnology and Bioinformatics, Ghent, Belgium
- VIB Center for Plant Systems Biology, Ghent, Belgium
| | - Wout Boerjan
- Ghent University, Department of Plant Biotechnology and Bioinformatics, Ghent, Belgium
- VIB Center for Plant Systems Biology, Ghent, Belgium
| | - Kris Morreel
- Ghent University, Department of Plant Biotechnology and Bioinformatics, Ghent, Belgium
- VIB Center for Plant Systems Biology, Ghent, Belgium
| |
Collapse
|
8
|
Ludwig M, Nothias LF, Dührkop K, Koester I, Fleischauer M, Hoffmann MA, Petras D, Vargas F, Morsy M, Aluwihare L, Dorrestein PC, Böcker S. Database-independent molecular formula annotation using Gibbs sampling through ZODIAC. NAT MACH INTELL 2020. [DOI: 10.1038/s42256-020-00234-6] [Citation(s) in RCA: 44] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022]
|
9
|
Lokhov PG, Trifonova OP, Maslov DL, Lichtenberg S, Balashova EE. Diagnosis of Parkinson's Disease by A Metabolomics-Based Laboratory-Developed Test (LDT). Diagnostics (Basel) 2020; 10:diagnostics10050332. [PMID: 32455603 PMCID: PMC7277951 DOI: 10.3390/diagnostics10050332] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2020] [Revised: 04/29/2020] [Accepted: 05/19/2020] [Indexed: 01/02/2023] Open
Abstract
A laboratory-developed test (LDT) is a type of in vitro diagnostic test that is designed, manufactured and used in the same laboratory (i.e., an in-house test). In this study, a metabolomics-based LDT was developed. This test involves a blood plasma preparation, direct-infusion mass spectrometry analysis with a high-resolution mass spectrometer, alignment and normalization of mass peaks using original algorithms, metabolite annotation by a biochemical context-driven algorithm, detection of overrepresented metabolic pathways and results in a visualization in the form of a pathway names cloud. The LDT was applied to detect early stage Parkinson’s disease (PD)—the diagnosis of which currently requires great effort due to the lack of available laboratory tests. In a case–control study (n = 56), the LDT revealed a statistically sound pattern in the PD-relevant pathways. Usage of the LDT for individuals confirmed its ability to reveal this pattern and thus diagnose PD at the early-stage (1–2.5 stages, according to Hoehn and Yahr scale). The detection of this pattern by LDT could diagnose PD with a specificity of 64%, sensitivity of 86% and an accuracy of 75%. Thus, this LDT can be used for further widespread testing.
Collapse
Affiliation(s)
- Petr G. Lokhov
- Institute of Biomedical Chemistry, 10 building 8, Pogodinskaya Street, 119121 Moscow, Russia; (O.P.T.); (D.L.M.); (E.E.B.)
- Correspondence:
| | - Oxana P. Trifonova
- Institute of Biomedical Chemistry, 10 building 8, Pogodinskaya Street, 119121 Moscow, Russia; (O.P.T.); (D.L.M.); (E.E.B.)
| | - Dmitry L. Maslov
- Institute of Biomedical Chemistry, 10 building 8, Pogodinskaya Street, 119121 Moscow, Russia; (O.P.T.); (D.L.M.); (E.E.B.)
| | - Steven Lichtenberg
- Metabometrics, Inc., 651 N Broad St., Suite 205 #1370, Middletown, DE 19709, USA;
| | - Elena E. Balashova
- Institute of Biomedical Chemistry, 10 building 8, Pogodinskaya Street, 119121 Moscow, Russia; (O.P.T.); (D.L.M.); (E.E.B.)
| |
Collapse
|
10
|
Hosseini R, Hassanpour N, Liu LP, Hassoun S. Pathway-Activity Likelihood Analysis and Metabolite Annotation for Untargeted Metabolomics Using Probabilistic Modeling. Metabolites 2020; 10:E183. [PMID: 32375258 PMCID: PMC7281100 DOI: 10.3390/metabo10050183] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2020] [Revised: 04/19/2020] [Accepted: 04/27/2020] [Indexed: 12/22/2022] Open
Abstract
Motivation: Untargeted metabolomics comprehensively characterizes small molecules and elucidates activities of biochemical pathways within a biological sample. Despite computational advances, interpreting collected measurements and determining their biological role remains a challenge. Results: To interpret measurements, we present an inference-based approach, termed Probabilistic modeling for Untargeted Metabolomics Analysis (PUMA). Our approach captures metabolomics measurements and the biological network for the biological sample under study in a generative model and uses stochastic sampling to compute posterior probability distributions. PUMA predicts the likelihood of pathways being active, and then derives probabilistic annotations, which assign chemical identities to measurements. Unlike prior pathway analysis tools that analyze differentially active pathways, PUMA defines a pathway as active if the likelihood that the path generated the observed measurements is above a particular (user-defined) threshold. Due to the lack of "ground truth" metabolomics datasets, where all measurements are annotated and pathway activities are known, PUMA is validated on synthetic datasets that are designed to mimic cellular processes. PUMA, on average, outperforms pathway enrichment analysis by 8%. PUMA is applied to two case studies. PUMA suggests many biological meaningful pathways as active. Annotation results were in agreement to those obtained using other tools that utilize additional information in the form of spectral signatures. Importantly, PUMA annotates many measurements, suggesting 23 chemical identities for metabolites that were previously only identified as isomers, and a significant number of additional putative annotations over spectral database lookups. For an experimentally validated 50-compound dataset, annotations using PUMA yielded 0.833 precision and 0.676 recall.
Collapse
|
11
|
Lokhov PG, Balashova EE, Trifonova OP, Maslov DL, Ponomarenko EA, Archakov AI. Mass Spectrometry-Based Metabolomics Analysis of Obese Patients' Blood Plasma. Int J Mol Sci 2020; 21:E568. [PMID: 31952343 PMCID: PMC7014187 DOI: 10.3390/ijms21020568] [Citation(s) in RCA: 17] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2019] [Revised: 01/09/2020] [Accepted: 01/14/2020] [Indexed: 12/31/2022] Open
Abstract
Scientists currently use only a small portion of the information contained in the blood metabolome. The identification of metabolites is a huge challenge because only highly abundant and well-separated compounds can be easily identified in complex samples. However, new approaches that enhance the identification of compounds have emerged; among them, the identification of compounds based on their involvement in a particular biological context is a recent development. In this work, this approach was first applied to identify metabolites in complex samples and, together with metabolite set enrichment analysis, was used for the evaluation of blood plasma from obese patients. The proposed approach was found to provide a statistically sound overview of the biochemical pathways, thus presenting additional information on obesity. Obesity progression was demonstrated to be accompanied by marked alterations in steroidogenesis, androstenedione metabolism, and androgen and estrogen metabolism. The findings of this study suggest that the workflow used for blood analysis is sufficient to demonstrate obesity at the biochemical pathway level as well as to monitor the response to treatment. This workflow is also expected to be suitable for studying other metabolic diseases.
Collapse
Affiliation(s)
- Petr G. Lokhov
- Institute of Biomedical Chemistry, 10 Building 8, Pogodinskaya Street, 119121 Moscow, Russia; (E.E.B.); (O.P.T.); (D.L.M.); (E.A.P.); (A.I.A.)
- Metabometrics Inc, 651 N Broad St, Suite 205 #1370, Middletown, DE 19709, USA
| | - Elena E. Balashova
- Institute of Biomedical Chemistry, 10 Building 8, Pogodinskaya Street, 119121 Moscow, Russia; (E.E.B.); (O.P.T.); (D.L.M.); (E.A.P.); (A.I.A.)
- Metabometrics Inc, 651 N Broad St, Suite 205 #1370, Middletown, DE 19709, USA
| | - Oxana P. Trifonova
- Institute of Biomedical Chemistry, 10 Building 8, Pogodinskaya Street, 119121 Moscow, Russia; (E.E.B.); (O.P.T.); (D.L.M.); (E.A.P.); (A.I.A.)
| | - Dmitry L. Maslov
- Institute of Biomedical Chemistry, 10 Building 8, Pogodinskaya Street, 119121 Moscow, Russia; (E.E.B.); (O.P.T.); (D.L.M.); (E.A.P.); (A.I.A.)
| | - Elena A. Ponomarenko
- Institute of Biomedical Chemistry, 10 Building 8, Pogodinskaya Street, 119121 Moscow, Russia; (E.E.B.); (O.P.T.); (D.L.M.); (E.A.P.); (A.I.A.)
| | - Alexander I. Archakov
- Institute of Biomedical Chemistry, 10 Building 8, Pogodinskaya Street, 119121 Moscow, Russia; (E.E.B.); (O.P.T.); (D.L.M.); (E.A.P.); (A.I.A.)
| |
Collapse
|
12
|
Chevalier M, Ricart E, Hanozin E, Pupin M, Jacques P, Smargiasso N, De Pauw E, Lisacek F, Leclère V, Flahaut C. Kendrick Mass Defect Approach Combined to NORINE Database for Molecular Formula Assignment of Nonribosomal Peptides. JOURNAL OF THE AMERICAN SOCIETY FOR MASS SPECTROMETRY 2019; 30:2608-2616. [PMID: 31659720 DOI: 10.1007/s13361-019-02314-3] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/12/2019] [Revised: 07/03/2019] [Accepted: 08/10/2019] [Indexed: 06/10/2023]
Abstract
The identification of known (dereplication) or unknown nonribosomal peptides (NRPs) produced by microorganisms is a time consuming, expensive, and challenging task where mass spectrometry and nuclear magnetic resonance play a key role. The first step of the identification process always involves the establishment of a molecular formula. Unfortunately, the number of potential molecular formulae increases significantly with higher molecular masses and the lower precision of their measurements. In the present article, we demonstrate that molecular formula assignment can be achieved by a combined approach using the regular Kendrick mass defect (RKMD) and NORINE, the reference curated database of NRPs. We observed that irrespective of the molecular formula, the addition and subtraction of a given atom or atom group always leads to the same RKMD variation and nominal Kendrick mass (NKM). Graphically, these variations translated into a vector mesh can be used to connect an unknown molecule to a known NRP of the NORINE database and establish its molecular formula. We explain and illustrate this concept through the high-resolution mass spectrometry analysis of a commercially available mixture composed of four surfactins. The Kendrick approach enriched with the NORINE database content is a fast, useful, and easy-to-use tool for molecular mass assignment of known and unknown NRP structures.
Collapse
Affiliation(s)
- Mickaël Chevalier
- Univ. Lille, INRA, ISA, Univ. Artois, Univ. Littoral Côte d'Opale, EA 7394-Institut Charles Viollette (ICV), F-59000, Lille, France
| | - Emma Ricart
- Proteome informatics Group, SIB Swiss Institute of Bioinformatics (SIB), and Computer Science Department, University of Geneva, Geneva, Switzerland
| | - Emeline Hanozin
- Mass Spectrometry Laboratory, Molecular Systems - MolSys Research Unit, University of Liège, Liège, Belgium
| | - Maude Pupin
- Univ. Lille, CNRS, Centrale Lille, UMR 9189 - CRIStAL - Centre de Recherche en Informatique Signal et Automatique de Lille, F-59000, Lille, France
- Inria-Lille Nord Europe, Bonsai team, F-59655, Villeneuve d'Ascq Cedex, France
| | - Philippe Jacques
- TERRA Research Centre, Microbial Processes and Interactions (MiPI), Gembloux Agro-Bio Tech University of Liège, B-5030, Gembloux, Belgium
| | - Nicolas Smargiasso
- Mass Spectrometry Laboratory, Molecular Systems - MolSys Research Unit, University of Liège, Liège, Belgium
| | - Edwin De Pauw
- Mass Spectrometry Laboratory, Molecular Systems - MolSys Research Unit, University of Liège, Liège, Belgium
| | - Frédérique Lisacek
- Proteome informatics Group, SIB Swiss Institute of Bioinformatics (SIB), and Computer Science Department, University of Geneva, Geneva, Switzerland
| | - Valérie Leclère
- Univ. Lille, INRA, ISA, Univ. Artois, Univ. Littoral Côte d'Opale, EA 7394-Institut Charles Viollette (ICV), F-59000, Lille, France
| | - Christophe Flahaut
- Univ. Lille, INRA, ISA, Univ. Artois, Univ. Littoral Côte d'Opale, EA 7394-Institut Charles Viollette (ICV), F-59000, Lille, France.
| |
Collapse
|
13
|
Del Carratore F, Schmidt K, Vinaixa M, Hollywood KA, Greenland-Bews C, Takano E, Rogers S, Breitling R. Integrated Probabilistic Annotation: A Bayesian-Based Annotation Method for Metabolomic Profiles Integrating Biochemical Connections, Isotope Patterns, and Adduct Relationships. Anal Chem 2019; 91:12799-12807. [PMID: 31509381 DOI: 10.1021/acs.analchem.9b02354] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/08/2023]
Abstract
In a typical untargeted metabolomics experiment, the huge amount of complex data generated by mass spectrometry necessitates automated tools for the extraction of useful biological information. Each metabolite generates numerous mass spectrometry features. The association of these experimental features to the underlying metabolites still represents one of the major bottlenecks in metabolomics data processing. While certain identification (e.g., by comparison to authentic standards) is always desirable, it is usually achievable only for a limited number of compounds, and scientists often deal with a significant amount of putatively annotated metabolites. The confidence in a specific annotation is usually assessed by considering different sources of information (e.g., isotope patterns, adduct formation, chromatographic retention times, and fragmentation patterns). IPA (integrated probabilistic annotation) offers a rigorous and reproducible method to automatically annotate metabolite profiles and evaluate the resulting confidence of the putative annotations. It is able to provide a rigorous measure of our confidence in any putative annotation and is also able to update and refine our beliefs (i.e., background prior knowledge) by incorporating different sources of information in the annotation process, such as isotope patterns, adduct formation and biochemical relations. The IPA package is freely available on GitHub ( https://github.com/francescodc87/IPA ), together with the related extensive documentation.
Collapse
Affiliation(s)
- Francesco Del Carratore
- Manchester Institute of Biotechnology, Faculty of Science and Engineering , University of Manchester , Manchester , M1 7DN , U.K
| | - Kamila Schmidt
- Manchester Institute of Biotechnology, Faculty of Science and Engineering , University of Manchester , Manchester , M1 7DN , U.K
| | - Maria Vinaixa
- Manchester Institute of Biotechnology, Faculty of Science and Engineering , University of Manchester , Manchester , M1 7DN , U.K
| | - Katherine A Hollywood
- Manchester Institute of Biotechnology, Faculty of Science and Engineering , University of Manchester , Manchester , M1 7DN , U.K
| | - Caitlin Greenland-Bews
- Manchester Institute of Biotechnology, Faculty of Science and Engineering , University of Manchester , Manchester , M1 7DN , U.K
| | - Eriko Takano
- Manchester Institute of Biotechnology, Faculty of Science and Engineering , University of Manchester , Manchester , M1 7DN , U.K
| | - Simon Rogers
- School of Computing Science , University of Glasgow , Glasgow , G12 8RZ , U.K
| | - Rainer Breitling
- Manchester Institute of Biotechnology, Faculty of Science and Engineering , University of Manchester , Manchester , M1 7DN , U.K
| |
Collapse
|
14
|
Rashmi D, Barvkar VT, Nadaf A, Mundhe S, Kadoo NY. Integrative omics analysis in Pandanus odorifer (Forssk.) Kuntze reveals the role of Asparagine synthetase in salinity tolerance. Sci Rep 2019; 9:932. [PMID: 30700750 PMCID: PMC6353967 DOI: 10.1038/s41598-018-37039-y] [Citation(s) in RCA: 16] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2018] [Accepted: 11/30/2018] [Indexed: 11/12/2022] Open
Abstract
Pandanus odorifer (Forssk) Kuntze grows naturally along the coastal regions and withstands salt-sprays as well as strong winds. A combination of omics approaches and enzyme activity studies was employed to comprehend the mechanistic basis of high salinity tolerance in P. odorifer. The young seedlings of P. odorifer were exposed to 1 M salt stress for up to three weeks and analyzed using RNAsequencing (RNAseq) and LC-MS. Integrative omics analysis revealed high expression of the Asparagine synthetase (AS) (EC 6.3.5.4) (8.95 fold) and remarkable levels of Asparagine (Asn) (28.5 fold). This indicated that salt stress promoted Asn accumulation in P. odorifer. To understand this further, the Asn biosynthesis pathway was traced out in P. odorifer. It was noticed that seven genes involved in Asn bisynthetic pathway namely glutamine synthetase (GS) (EC 6.3.1.2) glutamate synthase (GOGAT) (EC 1.4.1.14), aspartate kinase (EC 2.7.2.4), pyruvate kinase (EC 2.7.1.40), aspartate aminotransferase (AspAT) (EC 2.6.1.1), phosphoenolpyruvate carboxylase (PEPC) (EC 4.1.1.31) and AS were up-regulated under salt stress. AS transcripts were most abundant thereby showed its highest activity and thus were generating maximal Asn under salt stress. Also, an up-regulated Na+/H+ antiporter (NHX1) facilitated compartmentalization of Na+ into vacuoles, suggesting P. odorifer as salt accumulator species.
Collapse
Affiliation(s)
- Deo Rashmi
- Department of Botany, Savitribai Phule Pune University, Pune, 411007, India
| | - Vitthal T Barvkar
- Department of Botany, Savitribai Phule Pune University, Pune, 411007, India.
| | - Altafhusain Nadaf
- Department of Botany, Savitribai Phule Pune University, Pune, 411007, India.
| | - Swapnil Mundhe
- Biochemical Sciences Division, CSIR-National Chemical Laboratory, Pune, 411008, India
| | - Narendra Y Kadoo
- Biochemical Sciences Division, CSIR-National Chemical Laboratory, Pune, 411008, India
| |
Collapse
|
15
|
Ludwig M, Dührkop K, Böcker S. Bayesian networks for mass spectrometric metabolite identification via molecular fingerprints. Bioinformatics 2018; 34:i333-i340. [PMID: 29949965 PMCID: PMC6022630 DOI: 10.1093/bioinformatics/bty245] [Citation(s) in RCA: 28] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Motivation Metabolites, small molecules that are involved in cellular reactions, provide a direct functional signature of cellular state. Untargeted metabolomics experiments usually rely on tandem mass spectrometry to identify the thousands of compounds in a biological sample. Recently, we presented CSI:FingerID for searching in molecular structure databases using tandem mass spectrometry data. CSI:FingerID predicts a molecular fingerprint that encodes the structure of the query compound, then uses this to search a molecular structure database such as PubChem. Scoring of the predicted query fingerprint and deterministic target fingerprints is carried out assuming independence between the molecular properties constituting the fingerprint. Results We present a scoring that takes into account dependencies between molecular properties. As before, we predict posterior probabilities of molecular properties using machine learning. Dependencies between molecular properties are modeled as a Bayesian tree network; the tree structure is estimated on the fly from the instance data. For each edge, we also estimate the expected covariance between the two random variables. For fixed marginal probabilities, we then estimate conditional probabilities using the known covariance. Now, the corrected posterior probability of each candidate can be computed, and candidates are ranked by this score. Modeling dependencies improves identification rates of CSI:FingerID by 2.85 percentage points. Availability and implementation The new scoring Bayesian (fixed tree) is integrated into SIRIUS 4.0 (https://bio.informatik.uni-jena.de/software/sirius/).
Collapse
Affiliation(s)
- Marcus Ludwig
- Chair for Bioinformatics, Friedrich-Schiller-University, Jena, Germany
| | - Kai Dührkop
- Chair for Bioinformatics, Friedrich-Schiller-University, Jena, Germany
| | - Sebastian Böcker
- Chair for Bioinformatics, Friedrich-Schiller-University, Jena, Germany
| |
Collapse
|
16
|
da Silva RR, Wang M, Nothias LF, van der Hooft JJJ, Caraballo-Rodríguez AM, Fox E, Balunas MJ, Klassen JL, Lopes NP, Dorrestein PC. Propagating annotations of molecular networks using in silico fragmentation. PLoS Comput Biol 2018; 14:e1006089. [PMID: 29668671 PMCID: PMC5927460 DOI: 10.1371/journal.pcbi.1006089] [Citation(s) in RCA: 189] [Impact Index Per Article: 31.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2017] [Revised: 04/30/2018] [Accepted: 03/13/2018] [Indexed: 12/19/2022] Open
Abstract
The annotation of small molecules is one of the most challenging and important steps in untargeted mass spectrometry analysis, as most of our biological interpretations rely on structural annotations. Molecular networking has emerged as a structured way to organize and mine data from untargeted tandem mass spectrometry (MS/MS) experiments and has been widely applied to propagate annotations. However, propagation is done through manual inspection of MS/MS spectra connected in the spectral networks and is only possible when a reference library spectrum is available. One of the alternative approaches used to annotate an unknown fragmentation mass spectrum is through the use of in silico predictions. One of the challenges of in silico annotation is the uncertainty around the correct structure among the predicted candidate lists. Here we show how molecular networking can be used to improve the accuracy of in silico predictions through propagation of structural annotations, even when there is no match to a MS/MS spectrum in spectral libraries. This is accomplished through creating a network consensus of re-ranked structural candidates using the molecular network topology and structural similarity to improve in silico annotations. The Network Annotation Propagation (NAP) tool is accessible through the GNPS web-platform https://gnps.ucsd.edu/ProteoSAFe/static/gnps-theoretical.jsp.
Collapse
Affiliation(s)
- Ricardo R. da Silva
- Collaborative Mass Spectrometry Innovation Center, Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, La Jolla, CA, United States of America
- NPPNS, Department of Physic and Chemistry, School of Pharmaceutical Sciences of Ribeirão Preto, University of São Paulo, Ribeirão Preto, SP, Brazil
| | - Mingxun Wang
- Collaborative Mass Spectrometry Innovation Center, Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, La Jolla, CA, United States of America
| | - Louis-Félix Nothias
- Collaborative Mass Spectrometry Innovation Center, Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, La Jolla, CA, United States of America
| | - Justin J. J. van der Hooft
- Collaborative Mass Spectrometry Innovation Center, Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, La Jolla, CA, United States of America
- Bioinformatics Group, Department of Plant Sciences, Wageningen University, Wageningen, The Netherlands
| | - Andrés Mauricio Caraballo-Rodríguez
- Collaborative Mass Spectrometry Innovation Center, Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, La Jolla, CA, United States of America
| | - Evan Fox
- Department of Molecular and Cell Biology, University of Connecticut, Storrs, CT, United States of America
| | - Marcy J. Balunas
- Division of Medicinal Chemistry, Department of Pharmaceutical Sciences, University of Connecticut, Storrs, CT, United States of America
| | - Jonathan L. Klassen
- Department of Molecular and Cell Biology, University of Connecticut, Storrs, CT, United States of America
| | - Norberto Peporine Lopes
- NPPNS, Department of Physic and Chemistry, School of Pharmaceutical Sciences of Ribeirão Preto, University of São Paulo, Ribeirão Preto, SP, Brazil
| | - Pieter C. Dorrestein
- Collaborative Mass Spectrometry Innovation Center, Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, La Jolla, CA, United States of America
| |
Collapse
|
17
|
Domingo-Almenara X, Montenegro-Burke JR, Benton HP, Siuzdak G. Annotation: A Computational Solution for Streamlining Metabolomics Analysis. Anal Chem 2018; 90:480-489. [PMID: 29039932 PMCID: PMC5750104 DOI: 10.1021/acs.analchem.7b03929] [Citation(s) in RCA: 105] [Impact Index Per Article: 17.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]
Abstract
Metabolite identification is still considered an imposing bottleneck in liquid chromatography mass spectrometry (LC/MS) untargeted metabolomics. The identification workflow usually begins with detecting relevant LC/MS peaks via peak-picking algorithms and retrieving putative identities based on accurate mass searching. However, accurate mass search alone provides poor evidence for metabolite identification. For this reason, computational annotation is used to reveal the underlying metabolites monoisotopic masses, improving putative identification in addition to confirmation with tandem mass spectrometry. This review examines LC/MS data from a computational and analytical perspective, focusing on the occurrence of neutral losses and in-source fragments, to understand the challenges in computational annotation methodologies. Herein, we examine the state-of-the-art strategies for computational annotation including: (i) peak grouping or full scan (MS1) pseudo-spectra extraction, i.e., clustering all mass spectral signals stemming from each metabolite; (ii) annotation using ion adduction and mass distance among ion peaks; (iii) incorporation of biological knowledge such as biotransformations or pathways; (iv) tandem MS data; and (v) metabolite retention time calibration, usually achieved by prediction from molecular descriptors. Advantages and pitfalls of each of these strategies are discussed, as well as expected future trends in computational annotation.
Collapse
Affiliation(s)
- Xavier Domingo-Almenara
- Scripps Center for Metabolomics, The Scripps Research Institute , 10550 North Torrey Pines Road, La Jolla, California 92037, United States
| | - J Rafael Montenegro-Burke
- Scripps Center for Metabolomics, The Scripps Research Institute , 10550 North Torrey Pines Road, La Jolla, California 92037, United States
| | - H Paul Benton
- Scripps Center for Metabolomics, The Scripps Research Institute , 10550 North Torrey Pines Road, La Jolla, California 92037, United States
| | - Gary Siuzdak
- Scripps Center for Metabolomics, The Scripps Research Institute , 10550 North Torrey Pines Road, La Jolla, California 92037, United States
| |
Collapse
|
18
|
Godzien J, Gil de la Fuente A, Otero A, Barbas C. Metabolite Annotation and Identification. COMPREHENSIVE ANALYTICAL CHEMISTRY 2018. [DOI: 10.1016/bs.coac.2018.07.004] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/19/2022]
|
19
|
Dowsey AW. The need for statistical contributions to bioinformatics at scale, with illustration to mass spectrometry. STAT MODEL 2017. [DOI: 10.1177/1471082x17708519] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
In their article, Morris and Baladandayuthapani clearly evidence the influence of statisticians in recent methodological advances throughout the bioinformatics pipeline and advocate for the expansion of this role. The latest acquisition platforms, such as next generation sequencing (genomics/transcriptomics) and hyphenated mass spectrometry (proteomics/metabolomics), output raw datasets in the order of gigabytes; it is not unusual to acquire a terabyte or more of data per study. The increasing computational burden this brings is a further impediment against the use of statistically rigorous methodology in the pre-processing stages of the bioinformatics pipeline. In this discussion I describe the mass spectrometry pipeline and use it as an example to show that beneath this challenge lies a two-fold opportunity: (a) Biological complexity and dynamic range is still well beyond what is captured by current processing methodology; hence, potential biomarkers and mechanistic insights are consistently missed; (b) Statistical science could play a larger role in optimizing the acquisition process itself. Data rates will continue to increase as routine clinical omics analysis moves to large-scale facilities with systematic, standardized protocols. Key inferential gains will be achieved by borrowing strength across the sum total of all analyzed studies, a task best underpinned by appropriate statistical modelling.
Collapse
Affiliation(s)
- Andrew W Dowsey
- School of Social & Community Medicine and School of Veterinary Sciences, Faculty of Health Sciences, University of Bristol, United Kingdom
| |
Collapse
|
20
|
Perez de Souza L, Naake T, Tohge T, Fernie AR. From chromatogram to analyte to metabolite. How to pick horses for courses from the massive web resources for mass spectral plant metabolomics. Gigascience 2017; 6:1-20. [PMID: 28520864 PMCID: PMC5499862 DOI: 10.1093/gigascience/gix037] [Citation(s) in RCA: 46] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2017] [Revised: 05/08/2017] [Accepted: 05/12/2017] [Indexed: 01/19/2023] Open
Abstract
The grand challenge currently facing metabolomics is the expansion of the coverage of the metabolome from a minor percentage of the metabolic complement of the cell toward the level of coverage afforded by other post-genomic technologies such as transcriptomics and proteomics. In plants, this problem is exacerbated by the sheer diversity of chemicals that constitute the metabolome, with the number of metabolites in the plant kingdom generally considered to be in excess of 200 000. In this review, we focus on web resources that can be exploited in order to improve analyte and ultimately metabolite identification and quantification. There is a wide range of available software that not only aids in this but also in the related area of peak alignment; however, for the uninitiated, choosing which program to use is a daunting task. For this reason, we provide an overview of the pros and cons of the software as well as comments regarding the level of programing skills required to effectively exploit their basic functions. In addition, the torrent of available genome and transcriptome sequences that followed the advent of next-generation sequencing has opened up further valuable resources for metabolite identification. All things considered, we posit that only via a continued communal sharing of information such as that deposited in the databases described within the article are we likely to be able to make significant headway toward improving our coverage of the plant metabolome.
Collapse
Affiliation(s)
- Leonardo Perez de Souza
- Max-Planck-Institute of Molecular Plant Physiology, Am Mühlenberg 1, 14476 Potsdam-Golm, Germany
| | - Thomas Naake
- Max-Planck-Institute of Molecular Plant Physiology, Am Mühlenberg 1, 14476 Potsdam-Golm, Germany
| | - Takayuki Tohge
- Max-Planck-Institute of Molecular Plant Physiology, Am Mühlenberg 1, 14476 Potsdam-Golm, Germany
| | - Alisdair R Fernie
- Max-Planck-Institute of Molecular Plant Physiology, Am Mühlenberg 1, 14476 Potsdam-Golm, Germany
| |
Collapse
|
21
|
Moritz F, Kaling M, Schnitzler JP, Schmitt-Kopplin P. Characterization of poplar metabotypes via mass difference enrichment analysis. PLANT, CELL & ENVIRONMENT 2017; 40:1057-1073. [PMID: 27943315 DOI: 10.1111/pce.12878] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/08/2016] [Revised: 11/29/2016] [Accepted: 12/01/2016] [Indexed: 06/06/2023]
Abstract
Instrumentation technology for metabolomics has advanced drastically in recent years in terms of sensitivity and specificity. Despite these technical advances, data analytical strategies are still in their infancy in comparison with other 'omics'. Plants are known to possess an immense diversity of secondary metabolites. Typically, more than 70% of metabolomics data are not amenable to systems biological interpretation because of poor database coverage. Here, we propose a new general strategy for mass-spectrometry-based metabolomics that incorporates all exact mass features with known sum formulas into the evaluation and interpretation of metabolomics studies. We extend the use of mass differences, commonly used for feature annotation, by redefining them as variables that reflect the remaining 'omic' domains. The strategy uses exact mass difference network analyses exemplified for the metabolomic description of two grey poplar (Populus × canescens) genotypes that differ in their capability to emit isoprene. This strategy established a direct connection between the metabotype and the non-isoprene-emitting phenotype, as mass differences pertaining to prenylation reactions were over-represented in non-isoprene-emitting poplars. Not only was the analysis of mass differences able to grasp the known chemical biology of poplar, but it also improved the interpretability of yet unknown biochemical relationships.
Collapse
Affiliation(s)
- Franco Moritz
- Research Unit Analytical BioGeoChemistry, Helmholtz Zentrum München (HMGU), Neuherberg, Germany
| | - Moritz Kaling
- Research Unit Analytical BioGeoChemistry, Helmholtz Zentrum München (HMGU), Neuherberg, Germany
- Research Unit Environmental Simulation, Institute of Biochemical Plant Pathology, Helmholtz Zentrum München (HMGU), Neuherberg, Germany
| | - Jörg-Peter Schnitzler
- Research Unit Environmental Simulation, Institute of Biochemical Plant Pathology, Helmholtz Zentrum München (HMGU), Neuherberg, Germany
| | - Philippe Schmitt-Kopplin
- Research Unit Analytical BioGeoChemistry, Helmholtz Zentrum München (HMGU), Neuherberg, Germany
- Chair of Analytical Food Chemistry, Technische Universität München (TUM), Freising, Germany
| |
Collapse
|
22
|
Randazzo GM, Tonoli D, Strajhar P, Xenarios I, Odermatt A, Boccard J, Rudaz S. Enhanced metabolite annotation via dynamic retention time prediction: Steroidogenesis alterations as a case study. J Chromatogr B Analyt Technol Biomed Life Sci 2017; 1071:11-18. [PMID: 28479067 DOI: 10.1016/j.jchromb.2017.04.032] [Citation(s) in RCA: 23] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2016] [Revised: 03/07/2017] [Accepted: 04/20/2017] [Indexed: 10/19/2022]
Abstract
The development of metabolomics based on ultra-high pressure liquid chromatography coupled to high-resolution mass spectrometry (UHPLC-HRMS) now allows hundreds to thousands of metabolites to be simultaneously monitored in biological matrices. In that context, bioinformatics and multivariate data analysis (MVA) play a crucial role in the detection of relevant alteration patterns. However, sound biological interpretations must necessarily be supported by metabolite identifications to be definitive or at least have a high degree of confidence. Each compound, should be characterised by unique molecular properties. Among them, the exact mass and the chromatographic retention time are recognised as major and complementary criteria for compound identification. While the former is easily derived from the molecular structure, building generic and accurate retention time open databases still constitutes a critical issue because of the vast diversity of instruments, stationary phases and operating conditions in UHPLC-HRMS. Because several hits matching a molecular formula obtained from an exact mass and an isotopic pattern are often generated for each analyte, this methodology rarely allows a unique and unambiguous molecular identity to be gained. This work aims to provide a flexible solution to facilitate reliable compound annotation based on retention time in reversed-phase liquid chromatography (RPLC). It proposes an innovative approach based on the chromatographic linear solvent strength (LSS) theory, allowing retention times under any gradient conditions at fixed temperature, stationary phase and mobile phase type to be predicted. Starting from a subset of the Human Metabolite Database (HMDB), a new dynamic database involving LSS parameters was developed. A real case study involving steroidogenesis alterations due to forskolin exposure was conducted using the adrenal H295R OECD reference cell model for endocrine disruptor screening. The prediction of retention times was successfully achieved, facilitating steroid identification. An automated procedure which implements the compound annotation levels encouraged by the Metabolite Standard Initiative (MSI) and the Coordination of Standards in Metabolomics (COSMOS) was also developed to speed up the process and enhance the data reusability.
Collapse
Affiliation(s)
- Giuseppe Marco Randazzo
- School of Pharmaceutical Sciences, University of Geneva and University of Lausanne, 1211 Geneva, Switzerland
| | - David Tonoli
- School of Pharmaceutical Sciences, University of Geneva and University of Lausanne, 1211 Geneva, Switzerland; Swiss Centre for Applied Human Toxicology (SCAHT), Universities of Basel and Geneva, 4055 Basel, Switzerland; Human Protein Sciences Department, University of Geneva, 1211 Geneva, Switzerland
| | - Petra Strajhar
- Swiss Centre for Applied Human Toxicology (SCAHT), Universities of Basel and Geneva, 4055 Basel, Switzerland; Division of Molecular and Systems Toxicology, Department of Pharmaceutical Sciences, University of Basel, 4056 Basel, Switzerland
| | - Ioannis Xenarios
- Vital-IT/Swiss-Prot Groups, SIB Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
| | - Alex Odermatt
- Swiss Centre for Applied Human Toxicology (SCAHT), Universities of Basel and Geneva, 4055 Basel, Switzerland; Division of Molecular and Systems Toxicology, Department of Pharmaceutical Sciences, University of Basel, 4056 Basel, Switzerland
| | - Julien Boccard
- School of Pharmaceutical Sciences, University of Geneva and University of Lausanne, 1211 Geneva, Switzerland
| | - Serge Rudaz
- School of Pharmaceutical Sciences, University of Geneva and University of Lausanne, 1211 Geneva, Switzerland.
| |
Collapse
|
23
|
Aguilar-Mogas A, Sales-Pardo M, Navarro M, Guimerà R, Yanes O. iMet: A Network-Based Computational Tool To Assist in the Annotation of Metabolites from Tandem Mass Spectra. Anal Chem 2017; 89:3474-3482. [DOI: 10.1021/acs.analchem.6b04512] [Citation(s) in RCA: 36] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/08/2023]
Affiliation(s)
- Antoni Aguilar-Mogas
- Departament
d’Enginyeria Química, Universitat Rovira i Virgili, Av. Països Catalans 26, 43007 Tarragona, Catalonia, Spain
| | - Marta Sales-Pardo
- Departament
d’Enginyeria Química, Universitat Rovira i Virgili, Av. Països Catalans 26, 43007 Tarragona, Catalonia, Spain
| | - Miriam Navarro
- Metabolomics
Platform, Department of Electronic Engineering (DEEEA), Universitat Rovira i Virgili, Av. Països Catalans 26, 43007 Tarragona, Catalonia, Spain
- Biomedical Research Center in Diabetes and Associated Metabolic Disorders (CIBERDEM), Monforte de Lemos 35, 28029 Madrid, Spain
| | - Roger Guimerà
- Departament
d’Enginyeria Química, Universitat Rovira i Virgili, Av. Països Catalans 26, 43007 Tarragona, Catalonia, Spain
- Institució Catalana de Recerca i Estudis Avançats (ICREA), Lluís Companys 23, 08010 Barcelona, Catalonia, Spain
| | - Oscar Yanes
- Metabolomics
Platform, Department of Electronic Engineering (DEEEA), Universitat Rovira i Virgili, Av. Països Catalans 26, 43007 Tarragona, Catalonia, Spain
- Biomedical Research Center in Diabetes and Associated Metabolic Disorders (CIBERDEM), Monforte de Lemos 35, 28029 Madrid, Spain
| |
Collapse
|
24
|
Yamamoto H, Sasaki K. Metabolomics-based approach for ranking the candidate structures of unidentified peaks in capillary electrophoresis time-of-flight mass spectrometry. Electrophoresis 2017; 38:1053-1059. [DOI: 10.1002/elps.201600328] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2016] [Revised: 12/01/2016] [Accepted: 12/05/2016] [Indexed: 11/10/2022]
|
25
|
Uppal K, Walker DI, Jones DP. xMSannotator: An R Package for Network-Based Annotation of High-Resolution Metabolomics Data. Anal Chem 2017; 89:1063-1067. [PMID: 27977166 DOI: 10.1021/acs.analchem.6b01214] [Citation(s) in RCA: 215] [Impact Index Per Article: 30.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
Abstract
Improved analytical technologies and data extraction algorithms enable detection of >10 000 reproducible signals by liquid chromatography-high-resolution mass spectrometry, creating a bottleneck in chemical identification. In principle, measurement of more than one million chemicals would be possible if algorithms were available to facilitate utilization of the raw mass spectrometry data, especially low-abundance metabolites. Here we describe an automated computational framework to annotate ions for possible chemical identity using a multistage clustering algorithm in which metabolic pathway associations are used along with intensity profiles, retention time characteristics, mass defect, and isotope/adduct patterns. The algorithm uses high-resolution mass spectrometry data for a series of samples with common properties and publicly available chemical, metabolic, and environmental databases to assign confidence levels to annotation results. Evaluation results show that the algorithm achieves an F1-measure of 0.8 for a data set with known targets and is more robust than previously reported results for cases when database size is much greater than the actual number of metabolites. MS/MS evaluation of a set of randomly selected 210 metabolites annotated using xMSannotator in an untargeted metabolomics human data set shows that 80% of features with high or medium confidence scores have ion dissociation patterns consistent with the xMSannotator annotation. The algorithm has been incorporated into an R package, xMSannotator, which includes utilities for querying local or online databases such as ChemSpider, KEGG, HMDB, T3DB, and LipidMaps.
Collapse
Affiliation(s)
- Karan Uppal
- Clinical Biomarkers Laboratory, Department of Medicine, Emory University , Atlanta, Georgia 30308, United States
| | - Douglas I Walker
- Clinical Biomarkers Laboratory, Department of Medicine, Emory University , Atlanta, Georgia 30308, United States.,Department of Civil and Environmental Engineering, Tufts University , Medford, Massachusetts 02153, United States
| | - Dean P Jones
- Clinical Biomarkers Laboratory, Department of Medicine, Emory University , Atlanta, Georgia 30308, United States
| |
Collapse
|
26
|
Mahieu NG, Spalding JL, Gelman SJ, Patti GJ. Defining and Detecting Complex Peak Relationships in Mass Spectral Data: The Mz.unity Algorithm. Anal Chem 2016; 88:9037-46. [PMID: 27513885 PMCID: PMC6427821 DOI: 10.1021/acs.analchem.6b01702] [Citation(s) in RCA: 54] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/29/2023]
Abstract
Analysis of a single analyte by mass spectrometry can result in the detection of more than 100 degenerate peaks. These degenerate peaks complicate spectral interpretation and are challenging to annotate. In mass spectrometry-based metabolomics, this degeneracy leads to inflated false discovery rates, data sets containing an order of magnitude more features than analytes, and an inefficient use of resources during data analysis. Although software has been introduced to annotate spectral degeneracy, current approaches are unable to represent several important classes of peak relationships. These include heterodimers and higher complex adducts, distal fragments, relationships between peaks in different polarities, and complex adducts between features and background peaks. Here we outline sources of peak degeneracy in mass spectra that are not annotated by current approaches and introduce a software package called mz.unity to detect these relationships in accurate mass data. Using mz.unity, we find that data sets contain many more complex relationships than we anticipated. Examples include the adduct of glutamate and nicotinamide adenine dinucleotide (NAD), fragments of NAD detected in the same or opposite polarities, and the adduct of glutamate and a background peak. Further, the complex relationships we identify show that several assumptions commonly made when interpreting mass spectral degeneracy do not hold in general. These contributions provide new tools and insight to aid in the annotation of complex spectral relationships and provide a foundation for improved data set identification. Mz.unity is an R package and is freely available at https://github.com/nathaniel-mahieu/mz.unity as well as our laboratory Web site http://pattilab.wustl.edu/software/ .
Collapse
Affiliation(s)
- Nathaniel G. Mahieu
- Department of Chemistry, Washington University, St. Louis, Missouri 63130, United States
- Department of Medicine, Washington University School of Medicine, St. Louis, Missouri 63110, United States
| | - Jonathan L. Spalding
- Department of Chemistry, Washington University, St. Louis, Missouri 63130, United States
- Department of Genetics
| | - Susan J. Gelman
- Department of Chemistry, Washington University, St. Louis, Missouri 63130, United States
- Department of Medicine, Washington University School of Medicine, St. Louis, Missouri 63110, United States
| | - Gary J. Patti
- Department of Chemistry, Washington University, St. Louis, Missouri 63130, United States
- Department of Medicine, Washington University School of Medicine, St. Louis, Missouri 63110, United States
| |
Collapse
|
27
|
Yi L, Dong N, Yun Y, Deng B, Ren D, Liu S, Liang Y. Chemometric methods in data processing of mass spectrometry-based metabolomics: A review. Anal Chim Acta 2016; 914:17-34. [PMID: 26965324 DOI: 10.1016/j.aca.2016.02.001] [Citation(s) in RCA: 159] [Impact Index Per Article: 19.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2015] [Revised: 01/28/2016] [Accepted: 02/01/2016] [Indexed: 01/03/2023]
Abstract
This review focuses on recent and potential advances in chemometric methods in relation to data processing in metabolomics, especially for data generated from mass spectrometric techniques. Metabolomics is gradually being regarded a valuable and promising biotechnology rather than an ambitious advancement. Herein, we outline significant developments in metabolomics, especially in the combination with modern chemical analysis techniques, and dedicated statistical, and chemometric data analytical strategies. Advanced skills in the preprocessing of raw data, identification of metabolites, variable selection, and modeling are illustrated. We believe that insights from these developments will help narrow the gap between the original dataset and current biological knowledge. We also discuss the limitations and perspectives of extracting information from high-throughput datasets.
Collapse
Affiliation(s)
- Lunzhao Yi
- Yunnan Food Safety Research Institute, Kunming University of Science and Technology, Kunming, 650500, China.
| | - Naiping Dong
- Department of Applied Biology and Chemical Technology, The Hong Kong Polytechnic University, Hong Kong, 999077, China
| | - Yonghuan Yun
- College of Chemistry and Chemical Engineering, Central South University, Changsha, 410083, China
| | - Baichuan Deng
- College of Animal Science, South China Agricultural University, Guangzhou, 510642, China
| | - Dabing Ren
- Yunnan Food Safety Research Institute, Kunming University of Science and Technology, Kunming, 650500, China
| | - Shao Liu
- Xiangya Hospital, Central South University, Changsha, 410008, China
| | - Yizeng Liang
- College of Chemistry and Chemical Engineering, Central South University, Changsha, 410083, China
| |
Collapse
|
28
|
Abstract
BACKGROUND Untargeted metabolomics commonly uses liquid chromatography mass spectrometry to measure abundances of metabolites; subsequent tandem mass spectrometry is used to derive information about individual compounds. One of the bottlenecks in this experimental setup is the interpretation of fragmentation spectra to accurately and efficiently identify compounds. Fragmentation trees have become a powerful tool for the interpretation of tandem mass spectrometry data of small molecules. These trees are determined from the data using combinatorial optimization, and aim at explaining the experimental data via fragmentation cascades. Fragmentation tree computation does not require spectral or structural databases. To obtain biochemically meaningful trees, one needs an elaborate optimization function (scoring). RESULTS We present a new scoring for computing fragmentation trees, transforming the combinatorial optimization into a Maximum A Posteriori estimator. We demonstrate the superiority of the new scoring for two tasks: both for the de novo identification of molecular formulas of unknown compounds, and for searching a database for structurally similar compounds, our method SIRIUS 3, performs significantly better than the previous version of our method, as well as other methods for this task. CONCLUSION SIRIUS 3 can be a part of an untargeted metabolomics workflow, allowing researchers to investigate unknowns using automated computational methods.Graphical abstractWe present a new scoring for computing fragmentation trees from tandem mass spectrometry data based on Bayesian statistics. The best scoring fragmentation tree most likely explains the molecular formula of the measured parent ion.
Collapse
Affiliation(s)
- Sebastian Böcker
- Friedrich-Schiller-University, Ernst-Abbe-Platz 2, 07743 Jena, Germany
| | - Kai Dührkop
- Friedrich-Schiller-University, Ernst-Abbe-Platz 2, 07743 Jena, Germany
| |
Collapse
|
29
|
Frainay C, Jourdan F. Computational methods to identify metabolic sub-networks based on metabolomic profiles. Brief Bioinform 2016; 18:43-56. [PMID: 26822099 DOI: 10.1093/bib/bbv115] [Citation(s) in RCA: 44] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2015] [Revised: 12/16/2015] [Indexed: 11/13/2022] Open
Abstract
Untargeted metabolomics makes it possible to identify compounds that undergo significant changes in concentration in different experimental conditions. The resulting metabolomic profile characterizes the perturbation concerned, but does not explain the underlying biochemical mechanisms. Bioinformatics methods make it possible to interpret results in light of the whole metabolism. This knowledge is modelled into a network, which can be mined using algorithms that originate in graph theory. These algorithms can extract sub-networks related to the compounds identified. Several attempts have been made to adapt them to obtain more biologically meaningful results. However, there is still no consensus on this kind of analysis of metabolic networks. This review presents the main graph approaches used to interpret metabolomic data using metabolic networks. Their advantages and drawbacks are discussed, and the impacts of their parameters are emphasized. We also provide some guidelines for relevant sub-network extraction and also suggest a range of applications for most methods.
Collapse
|
30
|
Wolfender JL, Marti G, Thomas A, Bertrand S. Current approaches and challenges for the metabolite profiling of complex natural extracts. J Chromatogr A 2015; 1382:136-64. [DOI: 10.1016/j.chroma.2014.10.091] [Citation(s) in RCA: 352] [Impact Index Per Article: 39.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2014] [Revised: 10/23/2014] [Accepted: 10/26/2014] [Indexed: 12/11/2022]
|
31
|
Lynn KS, Cheng ML, Chen YR, Hsu C, Chen A, Lih TM, Chang HY, Huang CJ, Shiao MS, Pan WH, Sung TY, Hsu WL. Metabolite Identification for Mass Spectrometry-Based Metabolomics Using Multiple Types of Correlated Ion Information. Anal Chem 2015; 87:2143-51. [DOI: 10.1021/ac503325c] [Citation(s) in RCA: 57] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/27/2023]
Affiliation(s)
- Ke-Shiuan Lynn
- Institute of Information
Science, Academia Sinica, Taipei, Taiwan
| | - Mei-Ling Cheng
- Department
of Biomedical Sciences, Chang Gung University, Taoyuan, Taiwan
| | - Yet-Ran Chen
- Agricultural Biotechnology
Research Center, Academia Sinica, Taipei, Taiwan
| | - Chin Hsu
- Department
of Exercise Health Science, National Taiwan University of Physical Education and Sport, Taichung, Taiwan
| | - Ann Chen
- Department
of Biomedical Sciences, Chang Gung University, Taoyuan, Taiwan
| | - T. Mamie Lih
- Bioinformatics
Program, TIGP, Institute of Information Science, Academia Sinica, Taipei, Taiwan
| | - Hui-Yin Chang
- Bioinformatics
Program, TIGP, Institute of Information Science, Academia Sinica, Taipei, Taiwan
| | - Ching-jang Huang
- Department
of Biochemical Science and Technology, National Taiwan University, Taipei, Taiwan
| | - Ming-Shi Shiao
- Department
of Biomedical Sciences, Chang Gung University, Taoyuan, Taiwan
| | - Wen-Harn Pan
- Institute of Biomedical
Sciences, Academia Sinica, Taipei, Taiwan
| | - Ting-Yi Sung
- Institute of Information
Science, Academia Sinica, Taipei, Taiwan
| | - Wen-Lian Hsu
- Institute of Information
Science, Academia Sinica, Taipei, Taiwan
| |
Collapse
|
32
|
Suvitaival T, Rogers S, Kaski S. Stronger findings for metabolomics through Bayesian modeling of multiple peaks and compound correlations. Bioinformatics 2015; 30:i461-7. [PMID: 25161234 PMCID: PMC4147908 DOI: 10.1093/bioinformatics/btu455] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022] Open
Abstract
MOTIVATION Data analysis for metabolomics suffers from uncertainty because of the noisy measurement technology and the small sample size of experiments. Noise and the small sample size lead to a high probability of false findings. Further, individual compounds have natural variation between samples, which in many cases renders them unreliable as biomarkers. However, the levels of similar compounds are typically highly correlated, which is a phenomenon that we model in this work. RESULTS We propose a hierarchical Bayesian model for inferring differences between groups of samples more accurately in metabolomic studies, where the observed compounds are collinear. We discover that the method decreases the error of weak and non-existent covariate effects, and thereby reduces false-positive findings. To achieve this, the method makes use of the mass spectral peak data by clustering similar peaks into latent compounds, and by further clustering latent compounds into groups that respond in a coherent way to the experimental covariates. We demonstrate the method with three simulated studies and validate it with a metabolomic benchmark dataset. AVAILABILITY AND IMPLEMENTATION An implementation in R is available at http://research.ics.aalto.fi/mi/software/peakANOVA/.
Collapse
Affiliation(s)
- Tommi Suvitaival
- Helsinki Institute for Information Technology HIIT, Department of Information and Computer Science, Aalto University, FI-00076 Espoo, Finland, School of Computing Science, University of Glasgow, Glasgow G12 8QQ, UK and Helsinki Institute for Information Technology HIIT, Department of Computer Science, University of Helsinki, Helsinki, Finland
| | - Simon Rogers
- Helsinki Institute for Information Technology HIIT, Department of Information and Computer Science, Aalto University, FI-00076 Espoo, Finland, School of Computing Science, University of Glasgow, Glasgow G12 8QQ, UK and Helsinki Institute for Information Technology HIIT, Department of Computer Science, University of Helsinki, Helsinki, Finland
| | - Samuel Kaski
- Helsinki Institute for Information Technology HIIT, Department of Information and Computer Science, Aalto University, FI-00076 Espoo, Finland, School of Computing Science, University of Glasgow, Glasgow G12 8QQ, UK and Helsinki Institute for Information Technology HIIT, Department of Computer Science, University of Helsinki, Helsinki, Finland Helsinki Institute for Information Technology HIIT, Department of Information and Computer Science, Aalto University, FI-00076 Espoo, Finland, School of Computing Science, University of Glasgow, Glasgow G12 8QQ, UK and Helsinki Institute for Information Technology HIIT, Department of Computer Science, University of Helsinki, Helsinki, Finland
| |
Collapse
|
33
|
Yi L, Dong N, Yun Y, Deng B, Liu S, Zhang Y, Liang Y. WITHDRAWN: Recent advances in chemometric methods for plant metabolomics: A review. Biotechnol Adv 2014:S0734-9750(14)00183-9. [PMID: 25461504 DOI: 10.1016/j.biotechadv.2014.11.008] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2014] [Revised: 11/17/2014] [Accepted: 11/18/2014] [Indexed: 12/17/2022]
Abstract
This article has been withdrawn at the request of the author(s) and/or editor. The Publisher apologizes for any inconvenience this may cause. The full Elsevier Policy on Article Withdrawal can be found at http://www.elsevier.com/locate/withdrawalpolicy.
Collapse
Affiliation(s)
- Lunzhao Yi
- Yunnan Food Safety Research Institute, Kunming University of Science and Technology, Kunming 650500, China.
| | - Naiping Dong
- Department of Applied Biology and Chemical Technology, The Hong Kong Polytechnic University, Hong Kong 999077, Hong Kong, China
| | - Yonghuan Yun
- College of Chemistry and Chemical Engineering, Central South University, Changsha 410083, China
| | - Baichuan Deng
- Department of Chemistry, University of Bergen, Bergen N-5007, Norway
| | - Shao Liu
- Xiangya Hospital, Central South University, Changsha 410008, China
| | - Yi Zhang
- College of Chemistry and Chemical Engineering, Central South University, Changsha 410083, China
| | - Yizeng Liang
- College of Chemistry and Chemical Engineering, Central South University, Changsha 410083, China
| |
Collapse
|
34
|
Cho K, Evans BS, Wood BM, Kumar R, Erb TJ, Warlick BP, Gerlt JA, Sweedler JV. Integration of untargeted metabolomics with transcriptomics reveals active metabolic pathways. Metabolomics 2014; 2014. [PMID: 25705145 PMCID: PMC4334135 DOI: 10.1007/s11306-014-0713-3] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 01/22/2023]
Abstract
While recent advances in metabolomic measurement technologies have been dramatic, extracting biological insight from complex metabolite profiles remains a challenge. We present an analytical strategy that uses data obtained from high resolution liquid chromatography-mass spectrometry and a bioinformatics toolset for detecting actively changing metabolic pathways upon external perturbation. We begin with untargeted metabolite profiling to nominate altered metabolites and identify pathway candidates, followed by validation of those pathways with transcriptomics. Using the model organisms Rhodospirillum rubrum and Bacillus subtilis, our results reveal metabolic pathways that are interconnected with methionine salvage. The rubrum-type methionine salvage pathway is interconnected with the active methyl cycle in which re-methylation, a key reaction for recycling methionine from homocysteine, is unexpectedly suppressed; instead, homocysteine is catabolized by the transsulfuration pathway. Notably, the non-mevalonate pathway is repressed, whereas the rubrum-type methionine salvage pathway contributes to isoprenoid biosynthesis upon 5'-methylthioadenosine feeding. In this process, glutathione functions as a coenzyme in vivo when 1-methylthio-d-xylulose 5-phosphate (MTXu 5-P) methylsulfurylase catalyzes dethiomethylation of MTXu 5-P. These results clearly show that our analytical approach enables unexpected metabolic pathways to be uncovered.
Collapse
Affiliation(s)
- Kyuil Cho
- Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, IL 61801 USA
- Department of Chemistry, University of Illinois at Urbana-Champaign, 600 S. Mathews Ave., Urbana, IL 61801 USA
| | - Bradley S. Evans
- Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, IL 61801 USA
- Department of Chemistry, University of Illinois at Urbana-Champaign, 600 S. Mathews Ave., Urbana, IL 61801 USA
| | - B. McKay Wood
- Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, IL 61801 USA
| | - Ritesh Kumar
- Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, IL 61801 USA
| | - Tobias J. Erb
- Institute for Microbiology, Swiss Federal Institute of Technology (ETH) Zurich, CH-8093 Zurich, Switzerland
| | - Benjamin P. Warlick
- Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, IL 61801 USA
- Department of Biochemistry, University of Illinois at Urbana-Champaign, Urbana, IL 61801 USA
| | - John A. Gerlt
- Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, IL 61801 USA
- Department of Chemistry, University of Illinois at Urbana-Champaign, 600 S. Mathews Ave., Urbana, IL 61801 USA
- Department of Biochemistry, University of Illinois at Urbana-Champaign, Urbana, IL 61801 USA
| | - Jonathan V. Sweedler
- Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, IL 61801 USA
- Department of Chemistry, University of Illinois at Urbana-Champaign, 600 S. Mathews Ave., Urbana, IL 61801 USA
| |
Collapse
|
35
|
Daly R, Rogers S, Wandy J, Jankevics A, Burgess KEV, Breitling R. MetAssign: probabilistic annotation of metabolites from LC-MS data using a Bayesian clustering approach. ACTA ACUST UNITED AC 2014; 30:2764-71. [PMID: 24916385 PMCID: PMC4173012 DOI: 10.1093/bioinformatics/btu370] [Citation(s) in RCA: 58] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022]
Abstract
Motivation: The use of liquid chromatography coupled to mass spectrometry has enabled the high-throughput profiling of the metabolite composition of biological samples. However, the large amount of data obtained can be difficult to analyse and often requires computational processing to understand which metabolites are present in a sample. This article looks at the dual problem of annotating peaks in a sample with a metabolite, together with putatively annotating whether a metabolite is present in the sample. The starting point of the approach is a Bayesian clustering of peaks into groups, each corresponding to putative adducts and isotopes of a single metabolite. Results: The Bayesian modelling introduced here combines information from the mass-to-charge ratio, retention time and intensity of each peak, together with a model of the inter-peak dependency structure, to increase the accuracy of peak annotation. The results inherently contain a quantitative estimate of confidence in the peak annotations and allow an accurate trade-off between precision and recall. Extensive validation experiments using authentic chemical standards show that this system is able to produce more accurate putative identifications than other state-of-the-art systems, while at the same time giving a probabilistic measure of confidence in the annotations. Availability and implementation: The software has been implemented as part of the mzMatch metabolomics analysis pipeline, which is available for download at http://mzmatch.sourceforge.net/. Contact:Ronan.Daly@glasgow.ac.uk Supplementary information:Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Rónán Daly
- School of Computing Science, University of Glasgow, Glasgow, Manchester Institute of Biotechnology, Faculty of Life Sciences, University of Manchester, Manchester and Institute of Infection, Immunity and Inflammation, University of Glasgow, Glasgow, UK
| | - Simon Rogers
- School of Computing Science, University of Glasgow, Glasgow, Manchester Institute of Biotechnology, Faculty of Life Sciences, University of Manchester, Manchester and Institute of Infection, Immunity and Inflammation, University of Glasgow, Glasgow, UK
| | - Joe Wandy
- School of Computing Science, University of Glasgow, Glasgow, Manchester Institute of Biotechnology, Faculty of Life Sciences, University of Manchester, Manchester and Institute of Infection, Immunity and Inflammation, University of Glasgow, Glasgow, UK
| | - Andris Jankevics
- School of Computing Science, University of Glasgow, Glasgow, Manchester Institute of Biotechnology, Faculty of Life Sciences, University of Manchester, Manchester and Institute of Infection, Immunity and Inflammation, University of Glasgow, Glasgow, UK
| | - Karl E V Burgess
- School of Computing Science, University of Glasgow, Glasgow, Manchester Institute of Biotechnology, Faculty of Life Sciences, University of Manchester, Manchester and Institute of Infection, Immunity and Inflammation, University of Glasgow, Glasgow, UK
| | - Rainer Breitling
- School of Computing Science, University of Glasgow, Glasgow, Manchester Institute of Biotechnology, Faculty of Life Sciences, University of Manchester, Manchester and Institute of Infection, Immunity and Inflammation, University of Glasgow, Glasgow, UK
| |
Collapse
|
36
|
Doerfler H, Sun X, Wang L, Engelmeier D, Lyon D, Weckwerth W. mzGroupAnalyzer--predicting pathways and novel chemical structures from untargeted high-throughput metabolomics data. PLoS One 2014; 9:e96188. [PMID: 24846183 PMCID: PMC4028198 DOI: 10.1371/journal.pone.0096188] [Citation(s) in RCA: 33] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2013] [Accepted: 04/04/2014] [Indexed: 01/09/2023] Open
Abstract
The metabolome is a highly dynamic entity and the final readout of the genotype x environment x phenotype (GxExP) relationship of an organism. Monitoring metabolite dynamics over time thus theoretically encrypts the whole range of possible chemical and biochemical transformations of small molecules involved in metabolism. The bottleneck is, however, the sheer number of unidentified structures in these samples. This represents the next challenge for metabolomics technology and is comparable with genome sequencing 30 years ago. At the same time it is impossible to handle the amount of data involved in a metabolomics analysis manually. Algorithms are therefore imperative to allow for automated m/z feature extraction and subsequent structure or pathway assignment. Here we provide an automated pathway inference strategy comprising measurements of metabolome time series using LC- MS with high resolution and high mass accuracy. An algorithm was developed, called mzGroupAnalyzer, to automatically explore the metabolome for the detection of metabolite transformations caused by biochemical or chemical modifications. Pathways are extracted directly from the data and putative novel structures can be identified. The detected m/z features can be mapped on a van Krevelen diagram according to their H/C and O/C ratios for pattern recognition and to visualize oxidative processes and biochemical transformations. This method was applied to Arabidopsis thaliana treated simultaneously with cold and high light. Due to a protective antioxidant response the plants turn from green to purple color via the accumulation of flavonoid structures. The detection of potential biochemical pathways resulted in 15 putatively new compounds involved in the flavonoid-pathway. These compounds were further validated by product ion spectra from the same data. The mzGroupAnalyzer is implemented in the graphical user interface (GUI) of the metabolomics toolbox COVAIN (Sun & Weckwerth, 2012, Metabolomics 8: 81-93). The strategy can be extended to any biological system.
Collapse
Affiliation(s)
- Hannes Doerfler
- Department of Ecogenomics and Systems Biology, University of Vienna, Vienna, Austria
| | - Xiaoliang Sun
- Department of Ecogenomics and Systems Biology, University of Vienna, Vienna, Austria
| | - Lei Wang
- Department of Ecogenomics and Systems Biology, University of Vienna, Vienna, Austria
| | - Doris Engelmeier
- Department of Ecogenomics and Systems Biology, University of Vienna, Vienna, Austria
| | - David Lyon
- Department of Ecogenomics and Systems Biology, University of Vienna, Vienna, Austria
| | - Wolfram Weckwerth
- Department of Ecogenomics and Systems Biology, University of Vienna, Vienna, Austria
| |
Collapse
|
37
|
Silva RR, Jourdan F, Salvanha DM, Letisse F, Jamin EL, Guidetti-Gonzalez S, Labate CA, Vêncio RZN. ProbMetab: an R package for Bayesian probabilistic annotation of LC-MS-based metabolomics. ACTA ACUST UNITED AC 2014; 30:1336-7. [PMID: 24443383 PMCID: PMC3998140 DOI: 10.1093/bioinformatics/btu019] [Citation(s) in RCA: 49] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023]
Abstract
Summary: We present ProbMetab, an R package that promotes substantial improvement in automatic probabilistic liquid chromatography–mass spectrometry-based metabolome annotation. The inference engine core is based on a Bayesian model implemented to (i) allow diverse source of experimental data and metadata to be systematically incorporated into the model with alternative ways to calculate the likelihood function and (ii) allow sensitive selection of biologically meaningful biochemical reaction databases as Dirichlet-categorical prior distribution. Additionally, to ensure result interpretation by system biologists, we display the annotation in a network where observed mass peaks are connected if their candidate metabolites are substrate/product of known biochemical reactions. This graph can be overlaid with other graph-based analysis, such as partial correlation networks, in a visualization scheme exported to Cytoscape, with web and stand-alone versions. Availability and implementation: ProbMetab was implemented in a modular manner to fit together with established upstream (xcms, CAMERA, AStream, mzMatch.R, etc) and downstream R package tools (GeneNet, RCytoscape, DiffCorr, etc). ProbMetab, along with extensive documentation and case studies, is freely available under GNU license at: http://labpib.fmrp.usp.br/methods/probmetab/. Contact:rvencio@usp.br Supplementary information:Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Ricardo R Silva
- LabPIB, Department of Computing and Mathematics FFCLRP-USP, University of Sao Paulo, Ribeirao Preto, Brazil, INRA UMR1331, Toxalim, Research Centre in Food Toxicology, Universit de Toulouse, INSA, UPS, INP; LISBP, Toulouse, France, Institute for Systems Biology, Seattle, Washington, USA, CNRS, UMR5504, Toulouse, France, Department of Genetics ESALQ-USP, University of Sao Paulo, Piracicaba, Brazil and Laboratorio Nacional de Ciencia e Tecnologia do Bioetanol CTBE, Campinas, Brazil
| | | | | | | | | | | | | | | |
Collapse
|
38
|
Breitling R, Achcar F, Takano E. Modeling challenges in the synthetic biology of secondary metabolism. ACS Synth Biol 2013; 2:373-8. [PMID: 23659212 DOI: 10.1021/sb4000228] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
The successful engineering of secondary metabolite production relies on the availability of detailed computational models of metabolism. In this brief review we discuss the types of models used for synthetic biology and their application for the engineering of metabolism. We then highlight some of the major modeling challenges, in particular the need to make informative model predictions based on incomplete and uncertain information. This issue is particularly pressing in the synthetic biology of secondary metabolism, due to the genetic diversity of microbial secondary metabolite producers, the difficulty of enzyme-kinetic characterization of the complex biosynthetic machinery, and the need for engineered pathways to function efficiently in heterologous hosts. We argue that an explicit quantitative consideration of the resulting uncertainty of metabolic models can lead to more informative predictions to guide the design of improved production hosts for bioactive secondary metabolites.
Collapse
Affiliation(s)
- Rainer Breitling
- Manchester Institute of Biotechnology,
Faculty of Life Sciences, University of Manchester, 131 Princess Street, Manchester M1 7DN, United Kingdom
| | - Fiona Achcar
- Institute of Molecular, Cell
and Systems Biology, College of Medical, Veterinary and Life Sciences, University of Glasgow, Glasgow G12 8QQ, United Kingdom
| | - Eriko Takano
- Manchester Institute of Biotechnology,
Faculty of Life Sciences, University of Manchester, 131 Princess Street, Manchester M1 7DN, United Kingdom
| |
Collapse
|
39
|
Li S, Park Y, Duraisingham S, Strobel FH, Khan N, Soltow QA, Jones DP, Pulendran B. Predicting network activity from high throughput metabolomics. PLoS Comput Biol 2013; 9:e1003123. [PMID: 23861661 PMCID: PMC3701697 DOI: 10.1371/journal.pcbi.1003123] [Citation(s) in RCA: 603] [Impact Index Per Article: 54.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2012] [Accepted: 05/15/2013] [Indexed: 12/26/2022] Open
Abstract
The functional interpretation of high throughput metabolomics by mass spectrometry is hindered by the identification of metabolites, a tedious and challenging task. We present a set of computational algorithms which, by leveraging the collective power of metabolic pathways and networks, predict functional activity directly from spectral feature tables without a priori identification of metabolites. The algorithms were experimentally validated on the activation of innate immune cells. Mass spectrometry based untargeted metabolomics can now profile several thousand of metabolites simultaneously. However, these metabolites have to be identified before any biological meaning can be drawn from the data. Metabolite identification is a challenging and low throughput process, therefore becomes the bottleneck of the filed. We report here a novel approach to predict biological activity directly from mass spectrometry data without a priori identification of metabolites. By unifying network analysis and metabolite prediction under the same computational framework, the organization of metabolic networks and pathways helps resolve the ambiguity in metabolite prediction to a large extent. We validated our algorithms on a set of activation experiment of innate immune cells. The predicted activities were confirmed by both gene expression and metabolite identification. This method shall greatly accelerate the application of high throughput metabolomics, as the tedious task of identifying hundreds of metabolites upfront can be shifted to a handful of validation experiments after our computational prediction.
Collapse
Affiliation(s)
- Shuzhao Li
- Emory Vaccine Center, Emory University, Atlanta, Georgia, USA.
| | | | | | | | | | | | | | | |
Collapse
|
40
|
Abstract
The discovery, development and optimal utilization of pharmaceuticals can be greatly enhanced by knowledge of their modes of action. However, many drugs currently on the market act by unknown mechanisms. Untargeted metabolomics offers the potential to discover modes of action for drugs that perturb cellular metabolism. Development of high resolution LC-MS methods and improved data analysis software now allows rapid detection of drug-induced changes to cellular metabolism in an untargeted manner. Several studies have demonstrated the ability of untargeted metabolomics to provide unbiased target discovery for antimicrobial drugs, in particular for antiprotozoal agents. Furthermore, the utilization of targeted metabolomics techniques has enabled validation of existing hypotheses regarding antiprotozoal drug mechanisms. Metabolomics approaches are likely to assist with optimization of new drug candidates by identification of drug targets, and by allowing detailed characterization of modes of action and resistance of existing and novel antiprotozoal drugs.
Collapse
|
41
|
Peironcely JE, Rojas-Chertó M, Tas A, Vreeken R, Reijmers T, Coulier L, Hankemeier T. Automated Pipeline for De Novo Metabolite Identification Using Mass-Spectrometry-Based Metabolomics. Anal Chem 2013; 85:3576-83. [DOI: 10.1021/ac303218u] [Citation(s) in RCA: 41] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/10/2023]
Affiliation(s)
- Julio E. Peironcely
- TNO Research Group Quality & Safety, P.O. Box 360, NL-3700 AJ Zeist, The Netherlands
- Leiden
Academic Center for Drug
Research, Leiden University, Einsteinweg
55, 2333 CC Leiden, The Netherlands
- Netherlands Metabolomics Centre, Einsteinweg 55, 2333 CC Leiden, The Netherlands
| | - Miguel Rojas-Chertó
- Leiden
Academic Center for Drug
Research, Leiden University, Einsteinweg
55, 2333 CC Leiden, The Netherlands
- Netherlands Metabolomics Centre, Einsteinweg 55, 2333 CC Leiden, The Netherlands
| | - Albert Tas
- TNO Research Group Quality & Safety, P.O. Box 360, NL-3700 AJ Zeist, The Netherlands
| | - Rob Vreeken
- Leiden
Academic Center for Drug
Research, Leiden University, Einsteinweg
55, 2333 CC Leiden, The Netherlands
- Netherlands Metabolomics Centre, Einsteinweg 55, 2333 CC Leiden, The Netherlands
| | - Theo Reijmers
- Leiden
Academic Center for Drug
Research, Leiden University, Einsteinweg
55, 2333 CC Leiden, The Netherlands
- Netherlands Metabolomics Centre, Einsteinweg 55, 2333 CC Leiden, The Netherlands
| | - Leon Coulier
- TNO Research Group Quality & Safety, P.O. Box 360, NL-3700 AJ Zeist, The Netherlands
- Netherlands Metabolomics Centre, Einsteinweg 55, 2333 CC Leiden, The Netherlands
| | - Thomas Hankemeier
- Leiden
Academic Center for Drug
Research, Leiden University, Einsteinweg
55, 2333 CC Leiden, The Netherlands
- Netherlands Metabolomics Centre, Einsteinweg 55, 2333 CC Leiden, The Netherlands
| |
Collapse
|
42
|
Scheubert K, Hufsky F, Böcker S. Computational mass spectrometry for small molecules. J Cheminform 2013; 5:12. [PMID: 23453222 PMCID: PMC3648359 DOI: 10.1186/1758-2946-5-12] [Citation(s) in RCA: 108] [Impact Index Per Article: 9.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2012] [Accepted: 02/01/2013] [Indexed: 12/29/2022] Open
Abstract
: The identification of small molecules from mass spectrometry (MS) data remains a major challenge in the interpretation of MS data. This review covers the computational aspects of identifying small molecules, from the identification of a compound searching a reference spectral library, to the structural elucidation of unknowns. In detail, we describe the basic principles and pitfalls of searching mass spectral reference libraries. Determining the molecular formula of the compound can serve as a basis for subsequent structural elucidation; consequently, we cover different methods for molecular formula identification, focussing on isotope pattern analysis. We then discuss automated methods to deal with mass spectra of compounds that are not present in spectral libraries, and provide an insight into de novo analysis of fragmentation spectra using fragmentation trees. In addition, this review shortly covers the reconstruction of metabolic networks using MS data. Finally, we list available software for different steps of the analysis pipeline.
Collapse
Affiliation(s)
- Kerstin Scheubert
- Chair of Bioinformatics, Friedrich Schiller University, Ernst-Abbe-Platz 2, Jena, Germany.
| | | | | |
Collapse
|
43
|
Zhou B, Xiao JF, Ressom HW. Prioritization of putative metabolite identifications in LC-MS/MS experiments using a computational pipeline. Proteomics 2013; 13:248-60. [PMID: 23307777 DOI: 10.1002/pmic.201200306] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2012] [Revised: 10/14/2012] [Accepted: 10/24/2012] [Indexed: 01/12/2023]
Abstract
One of the major bottle-necks in current LC-MS-based metabolomic investigations is metabolite identification. An often-used approach is to first look up metabolites from databases through peak mass, followed by verification of the obtained putative identifications using MS/MS data. However, the mass-based search may provide inappropriate putative identifications when the observed peak is from isotopes, fragments, or adducts. In addition, a large fraction of peaks is often left with multiple putative identifications. To differentiate these putative identifications, manual verification of metabolites through comparison between biological samples and authentic compounds is necessary. However, such experiments are laborious, especially when multiple putative identifications are encountered. It is desirable to use computational approaches to obtain more reliable putative identifications and prioritize them before performing experimental verification of the metabolites. In this article, a computational pipeline is proposed to assist metabolite identification with improved metabolome coverage and prioritization capability. Multiple publicly available software tools and databases, along with in-house developed algorithms, are utilized to fully exploit the information acquired from LC-MS/MS experiments. The pipeline is successfully applied to identify metabolites on the basis of LC-MS as well as MS/MS data. Using accurate masses, retention time values, MS/MS spectra, and metabolic pathways/networks, more appropriate putative identifications are retrieved and prioritized to guide subsequent metabolite verification experiments.
Collapse
Affiliation(s)
- Bin Zhou
- Lombardi Comprehensive Cancer Center, Georgetown University, Washington, DC, USA
| | | | | |
Collapse
|
44
|
Abstract
The metabolome is sensitive to genetic and environmental factors contributing to complex diseases such as type 1 diabetes (T1D). Metabolomics is the study of biochemical and physiological processes involving metabolites. It is therefore one of the key platforms for the discovery and study of pathophysiological phenomena leading to T1D and the development of T1D-associated complications. Although the application of metabolomics in T1D research is still rare, metabolomic research has already advanced across the full spectrum, from disease progression to the development of diabetic complications. Metabolomic studies in T1D have contributed to an improved etiopathogenic understanding and demonstrated their potential in the clinic. For example, metabolomic data from recent T1D studies suggest that a specific metabolic profile, or metabotype, precedes islet autoimmunity and the development of overt T1D. These early metabolic changes are attributed to many biochemical pathways, thus suggesting a systemic change in metabolism which may be inborn. Based on this evidence, the role of the metabolome in the progression to T1D is therefore to facilitate specific biochemical processes associated with T1D, and to contribute to the development of a vulnerable state in which disease is more likely to be triggered. This may have important implications for the understanding of T1D pathophysiology and early disease detection and prevention.
Collapse
Affiliation(s)
- Matej Oresic
- VTT Technical Research Centre of Finland, Tietotie 2, Espoo, FIN-02044 VTT, Finland.
| |
Collapse
|
45
|
Chokkathukalam A, Jankevics A, Creek DJ, Achcar F, Barrett MP, Breitling R. mzMatch-ISO: an R tool for the annotation and relative quantification of isotope-labelled mass spectrometry data. ACTA ACUST UNITED AC 2012; 29:281-3. [PMID: 23162054 PMCID: PMC3546800 DOI: 10.1093/bioinformatics/bts674] [Citation(s) in RCA: 76] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]
Abstract
MOTIVATION Stable isotope-labelling experiments have recently gained increasing popularity in metabolomics studies, providing unique insights into the dynamics of metabolic fluxes, beyond the steady-state information gathered by routine mass spectrometry. However, most liquid chromatography-mass spectrometry data analysis software lacks features that enable automated annotation and relative quantification of labelled metabolite peaks. Here, we describe mzMatch-ISO, a new extension to the metabolomics analysis pipeline mzMatch.R. RESULTS Targeted and untargeted isotope profiling using mzMatch-ISO provides a convenient visual summary of the quality and quantity of labelling for every metabolite through four types of diagnostic plots that show (i) the chromatograms of the isotope peaks of each compound in each sample group; (ii) the ratio of mono-isotopic and labelled peaks indicating the fraction of labelling; (iii) the average peak area of mono-isotopic and labelled peaks in each sample group; and (iv) the trend in the relative amount of labelling in a predetermined isotopomer. To aid further statistical analyses, the values used for generating these plots are also provided as a tab-delimited file. We demonstrate the power and versatility of mzMatch-ISO by analysing a (13)C-labelled metabolome dataset from trypanosomal parasites. AVAILABILITY mzMatch.R and mzMatch-ISO are available free of charge from http://mzmatch.sourceforge.net and can be used on Linux and Windows platforms running the latest version of R. CONTACT rainer.breitling@manchester.ac.uk. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Achuthanunni Chokkathukalam
- College of Medical Veterinary and Life Sciences, Institute of Molecular Cell and Systems Biology, University of Glasgow, Glasgow G12 8QQ, UK
| | | | | | | | | | | |
Collapse
|
46
|
Abstract
Metabolomics aims at identification and quantitation of small molecules involved in metabolic reactions. LC-MS has enjoyed a growing popularity as the platform for metabolomic studies due to its high throughput, soft ionization, and good coverage of metabolites. The success of a LC-MS-based metabolomic study often depends on multiple experimental, analytical, and computational steps. This review presents a workflow of a typical LC-MS-based metabolomic analysis for identification and quantitation of metabolites indicative of biological/environmental perturbations. Challenges and current solutions in each step of the workflow are reviewed. The review intends to help investigators understand the challenges in metabolomic studies and to determine appropriate experimental, analytical, and computational methods to address these challenges.
Collapse
Affiliation(s)
- Bin Zhou
- Lombardi Comprehensive Cancer Center, Georgetown University, 4000 Reservoir Rd., NW, Washington, DC 20057, USA. Fax: 202-687-0227; Tel: 202-687-2283
| | - Jun Feng Xiao
- Lombardi Comprehensive Cancer Center, Georgetown University, 4000 Reservoir Rd., NW, Washington, DC 20057, USA. Fax: 202-687-0227; Tel: 202-687-2283
| | - Leepika Tuli
- Lombardi Comprehensive Cancer Center, Georgetown University, 4000 Reservoir Rd., NW, Washington, DC 20057, USA. Fax: 202-687-0227; Tel: 202-687-2283
| | - Habtom W. Ressom
- Lombardi Comprehensive Cancer Center, Georgetown University, 4000 Reservoir Rd., NW, Washington, DC 20057, USA. Fax: 202-687-0227; Tel: 202-687-2283
| |
Collapse
|
47
|
Xiao JF, Zhou B, Ressom HW. Metabolite identification and quantitation in LC-MS/MS-based metabolomics. Trends Analyt Chem 2012; 32:1-14. [PMID: 22345829 PMCID: PMC3278153 DOI: 10.1016/j.trac.2011.08.009] [Citation(s) in RCA: 331] [Impact Index Per Article: 27.6] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022]
Abstract
Metabolomics aims at detection and quantitation of all metabolites in biological samples. The presence of metabolites with a wide variety of physicochemical properties and different levels of abundance challenges existing analytical platforms used for identification and quantitation of metabolites. Significant efforts have been made to improve analytical and computational methods for metabolomics studies.This review focuses on the use of liquid chromatography with tandem mass spectrometry (LC-MS/MS) for quantitative and qualitative metabolomics studies. It illustrates recent developments in computational methods for metabolite identification, including ion annotation, spectral interpretation and spectral matching. We also review selected reaction monitoring and high-resolution MS for metabolite quantitation. We discuss current challenges in metabolite identification and quantitation as well as potential solutions.
Collapse
Affiliation(s)
| | | | - Habtom W. Ressom
- Department of Oncology, Lombardi Comprehensive Cancer Center, Georgetown University, 4000 Reservoir Rd., NW, Washington DC, 20057
| |
Collapse
|
48
|
Leader DP, Burgess K, Creek D, Barrett MP. Pathos: a web facility that uses metabolic maps to display experimental changes in metabolites identified by mass spectrometry. RAPID COMMUNICATIONS IN MASS SPECTROMETRY : RCM 2011; 25:3422-3426. [PMID: 22002696 PMCID: PMC3509215 DOI: 10.1002/rcm.5245] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/23/2011] [Revised: 09/05/2011] [Accepted: 09/07/2011] [Indexed: 05/30/2023]
Abstract
This work describes a freely available web-based facility which can be used to analyse raw or processed mass spectrometric data from metabolomics experiments and display the metabolites identified--and changes in their experimental abundance--in the context of the metabolic pathways in which they occur. The facility, Pathos (http://motif.gla.ac.uk/Pathos/), employs Java servlets and is underpinned by a relational database populated from the Kyoto Encyclopaedia of Genes and Genomes (KEGG). Input files can contain either raw m/z values from experiments conducted in different modes, or KEGG or MetaCyc IDs assigned by the user on the basis of the m/z values and other criteria. The textual output lists the KEGG pathways on an XHTML page according to the number of metabolites or potential metabolites that they contain. Filtering by organism is also available. For metabolic pathways of interest, the user is able to retrieve a pathway map with identified metabolites highlighted. A particular feature of Pathos is its ability to process relative quantification data for metabolites identified under different experimental conditions, and to present this in an easily comprehensible manner. Results are colour-coded according to the degree of experimental change, and bar charts of the results can be generated interactively from either the text listings or the pathway maps. The visual presentation of the output from Pathos is designed to allow the rapid identification of metabolic areas of potential interest, after which particular results may be examined in detail.
Collapse
Affiliation(s)
- David P Leader
- School of Life Sciences, College of Medical, Veterinary and Life Sciences, University of Glasgow, Glasgow, G12 8QQ, UK.
| | | | | | | |
Collapse
|
49
|
Abstract
BACKGROUND Metabolomics LC-MS experiments yield large numbers of peaks, few of which can be identified by database matching. Many of the remaining peaks correspond to derivatives of identified peaks (e.g., isotope peaks, adducts, fragments and multiply charged molecules). In this article, we present a data-reduction approach that automatically identifies these derivative peaks. RESULTS Using data-driven clustering based on chromatographic peak shape correlation and intensity patterns across biological replicates, derivative peaks can be reliably identified. Using a test data set obtained from Leishmania donovani extracts, we achieved a 60% reduction of the number of peaks. After quality control filtering, almost 80% of the peaks could putatively be identified by database matching. CONCLUSION Automated peak filtering substantially speeds up the data-interpretation process.
Collapse
|
50
|
Brown M, Wedge DC, Goodacre R, Kell DB, Baker PN, Kenny LC, Mamas MA, Neyses L, Dunn WB. Automated workflows for accurate mass-based putative metabolite identification in LC/MS-derived metabolomic datasets. Bioinformatics 2011; 27:1108-12. [PMID: 21325300 PMCID: PMC3709197 DOI: 10.1093/bioinformatics/btr079] [Citation(s) in RCA: 138] [Impact Index Per Article: 10.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/06/2023] Open
Abstract
MOTIVATION The study of metabolites (metabolomics) is increasingly being applied to investigate microbial, plant, environmental and mammalian systems. One of the limiting factors is that of chemically identifying metabolites from mass spectrometric signals present in complex datasets. RESULTS Three workflows have been developed to allow for the rapid, automated and high-throughput annotation and putative metabolite identification of electrospray LC-MS-derived metabolomic datasets. The collection of workflows are defined as PUTMEDID_LCMS and perform feature annotation, matching of accurate m/z to the accurate mass of neutral molecules and associated molecular formula and matching of the molecular formulae to a reference file of metabolites. The software is independent of the instrument and data pre-processing applied. The number of false positives is reduced by eliminating the inaccurate matching of many artifact, isotope, multiply charged and complex adduct peaks through complex interrogation of experimental data. AVAILABILITY The workflows, standard operating procedure and further information are publicly available at http://www.mcisb.org/resources/putmedid.html. CONTACT warwick.dunn@manchester.ac.uk.
Collapse
Affiliation(s)
- Marie Brown
- School of Biomedicine, The University of Manchester, Manchester M13 9PT, UK
| | | | | | | | | | | | | | | | | |
Collapse
|