101
|
Wang X, Jones DR, Shaw TI, Cho JH, Wang Y, Tan H, Xie B, Zhou S, Li Y, Peng J. Target-Decoy-Based False Discovery Rate Estimation for Large-Scale Metabolite Identification. J Proteome Res 2018; 17:2328-2334. [PMID: 29790753 DOI: 10.1021/acs.jproteome.8b00019] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Metabolite identification is a crucial step in mass spectrometry (MS)-based metabolomics. However, it is still challenging to assess the confidence of assigned metabolites. We report a novel method for estimating the false discovery rate (FDR) of metabolite assignment with a target-decoy strategy, in which the decoys are generated through violating the octet rule of chemistry by adding small odd numbers of hydrogen atoms. The target-decoy strategy was integrated into JUMPm, an automated metabolite identification pipeline for large-scale MS analysis and was also evaluated with two other metabolomics tools, mzMatch and MZmine 2. The reliability of FDR calculation was examined by false data sets, which were simulated by altering MS1 or MS2 spectra. Finally, we used the JUMPm pipeline coupled to the target-decoy strategy to process unlabeled and stable-isotope-labeled metabolomic data sets. The results demonstrate that the target-decoy strategy is a simple and effective method for evaluating the confidence of high-throughput metabolite identification.
Collapse
|
102
|
Blaženović I, Kind T, Ji J, Fiehn O. Software Tools and Approaches for Compound Identification of LC-MS/MS Data in Metabolomics. Metabolites 2018; 8:E31. [PMID: 29748461 PMCID: PMC6027441 DOI: 10.3390/metabo8020031] [Citation(s) in RCA: 402] [Impact Index Per Article: 67.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2018] [Revised: 04/26/2018] [Accepted: 05/06/2018] [Indexed: 01/17/2023] Open
Abstract
The annotation of small molecules remains a major challenge in untargeted mass spectrometry-based metabolomics. We here critically discuss structured elucidation approaches and software that are designed to help during the annotation of unknown compounds. Only by elucidating unknown metabolites first is it possible to biologically interpret complex systems, to map compounds to pathways and to create reliable predictive metabolic models for translational and clinical research. These strategies include the construction and quality of tandem mass spectral databases such as the coalition of MassBank repositories and investigations of MS/MS matching confidence. We present in silico fragmentation tools such as MS-FINDER, CFM-ID, MetFrag, ChemDistiller and CSI:FingerID that can annotate compounds from existing structure databases and that have been used in the CASMI (critical assessment of small molecule identification) contests. Furthermore, the use of retention time models from liquid chromatography and the utility of collision cross-section modelling from ion mobility experiments are covered. Workflows and published examples of successfully annotated unknown compounds are included.
Collapse
Affiliation(s)
- Ivana Blaženović
- NIH West Coast Metabolomics Center, UC Davis Genome Center, University of California, Davis, CA 95616, USA.
| | - Tobias Kind
- NIH West Coast Metabolomics Center, UC Davis Genome Center, University of California, Davis, CA 95616, USA.
| | - Jian Ji
- State Key Laboratory of Food Science and Technology, School of Food Science of Jiangnan University, School of Food Science Synergetic Innovation Center of Food Safety and Nutrition, Wuxi 214122, China.
| | - Oliver Fiehn
- NIH West Coast Metabolomics Center, UC Davis Genome Center, University of California, Davis, CA 95616, USA.
- Department of Biochemistry, Faculty of Sciences, King Abdulaziz University, Jeddah 21589, Saudi Arabia.
| |
Collapse
|
103
|
Nothias LF, Nothias-Esposito M, da Silva R, Wang M, Protsyuk I, Zhang Z, Sarvepalli A, Leyssen P, Touboul D, Costa J, Paolini J, Alexandrov T, Litaudon M, Dorrestein PC. Bioactivity-Based Molecular Networking for the Discovery of Drug Leads in Natural Product Bioassay-Guided Fractionation. JOURNAL OF NATURAL PRODUCTS 2018; 81:758-767. [PMID: 29498278 DOI: 10.1021/acs.jnatprod.7b00737] [Citation(s) in RCA: 208] [Impact Index Per Article: 34.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
It is a common problem in natural product therapeutic lead discovery programs that despite good bioassay results in the initial extract, the active compound(s) may not be isolated during subsequent bioassay-guided purification. Herein, we present the concept of bioactive molecular networking to find candidate active molecules directly from fractionated bioactive extracts. By employing tandem mass spectrometry, it is possible to accelerate the dereplication of molecules using molecular networking prior to subsequent isolation of the compounds, and it is also possible to expose potentially bioactive molecules using bioactivity score prediction. Indeed, bioactivity score prediction can be calculated with the relative abundance of a molecule in fractions and the bioactivity level of each fraction. For that reason, we have developed a bioinformatic workflow able to map bioactivity score in molecular networks and applied it for discovery of antiviral compounds from a previously investigated extract of Euphorbia dendroides where the bioactive candidate molecules were not discovered following a classical bioassay-guided fractionation procedure. It can be expected that this approach will be implemented as a systematic strategy, not only in current and future bioactive lead discovery from natural extract collections but also for the reinvestigation of the untapped reservoir of bioactive analogues in previous bioassay-guided fractionation efforts.
Collapse
Affiliation(s)
- Louis-Félix Nothias
- Collaborative Mass Spectrometry Innovation Center , University of California, San Diego , La Jolla , California 92093 , United States
- Skaggs School of Pharmacy and Pharmaceutical Sciences , University of California, San Diego , La Jolla , California 92093 , United States
- Institut de Chimie des Substances Naturelles, CNRS, ICSN UPR 2301 , Université Paris-Sud , 91198 , Gif-sur-Yvette , France
| | - Mélissa Nothias-Esposito
- Institut de Chimie des Substances Naturelles, CNRS, ICSN UPR 2301 , Université Paris-Sud , 91198 , Gif-sur-Yvette , France
- Laboratoire de Chimie des Produits Naturels, CNRS, UMR SPE 6134 , University of Corsica , 20250 , Corte , France
| | - Ricardo da Silva
- Collaborative Mass Spectrometry Innovation Center , University of California, San Diego , La Jolla , California 92093 , United States
- Skaggs School of Pharmacy and Pharmaceutical Sciences , University of California, San Diego , La Jolla , California 92093 , United States
| | - Mingxun Wang
- Collaborative Mass Spectrometry Innovation Center , University of California, San Diego , La Jolla , California 92093 , United States
- Skaggs School of Pharmacy and Pharmaceutical Sciences , University of California, San Diego , La Jolla , California 92093 , United States
| | - Ivan Protsyuk
- European Molecular Biology Laboratory, EMBL , Heidelberg , Germany
| | - Zheng Zhang
- Collaborative Mass Spectrometry Innovation Center , University of California, San Diego , La Jolla , California 92093 , United States
- Skaggs School of Pharmacy and Pharmaceutical Sciences , University of California, San Diego , La Jolla , California 92093 , United States
| | - Abi Sarvepalli
- Collaborative Mass Spectrometry Innovation Center , University of California, San Diego , La Jolla , California 92093 , United States
- Skaggs School of Pharmacy and Pharmaceutical Sciences , University of California, San Diego , La Jolla , California 92093 , United States
| | - Pieter Leyssen
- Laboratory for Virology and Experimental Chemotherapy, Rega Institute for Medical Research , KU Leuven , 3000 Leuven , Belgium
| | - David Touboul
- Institut de Chimie des Substances Naturelles, CNRS, ICSN UPR 2301 , Université Paris-Sud , 91198 , Gif-sur-Yvette , France
| | - Jean Costa
- Laboratoire de Chimie des Produits Naturels, CNRS, UMR SPE 6134 , University of Corsica , 20250 , Corte , France
| | - Julien Paolini
- Laboratoire de Chimie des Produits Naturels, CNRS, UMR SPE 6134 , University of Corsica , 20250 , Corte , France
| | - Theodore Alexandrov
- Skaggs School of Pharmacy and Pharmaceutical Sciences , University of California, San Diego , La Jolla , California 92093 , United States
- European Molecular Biology Laboratory, EMBL , Heidelberg , Germany
| | - Marc Litaudon
- Institut de Chimie des Substances Naturelles, CNRS, ICSN UPR 2301 , Université Paris-Sud , 91198 , Gif-sur-Yvette , France
| | - Pieter C Dorrestein
- Collaborative Mass Spectrometry Innovation Center , University of California, San Diego , La Jolla , California 92093 , United States
- Skaggs School of Pharmacy and Pharmaceutical Sciences , University of California, San Diego , La Jolla , California 92093 , United States
| |
Collapse
|
104
|
Hutchins PD, Russell JD, Coon JJ. LipiDex: An Integrated Software Package for High-Confidence Lipid Identification. Cell Syst 2018; 6:621-625.e5. [PMID: 29705063 DOI: 10.1016/j.cels.2018.03.011] [Citation(s) in RCA: 85] [Impact Index Per Article: 14.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2017] [Revised: 01/30/2018] [Accepted: 03/14/2018] [Indexed: 12/31/2022]
Abstract
State-of-the-art proteomics software routinely quantifies thousands of peptides per experiment with minimal need for manual validation or processing of data. For the emerging field of discovery lipidomics via liquid chromatography-tandem mass spectrometry (LC-MS/MS), comparably mature informatics tools do not exist. Here, we introduce LipiDex, a freely available software suite that unifies and automates all stages of lipid identification, reducing hands-on processing time from hours to minutes for even the most expansive datasets. LipiDex utilizes flexible in silico fragmentation templates and lipid-optimized MS/MS spectral matching routines to confidently identify and track hundreds of lipid species and unknown compounds from diverse sample matrices. Unique spectral and chromatographic peak purity algorithms accurately quantify co-isolation and co-elution of isobaric lipids, generating identifications that match the structural resolution afforded by the LC-MS/MS experiment. During final data filtering, ionization artifacts are removed to significantly reduce dataset redundancy. LipiDex interfaces with several LC-MS/MS software packages, enabling robust lipid identification to be readily incorporated into pre-existing data workflows.
Collapse
Affiliation(s)
- Paul D Hutchins
- Department of Chemistry, University of Wisconsin-Madison, Madison, WI 53706, USA; Genome Center of Wisconsin, University of Wisconsin-Madison, Madison, WI 53706, USA
| | - Jason D Russell
- Morgridge Institute for Research, Madison, WI 53715, USA; Genome Center of Wisconsin, University of Wisconsin-Madison, Madison, WI 53706, USA
| | - Joshua J Coon
- Department of Chemistry, University of Wisconsin-Madison, Madison, WI 53706, USA; Morgridge Institute for Research, Madison, WI 53715, USA; Genome Center of Wisconsin, University of Wisconsin-Madison, Madison, WI 53706, USA; Department of Biomolecular Chemistry, University of Wisconsin-Madison, Madison, WI 53706, USA.
| |
Collapse
|
105
|
da Silva RR, Wang M, Nothias LF, van der Hooft JJJ, Caraballo-Rodríguez AM, Fox E, Balunas MJ, Klassen JL, Lopes NP, Dorrestein PC. Propagating annotations of molecular networks using in silico fragmentation. PLoS Comput Biol 2018; 14:e1006089. [PMID: 29668671 PMCID: PMC5927460 DOI: 10.1371/journal.pcbi.1006089] [Citation(s) in RCA: 189] [Impact Index Per Article: 31.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2017] [Revised: 04/30/2018] [Accepted: 03/13/2018] [Indexed: 12/19/2022] Open
Abstract
The annotation of small molecules is one of the most challenging and important steps in untargeted mass spectrometry analysis, as most of our biological interpretations rely on structural annotations. Molecular networking has emerged as a structured way to organize and mine data from untargeted tandem mass spectrometry (MS/MS) experiments and has been widely applied to propagate annotations. However, propagation is done through manual inspection of MS/MS spectra connected in the spectral networks and is only possible when a reference library spectrum is available. One of the alternative approaches used to annotate an unknown fragmentation mass spectrum is through the use of in silico predictions. One of the challenges of in silico annotation is the uncertainty around the correct structure among the predicted candidate lists. Here we show how molecular networking can be used to improve the accuracy of in silico predictions through propagation of structural annotations, even when there is no match to a MS/MS spectrum in spectral libraries. This is accomplished through creating a network consensus of re-ranked structural candidates using the molecular network topology and structural similarity to improve in silico annotations. The Network Annotation Propagation (NAP) tool is accessible through the GNPS web-platform https://gnps.ucsd.edu/ProteoSAFe/static/gnps-theoretical.jsp.
Collapse
Affiliation(s)
- Ricardo R. da Silva
- Collaborative Mass Spectrometry Innovation Center, Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, La Jolla, CA, United States of America
- NPPNS, Department of Physic and Chemistry, School of Pharmaceutical Sciences of Ribeirão Preto, University of São Paulo, Ribeirão Preto, SP, Brazil
| | - Mingxun Wang
- Collaborative Mass Spectrometry Innovation Center, Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, La Jolla, CA, United States of America
| | - Louis-Félix Nothias
- Collaborative Mass Spectrometry Innovation Center, Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, La Jolla, CA, United States of America
| | - Justin J. J. van der Hooft
- Collaborative Mass Spectrometry Innovation Center, Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, La Jolla, CA, United States of America
- Bioinformatics Group, Department of Plant Sciences, Wageningen University, Wageningen, The Netherlands
| | - Andrés Mauricio Caraballo-Rodríguez
- Collaborative Mass Spectrometry Innovation Center, Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, La Jolla, CA, United States of America
| | - Evan Fox
- Department of Molecular and Cell Biology, University of Connecticut, Storrs, CT, United States of America
| | - Marcy J. Balunas
- Division of Medicinal Chemistry, Department of Pharmaceutical Sciences, University of Connecticut, Storrs, CT, United States of America
| | - Jonathan L. Klassen
- Department of Molecular and Cell Biology, University of Connecticut, Storrs, CT, United States of America
| | - Norberto Peporine Lopes
- NPPNS, Department of Physic and Chemistry, School of Pharmaceutical Sciences of Ribeirão Preto, University of São Paulo, Ribeirão Preto, SP, Brazil
| | - Pieter C. Dorrestein
- Collaborative Mass Spectrometry Innovation Center, Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, La Jolla, CA, United States of America
| |
Collapse
|
106
|
Advances in computational metabolomics and databases deepen the understanding of metabolisms. Curr Opin Biotechnol 2018; 54:10-17. [PMID: 29413746 DOI: 10.1016/j.copbio.2018.01.008] [Citation(s) in RCA: 62] [Impact Index Per Article: 10.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2017] [Revised: 01/06/2018] [Accepted: 01/09/2018] [Indexed: 01/13/2023]
Abstract
Mass spectrometry (MS)-based metabolomics is the popular platform for metabolome analyses. Computational techniques for the processing of MS raw data, for example, feature detection, peak alignment, and the exclusion of false-positive peaks, have been established. The next stage of untargeted metabolomics would be to decipher the mass fragmentation of small molecules for the global identification of human-, animal-, plant-, and microbiota metabolomes, resulting in a deeper understanding of metabolisms. This review is an update on the latest computational metabolomics including known/expected structure databases, chemical ontology classifications, and mass spectrometry cheminformatics for the interpretation of mass fragmentations and for the elucidation of unknown metabolites. The importance of metabolome 'databases' and 'repositories' is also discussed because novel biological discoveries are often attributable to the accumulation of data, to relational databases, and to their statistics. Lastly, a practical guide for metabolite annotations is presented as the summary of this review.
Collapse
|