1
|
Ragland JM, Place BJ. A Portable and Reusable Database Infrastructure for Mass Spectrometry, and Its Associated Toolkit (The DIMSpec Project). JOURNAL OF THE AMERICAN SOCIETY FOR MASS SPECTROMETRY 2024; 35:1282-1291. [PMID: 38704738 DOI: 10.1021/jasms.4c00073] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/07/2024]
Abstract
Nontargeted analysis (NTA) is a rapidly growing field of techniques that includes the identification of unknown chemical analytes in complex mixtures such as environmental, biological, and food matrices. The use of reference mass spectral databases is a key component of most NTA workflows, providing a high level of confidence for chemical identification when analytical standards are not available, yet effective interlaboratory sharing of research grade spectra remains challenging. The Database Infrastructure for Mass Spectrometry (DIMSpec) project focused on the creation of an open-source toolkit supporting storage and sharing of high-resolution mass spectra with attached sample and methodological metadata. As a demonstration of its utility, the DIMSpec toolkit was used to create a database of curated mass spectra for per- and polyfluoroalkyl substances (PFAS) generated from various sources. While the underlying toolkit is agnostic to analytical targets, this initial release (along with the database schema, mass spectral data, and database tools) should enable PFAS researchers to use these data for their own studies, including the identification of novel PFAS in the environment.
Collapse
Affiliation(s)
- Jared M Ragland
- National Institute of Standards and Technology, Material Measurement Laboratory, Chemical Sciences Division, Gaithersburg, Maryland 20899, United States
| | - Benjamin J Place
- National Institute of Standards and Technology, Material Measurement Laboratory, Chemical Sciences Division, Gaithersburg, Maryland 20899, United States
| |
Collapse
|
2
|
Chingate E, Drewes JE, Farré MJ, Hübner U. OrbiFragsNets. A tool for automatic annotation of orbitrap MS2 spectra using networks grade as selection criteria. MethodsX 2023; 11:102257. [PMID: 37383622 PMCID: PMC10293764 DOI: 10.1016/j.mex.2023.102257] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2023] [Accepted: 06/13/2023] [Indexed: 06/30/2023] Open
Abstract
We introduce OrbiFragsNets, a tool for automatic annotation of MS2 spectra generated by Orbitrap instruments, as well as the concepts of chemical consistency and fragments networks. OrbiFragsNets takes advantage of the specific confidence interval for each peak in every MS2 spectrum, which is an unclear idea across the high-resolution mass spectrometry literature. The spectrum annotations are expressed as fragments networks, a set of networks with the possible combinations of annotations for the fragments. The model behind OrbiFragsNets is briefly described here and explained in detail in the constantly updated manual available in the GitHub repository. This new approach in MS2 spectrum de novo automatic annotation proved to perform as good as well established tools such as RMassBank and SIRIUS.•A new approach on automatic annotation of Orbitrap MS2 spectra is introduced.•Possible spectrum annotation are described as independent consistent networks, with annotations for each fragment as nodes, and annotations for the mass difference between fragments as edges.•Annotation process is described as the selection of the most connected fragments network.
Collapse
Affiliation(s)
- Edwin Chingate
- Chair of Urban Water Systems Engineering, Technical University of Munich, Am Coulombwall 3, Garching 85748, Germany
- Catalan Institute for Water Research, Emili Grahit 101, Girona 17003, Spain
- Universitat de Girona, Girona, Spain
| | - Jörg E. Drewes
- Chair of Urban Water Systems Engineering, Technical University of Munich, Am Coulombwall 3, Garching 85748, Germany
| | - María José Farré
- Catalan Institute for Water Research, Emili Grahit 101, Girona 17003, Spain
| | - Uwe Hübner
- Chair of Urban Water Systems Engineering, Technical University of Munich, Am Coulombwall 3, Garching 85748, Germany
| |
Collapse
|
3
|
Kang Q, Fang P, Zhang S, Qiu H, Lan Z. Deep graph convolutional network for small-molecule retention time prediction. J Chromatogr A 2023; 1711:464439. [PMID: 37865024 DOI: 10.1016/j.chroma.2023.464439] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2023] [Revised: 10/04/2023] [Accepted: 10/06/2023] [Indexed: 10/23/2023]
Abstract
The retention time (RT) is a crucial source of data for liquid chromatography-mass spectrometry (LCMS). A model that can accurately predict the RT for each molecule would empower filtering candidates with similar spectra but differing RT in LCMS-based molecule identification. Recent research shows that graph neural networks (GNNs) outperform traditional machine learning algorithms in RT prediction. However, all of these models use relatively shallow GNNs. This study for the first time investigates how depth affects GNNs' performance on RT prediction. The results demonstrate that a notable improvement can be achieved by pushing the depth of GNNs to 16 layers by the adoption of residual connection. Additionally, we also find that graph convolutional network (GCN) model benefits from the edge information. The developed deep graph convolutional network, DeepGCN-RT, significantly outperforms the previous state-of-the-art method and achieves the lowest mean absolute percentage error (MAPE) of 3.3% and the lowest mean absolute error (MAE) of 26.55 s on the SMRT test set. We also finetune DeepGCN-RT on seven datasets with various chromatographic conditions. The mean MAE of the seven datasets largely decreases 30% compared to previous state-of-the-art method. On the RIKEN-PlaSMA dataset, we also test the effectiveness of DeepGCN-RT in assisting molecular structure identification. By 30% lessening the number of potential structures, DeepGCN-RT is able to improve top-1 accuracy by about 11%.
Collapse
Affiliation(s)
- Qiyue Kang
- School of Engineering, Westlake University, Hangzhou, Zhejiang, 310024, China.
| | - Pengfei Fang
- School of Computer Science and Engineering, Southeast University, Nanjing, Jiangsu, 210096, China
| | - Shuai Zhang
- School of Engineering, Westlake University, Hangzhou, Zhejiang, 310024, China
| | - Huachuan Qiu
- School of Engineering, Westlake University, Hangzhou, Zhejiang, 310024, China
| | - Zhenzhong Lan
- School of Engineering, Westlake University, Hangzhou, Zhejiang, 310024, China.
| |
Collapse
|
4
|
Arturi K, Hollender J. Machine Learning-Based Hazard-Driven Prioritization of Features in Nontarget Screening of Environmental High-Resolution Mass Spectrometry Data. ENVIRONMENTAL SCIENCE & TECHNOLOGY 2023; 57:18067-18079. [PMID: 37279189 PMCID: PMC10666537 DOI: 10.1021/acs.est.3c00304] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/12/2023] [Revised: 05/15/2023] [Accepted: 05/15/2023] [Indexed: 06/08/2023]
Abstract
Nontarget high-resolution mass spectrometry screening (NTS HRMS/MS) can detect thousands of organic substances in environmental samples. However, new strategies are needed to focus time-intensive identification efforts on features with the highest potential to cause adverse effects instead of the most abundant ones. To address this challenge, we developed MLinvitroTox, a machine learning framework that uses molecular fingerprints derived from fragmentation spectra (MS2) for a rapid classification of thousands of unidentified HRMS/MS features as toxic/nontoxic based on nearly 400 target-specific and over 100 cytotoxic endpoints from ToxCast/Tox21. Model development results demonstrated that using customized molecular fingerprints and models, over a quarter of toxic endpoints and the majority of the associated mechanistic targets could be accurately predicted with sensitivities exceeding 0.95. Notably, SIRIUS molecular fingerprints and xboost (Extreme Gradient Boosting) models with SMOTE (Synthetic Minority Oversampling Technique) for handling data imbalance were a universally successful and robust modeling configuration. Validation of MLinvitroTox on MassBank spectra showed that toxicity could be predicted from molecular fingerprints derived from MS2 with an average balanced accuracy of 0.75. By applying MLinvitroTox to environmental HRMS/MS data, we confirmed the experimental results obtained with target analysis and narrowed the analytical focus from tens of thousands of detected signals to 783 features linked to potential toxicity, including 109 spectral matches and 30 compounds with confirmed toxic activity.
Collapse
Affiliation(s)
- Katarzyna Arturi
- Department
of Environmental Chemistry, Swiss Federal
Institute of Aquatic Science and Technology (Eawag), Ueberlandstrasse 133, 8600 Dübendorf, Switzerland
| | - Juliane Hollender
- Department
of Environmental Chemistry, Swiss Federal
Institute of Aquatic Science and Technology (Eawag), Ueberlandstrasse 133, 8600 Dübendorf, Switzerland
- Institute
of Biogeochemistry and Pollution Dynamics, Eidgenössische Technische Hochschule Zürich (ETH Zurich), Rämistrasse 101, 8092 Zürich, Switzerland
| |
Collapse
|
5
|
Kong F, Keshet U, Shen T, Rodriguez E, Fiehn O. LibGen: Generating High Quality Spectral Libraries of Natural Products for EAD-, UVPD-, and HCD-High Resolution Mass Spectrometers. Anal Chem 2023; 95:16810-16818. [PMID: 37939222 DOI: 10.1021/acs.analchem.3c02263] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2023]
Abstract
Compound annotation using spectral-matching algorithms is vital for (MS/MS)-based metabolomics research, but is hindered by the lack of high-quality reference MS/MS library spectra. Finding and removing errors from libraries, including noise ions, is mostly done manually. This process is both error-prone and time-consuming. To address these challenges, we have developed an automated library curation pipeline, LibGen, to universally build novel spectral libraries. This pipeline corrects mass errors, denoises spectra by subformula assignments, and performs quality control of the reference spectra by calculating explained intensity and spectral entropy. We employed LibGen to generate three high-quality libraries with chemical standards of 2241 natural products. To this end, we used an IQ-X orbital ion trap mass spectrometer to generate 1947 classic high-energy collision dissociation spectra (HCD) as well as 1093 ultraviolet-photodissociation (UVPD) mass spectra. The third library was generated by an electron-activated collision dissociation (EAD) 7600 ZenoTOF mass spectrometer yielding 3244 MS/MS spectra. The natural compounds covered 140 chemical classes from prenol lipids to benzypyrans with >97% of the compounds showing <0.2 Tanimoto-similarity, demonstrating a very high structural variance. Mass spectra showed much higher information content for both UVPD- and EAD-mass spectra compared to classic HCD spectra when using spectral entropy calculations. We validated the denoising algorithm by acquiring MS/MS spectra at high concentration and at 13-fold diluted chemical standards. At low concentrations, a higher proportion of spectra showed apparent fragment ions that could not be explained by subformula losses of the parent molecule. When more than 10% of the total intensity of MS/MS fragments was regarded as noise ions, spectra were considered as low quality and were not included in the libraries. As the overall process is fully automated, LibGen can be utilized by all researchers who create or curate mass spectral libraries. The libraries we created here are publicly available at MassBank.us.
Collapse
Affiliation(s)
- Fanzhou Kong
- Chemistry Department, One Shields Avenue, University of California-Davis, Davis, California 95616, United States
- West Coast Metabolomics Center, University of California-Davis, Davis, California 95616, United States
| | - Uri Keshet
- West Coast Metabolomics Center, University of California-Davis, Davis, California 95616, United States
| | - Tong Shen
- West Coast Metabolomics Center, University of California-Davis, Davis, California 95616, United States
| | - Elys Rodriguez
- Chemistry Department, One Shields Avenue, University of California-Davis, Davis, California 95616, United States
- West Coast Metabolomics Center, University of California-Davis, Davis, California 95616, United States
| | - Oliver Fiehn
- West Coast Metabolomics Center, University of California-Davis, Davis, California 95616, United States
| |
Collapse
|
6
|
Elapavalore A, Kondić T, Singh RR, Shoemaker BA, Thiessen PA, Zhang J, Bolton EE, Schymanski EL. Adding open spectral data to MassBank and PubChem using open source tools to support non-targeted exposomics of mixtures. ENVIRONMENTAL SCIENCE. PROCESSES & IMPACTS 2023; 25:1788-1801. [PMID: 37431591 PMCID: PMC10648001 DOI: 10.1039/d3em00181d] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/28/2023] [Accepted: 06/25/2023] [Indexed: 07/12/2023]
Abstract
The term "exposome" is defined as a comprehensive study of life-course environmental exposures and the associated biological responses. Humans are exposed to many different chemicals, which can pose a major threat to the well-being of humanity. Targeted or non-targeted mass spectrometry techniques are widely used to identify and characterize various environmental stressors when linking exposures to human health. However, identification remains challenging due to the huge chemical space applicable to exposomics, combined with the lack of sufficient relevant entries in spectral libraries. Addressing these challenges requires cheminformatics tools and database resources to share curated open spectral data on chemicals to improve the identification of chemicals in exposomics studies. This article describes efforts to contribute spectra relevant for exposomics to the open mass spectral library MassBank (https://www.massbank.eu) using various open source software efforts, including the R packages RMassBank and Shinyscreen. The experimental spectra were obtained from ten mixtures containing toxicologically relevant chemicals from the US Environmental Protection Agency (EPA) Non-Targeted Analysis Collaborative Trial (ENTACT). Following processing and curation, 5582 spectra from 783 of the 1268 ENTACT compounds were added to MassBank, and through this to other open spectral libraries (e.g., MoNA, GNPS) for community benefit. Additionally, an automated deposition and annotation workflow was developed with PubChem to enable the display of all MassBank mass spectra in PubChem, which is rerun with each MassBank release. The new spectral records have already been used in several studies to increase the confidence in identification in non-target small molecule identification workflows applied to environmental and exposomics research.
Collapse
Affiliation(s)
- Anjana Elapavalore
- Luxembourg Centre for Systems Biomedicine (LCSB), University of Luxembourg, 6 Avenue du Swing, 4367, Belvaux, Luxembourg.
| | - Todor Kondić
- Luxembourg Centre for Systems Biomedicine (LCSB), University of Luxembourg, 6 Avenue du Swing, 4367, Belvaux, Luxembourg.
| | - Randolph R Singh
- Luxembourg Centre for Systems Biomedicine (LCSB), University of Luxembourg, 6 Avenue du Swing, 4367, Belvaux, Luxembourg.
- IFREMER (Institut Français de Recherche pour l'Exploitation de la Mer), Laboratoire Biogéochimie des Contaminants Organiques, Rue de l'Ile d'Yeu, BP 21105, Nantes Cedex 3, 44311, France
| | - Benjamin A Shoemaker
- National Center for Biotechnology Information (NCBI), National Library of Medicine (NLM), National Institutes of Health (NIH), Bethesda, MD, 20894, USA
| | - Paul A Thiessen
- National Center for Biotechnology Information (NCBI), National Library of Medicine (NLM), National Institutes of Health (NIH), Bethesda, MD, 20894, USA
| | - Jian Zhang
- National Center for Biotechnology Information (NCBI), National Library of Medicine (NLM), National Institutes of Health (NIH), Bethesda, MD, 20894, USA
| | - Evan E Bolton
- National Center for Biotechnology Information (NCBI), National Library of Medicine (NLM), National Institutes of Health (NIH), Bethesda, MD, 20894, USA
| | - Emma L Schymanski
- Luxembourg Centre for Systems Biomedicine (LCSB), University of Luxembourg, 6 Avenue du Swing, 4367, Belvaux, Luxembourg.
| |
Collapse
|
7
|
Codrean S, Kruit B, Meekel N, Vughs D, Béen F. Predicting the Diagnostic Information of Tandem Mass Spectra of Environmentally Relevant Compounds Using Machine Learning. Anal Chem 2023; 95:15810-15817. [PMID: 37812582 PMCID: PMC10603772 DOI: 10.1021/acs.analchem.3c03470] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2023] [Accepted: 09/21/2023] [Indexed: 10/11/2023]
Abstract
Acquisition and processing of informative tandem mass spectra (MS2) is crucial for numerous applications, including library-based (tentative) identification, feature prioritization, and prediction of chemical and toxicological characteristics. However, for environmentally relevant compounds, approaches to automatically assess the quality of the MS2 spectra are missing. This work focused on developing a machine learning-based approach to automatically evaluate the diagnostic information of MS2 spectra (e.g., number, distribution, and intensity of diagnostic fragments) of environmentally relevant compounds analyzed with electrospray ionization. For this, approximately 1400 MS2 spectra of 204 environmental contaminants, acquired with different collision energies using liquid chromatography coupled to high-resolution mass spectrometry, were used to train a random forest classifier to distinguish between spectra providing good or poor diagnostic information. Prior to training, validation, and testing, spectra were manually labeled based on criteria such as number, intensity, range of fragments present, molecular ion intensity, and noise levels. Subsequently, feature engineering and selection were applied to retrieve relevant variables from raw MS2 spectra as inputs for the classifier. The optimal set of features based on model performances was selected and used to train a final model, which showed an accuracy of 84%, a precision of 88%, and a recall of 75%. Results show that the combination of selected features and the machine learning model used here can effectively distinguish between MS2 spectra providing good or poor diagnostic information according to the defined criteria. The developed model has the potential to improve a broad range of applications that rely on MS2 data.
Collapse
Affiliation(s)
- S. Codrean
- Faculty
of Science, Artificial Intelligence, Vrije
Universiteit Amsterdam, De Boelelaan 1085, 1081 HV Amsterdam, The Netherlands
| | - B. Kruit
- Faculty
of Science, Artificial Intelligence, Vrije
Universiteit Amsterdam, De Boelelaan 1085, 1081 HV Amsterdam, The Netherlands
| | - N. Meekel
- KWR
Water Research Institute, Groningenhaven 7, P.O. Box 1072, 3430 BB Nieuwegein, The Netherlands
| | - D. Vughs
- KWR
Water Research Institute, Groningenhaven 7, P.O. Box 1072, 3430 BB Nieuwegein, The Netherlands
| | - F. Béen
- KWR
Water Research Institute, Groningenhaven 7, P.O. Box 1072, 3430 BB Nieuwegein, The Netherlands
- Chemistry
for Environment and Health, Amsterdam Institute
for Life and Environment (A-LIFE), Vrije Universiteit De Boelelaan 1085, 1081 HV Amsterdam, The Netherlands
| |
Collapse
|
8
|
Souihi A, Mohai MP, Martin JW, Kruve A. Mobile phase and column chemistry selection for high sensitivity non-targeted LC/ESI/HRMS screening of water. Anal Chim Acta 2023; 1274:341573. [PMID: 37455083 DOI: 10.1016/j.aca.2023.341573] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2023] [Revised: 05/23/2023] [Accepted: 06/28/2023] [Indexed: 07/18/2023]
Abstract
Systematic selection of mobile phase and column chemistry type can be critical for achieving optimal chromatographic separation, high sensitivity, and low detection limits in liquid chromatography electrospray high resolution mass spectrometry (LC/MS). However, the selection process is challenging for non-targeted screening where the compounds of interest are not preselected nor available for method optimization. To provide general guidance, twenty different mobile phase compositions and four columns were compared for the analysis of 78 compounds with a wide range of physicochemical properties (logP range from -1.46 to 5.48), and analyte sensitivity was compared between methods. The pH, additive type, column, and organic modifier had significant effects on the analyte response factors, and acidic mobile phases (e.g. 0.1% formic acid) yielded highest sensitivity. In some cases, the effect was attributable to the difference in organic modifier content at the time of elution, depending on the mobile phase and column chemistry. Based on these findings, 0.1% formic acid, 0.1% ammonia and 5.0 mM ammonium fluoride were further evaluated for their performance in non-targeted LC/ESI/HRMS analysis of wastewater treatment plan influent and effluent, using a data dependent MS2 acquisition and two different data processing workflows (MS-DIAL, patRoon 2.1) to compare number of detected features and sensitivity. Both data-processing workflows indicated that 0.1% formic acid yielded the highest number of features in full scan spectrum (MS1), as well as the highest number of features that triggered fragmentation spectra (MS2) when dynamic exclusion was used.
Collapse
Affiliation(s)
- Amina Souihi
- Department of Environmental and Materials Chemistry, Stockholm University, Svante Arrhenius väg 16, 106 91, Stockholm, Sweden
| | - Miklos Peter Mohai
- Department of Environmental and Materials Chemistry, Stockholm University, Svante Arrhenius väg 16, 106 91, Stockholm, Sweden
| | - Jonathan W Martin
- Department of Environmental Science, Stockholm University, Svante Arrhenius väg 8, 106 91, Stockholm, Sweden; Science for Life Laboratory, Department of Environmental Science, Stockholm University, Svante Arrhenius väg 8, 106 91, Stockholm, Sweden
| | - Anneli Kruve
- Department of Environmental and Materials Chemistry, Stockholm University, Svante Arrhenius väg 16, 106 91, Stockholm, Sweden; Department of Environmental Science, Stockholm University, Svante Arrhenius väg 8, 106 91, Stockholm, Sweden.
| |
Collapse
|
9
|
van Herwerden D, O’Brien JW, Lege S, Pirok BWJ, Thomas KV, Samanipour S. Cumulative Neutral Loss Model for Fragment Deconvolution in Electrospray Ionization High-Resolution Mass Spectrometry Data. Anal Chem 2023; 95:12247-12255. [PMID: 37549176 PMCID: PMC10448439 DOI: 10.1021/acs.analchem.3c00896] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2023] [Accepted: 07/03/2023] [Indexed: 08/09/2023]
Abstract
Clean high-resolution mass spectra (HRMS) are essential to a successful structural elucidation of an unknown feature during nontarget analysis (NTA) workflows. This is a crucial step, particularly for the spectra generated during data-independent acquisition or during direct infusion experiments. The most commonly available tools only take advantage of the time domain for spectral cleanup. Here, we present an algorithm that combines the time domain and mass domain information to perform spectral deconvolution. The algorithm employs a probability-based cumulative neutral loss (CNL) model for fragment deconvolution. The optimized model, with a mass tolerance of 0.005 Da and a scoreCNL threshold of 0.00, was able to achieve a true positive rate (TPr) of 95.0%, a false discovery rate (FDr) of 20.6%, and a reduction rate of 35.4%. Additionally, the CNL model was extensively tested on real samples containing predominantly pesticides at different concentration levels and with matrix effects. Overall, the model was able to obtain a TPr above 88.8% with FD rates between 33 and 79% and reduction rates between 9 and 45%. Finally, the CNL model was compared with the retention time difference method and peak shape correlation analysis, showing that a combination of correlation analysis and the CNL model was the most effective for fragment deconvolution, obtaining a TPr of 84.7%, an FDr of 54.4%, and a reduction rate of 51.0%.
Collapse
Affiliation(s)
- Denice van Herwerden
- Van
’t Hoff Institute for Molecular Sciences (HIMS), University of Amsterdam, Amsterdam 1012 WX, The Netherlands
| | - Jake W. O’Brien
- Van
’t Hoff Institute for Molecular Sciences (HIMS), University of Amsterdam, Amsterdam 1012 WX, The Netherlands
- Queensland
Alliance for Environmental Health Sciences (QAEHS), The University of Queensland, Brisbane 4102, Australia
| | - Sascha Lege
- Agilent
Technologies Deutschland GmbH, Waldbronn 76337, Germany
| | - Bob W. J. Pirok
- Van
’t Hoff Institute for Molecular Sciences (HIMS), University of Amsterdam, Amsterdam 1012 WX, The Netherlands
| | - Kevin V. Thomas
- Queensland
Alliance for Environmental Health Sciences (QAEHS), The University of Queensland, Brisbane 4102, Australia
| | - Saer Samanipour
- Van
’t Hoff Institute for Molecular Sciences (HIMS), University of Amsterdam, Amsterdam 1012 WX, The Netherlands
- Queensland
Alliance for Environmental Health Sciences (QAEHS), The University of Queensland, Brisbane 4102, Australia
- UvA
Data Science Center, University of Amsterdam, Amsterdam 1012 WP, The Netherlands
| |
Collapse
|
10
|
da Silva KM, van de Lavoir M, Robeyns R, Iturrospe E, Verheggen L, Covaci A, van Nuijs ALN. Guidelines and considerations for building multidimensional libraries for untargeted MS-based metabolomics. Metabolomics 2022; 19:4. [PMID: 36576608 DOI: 10.1007/s11306-022-01965-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/10/2022] [Accepted: 12/05/2022] [Indexed: 12/29/2022]
Abstract
INTRODUCTION Feature annotation is crucial in untargeted metabolomics but remains a major challenge. The large pool of metabolites collected under various instrumental conditions is underrepresented in publicly available databases. Retention time (RT) and collision cross section (CCS) measurements from liquid chromatography ion mobility high-resolution mass spectrometers can be employed in addition to MS/MS spectra to improve the confidence of metabolite annotation. Recent advancements in machine learning focus on improving the accuracy of predictions for CCS and RT values. Therefore, high-quality experimental data are crucial to be used either as training datasets or as a reference for high-confidence matching. METHODS This manuscript provides an easy-to-use workflow for the creation of an in-house metabolite library, offers an overview of alternative solutions, and discusses the challenges and advantages of using open-source software. A total of 100 metabolite standards from various classes were analyzed and subjected to the described workflow for library generation. RESULTS AND DISCUSSION The outcome was an open-access available NIST format metabolite library (.msp) with multidimensional information. The library was used to evaluate CCS prediction tools, MS/MS spectra heterogeneities (e.g., multiple adducts, in-source fragmentation, radical fragment ions using collision-induced dissociation), and the reporting of RT.
Collapse
Affiliation(s)
- Katyeny Manuela da Silva
- Department of Pharmaceutical Sciences, Faculty of Pharmaceutical, Toxicological Centre, Biomedical and Veterinary Sciences, Campus Drie Eiken, University of Antwerp, Universiteitsplein 1, 2610, Antwerp, Belgium
| | - Maria van de Lavoir
- Department of Pharmaceutical Sciences, Faculty of Pharmaceutical, Toxicological Centre, Biomedical and Veterinary Sciences, Campus Drie Eiken, University of Antwerp, Universiteitsplein 1, 2610, Antwerp, Belgium
| | - Rani Robeyns
- Department of Pharmaceutical Sciences, Faculty of Pharmaceutical, Toxicological Centre, Biomedical and Veterinary Sciences, Campus Drie Eiken, University of Antwerp, Universiteitsplein 1, 2610, Antwerp, Belgium
| | - Elias Iturrospe
- Department of Pharmaceutical Sciences, Faculty of Pharmaceutical, Toxicological Centre, Biomedical and Veterinary Sciences, Campus Drie Eiken, University of Antwerp, Universiteitsplein 1, 2610, Antwerp, Belgium
- Department of In Vitro Toxicology and Dermato-Cosmetology, Faculty of Medicine and Pharmacy, Campus Jette, Vrije Universiteit Brussel, Laarbeeklaan 103, 1090, Brussels, Belgium
| | - Lisa Verheggen
- Department of Pharmaceutical Sciences, Faculty of Pharmaceutical, Toxicological Centre, Biomedical and Veterinary Sciences, Campus Drie Eiken, University of Antwerp, Universiteitsplein 1, 2610, Antwerp, Belgium
| | - Adrian Covaci
- Department of Pharmaceutical Sciences, Faculty of Pharmaceutical, Toxicological Centre, Biomedical and Veterinary Sciences, Campus Drie Eiken, University of Antwerp, Universiteitsplein 1, 2610, Antwerp, Belgium
| | - Alexander L N van Nuijs
- Department of Pharmaceutical Sciences, Faculty of Pharmaceutical, Toxicological Centre, Biomedical and Veterinary Sciences, Campus Drie Eiken, University of Antwerp, Universiteitsplein 1, 2610, Antwerp, Belgium.
| |
Collapse
|
11
|
Mohammed Taha H, Aalizadeh R, Alygizakis N, Antignac JP, Arp HPH, Bade R, Baker N, Belova L, Bijlsma L, Bolton EE, Brack W, Celma A, Chen WL, Cheng T, Chirsir P, Čirka Ľ, D’Agostino LA, Djoumbou Feunang Y, Dulio V, Fischer S, Gago-Ferrero P, Galani A, Geueke B, Głowacka N, Glüge J, Groh K, Grosse S, Haglund P, Hakkinen PJ, Hale SE, Hernandez F, Janssen EML, Jonkers T, Kiefer K, Kirchner M, Koschorreck J, Krauss M, Krier J, Lamoree MH, Letzel M, Letzel T, Li Q, Little J, Liu Y, Lunderberg DM, Martin JW, McEachran AD, McLean JA, Meier C, Meijer J, Menger F, Merino C, Muncke J, Muschket M, Neumann M, Neveu V, Ng K, Oberacher H, O’Brien J, Oswald P, Oswaldova M, Picache JA, Postigo C, Ramirez N, Reemtsma T, Renaud J, Rostkowski P, Rüdel H, Salek RM, Samanipour S, Scheringer M, Schliebner I, Schulz W, Schulze T, Sengl M, Shoemaker BA, Sims K, Singer H, Singh RR, Sumarah M, Thiessen PA, Thomas KV, Torres S, Trier X, van Wezel AP, Vermeulen RCH, Vlaanderen JJ, von der Ohe PC, Wang Z, Williams AJ, Willighagen EL, Wishart DS, Zhang J, Thomaidis NS, Hollender J, Slobodnik J, Schymanski EL. The NORMAN Suspect List Exchange (NORMAN-SLE): facilitating European and worldwide collaboration on suspect screening in high resolution mass spectrometry. ENVIRONMENTAL SCIENCES EUROPE 2022; 34:104. [PMID: 36284750 PMCID: PMC9587084 DOI: 10.1186/s12302-022-00680-6] [Citation(s) in RCA: 42] [Impact Index Per Article: 21.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 07/27/2022] [Accepted: 09/24/2022] [Indexed: 06/16/2023]
Abstract
Background The NORMAN Association (https://www.norman-network.com/) initiated the NORMAN Suspect List Exchange (NORMAN-SLE; https://www.norman-network.com/nds/SLE/) in 2015, following the NORMAN collaborative trial on non-target screening of environmental water samples by mass spectrometry. Since then, this exchange of information on chemicals that are expected to occur in the environment, along with the accompanying expert knowledge and references, has become a valuable knowledge base for "suspect screening" lists. The NORMAN-SLE now serves as a FAIR (Findable, Accessible, Interoperable, Reusable) chemical information resource worldwide. Results The NORMAN-SLE contains 99 separate suspect list collections (as of May 2022) from over 70 contributors around the world, totalling over 100,000 unique substances. The substance classes include per- and polyfluoroalkyl substances (PFAS), pharmaceuticals, pesticides, natural toxins, high production volume substances covered under the European REACH regulation (EC: 1272/2008), priority contaminants of emerging concern (CECs) and regulatory lists from NORMAN partners. Several lists focus on transformation products (TPs) and complex features detected in the environment with various levels of provenance and structural information. Each list is available for separate download. The merged, curated collection is also available as the NORMAN Substance Database (NORMAN SusDat). Both the NORMAN-SLE and NORMAN SusDat are integrated within the NORMAN Database System (NDS). The individual NORMAN-SLE lists receive digital object identifiers (DOIs) and traceable versioning via a Zenodo community (https://zenodo.org/communities/norman-sle), with a total of > 40,000 unique views, > 50,000 unique downloads and 40 citations (May 2022). NORMAN-SLE content is progressively integrated into large open chemical databases such as PubChem (https://pubchem.ncbi.nlm.nih.gov/) and the US EPA's CompTox Chemicals Dashboard (https://comptox.epa.gov/dashboard/), enabling further access to these lists, along with the additional functionality and calculated properties these resources offer. PubChem has also integrated significant annotation content from the NORMAN-SLE, including a classification browser (https://pubchem.ncbi.nlm.nih.gov/classification/#hid=101). Conclusions The NORMAN-SLE offers a specialized service for hosting suspect screening lists of relevance for the environmental community in an open, FAIR manner that allows integration with other major chemical resources. These efforts foster the exchange of information between scientists and regulators, supporting the paradigm shift to the "one substance, one assessment" approach. New submissions are welcome via the contacts provided on the NORMAN-SLE website (https://www.norman-network.com/nds/SLE/). Supplementary Information The online version contains supplementary material available at 10.1186/s12302-022-00680-6.
Collapse
Affiliation(s)
- Hiba Mohammed Taha
- Luxembourg Centre for Systems Biomedicine (LCSB), University of Luxembourg, 6 Avenue du Swing, 4367 Belvaux, Luxembourg
| | - Reza Aalizadeh
- Laboratory of Analytical Chemistry, Department of Chemistry, National and Kapodistrian University of Athens, Panepistimiopolis Zografou, 15771 Athens, Greece
| | - Nikiforos Alygizakis
- Laboratory of Analytical Chemistry, Department of Chemistry, National and Kapodistrian University of Athens, Panepistimiopolis Zografou, 15771 Athens, Greece
- Environmental Institute, Okružná 784/42, 972 41 Koš, Slovak Republic
| | | | - Hans Peter H. Arp
- Norwegian Geotechnical Institute (NGI), Ullevål Stadion, P.O. Box 3930, 0806 Oslo, Norway
- Department of Chemistry, Norwegian University of Science and Technology (NTNU), 7491 Trondheim, Norway
| | - Richard Bade
- Queensland Alliance for Environmental Health Sciences (QAEHS), The University of Queensland, Woolloongabba, QLD 4102 Australia
| | | | - Lidia Belova
- Toxicological Centre, University of Antwerp, Antwerp, Belgium
| | - Lubertus Bijlsma
- Environmental and Public Health Analytical Chemistry, Research Institute for Pesticides and Water, University Jaume I, Castelló, Spain
| | - Evan E. Bolton
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, 8600 Rockville Pike, Bethesda, MD 20894 USA
| | - Werner Brack
- UFZ, Helmholtz Centre for Environmental Research, Leipzig, Germany
- Institute of Ecology, Evolution and Diversity, Goethe University, Frankfurt Am Main, Germany
| | - Alberto Celma
- Environmental and Public Health Analytical Chemistry, Research Institute for Pesticides and Water, University Jaume I, Castelló, Spain
- Swedish University of Agricultural Sciences (SLU), Uppsala, Sweden
| | - Wen-Ling Chen
- Institute of Food Safety and Health, College of Public Health, National Taiwan University, 17 Xuzhou Rd., Zhongzheng Dist., Taipei, Taiwan
| | - Tiejun Cheng
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, 8600 Rockville Pike, Bethesda, MD 20894 USA
| | - Parviel Chirsir
- Luxembourg Centre for Systems Biomedicine (LCSB), University of Luxembourg, 6 Avenue du Swing, 4367 Belvaux, Luxembourg
| | - Ľuboš Čirka
- Environmental Institute, Okružná 784/42, 972 41 Koš, Slovak Republic
- Faculty of Chemical and Food Technology, Institute of Information Engineering, Automation, and Mathematics, Slovak University of Technology in Bratislava (STU), Radlinského 9, 812 37 Bratislava, Slovak Republic
| | - Lisa A. D’Agostino
- Science for Life Laboratory, Department of Environmental Science, Stockholm University, 10691 Stockholm, Sweden
| | | | - Valeria Dulio
- INERIS, National Institute for Environment and Industrial Risks, Verneuil en Halatte, France
| | - Stellan Fischer
- Swedish Chemicals Agency (KEMI), P.O. Box 2, 172 13 Sundbyberg, Sweden
| | - Pablo Gago-Ferrero
- Institute of Environmental Assessment and Water Research-Severo Ochoa Excellence Center (IDAEA), Spanish Council of Scientific Research (CSIC), Barcelona, Spain
| | - Aikaterini Galani
- Laboratory of Analytical Chemistry, Department of Chemistry, National and Kapodistrian University of Athens, Panepistimiopolis Zografou, 15771 Athens, Greece
| | - Birgit Geueke
- Food Packaging Forum Foundation, Staffelstrasse 10, 8045 Zurich, Switzerland
| | - Natalia Głowacka
- Environmental Institute, Okružná 784/42, 972 41 Koš, Slovak Republic
| | - Juliane Glüge
- Institute of Biogeochemistry and Pollutant Dynamics, ETH Zurich, 8092 Zurich, Switzerland
| | - Ksenia Groh
- Eawag, Swiss Federal Institute for Aquatic Science and Technology, Überlandstrasse 133, 8600 Dübendorf, Switzerland
| | - Sylvia Grosse
- Thermo Fisher Scientific, Dornierstrasse 4, 82110 Germering, Germany
| | - Peter Haglund
- Department of Chemistry, Chemical Biological Centre (KBC), Umeå University, Linnaeus Väg 6, 901 87 Umeå, Sweden
| | - Pertti J. Hakkinen
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, 8600 Rockville Pike, Bethesda, MD 20894 USA
| | - Sarah E. Hale
- Norwegian Geotechnical Institute (NGI), Ullevål Stadion, P.O. Box 3930, 0806 Oslo, Norway
| | - Felix Hernandez
- Environmental and Public Health Analytical Chemistry, Research Institute for Pesticides and Water, University Jaume I, Castelló, Spain
| | - Elisabeth M.-L. Janssen
- Eawag, Swiss Federal Institute for Aquatic Science and Technology, Überlandstrasse 133, 8600 Dübendorf, Switzerland
| | - Tim Jonkers
- Department Environment and Health, Amsterdam Institute for Life and Environment, Vrije Universiteit, Amsterdam, The Netherlands
| | - Karin Kiefer
- Eawag, Swiss Federal Institute for Aquatic Science and Technology, Überlandstrasse 133, 8600 Dübendorf, Switzerland
| | - Michal Kirchner
- Water Research Institute (WRI), Nábr. Arm. Gen. L. Svobodu 5, 81249 Bratislava, Slovak Republic
| | - Jan Koschorreck
- German Environment Agency (UBA), Wörlitzer Platz 1, Dessau-Roßlau, Germany
| | - Martin Krauss
- UFZ, Helmholtz Centre for Environmental Research, Leipzig, Germany
| | - Jessy Krier
- Luxembourg Centre for Systems Biomedicine (LCSB), University of Luxembourg, 6 Avenue du Swing, 4367 Belvaux, Luxembourg
| | - Marja H. Lamoree
- Department Environment and Health, Amsterdam Institute for Life and Environment, Vrije Universiteit, Amsterdam, The Netherlands
| | - Marion Letzel
- Bavarian Environment Agency, 86179 Augsburg, Germany
| | - Thomas Letzel
- Analytisches Forschungsinstitut Für Non-Target Screening GmbH (AFIN-TS), Am Mittleren Moos 48, 86167 Augsburg, Germany
| | - Qingliang Li
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, 8600 Rockville Pike, Bethesda, MD 20894 USA
| | - James Little
- Mass Spec Interpretation Services, 3612 Hemlock Park Drive, Kingsport, TN 37663 USA
| | - Yanna Liu
- State Key Laboratory of Environmental Chemistry and Ecotoxicology, Research Center for Eco-Environmental Sciences, Chinese Academy of Sciences (SKLECE, RCEES, CAS), No. 18 Shuangqing Road, Haidian District, Beijing, 100086 China
| | - David M. Lunderberg
- Hope College, Holland, MI 49422 USA
- University of California, Berkeley, CA USA
| | - Jonathan W. Martin
- Science for Life Laboratory, Department of Environmental Science, Stockholm University, 10691 Stockholm, Sweden
| | - Andrew D. McEachran
- Agilent Technologies, Inc., 5301 Stevens Creek Blvd, Santa Clara, CA 95051 USA
| | - John A. McLean
- Department of Chemistry, Center for Innovative Technology, Vanderbilt-Ingram Cancer Center, Vanderbilt Institute of Chemical Biology, Vanderbilt Institute for Integrative Biosystems Research and Education, Vanderbilt University, Nashville, TN 37235 USA
| | - Christiane Meier
- German Environment Agency (UBA), Wörlitzer Platz 1, Dessau-Roßlau, Germany
| | - Jeroen Meijer
- Institute for Risk Assessment Sciences (IRAS), Utrecht University, Utrecht, The Netherlands
| | - Frank Menger
- Swedish University of Agricultural Sciences (SLU), Uppsala, Sweden
| | - Carla Merino
- University Rovira i Virgili, Tarragona, Spain
- Biosfer Teslab, Reus, Spain
| | - Jane Muncke
- Food Packaging Forum Foundation, Staffelstrasse 10, 8045 Zurich, Switzerland
| | | | - Michael Neumann
- German Environment Agency (UBA), Wörlitzer Platz 1, Dessau-Roßlau, Germany
| | - Vanessa Neveu
- Nutrition and Metabolism Branch, International Agency for Research On Cancer (IARC), 150 Cours Albert Thomas, 69372 Lyon Cedex 08, France
| | - Kelsey Ng
- Environmental Institute, Okružná 784/42, 972 41 Koš, Slovak Republic
- RECETOX, Faculty of Science, Masaryk University, Kotlářská 2, Brno, Czech Republic
| | - Herbert Oberacher
- Institute of Legal Medicine and Core Facility Metabolomics, Medical University of Innsbruck, Muellerstrasse 44, Innsbruck, Austria
| | - Jake O’Brien
- Queensland Alliance for Environmental Health Sciences (QAEHS), The University of Queensland, Woolloongabba, QLD 4102 Australia
| | - Peter Oswald
- Environmental Institute, Okružná 784/42, 972 41 Koš, Slovak Republic
| | - Martina Oswaldova
- Environmental Institute, Okružná 784/42, 972 41 Koš, Slovak Republic
| | - Jaqueline A. Picache
- Department of Chemistry, Center for Innovative Technology, Vanderbilt-Ingram Cancer Center, Vanderbilt Institute of Chemical Biology, Vanderbilt Institute for Integrative Biosystems Research and Education, Vanderbilt University, Nashville, TN 37235 USA
| | - Cristina Postigo
- Swedish University of Agricultural Sciences (SLU), Uppsala, Sweden
- Technologies for Water Management and Treatment Research Group, Department of Civil Engineering, University of Granada, Campus de Fuentenueva S/N, 18071 Granada, Spain
| | - Noelia Ramirez
- University Rovira i Virgili, Tarragona, Spain
- Institute of Health Research Pere Virgili, Tarragona, Spain
| | | | - Justin Renaud
- Agriculture and Agri-Food Canada/Agriculture et Agroalimentaire Canada, 1391 Sandford Street, London, ON N5V 4T3 Canada
| | | | - Heinz Rüdel
- Fraunhofer Institute for Molecular Biology and Applied Ecology (Fraunhofer IME), Schmallenberg, Germany
| | - Reza M. Salek
- Nutrition and Metabolism Branch, International Agency for Research On Cancer (IARC), 150 Cours Albert Thomas, 69372 Lyon Cedex 08, France
| | - Saer Samanipour
- Van’t Hoff Institute for Molecular Sciences, University of Amsterdam, P.O. Box 94157, Amsterdam, 1090 GD The Netherlands
| | - Martin Scheringer
- Institute of Biogeochemistry and Pollutant Dynamics, ETH Zurich, 8092 Zurich, Switzerland
- RECETOX, Faculty of Science, Masaryk University, Kotlářská 2, Brno, Czech Republic
| | - Ivo Schliebner
- German Environment Agency (UBA), Wörlitzer Platz 1, Dessau-Roßlau, Germany
| | - Wolfgang Schulz
- Laboratory for Operation Control and Research, Zweckverband Landeswasserversorgung, Am Spitzigen Berg 1, 89129 Langenau, Germany
| | - Tobias Schulze
- UFZ, Helmholtz Centre for Environmental Research, Leipzig, Germany
| | - Manfred Sengl
- Bavarian Environment Agency, 86179 Augsburg, Germany
| | - Benjamin A. Shoemaker
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, 8600 Rockville Pike, Bethesda, MD 20894 USA
| | - Kerry Sims
- Environment Agency, Horizon House, Deanery Road, Bristol, BS1 5AH UK
| | - Heinz Singer
- Eawag, Swiss Federal Institute for Aquatic Science and Technology, Überlandstrasse 133, 8600 Dübendorf, Switzerland
| | - Randolph R. Singh
- Luxembourg Centre for Systems Biomedicine (LCSB), University of Luxembourg, 6 Avenue du Swing, 4367 Belvaux, Luxembourg
- Chemical Contamination of Marine Ecosystems (CCEM) Unit, Institut Français de Recherche pour l’Exploitation de la Mer (IFREMER), Rue de l’Ile d’Yeu, BP 21105, 44311 Cedex 3, Nantes France
| | - Mark Sumarah
- Agriculture and Agri-Food Canada/Agriculture et Agroalimentaire Canada, 1391 Sandford Street, London, ON N5V 4T3 Canada
| | - Paul A. Thiessen
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, 8600 Rockville Pike, Bethesda, MD 20894 USA
| | - Kevin V. Thomas
- Queensland Alliance for Environmental Health Sciences (QAEHS), The University of Queensland, Woolloongabba, QLD 4102 Australia
| | | | - Xenia Trier
- Section for Environmental Chemistry and Physics, Plant and Environmental Sciences, University of Copenhagen, Thorvaldsensvej 40, 1871 Frederiksberg C, Denmark
| | - Annemarie P. van Wezel
- Institute for Biodiversity and Ecosystem Dynamics, University of Amsterdam, Amsterdam, The Netherlands
| | - Roel C. H. Vermeulen
- Institute for Risk Assessment Sciences (IRAS), Utrecht University, Utrecht, The Netherlands
| | - Jelle J. Vlaanderen
- Institute for Risk Assessment Sciences (IRAS), Utrecht University, Utrecht, The Netherlands
| | | | - Zhanyun Wang
- Technology and Society Laboratory, Empa-Swiss Federal Laboratories for Materials Science and Technology, Lerchenfeldstrasse 5, 9014 St. Gallen, Switzerland
| | - Antony J. Williams
- Computational Chemistry and Cheminformatics Branch (CCCB), Chemical Characterization and Exposure Division (CCED), Center for Computational Toxicology and Exposure (CCTE), United States Environmental Protection Agency, 109 T.W. Alexander Drive, Research Triangle Park, NC 27711 USA
| | - Egon L. Willighagen
- Department of Bioinformatics-BiGCaT, NUTRIM, Maastricht University, Maastricht, The Netherlands
| | | | - Jian Zhang
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, 8600 Rockville Pike, Bethesda, MD 20894 USA
| | - Nikolaos S. Thomaidis
- Laboratory of Analytical Chemistry, Department of Chemistry, National and Kapodistrian University of Athens, Panepistimiopolis Zografou, 15771 Athens, Greece
| | - Juliane Hollender
- Institute of Biogeochemistry and Pollutant Dynamics, ETH Zurich, 8092 Zurich, Switzerland
- Eawag, Swiss Federal Institute for Aquatic Science and Technology, Überlandstrasse 133, 8600 Dübendorf, Switzerland
| | | | - Emma L. Schymanski
- Luxembourg Centre for Systems Biomedicine (LCSB), University of Luxembourg, 6 Avenue du Swing, 4367 Belvaux, Luxembourg
| |
Collapse
|
12
|
Das S, Tanemura KA, Dinpazhoh L, Keng M, Schumm C, Leahy L, Asef CK, Rainey M, Edison AS, Fernández FM, Merz KM. In Silico Collision Cross Section Calculations to Aid Metabolite Annotation. JOURNAL OF THE AMERICAN SOCIETY FOR MASS SPECTROMETRY 2022; 33:750-759. [PMID: 35378036 PMCID: PMC9277703 DOI: 10.1021/jasms.1c00315] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/28/2023]
Abstract
The interpretation of ion mobility coupled to mass spectrometry (IM-MS) data to predict unknown structures is challenging and depends on accurate theoretical estimates of the molecular ion collision cross section (CCS) against a buffer gas in a low or atmospheric pressure drift chamber. The sensitivity and reliability of computational prediction of CCS values depend on accurately modeling the molecular state over accessible conformations. In this work, we developed an efficient CCS computational workflow using a machine learning model in conjunction with standard DFT methods and CCS calculations. Furthermore, we have performed Traveling Wave IM-MS (TWIMS) experiments to validate the extant experimental values and assess uncertainties in experimentally measured CCS values. The developed workflow yielded accurate structural predictions and provides unique insights into the likely preferred conformation analyzed using IM-MS experiments. The complete workflow makes the computation of CCS values tractable for a large number of conformationally flexible metabolites with complex molecular structures.
Collapse
Affiliation(s)
- Susanta Das
- Department of Chemistry, Michigan State University, 578 South Shaw Lane, East Lansing, Michigan 48824, United States
| | - Kiyoto Aramis Tanemura
- Department of Chemistry, Michigan State University, 578 South Shaw Lane, East Lansing, Michigan 48824, United States
| | - Laleh Dinpazhoh
- Department of Chemistry, Michigan State University, 578 South Shaw Lane, East Lansing, Michigan 48824, United States
| | - Mithony Keng
- Department of Chemistry, Michigan State University, 578 South Shaw Lane, East Lansing, Michigan 48824, United States
| | - Christina Schumm
- Department of Chemistry, Michigan State University, 578 South Shaw Lane, East Lansing, Michigan 48824, United States
| | - Lydia Leahy
- Department of Chemistry, Michigan State University, 578 South Shaw Lane, East Lansing, Michigan 48824, United States
| | - Carter K Asef
- School of Chemistry and Biochemistry, Georgia Institute of Technology, Atlanta, Georgia 30332, United States
| | - Markace Rainey
- School of Chemistry and Biochemistry, Georgia Institute of Technology, Atlanta, Georgia 30332, United States
| | - Arthur S Edison
- Departments of Genetics and Biochemistry, Institute of Bioinformatics and Complex Carbohydrate Center, University of Georgia, 315 Riverbend Road, Athens, Georgia 30602, United States
| | - Facundo M Fernández
- School of Chemistry and Biochemistry, Georgia Institute of Technology, Atlanta, Georgia 30332, United States
- Petit Institute for Bioengineering and Bioscience, Georgia Institute of Technology, Atlanta, Georgia 30332, United States
| | - Kenneth M Merz
- Department of Chemistry, Michigan State University, 578 South Shaw Lane, East Lansing, Michigan 48824, United States
| |
Collapse
|
13
|
Paszkiewicz M, Godlewska K, Lis H, Caban M, Białk-Bielińska A, Stepnowski P. Advances in suspect screening and non-target analysis of polar emerging contaminants in the environmental monitoring. Trends Analyt Chem 2022. [DOI: 10.1016/j.trac.2022.116671] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
|
14
|
Souihi A, Mohai MP, Palm E, Malm L, Kruve A. MultiConditionRT: Predicting liquid chromatography retention time for emerging contaminants for a wide range of eluent compositions and stationary phases. J Chromatogr A 2022; 1666:462867. [DOI: 10.1016/j.chroma.2022.462867] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2021] [Revised: 01/29/2022] [Accepted: 01/29/2022] [Indexed: 12/25/2022]
|
15
|
Natumi R, Dieziger C, Janssen EML. Cyanobacterial Toxins and Cyanopeptide Transformation Kinetics by Singlet Oxygen and pH-Dependence in Sunlit Surface Waters. ENVIRONMENTAL SCIENCE & TECHNOLOGY 2021; 55:15196-15205. [PMID: 34714625 DOI: 10.1021/acs.est.1c04194] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
To assess the risks associated with cyanobacterial blooms, the persistence and fate processes of cyanotoxins and other bioactive cyanobacterial metabolites need to be evaluated. Here, we investigated the reaction with photochemically produced singlet oxygen (1O2) for 30 cyanopeptides synthesized by Dolichospermum flos aquae, including 9 anabaenopeptins, 18 microcystins, 2 cyanopeptolins, and 1 cyclamide. All compounds were stable in UVA light alone but in the presence of a photosensitizer we observed compound-specific degradation. A strong pH effect on the decay was observed for 18 cyanopeptides that all contained tyrosine or structurally related moieties. We can attribute this effect to the reaction with 1O2 and triplet sensitizer that preferentially react with the deprotonated form of tyrosine moieties. The contribution of 1O2 to indirect phototransformation ranged from 12 to 39% and second-order rate constants for 9 tyrosine-containing cyanopeptides were assessed. Including the pH dependence of the reaction and system-independent second-order rate constants with 1O2 will improve the estimation of half-lives for multiclass cyanopeptide in surface waters. Our data further indicates that naturally occurring triplet sensitizers are likely to oxidize deprotonated tyrosine moieties of cyanopeptides and the specific reactivity and its pH dependence needs to be investigated in future studies.
Collapse
Affiliation(s)
- Regiane Natumi
- Department of Environmental Chemistry, Swiss Federal Institute of Aquatic Science and Technology (Eawag), 8600 Dübendorf, Switzerland
| | - Christoph Dieziger
- Department of Environmental Chemistry, Swiss Federal Institute of Aquatic Science and Technology (Eawag), 8600 Dübendorf, Switzerland
| | - Elisabeth M-L Janssen
- Department of Environmental Chemistry, Swiss Federal Institute of Aquatic Science and Technology (Eawag), 8600 Dübendorf, Switzerland
| |
Collapse
|
16
|
Schollée JE, Hollender J, McArdell CS. Characterization of advanced wastewater treatment with ozone and activated carbon using LC-HRMS based non-target screening with automated trend assignment. WATER RESEARCH 2021; 200:117209. [PMID: 34102384 DOI: 10.1016/j.watres.2021.117209] [Citation(s) in RCA: 28] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/15/2020] [Revised: 04/05/2021] [Accepted: 04/26/2021] [Indexed: 06/12/2023]
Abstract
Advanced treatment is increasingly being applied to improve abatement of micropollutants in wastewater effluent and reduce their load to surface waters. In this study, non-target screening of high-resolution mass spectrometry (HRMS) data, collected at three Swiss wastewater treatment plants (WWTPs), was used to evaluate different advanced wastewater treatment setups, including (1) granular activated carbon (GAC) filtration alone, (2) pre-ozonation followed by GAC filtration, and (3) pre-ozonation followed by powdered activated carbon (PAC) dosed onto a sand filter. Samples were collected at each treatment step of the WWTP and analyzed with reverse-phase liquid chromatography coupled to HRMS. Each WWTP received a portion of industrial wastewater and a prioritization method was applied to select non-target features potentially resulting from industrial activities. Approximately 37,000 non-target features were found in the influents of the WWTPs. A number of non-target features (1207) were prioritized as likely of industrial origin and 54 were identified through database spectral matching. The fates of all detected non-target features were assessed through a novel automated trend assignment method. A trend was assigned to each non-target feature based on the normalized intensity profile for each sampling date. Results showed that 73±4% of influent non-target features and the majority of industrial features (89%) were well-removed (i.e., >80% intensity reduction) during biological treatment in all three WWTPs. Advanced treatment removed, on average, an additional 11% of influent non-target features, with no significant differences observed among the different advanced treatment settings. In contrast, when considering a subset of 66 known micropollutants, advanced treatment was necessary to adequately abate these compounds and higher abatement was observed in fresh GAC (7,000-8,000 bed volumes (BVs)) compared to older GAC (18,000-48,000 BVs) (80% vs 56% of micropollutants were well-removed, respectively). Approximately half of the features detected in the WWTP effluents were features newly formed during the various treatment steps. In ozonation, between 1108-3579 features were classified as potential non-target ozonation transformation products (OTPs). No difference could be observed for their removal in GAC filters at the BVs investigated (70% of OTPs were well-removed on average). Similar amounts (67%) was observed with PAC (7.7-13.6 mg/L) dosed onto a sand filter, demonstrating that a post-treatment with activated carbon is efficient for the removal of OTPs.
Collapse
Affiliation(s)
- Jennifer E Schollée
- Eawag: Swiss Federal Institute of Aquatic Science and Technology, Duebendorf 8600, Switzerland.
| | - Juliane Hollender
- Eawag: Swiss Federal Institute of Aquatic Science and Technology, Duebendorf 8600, Switzerland; ETH Zurich, Institute of Biopollutant Dynamics, Zurich 8092, Switzerland
| | - Christa S McArdell
- Eawag: Swiss Federal Institute of Aquatic Science and Technology, Duebendorf 8600, Switzerland
| |
Collapse
|
17
|
Meijer J, Lamoree M, Hamers T, Antignac JP, Hutinet S, Debrauwer L, Covaci A, Huber C, Krauss M, Walker DI, Schymanski EL, Vermeulen R, Vlaanderen J. An annotation database for chemicals of emerging concern in exposome research. ENVIRONMENT INTERNATIONAL 2021; 152:106511. [PMID: 33773387 DOI: 10.1016/j.envint.2021.106511] [Citation(s) in RCA: 26] [Impact Index Per Article: 8.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/01/2020] [Revised: 02/03/2021] [Accepted: 03/06/2021] [Indexed: 05/18/2023]
Abstract
BACKGROUND Chemicals of Emerging Concern (CECs) include a very wide group of chemicals that are suspected to be responsible for adverse effects on health, but for which very limited information is available. Chromatographic techniques coupled with high-resolution mass spectrometry (HRMS) can be used for non-targeted screening and detection of CECs, by using comprehensive annotation databases. Establishing a database focused on the annotation of CECs in human samples will provide new insight into the distribution and extent of exposures to a wide range of CECs in humans. OBJECTIVES This study describes an approach for the aggregation and curation of an annotation database (CECscreen) for the identification of CECs in human biological samples. METHODS The approach consists of three main parts. First, CECs compound lists from various sources were aggregated and duplications and inorganic compounds were removed. Subsequently, the list was curated by standardization of structures to create "MS-ready" and "QSAR-ready" SMILES, as well as calculation of exact masses (monoisotopic and adducts) and molecular formulas. The second step included the simulation of Phase I metabolites. The third and final step included the calculation of QSAR predictions related to physicochemical properties, environmental fate, toxicity and Absorption, Distribution, Metabolism, Excretion (ADME) processes and the retrieval of information from the US EPA CompTox Chemicals Dashboard. RESULTS All CECscreen database and property files are publicly available (DOI: https://doi.org/10.5281/zenodo.3956586). In total, 145,284 entries were aggregated from various CECs data sources. After elimination of duplicates and curation, the pipeline produced 70,397 unique "MS-ready" structures and 66,071 unique QSAR-ready structures, corresponding with 69,526 CAS numbers. Simulation of Phase I metabolites resulted in 306,279 unique metabolites. QSAR predictions could be performed for 64,684 of the QSAR-ready structures, whereas information was retrieved from the CompTox Chemicals Dashboard for 59,739 CAS numbers out of 69,526 inquiries. CECscreen is incorporated in the in silico fragmentation approach MetFrag. DISCUSSION The CECscreen database can be used to prioritize annotation of CECs measured in non-targeted HRMS, facilitating the large-scale detection of CECs in human samples for exposome research. Large-scale detection of CECs can be further improved by integrating the present database with resources that contain CECs (metabolites) and meta-data measurements, further expansion towards in silico and experimental (e.g., MassBank) generation of MS/MS spectra, and development of bioinformatics approaches capable of using correlation patterns in the measured chemical features.
Collapse
Affiliation(s)
- Jeroen Meijer
- Institute for Risk Assessment Sciences (IRAS), Utrecht University, Utrecht, the Netherlands; Department Environment & Health, Vrije Universiteit, Amsterdam, the Netherlands
| | - Marja Lamoree
- Department Environment & Health, Vrije Universiteit, Amsterdam, the Netherlands
| | - Timo Hamers
- Department Environment & Health, Vrije Universiteit, Amsterdam, the Netherlands
| | | | | | - Laurent Debrauwer
- Toxalim (Research Centre in Food Toxicology), Toulouse University, INRAE, ENVT, INP-Purpan, Toulouse, France; Metatoul-AXIOM Platform, National Infrastructure for Metabolomics and Fluxomics: MetaboHUB, Toxalim, INRAE, Toulouse, France
| | - Adrian Covaci
- Toxicological Center, University of Antwerp, Belgium
| | - Carolin Huber
- Department Effect-Directed Analysis, Helmholtz Centre for Environmental Research - UFZ, Leipzig, Germany
| | - Martin Krauss
- Department Effect-Directed Analysis, Helmholtz Centre for Environmental Research - UFZ, Leipzig, Germany
| | - Douglas I Walker
- Department of Environmental Medicine and Public Health, Icahn School of Medicine, Mount Sinai, New York, NY, USA
| | - Emma L Schymanski
- Luxembourg Centre for Systems Biomedicine, University of Luxembourg, Belvaux, Luxembourg
| | - Roel Vermeulen
- Institute for Risk Assessment Sciences (IRAS), Utrecht University, Utrecht, the Netherlands
| | - Jelle Vlaanderen
- Institute for Risk Assessment Sciences (IRAS), Utrecht University, Utrecht, the Netherlands.
| |
Collapse
|
18
|
Kiefer K, Du L, Singer H, Hollender J. Identification of LC-HRMS nontarget signals in groundwater after source related prioritization. WATER RESEARCH 2021; 196:116994. [PMID: 33773453 DOI: 10.1016/j.watres.2021.116994] [Citation(s) in RCA: 31] [Impact Index Per Article: 10.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/25/2020] [Revised: 02/26/2021] [Accepted: 02/28/2021] [Indexed: 05/12/2023]
Abstract
Groundwater is a major drinking water resource but its quality with regard to organic micropollutants (MPs) is insufficiently assessed. Therefore, we aimed to investigate Swiss groundwater more comprehensively using liquid chromatography high-resolution tandem mass spectrometry (LC-HRMS/MS). First, samples from 60 sites were classified as having high or low urban or agricultural influence based on 498 target compounds associated with either urban or agricultural sources. Second, all LC-HRMS signals were related to their potential origin (urban, urban and agricultural, agricultural, or not classifiable) based on their occurrence and intensity in the classified samples. A considerable fraction of estimated concentrations associated with urban and/or agricultural sources could not be explained by the 139 detected targets. The most intense nontarget signals were automatically annotated with structure proposals using MetFrag and SIRIUS4/CSI:FingerID with a list of >988,000 compounds. Additionally, suspect screening was performed for 1162 compounds with predicted high groundwater mobility from primarily urban sources. Finally, 12 nontargets and 11 suspects were identified unequivocally (Level 1), while 17 further compounds were tentatively identified (Level 2a/3). amongst these were 13 pollutants thus far not reported in groundwater, such as: the industrial chemicals 2,5-dichlorobenzenesulfonic acid (19 detections, up to 100 ng L-1), phenylphosponic acid (10 detections, up to 50 ng L-1), triisopropanolamine borate (2 detections, up to 40 ng L-1), O-des[2-aminoethyl]-O-carboxymethyl dehydroamlodipine, a transformation product (TP) of the blood pressure regulator amlodipine (17 detections), and the TP SYN542490 of the herbicide metolachlor (Level 3, 33 detections, estimated concentrations up to 100-500 ng L-1). One monitoring site was far more contaminated than other sites based on estimated total concentrations of potential MPs, which was supported by the elucidation of site-specific nontarget signals such as the carcinogen chlorendic acid, and various naphthalenedisulfonic acids. Many compounds remained unknown, but overall, source related prioritisation proved an effective approach to support identification of compounds in groundwater.
Collapse
Affiliation(s)
- Karin Kiefer
- Eawag, Swiss Federal Institute of Aquatic Science and Technology, 8600 Dübendorf, Switzerland; Institute of Biogeochemistry and Pollutant Dynamics, ETH Zurich, 8092 Zurich, Switzerland
| | - Letian Du
- Eawag, Swiss Federal Institute of Aquatic Science and Technology, 8600 Dübendorf, Switzerland; Institute of Biogeochemistry and Pollutant Dynamics, ETH Zurich, 8092 Zurich, Switzerland
| | - Heinz Singer
- Eawag, Swiss Federal Institute of Aquatic Science and Technology, 8600 Dübendorf, Switzerland
| | - Juliane Hollender
- Eawag, Swiss Federal Institute of Aquatic Science and Technology, 8600 Dübendorf, Switzerland; Institute of Biogeochemistry and Pollutant Dynamics, ETH Zurich, 8092 Zurich, Switzerland.
| |
Collapse
|
19
|
Mairinger T, Loos M, Hollender J. Characterization of water-soluble synthetic polymeric substances in wastewater using LC-HRMS/MS. WATER RESEARCH 2021; 190:116745. [PMID: 33360422 DOI: 10.1016/j.watres.2020.116745] [Citation(s) in RCA: 18] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/20/2020] [Revised: 11/19/2020] [Accepted: 12/11/2020] [Indexed: 06/12/2023]
Abstract
Synthetic water-soluble polymeric materials are widely employed in e.g. cleaning detergents, personal care products, paints or textiles. Accordingly, these compounds reach sewage treatment plants and may enter receiving waters and the aquatic environment. Characteristically, these molecules show a polydisperse molecular weight distribution, comprising multiple repeating units, i.e. a homologous series (HS). Their analysis in environmentally relevant samples has received some attention over the last two decades, however, the majority of previous studies focused on surfactants and a molecular weight range <1000 Da. To capture a wider range on the mass versus polarity plane and extend towards less polar contaminants, a workflow was established using three different ionization strategies, namely conventional electrospray ionization, atmospheric pressure photoionization and atmospheric pressure chemical ionization. The data evaluation consisted of suspect screening of ca. 1200 suspect entries and a non-target screening of HS with pre-defined accurate mass differences using ca. 400 molecular formulas of repeating units of HS as input and repeating retention time shifts as HS indicator. To study the fate of these water-soluble polymeric substances in the wastewater treatment process, the different stages, i.e. after primary and secondary clarifier, and after ozonation followed by sand filtration, were sampled at a Swiss wastewater treatment plant. Remaining with two different ionization interfaces, ESI and APPI, in both polarities, a non-targeted screening approach led to a total number of 146 HS (each with a minimum number of 4 members), with a molecular mass of up to 1200 detected in the final effluent. Of the 146 HS, ca 15% could be associated with suspect hits and approximately 25% with transformation products of suspects. Tentative characterization or probable chemical structure could be assigned to almost half of the findings. In positive ionization mode various sugar derivatives with differing side chains, for negative mode structures with sulfonic acids, could be characterized. The number of detected HS decreased significantly over the three treatment stages. For HS detectable also in the biological and oxidative treatment stages, a change in HS distribution towards to lower mass range was often observed.
Collapse
Affiliation(s)
- Teresa Mairinger
- Eawag: Swiss Federal Institute of Aquatic Science and Technology, Duebendorf, Switzerland.
| | | | - Juliane Hollender
- Eawag: Swiss Federal Institute of Aquatic Science and Technology, Duebendorf, Switzerland; Institute of Biogeochemistry and Pollutant Dynamics, ETH Zurich, Zurich, Switzerland.
| |
Collapse
|
20
|
Wang S, Matt M, Murphy BL, Perkins M, Matthews DA, Moran SD, Zeng T. Organic Micropollutants in New York Lakes: A Statewide Citizen Science Occurrence Study. ENVIRONMENTAL SCIENCE & TECHNOLOGY 2020; 54:13759-13770. [PMID: 33064942 DOI: 10.1021/acs.est.0c04775] [Citation(s) in RCA: 20] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/06/2023]
Abstract
The widespread occurrence of organic micropollutants (OMPs) is a challenge for aquatic ecosystem management, and closing the gaps in risk assessment of OMPs requires a data-driven approach. One promising tool for increasing the spatiotemporal coverage of OMP data sets is through the active involvement of citizen volunteers to expand the scale of OMP monitoring. Working collaboratively with volunteers from the Citizens Statewide Lake Assessment Program (CSLAP), we conducted the first statewide study on OMP occurrence in surface waters of New York lakes. Samples collected by CSLAP volunteers were analyzed for OMPs by a suspect screening method based on mixed-mode solid-phase extraction and liquid chromatography-high resolution mass spectrometry. Sixty-five OMPs were confirmed and quantified in samples from 111 lakes across New York. Hierarchical clustering of OMP occurrence data revealed the relevance of 11 most frequently detected OMPs for classifying the contamination status of lakes. Partial least squares regression and multiple linear regression analyses prioritized three water quality parameters linked to agricultural and developed land uses (i.e., total dissolved nitrogen, specific conductance, and a wastewater-derived fluorescent organic matter component) as the best combination of predictors that partly explained the interlake variability in OMP occurrence. Lastly, the exposure-activity ratio approach identified the potential for biological effects associated with detected OMPs that warrant further biomonitoring studies. Overall, this work demonstrated the feasibility of incorporating citizen science approaches into the regional impact assessment of OMPs.
Collapse
Affiliation(s)
- Shiru Wang
- Department of Civil and Environmental Engineering, Syracuse University, 151 Link Hall, Syracuse, New York 13244, United States
| | - Monica Matt
- Upstate Freshwater Institute, 224 Midler Park Drive, Syracuse, New York 13206, United States
| | - Bethany L Murphy
- Department of Civil and Environmental Engineering, Syracuse University, 151 Link Hall, Syracuse, New York 13244, United States
| | - MaryGail Perkins
- Upstate Freshwater Institute, 224 Midler Park Drive, Syracuse, New York 13206, United States
| | - David A Matthews
- Upstate Freshwater Institute, 224 Midler Park Drive, Syracuse, New York 13206, United States
| | - Sharon D Moran
- Department of Environmental Studies, SUNY College of Environmental Science and Forestry, 1 Forestry Drive, Syracuse, New York 13210, United States
| | - Teng Zeng
- Department of Civil and Environmental Engineering, Syracuse University, 151 Link Hall, Syracuse, New York 13244, United States
| |
Collapse
|
21
|
Ludwig M, Nothias LF, Dührkop K, Koester I, Fleischauer M, Hoffmann MA, Petras D, Vargas F, Morsy M, Aluwihare L, Dorrestein PC, Böcker S. Database-independent molecular formula annotation using Gibbs sampling through ZODIAC. NAT MACH INTELL 2020. [DOI: 10.1038/s42256-020-00234-6] [Citation(s) in RCA: 44] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022]
|
22
|
Gatto L, Gibb S, Rainer J. MSnbase, Efficient and Elegant R-Based Processing and Visualization of Raw Mass Spectrometry Data. J Proteome Res 2020; 20:1063-1069. [PMID: 32902283 DOI: 10.1021/acs.jproteome.0c00313] [Citation(s) in RCA: 38] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022]
Abstract
We present version 2 of the MSnbase R/Bioconductor package. MSnbase provides infrastructure for the manipulation, processing, and visualization of mass spectrometry data. We focus on the new on-disk infrastructure, that allows the handling of large raw mass spectrometry experiments on commodity hardware and illustrate how the package is used for elegant data processing, method development, and visualization.
Collapse
Affiliation(s)
- Laurent Gatto
- Computational Biology Unit, de Duve Institute, Université catholique de Louvain, Brussels, 1200, Belgium
| | - Sebastian Gibb
- Department of Anaesthesiology and Intensive Care, University Medicine Greifswald, University of Greifswald, 17475 Greifswald, Germany
| | - Johannes Rainer
- Institute for Biomedicine, Eurac Research, Affiliated Institute of the University of Lübeck, 39100 Bolzano, Italy
| |
Collapse
|
23
|
Aron AT, Gentry EC, McPhail KL, Nothias LF, Nothias-Esposito M, Bouslimani A, Petras D, Gauglitz JM, Sikora N, Vargas F, van der Hooft JJJ, Ernst M, Kang KB, Aceves CM, Caraballo-Rodríguez AM, Koester I, Weldon KC, Bertrand S, Roullier C, Sun K, Tehan RM, Boya P CA, Christian MH, Gutiérrez M, Ulloa AM, Tejeda Mora JA, Mojica-Flores R, Lakey-Beitia J, Vásquez-Chaves V, Zhang Y, Calderón AI, Tayler N, Keyzers RA, Tugizimana F, Ndlovu N, Aksenov AA, Jarmusch AK, Schmid R, Truman AW, Bandeira N, Wang M, Dorrestein PC. Reproducible molecular networking of untargeted mass spectrometry data using GNPS. Nat Protoc 2020; 15:1954-1991. [PMID: 32405051 DOI: 10.1038/s41596-020-0317-5] [Citation(s) in RCA: 303] [Impact Index Per Article: 75.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2019] [Accepted: 03/03/2020] [Indexed: 02/06/2023]
Abstract
Global Natural Product Social Molecular Networking (GNPS) is an interactive online small molecule-focused tandem mass spectrometry (MS2) data curation and analysis infrastructure. It is intended to provide as much chemical insight as possible into an untargeted MS2 dataset and to connect this chemical insight to the user's underlying biological questions. This can be performed within one liquid chromatography (LC)-MS2 experiment or at the repository scale. GNPS-MassIVE is a public data repository for untargeted MS2 data with sample information (metadata) and annotated MS2 spectra. These publicly accessible data can be annotated and updated with the GNPS infrastructure keeping a continuous record of all changes. This knowledge is disseminated across all public data; it is a living dataset. Molecular networking-one of the main analysis tools used within the GNPS platform-creates a structured data table that reflects the molecular diversity captured in tandem mass spectrometry experiments by computing the relationships of the MS2 spectra as spectral similarity. This protocol provides step-by-step instructions for creating reproducible, high-quality molecular networks. For training purposes, the reader is led through a 90- to 120-min procedure that starts by recalling an example public dataset and its sample information and proceeds to creating and interpreting a molecular network. Each data analysis job can be shared or cloned to disseminate the knowledge gained, thus propagating information that can lead to the discovery of molecules, metabolic pathways, and ecosystem/community interactions.
Collapse
Affiliation(s)
- Allegra T Aron
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, CA, USA
| | - Emily C Gentry
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, CA, USA
| | - Kerry L McPhail
- Department of Pharmaceutical Sciences, College of Pharmacy, Oregon State University, Corvallis, OR, USA
| | - Louis-Félix Nothias
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, CA, USA
| | - Mélissa Nothias-Esposito
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, CA, USA
| | - Amina Bouslimani
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, CA, USA
| | - Daniel Petras
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, CA, USA
- Scripps Institution of Oceanography, University of California San Diego, La Jolla, CA, USA
| | - Julia M Gauglitz
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, CA, USA
| | - Nicole Sikora
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, CA, USA
| | - Fernando Vargas
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, CA, USA
- Division of Biological Sciences, University of California San Diego, La Jolla, CA, USA
| | | | - Madeleine Ernst
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, CA, USA
| | - Kyo Bin Kang
- College of Pharmacy, Sookmyung Women's University, Seoul, Korea
| | - Christine M Aceves
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, CA, USA
| | | | - Irina Koester
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, CA, USA
- Scripps Institution of Oceanography, University of California San Diego, La Jolla, CA, USA
| | - Kelly C Weldon
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, CA, USA
- Center of Microbiome Innovation, University of California San Diego, La Jolla, CA, USA
| | - Samuel Bertrand
- Groupe Mer, Molécules, Santé-EA 2160, UFR des Sciences Pharmaceutiques et Biologiques, Université de Nantes, Nantes, France
- ThalassOMICS Metabolomics Facility, Plateforme Corsaire, Biogenouest, Nantes, France
| | - Catherine Roullier
- College of Pharmacy, Sookmyung Women's University, Seoul, Korea
- ThalassOMICS Metabolomics Facility, Plateforme Corsaire, Biogenouest, Nantes, France
| | - Kunyang Sun
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, CA, USA
| | - Richard M Tehan
- Department of Pharmaceutical Sciences, College of Pharmacy, Oregon State University, Corvallis, OR, USA
| | - Cristopher A Boya P
- Centro de Biodiversidad y Descubrimiento de Drogas, Instituto de Investigaciones Científicas y Servicios de Alta Tecnología (INDICASAT AIP), Panama City, Panama
- Department of Biotechnology, Acharya Nagarjuna University, Guntur, Nagarjuna Nagar, India
| | - Martin H Christian
- Centro de Biodiversidad y Descubrimiento de Drogas, Instituto de Investigaciones Científicas y Servicios de Alta Tecnología (INDICASAT AIP), Panama City, Panama
| | - Marcelino Gutiérrez
- Centro de Biodiversidad y Descubrimiento de Drogas, Instituto de Investigaciones Científicas y Servicios de Alta Tecnología (INDICASAT AIP), Panama City, Panama
| | | | | | - Randy Mojica-Flores
- Centro de Biodiversidad y Descubrimiento de Drogas, Instituto de Investigaciones Científicas y Servicios de Alta Tecnología (INDICASAT AIP), Panama City, Panama
- Departamento de Química, Universidad Autónoma de Chiriquí (UNACHI), David, Chiriquí, Panama
| | - Johant Lakey-Beitia
- Centro de Biodiversidad y Descubrimiento de Drogas, Instituto de Investigaciones Científicas y Servicios de Alta Tecnología (INDICASAT AIP), Panama City, Panama
| | - Victor Vásquez-Chaves
- Centro de Investigaciones en Productos Naturales (CIPRONA), Universidad de Costa Rica, San José, Costa Rica
| | - Yilue Zhang
- Department of Drug Discovery and Development, Harrison School of Pharmacy, Auburn University, Auburn, AL, USA
| | - Angela I Calderón
- Department of Drug Discovery and Development, Harrison School of Pharmacy, Auburn University, Auburn, AL, USA
| | - Nicole Tayler
- Centro de Biodiversidad y Descubrimiento de Drogas, Instituto de Investigaciones Científicas y Servicios de Alta Tecnología (INDICASAT AIP), Panama City, Panama
- Department of Biotechnology, Acharya Nagarjuna University, Guntur, Nagarjuna Nagar, India
| | - Robert A Keyzers
- School of Chemical & Physical Sciences, Victoria University of Wellington, Wellington, New Zealand
| | - Fidele Tugizimana
- Centre for Plant Metabolomics Research, Department of Biochemistry, University of Johannesburg, Auckland Park, South Africa
- International R&D Division, Omnia Group (Pty) Ltd., Johannesburg, South Africa
| | - Nombuso Ndlovu
- Centre for Plant Metabolomics Research, Department of Biochemistry, University of Johannesburg, Auckland Park, South Africa
| | - Alexander A Aksenov
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, CA, USA
| | - Alan K Jarmusch
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, CA, USA
| | - Robin Schmid
- Institute of Inorganic and Analytical Chemistry, University of Münster, Münster, Germany
| | - Andrew W Truman
- Department of Molecular Microbiology, John Innes Centre, Norwich, UK
| | - Nuno Bandeira
- Department of Computer Science and Engineering, University of California, San Diego, La Jolla, CA, USA.
| | - Mingxun Wang
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, CA, USA.
| | - Pieter C Dorrestein
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, CA, USA.
- Center for Computational Mass Spectrometry, University of California, San Diego, La Jolla, CA, USA.
- Department of Pharmacology, University of California, San Diego, La Jolla, CA, USA.
- Department of Pediatrics, University of California, San Diego, La Jolla, CA, USA.
| |
Collapse
|
24
|
Natumi R, Janssen EML. Cyanopeptide Co-Production Dynamics beyond Mirocystins and Effects of Growth Stages and Nutrient Availability. ENVIRONMENTAL SCIENCE & TECHNOLOGY 2020; 54:6063-6072. [PMID: 32302105 DOI: 10.1021/acs.est.9b07334] [Citation(s) in RCA: 28] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Intensified cyanobacterial bloom events are of increasing global concern because of adverse effects associated with the release of bioactive compounds, including toxic cyanopeptides. Cyanobacteria can produce a variety of cyanopeptides, yet our knowledge about their abundance and co-production remains limited. We applied a suspect-screening approach, including 700 structurally known cyanopeptides, and identified 11 cyanopeptides in Microcystis aeruginosa and 17 in Dolichospermum flos-aquae. Total cyanopeptide concentrations ranged from high nmol to μmol gdry-1 with slightly higher cell quotas in the mid-exponential growth phase. Relative cyanopeptide profiles were unchanged throughout the growth cycle. We demonstrate that quantification based on microcystin-LR equivalents can introduce an error of up to 6-fold and recommend a class-equivalent approach instead. In M. aeruginosa, rarely studied cyclamides dominated (>80%) over cyanopeptolins and microcystins. While all nutrient reductions caused less growth, only lowering phosphorous and micronutrients reduced cyanopeptide production by M. aeruginosa. Similar trends were observed for D. flos-aquae and only lowering nitrogen decreased cyanopeptide production while the relative abundance of individual cyanopeptides remained stable. The synchronized production of other cyanopeptides along with microcystins emphasizes the need to make them available as reference standards to encourage more studies on their occurrence in blooms, persistence, and potential toxicity.
Collapse
Affiliation(s)
- Regiane Natumi
- Department of Environmental Chemistry, Swiss Federal Institute of Aquatic Science and Technology (Eawag), 8600 Dübendorf, Switzerland
| | - Elisabeth M-L Janssen
- Department of Environmental Chemistry, Swiss Federal Institute of Aquatic Science and Technology (Eawag), 8600 Dübendorf, Switzerland
| |
Collapse
|
25
|
Quantification for non-targeted LC/MS screening without standard substances. Sci Rep 2020; 10:5808. [PMID: 32242073 PMCID: PMC7118164 DOI: 10.1038/s41598-020-62573-z] [Citation(s) in RCA: 72] [Impact Index Per Article: 18.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2019] [Accepted: 03/16/2020] [Indexed: 01/27/2023] Open
Abstract
Non-targeted and suspect analyses with liquid chromatography/electrospray/high-resolution mass spectrometry (LC/ESI/HRMS) are gaining importance as they enable identification of hundreds or even thousands of compounds in a single sample. Here, we present an approach to address the challenge to quantify compounds identified from LC/HRMS data without authentic standards. The approach uses random forest regression to predict the response of the compounds in ESI/HRMS with a mean error of 2.2 and 2.0 times for ESI positive and negative mode, respectively. We observe that the predicted responses can be transferred between different instruments via a regression approach. Furthermore, we applied the predicted responses to estimate the concentration of the compounds without the standard substances. The approach was validated by quantifying pesticides and mycotoxins in six different cereal samples. For applicability, the accuracy of the concentration prediction needs to be compatible with the effect (e.g. toxicology) predictions. We achieved the average quantification error of 5.4 times, which is well compatible with the accuracy of the toxicology predictions.
Collapse
|
26
|
Jewell KS, Kunkel U, Ehlig B, Thron F, Schlüsener M, Dietrich C, Wick A, Ternes TA. Comparing mass, retention time and tandem mass spectra as criteria for the automated screening of small molecules in aqueous environmental samples analyzed by liquid chromatography/quadrupole time-of-flight tandem mass spectrometry. RAPID COMMUNICATIONS IN MASS SPECTROMETRY : RCM 2020; 34:e8541. [PMID: 31364212 DOI: 10.1002/rcm.8541] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/09/2019] [Revised: 06/27/2019] [Accepted: 07/22/2019] [Indexed: 06/10/2023]
Abstract
RATIONALE The adoption of database screening using high-resolution liquid chromatography/mass spectrometry data is promising as a river water monitoring and surveillance tool but depends on the ability to perform reliable data processing on a large number of samples in a unified workflow. Strategies to minimize errors have been proposed but automated procedures are rare. METHODS High-resolution LC/ESI-QTOFMS/MS in data-dependent MS2 acquisition mode was performed for the analysis of surface water samples by direct injection. Data processing was achieved with software tools written in R. A database containing MS2 spectra of 693 compounds formed the basis of the workflow. Standard mixes and a time series of 361 samples of river water were analyzed and processed with the optimized workflow. RESULTS Using the database and a mix of 70 standards for testing, it was found that an identification strategy including (i) mass, (ii) retention time, and (iii) MS2 spectral matching achieved a two- to three-fold improvement in the fraction of false positives compared with using only two criteria, while the number of false negatives remained low. The optimized workflow was applied to the sample series of river water. In total, 135 compounds were identified by a library match. CONCLUSIONS The developed automated database screening approach minimizes the proportion of false positives, while still allowing for the screening of hundreds of water samples for hundreds of compounds in a single run.
Collapse
Affiliation(s)
- Kevin S Jewell
- Federal Institute of Hydrology, Am Mainzer Tor 1, 56068, Koblenz, Germany
| | - Uwe Kunkel
- Federal Institute of Hydrology, Am Mainzer Tor 1, 56068, Koblenz, Germany
| | - Björn Ehlig
- Federal Institute of Hydrology, Am Mainzer Tor 1, 56068, Koblenz, Germany
| | - Franziska Thron
- Federal Institute of Hydrology, Am Mainzer Tor 1, 56068, Koblenz, Germany
| | - Michael Schlüsener
- Federal Institute of Hydrology, Am Mainzer Tor 1, 56068, Koblenz, Germany
| | - Christian Dietrich
- Federal Institute of Hydrology, Am Mainzer Tor 1, 56068, Koblenz, Germany
| | - Arne Wick
- Federal Institute of Hydrology, Am Mainzer Tor 1, 56068, Koblenz, Germany
| | - Thomas A Ternes
- Federal Institute of Hydrology, Am Mainzer Tor 1, 56068, Koblenz, Germany
| |
Collapse
|
27
|
Overview of Tandem Mass Spectral and Metabolite Databases for Metabolite Identification in Metabolomics. Methods Mol Biol 2020; 2104:139-148. [PMID: 31953816 DOI: 10.1007/978-1-0716-0239-3_8] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022]
Abstract
Liquid chromatography-mass spectrometry (LC-MS) is one of the most popular technologies in metabolomics. The large-scale and unambiguous identification of metabolite structures remains a challenging task in LC-MS based metabolomics. Tandem mass spectral databases provide experimental and in silico MS/MS spectra to facilitate the identification of both known and unknown metabolites, which has become a gold standard method in metabolomics. In addition, metabolite knowledge databases offer valuable biological and pathway information of metabolites. In this chapter, we have briefly reviewed the most common and important tandem mass spectral and metabolite databases, and illustrated how they could be used for metabolite identification.
Collapse
|
28
|
Stanstrup J, Broeckling CD, Helmus R, Hoffmann N, Mathé E, Naake T, Nicolotti L, Peters K, Rainer J, Salek RM, Schulze T, Schymanski EL, Stravs MA, Thévenot EA, Treutler H, Weber RJM, Willighagen E, Witting M, Neumann S. The metaRbolomics Toolbox in Bioconductor and beyond. Metabolites 2019; 9:E200. [PMID: 31548506 PMCID: PMC6835268 DOI: 10.3390/metabo9100200] [Citation(s) in RCA: 51] [Impact Index Per Article: 10.2] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2019] [Revised: 09/16/2019] [Accepted: 09/17/2019] [Indexed: 11/17/2022] Open
Abstract
Metabolomics aims to measure and characterise the complex composition of metabolites in a biological system. Metabolomics studies involve sophisticated analytical techniques such as mass spectrometry and nuclear magnetic resonance spectroscopy, and generate large amounts of high-dimensional and complex experimental data. Open source processing and analysis tools are of major interest in light of innovative, open and reproducible science. The scientific community has developed a wide range of open source software, providing freely available advanced processing and analysis approaches. The programming and statistics environment R has emerged as one of the most popular environments to process and analyse Metabolomics datasets. A major benefit of such an environment is the possibility of connecting different tools into more complex workflows. Combining reusable data processing R scripts with the experimental data thus allows for open, reproducible research. This review provides an extensive overview of existing packages in R for different steps in a typical computational metabolomics workflow, including data processing, biostatistics, metabolite annotation and identification, and biochemical network and pathway analysis. Multifunctional workflows, possible user interfaces and integration into workflow management systems are also reviewed. In total, this review summarises more than two hundred metabolomics specific packages primarily available on CRAN, Bioconductor and GitHub.
Collapse
Affiliation(s)
- Jan Stanstrup
- Preventive and Clinical Nutrition, University of Copenhagen, Rolighedsvej 30, 1958 Frederiksberg C, Denmark.
| | - Corey D Broeckling
- Proteomics and Metabolomics Facility, Colorado State University, Fort Collins, CO 80523, USA.
| | - Rick Helmus
- Institute for Biodiversity and Ecosystem Dynamics, University of Amsterdam, 1098 XH Amsterdam, The Netherlands.
| | - Nils Hoffmann
- Leibniz-Institut für Analytische Wissenschaften-ISAS-e.V., Otto-Hahn-Straße 6b, 44227 Dortmund, Germany.
| | - Ewy Mathé
- Department of Biomedical Informatics, College of Medicine, The Ohio State University, Columbus, OH 43210, USA.
| | - Thomas Naake
- Max Planck Institute of Molecular Plant Physiology, 14476 Potsdam-Golm, Germany.
| | - Luca Nicolotti
- The Australian Wine Research Institute, Metabolomics Australia, PO Box 197, Adelaide SA 5064, Australia.
| | - Kristian Peters
- Leibniz Institute of Plant Biochemistry (IPB Halle), Bioinformatics and Scientific Data, 06120 Halle, Germany.
| | - Johannes Rainer
- Institute for Biomedicine, Eurac Research, Affiliated Institute of the University of Lübeck, 39100 Bolzano, Italy.
| | - Reza M Salek
- The International Agency for Research on Cancer, 150 cours Albert Thomas, CEDEX 08, 69372 Lyon, France.
| | - Tobias Schulze
- Department of Effect-Directed Analysis, Helmholtz Centre for Environmental Research-UFZ, Permoserstraße 15, 04318 Leipzig, Germany.
| | - Emma L Schymanski
- Luxembourg Centre for Systems Biomedicine, University of Luxembourg, 6 avenue du Swing, L-4367 Belvaux, Luxembourg.
| | - Michael A Stravs
- Eawag, Swiss Federal Institute of Aquatic Science and Technology, Überlandstrasse 133, 8600 Dubendorf, Switzerland.
| | - Etienne A Thévenot
- CEA, LIST, Laboratory for Data Sciences and Decision, MetaboHUB, Gif-Sur-Yvette F-91191, France.
| | - Hendrik Treutler
- Leibniz Institute of Plant Biochemistry (IPB Halle), Bioinformatics and Scientific Data, 06120 Halle, Germany.
| | - Ralf J M Weber
- Phenome Centre Birmingham and School of Biosciences, University of Birmingham, Edgbaston, Birmingham B15 2TT, UK.
| | - Egon Willighagen
- Department of Bioinformatics-BiGCaT, NUTRIM, Maastricht University, 6229 ER Maastricht, The Netherlands.
| | - Michael Witting
- Research Unit Analytical BioGeoChemistry, Helmholtz Zentrum München, 85764 Neuherberg, Germany.
- Chair of Analytical Food Chemistry, Technische Universität München, 85354 Weihenstephan, Germany.
| | - Steffen Neumann
- Leibniz Institute of Plant Biochemistry (IPB Halle), Bioinformatics and Scientific Data, 06120 Halle, Germany.
- German Centre for Integrative Biodiversity Research (iDiv), Halle-Jena-Leipzig Deutscher, Platz 5e, 04103 Leipzig, Germany.
| |
Collapse
|
29
|
Albergamo V, Schollée JE, Schymanski EL, Helmus R, Timmer H, Hollender J, de Voogt P. Nontarget Screening Reveals Time Trends of Polar Micropollutants in a Riverbank Filtration System. ENVIRONMENTAL SCIENCE & TECHNOLOGY 2019; 53:7584-7594. [PMID: 31244084 PMCID: PMC6610556 DOI: 10.1021/acs.est.9b01750] [Citation(s) in RCA: 53] [Impact Index Per Article: 10.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/06/2023]
Abstract
The historic emissions of polar micropollutants in a natural drinking water source were investigated by nontarget screening with high-resolution mass spectrometry and open cheminformatics tools. The study area consisted of a riverbank filtration transect fed by the river Lek, a branch of the lower Rhine, and exhibiting up to 60-year travel time. More than 18,000 profiles were detected. Hierarchical clustering revealed that 43% of the 15 most populated clusters were characterized by intensity trends with maxima in the 1990s, reflecting intensified human activities, wastewater treatment plant upgrades and regulation in the Rhine riparian countries. Tentative structure annotation was performed using automated in silico fragmentation. Candidate structures retrieved from ChemSpider were scored based on the fit of the in silico fragments to the experimental tandem mass spectra, similarity to openly accessible accurate mass spectra, associated metadata, and presence in a suspect list. Sixty-seven unique structures (72 over both ionization modes) were tentatively identified, 25 of which were confirmed and included contaminants so far unknown to occur in bank filtrate or in natural waters at all, such as tetramethylsulfamide. This study demonstrates that many classes of hydrophilic organics enter riverbank filtration systems, persisting and migrating for decades if biogeochemical conditions are stable.
Collapse
Affiliation(s)
- Vittorio Albergamo
- Institute
for Biodiversity and Ecosystem Dynamics, University of Amsterdam, Science Park 904, 1098 XH Amsterdam, The Netherlands
- E-mail:
| | - Jennifer E. Schollée
- Eawag,
Swiss Federal Institute of Aquatic Science and Technology, Überlandstrasse 133, 8600 Dübendorf, Switzerland
| | - Emma L. Schymanski
- Eawag,
Swiss Federal Institute of Aquatic Science and Technology, Überlandstrasse 133, 8600 Dübendorf, Switzerland
- Luxembourg
Centre for Systems Biomedicine, University
of Luxembourg, House
of Biomedicine II 6, avenue du Swing, L-4367 Belvaux, Luxembourg
| | - Rick Helmus
- Institute
for Biodiversity and Ecosystem Dynamics, University of Amsterdam, Science Park 904, 1098 XH Amsterdam, The Netherlands
| | - Harrie Timmer
- Oasen, Nieuwe Gouwe
O.Z 3, 2801 SB Gouda, The Netherlands
| | - Juliane Hollender
- Eawag,
Swiss Federal Institute of Aquatic Science and Technology, Überlandstrasse 133, 8600 Dübendorf, Switzerland
- Institute
of Biogeochemistry and Pollutant Dynamics, ETH Zürich, Universitätstrasse
16, 8092 Zürich, Switzerland
| | - Pim de Voogt
- Institute
for Biodiversity and Ecosystem Dynamics, University of Amsterdam, Science Park 904, 1098 XH Amsterdam, The Netherlands
- KWR Watercycle
Research Institute, Groningenhaven
7, 3430 BB, Nieuwegein, The Netherlands
| |
Collapse
|
30
|
Ruttkies C, Schymanski EL, Strehmel N, Hollender J, Neumann S, Williams AJ, Krauss M. Supporting non-target identification by adding hydrogen deuterium exchange MS/MS capabilities to MetFrag. Anal Bioanal Chem 2019; 411:4683-4700. [PMID: 31209548 PMCID: PMC6611743 DOI: 10.1007/s00216-019-01885-0] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2019] [Revised: 04/08/2019] [Accepted: 04/30/2019] [Indexed: 01/02/2023]
Abstract
Liquid chromatography coupled with high-resolution mass spectrometry (LC-HRMS) is increasingly popular for the non-targeted exploration of complex samples, where tandem mass spectrometry (MS/MS) is used to characterize the structure of unknown compounds. However, mass spectra do not always contain sufficient information to unequivocally identify the correct structure. This study investigated how much additional information can be gained using hydrogen deuterium exchange (HDX) experiments. The exchange of “easily exchangeable” hydrogen atoms (connected to heteroatoms), with predominantly [M+D]+ ions in positive mode and [M-D]− in negative mode was observed. To enable high-throughput processing, new scoring terms were incorporated into the in silico fragmenter MetFrag. These were initially developed on small datasets and then tested on 762 compounds of environmental interest. Pairs of spectra (normal and deuterated) were found for 593 of these substances (506 positive mode, 155 negative mode spectra). The new scoring terms resulted in 29 additional correct identifications (78 vs 49) for positive mode and an increase in top 10 rankings from 80 to 106 in negative mode. Compounds with dual functionality (polar head group, long apolar tail) exhibited dramatic retention time (RT) shifts of up to several minutes, compared with an average 0.04 min RT shift. For a smaller dataset of 80 metabolites, top 10 rankings improved from 13 to 24 (positive mode, 57 spectra) and from 14 to 31 (negative mode, 63 spectra) when including HDX information. The results of standard measurements were confirmed using targets and tentatively identified surfactant species in an environmental sample collected from the river Danube near Novi Sad (Serbia). The changes to MetFrag have been integrated into the command line version available at http://c-ruttkies.github.io/MetFrag and all resulting spectra and compounds are available in online resources and in the Electronic Supplementary Material (ESM). Graphical abstract ![]()
Collapse
Affiliation(s)
- Christoph Ruttkies
- Department of Stress and Developmental Biology, Leibniz Institute of Plant Biochemistry, Weinberg 3, 06120, Halle, Germany
| | - Emma L Schymanski
- Luxembourg Centre for Systems Biomedicine (LCSB), University of Luxembourg, 6 avenue du Swing, 4367, Belvaux, Luxembourg. .,Eawag: Swiss Federal Institute of Aquatic Science and Technology, Überlandstrasse 133, 8600, Dübendorf, Switzerland.
| | - Nadine Strehmel
- Department of Stress and Developmental Biology, Leibniz Institute of Plant Biochemistry, Weinberg 3, 06120, Halle, Germany
| | - Juliane Hollender
- Eawag: Swiss Federal Institute of Aquatic Science and Technology, Überlandstrasse 133, 8600, Dübendorf, Switzerland.,Institute of Biogeochemistry and Pollutant Dynamics, ETH Zürich, 8092, Zürich, Switzerland
| | - Steffen Neumann
- Department of Stress and Developmental Biology, Leibniz Institute of Plant Biochemistry, Weinberg 3, 06120, Halle, Germany.,iDiv - German Centre for Integrative Biodiversity Research (iDiv), Halle-Jena-Leipzig Deutscher, Platz 5e, 04103, Leipzig, Germany
| | - Antony J Williams
- National Centre for Computational Toxicity (NCCT), United States Environmental Protection Agency, Research Triangle Park, NC, 27711, USA
| | - Martin Krauss
- Helmholtz Centre for Environmental Research - UFZ, Permoserstr. 15, 04318, Leipzig, Germany.
| |
Collapse
|
31
|
Fox Ramos AE, Evanno L, Poupon E, Champy P, Beniddir MA. Natural products targeting strategies involving molecular networking: different manners, one goal. Nat Prod Rep 2019; 36:960-980. [PMID: 31140509 DOI: 10.1039/c9np00006b] [Citation(s) in RCA: 133] [Impact Index Per Article: 26.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
Covering: up to 2019Landmark advances in bioinformatics tools have recently enhanced the field of natural products research, putting today's natural product chemists in the enviable position of being able to perform the efficient targeting/discovery of previously undescribed molecules by expediting the prioritization of the isolation workflow. Among these advances, MS/MS molecular networking has appeared as a promising approach to dereplicate complex natural product mixtures, leading to a real revolution in the "art of natural product isolation" by accelerating the pace of research of this field. This review illustrates through selected cornerstone studies the new thinking in natural product isolation by drawing a parallel between the different underlying philosophies behind the use of molecular networking in targeting natural products.
Collapse
Affiliation(s)
- Alexander E Fox Ramos
- Équipe "Pharmacognosie-Chimie des Substances Naturelles", BioCIS, Univ. Paris-Sud, CNRS, Université Paris-Saclay, 5 rue J.-B. Clément, 92290, Châtenay-Malabry, France.
| | - Laurent Evanno
- Équipe "Pharmacognosie-Chimie des Substances Naturelles", BioCIS, Univ. Paris-Sud, CNRS, Université Paris-Saclay, 5 rue J.-B. Clément, 92290, Châtenay-Malabry, France.
| | - Erwan Poupon
- Équipe "Pharmacognosie-Chimie des Substances Naturelles", BioCIS, Univ. Paris-Sud, CNRS, Université Paris-Saclay, 5 rue J.-B. Clément, 92290, Châtenay-Malabry, France.
| | - Pierre Champy
- Équipe "Pharmacognosie-Chimie des Substances Naturelles", BioCIS, Univ. Paris-Sud, CNRS, Université Paris-Saclay, 5 rue J.-B. Clément, 92290, Châtenay-Malabry, France.
| | - Mehdi A Beniddir
- Équipe "Pharmacognosie-Chimie des Substances Naturelles", BioCIS, Univ. Paris-Sud, CNRS, Université Paris-Saclay, 5 rue J.-B. Clément, 92290, Châtenay-Malabry, France.
| |
Collapse
|
32
|
Carpenter CMG, Wong LYJ, Johnson CA, Helbling DE. Fall Creek Monitoring Station: Highly Resolved Temporal Sampling to Prioritize the Identification of Nontarget Micropollutants in a Small Stream. ENVIRONMENTAL SCIENCE & TECHNOLOGY 2019; 53:77-87. [PMID: 30472836 DOI: 10.1021/acs.est.8b05320] [Citation(s) in RCA: 35] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/06/2023]
Abstract
The goal of this research was to comprehensively characterize the occurrence and temporal dynamics of target and nontarget micropollutants in a small stream. We established the Fall Creek Monitoring Station in March 2017 and collected daily composite samples for one year. We measured water samples by means of high-resolution mass spectrometry and developed and optimized a postacquisition data processing workflow to screen for 162 target micropollutants and group all mass spectral (MS) features into temporal profiles. We used hierarchical clustering analysis to prioritize nontarget MS features based their similarity to target micropollutant profiles and developed a high-throughput pipeline to elucidate the structures of prioritized nontarget MS features. Our analyses resulted in the identification of 31 target micropollutants and 59 nontarget micropollutants with varying levels of confidence. Temporal profiles of the 90 identified micropollutants revealed unexpected concentration-discharge relationships that depended on the source of the micropollutant and hydrological features of the watershed. Several of the nontarget micropollutants have not been previously reported including pharmaceutical metabolites, rubber vulcanization accelerators, plasticizers, and flame retardants. Our data provide novel insights on the temporal dynamics of micropollutant occurrence in small streams. Further, our approach to nontarget analysis is general and not restricted to highly resolved temporal data acquisitions or samples collected from surface water systems.
Collapse
Affiliation(s)
- Corey M G Carpenter
- School of Civil and Environmental Engineering , Cornell University , Ithaca , New York 14853 , United States
| | - Lok Yee J Wong
- Department of Biological and Environmental Engineering , Cornell University , Ithaca , New York 14853 , United States
| | - Catherine A Johnson
- School of Civil and Environmental Engineering , Cornell University , Ithaca , New York 14853 , United States
| | - Damian E Helbling
- School of Civil and Environmental Engineering , Cornell University , Ithaca , New York 14853 , United States
| |
Collapse
|
33
|
Oberacher H, Reinstadler V, Kreidl M, Stravs MA, Hollender J, Schymanski EL. Annotating Nontargeted LC-HRMS/MS Data with Two Complementary Tandem Mass Spectral Libraries. Metabolites 2018; 9:metabo9010003. [PMID: 30583579 PMCID: PMC6359582 DOI: 10.3390/metabo9010003] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2018] [Revised: 12/17/2018] [Accepted: 12/21/2018] [Indexed: 12/15/2022] Open
Abstract
Tandem mass spectral databases are indispensable for fast and reliable compound identification in nontargeted analysis with liquid chromatography–high resolution tandem mass spectrometry (LC-HRMS/MS), which is applied to a wide range of scientific fields. While many articles now review and compare spectral libraries, in this manuscript we investigate two high-quality and specialized collections from our respective institutes, recorded on different instruments (quadrupole time-of-flight or QqTOF vs. Orbitrap). The optimal range of collision energies for spectral comparison was evaluated using 233 overlapping compounds between the two libraries, revealing that spectra in the range of CE 20–50 eV on the QqTOF and 30–60 nominal collision energy units on the Orbitrap provided optimal matching results for these libraries. Applications to complex samples from the respective institutes revealed that the libraries, combined with a simple data mining approach to retrieve all spectra with precursor and fragment information, could confirm many validated target identifications and yield several new Level 2a (spectral match) identifications. While the results presented are not surprising in many ways, this article adds new results to the debate on the comparability of Orbitrap and QqTOF data and the application of spectral libraries to yield rapid and high-confidence tentative identifications in complex human and environmental samples.
Collapse
Affiliation(s)
- Herbert Oberacher
- Institute of Legal Medicine and Core Facility Metabolomics, Medical University of Innsbruck, 6020 Innsbruck, Austria.
| | - Vera Reinstadler
- Institute of Legal Medicine and Core Facility Metabolomics, Medical University of Innsbruck, 6020 Innsbruck, Austria.
| | - Marco Kreidl
- Institute of Legal Medicine and Core Facility Metabolomics, Medical University of Innsbruck, 6020 Innsbruck, Austria.
| | - Michael A Stravs
- Eawag, Swiss Federal Institute of Aquatic Science and Technology, 8600 Dübendorf, Switzerland.
| | - Juliane Hollender
- Eawag, Swiss Federal Institute of Aquatic Science and Technology, 8600 Dübendorf, Switzerland.
- Institute of Biogeochemistry and Pollutant Dynamics, ETH Zurich, 8092 Zurich, Switzerland.
| | - Emma L Schymanski
- Eawag, Swiss Federal Institute of Aquatic Science and Technology, 8600 Dübendorf, Switzerland.
- Luxembourg Centre for Systems Biomedicine (LCSB), University of Luxembourg, 4367 Belvaux, Luxembourg.
| |
Collapse
|
34
|
Assessing the Antioxidant Properties of Larrea tridentata Extract as a Potential Molecular Therapy against Oxidative Stress. Molecules 2018; 23:molecules23071826. [PMID: 30041415 PMCID: PMC6099408 DOI: 10.3390/molecules23071826] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2018] [Revised: 07/13/2018] [Accepted: 07/18/2018] [Indexed: 02/07/2023] Open
Abstract
Oxidative stress has been linked to neurodegenerative diseases such as Huntington's, Parkinson's, Alzheimer's and amyotrophic lateral sclerosis diseases. Larrea tridentata (LT) also known as Creosote Bush is an evergreen shrub found in the Chihuahuan desert which has been used medicinally by Native American tribes in southwestern North America and the Amerindians of South America. However, studies of the antioxidant capacity of the crude extract of LT towards the discovery of novel molecular therapies bearing antioxidants and drug-like properties are lacking. In this study, we assessed the antioxidant properties of Larrea tridentata, collected specifically from the Chihuahuan desert in the region of El Paso del Norte, TX, USA. LT phytochemicals were obtained from three different extracts (ethanol; ethanol: water (60:40) and water). Then the extracts were evaluated in eight different assays (DPPH, ABTS, superoxide; FRAP activity, nitric oxide, phenolic content, UV visible absorption and cytotoxicity in non-cancerous HS27 cells). The three extracts were not affecting the HS27 cells at concentrations up to 120 µg/mL. Among the three extracts, we found that the mixture of ethanol: water (60:40) LT extract has the most efficient antioxidant properties (IC50 (DPPH at 30 min) = 111.7 ± 3.8 μg/mL; IC50 (ABTS) = 8.49 ± 2.28 μg/mL; IC50 (superoxide) = 0.43 ± 0.17 μg/mL; IC50 (NO) = 230.4 ± 130.4 μg/mL; and the highest phenolic content was estimated to 212.46 ± 7.05 mg GAE/L). In addition, there was a strong correlation between phenolic content and the free-radical scavenging activity assays. HPLC-MS study identified nine compounds from the LT-ethanol: water extract including Justicidin B and Beta peltain have been previously reported as secondary metabolites of Larrea tridentata.
Collapse
|
35
|
Albergamo V, Helmus R, de Voogt P. Direct injection analysis of polar micropollutants in natural drinking water sources with biphenyl liquid chromatography coupled to high-resolution time-of-flight mass spectrometry. J Chromatogr A 2018; 1569:53-61. [PMID: 30017221 DOI: 10.1016/j.chroma.2018.07.036] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2018] [Revised: 06/30/2018] [Accepted: 07/06/2018] [Indexed: 12/31/2022]
Abstract
A method for the trace analysis of polar micropollutants (MPs) by direct injection of surface water and groundwater was validated with ultrahigh-performance liquid chromatography using a core-shell biphenyl stationary phase coupled to time-of-flight high-resolution mass spectrometry. The validation was successfully conducted with 33 polar MPs representative for several classes of emerging contaminants. Identification and quantification were achieved by semi-automated processing of full-scan and data-independent acquisition MS/MS spectra. In most cases good linearity (R2 ≥ 0.99), recovery (75% to 125%) and intra-day (RSD < 20%) and inter-day precision (RSD < 10%) values were observed. Detection limits of 9 to 83 ng/L and 9 to 93 ng/L could be achieved in riverbank filtrate and surface water, respectively. A solid-phase extraction was additionally validated to screen samples from full-scale reverse osmosis drinking water treatment at sub-ng/L levels and overall satisfactory analytical performance parameters were observed for RBF and reverse osmosis permeate. Applicability of the direct injection method is shown for surface water and riverbank filtrate samples from an actual drinking water source. Several targets linkable to incomplete removal in wastewater treatment and farming activities were detected and quantified in concentrations between 28 ng/L for saccharine in riverbank filtrate and up to 1 μg/L for acesulfame in surface water. The solid phase extraction method applied to samples from full-scale reverse osmosis drinking water treatment plant led to quantification of 8 targets between 6 and 57 ng/L in the feed water, whereas only diglyme was detected and quantified in reverse osmosis permeate. Our study shows that combining the chromatographic resolution of biphenyl stationary phase with the performance of time-of-flight high-resolution tandem mass spectrometry resulted in a fast, accurate and robust method to monitor polar MPs in source waters by direct injection with high efficiency.
Collapse
Affiliation(s)
- Vittorio Albergamo
- Institute for Biodiversity and Ecosystem Dynamics, University of Amsterdam, The Netherlands.
| | - Rick Helmus
- Institute for Biodiversity and Ecosystem Dynamics, University of Amsterdam, The Netherlands
| | - Pim de Voogt
- Institute for Biodiversity and Ecosystem Dynamics, University of Amsterdam, The Netherlands; KWR Watercycle Research Institute, Nieuwegein, The Netherlands
| |
Collapse
|
36
|
Kind T, Tsugawa H, Cajka T, Ma Y, Lai Z, Mehta SS, Wohlgemuth G, Barupal DK, Showalter MR, Arita M, Fiehn O. Identification of small molecules using accurate mass MS/MS search. MASS SPECTROMETRY REVIEWS 2018; 37:513-532. [PMID: 28436590 PMCID: PMC8106966 DOI: 10.1002/mas.21535] [Citation(s) in RCA: 259] [Impact Index Per Article: 43.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/27/2016] [Revised: 03/17/2017] [Accepted: 03/18/2017] [Indexed: 05/03/2023]
Abstract
Tandem mass spectral library search (MS/MS) is the fastest way to correctly annotate MS/MS spectra from screening small molecules in fields such as environmental analysis, drug screening, lipid analysis, and metabolomics. The confidence in MS/MS-based annotation of chemical structures is impacted by instrumental settings and requirements, data acquisition modes including data-dependent and data-independent methods, library scoring algorithms, as well as post-curation steps. We critically discuss parameters that influence search results, such as mass accuracy, precursor ion isolation width, intensity thresholds, centroiding algorithms, and acquisition speed. A range of publicly and commercially available MS/MS databases such as NIST, MassBank, MoNA, LipidBlast, Wiley MSforID, and METLIN are surveyed. In addition, software tools including NIST MS Search, MS-DIAL, Mass Frontier, SmileMS, Mass++, and XCMS2 to perform fast MS/MS search are discussed. MS/MS scoring algorithms and challenges during compound annotation are reviewed. Advanced methods such as the in silico generation of tandem mass spectra using quantum chemistry and machine learning methods are covered. Community efforts for curation and sharing of tandem mass spectra that will allow for faster distribution of scientific discoveries are discussed.
Collapse
Affiliation(s)
- Tobias Kind
- Genome Center, Metabolomics, UC Davis, Davis, California
| | - Hiroshi Tsugawa
- RIKEN Center for Sustainable Resource Science, Yokohama, Kanagawa, Japan
| | - Tomas Cajka
- Genome Center, Metabolomics, UC Davis, Davis, California
| | - Yan Ma
- National Institute of Biological Sciences, Beijing, People’s Republic of China
| | - Zijuan Lai
- Genome Center, Metabolomics, UC Davis, Davis, California
| | | | | | | | | | - Masanori Arita
- RIKEN Center for Sustainable Resource Science, Yokohama, Kanagawa, Japan
| | - Oliver Fiehn
- Genome Center, Metabolomics, UC Davis, Davis, California
- Faculty of Sciences, Department of Biochemistry, King Abdulaziz University, Jeddah, Saudi Arabia
| |
Collapse
|
37
|
Blaženović I, Kind T, Ji J, Fiehn O. Software Tools and Approaches for Compound Identification of LC-MS/MS Data in Metabolomics. Metabolites 2018; 8:E31. [PMID: 29748461 PMCID: PMC6027441 DOI: 10.3390/metabo8020031] [Citation(s) in RCA: 402] [Impact Index Per Article: 67.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2018] [Revised: 04/26/2018] [Accepted: 05/06/2018] [Indexed: 01/17/2023] Open
Abstract
The annotation of small molecules remains a major challenge in untargeted mass spectrometry-based metabolomics. We here critically discuss structured elucidation approaches and software that are designed to help during the annotation of unknown compounds. Only by elucidating unknown metabolites first is it possible to biologically interpret complex systems, to map compounds to pathways and to create reliable predictive metabolic models for translational and clinical research. These strategies include the construction and quality of tandem mass spectral databases such as the coalition of MassBank repositories and investigations of MS/MS matching confidence. We present in silico fragmentation tools such as MS-FINDER, CFM-ID, MetFrag, ChemDistiller and CSI:FingerID that can annotate compounds from existing structure databases and that have been used in the CASMI (critical assessment of small molecule identification) contests. Furthermore, the use of retention time models from liquid chromatography and the utility of collision cross-section modelling from ion mobility experiments are covered. Workflows and published examples of successfully annotated unknown compounds are included.
Collapse
Affiliation(s)
- Ivana Blaženović
- NIH West Coast Metabolomics Center, UC Davis Genome Center, University of California, Davis, CA 95616, USA.
| | - Tobias Kind
- NIH West Coast Metabolomics Center, UC Davis Genome Center, University of California, Davis, CA 95616, USA.
| | - Jian Ji
- State Key Laboratory of Food Science and Technology, School of Food Science of Jiangnan University, School of Food Science Synergetic Innovation Center of Food Safety and Nutrition, Wuxi 214122, China.
| | - Oliver Fiehn
- NIH West Coast Metabolomics Center, UC Davis Genome Center, University of California, Davis, CA 95616, USA.
- Department of Biochemistry, Faculty of Sciences, King Abdulaziz University, Jeddah 21589, Saudi Arabia.
| |
Collapse
|
38
|
Bruderer T, Varesio E, Hidasi AO, Duchoslav E, Burton L, Bonner R, Hopfgartner G. Metabolomic spectral libraries for data-independent SWATH liquid chromatography mass spectrometry acquisition. Anal Bioanal Chem 2018; 410:1873-1884. [PMID: 29411086 DOI: 10.1007/s00216-018-0860-x] [Citation(s) in RCA: 28] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2017] [Revised: 12/12/2017] [Accepted: 01/08/2018] [Indexed: 11/27/2022]
Abstract
High-quality mass spectral libraries have become crucial in mass spectrometry-based metabolomics. Here, we investigate a workflow to generate accurate mass discrete and composite spectral libraries for metabolite identification and for SWATH mass spectrometry data processing. Discrete collision energy (5-100 eV) accurate mass spectra were collected for 532 metabolites from the human metabolome database (HMDB) by flow injection analysis and compiled into composite spectra over a large collision energy range (e.g., 10-70 eV). Full scan response factors were also calculated. Software tools based on accurate mass and predictive fragmentation were specially developed and found to be essential for construction and quality control of the spectral library. First, elemental compositions constrained by the elemental composition of the precursor ion were calculated for all fragments. Secondly, all possible fragments were generated from the compound structure and were filtered based on their elemental compositions. From the discrete spectra, it was possible to analyze the specific fragment form at each collision energy and it was found that a relatively large collision energy range (10-70 eV) gives informative MS/MS spectra for library searches. From the composite spectra, it was possible to characterize specific neutral losses as radical losses using in silico fragmentation. Radical losses (generating radical cations) were found to be more prominent than expected. From 532 metabolites, 489 provided a signal in positive mode [M+H]+ and 483 in negative mode [M-H]-. MS/MS spectra were obtained for 399 compounds in positive mode and for 462 in negative mode; 329 metabolites generated suitable spectra in both modes. Using the spectral library, LC retention time, response factors to analyze data-independent LC-SWATH-MS data allowed the identification of 39 (positive mode) and 72 (negative mode) metabolites in a plasma pool sample (total 92 metabolites) where 81 previously were reported in HMDB to be found in plasma. Graphical abstract Library generation workflow for LC-SWATH MS, using collision energy spread, accurate mass, and fragment annotation.
Collapse
Affiliation(s)
- Tobias Bruderer
- Life Sciences Mass Spectrometry, Department of Inorganic and Analytical Chemistry, University of Geneva, 24, Quai Ernest Ansermet, 1211, Geneva 4, Switzerland
| | - Emmanuel Varesio
- School of Pharmaceutical Sciences, University of Geneva, University of Lausanne, Rue Michel-Servet 1, 1211, Geneva 4, Switzerland
| | - Anita O Hidasi
- Life Sciences Mass Spectrometry, Department of Inorganic and Analytical Chemistry, University of Geneva, 24, Quai Ernest Ansermet, 1211, Geneva 4, Switzerland
| | - Eva Duchoslav
- Sciex, 71 Four Valley Drive, Concord, ON, L4K 4V8, Canada
| | - Lyle Burton
- Sciex, 71 Four Valley Drive, Concord, ON, L4K 4V8, Canada
| | - Ron Bonner
- Ron Bonner Consulting, Newmarket, ON, L3Y 3C7, Canada
| | - Gérard Hopfgartner
- Life Sciences Mass Spectrometry, Department of Inorganic and Analytical Chemistry, University of Geneva, 24, Quai Ernest Ansermet, 1211, Geneva 4, Switzerland.
| |
Collapse
|
39
|
Hu M, Müller E, Schymanski EL, Ruttkies C, Schulze T, Brack W, Krauss M. Performance of combined fragmentation and retention prediction for the identification of organic micropollutants by LC-HRMS. Anal Bioanal Chem 2018; 410:1931-1941. [DOI: 10.1007/s00216-018-0857-5] [Citation(s) in RCA: 20] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2017] [Revised: 12/17/2017] [Accepted: 01/05/2018] [Indexed: 12/28/2022]
|
40
|
Shahaf N, Aharoni A, Rogachev I. A Complete Pipeline for Generating a High-Resolution LC-MS-Based Reference Mass Spectra Library. Methods Mol Biol 2018; 1778:193-206. [PMID: 29761440 DOI: 10.1007/978-1-4939-7819-9_14] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
Databases containing mass spectrometry (MS) spectral data (i.e., MS reference libraries) are currently the most reliable and widely accepted approach to annotate unknown features in MS-based metabolomics. While for gas chromatography (GC)-MS data, a strategy for collecting, storing, and comparing to raw data has been established, this is not the case for liquid chromatography (LC)-MS data. Here, we present our approach for high-throughput data collection and automated MS reference library generation, as applied recently in the WEIZMASS library of plant metabolites. Methodologies to experimentally generate pools of chemical standards and computationally convert them into a unique source of reference data are detailed.
Collapse
Affiliation(s)
- Nir Shahaf
- Department of Plant and Environmental Sciences, Faculty of Biochemistry, Weizmann Institute of Science, Rehovot, Israel
| | - Asaph Aharoni
- Department of Plant and Environmental Sciences, Faculty of Biochemistry, Weizmann Institute of Science, Rehovot, Israel
| | - Ilana Rogachev
- Department of Plant and Environmental Sciences, Faculty of Biochemistry, Weizmann Institute of Science, Rehovot, Israel.
| |
Collapse
|
41
|
Schollée JE, Schymanski EL, Stravs MA, Gulde R, Thomaidis NS, Hollender J. Similarity of High-Resolution Tandem Mass Spectrometry Spectra of Structurally Related Micropollutants and Transformation Products. JOURNAL OF THE AMERICAN SOCIETY FOR MASS SPECTROMETRY 2017; 28:2692-2704. [PMID: 28952028 DOI: 10.1007/s13361-017-1797-6] [Citation(s) in RCA: 48] [Impact Index Per Article: 6.9] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/02/2017] [Revised: 08/23/2017] [Accepted: 08/23/2017] [Indexed: 06/07/2023]
Abstract
High-resolution tandem mass spectrometry (HRMS2) with electrospray ionization is frequently applied to study polar organic molecules such as micropollutants. Fragmentation provides structural information to confirm structures of known compounds or propose structures of unknown compounds. Similarity of HRMS2 spectra between structurally related compounds has been suggested to facilitate identification of unknown compounds. To test this hypothesis, the similarity of reference standard HRMS2 spectra was calculated for 243 pairs of micropollutants and their structurally related transformation products (TPs); for comparison, spectral similarity was also calculated for 219 pairs of unrelated compounds. Spectra were measured on Orbitrap and QTOF mass spectrometers and similarity was calculated with the dot product. The influence of different factors on spectral similarity [e.g., normalized collision energy (NCE), merging fragments from all NCEs, and shifting fragments by the mass difference of the pair] was considered. Spectral similarity increased at higher NCEs and highest similarity scores for related pairs were obtained with merged spectra including measured fragments and shifted fragments. Removal of the monoisotopic peak was critical to reduce false positives. Using a spectral similarity score threshold of 0.52, 40% of related pairs and 0% of unrelated pairs were above this value. Structural similarity was estimated with the Tanimoto coefficient and pairs with higher structural similarity generally had higher spectral similarity. Pairs where one or both compounds contained heteroatoms such as sulfur often resulted in dissimilar spectra. This work demonstrates that HRMS2 spectral similarity may indicate structural similarity and that spectral similarity can be used in the future to screen complex samples for related compounds such as micropollutants and TPs, assisting in the prioritization of non-target compounds. Graphical Abstract ᅟ.
Collapse
Affiliation(s)
- Jennifer E Schollée
- Eawag, Swiss Federal Institute of Aquatic Science and Technology, 8600, Dübendorf, Switzerland.
- Institute of Biogeochemistry and Pollutant Dynamics, ETH Zürich, 8092, Zürich, Switzerland.
| | - Emma L Schymanski
- Eawag, Swiss Federal Institute of Aquatic Science and Technology, 8600, Dübendorf, Switzerland
| | - Michael A Stravs
- Eawag, Swiss Federal Institute of Aquatic Science and Technology, 8600, Dübendorf, Switzerland
- Institute of Biogeochemistry and Pollutant Dynamics, ETH Zürich, 8092, Zürich, Switzerland
| | - Rebekka Gulde
- Eawag, Swiss Federal Institute of Aquatic Science and Technology, 8600, Dübendorf, Switzerland
| | - Nikolaos S Thomaidis
- Laboratory of Analytical Chemistry, Department of Chemistry, National and Kapodistrian University of Athens, 157 71, Athens, Greece
| | - Juliane Hollender
- Eawag, Swiss Federal Institute of Aquatic Science and Technology, 8600, Dübendorf, Switzerland
- Institute of Biogeochemistry and Pollutant Dynamics, ETH Zürich, 8092, Zürich, Switzerland
| |
Collapse
|
42
|
van der Hooft JJJ, Wandy J, Young F, Padmanabhan S, Gerasimidis K, Burgess KEV, Barrett MP, Rogers S. Unsupervised Discovery and Comparison of Structural Families Across Multiple Samples in Untargeted Metabolomics. Anal Chem 2017. [PMID: 28621528 PMCID: PMC5524435 DOI: 10.1021/acs.analchem.7b01391] [Citation(s) in RCA: 39] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2023]
Abstract
![]()
In
untargeted metabolomics
approaches, the inability to structurally
annotate relevant features and map them to biochemical pathways is
hampering the full exploitation of many metabolomics experiments.
Furthermore, variable metabolic content across samples result in sparse
feature matrices that are statistically hard to handle. Here, we introduce
MS2LDA+ that tackles both above-mentioned problems. Previously, we
presented MS2LDA, which extracts biochemically relevant molecular
substructures (“Mass2Motifs”) from a collection of fragmentation
spectra as sets of co-occurring molecular fragments and neutral losses,
thereby recognizing building blocks of metabolomics. Here, we extend
MS2LDA to handle multiple metabolomics experiments in one analysis,
resulting in MS2LDA+. By linking Mass2Motifs across samples, we expose
the variability in prevalence of structurally related metabolite families.
We validate the differential prevalence of substructures between two
distinct samples groups and apply it to fecal samples. Subsequently,
within one sample group of urines, we rank the Mass2Motifs based on
their variance to assess whether xenobiotic-derived substructures
are among the most-variant Mass2Motifs. Indeed, we could ascribe 22
out of the 30 most-variant Mass2Motifs to xenobiotic-derived substructures
including paracetamol/acetaminophen mercapturate and dimethylpyrogallol.
In total, we structurally characterized 101 Mass2Motifs with biochemically
or chemically relevant substructures. Finally, we combined the discovered
metabolite families with full scan feature intensity information to
obtain insight into core metabolites present in most samples and rare
metabolites present in small subsets now linked through their common
substructures. We conclude that by biochemical grouping of metabolites
across samples MS2LDA+ aids in structural annotation of metabolites
and guides prioritization of analysis by using Mass2Motif prevalence.
Collapse
Affiliation(s)
- Justin J J van der Hooft
- Glasgow Polyomics, University of Glasgow , Glasgow G61 1HQ, United Kingdom.,Institute of Cardiovascular and Medical Sciences, College of Medical, Veterinary and Life Sciences, University of Glasgow , Glasgow G12 8QQ, United Kingdom
| | - Joe Wandy
- Glasgow Polyomics, University of Glasgow , Glasgow G61 1HQ, United Kingdom
| | - Francesca Young
- Glasgow Polyomics, University of Glasgow , Glasgow G61 1HQ, United Kingdom
| | - Sandosh Padmanabhan
- Institute of Cardiovascular and Medical Sciences, College of Medical, Veterinary and Life Sciences, University of Glasgow , Glasgow G12 8QQ, United Kingdom
| | - Konstantinos Gerasimidis
- Human Nutrition, School of Medicine, College of Medical, Veterinary and Life Sciences, University of Glasgow , New Lister Building, Glasgow Royal Infirmary, Glasgow G31 2ER, United Kingdom
| | - Karl E V Burgess
- Glasgow Polyomics, University of Glasgow , Glasgow G61 1HQ, United Kingdom
| | - Michael P Barrett
- Glasgow Polyomics, University of Glasgow , Glasgow G61 1HQ, United Kingdom.,Wellcome Centre for Molecular Parasitology, Institute of Infection, Immunity and Inflammation, University of Glasgow , Glasgow G12 8TA, United Kingdom
| | - Simon Rogers
- Glasgow Polyomics, University of Glasgow , Glasgow G61 1HQ, United Kingdom.,School of Computing Science, University of Glasgow , Glasgow G12 8RZ, United Kingdom
| |
Collapse
|
43
|
Perez de Souza L, Naake T, Tohge T, Fernie AR. From chromatogram to analyte to metabolite. How to pick horses for courses from the massive web resources for mass spectral plant metabolomics. Gigascience 2017; 6:1-20. [PMID: 28520864 PMCID: PMC5499862 DOI: 10.1093/gigascience/gix037] [Citation(s) in RCA: 46] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2017] [Revised: 05/08/2017] [Accepted: 05/12/2017] [Indexed: 01/19/2023] Open
Abstract
The grand challenge currently facing metabolomics is the expansion of the coverage of the metabolome from a minor percentage of the metabolic complement of the cell toward the level of coverage afforded by other post-genomic technologies such as transcriptomics and proteomics. In plants, this problem is exacerbated by the sheer diversity of chemicals that constitute the metabolome, with the number of metabolites in the plant kingdom generally considered to be in excess of 200 000. In this review, we focus on web resources that can be exploited in order to improve analyte and ultimately metabolite identification and quantification. There is a wide range of available software that not only aids in this but also in the related area of peak alignment; however, for the uninitiated, choosing which program to use is a daunting task. For this reason, we provide an overview of the pros and cons of the software as well as comments regarding the level of programing skills required to effectively exploit their basic functions. In addition, the torrent of available genome and transcriptome sequences that followed the advent of next-generation sequencing has opened up further valuable resources for metabolite identification. All things considered, we posit that only via a continued communal sharing of information such as that deposited in the databases described within the article are we likely to be able to make significant headway toward improving our coverage of the plant metabolome.
Collapse
Affiliation(s)
- Leonardo Perez de Souza
- Max-Planck-Institute of Molecular Plant Physiology, Am Mühlenberg 1, 14476 Potsdam-Golm, Germany
| | - Thomas Naake
- Max-Planck-Institute of Molecular Plant Physiology, Am Mühlenberg 1, 14476 Potsdam-Golm, Germany
| | - Takayuki Tohge
- Max-Planck-Institute of Molecular Plant Physiology, Am Mühlenberg 1, 14476 Potsdam-Golm, Germany
| | - Alisdair R Fernie
- Max-Planck-Institute of Molecular Plant Physiology, Am Mühlenberg 1, 14476 Potsdam-Golm, Germany
| |
Collapse
|
44
|
Stravs MA, Pomati F, Hollender J. Exploring micropollutant biotransformation in three freshwater phytoplankton species. ENVIRONMENTAL SCIENCE. PROCESSES & IMPACTS 2017; 19:822-832. [PMID: 28485428 DOI: 10.1039/c7em00100b] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/07/2023]
Abstract
Phytoplankton constitute an important component of surface water ecosystems; however little is known about their contribution to biotransformation of organic micropollutants. To elucidate biotransformation processes, batch experiments with two cyanobacterial species (Microcystis aeruginosa and Synechococcus sp.) and one green algal species (Chlamydomonas reinhardtii) were conducted. Twenty-four micropollutants were studied, including 15 fungicides and 9 pharmaceuticals. Online solid phase extraction (SPE) coupled with liquid chromatography (LC)-high resolution tandem mass spectrometry (HRMS/MS) was used together with suspect and nontarget screening to identify transformation products (TPs). 14 TPs were identified for 9 micropollutants, formed by cytochrome P450-mediated oxidation, conjugation and methylation reactions. The observed transformation pathways included reactions likely mediated by promiscuous enzymes, such as glutamate conjugation to mefenamic acid and pterin conjugation of sulfamethoxazole. For 15 compounds, including all azole fungicides tested, no TPs were identified. Environmentally relevant concentrations of chemical stressors had no influence on the transformation types and rates.
Collapse
Affiliation(s)
- Michael A Stravs
- Eawag Swiss Federal Institute of Aquatic Science and Technology, Überlandstrasse 133, 8600 Dübendorf, Switzerland.
| | | | | |
Collapse
|
45
|
Schymanski EL, Ruttkies C, Krauss M, Brouard C, Kind T, Dührkop K, Allen F, Vaniya A, Verdegem D, Böcker S, Rousu J, Shen H, Tsugawa H, Sajed T, Fiehn O, Ghesquière B, Neumann S. Critical Assessment of Small Molecule Identification 2016: automated methods. J Cheminform 2017; 9:22. [PMID: 29086042 PMCID: PMC5368104 DOI: 10.1186/s13321-017-0207-1] [Citation(s) in RCA: 94] [Impact Index Per Article: 13.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2016] [Accepted: 03/13/2017] [Indexed: 12/30/2022] Open
Abstract
BACKGROUND The fourth round of the Critical Assessment of Small Molecule Identification (CASMI) Contest ( www.casmi-contest.org ) was held in 2016, with two new categories for automated methods. This article covers the 208 challenges in Categories 2 and 3, without and with metadata, from organization, participation, results and post-contest evaluation of CASMI 2016 through to perspectives for future contests and small molecule annotation/identification. RESULTS The Input Output Kernel Regression (CSI:IOKR) machine learning approach performed best in "Category 2: Best Automatic Structural Identification-In Silico Fragmentation Only", won by Team Brouard with 41% challenge wins. The winner of "Category 3: Best Automatic Structural Identification-Full Information" was Team Kind (MS-FINDER), with 76% challenge wins. The best methods were able to achieve over 30% Top 1 ranks in Category 2, with all methods ranking the correct candidate in the Top 10 in around 50% of challenges. This success rate rose to 70% Top 1 ranks in Category 3, with candidates in the Top 10 in over 80% of the challenges. The machine learning and chemistry-based approaches are shown to perform in complementary ways. CONCLUSIONS The improvement in (semi-)automated fragmentation methods for small molecule identification has been substantial. The achieved high rates of correct candidates in the Top 1 and Top 10, despite large candidate numbers, open up great possibilities for high-throughput annotation of untargeted analysis for "known unknowns". As more high quality training data becomes available, the improvements in machine learning methods will likely continue, but the alternative approaches still provide valuable complementary information. Improved integration of experimental context will also improve identification success further for "real life" annotations. The true "unknown unknowns" remain to be evaluated in future CASMI contests. Graphical abstract .
Collapse
Affiliation(s)
- Emma L Schymanski
- Eawag: Swiss Federal Institute for Aquatic Science and Technology, Überlandstrasse 133, 8600, Dübendorf, Switzerland.
| | - Christoph Ruttkies
- Department of Stress and Developmental Biology, Leibniz Institute of Plant Biochemistry, Weinberg 3, 06120, Halle, Germany
| | - Martin Krauss
- Department of Effect-Directed Analysis, UFZ: Helmholtz Centre for Environmental Research, Permoserstrasse 15, 04318, Leipzig, Germany
| | - Céline Brouard
- Department of Computer Science, Aalto University, Konemiehentie 2, 02150, Espoo, Finland
- Helsinki Institute for Information Technology, Tekniikantie 14, 02150, Espoo, Finland
| | - Tobias Kind
- West Coast Metabolomics Center and Genome Center, University of California Davis, 451 Health Sciences Drive, Davis, CA, 95616, USA
| | - Kai Dührkop
- Chair of Bioinformatics, Friedrich-Schiller-University, Jena, Ernst-Abbe-Platz 2, 07743, Jena, Germany
| | - Felicity Allen
- Department of Computing Science, University of Alberta, Edmonton, AB, T6G 2E9, Canada
| | - Arpana Vaniya
- West Coast Metabolomics Center and Genome Center, University of California Davis, 451 Health Sciences Drive, Davis, CA, 95616, USA
- Department of Chemistry, University of California Davis, One Shields Avenue, Davis, CA, 95616, USA
| | - Dries Verdegem
- Metabolomics Expertise Center, Vesalius Research Center (VRC), VIB, KU Leuven - University of Leuven, 3000, Louvain, Belgium
| | - Sebastian Böcker
- Chair of Bioinformatics, Friedrich-Schiller-University, Jena, Ernst-Abbe-Platz 2, 07743, Jena, Germany
| | - Juho Rousu
- Department of Computer Science, Aalto University, Konemiehentie 2, 02150, Espoo, Finland
- Helsinki Institute for Information Technology, Tekniikantie 14, 02150, Espoo, Finland
| | - Huibin Shen
- Department of Computer Science, Aalto University, Konemiehentie 2, 02150, Espoo, Finland
- Helsinki Institute for Information Technology, Tekniikantie 14, 02150, Espoo, Finland
| | - Hiroshi Tsugawa
- RIKEN Center for Sustainable Resource Science (CSRS), 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa, 230-0045, Japan
| | - Tanvir Sajed
- Department of Computing Science, University of Alberta, Edmonton, AB, T6G 2E9, Canada
| | - Oliver Fiehn
- West Coast Metabolomics Center and Genome Center, University of California Davis, 451 Health Sciences Drive, Davis, CA, 95616, USA
- Department of Biochemistry, Faculty of Sciences, King Abdulaziz University, Jeddah, Saudi Arabia
| | - Bart Ghesquière
- Metabolomics Expertise Center, Vesalius Research Center (VRC), VIB, KU Leuven - University of Leuven, 3000, Louvain, Belgium
| | - Steffen Neumann
- Department of Stress and Developmental Biology, Leibniz Institute of Plant Biochemistry, Weinberg 3, 06120, Halle, Germany
| |
Collapse
|
46
|
Spicer R, Salek RM, Moreno P, Cañueto D, Steinbeck C. Navigating freely-available software tools for metabolomics analysis. Metabolomics 2017; 13:106. [PMID: 28890673 PMCID: PMC5550549 DOI: 10.1007/s11306-017-1242-7] [Citation(s) in RCA: 142] [Impact Index Per Article: 20.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/11/2017] [Accepted: 07/25/2017] [Indexed: 12/21/2022]
Abstract
INTRODUCTION The field of metabolomics has expanded greatly over the past two decades, both as an experimental science with applications in many areas, as well as in regards to data standards and bioinformatics software tools. The diversity of experimental designs and instrumental technologies used for metabolomics has led to the need for distinct data analysis methods and the development of many software tools. OBJECTIVES To compile a comprehensive list of the most widely used freely available software and tools that are used primarily in metabolomics. METHODS The most widely used tools were selected for inclusion in the review by either ≥ 50 citations on Web of Science (as of 08/09/16) or the use of the tool being reported in the recent Metabolomics Society survey. Tools were then categorised by the type of instrumental data (i.e. LC-MS, GC-MS or NMR) and the functionality (i.e. pre- and post-processing, statistical analysis, workflow and other functions) they are designed for. RESULTS A comprehensive list of the most used tools was compiled. Each tool is discussed within the context of its application domain and in relation to comparable tools of the same domain. An extended list including additional tools is available at https://github.com/RASpicer/MetabolomicsTools which is classified and searchable via a simple controlled vocabulary. CONCLUSION This review presents the most widely used tools for metabolomics analysis, categorised based on their main functionality. As future work, we suggest a direct comparison of tools' abilities to perform specific data analysis tasks e.g. peak picking.
Collapse
Affiliation(s)
- Rachel Spicer
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD UK
| | - Reza M. Salek
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD UK
| | - Pablo Moreno
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD UK
| | - Daniel Cañueto
- Metabolomics Platform, IISPV, DEEEA, Universitat Rovira i Virgili, Campus Sescelades, Carretera de Valls, s/n, 43007 Tarragona, Catalonia Spain
| | - Christoph Steinbeck
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD UK
- Friedrich-Schiller-University Jena, Lessingstr. 8, Jena, 07743 Germany
| |
Collapse
|
47
|
van der Hooft JJJ, Wandy J, Barrett MP, Burgess KEV, Rogers S. Topic modeling for untargeted substructure exploration in metabolomics. Proc Natl Acad Sci U S A 2016; 113:13738-13743. [PMID: 27856765 PMCID: PMC5137707 DOI: 10.1073/pnas.1608041113] [Citation(s) in RCA: 196] [Impact Index Per Article: 24.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/05/2023] Open
Abstract
The potential of untargeted metabolomics to answer important questions across the life sciences is hindered because of a paucity of computational tools that enable extraction of key biochemically relevant information. Available tools focus on using mass spectrometry fragmentation spectra to identify molecules whose behavior suggests they are relevant to the system under study. Unfortunately, fragmentation spectra cannot identify molecules in isolation but require authentic standards or databases of known fragmented molecules. Fragmentation spectra are, however, replete with information pertaining to the biochemical processes present, much of which is currently neglected. Here, we present an analytical workflow that exploits all fragmentation data from a given experiment to extract biochemically relevant features in an unsupervised manner. We demonstrate that an algorithm originally used for text mining, latent Dirichlet allocation, can be adapted to handle metabolomics datasets. Our approach extracts biochemically relevant molecular substructures ("Mass2Motifs") from spectra as sets of co-occurring molecular fragments and neutral losses. The analysis allows us to isolate molecular substructures, whose presence allows molecules to be grouped based on shared substructures regardless of classical spectral similarity. These substructures, in turn, support putative de novo structural annotation of molecules. Combining this spectral connectivity to orthogonal correlations (e.g., common abundance changes under system perturbation) significantly enhances our ability to provide mechanistic explanations for biological behavior.
Collapse
Affiliation(s)
- Justin Johan Jozias van der Hooft
- Glasgow Polyomics, University of Glasgow, Glasgow G61 1QH, United Kingdom
- Institute of Infection, Immunity, and Inflammation, College of Medical, Veterinary, and Life Sciences, University of Glasgow, Glasgow G12 8TA, United Kingdom
| | - Joe Wandy
- Glasgow Polyomics, University of Glasgow, Glasgow G61 1QH, United Kingdom
- School of Computing Science, University of Glasgow, Glasgow G12 8RZ, United Kingdom
| | - Michael P Barrett
- Glasgow Polyomics, University of Glasgow, Glasgow G61 1QH, United Kingdom
- Wellcome Trust Centre for Molecular Parasitology, Institute of Infection, Immunity and Inflammation, University of Glasgow, Glasgow G12 8TA, United Kingdom
| | - Karl E V Burgess
- Glasgow Polyomics, University of Glasgow, Glasgow G61 1QH, United Kingdom
| | - Simon Rogers
- Glasgow Polyomics, University of Glasgow, Glasgow G61 1QH, United Kingdom;
- School of Computing Science, University of Glasgow, Glasgow G12 8RZ, United Kingdom
| |
Collapse
|
48
|
Wohlgemuth G, Mehta SS, Mejia RF, Neumann S, Pedrosa D, Pluskal T, Schymanski EL, Willighagen EL, Wilson M, Wishart DS, Arita M, Dorrestein PC, Bandeira N, Wang M, Schulze T, Salek RM, Steinbeck C, Nainala VC, Mistrik R, Nishioka T, Fiehn O. SPLASH, a hashed identifier for mass spectra. Nat Biotechnol 2016; 34:1099-1101. [PMID: 27824832 PMCID: PMC5515539 DOI: 10.1038/nbt.3689] [Citation(s) in RCA: 48] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022]
Affiliation(s)
- Gert Wohlgemuth
- West Coast Metabolomics Center and Genome Center University of California Davis, Davis, CA, USA
| | - Sajjan S. Mehta
- West Coast Metabolomics Center and Genome Center University of California Davis, Davis, CA, USA
| | - Ramon F. Mejia
- RIKEN Center for Sustainable Resource Science, Yokohama, Kanagawa, Japan
| | - Steffen Neumann
- Department of Stress and Developmental Biology, Leibniz Institute of Plant Biochemistry, Halle, Germany
| | - Diego Pedrosa
- West Coast Metabolomics Center and Genome Center University of California Davis, Davis, CA, USA
| | - Tomáš Pluskal
- Whitehead Institute for Biomedical Research, Nine Cambridge Center, Cambridge, MA, USA
| | - Emma L. Schymanski
- Eawag: Swiss Federal Institute of Aquatic Science and Technology, Dübendorf, Switzerland
| | - Egon L. Willighagen
- Department of Bioinformatics - BiGCaT, Maastricht University, Maastricht, The Netherlands
| | - Michael Wilson
- Department of Computing Science, University of Alberta, Edmonton, AB, Canada
| | - David S. Wishart
- Department of Computing Science, University of Alberta, Edmonton, AB, Canada
| | - Masanori Arita
- RIKEN Center for Sustainable Resource Science, Yokohama, Kanagawa, Japan
- National Institute of Genetics, Mishima, Shizuoka, Japan
| | - Pieter C. Dorrestein
- Collaborative Mass Spectrometry Innovation Center, Skaggs School of Pharmacy and Pharmaceutical Sciences, UC San Diego, La Jolla, CA, USA
- Departments of Pharmacology and Pediatrics, School of Medicine, UC San Diego, La Jolla, CA, USA
| | - Nuno Bandeira
- Collaborative Mass Spectrometry Innovation Center, Skaggs School of Pharmacy and Pharmaceutical Sciences, UC San Diego, La Jolla, CA, USA
- Computer Science and Engineering, UC San Diego, La Jolla, CA, USA
- Center for Computational Mass Spectrometry, UC San Diego, La Jolla, CA, USA
| | - Mingxun Wang
- Computer Science and Engineering, UC San Diego, La Jolla, CA, USA
- Center for Computational Mass Spectrometry, UC San Diego, La Jolla, CA, USA
| | - Tobias Schulze
- Department of Effect-Directed Analysis, UFZ Helmholtz Centre for Environmental Research GmbH, Leipzig, Germany
| | - Reza M. Salek
- European Molecular Biology Laboratory - European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge, UK
| | - Christoph Steinbeck
- European Molecular Biology Laboratory - European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge, UK
| | - Venkata Chandrasekhar Nainala
- European Molecular Biology Laboratory - European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge, UK
| | | | - Takaaki Nishioka
- Graduate School of Agriculture, Kyoto University, Kitashirakawa Oiwake-cho, Kyoto, Japan
| | - Oliver Fiehn
- West Coast Metabolomics Center and Genome Center University of California Davis, Davis, CA, USA
- Biochemistry Department, King Abdulaziz University, Jeddah, Saudi Arabia
| |
Collapse
|
49
|
Meusel M, Hufsky F, Panter F, Krug D, Müller R, Böcker S. Predicting the Presence of Uncommon Elements in Unknown Biomolecules from Isotope Patterns. Anal Chem 2016; 88:7556-66. [DOI: 10.1021/acs.analchem.6b01015] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/02/2023]
Affiliation(s)
- Marvin Meusel
- Chair
for Bioinformatics, Friedrich Schiller University Jena, 07743 Jena, Germany
| | - Franziska Hufsky
- Chair
for Bioinformatics, Friedrich Schiller University Jena, 07743 Jena, Germany
- RNA
Bioinformatics and High Throughput Analysis, Friedrich Schiller University Jena, 07743 Jena, Germany
| | - Fabian Panter
- Department
of Microbial Natural Products, Helmholtz-Institute for Pharmaceutical
Research Saarland, Helmholtz Centre for Infection Research and Pharmaceutical
Biotechnology, Saarland University, 66123 Saarbrücken, Germany
| | - Daniel Krug
- Department
of Microbial Natural Products, Helmholtz-Institute for Pharmaceutical
Research Saarland, Helmholtz Centre for Infection Research and Pharmaceutical
Biotechnology, Saarland University, 66123 Saarbrücken, Germany
| | - Rolf Müller
- Department
of Microbial Natural Products, Helmholtz-Institute for Pharmaceutical
Research Saarland, Helmholtz Centre for Infection Research and Pharmaceutical
Biotechnology, Saarland University, 66123 Saarbrücken, Germany
| | - Sebastian Böcker
- Chair
for Bioinformatics, Friedrich Schiller University Jena, 07743 Jena, Germany
| |
Collapse
|
50
|
Vinaixa M, Schymanski EL, Neumann S, Navarro M, Salek RM, Yanes O. Mass spectral databases for LC/MS- and GC/MS-based metabolomics: State of the field and future prospects. Trends Analyt Chem 2016. [DOI: 10.1016/j.trac.2015.09.005] [Citation(s) in RCA: 325] [Impact Index Per Article: 40.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/22/2023]
|