1
|
Lai Y, Koelmel JP, Walker DI, Price EJ, Papazian S, Manz KE, Castilla-Fernández D, Bowden JA, Nikiforov V, David A, Bessonneau V, Amer B, Seethapathy S, Hu X, Lin EZ, Jbebli A, McNeil BR, Barupal D, Cerasa M, Xie H, Kalia V, Nandakumar R, Singh R, Tian Z, Gao P, Zhao Y, Froment J, Rostkowski P, Dubey S, Coufalíková K, Seličová H, Hecht H, Liu S, Udhani HH, Restituito S, Tchou-Wong KM, Lu K, Martin JW, Warth B, Godri Pollitt KJ, Klánová J, Fiehn O, Metz TO, Pennell KD, Jones DP, Miller GW. High-Resolution Mass Spectrometry for Human Exposomics: Expanding Chemical Space Coverage. ENVIRONMENTAL SCIENCE & TECHNOLOGY 2024; 58:12784-12822. [PMID: 38984754 PMCID: PMC11271014 DOI: 10.1021/acs.est.4c01156] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/01/2024] [Revised: 06/11/2024] [Accepted: 06/12/2024] [Indexed: 07/11/2024]
Abstract
In the modern "omics" era, measurement of the human exposome is a critical missing link between genetic drivers and disease outcomes. High-resolution mass spectrometry (HRMS), routinely used in proteomics and metabolomics, has emerged as a leading technology to broadly profile chemical exposure agents and related biomolecules for accurate mass measurement, high sensitivity, rapid data acquisition, and increased resolution of chemical space. Non-targeted approaches are increasingly accessible, supporting a shift from conventional hypothesis-driven, quantitation-centric targeted analyses toward data-driven, hypothesis-generating chemical exposome-wide profiling. However, HRMS-based exposomics encounters unique challenges. New analytical and computational infrastructures are needed to expand the analysis coverage through streamlined, scalable, and harmonized workflows and data pipelines that permit longitudinal chemical exposome tracking, retrospective validation, and multi-omics integration for meaningful health-oriented inferences. In this article, we survey the literature on state-of-the-art HRMS-based technologies, review current analytical workflows and informatic pipelines, and provide an up-to-date reference on exposomic approaches for chemists, toxicologists, epidemiologists, care providers, and stakeholders in health sciences and medicine. We propose efforts to benchmark fit-for-purpose platforms for expanding coverage of chemical space, including gas/liquid chromatography-HRMS (GC-HRMS and LC-HRMS), and discuss opportunities, challenges, and strategies to advance the burgeoning field of the exposome.
Collapse
Affiliation(s)
- Yunjia Lai
- Department
of Environmental Health Sciences, Mailman School of Public Health, Columbia University, New York, New York 10032, United States
| | - Jeremy P. Koelmel
- Department
of Environmental Health Sciences, Yale School
of Public Health, New Haven, Connecticut 06520, United States
| | - Douglas I. Walker
- Gangarosa
Department of Environmental Health, Rollins School of Public Health, Emory University, Atlanta, Georgia 30322, United States
| | - Elliott J. Price
- RECETOX,
Faculty of Science, Masaryk University, Kotlářská 2, 611 37 Brno, Czech Republic
| | - Stefano Papazian
- Department
of Environmental Science, Science for Life Laboratory, Stockholm University, SE-106 91 Stockholm, Sweden
- National
Facility for Exposomics, Metabolomics Platform, Science for Life Laboratory, Stockholm University, Solna 171 65, Sweden
| | - Katherine E. Manz
- Department
of Environmental Health Sciences, School of Public Health, University of Michigan, Ann Arbor, Michigan 48109, United States
| | - Delia Castilla-Fernández
- Department
of Food Chemistry and Toxicology, Faculty of Chemistry, University of Vienna, 1010 Vienna, Austria
| | - John A. Bowden
- Center for
Environmental and Human Toxicology, Department of Physiological Sciences,
College of Veterinary Medicine, University
of Florida, Gainesville, Florida 32611, United States
| | | | - Arthur David
- Univ Rennes,
Inserm, EHESP, Irset (Institut de recherche en santé, environnement
et travail) − UMR_S, 1085 Rennes, France
| | - Vincent Bessonneau
- Univ Rennes,
Inserm, EHESP, Irset (Institut de recherche en santé, environnement
et travail) − UMR_S, 1085 Rennes, France
| | - Bashar Amer
- Thermo
Fisher Scientific, San Jose, California 95134, United States
| | | | - Xin Hu
- Gangarosa
Department of Environmental Health, Rollins School of Public Health, Emory University, Atlanta, Georgia 30322, United States
| | - Elizabeth Z. Lin
- Department
of Environmental Health Sciences, Yale School
of Public Health, New Haven, Connecticut 06520, United States
| | - Akrem Jbebli
- RECETOX,
Faculty of Science, Masaryk University, Kotlářská 2, 611 37 Brno, Czech Republic
| | - Brooklynn R. McNeil
- Biomarkers
Core Laboratory, Irving Institute for Clinical and Translational Research, Columbia University Irving Medical Center, New York, New York 10032, United States
| | - Dinesh Barupal
- Department
of Environmental Medicine and Public Health, Icahn School of Medicine at Mount Sinai, New York, New York 10029, United States
| | - Marina Cerasa
- Institute
of Atmospheric Pollution Research, Italian National Research Council, 00015 Monterotondo, Rome, Italy
| | - Hongyu Xie
- Department
of Environmental Science, Science for Life Laboratory, Stockholm University, SE-106 91 Stockholm, Sweden
| | - Vrinda Kalia
- Department
of Environmental Health Sciences, Mailman School of Public Health, Columbia University, New York, New York 10032, United States
| | - Renu Nandakumar
- Biomarkers
Core Laboratory, Irving Institute for Clinical and Translational Research, Columbia University Irving Medical Center, New York, New York 10032, United States
| | - Randolph Singh
- Department
of Environmental Health Sciences, Mailman School of Public Health, Columbia University, New York, New York 10032, United States
| | - Zhenyu Tian
- Department
of Chemistry and Chemical Biology, Northeastern
University, Boston, Massachusetts 02115, United States
| | - Peng Gao
- Department
of Environmental and Occupational Health, and Department of Civil
and Environmental Engineering, University
of Pittsburgh, Pittsburgh, Pennsylvania 15261, United States
- UPMC Hillman
Cancer Center, Pittsburgh, Pennsylvania 15232, United States
| | - Yujia Zhao
- Institute
for Risk Assessment Sciences, Utrecht University, Utrecht 3584CM, The Netherlands
| | | | | | - Saurabh Dubey
- Biomarkers
Core Laboratory, Irving Institute for Clinical and Translational Research, Columbia University Irving Medical Center, New York, New York 10032, United States
| | - Kateřina Coufalíková
- RECETOX,
Faculty of Science, Masaryk University, Kotlářská 2, 611 37 Brno, Czech Republic
| | - Hana Seličová
- RECETOX,
Faculty of Science, Masaryk University, Kotlářská 2, 611 37 Brno, Czech Republic
| | - Helge Hecht
- RECETOX,
Faculty of Science, Masaryk University, Kotlářská 2, 611 37 Brno, Czech Republic
| | - Sheng Liu
- Department
of Environmental Health Sciences, Yale School
of Public Health, New Haven, Connecticut 06520, United States
| | - Hanisha H. Udhani
- Biomarkers
Core Laboratory, Irving Institute for Clinical and Translational Research, Columbia University Irving Medical Center, New York, New York 10032, United States
| | - Sophie Restituito
- Department
of Environmental Health Sciences, Mailman School of Public Health, Columbia University, New York, New York 10032, United States
| | - Kam-Meng Tchou-Wong
- Department
of Environmental Health Sciences, Mailman School of Public Health, Columbia University, New York, New York 10032, United States
| | - Kun Lu
- Department
of Environmental Sciences and Engineering, Gillings School of Global
Public Health, The University of North Carolina
at Chapel Hill, Chapel Hill, North Carolina 27599, United States
| | - Jonathan W. Martin
- Department
of Environmental Science, Science for Life Laboratory, Stockholm University, SE-106 91 Stockholm, Sweden
- National
Facility for Exposomics, Metabolomics Platform, Science for Life Laboratory, Stockholm University, Solna 171 65, Sweden
| | - Benedikt Warth
- Department
of Food Chemistry and Toxicology, Faculty of Chemistry, University of Vienna, 1010 Vienna, Austria
| | - Krystal J. Godri Pollitt
- Department
of Environmental Health Sciences, Yale School
of Public Health, New Haven, Connecticut 06520, United States
| | - Jana Klánová
- RECETOX,
Faculty of Science, Masaryk University, Kotlářská 2, 611 37 Brno, Czech Republic
| | - Oliver Fiehn
- West Coast
Metabolomics Center, University of California−Davis, Davis, California 95616, United States
| | - Thomas O. Metz
- Biological
Sciences Division, Pacific Northwest National
Laboratory, Richland, Washington 99354, United States
| | - Kurt D. Pennell
- School
of Engineering, Brown University, Providence, Rhode Island 02912, United States
| | - Dean P. Jones
- Department
of Medicine, School of Medicine, Emory University, Atlanta, Georgia 30322, United States
| | - Gary W. Miller
- Department
of Environmental Health Sciences, Mailman School of Public Health, Columbia University, New York, New York 10032, United States
| |
Collapse
|
2
|
Meyer C, Stravs MA, Hollender J. How Wastewater Reflects Human Metabolism─Suspect Screening of Pharmaceutical Metabolites in Wastewater Influent. ENVIRONMENTAL SCIENCE & TECHNOLOGY 2024; 58:9828-9839. [PMID: 38785362 PMCID: PMC11154963 DOI: 10.1021/acs.est.4c00968] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/30/2024] [Revised: 04/13/2024] [Accepted: 04/17/2024] [Indexed: 05/25/2024]
Abstract
Pharmaceuticals and their human metabolites are contaminants of emerging concern in the aquatic environment. Most monitoring studies focus on a limited set of parent compounds and even fewer metabolites. However, more than 50% of the most consumed pharmaceuticals are excreted in higher amounts as metabolites than as parents, as confirmed by a literature analysis within this study. Hence, we applied a wide-scope suspect screening approach to identify human pharmaceutical metabolites in wastewater influent from three Swiss treatment plants. Based on consumption amounts and human metabolism data, a suspect list comprising 268 parent compounds and over 1500 metabolites was compiled. Online solid phase extraction combined with liquid chromatography coupled to high-resolution tandem mass spectrometry was used to analyze the samples. Data processing, annotation, and structure elucidation were achieved with various tools, including molecular networking as well as SIRIUS/CSI:FingerID and MetFrag for MS2 spectra rationalization. We confirmed 37 metabolites with reference standards and 16 by human liver S9 incubation experiments. More than 25 metabolites were detected for the first time in influent wastewater. Semiquantification with MS2Quant showed that metabolite to parent concentration ratios were generally lower compared to literature expectations, probably due to further metabolite transformation in the sewer system or limitations in the metabolite detection. Nonetheless, metabolites pose a large fraction to the total pharmaceutical contribution in wastewater, highlighting the need for metabolite inclusion in chemical risk assessment.
Collapse
Affiliation(s)
- Corina Meyer
- Eawag:
Swiss Federal Institute of Aquatic Science and Technology, Ueberlandstrasse 133, 8600 Duebendorf, Switzerland
- Institute
of Biogeochemistry and Pollutant Dynamics, ETH Zurich, Universitaetstrasse
16, 8092 Zurich, Switzerland
| | - Michael A. Stravs
- Eawag:
Swiss Federal Institute of Aquatic Science and Technology, Ueberlandstrasse 133, 8600 Duebendorf, Switzerland
| | - Juliane Hollender
- Eawag:
Swiss Federal Institute of Aquatic Science and Technology, Ueberlandstrasse 133, 8600 Duebendorf, Switzerland
- Institute
of Biogeochemistry and Pollutant Dynamics, ETH Zurich, Universitaetstrasse
16, 8092 Zurich, Switzerland
| |
Collapse
|
3
|
Mitchell JM, Chi Y, Thapa M, Pang Z, Xia J, Li S. Common data models to streamline metabolomics processing and annotation, and implementation in a Python pipeline. PLoS Comput Biol 2024; 20:e1011912. [PMID: 38843301 PMCID: PMC11185459 DOI: 10.1371/journal.pcbi.1011912] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/12/2024] [Revised: 06/18/2024] [Accepted: 05/20/2024] [Indexed: 06/18/2024] Open
Abstract
To standardize metabolomics data analysis and facilitate future computational developments, it is essential to have a set of well-defined templates for common data structures. Here we describe a collection of data structures involved in metabolomics data processing and illustrate how they are utilized in a full-featured Python-centric pipeline. We demonstrate the performance of the pipeline, and the details in annotation and quality control using large-scale LC-MS metabolomics and lipidomics data and LC-MS/MS data. Multiple previously published datasets are also reanalyzed to showcase its utility in biological data analysis. This pipeline allows users to streamline data processing, quality control, annotation, and standardization in an efficient and transparent manner. This work fills a major gap in the Python ecosystem for computational metabolomics.
Collapse
Affiliation(s)
- Joshua M. Mitchell
- The Jackson Laboratory for Genomic Medicine, Farmington, Connecticut, United States of America
| | - Yuanye Chi
- The Jackson Laboratory for Genomic Medicine, Farmington, Connecticut, United States of America
| | - Maheshwor Thapa
- The Jackson Laboratory for Genomic Medicine, Farmington, Connecticut, United States of America
| | - Zhiqiang Pang
- Institute of Parasitology, McGill University, Montreal, Quebec, Canada
| | - Jianguo Xia
- Institute of Parasitology, McGill University, Montreal, Quebec, Canada
| | - Shuzhao Li
- The Jackson Laboratory for Genomic Medicine, Farmington, Connecticut, United States of America
- University of Connecticut School of Medicine, Farmington, Connecticut, United States of America
| |
Collapse
|
4
|
Perez de Souza L, Fernie AR. Computational methods for processing and interpreting mass spectrometry-based metabolomics. Essays Biochem 2024; 68:5-13. [PMID: 37999335 PMCID: PMC11065554 DOI: 10.1042/ebc20230019] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2023] [Revised: 11/10/2023] [Accepted: 11/15/2023] [Indexed: 11/25/2023]
Abstract
Metabolomics has emerged as an indispensable tool for exploring complex biological questions, providing the ability to investigate a substantial portion of the metabolome. However, the vast complexity and structural diversity intrinsic to metabolites imposes a great challenge for data analysis and interpretation. Liquid chromatography mass spectrometry (LC-MS) stands out as a versatile technique offering extensive metabolite coverage. In this mini-review, we address some of the hurdles posed by the complex nature of LC-MS data, providing a brief overview of computational tools designed to help tackling these challenges. Our focus centers on two major steps that are essential to most metabolomics investigations: the translation of raw data into quantifiable features, and the extraction of structural insights from mass spectra to facilitate metabolite identification. By exploring current computational solutions, we aim at providing a critical overview of the capabilities and constraints of mass spectrometry-based metabolomics, while introduce some of the most recent trends in data processing and analysis within the field.
Collapse
Affiliation(s)
- Leonardo Perez de Souza
- Max Planck Institute of Molecular Plant Physiology, Am Mühlenberg 1, 14476 Potsdam-Golm, Germany
| | - Alisdair R Fernie
- Max Planck Institute of Molecular Plant Physiology, Am Mühlenberg 1, 14476 Potsdam-Golm, Germany
- Center for Plant Systems Biology and Biotechnology, 4000 Plovdiv, Bulgaria
| |
Collapse
|
5
|
Mitchell JM, Chi Y, Thapa M, Pang Z, Xia J, Li S. Common data models to streamline metabolomics processing and annotation, and implementation in a Python pipeline. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.02.13.580048. [PMID: 38405981 PMCID: PMC10888883 DOI: 10.1101/2024.02.13.580048] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/27/2024]
Abstract
To standardize metabolomics data analysis and facilitate future computational developments, it is essential is have a set of well-defined templates for common data structures. Here we describe a collection of data structures involved in metabolomics data processing and illustrate how they are utilized in a full-featured Python-centric pipeline. We demonstrate the performance of the pipeline, and the details in annotation and quality control using large-scale LC-MS metabolomics and lipidomics data and LC-MS/MS data. Multiple previously published datasets are also reanalyzed to showcase its utility in biological data analysis. This pipeline allows users to streamline data processing, quality control, annotation, and standardization in an efficient and transparent manner. This work fills a major gap in the Python ecosystem for computational metabolomics.
Collapse
Affiliation(s)
- Joshua M. Mitchell
- The Jackson Laboratory for Genomic Medicine, 10 Discovery Drive, Farmington, CT 06032, USA
| | - Yuanye Chi
- The Jackson Laboratory for Genomic Medicine, 10 Discovery Drive, Farmington, CT 06032, USA
| | - Maheshwor Thapa
- The Jackson Laboratory for Genomic Medicine, 10 Discovery Drive, Farmington, CT 06032, USA
| | - Zhiqiang Pang
- Institute of Parasitology, McGill University, Montreal, Quebec, Canada
| | - Jianguo Xia
- Institute of Parasitology, McGill University, Montreal, Quebec, Canada
| | - Shuzhao Li
- The Jackson Laboratory for Genomic Medicine, 10 Discovery Drive, Farmington, CT 06032, USA
- University of Connecticut School of Medicine, Farmington, CT 06032, USA
| |
Collapse
|
6
|
Shen XJ, Zhang JQ, An YL, Yang L, Li XL, Hu YS, Sha F, Yao CL, Bi QR, Qu H, Guo DA. MATLAB language assisted data acquisition and processing in liquid chromatography Orbitrap mass spectrometry: Application to the identification and differentiation of Radix Bupleuri from its adulterants. J Chromatogr A 2024; 1714:464544. [PMID: 38142618 DOI: 10.1016/j.chroma.2023.464544] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2023] [Revised: 11/20/2023] [Accepted: 11/27/2023] [Indexed: 12/26/2023]
Abstract
Comprehensive and rapid analysis of secondary metabolites like saponins remains challenging. This study aimed to establish a semi-automated workflow for filtration, identification, and characterization of saikosaponins in six Bupleurum species. Radix Bupleuri, a high-sales herbal medicine, is often adulterated, restricting its quality control and applications. Two authentic Radix Bupleuri species and four major adulterants were analyzed through UHPLC-LTQ-Orbitrap-MS for targeted saikosaponin analysis. To reveal trace saikosaponins and obtain quality fragment data, a MATLAB-based process automatically enumerating "sugar chain + aglycone + side chain" combinations and deduplicating generated a predicted saikosaponin database covering all possible saikosaponins as a precursor ion list for comprehensive targeted acquisition. To focus on informative ions and reduce MS analysis workload, we utilized MATLAB to automatically filtrate the false positive ions by MS1 and MS2 spectrometry. The newly established MATLAB-assisted data acquisition approach exhibited 50 % improvement in characterization of targeted saikosaponins. Furthermore, positive and negative ionization workflows were designed for accurate saikosaponins characterization based on fragmentation rules. In total, 707 saikosaponins were characterized, including over 500 potential new compounds and previously unreported C29 aglycones. We identified 25 saikosaponins present in both authentic species but absent in adulterants as potential markers. This unprecedented comprehensive multi-origin species differentiation demonstrates the promise of MATLAB-assisted acquisition and processing to advance saponin identification and standardize the Radix Bupleuri market.
Collapse
Affiliation(s)
- Xuan-Jing Shen
- National Engineering Research Center of TCM Standardization Technology, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Haike Road #501, Shanghai 201203, China; University of Chinese Academy of Sciences, No. 19A Yuquan Road, Beijing 100049, China
| | - Jian-Qing Zhang
- National Engineering Research Center of TCM Standardization Technology, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Haike Road #501, Shanghai 201203, China
| | - Ya-Ling An
- National Engineering Research Center of TCM Standardization Technology, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Haike Road #501, Shanghai 201203, China
| | - Lin Yang
- National Engineering Research Center of TCM Standardization Technology, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Haike Road #501, Shanghai 201203, China
| | - Xiao-Lan Li
- National Engineering Research Center of TCM Standardization Technology, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Haike Road #501, Shanghai 201203, China; University of Chinese Academy of Sciences, No. 19A Yuquan Road, Beijing 100049, China
| | - Yun-Shu Hu
- National Engineering Research Center of TCM Standardization Technology, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Haike Road #501, Shanghai 201203, China
| | - Fei Sha
- National Engineering Research Center of TCM Standardization Technology, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Haike Road #501, Shanghai 201203, China
| | - Chang-Liang Yao
- National Engineering Research Center of TCM Standardization Technology, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Haike Road #501, Shanghai 201203, China
| | - Qi-Rui Bi
- National Engineering Research Center of TCM Standardization Technology, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Haike Road #501, Shanghai 201203, China
| | - Hua Qu
- National Engineering Research Center of TCM Standardization Technology, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Haike Road #501, Shanghai 201203, China
| | - De-An Guo
- National Engineering Research Center of TCM Standardization Technology, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Haike Road #501, Shanghai 201203, China; Zhongshan Institute for Drug Discovery, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Zhongshan 528400, China.
| |
Collapse
|
7
|
Ly R, Torres LC, Ly N, Britz-McKibbin P. Expanding Lipidomic Coverage in Multisegment Injection-Nonaqueous Capillary Electrophoresis-Mass Spectrometry via a Convenient and Quantitative Methylation Strategy. Anal Chem 2023; 95:17513-17524. [PMID: 37991882 DOI: 10.1021/acs.analchem.3c02605] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2023]
Abstract
Orthogonal separation techniques coupled to high-resolution mass spectrometry are required for characterizing the human lipidome, given its inherent chemical and structural complexity. However, electrophoretic separations remain largely unrecognized in contemporary lipidomics research compared to established chromatographic and ion mobility methods. Herein, we introduce a novel derivatization protocol based on 3-methyl-1-p-tolyltriazene (MTT) as a safer alternative to diazomethane for quantitative phospholipid (PL) methylation (∼90%), which enables their rapid analysis by multisegment injection-nonaqueous capillary electrophoresis-mass spectrometry (MSI-NACE-MS). Isobaric interferences and ion suppression effects were minimized by performing an initial reaction using 9-fluorenylmethyoxycarbonyl chloride prior to MTT and a subsequent back extraction in hexane. This charge-switch derivatization strategy expands lipidome coverage when using MSI-NACE-MS under positive ion mode with improved resolution, greater sensitivity, and higher throughput (∼3.5 min/sample), notably for zwitterionic PLs that are analyzed as their cationic phosphate methyl esters. Our method was validated by analyzing methyl-tert-butyl ether extracts of reference human plasma, which enabled a direct comparison of 48 phosphatidylcholine and 27 sphingomyelin species previously reported in an interlaboratory lipidomics harmonization study. The potential for plasma PL quantification by MSI-NACE-MS via a serial dilution of NIST SRM-1950 was also demonstrated based on estimation of relative response factors using their reported consensus concentrations. Moreover, lipid identification was supported by modeling predictable changes in the electrophoretic mobility for cationic PLs in conjunction with MS/MS. Overall, this work offers a practical derivatization protocol to expand lipidome coverage in CE-MS beyond the analysis of hydrophilic/polar metabolites under aqueous buffer conditions.
Collapse
Affiliation(s)
- Ritchie Ly
- Department of Chemistry and Chemical Biology, McMaster University, 1280 Main Street West, Hamilton, Ontario, Canada L8S 4M1
| | - Lucas Christian Torres
- Department of Chemistry and Chemical Biology, McMaster University, 1280 Main Street West, Hamilton, Ontario, Canada L8S 4M1
| | - Nicholas Ly
- Department of Chemistry and Chemical Biology, McMaster University, 1280 Main Street West, Hamilton, Ontario, Canada L8S 4M1
| | - Philip Britz-McKibbin
- Department of Chemistry and Chemical Biology, McMaster University, 1280 Main Street West, Hamilton, Ontario, Canada L8S 4M1
| |
Collapse
|
8
|
Arturi K, Hollender J. Machine Learning-Based Hazard-Driven Prioritization of Features in Nontarget Screening of Environmental High-Resolution Mass Spectrometry Data. ENVIRONMENTAL SCIENCE & TECHNOLOGY 2023; 57:18067-18079. [PMID: 37279189 PMCID: PMC10666537 DOI: 10.1021/acs.est.3c00304] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/12/2023] [Revised: 05/15/2023] [Accepted: 05/15/2023] [Indexed: 06/08/2023]
Abstract
Nontarget high-resolution mass spectrometry screening (NTS HRMS/MS) can detect thousands of organic substances in environmental samples. However, new strategies are needed to focus time-intensive identification efforts on features with the highest potential to cause adverse effects instead of the most abundant ones. To address this challenge, we developed MLinvitroTox, a machine learning framework that uses molecular fingerprints derived from fragmentation spectra (MS2) for a rapid classification of thousands of unidentified HRMS/MS features as toxic/nontoxic based on nearly 400 target-specific and over 100 cytotoxic endpoints from ToxCast/Tox21. Model development results demonstrated that using customized molecular fingerprints and models, over a quarter of toxic endpoints and the majority of the associated mechanistic targets could be accurately predicted with sensitivities exceeding 0.95. Notably, SIRIUS molecular fingerprints and xboost (Extreme Gradient Boosting) models with SMOTE (Synthetic Minority Oversampling Technique) for handling data imbalance were a universally successful and robust modeling configuration. Validation of MLinvitroTox on MassBank spectra showed that toxicity could be predicted from molecular fingerprints derived from MS2 with an average balanced accuracy of 0.75. By applying MLinvitroTox to environmental HRMS/MS data, we confirmed the experimental results obtained with target analysis and narrowed the analytical focus from tens of thousands of detected signals to 783 features linked to potential toxicity, including 109 spectral matches and 30 compounds with confirmed toxic activity.
Collapse
Affiliation(s)
- Katarzyna Arturi
- Department
of Environmental Chemistry, Swiss Federal
Institute of Aquatic Science and Technology (Eawag), Ueberlandstrasse 133, 8600 Dübendorf, Switzerland
| | - Juliane Hollender
- Department
of Environmental Chemistry, Swiss Federal
Institute of Aquatic Science and Technology (Eawag), Ueberlandstrasse 133, 8600 Dübendorf, Switzerland
- Institute
of Biogeochemistry and Pollution Dynamics, Eidgenössische Technische Hochschule Zürich (ETH Zurich), Rämistrasse 101, 8092 Zürich, Switzerland
| |
Collapse
|
9
|
Bartmanski BJ, Rocha M, Zimmermann-Kogadeeva M. Recent advances in data- and knowledge-driven approaches to explore primary microbial metabolism. Curr Opin Chem Biol 2023; 75:102324. [PMID: 37207402 PMCID: PMC10410306 DOI: 10.1016/j.cbpa.2023.102324] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/28/2022] [Revised: 04/15/2023] [Accepted: 04/18/2023] [Indexed: 05/21/2023]
Abstract
With the rapid progress in metabolomics and sequencing technologies, more data on the metabolome of single microbes and their communities become available, revealing the potential of microorganisms to metabolize a broad range of chemical compounds. The analysis of microbial metabolomics datasets remains challenging since it inherits the technical challenges of metabolomics analysis, such as compound identification and annotation, while harboring challenges in data interpretation, such as distinguishing metabolite sources in mixed samples. This review outlines the recent advances in computational methods to analyze primary microbial metabolism: knowledge-based approaches that take advantage of metabolic and molecular networks and data-driven approaches that employ machine/deep learning algorithms in combination with large-scale datasets. These methods aim at improving metabolite identification and disentangling reciprocal interactions between microbes and metabolites. We also discuss the perspective of combining these approaches and further developments required to advance the investigation of primary metabolism in mixed microbial samples.
Collapse
Affiliation(s)
| | - Miguel Rocha
- Centre of Biological Engineering, University of Minho, Campus of Gualtar, Braga, Portugal
| | | |
Collapse
|
10
|
Li S, Siddiqa A, Thapa M, Chi Y, Zheng S. Trackable and scalable LC-MS metabolomics data processing using asari. Nat Commun 2023; 14:4113. [PMID: 37433854 DOI: 10.1038/s41467-023-39889-1] [Citation(s) in RCA: 10] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/04/2022] [Accepted: 06/30/2023] [Indexed: 07/13/2023] Open
Abstract
Significant challenges remain in the computational processing of data from liquid chomratography-mass spectrometry (LC-MS)-based metabolomic experiments into metabolite features. In this study, we examine the issues of provenance and reproducibility using the current software tools. Inconsistency among the tools examined is attributed to the deficiencies of mass alignment and controls of feature quality. To address these issues, we develop the open-source software tool asari for LC-MS metabolomics data processing. Asari is designed with a set of specific algorithmic framework and data structures, and all steps are explicitly trackable. Asari compares favorably to other tools in feature detection and quantification. It offers substantial improvement in computational performance over current tools, and it is highly scalable.
Collapse
Affiliation(s)
- Shuzhao Li
- Jackson Laboratory for Genomic Medicine, 10 Discovery Drive, Farmington, CT, 06032, USA.
- University of Connecticut School of Medicine, Farmington, CT, USA.
| | - Amnah Siddiqa
- Jackson Laboratory for Genomic Medicine, 10 Discovery Drive, Farmington, CT, 06032, USA
| | - Maheshwor Thapa
- Jackson Laboratory for Genomic Medicine, 10 Discovery Drive, Farmington, CT, 06032, USA
| | - Yuanye Chi
- Jackson Laboratory for Genomic Medicine, 10 Discovery Drive, Farmington, CT, 06032, USA
| | - Shujian Zheng
- Jackson Laboratory for Genomic Medicine, 10 Discovery Drive, Farmington, CT, 06032, USA
| |
Collapse
|
11
|
Ebbels TMD, van der Hooft JJJ, Chatelaine H, Broeckling C, Zamboni N, Hassoun S, Mathé EA. Recent advances in mass spectrometry-based computational metabolomics. Curr Opin Chem Biol 2023; 74:102288. [PMID: 36966702 PMCID: PMC11075003 DOI: 10.1016/j.cbpa.2023.102288] [Citation(s) in RCA: 19] [Impact Index Per Article: 19.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2022] [Revised: 02/16/2023] [Accepted: 02/21/2023] [Indexed: 04/03/2023]
Abstract
The computational metabolomics field brings together computer scientists, bioinformaticians, chemists, clinicians, and biologists to maximize the impact of metabolomics across a wide array of scientific and medical disciplines. The field continues to expand as modern instrumentation produces datasets with increasing complexity, resolution, and sensitivity. These datasets must be processed, annotated, modeled, and interpreted to enable biological insight. Techniques for visualization, integration (within or between omics), and interpretation of metabolomics data have evolved along with innovation in the databases and knowledge resources required to aid understanding. In this review, we highlight recent advances in the field and reflect on opportunities and innovations in response to the most pressing challenges. This review was compiled from discussions from the 2022 Dagstuhl seminar entitled "Computational Metabolomics: From Spectra to Knowledge".
Collapse
Affiliation(s)
- Timothy M D Ebbels
- Section of Bioinformatics, Department of Metabolism, Digestion & Reproduction, Imperial College London, Burlington Danes Building, Hammersmith Hospital, Du Cane Road, London W12 0NN, UK.
| | - Justin J J van der Hooft
- Bioinformatics Group, Wageningen University & Research, Wageningen 6708 PB, the Netherlands; Department of Biochemistry, University of Johannesburg, Auckland Park, Johannesburg 2006, South Africa
| | - Haley Chatelaine
- Informatics Core, Division of Preclinical Innovation, National Center for Advancing Translational Sciences, Rockville, MD, USA
| | - Corey Broeckling
- Bioanalysis and Omics Center, Analytical Resources Core, Colorado State University, Fort Collins, CO, USA
| | - Nicola Zamboni
- Institute of Molecular Systems Biology, ETH Zurich, Zurich, Switzerland
| | - Soha Hassoun
- Department of Computer Science, Tufts University, Medford, MA, USA; Department of Chemical and Biological Engineering, Tufts University, Medford, MA, USA
| | - Ewy A Mathé
- Informatics Core, Division of Preclinical Innovation, National Center for Advancing Translational Sciences, Rockville, MD, USA.
| |
Collapse
|
12
|
Novák J, Schug KA, Havlíček V. Quantitation of small molecules from liquid chromatography-mass spectrometric accurate mass datasets using CycloBranch. EUROPEAN JOURNAL OF MASS SPECTROMETRY (CHICHESTER, ENGLAND) 2023; 29:102-110. [PMID: 37000628 DOI: 10.1177/14690667231164766] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]
Abstract
Gaussian and exponentially modified Gaussian functions were incorporated into integrating algorithms used by an open-source, cross-platform tool called CycloBranch. The quantitation is demonstrated on bacterial pyoverdines separated by fine isotope features. Using our algorithm, we can separate the m/z values 694.25802 and 694.26731 (a 0.009 Da difference), where the former belongs to the most intense peak of pyoverdine D (PvdD), and the latter to the second most intense peak of pyoverdine E (PvdE) in the respective isotopic clusters of [M + Fe-H]2+ ions. The areas under chromatographic curves of standards were analyzed for the limit of detection (LOD), limit of quantitation (LOQ), and regression coefficient calculations. The quantitative module returned a LOD and LOQ of 1.4 and 4.3 ng/mL, respectively, for both PvdD and PvdE in human urine. If present and detected in mass spectra, the intensities of user-defined [M + H]+, [M + Na]+, [M + K]+, [M + Fe-H]2+, or other ion types, can be accumulated and used for quantitation. The quantitation result is returned by CycloBranch in seconds or minutes, contrary to an hours-long manual approach, prone to user-born errors originating from necessary copying among various software environments. Native Bruker, Waters, Thermo, txt, mgf, mzML, and mzXML data formats are supported in CycloBranch, which is freely available at https://ms.biomed.cas.cz/cyclobranch.
Collapse
Affiliation(s)
- Jiří Novák
- Institute of Microbiology, 48311Czech Academy of Sciences, Prague, Czech Republic
- Faculty of Information Technology, Czech Technical University in Prague, Prague, Czech Republic
| | - Kevin A Schug
- Department of Chemistry and Biochemistry, The University of Texas Arlington, Arlington, TX, USA
| | - Vladimír Havlíček
- Institute of Microbiology, 48311Czech Academy of Sciences, Prague, Czech Republic
| |
Collapse
|
13
|
Guo J, Huan T. Mechanistic Understanding of the Discrepancies between Common Peak Picking Algorithms in Liquid Chromatography–Mass Spectrometry-Based Metabolomics. Anal Chem 2023; 95:5894-5902. [PMID: 36972195 DOI: 10.1021/acs.analchem.2c04887] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/29/2023]
Abstract
Inconsistent peak picking outcomes are a critical concern in processing liquid chromatography-mass spectrometry (LC-MS)-based untargeted metabolomics data. This work systematically studied the mechanisms behind the discrepancies among five commonly used peak picking algorithms, including CentWave in XCMS, linear-weighted moving average in MS-DIAL, automated data analysis pipeline (ADAP) in MZmine 2, Savitzky-Golay in El-MAVEN, and FeatureFinderMetabo in OpenMS. We first collected 10 public metabolomics datasets representing various LC-MS analytical conditions. We then incorporated several novel strategies to (i) acquire the optimal peak picking parameters of each algorithm for a fair comparison, (ii) automatically recognize false metabolic features with poor chromatographic peak shapes, and (iii) evaluate the real metabolic features that are missed by the algorithms. By applying these strategies, we compared the true, false, and undetected metabolic features in each data processing outcome. Our results show that linear-weighted moving average consistently outperforms the other peak picking algorithms. To facilitate a mechanistic understanding of the differences, we proposed six peak attributes: ideal slope, sharpness, peak height, mass deviation, peak width, and scan number. We also developed an R program to automatically measure these attributes for detected and undetected true metabolic features. From the results of the 10 datasets, we concluded that four peak attributes, including ideal slope, scan number, peak width, and mass deviation, are critical for the detectability of a peak. For instance, the focus on ideal slope critically hinders the extraction of true metabolic features with low ideal slope scores in linear-weighted moving average, Savitzky-Golay, and ADAP. The relationships between peak picking algorithms and peak attributes were also visualized in a principal component analysis biplot. Overall, the clear comparison and explanation of the differences between peak picking algorithms can lead to the design of better peak picking strategies in the future.
Collapse
|
14
|
Stancliffe E, Schwaiger-Haber M, Sindelar M, Murphy MJ, Soerensen M, Patti GJ. An Untargeted Metabolomics Workflow that Scales to Thousands of Samples for Population-Based Studies. Anal Chem 2022; 94:17370-17378. [PMID: 36475608 PMCID: PMC11018270 DOI: 10.1021/acs.analchem.2c01270] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
The success of precision medicine relies upon collecting data from many individuals at the population level. Although advancing technologies have made such large-scale studies increasingly feasible in some disciplines such as genomics, the standard workflows currently implemented in untargeted metabolomics were developed for small sample numbers and are limited by the processing of liquid chromatography/mass spectrometry data. Here we present an untargeted metabolomics workflow that is designed to support large-scale projects with thousands of biospecimens. Our strategy is to first evaluate a reference sample created by pooling aliquots of biospecimens from the cohort. The reference sample captures the chemical complexity of the biological matrix in a small number of analytical runs, which can subsequently be processed with conventional software such as XCMS. Although this generates thousands of so-called features, most do not correspond to unique compounds from the samples and can be filtered with established informatics tools. The features remaining represent a comprehensive set of biologically relevant reference chemicals that can then be extracted from the entire cohort's raw data on the basis of m/z values and retention times by using Skyline. To demonstrate applicability to large cohorts, we evaluated >2000 human plasma samples with our workflow. We focused our analysis on 360 identified compounds, but we also profiled >3000 unknowns from the plasma samples. As part of our workflow, we tested 14 different computational approaches for batch correction and found that a random forest-based approach outperformed the others. The corrected data revealed distinct profiles that were associated with the geographic location of participants.
Collapse
Affiliation(s)
- Ethan Stancliffe
- Department of Chemistry, Washington University in St. Louis, St. Louis, Missouri 63130, United States
- Department of Medicine, Washington University in St. Louis, St. Louis, Missouri 63130, United States
- Center for Metabolomics and Isotope Tracing at Washington University in St. Louis, St. Louis, Missouri 63130, United States
| | - Michaela Schwaiger-Haber
- Department of Chemistry, Washington University in St. Louis, St. Louis, Missouri 63130, United States
- Department of Medicine, Washington University in St. Louis, St. Louis, Missouri 63130, United States
- Center for Metabolomics and Isotope Tracing at Washington University in St. Louis, St. Louis, Missouri 63130, United States
| | - Miriam Sindelar
- Department of Chemistry, Washington University in St. Louis, St. Louis, Missouri 63130, United States
- Department of Medicine, Washington University in St. Louis, St. Louis, Missouri 63130, United States
- Center for Metabolomics and Isotope Tracing at Washington University in St. Louis, St. Louis, Missouri 63130, United States
| | - Matthew J. Murphy
- Department of Chemistry, Washington University in St. Louis, St. Louis, Missouri 63130, United States
- Department of Medicine, Washington University in St. Louis, St. Louis, Missouri 63130, United States
- Center for Metabolomics and Isotope Tracing at Washington University in St. Louis, St. Louis, Missouri 63130, United States
| | - Mette Soerensen
- Epidemiology, Biostatistics and Biodemography, Department of Public Health, University of Southern Denmark, Odense, Denmark
| | - Gary J. Patti
- Department of Chemistry, Washington University in St. Louis, St. Louis, Missouri 63130, United States
- Department of Medicine, Washington University in St. Louis, St. Louis, Missouri 63130, United States
- Center for Metabolomics and Isotope Tracing at Washington University in St. Louis, St. Louis, Missouri 63130, United States
- Siteman Cancer Center, Washington University in St. Louis, St. Louis, Missouri 63130, United States
| |
Collapse
|
15
|
Stochastic dynamic quantitative and 3D structural matrix assisted laser desorption/ionization mass spectrometric analyses of mixture of nucleosides. J Mol Struct 2022. [DOI: 10.1016/j.molstruc.2022.132701] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
|
16
|
El Abiead Y, Milford M, Schoeny H, Rusz M, Salek RM, Koellensperger G. Power of mzRAPP-Based Performance Assessments in MS1-Based Nontargeted Feature Detection. Anal Chem 2022; 94:8588-8595. [PMID: 35671103 PMCID: PMC9218958 DOI: 10.1021/acs.analchem.1c05270] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2021] [Accepted: 05/24/2022] [Indexed: 11/29/2022]
Abstract
When performing chromatography-mass spectrometry-based nontargeted metabolomics, or exposomics, one of the key steps in the analysis is to obtain MS1-based feature tables. Inapt parameter settings in feature detection will result in missing or wrong quantitative values and might ultimately lead to downstream incorrect biological interpretations. However, until recently, no strategies to assess the completeness and abundance accuracy of feature tables were available. Here, we show that mzRAPP enables the generation of benchmark peak lists by using an internal set of known molecules in the analyzed data set. Using the benchmark, the completeness and abundance accuracy of feature tables can be assessed in an automated pipeline. We demonstrate that our approach adds to other commonly applied quality assurance methods such as manual or automatized parameter optimization techniques or removal of false-positive signals. Moreover, we show that as few as 10 benchmark molecules can already allow for representative performance metrics to further improve quantitative biological understanding.
Collapse
Affiliation(s)
- Yasin El Abiead
- Department
of Analytical Chemistry, University of Vienna, Vienna 1090, Austria
| | - Maximilian Milford
- Department
of Analytical Chemistry, University of Vienna, Vienna 1090, Austria
| | - Harald Schoeny
- Department
of Analytical Chemistry, University of Vienna, Vienna 1090, Austria
| | - Mate Rusz
- Department
of Analytical Chemistry, University of Vienna, Vienna 1090, Austria
- Department
of Inorganic Chemistry, University of Vienna, Vienna 1090, Austria
| | - Reza M. Salek
- International
Agency for Research on Cancer, Section of Nutrition and Metabolism, Lyon 96008, France
| | | |
Collapse
|
17
|
Minkus S, Bieber S, Letzel T. Spotlight on mass spectrometric non-target screening analysis: Advanced data processing methods recently communicated for extracting, prioritizing and quantifying features. ANALYTICAL SCIENCE ADVANCES 2022; 3:103-112. [PMID: 38715638 PMCID: PMC10989605 DOI: 10.1002/ansa.202200001] [Citation(s) in RCA: 14] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/01/2022] [Revised: 03/22/2022] [Accepted: 03/24/2022] [Indexed: 06/13/2024]
Abstract
Non-target screening of trace organic compounds complements routine monitoring of water bodies. So-called features need to be extracted from the raw data that preferably represent a chemical compound. Relevant features need to be prioritized and further be interpreted, for instance by identifying them. Finally, quantitative data is required to assess the risks of a detected compound. This review presents recent and noteworthy contributions to the processing of non-target screening (NTS) data, prioritization of features as well as (semi-) quantitative methods that do not require analytical standards. The focus lies on environmental water samples measured by liquid chromatography, electrospray ionization and high-resolution mass spectrometry. Examples for fully-integrated data processing workflows are given with options for parameter optimization and choosing between different feature extraction algorithms to increase feature coverage. The regions of interest-multivariate curve resolution method is reviewed which combines a data compression alternative with chemometric feature extraction. Furthermore, prioritization strategies based on a confined chemical space for annotation, guidance by targeted analysis and signal intensity are presented. Exploiting the retention time (RT) as diagnostic evidence for NTS investigations is highlighted by discussing RT indexing and prediction using quantitative structure-retention relationship models. Finally, a seminal technology for quantitative NTS is discussed without the need for analytical standards based on predicting ionization efficiencies.
Collapse
Affiliation(s)
- Susanne Minkus
- AFIN‐TS GmbHAugsburgGermany
- Technical University of Munich (Chair of Urban Water Systems Engineering)MunichGermany
| | | | - Thomas Letzel
- AFIN‐TS GmbHAugsburgGermany
- Technical University of Munich (Chair of Urban Water Systems Engineering)MunichGermany
| |
Collapse
|