1
|
Martínez S, Fernández-García M, Londoño-Osorio S, Barbas C, Gradillas A. Highly reliable LC-MS lipidomics database for efficient human plasma profiling based on NIST SRM 1950. J Lipid Res 2024:100671. [PMID: 39395790 DOI: 10.1016/j.jlr.2024.100671] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2024] [Revised: 10/04/2024] [Accepted: 10/07/2024] [Indexed: 10/14/2024] Open
Abstract
Liquid chromatography coupled to high resolution mass spectrometry (LC-HRMS)-based methods have become the gold standard methodology for the comprehensive profiling of the human plasma lipidome. However, both the complexity of lipid chemistry and LC-HRMS-associated data pose challenges to the characterization of this biological matrix. In accordance with the current consensus of quality requirements for LC-HRMS lipidomics data, we aimed to characterize the NIST® Standard Reference Material for Human Plasma (SRM 1950) using an LC-ESI(+/-)-MS method compatible with high-throughput lipidome profiling. We generated a highly curated lipid database with increased coverage, quality, and consistency, including additional quality assurance procedures involving adduct formation, within-method m/z evaluation, retention behavior of species within lipid chain isomers, and expert-driven resolution of isomeric and isobaric interferences. As a proof-of-concept, we showed the utility of our in-house LC-MS lipidomic database -consisting of 592 lipid entries- for the fast, comprehensive, and reliable lipidomic profiling of the human plasma from healthy human volunteers. We are confident that the implementation of this robust resource and methodology will have a significant impact by reducing data redundancy and the current delays and bottlenecks in untargeted plasma lipidomic studies.
Collapse
Affiliation(s)
- Sara Martínez
- Centro de Metabolómica y Bioanálisis (CEMBIO), Facultad de Farmacia, Universidad San Pablo-CEU, CEU Universities, Urbanización Montepríncipe, 28660 Boadilla del Monte, Madrid, Spain
| | - Miguel Fernández-García
- Centro de Metabolómica y Bioanálisis (CEMBIO), Facultad de Farmacia, Universidad San Pablo-CEU, CEU Universities, Urbanización Montepríncipe, 28660 Boadilla del Monte, Madrid, Spain; Departamento de Ciencias Médicas Básicas, Facultad de Medicina, Universidad San Pablo-CEU, CEU Universities, Urbanización Montepríncipe, 28660 Boadilla del Monte, Madrid, Spain
| | - Sara Londoño-Osorio
- Centro de Metabolómica y Bioanálisis (CEMBIO), Facultad de Farmacia, Universidad San Pablo-CEU, CEU Universities, Urbanización Montepríncipe, 28660 Boadilla del Monte, Madrid, Spain
| | - Coral Barbas
- Centro de Metabolómica y Bioanálisis (CEMBIO), Facultad de Farmacia, Universidad San Pablo-CEU, CEU Universities, Urbanización Montepríncipe, 28660 Boadilla del Monte, Madrid, Spain.
| | - Ana Gradillas
- Centro de Metabolómica y Bioanálisis (CEMBIO), Facultad de Farmacia, Universidad San Pablo-CEU, CEU Universities, Urbanización Montepríncipe, 28660 Boadilla del Monte, Madrid, Spain.
| |
Collapse
|
2
|
Mildau K, Büschl C, Zanghellini J, van der Hooft JJJ. Combined LC-MS/MS feature grouping, statistical prioritization, and interactive networking in msFeaST. Bioinformatics 2024; 40:btae584. [PMID: 39348165 PMCID: PMC11471276 DOI: 10.1093/bioinformatics/btae584] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2024] [Revised: 09/09/2024] [Accepted: 09/26/2024] [Indexed: 10/01/2024] Open
Abstract
SUMMARY Computational metabolomics workflows have revolutionized the untargeted metabolomics field. However, the organization and prioritization of metabolite features remains a laborious process. Organizing metabolomics data is often done through mass fragmentation-based spectral similarity grouping, resulting in feature sets that also represent an intuitive and scientifically meaningful first stage of analysis in untargeted metabolomics. Exploiting such feature sets, feature-set testing has emerged as an approach that is widely used in genomics and targeted metabolomics pathway enrichment analyses. It allows for formally combining groupings with statistical testing into more meaningful pathway enrichment conclusions. Here, we present msFeaST (mass spectral Feature Set Testing), a feature-set testing and visualization workflow for LC-MS/MS untargeted metabolomics data. Feature-set testing involves statistically assessing differential abundance patterns for groups of features across experimental conditions. We developed msFeaST to make use of spectral similarity-based feature groupings generated using k-medoids clustering, where the resulting clusters serve as a proxy for grouping structurally similar features with potential biosynthesis pathway relationships. Spectral clustering done in this way allows for feature group-wise statistical testing using the globaltest package, which provides high power to detect small concordant effects via joint modeling and reduced multiplicity adjustment penalties. Hence, msFeaST provides interactive integration of the semi-quantitative experimental information with mass-spectral structural similarity information, enhancing the prioritization of features and feature sets during exploratory data analysis. AVAILABILITY AND IMPLEMENTATION The msFeaST workflow is freely available through https://github.com/kevinmildau/msFeaST and built to work on MacOS and Linux systems.
Collapse
Affiliation(s)
- Kevin Mildau
- Bioinformatics Group, Department of Plant Sciences, Wageningen University & Research, Radix Building, Droevendaalsesteeg 1, Wageningen, 6708PB, the Netherlands
- Department of Analytical Chemistry, University of Vienna, Vienna 1090, Austria
- Doctoral School in Chemistry (DOSCHEM), University of Vienna, Vienna 1090, Austria
| | - Christoph Büschl
- Department of Agrobiotechnology, Institute of Bioanalytics and Agro-Metabolomics, University of Natural Resources and Life Sciences, Konrad-Lorenz-Straße, Lower Austria 3430, Austria
| | - Jürgen Zanghellini
- Department of Analytical Chemistry, University of Vienna, Vienna 1090, Austria
| | - Justin J J van der Hooft
- Bioinformatics Group, Department of Plant Sciences, Wageningen University & Research, Radix Building, Droevendaalsesteeg 1, Wageningen, 6708PB, the Netherlands
- Department of Biochemistry, University of Johannesburg, Johannesburg, Gauteng Province 2006, South Africa
| |
Collapse
|
3
|
Pakkir Shah AK, Walter A, Ottosson F, Russo F, Navarro-Diaz M, Boldt J, Kalinski JCJ, Kontou EE, Elofson J, Polyzois A, González-Marín C, Farrell S, Aggerbeck MR, Pruksatrakul T, Chan N, Wang Y, Pöchhacker M, Brungs C, Cámara B, Caraballo-Rodríguez AM, Cumsille A, de Oliveira F, Dührkop K, El Abiead Y, Geibel C, Graves LG, Hansen M, Heuckeroth S, Knoblauch S, Kostenko A, Kuijpers MCM, Mildau K, Papadopoulos Lambidis S, Portal Gomes PW, Schramm T, Steuer-Lodd K, Stincone P, Tayyab S, Vitale GA, Wagner BC, Xing S, Yazzie MT, Zuffa S, de Kruijff M, Beemelmanns C, Link H, Mayer C, van der Hooft JJJ, Damiani T, Pluskal T, Dorrestein P, Stanstrup J, Schmid R, Wang M, Aron A, Ernst M, Petras D. Statistical analysis of feature-based molecular networking results from non-targeted metabolomics data. Nat Protoc 2024:10.1038/s41596-024-01046-3. [PMID: 39304763 DOI: 10.1038/s41596-024-01046-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2023] [Accepted: 07/02/2024] [Indexed: 09/22/2024]
Abstract
Feature-based molecular networking (FBMN) is a popular analysis approach for liquid chromatography-tandem mass spectrometry-based non-targeted metabolomics data. While processing liquid chromatography-tandem mass spectrometry data through FBMN is fairly streamlined, downstream data handling and statistical interrogation are often a key bottleneck. Especially users new to statistical analysis struggle to effectively handle and analyze complex data matrices. Here we provide a comprehensive guide for the statistical analysis of FBMN results, focusing on the downstream analysis of the FBMN output table. We explain the data structure and principles of data cleanup and normalization, as well as uni- and multivariate statistical analysis of FBMN results. We provide explanations and code in two scripting languages (R and Python) as well as the QIIME2 framework for all protocol steps, from data clean-up to statistical analysis. All code is shared in the form of Jupyter Notebooks ( https://github.com/Functional-Metabolomics-Lab/FBMN-STATS ). Additionally, the protocol is accompanied by a web application with a graphical user interface ( https://fbmn-statsguide.gnps2.org/ ) to lower the barrier of entry for new users and for educational purposes. Finally, we also show users how to integrate their statistical results into the molecular network using the Cytoscape visualization tool. Throughout the protocol, we use a previously published environmental metabolomics dataset for demonstration purposes. Together, the protocol, code and web application provide a complete guide and toolbox for FBMN data integration, cleanup and advanced statistical analysis, enabling new users to uncover molecular insights from their non-targeted metabolomics data. Our protocol is tailored for the seamless analysis of FBMN results from Global Natural Products Social Molecular Networking and can be easily adapted to other mass spectrometry feature detection, annotation and networking tools.
Collapse
Affiliation(s)
- Abzer K Pakkir Shah
- Virtual Multi-Omics Laboratory, The Internet, Riverside, CA, USA
- University of Tübingen, Interfaculty Institute of Microbiology and Infection Medicine, Tübingen, Germany
| | - Axel Walter
- Virtual Multi-Omics Laboratory, The Internet, Riverside, CA, USA
- University of Tübingen, Interfaculty Institute of Microbiology and Infection Medicine, Tübingen, Germany
- Applied Bioinformatics, Department of Computer Science, University of Tübingen, Tübingen, Germany
| | - Filip Ottosson
- Section for Clinical Mass Spectrometry, Danish Center for Neonatal Screening, Department of Congenital Disorders, Statens Serum Institut, Copenhagen S, Denmark
| | - Francesco Russo
- Section for Clinical Mass Spectrometry, Danish Center for Neonatal Screening, Department of Congenital Disorders, Statens Serum Institut, Copenhagen S, Denmark
| | - Marcelo Navarro-Diaz
- University of Tübingen, Interfaculty Institute of Microbiology and Infection Medicine, Tübingen, Germany
| | - Judith Boldt
- Virtual Multi-Omics Laboratory, The Internet, Riverside, CA, USA
- Leibniz Institute DSMZ-German Collection of Microorganisms and Cell Cultures, Braunschweig, Germany
- German Center for Infection Research, Partner Site Braunschweig-Hannover, Braunschweig, Germany
| | - Jarmo-Charles J Kalinski
- Virtual Multi-Omics Laboratory, The Internet, Riverside, CA, USA
- Department of Biochemistry and Microbiology, Rhodes University, Makhanda, South Africa
| | - Eftychia Eva Kontou
- Virtual Multi-Omics Laboratory, The Internet, Riverside, CA, USA
- The Novo Nordisk Foundation for Biosustainability, Technical University of Denmark, Kongens Lyngby, Denmark
| | - James Elofson
- Department of Chemistry and Biochemistry, University of Denver, Denver, CO, USA
| | - Alexandros Polyzois
- Virtual Multi-Omics Laboratory, The Internet, Riverside, CA, USA
- Boyce Thompson Institute and Department of Chemistry and Chemical Biology, Cornell University, Ithaca, NY, USA
| | - Carolina González-Marín
- Virtual Multi-Omics Laboratory, The Internet, Riverside, CA, USA
- Universidad EAFIT, Medellín, Antioquia, Colombia
| | - Shane Farrell
- Bigelow Laboratory for Ocean Sciences, East Boothbay, ME, USA
- School of Marine Sciences, Darling Marine Center, University of Maine, Walpole, ME, USA
| | - Marie R Aggerbeck
- Virtual Multi-Omics Laboratory, The Internet, Riverside, CA, USA
- Department of Environmental Science, Aarhus University, Roskilde, Denmark
| | - Thapanee Pruksatrakul
- Virtual Multi-Omics Laboratory, The Internet, Riverside, CA, USA
- National Center for Genetic Engineering and Biotechnology, National Science and Technology Development Agency, Thailand Science Park, Pathum Thani, Thailand
| | - Nathan Chan
- Department of Computer Science, University of California Riverside, Riverside, CA, USA
| | - Yunshu Wang
- Department of Computer Science, University of California Riverside, Riverside, CA, USA
| | - Magdalena Pöchhacker
- Virtual Multi-Omics Laboratory, The Internet, Riverside, CA, USA
- Department of Food Chemistry and Toxicology, University of Vienna, Vienna, Austria
| | - Corinna Brungs
- Institute of Organic Chemistry and Biochemistry of the Czech Academy of Sciences, Prague, Czech Republic
| | - Beatriz Cámara
- Laboratorio de Microbiología Molecular y Biotecnología Ambiental, Centro de Biotecnología DAL, Universidad Técnica Federico Santa María, Valparaíso, Chile
| | | | - Andres Cumsille
- Laboratorio de Microbiología Molecular y Biotecnología Ambiental, Centro de Biotecnología DAL, Universidad Técnica Federico Santa María, Valparaíso, Chile
| | - Fernanda de Oliveira
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, San Diego, CA, USA
- Department of Biotechnology, Engineering School of Lorena, University of São Paulo, Lorena, São Paulo, Brazil
| | - Kai Dührkop
- Department of Bioinformatics, University of Jena, Jena, Germany
| | - Yasin El Abiead
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, San Diego, CA, USA
| | - Christian Geibel
- University of Tübingen, Interfaculty Institute of Microbiology and Infection Medicine, Tübingen, Germany
| | - Lana G Graves
- Department of Environmental Systems Analysis, University of Tübingen, Tübingen, Germany
- Leibniz Institute of Freshwater Ecology and Inland Fisheries, Berlin, Germany
| | - Martin Hansen
- Department of Environmental Science, Aarhus University, Roskilde, Denmark
| | - Steffen Heuckeroth
- Institute of Inorganic and Analytical Chemistry, University of Münster, Münster, Germany
| | - Simon Knoblauch
- University of Tübingen, Interfaculty Institute of Microbiology and Infection Medicine, Tübingen, Germany
| | - Anastasiia Kostenko
- Department of Chemistry and Biochemistry, University of Denver, Denver, CO, USA
| | - Mirte C M Kuijpers
- Department of Ecology, Behavior and Evolution, University of California San Diego, San Diego, CA, USA
| | - Kevin Mildau
- Virtual Multi-Omics Laboratory, The Internet, Riverside, CA, USA
- Department of Analytical Chemistry, University of Vienna, Vienna, Austria
- Bioinformatics Group, Wageningen University and Research, Wageningen, the Netherlands
| | | | - Paulo Wender Portal Gomes
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, San Diego, CA, USA
| | - Tilman Schramm
- University of Tübingen, Interfaculty Institute of Microbiology and Infection Medicine, Tübingen, Germany
- Department of Biochemistry, University of California Riverside, Riverside, CA, USA
| | - Karoline Steuer-Lodd
- University of Tübingen, Interfaculty Institute of Microbiology and Infection Medicine, Tübingen, Germany
- Department of Biochemistry, University of California Riverside, Riverside, CA, USA
| | - Paolo Stincone
- University of Tübingen, Interfaculty Institute of Microbiology and Infection Medicine, Tübingen, Germany
| | - Sibgha Tayyab
- University of Tübingen, Interfaculty Institute of Microbiology and Infection Medicine, Tübingen, Germany
| | - Giovanni Andrea Vitale
- University of Tübingen, Interfaculty Institute of Microbiology and Infection Medicine, Tübingen, Germany
| | - Berenike C Wagner
- University of Tübingen, Interfaculty Institute of Microbiology and Infection Medicine, Tübingen, Germany
| | - Shipei Xing
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, San Diego, CA, USA
| | - Marquis T Yazzie
- Department of Chemistry and Biochemistry, University of Denver, Denver, CO, USA
| | - Simone Zuffa
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, San Diego, CA, USA
- Collaborative Mass Spectrometry Innovation Center, Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, San Diego, CA, USA
| | - Martinus de Kruijff
- Helmholtz Institute for Pharmaceutical Research Saarland, Helmholtz Centre for Infection Research, Saarbrücken, Germany
| | - Christine Beemelmanns
- Helmholtz Institute for Pharmaceutical Research Saarland, Helmholtz Centre for Infection Research, Saarbrücken, Germany
- Saarland University, Saarbrücken, Germany
| | - Hannes Link
- University of Tübingen, Interfaculty Institute of Microbiology and Infection Medicine, Tübingen, Germany
| | - Christoph Mayer
- University of Tübingen, Interfaculty Institute of Microbiology and Infection Medicine, Tübingen, Germany
| | - Justin J J van der Hooft
- Virtual Multi-Omics Laboratory, The Internet, Riverside, CA, USA
- Bioinformatics Group, Wageningen University and Research, Wageningen, the Netherlands
- Department of Biochemistry, University of Johannesburg, Johannesburg, South Africa
| | - Tito Damiani
- Institute of Organic Chemistry and Biochemistry of the Czech Academy of Sciences, Prague, Czech Republic
| | - Tomáš Pluskal
- Institute of Organic Chemistry and Biochemistry of the Czech Academy of Sciences, Prague, Czech Republic
| | - Pieter Dorrestein
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, San Diego, CA, USA
| | - Jan Stanstrup
- Department of Nutrition, Exercise and Sports, University of Copenhagen, Frederiksberg C, Denmark
| | - Robin Schmid
- Virtual Multi-Omics Laboratory, The Internet, Riverside, CA, USA
- Institute of Organic Chemistry and Biochemistry of the Czech Academy of Sciences, Prague, Czech Republic
| | - Mingxun Wang
- Virtual Multi-Omics Laboratory, The Internet, Riverside, CA, USA
- Department of Computer Science, University of California Riverside, Riverside, CA, USA
| | - Allegra Aron
- Virtual Multi-Omics Laboratory, The Internet, Riverside, CA, USA
- Department of Chemistry and Biochemistry, University of Denver, Denver, CO, USA
| | - Madeleine Ernst
- Section for Clinical Mass Spectrometry, Danish Center for Neonatal Screening, Department of Congenital Disorders, Statens Serum Institut, Copenhagen S, Denmark.
| | - Daniel Petras
- Virtual Multi-Omics Laboratory, The Internet, Riverside, CA, USA.
- University of Tübingen, Interfaculty Institute of Microbiology and Infection Medicine, Tübingen, Germany.
- Department of Biochemistry, University of California Riverside, Riverside, CA, USA.
| |
Collapse
|
4
|
Hupatz H, Rahu I, Wang WC, Peets P, Palm EH, Kruve A. Critical review on in silico methods for structural annotation of chemicals detected with LC/HRMS non-targeted screening. Anal Bioanal Chem 2024:10.1007/s00216-024-05471-x. [PMID: 39138659 DOI: 10.1007/s00216-024-05471-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2024] [Revised: 07/22/2024] [Accepted: 07/24/2024] [Indexed: 08/15/2024]
Abstract
Non-targeted screening with liquid chromatography coupled to high-resolution mass spectrometry (LC/HRMS) is increasingly leveraging in silico methods, including machine learning, to obtain candidate structures for structural annotation of LC/HRMS features and their further prioritization. Candidate structures are commonly retrieved based on the tandem mass spectral information either from spectral or structural databases; however, the vast majority of the detected LC/HRMS features remain unannotated, constituting what we refer to as a part of the unknown chemical space. Recently, the exploration of this chemical space has become accessible through generative models. Furthermore, the evaluation of the candidate structures benefits from the complementary empirical analytical information such as retention time, collision cross section values, and ionization type. In this critical review, we provide an overview of the current approaches for retrieving and prioritizing candidate structures. These approaches come with their own set of advantages and limitations, as we showcase in the example of structural annotation of ten known and ten unknown LC/HRMS features. We emphasize that these limitations stem from both experimental and computational considerations. Finally, we highlight three key considerations for the future development of in silico methods.
Collapse
Affiliation(s)
- Henrik Hupatz
- Department of Materials and Environmental Chemistry, Stockholm University, Svante Arrhenius Väg 16, 114 18, Stockholm, Sweden
- Stockholm University Center for Circular and Sustainable Systems (SUCCeSS), Stockholm University, 106 91, Stockholm, Sweden
| | - Ida Rahu
- Department of Materials and Environmental Chemistry, Stockholm University, Svante Arrhenius Väg 16, 114 18, Stockholm, Sweden.
| | - Wei-Chieh Wang
- Department of Materials and Environmental Chemistry, Stockholm University, Svante Arrhenius Väg 16, 114 18, Stockholm, Sweden
| | - Pilleriin Peets
- Institute of Biodiversity, Faculty of Biological Science, Cluster of Excellence Balance of the Microverse, Friedrich Schiller University Jena, 07743, Jena, Germany
| | - Emma H Palm
- Luxembourg Centre for Systems Biomedicine (LCSB), University of Luxembourg, 6 Avenue du Swing, 4367, Belvaux, Luxembourg
| | - Anneli Kruve
- Department of Materials and Environmental Chemistry, Stockholm University, Svante Arrhenius Väg 16, 114 18, Stockholm, Sweden.
- Stockholm University Center for Circular and Sustainable Systems (SUCCeSS), Stockholm University, 106 91, Stockholm, Sweden.
- Department of Environmental Science, Stockholm University, Svante Arrhenius Väg 8, 114 18, Stockholm, Sweden.
| |
Collapse
|
5
|
Li X, Zhou Chen Y, Kalia A, Zhu H, Liu LP, Hassoun S. An Ensemble Spectral Prediction (ESP) model for metabolite annotation. Bioinformatics 2024; 40:btae490. [PMID: 39180771 PMCID: PMC11344591 DOI: 10.1093/bioinformatics/btae490] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2022] [Revised: 06/25/2024] [Indexed: 08/26/2024] Open
Abstract
MOTIVATION A key challenge in metabolomics is annotating measured spectra from a biological sample with chemical identities. Currently, only a small fraction of measurements can be assigned identities. Two complementary computational approaches have emerged to address the annotation problem: mapping candidate molecules to spectra, and mapping query spectra to molecular candidates. In essence, the candidate molecule with the spectrum that best explains the query spectrum is recommended as the target molecule. Despite candidate ranking being fundamental in both approaches, limited prior works incorporated rank learning tasks in determining the target molecule. RESULTS We propose a novel machine learning model, Ensemble Spectral Prediction (ESP), for metabolite annotation. ESP takes advantage of prior neural network-based annotation models that utilize multilayer perceptron (MLP) networks and Graph Neural Networks (GNNs). Based on the ranking results of the MLP- and GNN-based models, ESP learns a weighting for the outputs of MLP and GNN spectral predictors to generate a spectral prediction for a query molecule. Importantly, training data is stratified by molecular formula to provide candidate sets during model training. Further, baseline MLP and GNN models are enhanced by considering peak dependencies through label mixing and multi-tasking on spectral topic distributions. When trained on the NIST 2020 dataset and evaluated on the relevant candidate sets from PubChem, ESP improves average rank by 23.7% and 37.2% over the MLP and GNN baselines, respectively, demonstrating performance gain over state-of-the-art neural network approaches. However, MLP approaches remain strong contenders when considering top five ranks. Importantly, we show that annotation performance is dependent on the training dataset, the number of molecules in the candidate set and candidate similarity to the target molecule. AVAILABILITY AND IMPLEMENTATION The ESP code, a trained model, and a Jupyter notebook that guide users on using the ESP tool is available at https://github.com/HassounLab/ESP.
Collapse
Affiliation(s)
- Xinmeng Li
- Department of Computer Science, Tufts University, Medford, MA, 02155, United States
| | - Yan Zhou Chen
- Department of Computer Science, Tufts University, Medford, MA, 02155, United States
| | - Apurva Kalia
- Department of Computer Science, Tufts University, Medford, MA, 02155, United States
| | - Hao Zhu
- Department of Computer Science, Tufts University, Medford, MA, 02155, United States
| | - Li-ping Liu
- Department of Computer Science, Tufts University, Medford, MA, 02155, United States
| | - Soha Hassoun
- Department of Computer Science, Tufts University, Medford, MA, 02155, United States
- Department of Chemical and Biological Engineering, Tufts University, Medford, MA, 02155, United States
| |
Collapse
|
6
|
Coutinho ID, Facchinatto WM, Mertz-Henning LM, Viana AC, Marin SR, Santagneli SH, Nepomuceno AL, Colnago LA. NMR Fingerprinting of Conventional and Genetically Modified Soybean Plants with AtAREB1 Transcription Factors. ACS OMEGA 2024; 9:32651-32661. [PMID: 39100338 PMCID: PMC11292650 DOI: 10.1021/acsomega.4c01796] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/24/2024] [Revised: 06/09/2024] [Accepted: 06/12/2024] [Indexed: 08/06/2024]
Abstract
Drought stress impacts soybean yields and physiological processes. However, the insertion of the activated form of the AtAREB1 gene in the soybean cultivar BR16, which is sensitive to water deficit, improved the drought response of the genetically modified plants. Thus, in this study, we used 1H NMR in solution and solid-state NMR to investigate the response of genetically modified soybean overexpressing AtAREB1 under water deficiency conditions. We achieved that drought-tolerant soybean yields high content of amino acids isoleucine, leucine, threonine, valine, proline, glutamate, aspartate, asparagine, tyrosine, and phenylalanine after 12 days of drought stress conditions, as compared to drought-sensitive soybean under the same conditions. Specific target compounds, including sugars, organic acids, and phenolic compounds, were identified as involved in controlling sensitive soybean during the vegetative stage. Solid-state NMR was used to study the impact of drought stress on starch and cellulose contents in different soybean genotypes. The findings provide insights into the metabolic adjustments of soybean overexpressing AREB transcription factors in adapting to dry climates. This study presents NMR techniques for investigating the metabolome of transgenic soybean plants in response to the water deficit. The approach allowed for the identification of physiological and morphological changes in drought-resistant and drought-tolerant soybean tissues. The findings indicate that drought stress significantly alters micro- and macromolecular metabolism in soybean plants. Differential responses were observed among roots and leaves as well as drought-tolerant and drought-sensitive cultivars, highlighting the complex interplay between overexpressed transcription factors and drought stress in soybean plants.
Collapse
Affiliation(s)
- Isabel Duarte Coutinho
- Embrapa
Instrumentation, Brazilian Agricultural
Research Corporation, St. XV de Novembro 1452, P.O. Box 741, 13560-970 São Carlos, São Paulo, Brazil
| | - William Marcondes Facchinatto
- Embrapa
Instrumentation, Brazilian Agricultural
Research Corporation, St. XV de Novembro 1452, P.O. Box 741, 13560-970 São Carlos, São Paulo, Brazil
| | - Liliane Marcia Mertz-Henning
- Embrapa
Soybean, Brazilian Agricultural Research
Corporation, HWY Carlos João Strass, Warta District, P.O.
Box 4006, 86085-981 Londrina, Paraná, Brazil
| | - Américo
José Carvalho Viana
- Embrapa
Soybean, Brazilian Agricultural Research
Corporation, HWY Carlos João Strass, Warta District, P.O.
Box 4006, 86085-981 Londrina, Paraná, Brazil
| | - Silvana Regina
Rockenbach Marin
- Embrapa
Soybean, Brazilian Agricultural Research
Corporation, HWY Carlos João Strass, Warta District, P.O.
Box 4006, 86085-981 Londrina, Paraná, Brazil
| | - Silvia Helena Santagneli
- Institute
of Chemistry, São Paulo State University
(UNESP), Avenue Francisco Degni 55, CEP 14800-060 Araraquara, São Paulo, Brazil
| | - Alexandre Lima Nepomuceno
- Embrapa
Soybean, Brazilian Agricultural Research
Corporation, HWY Carlos João Strass, Warta District, P.O.
Box 4006, 86085-981 Londrina, Paraná, Brazil
| | - Luiz Alberto Colnago
- Embrapa
Instrumentation, Brazilian Agricultural
Research Corporation, St. XV de Novembro 1452, P.O. Box 741, 13560-970 São Carlos, São Paulo, Brazil
| |
Collapse
|
7
|
de Jonge NF, Hecht H, Strobel M, Wang M, van der Hooft JJJ, Huber F. Reproducible MS/MS library cleaning pipeline in matchms. J Cheminform 2024; 16:88. [PMID: 39075613 PMCID: PMC11285329 DOI: 10.1186/s13321-024-00878-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2023] [Accepted: 07/09/2024] [Indexed: 07/31/2024] Open
Abstract
Mass spectral libraries have proven to be essential for mass spectrum annotation, both for library matching and training new machine learning algorithms. A key step in training machine learning models is the availability of high-quality training data. Public libraries of mass spectrometry data that are open to user submission often suffer from limited metadata curation and harmonization. The resulting variability in data quality makes training of machine learning models challenging. Here we present a library cleaning pipeline designed for cleaning tandem mass spectrometry library data. The pipeline is designed with ease of use, flexibility, and reproducibility as leading principles.Scientific contributionThis pipeline will result in cleaner public mass spectral libraries that will improve library searching and the quality of machine-learning training datasets in mass spectrometry. This pipeline builds on previous work by adding new functionality for curating and correcting annotated libraries, by validating structure annotations. Due to the high quality of our software, the reproducibility, and improved logging, we think our new pipeline has the potential to become the standard in the field for cleaning tandem mass spectrometry libraries.
Collapse
Affiliation(s)
- Niek F de Jonge
- Bioinformatics Group, Wageningen University & Research, 6708 PB, Wageningen, the Netherlands.
| | - Helge Hecht
- Faculty of Science, RECETOX, Masaryk University, Kotlářská 2, Brno, Czech Republic
| | - Michael Strobel
- Department of Computer Science and Engineering, University of California Riverside, 900 University Ave., Riverside, CA, 92521, USA
| | - Mingxun Wang
- Department of Computer Science and Engineering, University of California Riverside, 900 University Ave., Riverside, CA, 92521, USA
| | - Justin J J van der Hooft
- Bioinformatics Group, Wageningen University & Research, 6708 PB, Wageningen, the Netherlands.
- Department of Biochemistry, University of Johannesburg, Auckland Park, Johannesburg, 2006, South Africa.
| | - Florian Huber
- Centre for Digitalisation and Digitality, Düsseldorf University of Applied Sciences, 40476, Düsseldorf, Germany.
| |
Collapse
|
8
|
Mitchell JM, Chi Y, Thapa M, Pang Z, Xia J, Li S. Common data models to streamline metabolomics processing and annotation, and implementation in a Python pipeline. PLoS Comput Biol 2024; 20:e1011912. [PMID: 38843301 PMCID: PMC11185459 DOI: 10.1371/journal.pcbi.1011912] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/12/2024] [Revised: 06/18/2024] [Accepted: 05/20/2024] [Indexed: 06/18/2024] Open
Abstract
To standardize metabolomics data analysis and facilitate future computational developments, it is essential to have a set of well-defined templates for common data structures. Here we describe a collection of data structures involved in metabolomics data processing and illustrate how they are utilized in a full-featured Python-centric pipeline. We demonstrate the performance of the pipeline, and the details in annotation and quality control using large-scale LC-MS metabolomics and lipidomics data and LC-MS/MS data. Multiple previously published datasets are also reanalyzed to showcase its utility in biological data analysis. This pipeline allows users to streamline data processing, quality control, annotation, and standardization in an efficient and transparent manner. This work fills a major gap in the Python ecosystem for computational metabolomics.
Collapse
Affiliation(s)
- Joshua M. Mitchell
- The Jackson Laboratory for Genomic Medicine, Farmington, Connecticut, United States of America
| | - Yuanye Chi
- The Jackson Laboratory for Genomic Medicine, Farmington, Connecticut, United States of America
| | - Maheshwor Thapa
- The Jackson Laboratory for Genomic Medicine, Farmington, Connecticut, United States of America
| | - Zhiqiang Pang
- Institute of Parasitology, McGill University, Montreal, Quebec, Canada
| | - Jianguo Xia
- Institute of Parasitology, McGill University, Montreal, Quebec, Canada
| | - Shuzhao Li
- The Jackson Laboratory for Genomic Medicine, Farmington, Connecticut, United States of America
- University of Connecticut School of Medicine, Farmington, Connecticut, United States of America
| |
Collapse
|
9
|
Quiros-Guerrero LM, Allard PM, Nothias LF, David B, Grondin A, Wolfender JL. Comprehensive mass spectrometric metabolomic profiling of a chemically diverse collection of plants of the Celastraceae family. Sci Data 2024; 11:415. [PMID: 38649352 PMCID: PMC11035674 DOI: 10.1038/s41597-024-03094-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2023] [Accepted: 02/27/2024] [Indexed: 04/25/2024] Open
Abstract
Natural products exhibit interesting structural features and significant biological activities. The discovery of new bioactive molecules is a complex process that requires high-quality metabolite profiling data to properly target the isolation of compounds of interest and enable their complete structural characterization. The same metabolite profiling data can also be used to better understand chemotaxonomic links between species. This Data Descriptor details a dataset resulting from the untargeted liquid chromatography-mass spectrometry metabolite profiling of 76 natural extracts of the Celastraceae family. The spectral annotation results and related chemical and taxonomic metadata are shared, along with proposed examples of data reuse. This data can be further studied by researchers exploring the chemical diversity of natural products. This can serve as a reference sample set for deep metabolome investigation of this chemically rich plant family.
Collapse
Affiliation(s)
- Luis-Manuel Quiros-Guerrero
- Institute of Pharmaceutical Sciences of Western Switzerland, University of Geneva, CMU, 1211, Geneva, Switzerland.
- School of Pharmaceutical Sciences, University of Geneva, CMU, 1211, Geneva, Switzerland.
| | | | - Louis-Felix Nothias
- Institute of Pharmaceutical Sciences of Western Switzerland, University of Geneva, CMU, 1211, Geneva, Switzerland
- School of Pharmaceutical Sciences, University of Geneva, CMU, 1211, Geneva, Switzerland
| | - Bruno David
- Green Mission Department, Herbal Products Laboratory, Pierre Fabre Research Institute, Toulouse, France
| | - Antonio Grondin
- Green Mission Department, Herbal Products Laboratory, Pierre Fabre Research Institute, Toulouse, France
| | - Jean-Luc Wolfender
- Institute of Pharmaceutical Sciences of Western Switzerland, University of Geneva, CMU, 1211, Geneva, Switzerland.
- School of Pharmaceutical Sciences, University of Geneva, CMU, 1211, Geneva, Switzerland.
| |
Collapse
|
10
|
Mildau K, Ehlers H, Oesterle I, Pristner M, Warth B, Doppler M, Bueschl C, Zanghellini J, van der Hooft JJJ. Tailored Mass Spectral Data Exploration Using the SpecXplore Interactive Dashboard. Anal Chem 2024; 96:5798-5806. [PMID: 38564584 PMCID: PMC11024886 DOI: 10.1021/acs.analchem.3c04444] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2023] [Revised: 03/21/2024] [Accepted: 03/21/2024] [Indexed: 04/04/2024]
Abstract
Untargeted metabolomics promises comprehensive characterization of small molecules in biological samples. However, the field is hampered by low annotation rates and abstract spectral data. Despite recent advances in computational metabolomics, manual annotations and manual confirmation of in-silico annotations remain important in the field. Here, exploratory data analysis methods for mass spectral data provide overviews, prioritization, and structural hypothesis starting points to researchers facing large quantities of spectral data. In this research, we propose a fluid means of dealing with mass spectral data using specXplore, an interactive Python dashboard providing interactive and complementary visualizations facilitating mass spectral similarity matrix exploration. Specifically, specXplore provides a two-dimensional t-distributed stochastic neighbor embedding embedding as a jumping board for local connectivity exploration using complementary interactive visualizations in the form of partial network drawings, similarity heatmaps, and fragmentation overview maps. SpecXplore makes use of state-of-the-art ms2deepscore pairwise spectral similarities as a quantitative backbone while allowing fast changes of threshold and connectivity limitation settings, providing flexibility in adjusting settings to suit the localized node environment being explored. We believe that specXplore can become an integral part of mass spectral data exploration efforts and assist users in the generation of structural hypotheses for compounds of interest.
Collapse
Affiliation(s)
- Kevin Mildau
- Department
of Analytical Chemistry, University of Vienna, 1090 Vienna, Austria
- Austrian
Centre of Industrial Biotechnology (ACIB GmbH), 8010 Graz, Austria
- Doctoral
School in Chemistry, University of Vienna, 1090 Vienna, Austria
| | - Henry Ehlers
- Institute
of Visual Computing and Human-Centered Technology, TU Wien, 1040 Vienna, Austria
| | - Ian Oesterle
- Doctoral
School in Chemistry, University of Vienna, 1090 Vienna, Austria
- Department
of Food Chemistry and Toxicology, University
of Vienna, 1090 Vienna, Austria
- Department
of Biophysical Chemistry, University of
Vienna, 1090 Vienna, Austria
| | - Manuel Pristner
- Doctoral
School in Chemistry, University of Vienna, 1090 Vienna, Austria
- Department
of Food Chemistry and Toxicology, University
of Vienna, 1090 Vienna, Austria
| | - Benedikt Warth
- Department
of Food Chemistry and Toxicology, University
of Vienna, 1090 Vienna, Austria
| | - Maria Doppler
- University
of Natural Resources and Life Sciences (BOKU), 3430 Tulln, Austria
| | - Christoph Bueschl
- University
of Natural Resources and Life Sciences (BOKU), 3430 Tulln, Austria
| | - Jürgen Zanghellini
- Department
of Analytical Chemistry, University of Vienna, 1090 Vienna, Austria
| | - Justin J. J. van der Hooft
- Bioinformatics
Group, Wageningen University, 6708PB Wageningen, The Netherlands
- Department
of Biochemistry, University of Johannesburg, 2006 Johannesburg, South Africa
| |
Collapse
|
11
|
Cajka T, Hricko J, Rakusanova S, Brejchova K, Novakova M, Rudl Kulhava L, Hola V, Paucova M, Fiehn O, Kuda O. Hydrophilic Interaction Liquid Chromatography-Hydrogen/Deuterium Exchange-Mass Spectrometry (HILIC-HDX-MS) for Untargeted Metabolomics. Int J Mol Sci 2024; 25:2899. [PMID: 38474147 DOI: 10.3390/ijms25052899] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2024] [Revised: 02/17/2024] [Accepted: 02/26/2024] [Indexed: 03/14/2024] Open
Abstract
Liquid chromatography with mass spectrometry (LC-MS)-based metabolomics detects thousands of molecular features (retention time-m/z pairs) in biological samples per analysis, yet the metabolite annotation rate remains low, with 90% of signals classified as unknowns. To enhance the metabolite annotation rates, researchers employ tandem mass spectral libraries and challenging in silico fragmentation software. Hydrogen/deuterium exchange mass spectrometry (HDX-MS) may offer an additional layer of structural information in untargeted metabolomics, especially for identifying specific unidentified metabolites that are revealed to be statistically significant. Here, we investigate the potential of hydrophilic interaction liquid chromatography (HILIC)-HDX-MS in untargeted metabolomics. Specifically, we evaluate the effectiveness of two approaches using hypothetical targets: the post-column addition of deuterium oxide (D2O) and the on-column HILIC-HDX-MS method. To illustrate the practical application of HILIC-HDX-MS, we apply this methodology using the in silico fragmentation software MS-FINDER to an unknown compound detected in various biological samples, including plasma, serum, tissues, and feces during HILIC-MS profiling, subsequently identified as N1-acetylspermidine.
Collapse
Affiliation(s)
- Tomas Cajka
- Institute of Physiology of the Czech Academy of Sciences, Videnska 1083, 14200 Prague, Czech Republic
| | - Jiri Hricko
- Institute of Physiology of the Czech Academy of Sciences, Videnska 1083, 14200 Prague, Czech Republic
| | - Stanislava Rakusanova
- Institute of Physiology of the Czech Academy of Sciences, Videnska 1083, 14200 Prague, Czech Republic
| | - Kristyna Brejchova
- Institute of Physiology of the Czech Academy of Sciences, Videnska 1083, 14200 Prague, Czech Republic
| | - Michaela Novakova
- Institute of Physiology of the Czech Academy of Sciences, Videnska 1083, 14200 Prague, Czech Republic
| | - Lucie Rudl Kulhava
- Institute of Physiology of the Czech Academy of Sciences, Videnska 1083, 14200 Prague, Czech Republic
| | - Veronika Hola
- Institute of Physiology of the Czech Academy of Sciences, Videnska 1083, 14200 Prague, Czech Republic
| | - Michaela Paucova
- Institute of Physiology of the Czech Academy of Sciences, Videnska 1083, 14200 Prague, Czech Republic
| | - Oliver Fiehn
- West Coast Metabolomics Center, University of California, Davis, 451 Health Sciences Drive, Davis, CA 95616, USA
| | - Ondrej Kuda
- Institute of Physiology of the Czech Academy of Sciences, Videnska 1083, 14200 Prague, Czech Republic
| |
Collapse
|
12
|
Cooper B, Yang R. An assessment of AcquireX and Compound Discoverer software 3.3 for non-targeted metabolomics. Sci Rep 2024; 14:4841. [PMID: 38418855 PMCID: PMC10902394 DOI: 10.1038/s41598-024-55356-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2023] [Accepted: 02/22/2024] [Indexed: 03/02/2024] Open
Abstract
We used the Exploris 240 mass spectrometer for non-targeted metabolomics on Saccharomyces cerevisiae strain BY4741 and tested AcquireX software for increasing the number of detectable compounds and Compound Discoverer 3.3 software for identifying compounds by MS2 spectral library matching. AcquireX increased the number of potentially identifiable compounds by 50% through six iterations of MS2 acquisition. On the basis of high-scoring MS2 matches made by Compound Discoverer, there were 483 compounds putatively identified from nearly 8000 candidate spectra. Comparisons to 20 amino acid standards, however, revealed instances whereby compound matches could be incorrect despite strong scores. Situations included the candidate with the top score not being the correct compound, matching the same compound at two different chromatographic peaks, assigning the highest score to a library compound much heavier than the mass for the parent ion, and grouping MS2 isomers to a single parent ion. Because the software does not calculate false positive and false discovery rates at these multiple levels where such errors can propagate, we conclude that manual examination of findings will be required post software analysis. These results will interest scientists who may use this platform for metabolomics research in diverse disciplines including medical science, environmental science, and agriculture.
Collapse
Affiliation(s)
- Bret Cooper
- Soybean Genomics and Improvement Laboratory, USDA-ARS, Beltsville, MD, 20705, USA.
| | - Ronghui Yang
- Soybean Genomics and Improvement Laboratory, USDA-ARS, Beltsville, MD, 20705, USA
| |
Collapse
|
13
|
Yan Y, Hemmler D, Schmitt-Kopplin P. Discovery of Glycation Products: Unraveling the Unknown Glycation Space Using a Mass Spectral Library from In Vitro Model Systems. Anal Chem 2024; 96:3569-3577. [PMID: 38346319 PMCID: PMC10902809 DOI: 10.1021/acs.analchem.3c05540] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/28/2024]
Abstract
The nonenzymatic reaction between amino acids (AAs) and reducing sugars, also known as the Maillard reaction, is the primary source of free glycation products (GPs) in vivo and in vitro. The limited number of MS/MS records for GPs in public libraries hinders the annotation and investigation of nonenzymatic glycation. To address this issue, we present a mass spectral library containing the experimental MS/MS spectra of diverse GPs from model systems. Based on the conceptional reaction processes and structural characteristics of products, we classified GPs into common GPs (CGPs) and modified AAs (MAAs). A workflow for annotating GPs was established based on the structural and fragmentation patterns of each GP type. The final spectral library contains 157 CGPs, 499 MAAs, and 2426 GP spectra with synthetic model system information, retention time, precursor m/z, MS/MS, and annotations. As a proof-of-concept, we demonstrated the use of the library for screening GPs in unidentified spectra of human plasma and urine. The AAs with the C6H10O5 modification, fructosylation from Amadori rearrangement, were the most found GPs. With the help of the model system, we confirmed the existence of C6H10O5-modified Valine in human plasma by matching both retention time, MS1, and MS/MS without reference standards. In summary, our GP library can serve as an online resource to quickly screen possible GPs in an untargeted metabolomics workflow, furthermore with the model system as a practical synthesis method to confirm their identity.
Collapse
Affiliation(s)
- Yingfei Yan
- Research Unit Analytical BioGeoChemistry, Helmholtz Zentrum München, Neuherberg 85764, Germany
| | - Daniel Hemmler
- Research Unit Analytical BioGeoChemistry, Helmholtz Zentrum München, Neuherberg 85764, Germany
- Chair of Analytical Food Chemistry, Technical University of Munich, Freising 85354, Germany
| | - Philippe Schmitt-Kopplin
- Research Unit Analytical BioGeoChemistry, Helmholtz Zentrum München, Neuherberg 85764, Germany
- Chair of Analytical Food Chemistry, Technical University of Munich, Freising 85354, Germany
| |
Collapse
|
14
|
Mitchell JM, Chi Y, Thapa M, Pang Z, Xia J, Li S. Common data models to streamline metabolomics processing and annotation, and implementation in a Python pipeline. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.02.13.580048. [PMID: 38405981 PMCID: PMC10888883 DOI: 10.1101/2024.02.13.580048] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/27/2024]
Abstract
To standardize metabolomics data analysis and facilitate future computational developments, it is essential is have a set of well-defined templates for common data structures. Here we describe a collection of data structures involved in metabolomics data processing and illustrate how they are utilized in a full-featured Python-centric pipeline. We demonstrate the performance of the pipeline, and the details in annotation and quality control using large-scale LC-MS metabolomics and lipidomics data and LC-MS/MS data. Multiple previously published datasets are also reanalyzed to showcase its utility in biological data analysis. This pipeline allows users to streamline data processing, quality control, annotation, and standardization in an efficient and transparent manner. This work fills a major gap in the Python ecosystem for computational metabolomics.
Collapse
Affiliation(s)
- Joshua M. Mitchell
- The Jackson Laboratory for Genomic Medicine, 10 Discovery Drive, Farmington, CT 06032, USA
| | - Yuanye Chi
- The Jackson Laboratory for Genomic Medicine, 10 Discovery Drive, Farmington, CT 06032, USA
| | - Maheshwor Thapa
- The Jackson Laboratory for Genomic Medicine, 10 Discovery Drive, Farmington, CT 06032, USA
| | - Zhiqiang Pang
- Institute of Parasitology, McGill University, Montreal, Quebec, Canada
| | - Jianguo Xia
- Institute of Parasitology, McGill University, Montreal, Quebec, Canada
| | - Shuzhao Li
- The Jackson Laboratory for Genomic Medicine, 10 Discovery Drive, Farmington, CT 06032, USA
- University of Connecticut School of Medicine, Farmington, CT 06032, USA
| |
Collapse
|
15
|
Olkowicz M, Ramadan K, Rosales-Solano H, Yu M, Wang A, Cypel M, Pawliszyn J. Mapping the metabolic responses to oxaliplatin-based chemotherapy with in vivo spatiotemporal metabolomics. J Pharm Anal 2024; 14:196-210. [PMID: 38464782 PMCID: PMC10921245 DOI: 10.1016/j.jpha.2023.08.001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2023] [Revised: 07/14/2023] [Accepted: 08/07/2023] [Indexed: 03/12/2024] Open
Abstract
Adjuvant chemotherapy improves the survival outlook for patients undergoing operations for lung metastases caused by colorectal cancer (CRC). However, a multidisciplinary approach that evaluates several factors related to patient and tumor characteristics is necessary for managing chemotherapy treatment in metastatic CRC patients with lung disease, as such factors dictate the timing and drug regimen, which may affect treatment response and prognosis. In this study, we explore the potential of spatial metabolomics for evaluating metabolic phenotypes and therapy outcomes during the local delivery of the anticancer drug, oxaliplatin, to the lung. 12 male Yorkshire pigs underwent a 3 h left lung in vivo lung perfusion (IVLP) with various doses of oxaliplatin (7.5, 10, 20, 40, and 80 mg/L), which were administered to the perfusion circuit reservoir as a bolus. Biocompatible solid-phase microextraction (SPME) microprobes were combined with global metabolite profiling to obtain spatiotemporal information about the activity of the drug, determine toxic doses that exceed therapeutic efficacy, and conduct a mechanistic exploration of associated lung injury. Mild and subclinical lung injury was observed at 40 mg/L of oxaliplatin, and significant compromise of the hemodynamic lung function was found at 80 mg/L. This result was associated with massive alterations in metabolic patterns of lung tissue and perfusate, resulting in a total of 139 discriminant compounds. Uncontrolled inflammatory response, abnormalities in energy metabolism, and mitochondrial dysfunction next to accelerated kynurenine and aldosterone production were recognized as distinct features of dysregulated metabolipidome. Spatial pharmacometabolomics may be a promising tool for identifying pathological responses to chemotherapy.
Collapse
Affiliation(s)
- Mariola Olkowicz
- Department of Chemistry, University of Waterloo, Waterloo, ON, Canada
- Jagiellonian Centre for Experimental Therapeutics (JCET), Jagiellonian University, Krakow, Poland
| | - Khaled Ramadan
- Latner Thoracic Surgery Research Laboratories, Toronto General Hospital Research Institute, University Health Network, Toronto, ON, Canada
| | | | - Miao Yu
- The Jackson Laboratory, JAX Genomic Medicine, Farmington, CT, USA
| | - Aizhou Wang
- Latner Thoracic Surgery Research Laboratories, Toronto General Hospital Research Institute, University Health Network, Toronto, ON, Canada
| | - Marcelo Cypel
- Latner Thoracic Surgery Research Laboratories, Toronto General Hospital Research Institute, University Health Network, Toronto, ON, Canada
- Division of Thoracic Surgery, Department of Surgery, University Health Network, University of Toronto, Toronto Lung Transplant Program, Toronto, ON, Canada
| | - Janusz Pawliszyn
- Department of Chemistry, University of Waterloo, Waterloo, ON, Canada
| |
Collapse
|
16
|
Matey JM, Zapata F, Menéndez-Quintanal LM, Montalvo G, García-Ruiz C. Identification of new psychoactive substances and their metabolites using non-targeted detection with high-resolution mass spectrometry through diagnosing fragment ions/neutral loss analysis. Talanta 2023; 265:124816. [PMID: 37423179 DOI: 10.1016/j.talanta.2023.124816] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2023] [Revised: 04/24/2023] [Accepted: 06/12/2023] [Indexed: 07/11/2023]
Affiliation(s)
- José Manuel Matey
- Department of Chemistry and Drugs, National Institute of Toxicology and Forensic Sciences, C/ José Echegaray Nº4, 28232, Las Rozas de Madrid, Madrid, Spain; Universidad de Alcalá, Instituto Universitario de Investigación en Ciencias Policiales (IUICP), calle Libreros 27, 28801, Alcalá de Henares, Madrid, España(1); Chemical and Forensic Sciences (CINQUIFOR) Research Group, University of Alcalá, Ctra. Madrid-Barcelona km 33.600, 28871, Alcalá de Henares, Madrid, Spain(2).
| | - Félix Zapata
- Department of Analytical Chemistry, University of Murcia, Campus Espinardo, 30100, Murcia, Spain.
| | - Luis Manuel Menéndez-Quintanal
- Department of Chemistry and Drugs, National Institute of Toxicology and Forensic Sciences, Campus de Ciencias de la Salud, La Cuesta, 38320, La Laguna (Sta. Cruz de Tenerife), Spain.
| | - Gemma Montalvo
- Universidad de Alcalá, Instituto Universitario de Investigación en Ciencias Policiales (IUICP), calle Libreros 27, 28801, Alcalá de Henares, Madrid, España(1); Chemical and Forensic Sciences (CINQUIFOR) Research Group, University of Alcalá, Ctra. Madrid-Barcelona km 33.600, 28871, Alcalá de Henares, Madrid, Spain(2); Universidad de Alcalá, Departamento de Química Analítica, Quimica Física e Ingeniería Química, Ctra. Madrid-Barcelona km 33,6, 28871 Alcalá de Henares, Madrid, España.
| | - Carmen García-Ruiz
- Universidad de Alcalá, Instituto Universitario de Investigación en Ciencias Policiales (IUICP), calle Libreros 27, 28801, Alcalá de Henares, Madrid, España(1); Chemical and Forensic Sciences (CINQUIFOR) Research Group, University of Alcalá, Ctra. Madrid-Barcelona km 33.600, 28871, Alcalá de Henares, Madrid, Spain(2); Universidad de Alcalá, Departamento de Química Analítica, Quimica Física e Ingeniería Química, Ctra. Madrid-Barcelona km 33,6, 28871 Alcalá de Henares, Madrid, España.
| |
Collapse
|
17
|
Hu G, Qiu M. Machine learning-assisted structure annotation of natural products based on MS and NMR data. Nat Prod Rep 2023; 40:1735-1753. [PMID: 37519196 DOI: 10.1039/d3np00025g] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/01/2023]
Abstract
Covering: up to March 2023Machine learning (ML) has emerged as a popular tool for analyzing the structures of natural products (NPs). This review presents a summary of the recent advancements in ML-assisted mass spectrometry (MS) and nuclear magnetic resonance (NMR) data analysis to establish the chemical structures of NPs. First, ML-based MS/MS analyses that rely on library matching are discussed, which involves the utilization of ML algorithms to calculate similarity, predict the MS/MS fragments, and form molecular fingerprint. Then, ML assisted MS/MS structural annotation without library matching is reviewed. Furthermore, the cases of ML algorithms in assisting structural studies of NPs based on NMR are discussed from four perspectives: NMR prediction, functional group identification, structural categorization and quantum chemical calculation. Finally, the review concludes with a discussion of the challenges and the trends associated with the structural establishment of NPs based on ML algorithms.
Collapse
Affiliation(s)
- Guilin Hu
- State Key Laboratory of Phytochemistry and Plant Resources in West China, Kunming Institute of Botany, Chinese Academy of Sciences, Kunming 650201, Yunnan, China.
- University of the Chinese Academy of Sciences, Beijing 100049, People's Republic of China
| | - Minghua Qiu
- State Key Laboratory of Phytochemistry and Plant Resources in West China, Kunming Institute of Botany, Chinese Academy of Sciences, Kunming 650201, Yunnan, China.
- University of the Chinese Academy of Sciences, Beijing 100049, People's Republic of China
| |
Collapse
|
18
|
Dean DA, Roach J, Ulrich vonBargen R, Xiong Y, Kane SS, Klechka L, Wheeler K, Jimenez Sandoval M, Lesani M, Hossain E, Katemauswa M, Schaefer M, Harris M, Barron S, Liu Z, Pan C, McCall LI. Persistent Biofluid Small-Molecule Alterations Induced by Trypanosoma cruzi Infection Are Not Restored by Parasite Elimination. ACS Infect Dis 2023; 9:2173-2189. [PMID: 37883691 PMCID: PMC10842590 DOI: 10.1021/acsinfecdis.3c00261] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/28/2023]
Abstract
Chagas disease (CD), caused by Trypanosoma cruzi (T. cruzi) protozoa, is a complicated parasitic illness with inadequate medical measures for diagnosing infection and monitoring treatment success. To address this gap, we analyzed changes in the metabolome of T. cruzi-infected mice via liquid chromatography tandem mass spectrometry of clinically accessible biofluids: saliva, urine, and plasma. Urine was the most indicative of infection status across mouse and parasite genotypes. Metabolites perturbed by infection in urine include kynurenate, acylcarnitines, and threonylcarbamoyladenosine. Based on these results, we sought to implement urine as a tool for the assessment of CD treatment success. Strikingly, it was found that mice with parasite clearance following benznidazole antiparasitic treatment had an overall urine metabolome comparable to that of mice that failed to clear parasites. These results provide a complementary hypothesis to explain clinical trial data in which benznidazole treatment did not improve patient outcomes in late-stage disease, even in patients with successful parasite clearance. Overall, this study provides insights into new small-molecule-based CD diagnostic methods and a new approach to assess functional responses to treatment.
Collapse
Affiliation(s)
- Danya A. Dean
- Department of Chemistry and Biochemistry, University of Oklahoma, Norman, OK, 73019, USA
- Laboratories of Molecular Anthropology and Microbiome Research, University of Oklahoma, Norman, OK, 73019, USA
| | - Jarrod Roach
- Department of Chemistry and Biochemistry, University of Oklahoma, Norman, OK, 73019, USA
| | | | - Yi Xiong
- Department of Microbiology and Plant Biology, University of Oklahoma, Norman, OK, 73019, USA
| | - Shelley S. Kane
- Department of Chemistry and Biochemistry, University of Oklahoma, Norman, OK, 73019, USA
- Laboratories of Molecular Anthropology and Microbiome Research, University of Oklahoma, Norman, OK, 73019, USA
| | - London Klechka
- Department of Biology, University of Oklahoma, Norman, OK, 73019, USA
| | - Kate Wheeler
- Department of Biology, University of Oklahoma, Norman, OK, 73019, USA
| | | | - Mahbobeh Lesani
- Department of Microbiology and Plant Biology, University of Oklahoma, Norman, OK, 73019, USA
| | - Ekram Hossain
- Department of Chemistry and Biochemistry, University of Oklahoma, Norman, OK, 73019, USA
- Laboratories of Molecular Anthropology and Microbiome Research, University of Oklahoma, Norman, OK, 73019, USA
| | - Mitchelle Katemauswa
- Department of Chemistry and Biochemistry, University of Oklahoma, Norman, OK, 73019, USA
- Laboratories of Molecular Anthropology and Microbiome Research, University of Oklahoma, Norman, OK, 73019, USA
| | - Miranda Schaefer
- Department of Chemistry and Biochemistry, University of Oklahoma, Norman, OK, 73019, USA
| | - Morgan Harris
- Department of Chemistry and Biochemistry, University of Oklahoma, Norman, OK, 73019, USA
| | - Sayre Barron
- Department of Chemistry and Biochemistry, University of Oklahoma, Norman, OK, 73019, USA
| | - Zongyuan Liu
- Department of Chemistry and Biochemistry, University of Oklahoma, Norman, OK, 73019, USA
- Laboratories of Molecular Anthropology and Microbiome Research, University of Oklahoma, Norman, OK, 73019, USA
| | - Chongle Pan
- Department of Microbiology and Plant Biology, University of Oklahoma, Norman, OK, 73019, USA
| | - Laura-Isobel McCall
- Department of Chemistry and Biochemistry, University of Oklahoma, Norman, OK, 73019, USA
- Laboratories of Molecular Anthropology and Microbiome Research, University of Oklahoma, Norman, OK, 73019, USA
- Department of Microbiology and Plant Biology, University of Oklahoma, Norman, OK, 73019, USA
| |
Collapse
|
19
|
Liu Z, Ulrich vonBargen R, Kendricks AL, Wheeler K, Leão AC, Sankaranarayanan K, Dean DA, Kane SS, Hossain E, Pollet J, Bottazzi ME, Hotez PJ, Jones KM, McCall LI. Localized cardiac small molecule trajectories and persistent chemical sequelae in experimental Chagas disease. Nat Commun 2023; 14:6769. [PMID: 37880260 PMCID: PMC10600178 DOI: 10.1038/s41467-023-42247-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2023] [Accepted: 10/04/2023] [Indexed: 10/27/2023] Open
Abstract
Post-infectious conditions present major health burdens but remain poorly understood. In Chagas disease (CD), caused by Trypanosoma cruzi parasites, antiparasitic agents that successfully clear T. cruzi do not always improve clinical outcomes. In this study, we reveal differential small molecule trajectories between cardiac regions during chronic T. cruzi infection, matching with characteristic CD apical aneurysm sites. Incomplete, region-specific, cardiac small molecule restoration is observed in animals treated with the antiparasitic benznidazole. In contrast, superior restoration of the cardiac small molecule profile is observed for a combination treatment of reduced-dose benznidazole plus an immunotherapy, even with less parasite burden reduction. Overall, these results reveal molecular mechanisms of CD treatment based on simultaneous effects on the pathogen and on host small molecule responses, and expand our understanding of clinical treatment failure in CD. This link between infection and subsequent persistent small molecule perturbation broadens our understanding of infectious disease sequelae.
Collapse
Affiliation(s)
- Zongyuan Liu
- Department of Chemistry and Biochemistry, University of Oklahoma, Norman, OK, USA
- Laboratories of Molecular Anthropology and Microbiome Research, University of Oklahoma, Norman, OK, USA
| | - Rebecca Ulrich vonBargen
- Laboratories of Molecular Anthropology and Microbiome Research, University of Oklahoma, Norman, OK, USA
- Department of Biomedical Engineering, University of Oklahoma, Norman, OK, USA
| | | | - Kate Wheeler
- Department of Biology, University of Oklahoma, Norman, OK, USA
| | - Ana Carolina Leão
- Department of Pediatrics, Baylor College of Medicine, Houston, TX, USA
| | - Krithivasan Sankaranarayanan
- Laboratories of Molecular Anthropology and Microbiome Research, University of Oklahoma, Norman, OK, USA
- Department of Microbiology and Plant Biology, University of Oklahoma, Norman, OK, USA
| | - Danya A Dean
- Department of Chemistry and Biochemistry, University of Oklahoma, Norman, OK, USA
- Laboratories of Molecular Anthropology and Microbiome Research, University of Oklahoma, Norman, OK, USA
| | - Shelley S Kane
- Department of Chemistry and Biochemistry, University of Oklahoma, Norman, OK, USA
- Laboratories of Molecular Anthropology and Microbiome Research, University of Oklahoma, Norman, OK, USA
| | - Ekram Hossain
- Department of Chemistry and Biochemistry, University of Oklahoma, Norman, OK, USA
- Laboratories of Molecular Anthropology and Microbiome Research, University of Oklahoma, Norman, OK, USA
| | - Jeroen Pollet
- Department of Pediatrics, Baylor College of Medicine, Houston, TX, USA
| | - Maria Elena Bottazzi
- Department of Pediatrics, Baylor College of Medicine, Houston, TX, USA
- Department of Molecular Virology and Microbiology, Baylor College of Medicine, Houston, TX, USA
| | - Peter J Hotez
- Department of Pediatrics, Baylor College of Medicine, Houston, TX, USA
- Department of Molecular Virology and Microbiology, Baylor College of Medicine, Houston, TX, USA
| | - Kathryn M Jones
- Department of Pediatrics, Baylor College of Medicine, Houston, TX, USA.
- Department of Molecular Virology and Microbiology, Baylor College of Medicine, Houston, TX, USA.
| | - Laura-Isobel McCall
- Department of Chemistry and Biochemistry, University of Oklahoma, Norman, OK, USA.
- Laboratories of Molecular Anthropology and Microbiome Research, University of Oklahoma, Norman, OK, USA.
- Department of Microbiology and Plant Biology, University of Oklahoma, Norman, OK, USA.
- Department of Chemistry and Biochemistry, San Diego State University, San Diego, CA, USA.
| |
Collapse
|
20
|
Redick MA, Cummings ME, Neuhaus GF, Ardor Bellucci LM, Thurber AR, McPhail KL. Integration of Untargeted Metabolomics and Microbial Community Analyses to Characterize Distinct Deep-Sea Methane Seeps. FRONTIERS IN MARINE SCIENCE 2023; 10:1197338. [PMID: 39268414 PMCID: PMC11392061 DOI: 10.3389/fmars.2023.1197338] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/15/2024]
Abstract
Deep-sea methane seeps host highly diverse microbial communities whose biological diversity is distinct from other marine habitats. Coupled with microbial community analysis, untargeted metabolomics of environmental samples using high resolution tandem mass spectrometry provides unprecedented access to the unique specialized metabolisms of these chemosynthetic microorganisms. In addition, the diverse microbial natural products are of broad interest due to their potential applications for human and environmental health and well-being. In this exploratory study, sediment cores were collected from two methane seeps (-1000 m water depth) with very different gross geomorphologies, as well as a non-seep control site. Cores were subjected to parallel metabolomic and microbial community analyses to assess the feasibility of representative metabolite detection and identify congruent patterns between metabolites and microbes. Metabolomes generated using high resolution liquid chromatography tandem mass spectrometry were annotated with predicted structure classifications of the majority of mass features using SIRIUS and CANOPUS. The microbiome was characterized by analysis of 16S rRNA genes and analyzed both at the whole community level, as well as the small subgroup of Actinobacteria, which are known to produce societally useful compounds. Overall, the younger Dagorlad seep possessed a greater abundance of metabolites while there was more variation in abundance, number, and distribution of metabolites between samples at the older Emyn Muil seep. Lipid and lipid-like molecules displayed the greatest variation between sites and accounted for a larger proportion of metabolites found at the older seep. Overall, significant differences in composition of the microbial community mirrored the patterns of metabolite diversity within the samples; both varied greatly as a function of distance from methane seep, indicating a deterministic role of seepage. Interdisciplinary research to understand microbial and metabolic diversity is essential for understanding the processes and role of ubiquitous methane seeps in global systems and here we increase understanding of these systems by visualizing some of the chemical diversity that seeps add to marine systems.
Collapse
Affiliation(s)
- Margaret A Redick
- Department of Pharmaceutical Sciences, College of Pharmacy, Oregon State University, Corvallis, Oregon, USA
| | - Milo E Cummings
- Department of Microbiology, College of Science, Oregon State University, Corvallis, Oregon, USA
| | - George F Neuhaus
- Department of Pharmaceutical Sciences, College of Pharmacy, Oregon State University, Corvallis, Oregon, USA
| | - Lila M Ardor Bellucci
- College of Earth, Ocean, and Atmospheric Sciences, Oregon State University, Corvallis, Oregon, USA
| | - Andrew R Thurber
- Department of Microbiology, College of Science, Oregon State University, Corvallis, Oregon, USA
- College of Earth, Ocean, and Atmospheric Sciences, Oregon State University, Corvallis, Oregon, USA
| | - Kerry L McPhail
- Department of Pharmaceutical Sciences, College of Pharmacy, Oregon State University, Corvallis, Oregon, USA
| |
Collapse
|
21
|
Baker JL. Illuminating the oral microbiome and its host interactions: recent advancements in omics and bioinformatics technologies in the context of oral microbiome research. FEMS Microbiol Rev 2023; 47:fuad051. [PMID: 37667515 PMCID: PMC10503653 DOI: 10.1093/femsre/fuad051] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2023] [Revised: 08/02/2023] [Accepted: 09/01/2023] [Indexed: 09/06/2023] Open
Abstract
The oral microbiota has an enormous impact on human health, with oral dysbiosis now linked to many oral and systemic diseases. Recent advancements in sequencing, mass spectrometry, bioinformatics, computational biology, and machine learning are revolutionizing oral microbiome research, enabling analysis at an unprecedented scale and level of resolution using omics approaches. This review contains a comprehensive perspective of the current state-of-the-art tools available to perform genomics, metagenomics, phylogenomics, pangenomics, transcriptomics, proteomics, metabolomics, lipidomics, and multi-omics analysis on (all) microbiomes, and then provides examples of how the techniques have been applied to research of the oral microbiome, specifically. Key findings of these studies and remaining challenges for the field are highlighted. Although the methods discussed here are placed in the context of their contributions to oral microbiome research specifically, they are pertinent to the study of any microbiome, and the intended audience of this includes researchers would simply like to get an introduction to microbial omics and/or an update on the latest omics methods. Continued research of the oral microbiota using omics approaches is crucial and will lead to dramatic improvements in human health, longevity, and quality of life.
Collapse
Affiliation(s)
- Jonathon L Baker
- Department of Oral Rehabilitation & Biosciences, School of Dentistry, Oregon Health & Science University, 3181 Sam Jackson Park Road, Portland, OR 97202, United States
- Genomic Medicine Group, J. Craig Venter Institute, La Jolla, CA 92037, United States
- Department of Pediatrics, UC San Diego School of Medicine, La Jolla, CA 92093, United States
| |
Collapse
|
22
|
Boruta T. Computation-aided studies related to the induction of specialized metabolite biosynthesis in microbial co-cultures: An introductory overview. Comput Struct Biotechnol J 2023; 21:4021-4029. [PMID: 37649711 PMCID: PMC10462793 DOI: 10.1016/j.csbj.2023.08.011] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2023] [Revised: 08/14/2023] [Accepted: 08/14/2023] [Indexed: 09/01/2023] Open
Abstract
Co-cultivation is an effective method of inducing the production of specialized metabolites (SMs) in microbial strains. By mimicking the ecological interactions that take place in natural environment, this approach enables to trigger the biosynthesis of molecules which are not formed under monoculture conditions. Importantly, microbial co-cultivation may lead to the discovery of novel chemical entities of pharmaceutical interest. The experimental efforts aimed at the induction of SMs are greatly facilitated by computational techniques. The aim of this overview is to highlight the relevance of computational methods for the investigation of SM induction via microbial co-cultivation. The concepts related to the induction of SMs in microbial co-cultures are briefly introduced by addressing four areas associated with the SM induction workflows, namely the detection of SMs formed exclusively under co-culture conditions, the annotation of induced SMs, the identification of SM producer strains, and the optimization of fermentation conditions. The computational infrastructure associated with these areas, including the tools of multivariate data analysis, molecular networking, genome mining and mathematical optimization, is discussed in relation to the experimental results described in recent literature. The perspective on the future developments in the field, mainly in relation to the microbiome-related research, is also provided.
Collapse
Affiliation(s)
- Tomasz Boruta
- Lodz University of Technology, Faculty of Process and Environmental Engineering, Department of Bioprocess Engineering, ul. Wólczańska 213, 93-005 Łódź, Poland
| |
Collapse
|
23
|
Karunaratne E, Hill DW, Dührkop K, Böcker S, Grant DF. Combining Experimental with Computational Infrared and Mass Spectra for High-Throughput Nontargeted Chemical Structure Identification. Anal Chem 2023; 95:11901-11907. [PMID: 37540774 DOI: 10.1021/acs.analchem.3c00937] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/06/2023]
Abstract
The inability to identify the structures of most metabolites detected in environmental or biological samples limits the utility of nontargeted metabolomics. The most widely used analytical approaches combine mass spectrometry and machine learning methods to rank candidate structures contained in large chemical databases. Given the large chemical space typically searched, the use of additional orthogonal data may improve the identification rates and reliability. Here, we present results of combining experimental and computational mass and IR spectral data for high-throughput nontargeted chemical structure identification. Experimental MS/MS and gas-phase IR data for 148 test compounds were obtained from NIST. Candidate structures for each of the test compounds were obtained from PubChem (mean = 4444 candidate structures per test compound). Our workflow used CSI:FingerID to initially score and rank the candidate structures. The top 1000 ranked candidates were subsequently used for IR spectra prediction, scoring, and ranking using density functional theory (DFT-IR). Final ranking of the candidates was based on a composite score calculated as the average of the CSI:FingerID and DFT-IR rankings. This approach resulted in the correct identification of 88 of the 148 test compounds (59%). 129 of the 148 test compounds (87%) were ranked within the top 20 candidates. These identification rates are the highest yet reported when candidate structures are used from PubChem. Combining experimental and computational MS/MS and IR spectral data is a potentially powerful option for prioritizing candidates for final structure verification.
Collapse
Affiliation(s)
- Erandika Karunaratne
- Department of Pharmaceutical Sciences, University of Connecticut, Storrs, Connecticut 06269, United States
| | - Dennis W Hill
- Department of Pharmaceutical Sciences, University of Connecticut, Storrs, Connecticut 06269, United States
| | - Kai Dührkop
- Chair for Bioinformatics, Faculty of Mathematics and Computer Science, Friedrich Schiller University Jena, Jena 07743, Germany
| | - Sebastian Böcker
- Chair for Bioinformatics, Faculty of Mathematics and Computer Science, Friedrich Schiller University Jena, Jena 07743, Germany
| | - David F Grant
- Department of Pharmaceutical Sciences, University of Connecticut, Storrs, Connecticut 06269, United States
| |
Collapse
|
24
|
Zdouc MM, van der Hooft JJJ, Medema MH. Metabolome-guided genome mining of RiPP natural products. Trends Pharmacol Sci 2023; 44:532-541. [PMID: 37391295 DOI: 10.1016/j.tips.2023.06.004] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2023] [Revised: 06/12/2023] [Accepted: 06/12/2023] [Indexed: 07/02/2023]
Abstract
Ribosomally synthesized and post-translationally modified peptides (RiPPs) are a chemically diverse class of metabolites. Many RiPPs show potent biological activities that make them attractive starting points for drug development. A promising approach for the discovery of new classes of RiPPs is genome mining. However, the accuracy of genome mining is hampered by the lack of signature genes shared across different RiPP classes. One way to reduce false-positive predictions is by complementing genomic information with metabolomics data. In recent years, several new approaches addressing such integrative genomics and metabolomics analyses have been developed. In this review, we provide a detailed discussion of RiPP-compatible software tools that integrate paired genomics and metabolomics data. We highlight current challenges in data integration and identify opportunities for further developments targeting new classes of bioactive RiPPs.
Collapse
Affiliation(s)
- Mitja M Zdouc
- Bioinformatics Group, Wageningen University & Research, Droevendaalsesteeg 1, 6708 PB, Wageningen, the Netherlands.
| | - Justin J J van der Hooft
- Bioinformatics Group, Wageningen University & Research, Droevendaalsesteeg 1, 6708 PB, Wageningen, the Netherlands; Department of Biochemistry, University of Johannesburg, Auckland Park, Johannesburg 2006, South Africa.
| | - Marnix H Medema
- Bioinformatics Group, Wageningen University & Research, Droevendaalsesteeg 1, 6708 PB, Wageningen, the Netherlands.
| |
Collapse
|
25
|
Ebbels TMD, van der Hooft JJJ, Chatelaine H, Broeckling C, Zamboni N, Hassoun S, Mathé EA. Recent advances in mass spectrometry-based computational metabolomics. Curr Opin Chem Biol 2023; 74:102288. [PMID: 36966702 PMCID: PMC11075003 DOI: 10.1016/j.cbpa.2023.102288] [Citation(s) in RCA: 19] [Impact Index Per Article: 19.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2022] [Revised: 02/16/2023] [Accepted: 02/21/2023] [Indexed: 04/03/2023]
Abstract
The computational metabolomics field brings together computer scientists, bioinformaticians, chemists, clinicians, and biologists to maximize the impact of metabolomics across a wide array of scientific and medical disciplines. The field continues to expand as modern instrumentation produces datasets with increasing complexity, resolution, and sensitivity. These datasets must be processed, annotated, modeled, and interpreted to enable biological insight. Techniques for visualization, integration (within or between omics), and interpretation of metabolomics data have evolved along with innovation in the databases and knowledge resources required to aid understanding. In this review, we highlight recent advances in the field and reflect on opportunities and innovations in response to the most pressing challenges. This review was compiled from discussions from the 2022 Dagstuhl seminar entitled "Computational Metabolomics: From Spectra to Knowledge".
Collapse
Affiliation(s)
- Timothy M D Ebbels
- Section of Bioinformatics, Department of Metabolism, Digestion & Reproduction, Imperial College London, Burlington Danes Building, Hammersmith Hospital, Du Cane Road, London W12 0NN, UK.
| | - Justin J J van der Hooft
- Bioinformatics Group, Wageningen University & Research, Wageningen 6708 PB, the Netherlands; Department of Biochemistry, University of Johannesburg, Auckland Park, Johannesburg 2006, South Africa
| | - Haley Chatelaine
- Informatics Core, Division of Preclinical Innovation, National Center for Advancing Translational Sciences, Rockville, MD, USA
| | - Corey Broeckling
- Bioanalysis and Omics Center, Analytical Resources Core, Colorado State University, Fort Collins, CO, USA
| | - Nicola Zamboni
- Institute of Molecular Systems Biology, ETH Zurich, Zurich, Switzerland
| | - Soha Hassoun
- Department of Computer Science, Tufts University, Medford, MA, USA; Department of Chemical and Biological Engineering, Tufts University, Medford, MA, USA
| | - Ewy A Mathé
- Informatics Core, Division of Preclinical Innovation, National Center for Advancing Translational Sciences, Rockville, MD, USA.
| |
Collapse
|
26
|
de Jonge NF, Louwen JJR, Chekmeneva E, Camuzeaux S, Vermeir FJ, Jansen RS, Huber F, van der Hooft JJJ. MS2Query: reliable and scalable MS 2 mass spectra-based analogue search. Nat Commun 2023; 14:1752. [PMID: 36990978 PMCID: PMC10060387 DOI: 10.1038/s41467-023-37446-4] [Citation(s) in RCA: 16] [Impact Index Per Article: 16.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/10/2022] [Accepted: 03/15/2023] [Indexed: 03/31/2023] Open
Abstract
Metabolomics-driven discoveries of biological samples remain hampered by the grand challenge of metabolite annotation and identification. Only few metabolites have an annotated spectrum in spectral libraries; hence, searching only for exact library matches generally returns a few hits. An attractive alternative is searching for so-called analogues as a starting point for structural annotations; analogues are library molecules which are not exact matches but display a high chemical similarity. However, current analogue search implementations are not yet very reliable and relatively slow. Here, we present MS2Query, a machine learning-based tool that integrates mass spectral embedding-based chemical similarity predictors (Spec2Vec and MS2Deepscore) as well as detected precursor masses to rank potential analogues and exact matches. Benchmarking MS2Query on reference mass spectra and experimental case studies demonstrate improved reliability and scalability. Thereby, MS2Query offers exciting opportunities to further increase the annotation rate of metabolomics profiles of complex metabolite mixtures and to discover new biology.
Collapse
Affiliation(s)
- Niek F de Jonge
- Bioinformatics Group, Wageningen University & Research, 6708 PB, Wageningen, the Netherlands.
| | - Joris J R Louwen
- Bioinformatics Group, Wageningen University & Research, 6708 PB, Wageningen, the Netherlands
| | - Elena Chekmeneva
- National Phenome Centre, Section of Bioanalytical Chemistry, Division of Systems Medicine, Department of Metabolism, Digestion and Reproduction, Faculty of Medicine, Imperial College London, Hammersmith Hospital Campus, London, W12 0NN, UK
| | - Stephane Camuzeaux
- National Phenome Centre, Section of Bioanalytical Chemistry, Division of Systems Medicine, Department of Metabolism, Digestion and Reproduction, Faculty of Medicine, Imperial College London, Hammersmith Hospital Campus, London, W12 0NN, UK
| | - Femke J Vermeir
- Department of Microbiology, Radboud Institute for Biological and Environmental Sciences, Radboud University, 6525ED, Nijmegen, the Netherlands
| | - Robert S Jansen
- Department of Microbiology, Radboud Institute for Biological and Environmental Sciences, Radboud University, 6525ED, Nijmegen, the Netherlands
| | - Florian Huber
- Centre for Digitalization and Digitality (ZDD), University of Applied Sciences Düsseldorf, Düsseldorf, Germany.
| | - Justin J J van der Hooft
- Bioinformatics Group, Wageningen University & Research, 6708 PB, Wageningen, the Netherlands.
- Department of Biochemistry, University of Johannesburg, Auckland Park, Johannesburg, 2006, South Africa.
| |
Collapse
|