1
|
Chi J, Shu J, Li M, Mudappathi R, Jin Y, Lewis F, Boon A, Qin X, Liu L, Gu H. Artificial Intelligence in Metabolomics: A Current Review. Trends Analyt Chem 2024; 178:117852. [PMID: 39071116 PMCID: PMC11271759 DOI: 10.1016/j.trac.2024.117852] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/30/2024]
Abstract
Metabolomics and artificial intelligence (AI) form a synergistic partnership. Metabolomics generates large datasets comprising hundreds to thousands of metabolites with complex relationships. AI, aiming to mimic human intelligence through computational modeling, possesses extraordinary capabilities for big data analysis. In this review, we provide a recent overview of the methodologies and applications of AI in metabolomics studies in the context of systems biology and human health. We first introduce the AI concept, history, and key algorithms for machine learning and deep learning, summarizing their strengths and weaknesses. We then discuss studies that have successfully used AI across different aspects of metabolomic analysis, including analytical detection, data preprocessing, biomarker discovery, predictive modeling, and multi-omics data integration. Lastly, we discuss the existing challenges and future perspectives in this rapidly evolving field. Despite limitations and challenges, the combination of metabolomics and AI holds great promises for revolutionary advancements in enhancing human health.
Collapse
Affiliation(s)
- Jinhua Chi
- College of Health Solutions, Arizona State University, Phoenix, AZ 85004, USA
- Center for Translational Science, Florida International University, Port St. Lucie, FL 34987, USA
| | - Jingmin Shu
- College of Health Solutions, Arizona State University, Phoenix, AZ 85004, USA
- Center for Personalized Diagnostics, Biodesign Institute, Arizona State University, Tempe, AZ 85281, USA
| | - Ming Li
- Phoenix VA Health Care System, Phoenix, AZ 85012, USA
- University of Arizona College of Medicine, Phoenix, AZ 85004, USA
| | - Rekha Mudappathi
- College of Health Solutions, Arizona State University, Phoenix, AZ 85004, USA
- Center for Personalized Diagnostics, Biodesign Institute, Arizona State University, Tempe, AZ 85281, USA
| | - Yan Jin
- Center for Translational Science, Florida International University, Port St. Lucie, FL 34987, USA
| | - Freeman Lewis
- Center for Translational Science, Florida International University, Port St. Lucie, FL 34987, USA
| | - Alexandria Boon
- Center for Translational Science, Florida International University, Port St. Lucie, FL 34987, USA
| | - Xiaoyan Qin
- College of Liberal Arts and Sciences, Arizona State University, Tempe, AZ 85281, USA
| | - Li Liu
- College of Health Solutions, Arizona State University, Phoenix, AZ 85004, USA
- Center for Personalized Diagnostics, Biodesign Institute, Arizona State University, Tempe, AZ 85281, USA
| | - Haiwei Gu
- College of Health Solutions, Arizona State University, Phoenix, AZ 85004, USA
- Center for Translational Science, Florida International University, Port St. Lucie, FL 34987, USA
| |
Collapse
|
2
|
Beck A, Muhoberac M, Randolph CE, Beveridge CH, Wijewardhane PR, Kenttämaa HI, Chopra G. Recent Developments in Machine Learning for Mass Spectrometry. ACS MEASUREMENT SCIENCE AU 2024; 4:233-246. [PMID: 38910862 PMCID: PMC11191731 DOI: 10.1021/acsmeasuresciau.3c00060] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 10/06/2023] [Revised: 12/27/2023] [Accepted: 01/22/2024] [Indexed: 06/25/2024]
Abstract
Statistical analysis and modeling of mass spectrometry (MS) data have a long and rich history with several modern MS-based applications using statistical and chemometric methods. Recently, machine learning (ML) has experienced a renaissance due to advents in computational hardware and the development of new algorithms for artificial neural networks (ANN) and deep learning architectures. Moreover, recent successes of new ANN and deep learning architectures in several areas of science, engineering, and society have further strengthened the ML field. Importantly, modern ML methods and architectures have enabled new approaches for tasks related to MS that are now widely adopted in several popular MS-based subdisciplines, such as mass spectrometry imaging and proteomics. Herein, we aim to provide an introductory summary of the practical aspects of ML methodology relevant to MS. Additionally, we seek to provide an up-to-date review of the most recent developments in ML integration with MS-based techniques while also providing critical insights into the future direction of the field.
Collapse
Affiliation(s)
- Armen
G. Beck
- Department
of Chemistry, Purdue University, 560 Oval Drive, West Lafayette, Indiana 47907, United States
| | - Matthew Muhoberac
- Department
of Chemistry, Purdue University, 560 Oval Drive, West Lafayette, Indiana 47907, United States
| | - Caitlin E. Randolph
- Department
of Chemistry, Purdue University, 560 Oval Drive, West Lafayette, Indiana 47907, United States
| | - Connor H. Beveridge
- Department
of Chemistry, Purdue University, 560 Oval Drive, West Lafayette, Indiana 47907, United States
| | - Prageeth R. Wijewardhane
- Department
of Chemistry, Purdue University, 560 Oval Drive, West Lafayette, Indiana 47907, United States
| | - Hilkka I. Kenttämaa
- Department
of Chemistry, Purdue University, 560 Oval Drive, West Lafayette, Indiana 47907, United States
| | - Gaurav Chopra
- Department
of Chemistry, Purdue University, 560 Oval Drive, West Lafayette, Indiana 47907, United States
- Department
of Computer Science (by courtesy), Purdue University, West Lafayette, Indiana 47907, United States
- Purdue
Institute for Drug Discovery, Purdue Institute for Cancer Research,
Regenstrief Center for Healthcare Engineering, Purdue Institute for
Inflammation, Immunology and Infectious Disease, Purdue Institute for Integrative Neuroscience, West Lafayette, Indiana 47907 United States
| |
Collapse
|
3
|
Luong KD, Singh A. Application of Transformers in Cheminformatics. J Chem Inf Model 2024; 64:4392-4409. [PMID: 38815246 PMCID: PMC11167597 DOI: 10.1021/acs.jcim.3c02070] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/28/2023] [Revised: 04/05/2024] [Accepted: 05/06/2024] [Indexed: 06/01/2024]
Abstract
By accelerating time-consuming processes with high efficiency, computing has become an essential part of many modern chemical pipelines. Machine learning is a class of computing methods that can discover patterns within chemical data and utilize this knowledge for a wide variety of downstream tasks, such as property prediction or substance generation. The complex and diverse chemical space requires complex machine learning architectures with great learning power. Recently, learning models based on transformer architectures have revolutionized multiple domains of machine learning, including natural language processing and computer vision. Naturally, there have been ongoing endeavors in adopting these techniques to the chemical domain, resulting in a surge of publications within a short period. The diversity of chemical structures, use cases, and learning models necessitate a comprehensive summarization of existing works. In this paper, we review recent innovations in adapting transformers to solve learning problems in chemistry. Because chemical data is diverse and complex, we structure our discussion based on chemical representations. Specifically, we highlight the strengths and weaknesses of each representation, the current progress of adapting transformer architectures, and future directions.
Collapse
Affiliation(s)
- Kha-Dinh Luong
- Department of Computer Science, University of California Santa Barbara, Santa Barbara, CA 93106, United States
| | - Ambuj Singh
- Department of Computer Science, University of California Santa Barbara, Santa Barbara, CA 93106, United States
| |
Collapse
|
4
|
Cuahtecontzi Delint R, Jaffery H, Ishak MI, Nobbs AH, Su B, Dalby MJ. Mechanotransducive surfaces for enhanced cell osteogenesis, a review. BIOMATERIALS ADVANCES 2024; 160:213861. [PMID: 38663159 DOI: 10.1016/j.bioadv.2024.213861] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/14/2023] [Revised: 03/31/2024] [Accepted: 04/12/2024] [Indexed: 05/04/2024]
Abstract
Novel strategies employing mechano-transducing materials eliciting biological outcomes have recently emerged for controlling cellular behaviour. Targeted cellular responses are achieved by manipulating physical, chemical, or biochemical modification of material properties. Advances in techniques such as nanopatterning, chemical modification, biochemical molecule embedding, force-tuneable materials, and artificial extracellular matrices are helping understand cellular mechanotransduction. Collectively, these strategies manipulate cellular sensing and regulate signalling cascades including focal adhesions, YAP-TAZ transcription factors, and multiple osteogenic pathways. In this minireview, we are providing a summary of the influence that these materials, particularly titanium-based orthopaedic materials, have on cells. We also highlight recent complementary methodological developments including, but not limited to, the use of metabolomics for identification of active biomolecules that drive cellular differentiation.
Collapse
Affiliation(s)
- Rosalia Cuahtecontzi Delint
- Centre for the Cellular Microenvironment, Institute of Molecular, Cell and Systems Biology, College of Medical, Veterinary and Life Sciences, University of Glasgow, Glasgow G12 8QQ, UK.
| | - Hussain Jaffery
- Centre for the Cellular Microenvironment, Institute of Molecular, Cell and Systems Biology, College of Medical, Veterinary and Life Sciences, University of Glasgow, Glasgow G12 8QQ, UK
| | - Mohd I Ishak
- Bristol Dental School, University of Bristol, Lower Maudlin Street, Bristol BS1 2LY, UK
| | - Angela H Nobbs
- Bristol Dental School, University of Bristol, Lower Maudlin Street, Bristol BS1 2LY, UK
| | - Bo Su
- Bristol Dental School, University of Bristol, Lower Maudlin Street, Bristol BS1 2LY, UK
| | - Matthew J Dalby
- Centre for the Cellular Microenvironment, Institute of Molecular, Cell and Systems Biology, College of Medical, Veterinary and Life Sciences, University of Glasgow, Glasgow G12 8QQ, UK
| |
Collapse
|
5
|
Pretorius E, Kell DB. A Perspective on How Fibrinaloid Microclots and Platelet Pathology May be Applied in Clinical Investigations. Semin Thromb Hemost 2024; 50:537-551. [PMID: 37748515 PMCID: PMC11105946 DOI: 10.1055/s-0043-1774796] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/27/2023]
Abstract
Microscopy imaging has enabled us to establish the presence of fibrin(ogen) amyloid (fibrinaloid) microclots in a range of chronic, inflammatory diseases. Microclots may also be induced by a variety of purified substances, often at very low concentrations. These molecules include bacterial inflammagens, serum amyloid A, and the S1 spike protein of severe acute respiratory syndrome coronavirus 2. Here, we explore which of the properties of these microclots might be used to contribute to differential clinical diagnoses and prognoses of the various diseases with which they may be associated. Such properties include distributions in their size and number before and after the addition of exogenous thrombin, their spectral properties, the diameter of the fibers of which they are made, their resistance to proteolysis by various proteases, their cross-seeding ability, and the concentration dependence of their ability to bind small molecules including fluorogenic amyloid stains. Measuring these microclot parameters, together with microscopy imaging itself, along with methodologies like proteomics and imaging flow cytometry, as well as more conventional assays such as those for cytokines, might open up the possibility of a much finer use of these microclot properties in generative methods for a future where personalized medicine will be standard procedures in all clotting pathology disease diagnoses.
Collapse
Affiliation(s)
- Etheresia Pretorius
- Department of Physiological Sciences, Faculty of Science, Stellenbosch University, Stellenbosch, Matieland, South Africa
- Department of Biochemistry and Systems Biology, Institute of Systems, Molecular and Integrative Biology, Faculty of Health and Life Sciences, University of Liverpool, Liverpool, United Kingdom
| | - Douglas B. Kell
- Department of Physiological Sciences, Faculty of Science, Stellenbosch University, Stellenbosch, Matieland, South Africa
- Department of Biochemistry and Systems Biology, Institute of Systems, Molecular and Integrative Biology, Faculty of Health and Life Sciences, University of Liverpool, Liverpool, United Kingdom
- The Novo Nordisk Foundation Centre for Biosustainability, Technical University of Denmark, Lyngby, Denmark
| |
Collapse
|
6
|
Bui-Thi D, Liu Y, Lippens JL, Laukens K, De Vijlder T. TransExION: a transformer based explainable similarity metric for comparing IONS in tandem mass spectrometry. J Cheminform 2024; 16:61. [PMID: 38807166 PMCID: PMC11134763 DOI: 10.1186/s13321-024-00858-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2023] [Accepted: 05/12/2024] [Indexed: 05/30/2024] Open
Abstract
Small molecule identification is a crucial task in analytical chemistry and life sciences. One of the most commonly used technologies to elucidate small molecule structures is mass spectrometry. Spectral library search of product ion spectra (MS/MS) is a popular strategy to identify or find structural analogues. This approach relies on the assumption that spectral similarity and structural similarity are correlated. However, popular spectral similarity measures, usually calculated based on identical fragment matches between the MS/MS spectra, do not always accurately reflect the structural similarity. In this study, we propose TransExION, a Transformer based Explainable similarity metric for IONS. TransExION detects related fragments between MS/MS spectra through their mass difference and uses these to estimate spectral similarity. These related fragments can be nearly identical, but can also share a substructure. TransExION also provides a post-hoc explanation of its estimation, which can be used to support scientists in evaluating the spectral library search results and thus in structure elucidation of unknown molecules. Our model has a Transformer based architecture and it is trained on the data derived from GNPS MS/MS libraries. The experimental results show that it improves existing spectral similarity measures in searching and interpreting structural analogues as well as in molecular networking. SCIENTIFIC CONTRIBUTION: We propose a transformer-based spectral similarity metrics that improves the comparison of small molecule tandem mass spectra. We provide a post hoc explanation that can serve as a good starting point for unknown spectra annotation based on database spectra.
Collapse
Affiliation(s)
- Danh Bui-Thi
- Computer Science Department, University of Antwerp, Middelheimlaan 1, 2020, Antwerp, Belgium
| | - Youzhong Liu
- Therapeutic Development and Supply, Janssen Pharmaceutica N.V., Turnhoutseweg 30, 2340, Beerse, Belgium
| | - Jennifer L Lippens
- Therapeutic Development and Supply, Janssen Pharmaceutica N.V., Turnhoutseweg 30, 2340, Beerse, Belgium
| | - Kris Laukens
- Computer Science Department, University of Antwerp, Middelheimlaan 1, 2020, Antwerp, Belgium
| | - Thomas De Vijlder
- Therapeutic Development and Supply, Janssen Pharmaceutica N.V., Turnhoutseweg 30, 2340, Beerse, Belgium.
| |
Collapse
|
7
|
Lu XY, Wu HP, Ma H, Li H, Li J, Liu YT, Pan ZY, Xie Y, Wang L, Ren B, Liu GK. Deep Learning-Assisted Spectrum-Structure Correlation: State-of-the-Art and Perspectives. Anal Chem 2024; 96:7959-7975. [PMID: 38662943 DOI: 10.1021/acs.analchem.4c01639] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/22/2024]
Abstract
Spectrum-structure correlation is playing an increasingly crucial role in spectral analysis and has undergone significant development in recent decades. With the advancement of spectrometers, the high-throughput detection triggers the explosive growth of spectral data, and the research extension from small molecules to biomolecules accompanies massive chemical space. Facing the evolving landscape of spectrum-structure correlation, conventional chemometrics becomes ill-equipped, and deep learning assisted chemometrics rapidly emerges as a flourishing approach with superior ability of extracting latent features and making precise predictions. In this review, the molecular and spectral representations and fundamental knowledge of deep learning are first introduced. We then summarize the development of how deep learning assist to establish the correlation between spectrum and molecular structure in the recent 5 years, by empowering spectral prediction (i.e., forward structure-spectrum correlation) and further enabling library matching and de novo molecular generation (i.e., inverse spectrum-structure correlation). Finally, we highlight the most important open issues persisted with corresponding potential solutions. With the fast development of deep learning, it is expected to see ultimate solution of establishing spectrum-structure correlation soon, which would trigger substantial development of various disciplines.
Collapse
Affiliation(s)
- Xin-Yu Lu
- State Key Laboratory of Physical Chemistry of Solid Surfaces, Collaborative Innovation Center of Chemistry for Energy Materials (iChEM), College of Chemistry and Chemical Engineering, Xiamen University, Xiamen 361005, P. R. China
- Tan Kah Kee Innovation Laboratory, Xiamen 361005, P. R. China
| | - Hao-Ping Wu
- State Key Laboratory of Marine Environmental Science, Fujian Provincial Key Laboratory for Coastal Ecology and Environmental Studies, Center for Marine Environmental Chemistry & Toxicology, College of the Environment and Ecology, Xiamen University, Xiamen, Fujian 361102, P. R. China
| | - Hao Ma
- State Key Laboratory of Physical Chemistry of Solid Surfaces, Collaborative Innovation Center of Chemistry for Energy Materials (iChEM), College of Chemistry and Chemical Engineering, Xiamen University, Xiamen 361005, P. R. China
- Tan Kah Kee Innovation Laboratory, Xiamen 361005, P. R. China
| | - Hui Li
- Key Laboratory of Multimedia Trusted Perception and Efficient Computing, Ministry of Education of China, Xiamen University, Xiamen 361005, P. R. China
| | - Jia Li
- Institute of Artificial Intelligence, Xiamen University, Xiamen 361005, P. R. China
| | - Yan-Ti Liu
- State Key Laboratory of Physical Chemistry of Solid Surfaces, Collaborative Innovation Center of Chemistry for Energy Materials (iChEM), College of Chemistry and Chemical Engineering, Xiamen University, Xiamen 361005, P. R. China
- Tan Kah Kee Innovation Laboratory, Xiamen 361005, P. R. China
| | - Zheng-Yan Pan
- State Key Laboratory of Physical Chemistry of Solid Surfaces, Collaborative Innovation Center of Chemistry for Energy Materials (iChEM), College of Chemistry and Chemical Engineering, Xiamen University, Xiamen 361005, P. R. China
| | - Yi Xie
- School of Informatics, Xiamen University, Xiamen 361005, P. R. China
| | - Lei Wang
- Pen-Tung Sah Institute of Micro-Nano Science and Technology, Xiamen University, Xiamen 361005, P. R. China
| | - Bin Ren
- State Key Laboratory of Physical Chemistry of Solid Surfaces, Collaborative Innovation Center of Chemistry for Energy Materials (iChEM), College of Chemistry and Chemical Engineering, Xiamen University, Xiamen 361005, P. R. China
- Tan Kah Kee Innovation Laboratory, Xiamen 361005, P. R. China
| | - Guo-Kun Liu
- State Key Laboratory of Marine Environmental Science, Fujian Provincial Key Laboratory for Coastal Ecology and Environmental Studies, Center for Marine Environmental Chemistry & Toxicology, College of the Environment and Ecology, Xiamen University, Xiamen, Fujian 361102, P. R. China
| |
Collapse
|
8
|
Yang Y, Sun S, Yang S, Yang Q, Lu X, Wang X, Yu Q, Huo X, Qian X. Structural annotation of unknown molecules in a miniaturized mass spectrometer based on a transformer enabled fragment tree method. Commun Chem 2024; 7:109. [PMID: 38740942 DOI: 10.1038/s42004-024-01189-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2023] [Accepted: 04/26/2024] [Indexed: 05/16/2024] Open
Abstract
Structural annotation of small molecules in tandem mass spectrometry has always been a central challenge in mass spectrometry analysis, especially using a miniaturized mass spectrometer for on-site testing. Here, we propose the Transformer enabled Fragment Tree (TeFT) method, which combines various types of fragmentation tree models and a deep learning Transformer module. It is aimed to generate the specific structure of molecules de novo solely from mass spectrometry spectra. The evaluation results on different open-source databases indicated that the proposed model achieved remarkable results in that the majority of molecular structures of compounds in the test can be successfully recognized. Also, the TeFT has been validated on a miniaturized mass spectrometer with low-resolution spectra for 16 flavonoid alcohols, achieving complete structure prediction for 8 substances. Finally, TeFT confirmed the structure of the compound contained in a Chinese medicine substance called the Anweiyang capsule. These results indicate that the TeFT method is suitable for annotating fragmentation peaks with clear fragmentation rules, particularly when applied to on-site mass spectrometry with lower mass resolution.
Collapse
Affiliation(s)
- Yiming Yang
- Shenzhen International Graduate School, Tsinghua University, Shenzhen, 518055, China
| | - Shuang Sun
- Shenzhen International Graduate School, Tsinghua University, Shenzhen, 518055, China
| | - Shuyuan Yang
- Shenzhen International Graduate School, Tsinghua University, Shenzhen, 518055, China
| | - Qin Yang
- Shenzhen International Graduate School, Tsinghua University, Shenzhen, 518055, China
| | - Xinqiong Lu
- CHIN Instrument (Hefei) Co., Ltd., Hefei, 231200, China
| | - Xiaohao Wang
- Shenzhen International Graduate School, Tsinghua University, Shenzhen, 518055, China
| | - Quan Yu
- Shenzhen International Graduate School, Tsinghua University, Shenzhen, 518055, China
| | - Xinming Huo
- Key Laboratory of Sensing Technology and Biomedical Instruments of Guangdong Province, School of Biomedical Engineering, Shenzhen Campus of Sun Yat-sen University, Shenzhen, 518107, China.
| | - Xiang Qian
- Shenzhen International Graduate School, Tsinghua University, Shenzhen, 518055, China.
| |
Collapse
|
9
|
Ziaikin E, Tello E, Peterson DG, Niv MY. BitterMasS: Predicting Bitterness from Mass Spectra. JOURNAL OF AGRICULTURAL AND FOOD CHEMISTRY 2024; 72:10537-10547. [PMID: 38685906 PMCID: PMC11082931 DOI: 10.1021/acs.jafc.3c09767] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/25/2023] [Revised: 04/18/2024] [Accepted: 04/18/2024] [Indexed: 05/02/2024]
Abstract
Bitter compounds are common in nature and among drugs. Previously, machine learning tools were developed to predict bitterness from the chemical structure. However, known structures are estimated to represent only 5-10% of the metabolome, and the rest remain unassigned or "dark". We present BitterMasS, a Random Forest classifier that was trained on 5414 experimental mass spectra of bitter and nonbitter compounds, achieving precision = 0.83 and recall = 0.90 for an internal test set. Next, the model was tested against spectra newly extracted from the literature 106 bitter and nonbitter compounds and for additional spectra measured for 26 compounds. For these external test cases, BitterMasS exhibited 67% precision and 93% recall for the first and 58% accuracy and 99% recall for the second. The spectrum-bitterness prediction strategy was more effective than the spectrum-structure-bitterness prediction strategy and covered more compounds. These encouraging results suggest that BitterMasS can be used to predict bitter compounds in the metabolome without the need for structural assignment of individual molecules. This may enable identification of bitter compounds from metabolomics analyses, for comparing potential bitterness levels obtained by different treatments of samples and for monitoring bitterness changes overtime.
Collapse
Affiliation(s)
- Evgenii Ziaikin
- Food
Science and Nutrition, The Robert H. Smith Faculty of Agriculture,
Food and Environment, The Institute of Biochemistry, Food and Nutrition, The Hebrew University of Jerusalem, 76100 Rehovot, Israel
| | - Edisson Tello
- Department
of Food Science and Technology, College of Food, Agriculture, and
Environmental Sciences, The Ohio State University, Columbus, Ohio 43210, United States
| | - Devin G. Peterson
- Department
of Food Science and Technology, College of Food, Agriculture, and
Environmental Sciences, The Ohio State University, Columbus, Ohio 43210, United States
| | - Masha Y. Niv
- Food
Science and Nutrition, The Robert H. Smith Faculty of Agriculture,
Food and Environment, The Institute of Biochemistry, Food and Nutrition, The Hebrew University of Jerusalem, 76100 Rehovot, Israel
| |
Collapse
|
10
|
Kalinski JCJ, Noundou XS, Petras D, Matcher GF, Polyzois A, Aron AT, Gentry EC, Bornman TG, Adams JB, Dorrington RA. Urban and agricultural influences on the coastal dissolved organic matter pool in the Algoa Bay estuaries. CHEMOSPHERE 2024; 355:141782. [PMID: 38548083 DOI: 10.1016/j.chemosphere.2024.141782] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/23/2023] [Revised: 02/28/2024] [Accepted: 03/22/2024] [Indexed: 04/08/2024]
Abstract
While anthropogenic pollution is a major threat to aquatic ecosystem health, our knowledge of the presence of xenobiotics in coastal Dissolved Organic Matter (DOM) is still relatively poor. This is especially true for water bodies in the Global South with limited information gained mostly from targeted studies that rely on comparison with authentic standards. In recent years, non-targeted tandem mass spectrometry has emerged as a powerful tool to collectively detect and identify pollutants and biogenic DOM components in the environment, but this approach has yet to be widely utilized for monitoring ecologically important aquatic systems. In this study we compared the DOM composition of Algoa Bay, Eastern Cape, South Africa, and its two estuaries. The Swartkops Estuary is highly urbanized and severely impacted by anthropogenic pollution, while the Sundays Estuary is impacted by commercial agriculture in its catchment. We employed solid-phase extraction followed by liquid chromatography tandem mass spectrometry to annotate more than 200 pharmaceuticals, pesticides, urban xenobiotics, and natural products based on spectral matching. The identification with authentic standards confirmed the presence of methamphetamine, carbamazepine, sulfamethoxazole, N-acetylsulfamethoxazole, imazapyr, caffeine and hexa(methoxymethyl)melamine, and allowed semi-quantitative estimations for annotated xenobiotics. The Swartkops Estuary DOM composition was strongly impacted by features annotated as urban pollutants including pharmaceuticals such as melamines and antiretrovirals. By contrast, the Sundays Estuary exhibited significant enrichment of molecules annotated as agrochemicals widely used in the citrus farming industry, with predicted concentrations for some of them exceeding predicted no-effect concentrations. This study provides new insight into anthropogenic impact on the Algoa Bay system and demonstrates the utility of non-targeted tandem mass spectrometry as a sensitive tool for assessing the health of ecologically important coastal ecosystems and will serve as a valuable foundation for strategizing long-term monitoring efforts.
Collapse
Affiliation(s)
| | - Xavier Siwe Noundou
- Department of Biochemistry and Microbiology, Rhodes University, Makhanda, South Africa; Department of Pharmaceutical Sciences, Sefako Makgatho Health Sciences University, Pretoria, South Africa
| | - Daniel Petras
- Collaborative Mass Spectrometry Innovation Center, University of California San Diego, La Jolla, USA; Department of Biochemistry, University of California Riverside, Riverside, USA; CMFI Cluster of Excellence, Interfaculty Institute of Microbiology and Medicine, University of Tuebingen, Tuebingen, Germany
| | - Gwynneth F Matcher
- Department of Biochemistry and Microbiology, Rhodes University, Makhanda, South Africa; South African Institute for Aquatic Biodiversity, 6139, Makhanda, South Africa
| | - Alexandros Polyzois
- Department of Biochemistry and Microbiology, Rhodes University, Makhanda, South Africa; Boyce Thompson Institute and Department of Chemistry and Chemical Biology, Cornell University, Ithaca, NY, 14853, United States
| | - Allegra T Aron
- Collaborative Mass Spectrometry Innovation Center, University of California San Diego, La Jolla, USA; Department of Chemistry and Biochemistry, University of Denver, Denver, CO, 80210, United States
| | - Emily C Gentry
- Collaborative Mass Spectrometry Innovation Center, University of California San Diego, La Jolla, USA; Department of Chemistry, Virginia Tech, Blacksburg, VA, 24061, United States
| | - Thomas G Bornman
- Department of Biochemistry and Microbiology, Rhodes University, Makhanda, South Africa; South African Environmental Observation Network SAEON, Elwandle Coastal Node, Gqeberha, South Africa; Institute for Coastal and Marine Research, Nelson Mandela University, Gqeberha, South Africa
| | - Janine B Adams
- DSI/NRF Research Chair, Shallow Water Ecosystems, Department of Botany and Institute for Coastal and Marine Research, Nelson Mandela University, Gqeberha, South Africa; Department of Botany, Institute for Coastal and Marine Research CMR, Nelson Mandela University, Gqeberha, South Africa
| | - Rosemary A Dorrington
- Department of Biochemistry and Microbiology, Rhodes University, Makhanda, South Africa; South African Institute for Aquatic Biodiversity, 6139, Makhanda, South Africa.
| |
Collapse
|
11
|
Perez de Souza L, Fernie AR. Computational methods for processing and interpreting mass spectrometry-based metabolomics. Essays Biochem 2024; 68:5-13. [PMID: 37999335 PMCID: PMC11065554 DOI: 10.1042/ebc20230019] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2023] [Revised: 11/10/2023] [Accepted: 11/15/2023] [Indexed: 11/25/2023]
Abstract
Metabolomics has emerged as an indispensable tool for exploring complex biological questions, providing the ability to investigate a substantial portion of the metabolome. However, the vast complexity and structural diversity intrinsic to metabolites imposes a great challenge for data analysis and interpretation. Liquid chromatography mass spectrometry (LC-MS) stands out as a versatile technique offering extensive metabolite coverage. In this mini-review, we address some of the hurdles posed by the complex nature of LC-MS data, providing a brief overview of computational tools designed to help tackling these challenges. Our focus centers on two major steps that are essential to most metabolomics investigations: the translation of raw data into quantifiable features, and the extraction of structural insights from mass spectra to facilitate metabolite identification. By exploring current computational solutions, we aim at providing a critical overview of the capabilities and constraints of mass spectrometry-based metabolomics, while introduce some of the most recent trends in data processing and analysis within the field.
Collapse
Affiliation(s)
- Leonardo Perez de Souza
- Max Planck Institute of Molecular Plant Physiology, Am Mühlenberg 1, 14476 Potsdam-Golm, Germany
| | - Alisdair R Fernie
- Max Planck Institute of Molecular Plant Physiology, Am Mühlenberg 1, 14476 Potsdam-Golm, Germany
- Center for Plant Systems Biology and Biotechnology, 4000 Plovdiv, Bulgaria
| |
Collapse
|
12
|
Xue X, Sun H, Yang M, Liu X, Hu HY, Deng Y, Wang X. Advances in the Application of Artificial Intelligence-Based Spectral Data Interpretation: A Perspective. Anal Chem 2023; 95:13733-13745. [PMID: 37688541 DOI: 10.1021/acs.analchem.3c02540] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/11/2023]
Abstract
The interpretation of spectral data, including mass, nuclear magnetic resonance, infrared, and ultraviolet-visible spectra, is critical for obtaining molecular structural information. The development of advanced sensing technology has multiplied the amount of available spectral data. Chemical experts must use basic principles corresponding to the spectral information generated by molecular fragments and functional groups. This is a time-consuming process that requires a solid professional knowledge base. In recent years, the rapid development of computer science and its applications in cheminformatics and the emergence of computer-aided expert systems have greatly reduced the difficulty in analyzing large quantities of data. For expert systems, however, the problem-solving strategy must be known in advance or extracted by human experts and translated into algorithms. Gratifyingly, the development of artificial intelligence (AI) methods has shown great promise for solving such problems. Traditional algorithms, including the latest neural network algorithms, have shown great potential for both extracting useful information and processing massive quantities of data. This Perspective highlights recent innovations covering all of the emerging AI-based spectral interpretation techniques. In addition, the main limitations and current obstacles are presented, and the corresponding directions for further research are proposed. Moreover, this Perspective gives the authors' personal outlook on the development and future applications of spectral interpretation.
Collapse
Affiliation(s)
- Xi Xue
- State Key Laboratory of Bioactive Substances and Functions of Natural Medicines, Institute of Materia Medica, Peking Union Medical College and Chinese Academy of Medical Sciences, Beijing 100050, China
- Beijing Key Laboratory of Active Substances Discovery and Drugability Evaluation, Department of Medicinal Chemistry, Institute of Materia Medica, Peking Union Medical College and Chinese Academy of Medical Sciences, Beijing 100050, P. R. China
| | - Hanyu Sun
- State Key Laboratory of Bioactive Substances and Functions of Natural Medicines, Institute of Materia Medica, Peking Union Medical College and Chinese Academy of Medical Sciences, Beijing 100050, China
- Beijing Key Laboratory of Active Substances Discovery and Drugability Evaluation, Department of Medicinal Chemistry, Institute of Materia Medica, Peking Union Medical College and Chinese Academy of Medical Sciences, Beijing 100050, P. R. China
| | - Minjian Yang
- State Key Laboratory of Bioactive Substances and Functions of Natural Medicines, Institute of Materia Medica, Peking Union Medical College and Chinese Academy of Medical Sciences, Beijing 100050, China
- Beijing Key Laboratory of Active Substances Discovery and Drugability Evaluation, Department of Medicinal Chemistry, Institute of Materia Medica, Peking Union Medical College and Chinese Academy of Medical Sciences, Beijing 100050, P. R. China
| | - Xue Liu
- State Key Laboratory of Bioactive Substances and Functions of Natural Medicines, Institute of Materia Medica, Peking Union Medical College and Chinese Academy of Medical Sciences, Beijing 100050, China
| | - Hai-Yu Hu
- State Key Laboratory of Bioactive Substances and Functions of Natural Medicines, Institute of Materia Medica, Peking Union Medical College and Chinese Academy of Medical Sciences, Beijing 100050, China
| | - Yafeng Deng
- CarbonSilicon AI Technology Co., Ltd. Beijing 100080, China
- Department of Automation, Tsinghua University, Beijing 100084, China
| | - Xiaojian Wang
- State Key Laboratory of Bioactive Substances and Functions of Natural Medicines, Institute of Materia Medica, Peking Union Medical College and Chinese Academy of Medical Sciences, Beijing 100050, China
- CarbonSilicon AI Technology Co., Ltd. Beijing 100080, China
| |
Collapse
|
13
|
Ebbels TMD, van der Hooft JJJ, Chatelaine H, Broeckling C, Zamboni N, Hassoun S, Mathé EA. Recent advances in mass spectrometry-based computational metabolomics. Curr Opin Chem Biol 2023; 74:102288. [PMID: 36966702 PMCID: PMC11075003 DOI: 10.1016/j.cbpa.2023.102288] [Citation(s) in RCA: 15] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2022] [Revised: 02/16/2023] [Accepted: 02/21/2023] [Indexed: 04/03/2023]
Abstract
The computational metabolomics field brings together computer scientists, bioinformaticians, chemists, clinicians, and biologists to maximize the impact of metabolomics across a wide array of scientific and medical disciplines. The field continues to expand as modern instrumentation produces datasets with increasing complexity, resolution, and sensitivity. These datasets must be processed, annotated, modeled, and interpreted to enable biological insight. Techniques for visualization, integration (within or between omics), and interpretation of metabolomics data have evolved along with innovation in the databases and knowledge resources required to aid understanding. In this review, we highlight recent advances in the field and reflect on opportunities and innovations in response to the most pressing challenges. This review was compiled from discussions from the 2022 Dagstuhl seminar entitled "Computational Metabolomics: From Spectra to Knowledge".
Collapse
Affiliation(s)
- Timothy M D Ebbels
- Section of Bioinformatics, Department of Metabolism, Digestion & Reproduction, Imperial College London, Burlington Danes Building, Hammersmith Hospital, Du Cane Road, London W12 0NN, UK.
| | - Justin J J van der Hooft
- Bioinformatics Group, Wageningen University & Research, Wageningen 6708 PB, the Netherlands; Department of Biochemistry, University of Johannesburg, Auckland Park, Johannesburg 2006, South Africa
| | - Haley Chatelaine
- Informatics Core, Division of Preclinical Innovation, National Center for Advancing Translational Sciences, Rockville, MD, USA
| | - Corey Broeckling
- Bioanalysis and Omics Center, Analytical Resources Core, Colorado State University, Fort Collins, CO, USA
| | - Nicola Zamboni
- Institute of Molecular Systems Biology, ETH Zurich, Zurich, Switzerland
| | - Soha Hassoun
- Department of Computer Science, Tufts University, Medford, MA, USA; Department of Chemical and Biological Engineering, Tufts University, Medford, MA, USA
| | - Ewy A Mathé
- Informatics Core, Division of Preclinical Innovation, National Center for Advancing Translational Sciences, Rockville, MD, USA.
| |
Collapse
|
14
|
de Jonge NF, Louwen JJR, Chekmeneva E, Camuzeaux S, Vermeir FJ, Jansen RS, Huber F, van der Hooft JJJ. MS2Query: reliable and scalable MS 2 mass spectra-based analogue search. Nat Commun 2023; 14:1752. [PMID: 36990978 PMCID: PMC10060387 DOI: 10.1038/s41467-023-37446-4] [Citation(s) in RCA: 16] [Impact Index Per Article: 16.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/10/2022] [Accepted: 03/15/2023] [Indexed: 03/31/2023] Open
Abstract
Metabolomics-driven discoveries of biological samples remain hampered by the grand challenge of metabolite annotation and identification. Only few metabolites have an annotated spectrum in spectral libraries; hence, searching only for exact library matches generally returns a few hits. An attractive alternative is searching for so-called analogues as a starting point for structural annotations; analogues are library molecules which are not exact matches but display a high chemical similarity. However, current analogue search implementations are not yet very reliable and relatively slow. Here, we present MS2Query, a machine learning-based tool that integrates mass spectral embedding-based chemical similarity predictors (Spec2Vec and MS2Deepscore) as well as detected precursor masses to rank potential analogues and exact matches. Benchmarking MS2Query on reference mass spectra and experimental case studies demonstrate improved reliability and scalability. Thereby, MS2Query offers exciting opportunities to further increase the annotation rate of metabolomics profiles of complex metabolite mixtures and to discover new biology.
Collapse
Affiliation(s)
- Niek F de Jonge
- Bioinformatics Group, Wageningen University & Research, 6708 PB, Wageningen, the Netherlands.
| | - Joris J R Louwen
- Bioinformatics Group, Wageningen University & Research, 6708 PB, Wageningen, the Netherlands
| | - Elena Chekmeneva
- National Phenome Centre, Section of Bioanalytical Chemistry, Division of Systems Medicine, Department of Metabolism, Digestion and Reproduction, Faculty of Medicine, Imperial College London, Hammersmith Hospital Campus, London, W12 0NN, UK
| | - Stephane Camuzeaux
- National Phenome Centre, Section of Bioanalytical Chemistry, Division of Systems Medicine, Department of Metabolism, Digestion and Reproduction, Faculty of Medicine, Imperial College London, Hammersmith Hospital Campus, London, W12 0NN, UK
| | - Femke J Vermeir
- Department of Microbiology, Radboud Institute for Biological and Environmental Sciences, Radboud University, 6525ED, Nijmegen, the Netherlands
| | - Robert S Jansen
- Department of Microbiology, Radboud Institute for Biological and Environmental Sciences, Radboud University, 6525ED, Nijmegen, the Netherlands
| | - Florian Huber
- Centre for Digitalization and Digitality (ZDD), University of Applied Sciences Düsseldorf, Düsseldorf, Germany.
| | - Justin J J van der Hooft
- Bioinformatics Group, Wageningen University & Research, 6708 PB, Wageningen, the Netherlands.
- Department of Biochemistry, University of Johannesburg, Auckland Park, Johannesburg, 2006, South Africa.
| |
Collapse
|
15
|
Aharoni A, Goodacre R, Fernie AR. Plant and microbial sciences as key drivers in the development of metabolomics research. Proc Natl Acad Sci U S A 2023; 120:e2217383120. [PMID: 36930598 PMCID: PMC10041103 DOI: 10.1073/pnas.2217383120] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/18/2023] Open
Abstract
This year marks the 25th anniversary of the coinage of the term metabolome [S. G. Oliver et al., Trends Biotech. 16, 373-378 (1998)]. As the field rapidly advances, it is important to take stock of the progress which has been made to best inform the disciplines future. While a medical-centric perspective on metabolomics has recently been published [M. Giera et al., Cell Metab. 34, 21-34 (2022)], this largely ignores the pioneering contributions made by the plant and microbial science communities. In this perspective, we provide a contemporary overview of all fields in which metabolomics is employed with particular emphasis on both methodological and application breakthroughs made in plant and microbial sciences that have shaped this evolving research discipline from the very early days of its establishment. This will not cover all types of metabolomics assays currently employed but will focus mainly on those utilizing mass spectrometry-based measurements since they are currently by far the most prominent. Having established the historical context of metabolomics, we will address the key challenges currently facing metabolomics and offer potential approaches by which these can be faced. Most salient among these is the fact that the vast majority of mass features are as yet not annotated with high confidence; what we may refer to as definitive identification. We discuss the potential of both standard compound libraries and artificial intelligence technologies to address this challenge and the use of natural variance-based approaches such as genome-wide association studies in attempt to assign specific functions to the myriad of structurally similar and complex specialized metabolites. We conclude by stating our contention that as these challenges are epic and that they will need far greater cooperative efforts from biologists, chemists, and computer scientists with an interest in all kingdoms of life than have been made to date. Ultimately, a better linkage of metabolome and genome data will likely also be needed particularly considering the Earth BioGenome Project.
Collapse
Affiliation(s)
- Asaph Aharoni
- Department of Plant and Environmental Sciences, Weizmann Institute of Science, Rehovot76100, Israel
| | - Royston Goodacre
- Department of Biochemistry and Systems Biology, Institute of Systems, Molecular and Integrative Biology, University of Liverpool, LiverpoolL69 7BE, UK
| | - Alisdair R. Fernie
- Max-Planck-Institute for Molecular Plant Physiology, Potsdam14476, Germany
| |
Collapse
|
16
|
de Jonge NF, Mildau K, Meijer D, Louwen JJR, Bueschl C, Huber F, van der Hooft JJJ. Good practices and recommendations for using and benchmarking computational metabolomics metabolite annotation tools. Metabolomics 2022; 18:103. [PMID: 36469190 PMCID: PMC9722809 DOI: 10.1007/s11306-022-01963-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/16/2022] [Accepted: 11/18/2022] [Indexed: 12/12/2022]
Abstract
BACKGROUND Untargeted metabolomics approaches based on mass spectrometry obtain comprehensive profiles of complex biological samples. However, on average only 10% of the molecules can be annotated. This low annotation rate hampers biochemical interpretation and effective comparison of metabolomics studies. Furthermore, de novo structural characterization of mass spectral data remains a complicated and time-intensive process. Recently, the field of computational metabolomics has gained traction and novel methods have started to enable large-scale and reliable metabolite annotation. Molecular networking and machine learning-based in-silico annotation tools have been shown to greatly assist metabolite characterization in diverse fields such as clinical metabolomics and natural product discovery. AIM OF REVIEW We highlight recent advances in computational metabolite annotation workflows with a special focus on their evaluation and comparison with other tools. Whilst the progress is substantial and promising, we also argue that inconsistencies in benchmarking different tools hamper users from selecting the most appropriate and promising method for their research. We summarize benchmarking strategies of the different tools and outline several recommendations for benchmarking and comparing novel tools. KEY SCIENTIFIC CONCEPTS OF REVIEW This review focuses on recent advances in mass spectral library-based and machine learning-supported metabolite annotation workflows. We discuss large-scale library matching and analogue search, the current bloom of mass spectral similarity scores, and how molecular networking has changed the field. In addition, the potentials and challenges of machine learning-supported metabolite annotation workflows are highlighted. Overall, recent developments in computational metabolomics have started to fundamentally change metabolomics workflows, and we expect that as a community we will be able to overcome current method performance ambiguities and annotation bottlenecks.
Collapse
Affiliation(s)
- Niek F. de Jonge
- Bioinformatics Group, Wageningen University, Wageningen, the Netherlands
| | - Kevin Mildau
- Department of Analytical Chemistry, Biochemical Network Analysis Lab, University of Vienna, Vienna, Austria
| | - David Meijer
- Bioinformatics Group, Wageningen University, Wageningen, the Netherlands
| | - Joris J. R. Louwen
- Bioinformatics Group, Wageningen University, Wageningen, the Netherlands
| | - Christoph Bueschl
- Department of Analytical Chemistry, Biochemical Network Analysis Lab, University of Vienna, Vienna, Austria
| | - Florian Huber
- Centre for Digitalization and Digitality (ZDD), University of Applied Sciences Düsseldorf, Düsseldorf, Germany
| | - Justin J. J. van der Hooft
- Bioinformatics Group, Wageningen University, Wageningen, the Netherlands
- Department of Biochemistry, University of Johannesburg, Johannesburg, South Africa
| |
Collapse
|
17
|
TransG-net: transformer and graph neural network based multi-modal data fusion network for molecular properties prediction. APPL INTELL 2022. [DOI: 10.1007/s10489-022-04351-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/05/2022]
|
18
|
Bittremieux W, Wang M, Dorrestein PC. The critical role that spectral libraries play in capturing the metabolomics community knowledge. Metabolomics 2022; 18:94. [PMID: 36409434 DOI: 10.1007/s11306-022-01947-y] [Citation(s) in RCA: 23] [Impact Index Per Article: 11.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/01/2022] [Accepted: 10/19/2022] [Indexed: 11/22/2022]
Abstract
BACKGROUND Spectral library searching is currently the most common approach for compound annotation in untargeted metabolomics. Spectral libraries applicable to liquid chromatography mass spectrometry have grown in size over the past decade to include hundreds of thousands to millions of mass spectra and tens of thousands of compounds, forming an essential knowledge base for the interpretation of metabolomics experiments. AIM OF REVIEW We describe existing spectral library resources, highlight different strategies for compiling spectral libraries, and discuss quality considerations that should be taken into account when interpreting spectral library searching results. Finally, we describe how spectral libraries are empowering the next generation of machine learning tools in computational metabolomics, and discuss several opportunities for using increasingly accessible large spectral libraries. KEY SCIENTIFIC CONCEPTS OF REVIEW This review focuses on the current state of spectral libraries for untargeted LC-MS/MS based metabolomics. We show how the number of entries in publicly accessible spectral libraries has increased more than 60-fold in the past eight years to aid molecular interpretation and we discuss how the role of spectral libraries in untargeted metabolomics will evolve in the near future.
Collapse
Affiliation(s)
- Wout Bittremieux
- Collaborative Mass Spectrometry Innovation Center, University of California San Diego, La Jolla, CA, 92093, USA
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, CA, 92093, USA
| | - Mingxun Wang
- Department of Computer Science, University of California Riverside, Riverside, CA, 92507, USA
| | - Pieter C Dorrestein
- Collaborative Mass Spectrometry Innovation Center, University of California San Diego, La Jolla, CA, 92093, USA.
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, CA, 92093, USA.
| |
Collapse
|
19
|
Consonni V, Gosetti F, Termopoli V, Todeschini R, Valsecchi C, Ballabio D. Multi-Task Neural Networks and Molecular Fingerprints to Enhance Compound Identification from LC-MS/MS Data. Molecules 2022; 27:5827. [PMID: 36144564 PMCID: PMC9502453 DOI: 10.3390/molecules27185827] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2022] [Accepted: 09/05/2022] [Indexed: 11/27/2022] Open
Abstract
Mass spectrometry (MS) is widely used for the identification of chemical compounds by matching the experimentally acquired mass spectrum against a database of reference spectra. However, this approach suffers from a limited coverage of the existing databases causing a failure in the identification of a compound not present in the database. Among the computational approaches for mining metabolite structures based on MS data, one option is to predict molecular fingerprints from the mass spectra by means of chemometric strategies and then use them to screen compound libraries. This can be carried out by calibrating multi-task artificial neural networks from large datasets of mass spectra, used as inputs, and molecular fingerprints as outputs. In this study, we prepared a large LC-MS/MS dataset from an on-line open repository. These data were used to train and evaluate deep-learning-based approaches to predict molecular fingerprints and retrieve the structure of unknown compounds from their LC-MS/MS spectra. Effects of data sparseness and the impact of different strategies of data curing and dimensionality reduction on the output accuracy have been evaluated. Moreover, extensive diagnostics have been carried out to evaluate modelling advantages and drawbacks as a function of the explored chemical space.
Collapse
Affiliation(s)
| | | | | | | | | | - Davide Ballabio
- Department of Earth and Environmental Sciences, University of Milano-Bicocca, Piazza della Scienza 1, 20126 Milano, Italy
| |
Collapse
|
20
|
Tian Z, Liu F, Li D, Fernie AR, Chen W. Strategies for structure elucidation of small molecules based on LC–MS/MS data from complex biological samples. Comput Struct Biotechnol J 2022; 20:5085-5097. [PMID: 36187931 PMCID: PMC9489805 DOI: 10.1016/j.csbj.2022.09.004] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2022] [Revised: 09/03/2022] [Accepted: 09/03/2022] [Indexed: 11/06/2022] Open
Abstract
LC–MS/MS is a major analytical platform for metabolomics, which has become a recent hotspot in the research fields of life and environmental sciences. By contrast, structure elucidation of small molecules based on LC–MS/MS data remains a major challenge in the chemical and biological interpretation of untargeted metabolomics datasets. In recent years, several strategies for structure elucidation using LC–MS/MS data from complex biological samples have been proposed, these strategies can be simply categorized into two types, one based on structure annotation of mass spectra and for the other on retention time prediction. These strategies have helped many scientists conduct research in metabolite-related fields and are indispensable for the development of future tools. Here, we summarized the characteristics of the current tools and strategies for structure elucidation of small molecules based on LC–MS/MS data, and further discussed the directions and perspectives to improve the power of the tools or strategies for structure elucidation.
Collapse
|
21
|
Roberts I, Wright Muelas M, Taylor JM, Davison AS, Xu Y, Grixti JM, Gotts N, Sorokin A, Goodacre R, Kell DB. Untargeted metabolomics of COVID-19 patient serum reveals potential prognostic markers of both severity and outcome. Metabolomics 2021; 18:6. [PMID: 34928464 PMCID: PMC8686810 DOI: 10.1007/s11306-021-01859-3] [Citation(s) in RCA: 46] [Impact Index Per Article: 15.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/20/2021] [Accepted: 11/29/2021] [Indexed: 12/15/2022]
Abstract
INTRODUCTION The diagnosis of COVID-19 is normally based on the qualitative detection of viral nucleic acid sequences. Properties of the host response are not measured but are key in determining outcome. Although metabolic profiles are well suited to capture host state, most metabolomics studies are either underpowered, measure only a restricted subset of metabolites, compare infected individuals against uninfected control cohorts that are not suitably matched, or do not provide a compact predictive model. OBJECTIVES Here we provide a well-powered, untargeted metabolomics assessment of 120 COVID-19 patient samples acquired at hospital admission. The study aims to predict the patient's infection severity (i.e., mild or severe) and potential outcome (i.e., discharged or deceased). METHODS High resolution untargeted UHPLC-MS/MS analysis was performed on patient serum using both positive and negative ionization modes. A subset of 20 intermediary metabolites predictive of severity or outcome were selected based on univariate statistical significance and a multiple predictor Bayesian logistic regression model was created. RESULTS The predictors were selected for their relevant biological function and include deoxycytidine and ureidopropionate (indirectly reflecting viral load), kynurenine (reflecting host inflammatory response), and multiple short chain acylcarnitines (energy metabolism) among others. Currently, this approach predicts outcome and severity with a Monte Carlo cross validated area under the ROC curve of 0.792 (SD 0.09) and 0.793 (SD 0.08), respectively. A blind validation study on an additional 90 patients predicted outcome and severity at ROC AUC of 0.83 (CI 0.74-0.91) and 0.76 (CI 0.67-0.86). CONCLUSION Prognostic tests based on the markers discussed in this paper could allow improvement in the planning of COVID-19 patient treatment.
Collapse
Affiliation(s)
- Ivayla Roberts
- Department of Biochemistry and Systems Biology, Institute of Systems, Molecular and Integrative Biology, University of Liverpool, Liverpool, UK
| | - Marina Wright Muelas
- Department of Biochemistry and Systems Biology, Institute of Systems, Molecular and Integrative Biology, University of Liverpool, Liverpool, UK.
| | - Joseph M Taylor
- Department of Clinical Biochemistry and Metabolic Medicine, Liverpool Clinical Laboratories, Royal Liverpool University Hospitals Trust, Liverpool, UK
| | - Andrew S Davison
- Department of Clinical Biochemistry and Metabolic Medicine, Liverpool Clinical Laboratories, Royal Liverpool University Hospitals Trust, Liverpool, UK
| | - Yun Xu
- Department of Biochemistry and Systems Biology, Institute of Systems, Molecular and Integrative Biology, University of Liverpool, Liverpool, UK
- Centre for Metabolomics Research (CMR), Department of Biochemistry and Systems Biology, Institute of Systems, Molecular and Integrative Biology, University of Liverpool, Liverpool, UK
| | - Justine M Grixti
- Department of Biochemistry and Systems Biology, Institute of Systems, Molecular and Integrative Biology, University of Liverpool, Liverpool, UK
| | - Nigel Gotts
- Department of Biochemistry and Systems Biology, Institute of Systems, Molecular and Integrative Biology, University of Liverpool, Liverpool, UK
- Centre for Metabolomics Research (CMR), Department of Biochemistry and Systems Biology, Institute of Systems, Molecular and Integrative Biology, University of Liverpool, Liverpool, UK
| | - Anatolii Sorokin
- Department of Biochemistry and Systems Biology, Institute of Systems, Molecular and Integrative Biology, University of Liverpool, Liverpool, UK
| | - Royston Goodacre
- Department of Biochemistry and Systems Biology, Institute of Systems, Molecular and Integrative Biology, University of Liverpool, Liverpool, UK
- Centre for Metabolomics Research (CMR), Department of Biochemistry and Systems Biology, Institute of Systems, Molecular and Integrative Biology, University of Liverpool, Liverpool, UK
| | - Douglas B Kell
- Department of Biochemistry and Systems Biology, Institute of Systems, Molecular and Integrative Biology, University of Liverpool, Liverpool, UK.
- Novo Nordisk Foundation Centre for Biosustainability, Technical University of Denmark, Building 220, Chemitorvet, 2000, Kgs Lyngby, Denmark.
| |
Collapse
|