1
|
Aggerbeck MR, Frøkjær EE, Johansen A, Ellegaard-Jensen L, Hansen LH, Hansen M. Non-target analysis of Danish wastewater treatment plant effluent: Statistical analysis of chemical fingerprinting as a step toward a future monitoring tool. ENVIRONMENTAL RESEARCH 2024; 257:119242. [PMID: 38821457 DOI: 10.1016/j.envres.2024.119242] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/11/2024] [Revised: 04/25/2024] [Accepted: 05/26/2024] [Indexed: 06/02/2024]
Abstract
In an attempt to discover and characterize the plethora of xenobiotic substances, this study investigates chemical compounds released into the environment with wastewater effluents. A novel non-targeted screening methodology based on ultra-high resolution Orbitrap mass spectrometry and nanoflow ultra-high performance liquid chromatography together with a newly optimized data-processing pipeline were applied to effluent samples from two state-of-the-art and one small wastewater treatment facility. In total, 785 molecular structures were obtained, of which 38 were identified as single compounds, while 480 structures were identified at a putative level. Most of these substances were therapeutics and drugs, present as parent compounds and metabolites. Using R packages Phyloseq and MetacodeR, originally developed for bioinformatics, significant differences in xenobiotic presence in the wastewater effluents between the three sites were demonstrated.
Collapse
Affiliation(s)
- Marie Rønne Aggerbeck
- Department of Environmental Science, Aarhus University, Frederiksborgvej 399, 4000, Roskilde, Denmark; Aarhus University Centre for Water Technology (WATEC), Aarhus University, Vejlsøvej 25, 8600, Silkeborg, Denmark.
| | - Emil Egede Frøkjær
- Department of Environmental Science, Aarhus University, Frederiksborgvej 399, 4000, Roskilde, Denmark
| | - Anders Johansen
- Department of Environmental Science, Aarhus University, Frederiksborgvej 399, 4000, Roskilde, Denmark; Aarhus University Centre for Water Technology (WATEC), Aarhus University, Vejlsøvej 25, 8600, Silkeborg, Denmark; Department of Plant and Environmental Sciences, University of Copenhagen, Thorvaldsensvej 40, 1871, Frederiksberg, Denmark; Aarhus University Centre for Circular Bioeconomy, Aarhus University, 8830 Tjele, Denmark
| | - Lea Ellegaard-Jensen
- Department of Environmental Science, Aarhus University, Frederiksborgvej 399, 4000, Roskilde, Denmark; Aarhus University Centre for Water Technology (WATEC), Aarhus University, Vejlsøvej 25, 8600, Silkeborg, Denmark
| | - Lars Hestbjerg Hansen
- Department of Plant and Environmental Sciences, University of Copenhagen, Thorvaldsensvej 40, 1871, Frederiksberg, Denmark
| | - Martin Hansen
- Department of Environmental Science, Aarhus University, Frederiksborgvej 399, 4000, Roskilde, Denmark; Aarhus University Centre for Water Technology (WATEC), Aarhus University, Vejlsøvej 25, 8600, Silkeborg, Denmark.
| |
Collapse
|
2
|
Chi J, Shu J, Li M, Mudappathi R, Jin Y, Lewis F, Boon A, Qin X, Liu L, Gu H. Artificial Intelligence in Metabolomics: A Current Review. Trends Analyt Chem 2024; 178:117852. [PMID: 39071116 PMCID: PMC11271759 DOI: 10.1016/j.trac.2024.117852] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/30/2024]
Abstract
Metabolomics and artificial intelligence (AI) form a synergistic partnership. Metabolomics generates large datasets comprising hundreds to thousands of metabolites with complex relationships. AI, aiming to mimic human intelligence through computational modeling, possesses extraordinary capabilities for big data analysis. In this review, we provide a recent overview of the methodologies and applications of AI in metabolomics studies in the context of systems biology and human health. We first introduce the AI concept, history, and key algorithms for machine learning and deep learning, summarizing their strengths and weaknesses. We then discuss studies that have successfully used AI across different aspects of metabolomic analysis, including analytical detection, data preprocessing, biomarker discovery, predictive modeling, and multi-omics data integration. Lastly, we discuss the existing challenges and future perspectives in this rapidly evolving field. Despite limitations and challenges, the combination of metabolomics and AI holds great promises for revolutionary advancements in enhancing human health.
Collapse
Affiliation(s)
- Jinhua Chi
- College of Health Solutions, Arizona State University, Phoenix, AZ 85004, USA
- Center for Translational Science, Florida International University, Port St. Lucie, FL 34987, USA
| | - Jingmin Shu
- College of Health Solutions, Arizona State University, Phoenix, AZ 85004, USA
- Center for Personalized Diagnostics, Biodesign Institute, Arizona State University, Tempe, AZ 85281, USA
| | - Ming Li
- Phoenix VA Health Care System, Phoenix, AZ 85012, USA
- University of Arizona College of Medicine, Phoenix, AZ 85004, USA
| | - Rekha Mudappathi
- College of Health Solutions, Arizona State University, Phoenix, AZ 85004, USA
- Center for Personalized Diagnostics, Biodesign Institute, Arizona State University, Tempe, AZ 85281, USA
| | - Yan Jin
- Center for Translational Science, Florida International University, Port St. Lucie, FL 34987, USA
| | - Freeman Lewis
- Center for Translational Science, Florida International University, Port St. Lucie, FL 34987, USA
| | - Alexandria Boon
- Center for Translational Science, Florida International University, Port St. Lucie, FL 34987, USA
| | - Xiaoyan Qin
- College of Liberal Arts and Sciences, Arizona State University, Tempe, AZ 85281, USA
| | - Li Liu
- College of Health Solutions, Arizona State University, Phoenix, AZ 85004, USA
- Center for Personalized Diagnostics, Biodesign Institute, Arizona State University, Tempe, AZ 85281, USA
| | - Haiwei Gu
- College of Health Solutions, Arizona State University, Phoenix, AZ 85004, USA
- Center for Translational Science, Florida International University, Port St. Lucie, FL 34987, USA
| |
Collapse
|
3
|
Chou L, Zhang S, Luo W, Zhu W, Guo J, Tu K, Tan H, Wang C, Wei S, Yu H, Zhang X, Shi W. Identification of Key Toxic Substances Considering Metabolic Activation: A Combination of Transcriptome and Nontarget Analysis. ENVIRONMENTAL SCIENCE & TECHNOLOGY 2024. [PMID: 39120612 DOI: 10.1021/acs.est.4c03683] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/10/2024]
Abstract
There have been numerous studies using effect-directed analysis (EDA) to identify key toxic substances present in source and drinking water, but none of these studies have considered the effects of metabolic activation. This study developed a comprehensive method including a pretreatment process based on an in vitro metabolic activation system, a comprehensive biological effect evaluation based on concentration-dependent transcriptome (CDT), and a chemical feature identification based on nontarget chemical analysis (NTA), to evaluate the changes in the toxic effects and differences in the chemical composition after metabolism. Models for matching metabolites and precursors as well as data-driven identification methods were further constructed to identify toxic metabolites and key toxic precursor substances in drinking water samples from the Yangtze River. After metabolism, the metabolic samples showed a general trend of reduced toxicity in terms of overall biological potency (mean: 3.2-fold). However, metabolic activation led to an increase in some types of toxic effects, including pathways such as excision repair, mismatch repair, protein processing in endoplasmic reticulum, nucleotide excision repair, and DNA replication. Meanwhile, metabolic samples showed a decrease (17.8%) in the number of peaks and average peak area after metabolism, while overall polarity, hydrophilicity, and average molecular weight increased slightly (10.3%). Based on the models for matching of metabolites and precursors and the data-driven identification methods, 32 chemicals were efficiently identified as key toxic substances as main contributors to explain the different transcriptome biological effects such as cellular component, development, and DNA damage related, including 15 industrial compounds, 7 PPCPs, 6 pesticides, and 4 natural products. This study avoids the process of structure elucidation of toxic metabolites and can trace them directly to the precursors based on MS spectra, providing a new idea for the identification of key toxic pollutants of metabolites.
Collapse
Affiliation(s)
- Liben Chou
- State Key Laboratory of Pollution Control and Resource Reuse, School of the Environment, Nanjing University, Nanjing 210023, China
| | - Shaoqing Zhang
- State Key Laboratory of Pollution Control and Resource Reuse, School of the Environment, Nanjing University, Nanjing 210023, China
| | - Wenrui Luo
- State Key Laboratory of Pollution Control and Resource Reuse, School of the Environment, Nanjing University, Nanjing 210023, China
| | - Wenxuan Zhu
- Department of Mathematics, Statistics, and Computer Science, Macalester College, Saint Paul, Minnesota 55105, United States
| | - Jing Guo
- State Key Laboratory of Pollution Control and Resource Reuse, School of the Environment, Nanjing University, Nanjing 210023, China
| | - Keng Tu
- State Key Laboratory of Pollution Control and Resource Reuse, School of the Environment, Nanjing University, Nanjing 210023, China
| | - Haoyue Tan
- State Key Laboratory of Pollution Control and Resource Reuse, School of the Environment, Nanjing University, Nanjing 210023, China
| | - Chang Wang
- Hubei Key Laboratory of Environmental and Health Effects of Persistent Toxic Substances, Institute of Environment and Health, Jianghan University, Wuhan 430056, China
| | - Si Wei
- State Key Laboratory of Pollution Control and Resource Reuse, School of the Environment, Nanjing University, Nanjing 210023, China
- Jiangsu Province Ecology and Environment Protection Key Laboratory of Chemical Safety and Health Risk, Nanjing 210023, China
| | - Hongxia Yu
- State Key Laboratory of Pollution Control and Resource Reuse, School of the Environment, Nanjing University, Nanjing 210023, China
- Jiangsu Province Ecology and Environment Protection Key Laboratory of Chemical Safety and Health Risk, Nanjing 210023, China
| | - Xiaowei Zhang
- State Key Laboratory of Pollution Control and Resource Reuse, School of the Environment, Nanjing University, Nanjing 210023, China
- Jiangsu Province Ecology and Environment Protection Key Laboratory of Chemical Safety and Health Risk, Nanjing 210023, China
| | - Wei Shi
- State Key Laboratory of Pollution Control and Resource Reuse, School of the Environment, Nanjing University, Nanjing 210023, China
- Jiangsu Province Ecology and Environment Protection Key Laboratory of Chemical Safety and Health Risk, Nanjing 210023, China
| |
Collapse
|
4
|
Metz TO, Chang CH, Gautam V, Anjum A, Tian S, Wang F, Colby SM, Nunez JR, Blumer MR, Edison AS, Fiehn O, Jones DP, Li S, Morgan ET, Patti GJ, Ross DH, Shapiro MR, Williams AJ, Wishart DS. Introducing 'identification probability' for automated and transferable assessment of metabolite identification confidence in metabolomics and related studies. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.07.30.605945. [PMID: 39131324 PMCID: PMC11312557 DOI: 10.1101/2024.07.30.605945] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 08/13/2024]
Abstract
Methods for assessing compound identification confidence in metabolomics and related studies have been debated and actively researched for the past two decades. The earliest effort in 2007 focused primarily on mass spectrometry and nuclear magnetic resonance spectroscopy and resulted in four recommended levels of metabolite identification confidence - the Metabolite Standards Initiative (MSI) Levels. In 2014, the original MSI Levels were expanded to five levels (including two sublevels) to facilitate communication of compound identification confidence in high resolution mass spectrometry studies. Further refinement in identification levels have occurred, for example to accommodate use of ion mobility spectrometry in metabolomics workflows, and alternate approaches to communicate compound identification confidence also have been developed based on identification points schema. However, neither qualitative levels of identification confidence nor quantitative scoring systems address the degree of ambiguity in compound identifications in context of the chemical space being considered, are easily automated, or are transferable between analytical platforms. In this perspective, we propose that the metabolomics and related communities consider identification probability as an approach for automated and transferable assessment of compound identification and ambiguity in metabolomics and related studies. Identification probability is defined simply as 1/N, where N is the number of compounds in a reference library or chemical space that match to an experimentally measured molecule within user-defined measurement precision(s), for example mass measurement or retention time accuracy, etc. We demonstrate the utility of identification probability in an in silico analysis of multi-property reference libraries constructed from the Human Metabolome Database and computational property predictions, provide guidance to the community in transparent implementation of the concept, and invite the community to further evaluate this concept in parallel with their current preferred methods for assessing metabolite identification confidence.
Collapse
|
5
|
Vaz-Rodrigues R, Mazuecos L, Villar M, Contreras M, González-García A, Bonini P, Scimeca RC, Mulenga A, de la Fuente J. Tick salivary proteome and lipidome with low glycan content correlate with allergic type reactions in the zebrafish model. Int J Parasitol 2024:S0020-7519(24)00139-5. [PMID: 39074655 DOI: 10.1016/j.ijpara.2024.07.002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2024] [Revised: 07/01/2024] [Accepted: 07/24/2024] [Indexed: 07/31/2024]
Abstract
Ticks, as hematophagous ectoparasites, can manipulate host immune and metabolic processes, causing tick-borne allergies such as α-Gal syndrome (AGS). Glycolipids with bound galactose-alpha-1-3-galactose (α-Gal) are potential allergenic molecules associated with AGS. Nevertheless, proteins and lipids lacking α-Gal modifications may contribute to tick salivary allergies and be linked to AGS. In this study, we characterized the effect of deglycosylated tick salivary proteins without lipids on treated zebrafish fed with dog food formulated with mammalian (beef, lamb, pork) meat by quantitative proteomics analysis of intestinal samples. The characterization and functional annotations of tick salivary lipids with low representation of glycolipids was conducted using a lipidomics approach. Results showed a significant effect of treatment with saliva and saliva deglycosylated protein fraction on zebrafish abnormal or no feeding (p < 0.005). Treatment with this fraction affected multiple metabolic pathways, defense responses to pathogens and protein metabolism, which correlated with abnormal or no feeding. Lipidomics analysis identified 23 lipid classes with low representation of glycolipids (0.70% of identified lipids). The lipid class with highest representation was phosphatidylcholine (PC; 26.66%) and for glycolipids it corresponded to diacylglycerol (DG; 0.48%). Qualitative analysis of PC antibodies revealed that individuals bitten by ticks were more likely to produce PC-IgG antibodies (p < 0.001). DG levels were significantly higher in tick salivary glands (p < 0.05) compared with tick saliva and salivary fractions. The α-Gal content was higher in tick saliva than in deglycosylated saliva and lipid fractions. These results support a possible role for tick salivary proteins and lipids without α-Gal modifications in AGS.
Collapse
Affiliation(s)
- Rita Vaz-Rodrigues
- SaBio, Instituto de Investigación en Recursos Cinegéticos (IREC, CSIC-UCLM-JCCM), Ronda de Toledo 12, 13071 Ciudad Real, Spain
| | - Lorena Mazuecos
- SaBio, Instituto de Investigación en Recursos Cinegéticos (IREC, CSIC-UCLM-JCCM), Ronda de Toledo 12, 13071 Ciudad Real, Spain
| | - Margarita Villar
- SaBio, Instituto de Investigación en Recursos Cinegéticos (IREC, CSIC-UCLM-JCCM), Ronda de Toledo 12, 13071 Ciudad Real, Spain; Biochemistry Section, Faculty of Science and Chemical Technologies, University of Castilla-La Mancha, 13071 Ciudad Real, Spain
| | - Marinela Contreras
- SaBio, Instituto de Investigación en Recursos Cinegéticos (IREC, CSIC-UCLM-JCCM), Ronda de Toledo 12, 13071 Ciudad Real, Spain
| | - Almudena González-García
- SaBio, Instituto de Investigación en Recursos Cinegéticos (IREC, CSIC-UCLM-JCCM), Ronda de Toledo 12, 13071 Ciudad Real, Spain
| | - Paolo Bonini
- oloBion SL, Av. Dr. Marañón 8, 08028Barcelona, Spain
| | - Ruth C Scimeca
- Department of Veterinary Pathobiology, College of Veterinary Medicine, Oklahoma State University, Stillwater, OK 74078, USA
| | - Albert Mulenga
- Department of Veterinary Pathobiology, School of Veterinary Medicine & Biomedical Sciences, Texas A&M University, College Station, TX 77843, USA
| | - José de la Fuente
- SaBio, Instituto de Investigación en Recursos Cinegéticos (IREC, CSIC-UCLM-JCCM), Ronda de Toledo 12, 13071 Ciudad Real, Spain; Department of Veterinary Pathobiology, College of Veterinary Medicine, Oklahoma State University, Stillwater, OK 74078, USA.
| |
Collapse
|
6
|
Kong F, Shen T, Li Y, Bashar A, Bird SS, Fiehn O. Denoising Search doubles the number of metabolite and exposome annotations in human plasma using an Orbitrap Astral mass spectrometer. RESEARCH SQUARE 2024:rs.3.rs-4758843. [PMID: 39108483 PMCID: PMC11302682 DOI: 10.21203/rs.3.rs-4758843/v1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 08/13/2024]
Abstract
Chemical exposures may impact human metabolism and contribute to the etiology of neurodegenerative disorders like Alzheimer's Disease (AD). Identifying these small metabolites involves matching experimental spectra to reference spectra in databases. However, environmental chemicals or physiologically active metabolites are usually present at low concentrations in human specimens. The presence of noise ions can significantly degrade spectral quality, leading to false negatives and reduced identification rates. In response to this challenge, the Spectral Denoising algorithm removes both chemical and electronic noise. Spectral Denoising outperformed alternative methods in benchmarking studies on 240 tested metabolites. It improved high confident compound identifications at an average 35-fold lower concentrations than previously achievable. Spectral Denoising proved highly robust against varying levels of both chemical and electronic noise even with >150-fold higher intensity of noise ions than true fragment ions. For human plasma samples of AD patients that were analyzed on the Orbitrap Astral mass spectrometer, Denoising Search detected 2.3-fold more annotated compounds compared to the Exploris 240 Orbitrap instrument, including drug metabolites, household and industrial chemicals, and pesticides. This combination of advanced instrumentation with a superior denoising algorithm opens the door for precision medicine in exposome research.
Collapse
Affiliation(s)
- Fanzhou Kong
- Chemistry Department, One Shields Avenue, University of California Davis, Davis, CA, 95616, USA
- West Coast Metabolomics Center, University of California Davis, Davis, CA, 95616, USA
| | - Tong Shen
- West Coast Metabolomics Center, University of California Davis, Davis, CA, 95616, USA
| | - Yuanyue Li
- West Coast Metabolomics Center, University of California Davis, Davis, CA, 95616, USA
| | - Amer Bashar
- Thermo Fisher Scientific, 355 River Oaks Pkwy, San Jose, CA 95134, USA
| | - Susan S Bird
- Thermo Fisher Scientific, 355 River Oaks Pkwy, San Jose, CA 95134, USA
| | - Oliver Fiehn
- West Coast Metabolomics Center, University of California Davis, Davis, CA, 95616, USA
| |
Collapse
|
7
|
Han Y, Hu LX, Liu T, Dong LL, Liu YS, Zhao JL, Ying GG. Discovering transformation products of pharmaceuticals in domestic wastewaters and receiving rivers by using non-target screening and machine learning approaches. THE SCIENCE OF THE TOTAL ENVIRONMENT 2024; 948:174715. [PMID: 39002592 DOI: 10.1016/j.scitotenv.2024.174715] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/29/2024] [Revised: 07/07/2024] [Accepted: 07/09/2024] [Indexed: 07/15/2024]
Abstract
Wastewater treatment plants (WWTPs) are an important source of pharmaceuticals in surface water, but information about their transformation products (TPs) is very limited. Here, we investigated occurrence and transformation of pharmaceuticals and TPs in WWTPs and receiving rivers by using suspect and non-target analysis as well as target analysis. Results showed identification of 113 pharmaceuticals and 399 TPs, including mammalian metabolites (n = 100), environmental microbial degradation products (n = 250), photodegradation products (n = 44) and hydrolysis products (n = 5). The predominant parent pharmaceuticals (n = 37) and transformation products (n = 68) were mainly derived from antimicrobials, accounting for 32.7 % and 17.0 %, respectively. The identified compounds were found in the influent (387-428) and effluent (227-400) of WWTPs, as well as upstream (290-451) and downstream (322-416) of receiving rivers, most predominantly from antimicrobials, followed by analgesic and antipyretic drugs. A total of 399 identified TPs were transformed by 110 pathways, of which the oxidation reaction was predominant (27.0 %), followed by photodegradation reaction (10.7 %). Of the 399 TPs, 49 (with lower PNECs) were predicted to be more toxic than their parents. Compounds with potential high risks (hazard quotient >1 and risk index (RI) > 0.1) were found in the WWTP influent (126), effluent (53) and river (61), and the majority were from the antimicrobial and antihypertensive classes. In particular, the potential risks (RI) of TPs from roxithromycin and irbesartan were found higher than those for their corresponding parents. The findings from this study highlight the need to monitor TPs from pharmaceuticals in the environment.
Collapse
Affiliation(s)
- Yu Han
- SCNU Environmental Research Institute, Guangdong Provincial Key Laboratory of Chemical Pollution and Environmental Safety & MOE Key Laboratory of Theoretical Chemistry of Environment, South China Normal University, Guangzhou 510006, China; School of Environment, South China Normal University, University Town, Guangzhou 510006, China
| | - Li-Xin Hu
- SCNU Environmental Research Institute, Guangdong Provincial Key Laboratory of Chemical Pollution and Environmental Safety & MOE Key Laboratory of Theoretical Chemistry of Environment, South China Normal University, Guangzhou 510006, China; School of Environment, South China Normal University, University Town, Guangzhou 510006, China.
| | - Ting Liu
- SCNU Environmental Research Institute, Guangdong Provincial Key Laboratory of Chemical Pollution and Environmental Safety & MOE Key Laboratory of Theoretical Chemistry of Environment, South China Normal University, Guangzhou 510006, China; School of Environment, South China Normal University, University Town, Guangzhou 510006, China
| | - Liang-Li Dong
- SCNU Environmental Research Institute, Guangdong Provincial Key Laboratory of Chemical Pollution and Environmental Safety & MOE Key Laboratory of Theoretical Chemistry of Environment, South China Normal University, Guangzhou 510006, China; School of Environment, South China Normal University, University Town, Guangzhou 510006, China
| | - You-Sheng Liu
- SCNU Environmental Research Institute, Guangdong Provincial Key Laboratory of Chemical Pollution and Environmental Safety & MOE Key Laboratory of Theoretical Chemistry of Environment, South China Normal University, Guangzhou 510006, China; School of Environment, South China Normal University, University Town, Guangzhou 510006, China
| | - Jian-Liang Zhao
- SCNU Environmental Research Institute, Guangdong Provincial Key Laboratory of Chemical Pollution and Environmental Safety & MOE Key Laboratory of Theoretical Chemistry of Environment, South China Normal University, Guangzhou 510006, China; School of Environment, South China Normal University, University Town, Guangzhou 510006, China
| | - Guang-Guo Ying
- SCNU Environmental Research Institute, Guangdong Provincial Key Laboratory of Chemical Pollution and Environmental Safety & MOE Key Laboratory of Theoretical Chemistry of Environment, South China Normal University, Guangzhou 510006, China; School of Environment, South China Normal University, University Town, Guangzhou 510006, China.
| |
Collapse
|
8
|
Nash W, Ngere JB, Najdekr L, Dunn WB. Characterization of Electrospray Ionization Complexity in Untargeted Metabolomic Studies. Anal Chem 2024; 96:10935-10942. [PMID: 38917347 PMCID: PMC11238156 DOI: 10.1021/acs.analchem.4c00966] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2024] [Revised: 05/31/2024] [Accepted: 06/11/2024] [Indexed: 06/27/2024]
Abstract
The annotation of metabolites detected in LC-MS-based untargeted metabolomics studies routinely applies accurate m/z of the intact metabolite (MS1) as well as chromatographic retention time and MS/MS data. Electrospray ionization and transfer of ions through the mass spectrometer can result in the generation of multiple "features" derived from the same metabolite with different m/z values but the same retention time. The complexity of the different charged and neutral adducts, in-source fragments, and charge states has not been previously and deeply characterized. In this paper, we report the first large-scale characterization using publicly available data sets derived from different research groups, instrument manufacturers, LC assays, sample types, and ion modes. 271 m/z differences relating to different metabolite feature pairs were reported, and 209 were annotated. The results show a wide range of different features being observed with only a core 32 m/z differences reported in >50% of the data sets investigated. There were no patterns reporting specific m/z differences that were observed in relation to ion mode, instrument manufacturer, LC assay type, and mammalian sample type, although some m/z differences were related to study group (mammal, microbe, plant) and mobile phase composition. The results provide the metabolomics community with recommendations of adducts, in-source fragments, and charge states to apply in metabolite annotation workflows.
Collapse
Affiliation(s)
- William
J. Nash
- School
of Biosciences, University of Birmingham, Birmingham, West Midlands B15 2TT, United
Kingdom
| | - Judith B. Ngere
- School
of Biosciences, University of Birmingham, Birmingham, West Midlands B15 2TT, United
Kingdom
| | - Lukas Najdekr
- Institute
of Molecular and Translational Medicine, Palacký University Olomouc, Olomouc 779 00, Czech Republic
| | - Warwick B. Dunn
- School
of Biosciences, University of Birmingham, Birmingham, West Midlands B15 2TT, United
Kingdom
- Centre
for Metabolomics Research, Department of Biochemistry, Cell and Systems
Biology, Institute of Systems, Molecular, and Integrative Biology, University of Liverpool, Liverpool L69 7ZB, United Kingdom
| |
Collapse
|
9
|
Anderson BG, Raskind A, Hissong R, Dougherty MK, McGill SK, Gulati AS, Theriot CM, Kennedy RT, Evans CR. Offline Two-Dimensional Liquid Chromatography-Mass Spectrometry for Deep Annotation of the Fecal Metabolome Following Fecal Microbiota Transplantation. J Proteome Res 2024. [PMID: 38752739 DOI: 10.1021/acs.jproteome.4c00022] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/28/2024]
Abstract
Biological interpretation of untargeted LC-MS-based metabolomics data depends on accurate compound identification, but current techniques fall short of identifying most features that can be detected. The human fecal metabolome is complex, variable, incompletely annotated, and serves as an ideal matrix to evaluate novel compound identification methods. We devised an experimental strategy for compound annotation using multidimensional chromatography and semiautomated feature alignment and applied these methods to study the fecal metabolome in the context of fecal microbiota transplantation (FMT) for recurrent C. difficile infection. Pooled fecal samples were fractionated using semipreparative liquid chromatography and analyzed by an orthogonal LC-MS/MS method. The resulting spectra were searched against commercial, public, and local spectral libraries, and annotations were vetted using retention time alignment and prediction. Multidimensional chromatography yielded more than a 2-fold improvement in identified compounds compared to conventional LC-MS/MS and successfully identified several rare and previously unreported compounds, including novel fatty-acid conjugated bile acid species. Using an automated software-based feature alignment strategy, most metabolites identified by the new approach could be matched to features that were detected but not identified in single-dimensional LC-MS/MS data. Overall, our approach represents a powerful strategy to enhance compound identification and biological insight from untargeted metabolomics data.
Collapse
Affiliation(s)
- Brady G Anderson
- Department of Chemistry, University of Michigan, Ann Arbor, Michigan 48109, United States
- Michigan Compound Identification Development Core, University of Michigan, Ann Arbor, Michigan 48109, United States
| | - Alexander Raskind
- Michigan Compound Identification Development Core, University of Michigan, Ann Arbor, Michigan 48109, United States
- Biomedical Research Core Facilities, University of Michigan, Ann Arbor Michigan 48109, United States
| | - Rylan Hissong
- Michigan Compound Identification Development Core, University of Michigan, Ann Arbor, Michigan 48109, United States
- Biomedical Research Core Facilities, University of Michigan, Ann Arbor Michigan 48109, United States
| | - Michael K Dougherty
- Department of Medicine, Division of Gastroenterology and Hepatology, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina 27599, United States
| | - Sarah K McGill
- Department of Medicine, Division of Gastroenterology and Hepatology, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina 27599, United States
| | - Ajay S Gulati
- Department of Medicine, Division of Gastroenterology and Hepatology, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina 27599, United States
- Department of Pathology and Laboratory Medicine, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina 27599, United States
| | - Casey M Theriot
- Department of Population Health and Pathobiology, College of Veterinary Medicine, North Carolina State University, Raleigh, North Carolina 27606, United States
| | - Robert T Kennedy
- Department of Chemistry, University of Michigan, Ann Arbor, Michigan 48109, United States
- Michigan Compound Identification Development Core, University of Michigan, Ann Arbor, Michigan 48109, United States
- Department of Pharmacology, University of Michigan, Ann Arbor, Michigan 48109, United States
| | - Charles R Evans
- Michigan Compound Identification Development Core, University of Michigan, Ann Arbor, Michigan 48109, United States
- Biomedical Research Core Facilities, University of Michigan, Ann Arbor Michigan 48109, United States
- Department of Internal Medicine, University of Michigan, Ann Arbor Michigan 48109, United States
| |
Collapse
|
10
|
Zhao Y, Lai Y, Konijnenberg H, Huerta JM, Vinagre-Aragon A, Sabin JA, Hansen J, Petrova D, Sacerdote C, Zamora-Ros R, Pala V, Heath AK, Panico S, Guevara M, Masala G, Lill CM, Miller GW, Peters S, Vermeulen R. Association of Coffee Consumption and Prediagnostic Caffeine Metabolites With Incident Parkinson Disease in a Population-Based Cohort. Neurology 2024; 102:e209201. [PMID: 38513162 PMCID: PMC11175631 DOI: 10.1212/wnl.0000000000209201] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2023] [Accepted: 12/14/2023] [Indexed: 03/23/2024] Open
Abstract
BACKGROUND AND OBJECTIVES Inverse associations between caffeine intake and Parkinson disease (PD) have been frequently implicated in human studies. However, no studies have quantified biomarkers of caffeine intake years before PD onset and investigated whether and which caffeine metabolites are related to PD. METHODS Associations between self-reported total coffee consumption and future PD risk were examined in the EPIC4PD study, a prospective population-based cohort including 6 European countries. Cases with PD were identified through medical records and reviewed by expert neurologists. Hazard ratios (HRs) and 95% CIs for coffee consumption and PD incidence were estimated using Cox proportional hazards models. A case-control study nested within the EPIC4PD was conducted, recruiting cases with incident PD and matching each case with a control by age, sex, study center, and fasting status at blood collection. Caffeine metabolites were quantified by high-resolution mass spectrometry in baseline collected plasma samples. Using conditional logistic regression models, odds ratios (ORs) and 95% CIs were estimated for caffeine metabolites and PD risk. RESULTS In the EPIC4PD cohort (comprising 184,024 individuals), the multivariable-adjusted HR comparing the highest coffee intake with nonconsumers was 0.63 (95% CI 0.46-0.88, p = 0.006). In the nested case-control study, which included 351 cases with incident PD and 351 matched controls, prediagnostic caffeine and its primary metabolites, paraxanthine and theophylline, were inversely associated with PD risk. The ORs were 0.80 (95% CI 0.67-0.95, p = 0.009), 0.82 (95% CI 0.69-0.96, p = 0.015), and 0.78 (95% CI 0.65-0.93, p = 0.005), respectively. Adjusting for smoking and alcohol consumption did not substantially change these results. DISCUSSION This study demonstrates that the neuroprotection of coffee on PD is attributed to caffeine and its metabolites by detailed quantification of plasma caffeine and its metabolites years before diagnosis.
Collapse
Affiliation(s)
- Yujia Zhao
- From the Institute for Risk Assessment Sciences (Y.Z., H.K., S. Peters, R.V.), Utrecht University, the Netherlands; Department of Environmental Health Sciences (Y.L., G.W.M.), Mailman School of Public Health, Columbia University, New York, NY; Department of Epidemiology (J.M.H.), Murcia Regional Health Council-IMIB, Murcia; CIBER Epidemiología y Salud Pública (CIBERESP) (J.M.H., M.G.), Madrid; Movement Disorders Unit (A.V.-A.), Department of Neurology, University Hospital Donostia; BioDonostia Health Research Institute (A.V.-A.), Neurodegenerative Diseases Area, San Sebastián, Spain; Division of Cancer Epidemiology (J.A.S.), German Cancer Research Center (DKFZ), Heidelberg, Germany; Danish Cancer Institute (J.H.), Danish Cancer Society, Copenhagen, Denmark; Escuela Andaluza de Salud Pública (EASP) (D.P.); Instituto de Investigación Biosanitaria-ibs.GRANADA (D.P.), Granada; Centro de Investigación Biomédica en Red de Epidemiología y Salud Pública (CIBERESP) (D.P.), Madrid, Spain; Unit of Cancer Epidemiology (C.S.), Città della Salute e della Scienza University-Hospital, Turin, Italy; Unit of Nutrition and Cancer (R.Z.-R.), Cancer Epidemiology Research Programme, Catalan Institute of Oncology (ICO), Bellvitge Biomedical Research Institute (IDIBELL), Barcelona, Spain; Epidemiology and Prevention Unit (V.P.), Fondazione IRCCS Istituto Nazionale dei Tumori di Milano, Italy; Department of Epidemiology and Biostatistics (A.K.H., M.G.), School of Public Health, Imperial College London, United Kingdom; School of Medicine (S. Panico), Federico II University, Naples, Italy; de Salud Pública y Laboral de Navarra (M.G.), Pamplona; Navarra Institute for Health Research (IdiSNA) (M.G.), Pamplona, Spain; Institute for Cancer Research (G.M.), Prevention and Clinical Network (ISPRO), Florence, Italy; Institute of Epidemiology and Social Medicine (C.M.L.), University of Münster, Germany; Ageing Epidemiology Research Unit (AGE) (C.M.L.), School of Public Health, Imperial College London, United Kingdom; and University Medical Centre Utrecht (R.V.), the Netherlands
| | - Yunjia Lai
- From the Institute for Risk Assessment Sciences (Y.Z., H.K., S. Peters, R.V.), Utrecht University, the Netherlands; Department of Environmental Health Sciences (Y.L., G.W.M.), Mailman School of Public Health, Columbia University, New York, NY; Department of Epidemiology (J.M.H.), Murcia Regional Health Council-IMIB, Murcia; CIBER Epidemiología y Salud Pública (CIBERESP) (J.M.H., M.G.), Madrid; Movement Disorders Unit (A.V.-A.), Department of Neurology, University Hospital Donostia; BioDonostia Health Research Institute (A.V.-A.), Neurodegenerative Diseases Area, San Sebastián, Spain; Division of Cancer Epidemiology (J.A.S.), German Cancer Research Center (DKFZ), Heidelberg, Germany; Danish Cancer Institute (J.H.), Danish Cancer Society, Copenhagen, Denmark; Escuela Andaluza de Salud Pública (EASP) (D.P.); Instituto de Investigación Biosanitaria-ibs.GRANADA (D.P.), Granada; Centro de Investigación Biomédica en Red de Epidemiología y Salud Pública (CIBERESP) (D.P.), Madrid, Spain; Unit of Cancer Epidemiology (C.S.), Città della Salute e della Scienza University-Hospital, Turin, Italy; Unit of Nutrition and Cancer (R.Z.-R.), Cancer Epidemiology Research Programme, Catalan Institute of Oncology (ICO), Bellvitge Biomedical Research Institute (IDIBELL), Barcelona, Spain; Epidemiology and Prevention Unit (V.P.), Fondazione IRCCS Istituto Nazionale dei Tumori di Milano, Italy; Department of Epidemiology and Biostatistics (A.K.H., M.G.), School of Public Health, Imperial College London, United Kingdom; School of Medicine (S. Panico), Federico II University, Naples, Italy; de Salud Pública y Laboral de Navarra (M.G.), Pamplona; Navarra Institute for Health Research (IdiSNA) (M.G.), Pamplona, Spain; Institute for Cancer Research (G.M.), Prevention and Clinical Network (ISPRO), Florence, Italy; Institute of Epidemiology and Social Medicine (C.M.L.), University of Münster, Germany; Ageing Epidemiology Research Unit (AGE) (C.M.L.), School of Public Health, Imperial College London, United Kingdom; and University Medical Centre Utrecht (R.V.), the Netherlands
| | - Hilde Konijnenberg
- From the Institute for Risk Assessment Sciences (Y.Z., H.K., S. Peters, R.V.), Utrecht University, the Netherlands; Department of Environmental Health Sciences (Y.L., G.W.M.), Mailman School of Public Health, Columbia University, New York, NY; Department of Epidemiology (J.M.H.), Murcia Regional Health Council-IMIB, Murcia; CIBER Epidemiología y Salud Pública (CIBERESP) (J.M.H., M.G.), Madrid; Movement Disorders Unit (A.V.-A.), Department of Neurology, University Hospital Donostia; BioDonostia Health Research Institute (A.V.-A.), Neurodegenerative Diseases Area, San Sebastián, Spain; Division of Cancer Epidemiology (J.A.S.), German Cancer Research Center (DKFZ), Heidelberg, Germany; Danish Cancer Institute (J.H.), Danish Cancer Society, Copenhagen, Denmark; Escuela Andaluza de Salud Pública (EASP) (D.P.); Instituto de Investigación Biosanitaria-ibs.GRANADA (D.P.), Granada; Centro de Investigación Biomédica en Red de Epidemiología y Salud Pública (CIBERESP) (D.P.), Madrid, Spain; Unit of Cancer Epidemiology (C.S.), Città della Salute e della Scienza University-Hospital, Turin, Italy; Unit of Nutrition and Cancer (R.Z.-R.), Cancer Epidemiology Research Programme, Catalan Institute of Oncology (ICO), Bellvitge Biomedical Research Institute (IDIBELL), Barcelona, Spain; Epidemiology and Prevention Unit (V.P.), Fondazione IRCCS Istituto Nazionale dei Tumori di Milano, Italy; Department of Epidemiology and Biostatistics (A.K.H., M.G.), School of Public Health, Imperial College London, United Kingdom; School of Medicine (S. Panico), Federico II University, Naples, Italy; de Salud Pública y Laboral de Navarra (M.G.), Pamplona; Navarra Institute for Health Research (IdiSNA) (M.G.), Pamplona, Spain; Institute for Cancer Research (G.M.), Prevention and Clinical Network (ISPRO), Florence, Italy; Institute of Epidemiology and Social Medicine (C.M.L.), University of Münster, Germany; Ageing Epidemiology Research Unit (AGE) (C.M.L.), School of Public Health, Imperial College London, United Kingdom; and University Medical Centre Utrecht (R.V.), the Netherlands
| | - José María Huerta
- From the Institute for Risk Assessment Sciences (Y.Z., H.K., S. Peters, R.V.), Utrecht University, the Netherlands; Department of Environmental Health Sciences (Y.L., G.W.M.), Mailman School of Public Health, Columbia University, New York, NY; Department of Epidemiology (J.M.H.), Murcia Regional Health Council-IMIB, Murcia; CIBER Epidemiología y Salud Pública (CIBERESP) (J.M.H., M.G.), Madrid; Movement Disorders Unit (A.V.-A.), Department of Neurology, University Hospital Donostia; BioDonostia Health Research Institute (A.V.-A.), Neurodegenerative Diseases Area, San Sebastián, Spain; Division of Cancer Epidemiology (J.A.S.), German Cancer Research Center (DKFZ), Heidelberg, Germany; Danish Cancer Institute (J.H.), Danish Cancer Society, Copenhagen, Denmark; Escuela Andaluza de Salud Pública (EASP) (D.P.); Instituto de Investigación Biosanitaria-ibs.GRANADA (D.P.), Granada; Centro de Investigación Biomédica en Red de Epidemiología y Salud Pública (CIBERESP) (D.P.), Madrid, Spain; Unit of Cancer Epidemiology (C.S.), Città della Salute e della Scienza University-Hospital, Turin, Italy; Unit of Nutrition and Cancer (R.Z.-R.), Cancer Epidemiology Research Programme, Catalan Institute of Oncology (ICO), Bellvitge Biomedical Research Institute (IDIBELL), Barcelona, Spain; Epidemiology and Prevention Unit (V.P.), Fondazione IRCCS Istituto Nazionale dei Tumori di Milano, Italy; Department of Epidemiology and Biostatistics (A.K.H., M.G.), School of Public Health, Imperial College London, United Kingdom; School of Medicine (S. Panico), Federico II University, Naples, Italy; de Salud Pública y Laboral de Navarra (M.G.), Pamplona; Navarra Institute for Health Research (IdiSNA) (M.G.), Pamplona, Spain; Institute for Cancer Research (G.M.), Prevention and Clinical Network (ISPRO), Florence, Italy; Institute of Epidemiology and Social Medicine (C.M.L.), University of Münster, Germany; Ageing Epidemiology Research Unit (AGE) (C.M.L.), School of Public Health, Imperial College London, United Kingdom; and University Medical Centre Utrecht (R.V.), the Netherlands
| | - Ana Vinagre-Aragon
- From the Institute for Risk Assessment Sciences (Y.Z., H.K., S. Peters, R.V.), Utrecht University, the Netherlands; Department of Environmental Health Sciences (Y.L., G.W.M.), Mailman School of Public Health, Columbia University, New York, NY; Department of Epidemiology (J.M.H.), Murcia Regional Health Council-IMIB, Murcia; CIBER Epidemiología y Salud Pública (CIBERESP) (J.M.H., M.G.), Madrid; Movement Disorders Unit (A.V.-A.), Department of Neurology, University Hospital Donostia; BioDonostia Health Research Institute (A.V.-A.), Neurodegenerative Diseases Area, San Sebastián, Spain; Division of Cancer Epidemiology (J.A.S.), German Cancer Research Center (DKFZ), Heidelberg, Germany; Danish Cancer Institute (J.H.), Danish Cancer Society, Copenhagen, Denmark; Escuela Andaluza de Salud Pública (EASP) (D.P.); Instituto de Investigación Biosanitaria-ibs.GRANADA (D.P.), Granada; Centro de Investigación Biomédica en Red de Epidemiología y Salud Pública (CIBERESP) (D.P.), Madrid, Spain; Unit of Cancer Epidemiology (C.S.), Città della Salute e della Scienza University-Hospital, Turin, Italy; Unit of Nutrition and Cancer (R.Z.-R.), Cancer Epidemiology Research Programme, Catalan Institute of Oncology (ICO), Bellvitge Biomedical Research Institute (IDIBELL), Barcelona, Spain; Epidemiology and Prevention Unit (V.P.), Fondazione IRCCS Istituto Nazionale dei Tumori di Milano, Italy; Department of Epidemiology and Biostatistics (A.K.H., M.G.), School of Public Health, Imperial College London, United Kingdom; School of Medicine (S. Panico), Federico II University, Naples, Italy; de Salud Pública y Laboral de Navarra (M.G.), Pamplona; Navarra Institute for Health Research (IdiSNA) (M.G.), Pamplona, Spain; Institute for Cancer Research (G.M.), Prevention and Clinical Network (ISPRO), Florence, Italy; Institute of Epidemiology and Social Medicine (C.M.L.), University of Münster, Germany; Ageing Epidemiology Research Unit (AGE) (C.M.L.), School of Public Health, Imperial College London, United Kingdom; and University Medical Centre Utrecht (R.V.), the Netherlands
| | - Jara Anna Sabin
- From the Institute for Risk Assessment Sciences (Y.Z., H.K., S. Peters, R.V.), Utrecht University, the Netherlands; Department of Environmental Health Sciences (Y.L., G.W.M.), Mailman School of Public Health, Columbia University, New York, NY; Department of Epidemiology (J.M.H.), Murcia Regional Health Council-IMIB, Murcia; CIBER Epidemiología y Salud Pública (CIBERESP) (J.M.H., M.G.), Madrid; Movement Disorders Unit (A.V.-A.), Department of Neurology, University Hospital Donostia; BioDonostia Health Research Institute (A.V.-A.), Neurodegenerative Diseases Area, San Sebastián, Spain; Division of Cancer Epidemiology (J.A.S.), German Cancer Research Center (DKFZ), Heidelberg, Germany; Danish Cancer Institute (J.H.), Danish Cancer Society, Copenhagen, Denmark; Escuela Andaluza de Salud Pública (EASP) (D.P.); Instituto de Investigación Biosanitaria-ibs.GRANADA (D.P.), Granada; Centro de Investigación Biomédica en Red de Epidemiología y Salud Pública (CIBERESP) (D.P.), Madrid, Spain; Unit of Cancer Epidemiology (C.S.), Città della Salute e della Scienza University-Hospital, Turin, Italy; Unit of Nutrition and Cancer (R.Z.-R.), Cancer Epidemiology Research Programme, Catalan Institute of Oncology (ICO), Bellvitge Biomedical Research Institute (IDIBELL), Barcelona, Spain; Epidemiology and Prevention Unit (V.P.), Fondazione IRCCS Istituto Nazionale dei Tumori di Milano, Italy; Department of Epidemiology and Biostatistics (A.K.H., M.G.), School of Public Health, Imperial College London, United Kingdom; School of Medicine (S. Panico), Federico II University, Naples, Italy; de Salud Pública y Laboral de Navarra (M.G.), Pamplona; Navarra Institute for Health Research (IdiSNA) (M.G.), Pamplona, Spain; Institute for Cancer Research (G.M.), Prevention and Clinical Network (ISPRO), Florence, Italy; Institute of Epidemiology and Social Medicine (C.M.L.), University of Münster, Germany; Ageing Epidemiology Research Unit (AGE) (C.M.L.), School of Public Health, Imperial College London, United Kingdom; and University Medical Centre Utrecht (R.V.), the Netherlands
| | - Johnni Hansen
- From the Institute for Risk Assessment Sciences (Y.Z., H.K., S. Peters, R.V.), Utrecht University, the Netherlands; Department of Environmental Health Sciences (Y.L., G.W.M.), Mailman School of Public Health, Columbia University, New York, NY; Department of Epidemiology (J.M.H.), Murcia Regional Health Council-IMIB, Murcia; CIBER Epidemiología y Salud Pública (CIBERESP) (J.M.H., M.G.), Madrid; Movement Disorders Unit (A.V.-A.), Department of Neurology, University Hospital Donostia; BioDonostia Health Research Institute (A.V.-A.), Neurodegenerative Diseases Area, San Sebastián, Spain; Division of Cancer Epidemiology (J.A.S.), German Cancer Research Center (DKFZ), Heidelberg, Germany; Danish Cancer Institute (J.H.), Danish Cancer Society, Copenhagen, Denmark; Escuela Andaluza de Salud Pública (EASP) (D.P.); Instituto de Investigación Biosanitaria-ibs.GRANADA (D.P.), Granada; Centro de Investigación Biomédica en Red de Epidemiología y Salud Pública (CIBERESP) (D.P.), Madrid, Spain; Unit of Cancer Epidemiology (C.S.), Città della Salute e della Scienza University-Hospital, Turin, Italy; Unit of Nutrition and Cancer (R.Z.-R.), Cancer Epidemiology Research Programme, Catalan Institute of Oncology (ICO), Bellvitge Biomedical Research Institute (IDIBELL), Barcelona, Spain; Epidemiology and Prevention Unit (V.P.), Fondazione IRCCS Istituto Nazionale dei Tumori di Milano, Italy; Department of Epidemiology and Biostatistics (A.K.H., M.G.), School of Public Health, Imperial College London, United Kingdom; School of Medicine (S. Panico), Federico II University, Naples, Italy; de Salud Pública y Laboral de Navarra (M.G.), Pamplona; Navarra Institute for Health Research (IdiSNA) (M.G.), Pamplona, Spain; Institute for Cancer Research (G.M.), Prevention and Clinical Network (ISPRO), Florence, Italy; Institute of Epidemiology and Social Medicine (C.M.L.), University of Münster, Germany; Ageing Epidemiology Research Unit (AGE) (C.M.L.), School of Public Health, Imperial College London, United Kingdom; and University Medical Centre Utrecht (R.V.), the Netherlands
| | - Dafina Petrova
- From the Institute for Risk Assessment Sciences (Y.Z., H.K., S. Peters, R.V.), Utrecht University, the Netherlands; Department of Environmental Health Sciences (Y.L., G.W.M.), Mailman School of Public Health, Columbia University, New York, NY; Department of Epidemiology (J.M.H.), Murcia Regional Health Council-IMIB, Murcia; CIBER Epidemiología y Salud Pública (CIBERESP) (J.M.H., M.G.), Madrid; Movement Disorders Unit (A.V.-A.), Department of Neurology, University Hospital Donostia; BioDonostia Health Research Institute (A.V.-A.), Neurodegenerative Diseases Area, San Sebastián, Spain; Division of Cancer Epidemiology (J.A.S.), German Cancer Research Center (DKFZ), Heidelberg, Germany; Danish Cancer Institute (J.H.), Danish Cancer Society, Copenhagen, Denmark; Escuela Andaluza de Salud Pública (EASP) (D.P.); Instituto de Investigación Biosanitaria-ibs.GRANADA (D.P.), Granada; Centro de Investigación Biomédica en Red de Epidemiología y Salud Pública (CIBERESP) (D.P.), Madrid, Spain; Unit of Cancer Epidemiology (C.S.), Città della Salute e della Scienza University-Hospital, Turin, Italy; Unit of Nutrition and Cancer (R.Z.-R.), Cancer Epidemiology Research Programme, Catalan Institute of Oncology (ICO), Bellvitge Biomedical Research Institute (IDIBELL), Barcelona, Spain; Epidemiology and Prevention Unit (V.P.), Fondazione IRCCS Istituto Nazionale dei Tumori di Milano, Italy; Department of Epidemiology and Biostatistics (A.K.H., M.G.), School of Public Health, Imperial College London, United Kingdom; School of Medicine (S. Panico), Federico II University, Naples, Italy; de Salud Pública y Laboral de Navarra (M.G.), Pamplona; Navarra Institute for Health Research (IdiSNA) (M.G.), Pamplona, Spain; Institute for Cancer Research (G.M.), Prevention and Clinical Network (ISPRO), Florence, Italy; Institute of Epidemiology and Social Medicine (C.M.L.), University of Münster, Germany; Ageing Epidemiology Research Unit (AGE) (C.M.L.), School of Public Health, Imperial College London, United Kingdom; and University Medical Centre Utrecht (R.V.), the Netherlands
| | - Carlotta Sacerdote
- From the Institute for Risk Assessment Sciences (Y.Z., H.K., S. Peters, R.V.), Utrecht University, the Netherlands; Department of Environmental Health Sciences (Y.L., G.W.M.), Mailman School of Public Health, Columbia University, New York, NY; Department of Epidemiology (J.M.H.), Murcia Regional Health Council-IMIB, Murcia; CIBER Epidemiología y Salud Pública (CIBERESP) (J.M.H., M.G.), Madrid; Movement Disorders Unit (A.V.-A.), Department of Neurology, University Hospital Donostia; BioDonostia Health Research Institute (A.V.-A.), Neurodegenerative Diseases Area, San Sebastián, Spain; Division of Cancer Epidemiology (J.A.S.), German Cancer Research Center (DKFZ), Heidelberg, Germany; Danish Cancer Institute (J.H.), Danish Cancer Society, Copenhagen, Denmark; Escuela Andaluza de Salud Pública (EASP) (D.P.); Instituto de Investigación Biosanitaria-ibs.GRANADA (D.P.), Granada; Centro de Investigación Biomédica en Red de Epidemiología y Salud Pública (CIBERESP) (D.P.), Madrid, Spain; Unit of Cancer Epidemiology (C.S.), Città della Salute e della Scienza University-Hospital, Turin, Italy; Unit of Nutrition and Cancer (R.Z.-R.), Cancer Epidemiology Research Programme, Catalan Institute of Oncology (ICO), Bellvitge Biomedical Research Institute (IDIBELL), Barcelona, Spain; Epidemiology and Prevention Unit (V.P.), Fondazione IRCCS Istituto Nazionale dei Tumori di Milano, Italy; Department of Epidemiology and Biostatistics (A.K.H., M.G.), School of Public Health, Imperial College London, United Kingdom; School of Medicine (S. Panico), Federico II University, Naples, Italy; de Salud Pública y Laboral de Navarra (M.G.), Pamplona; Navarra Institute for Health Research (IdiSNA) (M.G.), Pamplona, Spain; Institute for Cancer Research (G.M.), Prevention and Clinical Network (ISPRO), Florence, Italy; Institute of Epidemiology and Social Medicine (C.M.L.), University of Münster, Germany; Ageing Epidemiology Research Unit (AGE) (C.M.L.), School of Public Health, Imperial College London, United Kingdom; and University Medical Centre Utrecht (R.V.), the Netherlands
| | - Raul Zamora-Ros
- From the Institute for Risk Assessment Sciences (Y.Z., H.K., S. Peters, R.V.), Utrecht University, the Netherlands; Department of Environmental Health Sciences (Y.L., G.W.M.), Mailman School of Public Health, Columbia University, New York, NY; Department of Epidemiology (J.M.H.), Murcia Regional Health Council-IMIB, Murcia; CIBER Epidemiología y Salud Pública (CIBERESP) (J.M.H., M.G.), Madrid; Movement Disorders Unit (A.V.-A.), Department of Neurology, University Hospital Donostia; BioDonostia Health Research Institute (A.V.-A.), Neurodegenerative Diseases Area, San Sebastián, Spain; Division of Cancer Epidemiology (J.A.S.), German Cancer Research Center (DKFZ), Heidelberg, Germany; Danish Cancer Institute (J.H.), Danish Cancer Society, Copenhagen, Denmark; Escuela Andaluza de Salud Pública (EASP) (D.P.); Instituto de Investigación Biosanitaria-ibs.GRANADA (D.P.), Granada; Centro de Investigación Biomédica en Red de Epidemiología y Salud Pública (CIBERESP) (D.P.), Madrid, Spain; Unit of Cancer Epidemiology (C.S.), Città della Salute e della Scienza University-Hospital, Turin, Italy; Unit of Nutrition and Cancer (R.Z.-R.), Cancer Epidemiology Research Programme, Catalan Institute of Oncology (ICO), Bellvitge Biomedical Research Institute (IDIBELL), Barcelona, Spain; Epidemiology and Prevention Unit (V.P.), Fondazione IRCCS Istituto Nazionale dei Tumori di Milano, Italy; Department of Epidemiology and Biostatistics (A.K.H., M.G.), School of Public Health, Imperial College London, United Kingdom; School of Medicine (S. Panico), Federico II University, Naples, Italy; de Salud Pública y Laboral de Navarra (M.G.), Pamplona; Navarra Institute for Health Research (IdiSNA) (M.G.), Pamplona, Spain; Institute for Cancer Research (G.M.), Prevention and Clinical Network (ISPRO), Florence, Italy; Institute of Epidemiology and Social Medicine (C.M.L.), University of Münster, Germany; Ageing Epidemiology Research Unit (AGE) (C.M.L.), School of Public Health, Imperial College London, United Kingdom; and University Medical Centre Utrecht (R.V.), the Netherlands
| | - Valeria Pala
- From the Institute for Risk Assessment Sciences (Y.Z., H.K., S. Peters, R.V.), Utrecht University, the Netherlands; Department of Environmental Health Sciences (Y.L., G.W.M.), Mailman School of Public Health, Columbia University, New York, NY; Department of Epidemiology (J.M.H.), Murcia Regional Health Council-IMIB, Murcia; CIBER Epidemiología y Salud Pública (CIBERESP) (J.M.H., M.G.), Madrid; Movement Disorders Unit (A.V.-A.), Department of Neurology, University Hospital Donostia; BioDonostia Health Research Institute (A.V.-A.), Neurodegenerative Diseases Area, San Sebastián, Spain; Division of Cancer Epidemiology (J.A.S.), German Cancer Research Center (DKFZ), Heidelberg, Germany; Danish Cancer Institute (J.H.), Danish Cancer Society, Copenhagen, Denmark; Escuela Andaluza de Salud Pública (EASP) (D.P.); Instituto de Investigación Biosanitaria-ibs.GRANADA (D.P.), Granada; Centro de Investigación Biomédica en Red de Epidemiología y Salud Pública (CIBERESP) (D.P.), Madrid, Spain; Unit of Cancer Epidemiology (C.S.), Città della Salute e della Scienza University-Hospital, Turin, Italy; Unit of Nutrition and Cancer (R.Z.-R.), Cancer Epidemiology Research Programme, Catalan Institute of Oncology (ICO), Bellvitge Biomedical Research Institute (IDIBELL), Barcelona, Spain; Epidemiology and Prevention Unit (V.P.), Fondazione IRCCS Istituto Nazionale dei Tumori di Milano, Italy; Department of Epidemiology and Biostatistics (A.K.H., M.G.), School of Public Health, Imperial College London, United Kingdom; School of Medicine (S. Panico), Federico II University, Naples, Italy; de Salud Pública y Laboral de Navarra (M.G.), Pamplona; Navarra Institute for Health Research (IdiSNA) (M.G.), Pamplona, Spain; Institute for Cancer Research (G.M.), Prevention and Clinical Network (ISPRO), Florence, Italy; Institute of Epidemiology and Social Medicine (C.M.L.), University of Münster, Germany; Ageing Epidemiology Research Unit (AGE) (C.M.L.), School of Public Health, Imperial College London, United Kingdom; and University Medical Centre Utrecht (R.V.), the Netherlands
| | - Alicia K Heath
- From the Institute for Risk Assessment Sciences (Y.Z., H.K., S. Peters, R.V.), Utrecht University, the Netherlands; Department of Environmental Health Sciences (Y.L., G.W.M.), Mailman School of Public Health, Columbia University, New York, NY; Department of Epidemiology (J.M.H.), Murcia Regional Health Council-IMIB, Murcia; CIBER Epidemiología y Salud Pública (CIBERESP) (J.M.H., M.G.), Madrid; Movement Disorders Unit (A.V.-A.), Department of Neurology, University Hospital Donostia; BioDonostia Health Research Institute (A.V.-A.), Neurodegenerative Diseases Area, San Sebastián, Spain; Division of Cancer Epidemiology (J.A.S.), German Cancer Research Center (DKFZ), Heidelberg, Germany; Danish Cancer Institute (J.H.), Danish Cancer Society, Copenhagen, Denmark; Escuela Andaluza de Salud Pública (EASP) (D.P.); Instituto de Investigación Biosanitaria-ibs.GRANADA (D.P.), Granada; Centro de Investigación Biomédica en Red de Epidemiología y Salud Pública (CIBERESP) (D.P.), Madrid, Spain; Unit of Cancer Epidemiology (C.S.), Città della Salute e della Scienza University-Hospital, Turin, Italy; Unit of Nutrition and Cancer (R.Z.-R.), Cancer Epidemiology Research Programme, Catalan Institute of Oncology (ICO), Bellvitge Biomedical Research Institute (IDIBELL), Barcelona, Spain; Epidemiology and Prevention Unit (V.P.), Fondazione IRCCS Istituto Nazionale dei Tumori di Milano, Italy; Department of Epidemiology and Biostatistics (A.K.H., M.G.), School of Public Health, Imperial College London, United Kingdom; School of Medicine (S. Panico), Federico II University, Naples, Italy; de Salud Pública y Laboral de Navarra (M.G.), Pamplona; Navarra Institute for Health Research (IdiSNA) (M.G.), Pamplona, Spain; Institute for Cancer Research (G.M.), Prevention and Clinical Network (ISPRO), Florence, Italy; Institute of Epidemiology and Social Medicine (C.M.L.), University of Münster, Germany; Ageing Epidemiology Research Unit (AGE) (C.M.L.), School of Public Health, Imperial College London, United Kingdom; and University Medical Centre Utrecht (R.V.), the Netherlands
| | - Salvatore Panico
- From the Institute for Risk Assessment Sciences (Y.Z., H.K., S. Peters, R.V.), Utrecht University, the Netherlands; Department of Environmental Health Sciences (Y.L., G.W.M.), Mailman School of Public Health, Columbia University, New York, NY; Department of Epidemiology (J.M.H.), Murcia Regional Health Council-IMIB, Murcia; CIBER Epidemiología y Salud Pública (CIBERESP) (J.M.H., M.G.), Madrid; Movement Disorders Unit (A.V.-A.), Department of Neurology, University Hospital Donostia; BioDonostia Health Research Institute (A.V.-A.), Neurodegenerative Diseases Area, San Sebastián, Spain; Division of Cancer Epidemiology (J.A.S.), German Cancer Research Center (DKFZ), Heidelberg, Germany; Danish Cancer Institute (J.H.), Danish Cancer Society, Copenhagen, Denmark; Escuela Andaluza de Salud Pública (EASP) (D.P.); Instituto de Investigación Biosanitaria-ibs.GRANADA (D.P.), Granada; Centro de Investigación Biomédica en Red de Epidemiología y Salud Pública (CIBERESP) (D.P.), Madrid, Spain; Unit of Cancer Epidemiology (C.S.), Città della Salute e della Scienza University-Hospital, Turin, Italy; Unit of Nutrition and Cancer (R.Z.-R.), Cancer Epidemiology Research Programme, Catalan Institute of Oncology (ICO), Bellvitge Biomedical Research Institute (IDIBELL), Barcelona, Spain; Epidemiology and Prevention Unit (V.P.), Fondazione IRCCS Istituto Nazionale dei Tumori di Milano, Italy; Department of Epidemiology and Biostatistics (A.K.H., M.G.), School of Public Health, Imperial College London, United Kingdom; School of Medicine (S. Panico), Federico II University, Naples, Italy; de Salud Pública y Laboral de Navarra (M.G.), Pamplona; Navarra Institute for Health Research (IdiSNA) (M.G.), Pamplona, Spain; Institute for Cancer Research (G.M.), Prevention and Clinical Network (ISPRO), Florence, Italy; Institute of Epidemiology and Social Medicine (C.M.L.), University of Münster, Germany; Ageing Epidemiology Research Unit (AGE) (C.M.L.), School of Public Health, Imperial College London, United Kingdom; and University Medical Centre Utrecht (R.V.), the Netherlands
| | - Marcela Guevara
- From the Institute for Risk Assessment Sciences (Y.Z., H.K., S. Peters, R.V.), Utrecht University, the Netherlands; Department of Environmental Health Sciences (Y.L., G.W.M.), Mailman School of Public Health, Columbia University, New York, NY; Department of Epidemiology (J.M.H.), Murcia Regional Health Council-IMIB, Murcia; CIBER Epidemiología y Salud Pública (CIBERESP) (J.M.H., M.G.), Madrid; Movement Disorders Unit (A.V.-A.), Department of Neurology, University Hospital Donostia; BioDonostia Health Research Institute (A.V.-A.), Neurodegenerative Diseases Area, San Sebastián, Spain; Division of Cancer Epidemiology (J.A.S.), German Cancer Research Center (DKFZ), Heidelberg, Germany; Danish Cancer Institute (J.H.), Danish Cancer Society, Copenhagen, Denmark; Escuela Andaluza de Salud Pública (EASP) (D.P.); Instituto de Investigación Biosanitaria-ibs.GRANADA (D.P.), Granada; Centro de Investigación Biomédica en Red de Epidemiología y Salud Pública (CIBERESP) (D.P.), Madrid, Spain; Unit of Cancer Epidemiology (C.S.), Città della Salute e della Scienza University-Hospital, Turin, Italy; Unit of Nutrition and Cancer (R.Z.-R.), Cancer Epidemiology Research Programme, Catalan Institute of Oncology (ICO), Bellvitge Biomedical Research Institute (IDIBELL), Barcelona, Spain; Epidemiology and Prevention Unit (V.P.), Fondazione IRCCS Istituto Nazionale dei Tumori di Milano, Italy; Department of Epidemiology and Biostatistics (A.K.H., M.G.), School of Public Health, Imperial College London, United Kingdom; School of Medicine (S. Panico), Federico II University, Naples, Italy; de Salud Pública y Laboral de Navarra (M.G.), Pamplona; Navarra Institute for Health Research (IdiSNA) (M.G.), Pamplona, Spain; Institute for Cancer Research (G.M.), Prevention and Clinical Network (ISPRO), Florence, Italy; Institute of Epidemiology and Social Medicine (C.M.L.), University of Münster, Germany; Ageing Epidemiology Research Unit (AGE) (C.M.L.), School of Public Health, Imperial College London, United Kingdom; and University Medical Centre Utrecht (R.V.), the Netherlands
| | - Giovanna Masala
- From the Institute for Risk Assessment Sciences (Y.Z., H.K., S. Peters, R.V.), Utrecht University, the Netherlands; Department of Environmental Health Sciences (Y.L., G.W.M.), Mailman School of Public Health, Columbia University, New York, NY; Department of Epidemiology (J.M.H.), Murcia Regional Health Council-IMIB, Murcia; CIBER Epidemiología y Salud Pública (CIBERESP) (J.M.H., M.G.), Madrid; Movement Disorders Unit (A.V.-A.), Department of Neurology, University Hospital Donostia; BioDonostia Health Research Institute (A.V.-A.), Neurodegenerative Diseases Area, San Sebastián, Spain; Division of Cancer Epidemiology (J.A.S.), German Cancer Research Center (DKFZ), Heidelberg, Germany; Danish Cancer Institute (J.H.), Danish Cancer Society, Copenhagen, Denmark; Escuela Andaluza de Salud Pública (EASP) (D.P.); Instituto de Investigación Biosanitaria-ibs.GRANADA (D.P.), Granada; Centro de Investigación Biomédica en Red de Epidemiología y Salud Pública (CIBERESP) (D.P.), Madrid, Spain; Unit of Cancer Epidemiology (C.S.), Città della Salute e della Scienza University-Hospital, Turin, Italy; Unit of Nutrition and Cancer (R.Z.-R.), Cancer Epidemiology Research Programme, Catalan Institute of Oncology (ICO), Bellvitge Biomedical Research Institute (IDIBELL), Barcelona, Spain; Epidemiology and Prevention Unit (V.P.), Fondazione IRCCS Istituto Nazionale dei Tumori di Milano, Italy; Department of Epidemiology and Biostatistics (A.K.H., M.G.), School of Public Health, Imperial College London, United Kingdom; School of Medicine (S. Panico), Federico II University, Naples, Italy; de Salud Pública y Laboral de Navarra (M.G.), Pamplona; Navarra Institute for Health Research (IdiSNA) (M.G.), Pamplona, Spain; Institute for Cancer Research (G.M.), Prevention and Clinical Network (ISPRO), Florence, Italy; Institute of Epidemiology and Social Medicine (C.M.L.), University of Münster, Germany; Ageing Epidemiology Research Unit (AGE) (C.M.L.), School of Public Health, Imperial College London, United Kingdom; and University Medical Centre Utrecht (R.V.), the Netherlands
| | - Christina M Lill
- From the Institute for Risk Assessment Sciences (Y.Z., H.K., S. Peters, R.V.), Utrecht University, the Netherlands; Department of Environmental Health Sciences (Y.L., G.W.M.), Mailman School of Public Health, Columbia University, New York, NY; Department of Epidemiology (J.M.H.), Murcia Regional Health Council-IMIB, Murcia; CIBER Epidemiología y Salud Pública (CIBERESP) (J.M.H., M.G.), Madrid; Movement Disorders Unit (A.V.-A.), Department of Neurology, University Hospital Donostia; BioDonostia Health Research Institute (A.V.-A.), Neurodegenerative Diseases Area, San Sebastián, Spain; Division of Cancer Epidemiology (J.A.S.), German Cancer Research Center (DKFZ), Heidelberg, Germany; Danish Cancer Institute (J.H.), Danish Cancer Society, Copenhagen, Denmark; Escuela Andaluza de Salud Pública (EASP) (D.P.); Instituto de Investigación Biosanitaria-ibs.GRANADA (D.P.), Granada; Centro de Investigación Biomédica en Red de Epidemiología y Salud Pública (CIBERESP) (D.P.), Madrid, Spain; Unit of Cancer Epidemiology (C.S.), Città della Salute e della Scienza University-Hospital, Turin, Italy; Unit of Nutrition and Cancer (R.Z.-R.), Cancer Epidemiology Research Programme, Catalan Institute of Oncology (ICO), Bellvitge Biomedical Research Institute (IDIBELL), Barcelona, Spain; Epidemiology and Prevention Unit (V.P.), Fondazione IRCCS Istituto Nazionale dei Tumori di Milano, Italy; Department of Epidemiology and Biostatistics (A.K.H., M.G.), School of Public Health, Imperial College London, United Kingdom; School of Medicine (S. Panico), Federico II University, Naples, Italy; de Salud Pública y Laboral de Navarra (M.G.), Pamplona; Navarra Institute for Health Research (IdiSNA) (M.G.), Pamplona, Spain; Institute for Cancer Research (G.M.), Prevention and Clinical Network (ISPRO), Florence, Italy; Institute of Epidemiology and Social Medicine (C.M.L.), University of Münster, Germany; Ageing Epidemiology Research Unit (AGE) (C.M.L.), School of Public Health, Imperial College London, United Kingdom; and University Medical Centre Utrecht (R.V.), the Netherlands
| | - Gary W Miller
- From the Institute for Risk Assessment Sciences (Y.Z., H.K., S. Peters, R.V.), Utrecht University, the Netherlands; Department of Environmental Health Sciences (Y.L., G.W.M.), Mailman School of Public Health, Columbia University, New York, NY; Department of Epidemiology (J.M.H.), Murcia Regional Health Council-IMIB, Murcia; CIBER Epidemiología y Salud Pública (CIBERESP) (J.M.H., M.G.), Madrid; Movement Disorders Unit (A.V.-A.), Department of Neurology, University Hospital Donostia; BioDonostia Health Research Institute (A.V.-A.), Neurodegenerative Diseases Area, San Sebastián, Spain; Division of Cancer Epidemiology (J.A.S.), German Cancer Research Center (DKFZ), Heidelberg, Germany; Danish Cancer Institute (J.H.), Danish Cancer Society, Copenhagen, Denmark; Escuela Andaluza de Salud Pública (EASP) (D.P.); Instituto de Investigación Biosanitaria-ibs.GRANADA (D.P.), Granada; Centro de Investigación Biomédica en Red de Epidemiología y Salud Pública (CIBERESP) (D.P.), Madrid, Spain; Unit of Cancer Epidemiology (C.S.), Città della Salute e della Scienza University-Hospital, Turin, Italy; Unit of Nutrition and Cancer (R.Z.-R.), Cancer Epidemiology Research Programme, Catalan Institute of Oncology (ICO), Bellvitge Biomedical Research Institute (IDIBELL), Barcelona, Spain; Epidemiology and Prevention Unit (V.P.), Fondazione IRCCS Istituto Nazionale dei Tumori di Milano, Italy; Department of Epidemiology and Biostatistics (A.K.H., M.G.), School of Public Health, Imperial College London, United Kingdom; School of Medicine (S. Panico), Federico II University, Naples, Italy; de Salud Pública y Laboral de Navarra (M.G.), Pamplona; Navarra Institute for Health Research (IdiSNA) (M.G.), Pamplona, Spain; Institute for Cancer Research (G.M.), Prevention and Clinical Network (ISPRO), Florence, Italy; Institute of Epidemiology and Social Medicine (C.M.L.), University of Münster, Germany; Ageing Epidemiology Research Unit (AGE) (C.M.L.), School of Public Health, Imperial College London, United Kingdom; and University Medical Centre Utrecht (R.V.), the Netherlands
| | - Susan Peters
- From the Institute for Risk Assessment Sciences (Y.Z., H.K., S. Peters, R.V.), Utrecht University, the Netherlands; Department of Environmental Health Sciences (Y.L., G.W.M.), Mailman School of Public Health, Columbia University, New York, NY; Department of Epidemiology (J.M.H.), Murcia Regional Health Council-IMIB, Murcia; CIBER Epidemiología y Salud Pública (CIBERESP) (J.M.H., M.G.), Madrid; Movement Disorders Unit (A.V.-A.), Department of Neurology, University Hospital Donostia; BioDonostia Health Research Institute (A.V.-A.), Neurodegenerative Diseases Area, San Sebastián, Spain; Division of Cancer Epidemiology (J.A.S.), German Cancer Research Center (DKFZ), Heidelberg, Germany; Danish Cancer Institute (J.H.), Danish Cancer Society, Copenhagen, Denmark; Escuela Andaluza de Salud Pública (EASP) (D.P.); Instituto de Investigación Biosanitaria-ibs.GRANADA (D.P.), Granada; Centro de Investigación Biomédica en Red de Epidemiología y Salud Pública (CIBERESP) (D.P.), Madrid, Spain; Unit of Cancer Epidemiology (C.S.), Città della Salute e della Scienza University-Hospital, Turin, Italy; Unit of Nutrition and Cancer (R.Z.-R.), Cancer Epidemiology Research Programme, Catalan Institute of Oncology (ICO), Bellvitge Biomedical Research Institute (IDIBELL), Barcelona, Spain; Epidemiology and Prevention Unit (V.P.), Fondazione IRCCS Istituto Nazionale dei Tumori di Milano, Italy; Department of Epidemiology and Biostatistics (A.K.H., M.G.), School of Public Health, Imperial College London, United Kingdom; School of Medicine (S. Panico), Federico II University, Naples, Italy; de Salud Pública y Laboral de Navarra (M.G.), Pamplona; Navarra Institute for Health Research (IdiSNA) (M.G.), Pamplona, Spain; Institute for Cancer Research (G.M.), Prevention and Clinical Network (ISPRO), Florence, Italy; Institute of Epidemiology and Social Medicine (C.M.L.), University of Münster, Germany; Ageing Epidemiology Research Unit (AGE) (C.M.L.), School of Public Health, Imperial College London, United Kingdom; and University Medical Centre Utrecht (R.V.), the Netherlands
| | - Roel Vermeulen
- From the Institute for Risk Assessment Sciences (Y.Z., H.K., S. Peters, R.V.), Utrecht University, the Netherlands; Department of Environmental Health Sciences (Y.L., G.W.M.), Mailman School of Public Health, Columbia University, New York, NY; Department of Epidemiology (J.M.H.), Murcia Regional Health Council-IMIB, Murcia; CIBER Epidemiología y Salud Pública (CIBERESP) (J.M.H., M.G.), Madrid; Movement Disorders Unit (A.V.-A.), Department of Neurology, University Hospital Donostia; BioDonostia Health Research Institute (A.V.-A.), Neurodegenerative Diseases Area, San Sebastián, Spain; Division of Cancer Epidemiology (J.A.S.), German Cancer Research Center (DKFZ), Heidelberg, Germany; Danish Cancer Institute (J.H.), Danish Cancer Society, Copenhagen, Denmark; Escuela Andaluza de Salud Pública (EASP) (D.P.); Instituto de Investigación Biosanitaria-ibs.GRANADA (D.P.), Granada; Centro de Investigación Biomédica en Red de Epidemiología y Salud Pública (CIBERESP) (D.P.), Madrid, Spain; Unit of Cancer Epidemiology (C.S.), Città della Salute e della Scienza University-Hospital, Turin, Italy; Unit of Nutrition and Cancer (R.Z.-R.), Cancer Epidemiology Research Programme, Catalan Institute of Oncology (ICO), Bellvitge Biomedical Research Institute (IDIBELL), Barcelona, Spain; Epidemiology and Prevention Unit (V.P.), Fondazione IRCCS Istituto Nazionale dei Tumori di Milano, Italy; Department of Epidemiology and Biostatistics (A.K.H., M.G.), School of Public Health, Imperial College London, United Kingdom; School of Medicine (S. Panico), Federico II University, Naples, Italy; de Salud Pública y Laboral de Navarra (M.G.), Pamplona; Navarra Institute for Health Research (IdiSNA) (M.G.), Pamplona, Spain; Institute for Cancer Research (G.M.), Prevention and Clinical Network (ISPRO), Florence, Italy; Institute of Epidemiology and Social Medicine (C.M.L.), University of Münster, Germany; Ageing Epidemiology Research Unit (AGE) (C.M.L.), School of Public Health, Imperial College London, United Kingdom; and University Medical Centre Utrecht (R.V.), the Netherlands
| |
Collapse
|
11
|
Song D, Tang T, Wang R, Liu H, Xie D, Zhao B, Dang Z, Lu G. Enhancing compound confidence in suspect and non-target screening through machine learning-based retention time prediction. ENVIRONMENTAL POLLUTION (BARKING, ESSEX : 1987) 2024; 347:123763. [PMID: 38492749 DOI: 10.1016/j.envpol.2024.123763] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/26/2024] [Revised: 02/26/2024] [Accepted: 03/09/2024] [Indexed: 03/18/2024]
Abstract
The retention time (RT) of contaminants of emerging concern (CECs) in liquid chromatography-high-resolution mass spectrometry (LC-HRMS) is crucial for database matching in non-targeted screening (NTS) analysis. In this study, we developed a machine learning (ML) model to predict RTs of CECs in NTS analysis. Using 1051 CEC standards, we evaluated Random Forest (RF), XGBoost, Support Vector Regression (SVR), and Artificial Neural Network (ANN) with molecular fingerprints and chemical descriptors to establish an optimal model. The SVR model utilizing chemical descriptors resulted in good predictive capacity with R2ext = 0.850 and r2 = 0.925. The model was further validated through laboratory NTS compound characterization. When applied to examine CEC occurrence in a large wastewater treatment plant, we identified 40 level S1 CECs (confirmed structure by reference standard) and 234 level S2 compounds (probable structure by library spectrum match). The model predicted RTs for level S2 compounds, leading to the classification of 153 level S2 compounds with high confidence (ΔRT <2 min). The model served as a robust filtering mechanism within the analytical framework. This study emphasizes the importance of predicted RTs in NTS analysis and highlights the potential of prediction models. Our research introduces a workflow that enhances NTS analysis by utilizing RT prediction models to determine compound confidence levels.
Collapse
Affiliation(s)
- Dehao Song
- School of Environment and Energy, South China University of Technology, Guangzhou, 510006, China
| | - Ting Tang
- School of Environment and Energy, South China University of Technology, Guangzhou, 510006, China; The Key Lab of Pollution Control and Ecosystem Restoration in Industry Clusters, Ministry of Education, South China University of Technology, Guangzhou, 510006, China.
| | - Rui Wang
- South China Institute of Environmental Sciences, Ministry of Ecology and Environment, Guangzhou, 510655, China; Guangxi Key Laboratory of Emerging Contaminants Monitoring, Early Warning and Environmental Health Risk Assessment, Nanning, 530000, China
| | - He Liu
- South China Institute of Environmental Sciences, Ministry of Ecology and Environment, Guangzhou, 510655, China; Guangxi Key Laboratory of Emerging Contaminants Monitoring, Early Warning and Environmental Health Risk Assessment, Nanning, 530000, China
| | - Danping Xie
- South China Institute of Environmental Sciences, Ministry of Ecology and Environment, Guangzhou, 510655, China; Guangxi Key Laboratory of Emerging Contaminants Monitoring, Early Warning and Environmental Health Risk Assessment, Nanning, 530000, China
| | - Bo Zhao
- South China Institute of Environmental Sciences, Ministry of Ecology and Environment, Guangzhou, 510655, China; Guangxi Key Laboratory of Emerging Contaminants Monitoring, Early Warning and Environmental Health Risk Assessment, Nanning, 530000, China
| | - Zhi Dang
- School of Environment and Energy, South China University of Technology, Guangzhou, 510006, China; Guangdong Provincial Key Laboratory of Solid Wastes Pollution Control and Recycling, South China University of Technology, Guangzhou, 510006, China; The Key Lab of Pollution Control and Ecosystem Restoration in Industry Clusters, Ministry of Education, South China University of Technology, Guangzhou, 510006, China
| | - Guining Lu
- School of Environment and Energy, South China University of Technology, Guangzhou, 510006, China; The Key Lab of Pollution Control and Ecosystem Restoration in Industry Clusters, Ministry of Education, South China University of Technology, Guangzhou, 510006, China
| |
Collapse
|
12
|
Obradović D, Stavrianidi A, Fedorova E, Bogojević A, Shpigun O, Buryak A, Lazović S. A comparative study of the predictive performance of different descriptor calculation tools: Molecular-based elution order modeling and interpretation of retention mechanism for isomeric compounds from METLIN database. J Chromatogr A 2024; 1719:464731. [PMID: 38377661 DOI: 10.1016/j.chroma.2024.464731] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2023] [Revised: 02/08/2024] [Accepted: 02/09/2024] [Indexed: 02/22/2024]
Abstract
In the pharmaceutical industry, the need for analytical standards is a bottleneck for comprehensive evaluation and quality control of intermediate and end products. These are complex mixtures containing structurally related molecules. In this regard, chromatographic peak annotation, especially for critical pairs of isomers and closest structural analogs, can be supported by using a Quantitative Structure Retention Relationship (QSRR) approach. In our study, we investigated the fundamental basis of the reversed-phase (RP) retention mechanism for 1141 isomeric compounds from the METLIN SMRT dataset. Nine different descriptor calculation tools combined with different feature selection methods (genetic algorithm (GA), stepwise, Boruta) and machine learning (ML) approaches (support vector machine (SVM), multiple linear regression (MLR), random forest (RF), XGBoost) were applied to provide a reliable molecular structure-based interpretation of RP retention behaviour of the isomeric compounds. Strict internal and external validation metrics were used to select models with the best predictive capabilities (rtest > 0.73, order of elution > 60 %). For the developed models, mean absolute errors were in the range of 60 to 110 s. Stepwise and GA showed the most suitable performance as descriptor selection methods, while SVM and XGBoost modeling gave satisfactory predictive characteristics in most cases. Validation performed on the published experimental data for structurally related pharmaceutical compounds confirmed the best accuracy of MLR modeling in combination with GA feature selection of general physico-chemical properties. The resulting models will be useful for the prediction of separation and identification of structurally related compounds in pharmaceutical analysis, providing a simultaneous understanding of the interaction mechanisms leading to their retention under RP conditions.
Collapse
Affiliation(s)
- Darija Obradović
- Institute of Physics Belgrade, National Institute of the Republic of Serbia, Pregrevica 118, Belgrade 11080, Serbia
| | - Andrey Stavrianidi
- Chemistry Department, Lomonosov Moscow State University, 1/3 Leninskie Gory, GSP-1, Moscow 119991, Russia; A.N. Frumkin Institute of Physical Chemistry and Electrochemistry, Russian Academy of Sciences, 31 Leninsky Prospect, GSP-1, Moscow 119071, Russia.
| | - Elizaveta Fedorova
- A.N. Frumkin Institute of Physical Chemistry and Electrochemistry, Russian Academy of Sciences, 31 Leninsky Prospect, GSP-1, Moscow 119071, Russia
| | - Aleksandar Bogojević
- Institute of Physics Belgrade, National Institute of the Republic of Serbia, Pregrevica 118, Belgrade 11080, Serbia
| | - Oleg Shpigun
- Chemistry Department, Lomonosov Moscow State University, 1/3 Leninskie Gory, GSP-1, Moscow 119991, Russia
| | - Aleksey Buryak
- A.N. Frumkin Institute of Physical Chemistry and Electrochemistry, Russian Academy of Sciences, 31 Leninsky Prospect, GSP-1, Moscow 119071, Russia
| | - Saša Lazović
- Institute of Physics Belgrade, National Institute of the Republic of Serbia, Pregrevica 118, Belgrade 11080, Serbia
| |
Collapse
|
13
|
Leporino M, Rouphael Y, Bonini P, Colla G, Cardarelli M. Protein hydrolysates enhance recovery from drought stress in tomato plants: phenomic and metabolomic insights. FRONTIERS IN PLANT SCIENCE 2024; 15:1357316. [PMID: 38533405 PMCID: PMC10963501 DOI: 10.3389/fpls.2024.1357316] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/17/2023] [Accepted: 02/09/2024] [Indexed: 03/28/2024]
Abstract
Introduction High-throughput phenotyping technologies together with metabolomics analysis can speed up the development of highly efficient and effective biostimulants for enhancing crop tolerance to drought stress. The aim of this study was to examine the morphophysiological and metabolic changes in tomato plants foliarly treated with two protein hydrolysates obtained by enzymatic hydrolysis of vegetal proteins from Malvaceae (PH1) or Fabaceae (PH2) in comparison with a control treatment, as well as to investigate the mechanisms involved in the enhancement of plant resistance to repeated drought stress cycles. Methods A phenotyping device was used for daily monitoring morphophysiological traits while untargeted metabolomics analysis was carried out in leaves of the best performing treatment based on phenotypic results.Results: PH1 treatment was the most effective in enhancing plant resistance to water stress due to the better recovery of digital biomass and 3D leaf area after each water stress event while PH2 was effective in mitigating water stress only during the recovery period after the first drought stress event. Metabolomics data indicated that PH1 modified primary metabolism by increasing the concentration of dipeptides and fatty acids in comparison with untreated control, as well as secondary metabolism by regulating several compounds like phenols. In contrast, hormones and compounds involved in detoxification or signal molecules against reactive oxygen species were downregulated in comparison with untreated control. Conclusion The above findings demonstrated the advantages of a combined phenomics-metabolomics approach for elucidating the relationship between metabolic and morphophysiological changes associated with a biostimulant-mediated increase of crop resistance to repeated water stress events.
Collapse
Affiliation(s)
- Marzia Leporino
- Department of Agriculture and Forest Sciences, University of Tuscia, Viterbo, Italy
| | - Youssef Rouphael
- Department of Agricultural Sciences at the University of Naples, Portici, Italy
| | - Paolo Bonini
- oloBion SL, Barcelona, Spain
- Arcadia s.r.l., Rivoli Veronese, Italy
| | - Giuseppe Colla
- Department of Agriculture and Forest Sciences, University of Tuscia, Viterbo, Italy
- Arcadia s.r.l., Rivoli Veronese, Italy
| | | |
Collapse
|
14
|
Zhang Y, Liu F, Li XQ, Gao Y, Li KC, Zhang QH. Generic and accurate prediction of retention times in liquid chromatography by post-projection calibration. Commun Chem 2024; 7:54. [PMID: 38459241 PMCID: PMC10923921 DOI: 10.1038/s42004-024-01135-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2023] [Accepted: 02/21/2024] [Indexed: 03/10/2024] Open
Abstract
Retention time predictions from molecule structures in liquid chromatography (LC) are increasingly used in MS-based targeted and untargeted analyses, providing supplementary evidence for molecule annotation and reducing experimental measurements. Nevertheless, different LC setups (e.g., differences in gradient, column, and/or mobile phase) give rise to many prediction models that can only accurately predict retention times for a specific chromatographic method (CM). Here, a generic and accurate method is present to predict retention times across different CMs, by introducing the concept of post-projection calibration. This concept builds on the direct projections of retention times between different CMs and uses 35 external calibrants to eliminate the impact of LC setups on projection accuracy. Results showed that post-projection calibration consistently achieved a median projection error below 3.2% of the elution time. The ranking results of putative candidates reached similar levels among different CMs. This work opens up broad possibilities for coordinating retention times between different laboratories and developing extensive retention databases.
Collapse
Affiliation(s)
- Yan Zhang
- Key Laboratory of Groundwater Conservation of MWR, China University of Geosciences, Beijing, 100083, People's Republic of China
- Division of Chemical Metrology and Analytical Science, National Institute of Metrology, Beijing, 100029, People's Republic of China
- Key Laboratory of Chemical Metrology and Applications on Nutrition and Health for State Market Regulation, Beijing, 100029, China
| | - Fei Liu
- Key Laboratory of Groundwater Conservation of MWR, China University of Geosciences, Beijing, 100083, People's Republic of China.
| | - Xiu Qin Li
- Division of Chemical Metrology and Analytical Science, National Institute of Metrology, Beijing, 100029, People's Republic of China
- Key Laboratory of Chemical Metrology and Applications on Nutrition and Health for State Market Regulation, Beijing, 100029, China
| | - Yan Gao
- Division of Chemical Metrology and Analytical Science, National Institute of Metrology, Beijing, 100029, People's Republic of China
- Key Laboratory of Chemical Metrology and Applications on Nutrition and Health for State Market Regulation, Beijing, 100029, China
| | - Kang Cong Li
- Division of Chemical Metrology and Analytical Science, National Institute of Metrology, Beijing, 100029, People's Republic of China
- Key Laboratory of Chemical Metrology and Applications on Nutrition and Health for State Market Regulation, Beijing, 100029, China
| | - Qing He Zhang
- Division of Chemical Metrology and Analytical Science, National Institute of Metrology, Beijing, 100029, People's Republic of China.
- Key Laboratory of Chemical Metrology and Applications on Nutrition and Health for State Market Regulation, Beijing, 100029, China.
| |
Collapse
|
15
|
Xue J, Wang B, Ji H, Li W. RT-Transformer: retention time prediction for metabolite annotation to assist in metabolite identification. Bioinformatics 2024; 40:btae084. [PMID: 38402516 PMCID: PMC10914443 DOI: 10.1093/bioinformatics/btae084] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2023] [Revised: 01/14/2024] [Accepted: 02/22/2024] [Indexed: 02/26/2024] Open
Abstract
MOTIVATION Liquid chromatography retention times prediction can assist in metabolite identification, which is a critical task and challenge in nontargeted metabolomics. However, different chromatographic conditions may result in different retention times for the same metabolite. Current retention time prediction methods lack sufficient scalability to transfer from one specific chromatographic method to another. RESULTS Therefore, we present RT-Transformer, a novel deep neural network model coupled with graph attention network and 1D-Transformer, which can predict retention times under any chromatographic methods. First, we obtain a pre-trained model by training RT-Transformer on the large small molecule retention time dataset containing 80 038 molecules, and then transfer the resulting model to different chromatographic methods based on transfer learning. When tested on the small molecule retention time dataset, as other authors did, the average absolute error reached 27.30 after removing not retained molecules. Still, it reached 33.41 when no samples were removed. The pre-trained RT-Transformer was further transferred to 5 datasets corresponding to different chromatographic conditions and fine-tuned. According to the experimental results, RT-Transformer achieves competitive performance compared to state-of-the-art methods. In addition, RT-Transformer was applied to 41 external molecular retention time datasets. Extensive evaluations indicate that RT-Transformer has excellent scalability in predicting retention times for liquid chromatography and improves the accuracy of metabolite identification. AVAILABILITY AND IMPLEMENTATION The source code for the model is available at https://github.com/01dadada/RT-Transformer. The web server is available at https://huggingface.co/spaces/Xue-Jun/RT-Transformer.
Collapse
Affiliation(s)
- Jun Xue
- School of Information Science and Engineering, Yunnan University, Kunming, Yunnan 650500, China
- Shenzhen Branch, Guangdong Laboratory for Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, Guangdong 518120, China
| | - Bingyi Wang
- Yunnan Police College, Kunming, Yunnan 650223, China
- Key Laboratory of Smart Drugs Control (Yunnan Police College), Ministry of Education, Kunming, Yunnan 650223, China
| | - Hongchao Ji
- Shenzhen Branch, Guangdong Laboratory for Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, Guangdong 518120, China
| | - WeiHua Li
- School of Information Science and Engineering, Yunnan University, Kunming, Yunnan 650500, China
| |
Collapse
|
16
|
Bland GD, Abrahamsson D, Wang M, Zlatnik MG, Morello-Frosch R, Park JS, Sirota M, Woodruff TJ. Exploring applications of non-targeted analysis in the characterization of the prenatal exposome. THE SCIENCE OF THE TOTAL ENVIRONMENT 2024; 912:169458. [PMID: 38142008 PMCID: PMC10947484 DOI: 10.1016/j.scitotenv.2023.169458] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/20/2023] [Revised: 12/15/2023] [Accepted: 12/15/2023] [Indexed: 12/25/2023]
Abstract
Capturing the breadth of chemical exposures in utero is critical in understanding their long-term health effects for mother and child. We explored methodological adaptations in a Non-Targeted Analysis (NTA) pipeline and evaluated the effects on chemical annotation and discovery for maternal and infant exposure. We focus on lesser-known/underreported chemicals in maternal and umbilical cord serum analyzed with liquid chromatography-quadrupole time-of-flight mass spectrometry (LC-QTOF/MS). The samples were collected from a demographically diverse cohort of 296 maternal-cord pairs (n = 592) recruited in San Francisco Bay area. We developed and evaluated two data processing pipelines, primarily differing by detection frequency cut-off, to extract chemical features from non-targeted analysis (NTA). We annotated the detected chemical features by matching with EPA CompTox Chemicals Dashboard (n = 860,000 chemicals) and Human Metabolome Database (n = 3140 chemicals) and applied a Kendrick Mass Defect filter to detect homologous series. We collected fragmentation spectra (MS/MS) on a subset of serum samples and matched to an experimental MS/MS database within the MS-Dial website and other experimental MS/MS spectra collected from standards in our lab. We annotated ~72 % of the features (total features = 32,197, levels 1-4). We confirmed 22 compounds with analytical standards, tentatively identified 88 compounds with MS/MS spectra, and annotated 4862 exogenous chemicals with an in-house developed annotation algorithm. We detected 36 chemicals that appear to not have been previously reported in human blood and 9 chemicals that were reported in less than five studies. Our findings underline the importance of NTA in the discovery of lesser-known/unreported chemicals important to characterize human exposures.
Collapse
Affiliation(s)
- Garret D Bland
- Department of Obstetrics, Gynecology and Reproductive Sciences, Program on Reproductive Health and the Environment, University of California San Francisco, San Francisco, CA, United States
| | - Dimitri Abrahamsson
- Department of Obstetrics, Gynecology and Reproductive Sciences, Program on Reproductive Health and the Environment, University of California San Francisco, San Francisco, CA, United States.
| | - Miaomiao Wang
- Department of Toxic Substances Control, California Environmental Protection Agency, Berkeley, CA, United States
| | - Marya G Zlatnik
- Department of Obstetrics, Gynecology and Reproductive Sciences, Program on Reproductive Health and the Environment, University of California San Francisco, San Francisco, CA, United States
| | - Rachel Morello-Frosch
- Department of Environmental Science, Policy and Management, School of Public Health, University of California Berkeley, Berkeley, CA, United States
| | - June-Soo Park
- Department of Obstetrics, Gynecology and Reproductive Sciences, Program on Reproductive Health and the Environment, University of California San Francisco, San Francisco, CA, United States; Department of Toxic Substances Control, California Environmental Protection Agency, Berkeley, CA, United States
| | - Marina Sirota
- Bakar Computational Health Sciences Institute, Department of Pediatrics, University of California San Francisco, San Francisco 94158, CA, United States
| | - Tracey J Woodruff
- Department of Obstetrics, Gynecology and Reproductive Sciences, Program on Reproductive Health and the Environment, University of California San Francisco, San Francisco, CA, United States.
| |
Collapse
|
17
|
Balcells C, Xu Y, Gil-Solsona R, Maitre L, Gago-Ferrero P, Keun HC. Blurred lines: Crossing the boundaries between the chemical exposome and the metabolome. Curr Opin Chem Biol 2024; 78:102407. [PMID: 38086287 DOI: 10.1016/j.cbpa.2023.102407] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2023] [Revised: 10/22/2023] [Accepted: 11/09/2023] [Indexed: 02/09/2024]
Abstract
The aetiology of every human disease lies in a combination of genetic and environmental factors, each contributing in varying proportions. While genomics investigates the former, a comparable holistic paradigm was proposed for environmental exposures in 2005, marking the onset of exposome research. Since then, the exposome definition has broadened to include a wide array of physical, chemical, and psychosocial factors that interact with the human body and potentially alter the epigenome, the transcriptome, the proteome, and the metabolome. The chemical exposome, deeply intertwined with the metabolome, includes all small molecules originating from diet as well as pharmaceuticals, personal care and consumer products, or pollutants in air and water. The set of techniques to interrogate these exposures, primarily mass spectrometry and nuclear magnetic resonance spectroscopy, are also extensively used in metabolomics. Recent advances in untargeted metabolomics using high resolution mass spectrometry have paved the way for the development of methods able to provide in depth characterisation of both the internal chemical exposome and the endogenous metabolome simultaneously. Herein we review the available tools, databases, and workflows currently available for such work, and discuss how these can bridge the gap between the study of the metabolome and the exposome.
Collapse
Affiliation(s)
- Cristina Balcells
- Institute of Developmental and Reproductive Biology (IRDB), Division of Cancer, Department of Surgery and Cancer, Faculty of Medicine, Imperial College London, London, UK.
| | - Yitao Xu
- Institute of Developmental and Reproductive Biology (IRDB), Division of Cancer, Department of Surgery and Cancer, Faculty of Medicine, Imperial College London, London, UK
| | - Rubén Gil-Solsona
- Institute of Environmental Assessment and Water Research (IDAEA-CSIC), Barcelona, Spain
| | - Léa Maitre
- ISGlobal, Barcelona, Spain; Universitat Pompeu Fabra (UPF), Barcelona, Spain; CIBER Epidemiología y Salud Pública (CIBERESP), Madrid, Spain
| | - Pablo Gago-Ferrero
- Institute of Environmental Assessment and Water Research (IDAEA-CSIC), Barcelona, Spain
| | - Hector C Keun
- Institute of Developmental and Reproductive Biology (IRDB), Division of Cancer, Department of Surgery and Cancer, Faculty of Medicine, Imperial College London, London, UK.
| |
Collapse
|
18
|
Kehl N, Gessner A, Maas R, Fromm MF, Taudte RV. A supervised machine-learning approach for the efficient development of a multi method (LC-MS) for a large number of drugs and subsets thereof: focus on oral antitumor agents. Clin Chem Lab Med 2024; 62:293-302. [PMID: 37606251 DOI: 10.1515/cclm-2023-0468] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2023] [Accepted: 07/31/2023] [Indexed: 08/23/2023]
Abstract
OBJECTIVES Accumulating evidence argues for a more widespread use of therapeutic drug monitoring (TDM) to support individualized medicine, especially for therapies where toxicity and efficacy are critical issues, such as in oncology. However, development of TDM assays struggles to keep pace with the rapid introduction of new drugs. Therefore, novel approaches for faster assay development are needed that also allow effortless inclusion of newly approved drugs as well as customization to smaller subsets if scientific or clinical situations require. METHODS We applied and evaluated two machine-learning approaches i.e., a regression-based approach and an artificial neural network (ANN) to retention time (RT) prediction for efficient development of a liquid chromatography mass spectrometry (LC-MS) method quantifying 73 oral antitumor drugs (OADs) and five active metabolites. Individual steps included training, evaluation, comparison, and application of the superior approach to RT prediction, followed by stipulation of the optimal gradient. RESULTS Both approaches showed excellent results for RT prediction (mean difference ± standard deviation: 2.08 % ± 9.44 % ANN; 1.78 % ± 1.93 % regression-based approach). Using the regression-based approach, the optimum gradient (4.91 % MeOH/min) was predicted with a total run time of 17.92 min. The associated method was fully validated following FDA and EMA guidelines. Exemplary modification and application of the regression-based approach to a subset of 14 uro-oncological agents resulted in a considerably shortened run time of 9.29 min. CONCLUSIONS Using a regression-based approach, a multi drug LC-MS assay for RT prediction was efficiently developed, which can be easily expanded to newly approved OADs and customized to smaller subsets if required.
Collapse
Affiliation(s)
- Niklas Kehl
- Institute of Experimental and Clinical Pharmacology and Toxicology, Friedrich-Alexander-Universität Erlangen-Nürnberg (FAU), Erlangen, Germany
| | - Arne Gessner
- Institute of Experimental and Clinical Pharmacology and Toxicology, Friedrich-Alexander-Universität Erlangen-Nürnberg (FAU), Erlangen, Germany
| | - Renke Maas
- Institute of Experimental and Clinical Pharmacology and Toxicology, Friedrich-Alexander-Universität Erlangen-Nürnberg (FAU), Erlangen, Germany
- FAU NeW - Research Center for New Bioactive Compounds, Friedrich-Alexander-Universität Erlangen-Nürnberg (FAU), Erlangen, Germany
| | - Martin F Fromm
- Institute of Experimental and Clinical Pharmacology and Toxicology, Friedrich-Alexander-Universität Erlangen-Nürnberg (FAU), Erlangen, Germany
- FAU NeW - Research Center for New Bioactive Compounds, Friedrich-Alexander-Universität Erlangen-Nürnberg (FAU), Erlangen, Germany
| | - R Verena Taudte
- Institute of Experimental and Clinical Pharmacology and Toxicology, Friedrich-Alexander-Universität Erlangen-Nürnberg (FAU), Erlangen, Germany
- Core Facility for Metabolomics, Department of Medicine, Philipps-Universität Marburg, 35043 Marburg, Germany
| |
Collapse
|
19
|
Torigoe T, Takahashi M, Heravizadeh O, Ikeda K, Nakatani K, Bamba T, Izumi Y. Predicting Retention Time in Unified-Hydrophilic-Interaction/Anion-Exchange Liquid Chromatography High-Resolution Tandem Mass Spectrometry (Unified-HILIC/AEX/HRMS/MS) for Comprehensive Structural Annotation of Polar Metabolome. Anal Chem 2024; 96:1275-1283. [PMID: 38186224 DOI: 10.1021/acs.analchem.3c04618] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2024]
Abstract
The accuracy of the structural annotation of unidentified peaks obtained in metabolomic analysis using liquid chromatography/tandem mass spectrometry (LC/MS/MS) can be enhanced using retention time (RT) information as well as precursor and product ions. Unified-hydrophilic-interaction/anion-exchange liquid chromatography high-resolution tandem mass spectrometry (unified-HILIC/AEX/HRMS/MS) has been recently developed as an innovative method ideal for nontargeted polar metabolomics. However, the RT prediction for unified-HILIC/AEX has not been developed because of the complex separation mechanism characterized by the continuous transition of the separation modes from HILIC to AEX. In this study, we propose an RT prediction model of unified-HILIC/AEX/HRMS/MS, which enables the comprehensive structural annotation of polar metabolites. With training data for 203 polar metabolites, we ranked the feature importance using a random forest among 12,420 molecular descriptors (MDs) and constructed an RT prediction model with 26 selected MDs. The accuracy of the RT model was evaluated using test data for 51 polar metabolites, and 86.3% of the ΔRTs (difference between measured and predicted RTs) were within ±1.50 min, with a mean absolute error of 0.80 min, indicating high RT prediction accuracy. Nontargeted metabolomic data from the NIST SRM 1950-Metabolites in frozen human plasma were analyzed using the developed RT model and in silico MS/MS prediction, resulting in a successful structural estimation of 216 polar metabolites, in addition to the 62 identified based on standards. The proposed model can help accelerate the structural annotation of unknown hydrophilic metabolites, which is a key issue in metabolomic research.
Collapse
Affiliation(s)
- Taihei Torigoe
- Department of Systems Life Sciences, Graduate School of Systems Life Sciences, Kyushu University, 3-1-1 Maidashi, Higashi-ku, Fukuoka 812-8582, Japan
| | - Masatomo Takahashi
- Department of Systems Life Sciences, Graduate School of Systems Life Sciences, Kyushu University, 3-1-1 Maidashi, Higashi-ku, Fukuoka 812-8582, Japan
- Division of Metabolomics/Mass Spectrometry Center, Medical Research Center for High Depth Omics, Medical Institute of Bioregulation, Kyushu University, 3-1-1 Maidashi, Higashi-ku, Fukuoka 812-8582, Japan
| | - Omidreza Heravizadeh
- Department of Systems Life Sciences, Graduate School of Systems Life Sciences, Kyushu University, 3-1-1 Maidashi, Higashi-ku, Fukuoka 812-8582, Japan
| | - Kazuki Ikeda
- Department of Systems Life Sciences, Graduate School of Systems Life Sciences, Kyushu University, 3-1-1 Maidashi, Higashi-ku, Fukuoka 812-8582, Japan
| | - Kohta Nakatani
- Department of Systems Life Sciences, Graduate School of Systems Life Sciences, Kyushu University, 3-1-1 Maidashi, Higashi-ku, Fukuoka 812-8582, Japan
- Division of Metabolomics/Mass Spectrometry Center, Medical Research Center for High Depth Omics, Medical Institute of Bioregulation, Kyushu University, 3-1-1 Maidashi, Higashi-ku, Fukuoka 812-8582, Japan
| | - Takeshi Bamba
- Department of Systems Life Sciences, Graduate School of Systems Life Sciences, Kyushu University, 3-1-1 Maidashi, Higashi-ku, Fukuoka 812-8582, Japan
- Division of Metabolomics/Mass Spectrometry Center, Medical Research Center for High Depth Omics, Medical Institute of Bioregulation, Kyushu University, 3-1-1 Maidashi, Higashi-ku, Fukuoka 812-8582, Japan
| | - Yoshihiro Izumi
- Department of Systems Life Sciences, Graduate School of Systems Life Sciences, Kyushu University, 3-1-1 Maidashi, Higashi-ku, Fukuoka 812-8582, Japan
- Division of Metabolomics/Mass Spectrometry Center, Medical Research Center for High Depth Omics, Medical Institute of Bioregulation, Kyushu University, 3-1-1 Maidashi, Higashi-ku, Fukuoka 812-8582, Japan
| |
Collapse
|
20
|
Parinet J, Makni Y, Diallo T, Guérin T. Liquid chromatographic retention time prediction models to secure and improve the feature annotation process in high-resolution mass spectrometry. Talanta 2024; 267:125214. [PMID: 37734288 DOI: 10.1016/j.talanta.2023.125214] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2023] [Revised: 09/07/2023] [Accepted: 09/14/2023] [Indexed: 09/23/2023]
Abstract
The development of quantitative structure-retention relationship (QSRR) models has, until recently, required an adequate selection of molecular descriptors necessarily obtained based on a known chemical structure. However, these complex descriptors are not always available nor calculable when the high-resolution mass spectrometry (HRMS) annotation process is underway. Depending on the level of annotation, many structures or even various molecular formulas could be candidates. To secure and improve the annotation process and to save time, a QSRR model (using only 0D molecular descriptors) to predict retention times in reverse-phase liquid chromatography (RPLC) based on the molecular formula was developed, and a general QSRR annotation-based methodology was also proposed.
Collapse
Affiliation(s)
- Julien Parinet
- ANSES, Laboratory for Food Safety, 94701, Maisons-Alfort, France.
| | - Yassine Makni
- ANSES, Laboratory for Food Safety, 94701, Maisons-Alfort, France
| | - Thierno Diallo
- ANSES, Laboratory for Food Safety, 94701, Maisons-Alfort, France
| | - Thierry Guérin
- ANSES, Strategy and Programmes Department, 94701, Maisons-Alfort, France
| |
Collapse
|
21
|
Cardarelli M, El Chami A, Rouphael Y, Ciriello M, Bonini P, Erice G, Cirino V, Basile B, Corrado G, Choi S, Kim HJ, Colla G. Plant biostimulants as natural alternatives to synthetic auxins in strawberry production: physiological and metabolic insights. FRONTIERS IN PLANT SCIENCE 2024; 14:1337926. [PMID: 38264017 PMCID: PMC10803581 DOI: 10.3389/fpls.2023.1337926] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/13/2023] [Accepted: 12/12/2023] [Indexed: 01/25/2024]
Abstract
The demand for high-quality strawberries continues to grow, emphasizing the need for innovative agricultural practices to enhance both yield and fruit quality. In this context, the utilization of natural products, such as biostimulants, has emerged as a promising avenue for improving strawberry production while aligning with sustainable and eco-friendly agricultural approaches. This study explores the influence of a bacterial filtrate (BF), a vegetal-derived protein hydrolysate (PH), and a standard synthetic auxin (SA) on strawberry, investigating their effects on yield, fruit quality, mineral composition and metabolomics of leaves and fruits. Agronomic trial revealed that SA and BF significantly enhanced early fruit yield due to their positive influence on flowering and fruit set, while PH treatment favored a gradual and prolonged fruit set, associated with an increased shoot biomass and sustained production. Fruit quality analysis showed that PH-treated fruits exhibited an increase of firmness and soluble solids content, whereas SA-treated fruits displayed lower firmness and soluble solids content. The ionomic analysis of leaves and fruits indicated that all treatments provided sufficient nutrients, with heavy metals within regulatory limits. Metabolomics indicated that PH stimulated primary metabolites, while SA and BF directly affected flavonoid and anthocyanin biosynthesis, and PH increased fruit quality through enhanced production of beneficial metabolites. This research offers valuable insights for optimizing strawberry production and fruit quality by harnessing the potential of natural biostimulants as viable alternative to synthetic compounds.
Collapse
Affiliation(s)
| | - Antonio El Chami
- Department of Agriculture and Forest Sciences, University of Tuscia, Viterbo, Italy
| | - Youssef Rouphael
- Department of Agricultural Sciences, University of Naples Federico II, Portici, Italy
| | - Michele Ciriello
- Department of Agricultural Sciences, University of Naples Federico II, Portici, Italy
| | | | - Gorka Erice
- Atens - Agrotecnologías Naturales, La Riera de Gaià, Spain
| | | | - Boris Basile
- Department of Agricultural Sciences, University of Naples Federico II, Portici, Italy
| | - Giandomenico Corrado
- Department of Agricultural Sciences, University of Naples Federico II, Portici, Italy
| | - Seunghyun Choi
- Texas A&M AgriLife Research and Extension Center, Texas A&M University, Uvalde, TX, United States
| | - Hye-Ji Kim
- Agri-tech and Food Innovation Department, Urban Food Solutions Division, Singapore Food Agency, Singapore, Singapore
| | - Giuseppe Colla
- Department of Agriculture and Forest Sciences, University of Tuscia, Viterbo, Italy
| |
Collapse
|
22
|
Mahajan P, Fiehn O, Barupal D. IDSL.GOA: Gene Ontology Analysis for Interpreting Metabolomic datasets. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2023.03.25.534225. [PMID: 37034715 PMCID: PMC10081191 DOI: 10.1101/2023.03.25.534225] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 04/17/2023]
Abstract
Biological interpretation of metabolomic datasets often ends at a pathway analysis step to find the over-represented metabolic pathways in the list of statistically significant metabolites. However, definitions of biochemical pathways and metabolite coverage vary among different curated databases, leading to missed interpretations. For the lists of genes, transcripts and proteins, Gene Ontology (GO) terms over-presentation analysis has become a standardized approach for biological interpretation. But, GO analysis has not been achieved for metabolomic datasets. We present a new knowledgebase (KB) and the online tool, Gene Ontology Analysis by the Integrated Data Science Laboratory for Metabolomics and Exposomics (IDSL.GOA) to conduct GO over-representation analysis for a metabolite list. The IDSL.GOA KB covers 2,393 metabolic GO terms and associated 3,144 genes, 1,492 EC annotations, and 2,621 metabolites. IDSL.GOA analysis of a case study of older vs young female brain cortex metabolome highlighted 82 GO terms being significantly overrepresented (FDR <0.05). We showed how IDSL.GOA identified key and relevant GO metabolic processes that were not yet covered in other pathway databases. Overall, we suggest that interpretation of metabolite lists should not be limited to only pathway maps and can also leverage GO terms as well. IDSL.GOA provides a useful tool for this purpose, allowing for a more comprehensive and accurate analysis of metabolite pathway data. IDSL.GOA tool can be accessed at https://goa.idsl.me/.
Collapse
Affiliation(s)
- Priyanka Mahajan
- Integrated Data Science Laboratory for Metabolomics and Exposomics, Department of Environmental Medicine and Public Health, Icahn School of Medicine at Mount Sinai, New York, USA 10954
| | - Oliver Fiehn
- NIH-West Coast Metabolomics Center, University of California, Davis, California, 95616, USA
| | - Dinesh Barupal
- Integrated Data Science Laboratory for Metabolomics and Exposomics, Department of Environmental Medicine and Public Health, Icahn School of Medicine at Mount Sinai, New York, USA 10954
| |
Collapse
|
23
|
Kensert A, Desmet G, Cabooter D. A perspective on the use of deep deterministic policy gradient reinforcement learning for retention time modeling in reversed-phase liquid chromatography. J Chromatogr A 2024; 1713:464570. [PMID: 38101304 DOI: 10.1016/j.chroma.2023.464570] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2023] [Revised: 12/04/2023] [Accepted: 12/07/2023] [Indexed: 12/17/2023]
Abstract
Artificial intelligence and machine learning techniques are increasingly used for different tasks related to method development in liquid chromatography. In this study, the possibilities of a reinforcement learning algorithm, more specifically a deep deterministic policy gradient algorithm, are evaluated for the selection of scouting runs for retention time modeling. As a theoretical exercise, it is investigated whether such an algorithm can be trained to select scouting runs for any compound of interest allowing to retrieve its correct retention parameters for the three-parameter Neue-Kuss retention model. It is observed that three scouting runs are generally sufficient to retrieve the retention parameters with an accuracy (mean relative percentage error MRPE) of 1 % or less. When given the opportunity to select additional scouting runs, this does not lead to a significantly improved accuracy. It is also observed that the agent tends to give preference to isocratic scouting runs for retention time modeling, and is only motivated towards selecting gradient scouting runs when penalized (strongly) for large analysis/gradient times. This seems to reinforce the general power and usefulness of isocratic scouting runs for retention time modeling. Finally, the best results (lowest MRPE) are obtained when the agent manages to retrieve retention time data for % ACN at elution of the compound under consideration that spread the entire relevant range of ACN (5 % ACN to 95 % ACN) as well as possible, i.e., resulting in retention data at a low, intermediate and high % ACN. Based on the obtained results, we believe reinforcement learning holds great potential to automate and rationalize method development in liquid chromatography in the future.
Collapse
Affiliation(s)
- Alexander Kensert
- University of Leuven (KU Leuven), Department for Pharmaceutical and Pharmacological Sciences, Pharmaceutical Analysis, Herestraat 49, 3000 Leuven, Belgium; Vrije Universiteit Brussel, Department of Chemical Engineering, Pleinlaan 2, 1050 Brussel, Belgium
| | - Gert Desmet
- Vrije Universiteit Brussel, Department of Chemical Engineering, Pleinlaan 2, 1050 Brussel, Belgium
| | - Deirdre Cabooter
- University of Leuven (KU Leuven), Department for Pharmaceutical and Pharmacological Sciences, Pharmaceutical Analysis, Herestraat 49, 3000 Leuven, Belgium.
| |
Collapse
|
24
|
Chaker J, Gilles E, Monfort C, Chevrier C, Lennon S, David A. Scannotation: A Suspect Screening Tool for the Rapid Pre-Annotation of the Human LC-HRMS-Based Chemical Exposome. ENVIRONMENTAL SCIENCE & TECHNOLOGY 2023; 57:19253-19262. [PMID: 37968235 DOI: 10.1021/acs.est.3c04764] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/17/2023]
Abstract
In an increasingly chemically polluted environment, rapidly characterizing the human chemical exposome (i.e., chemical mixtures accumulating in humans) at the population scale is critical to understand its impact on health. High-resolution mass spectrometry (HRMS) profiling of complex biological matrices can theoretically provide a comprehensive picture of chemical exposures. However, annotating the detected chemical features, particularly low-abundant ones, remains a significant obstacle to implementing such approaches at a large scale. We present Scannotation (https://github.com/scannotation/Scannotation_software), an automated and user-friendly suspect screening tool for the rapid pre-annotation of HRMS preprocessed data sets. This software tool combines several MS1 chemical predictors, i.e., m/z, experimental and predicted retention times, isotopic patterns, and neutral loss patterns, to score the proximity between features and suspects, thus efficiently prioritizing tentative annotations to verify. Scannotation and MS-DIAL4 were used to annotate blood serum samples of 75 Breton adolescents. Scannotation's combination of MS1-based chemical predictors allowed us to annotate 89 chemically diverse environmental compounds with high confidence (confirmed by MS2 when available). These compounds included 62% of emerging molecules, for which no toxicological or human biomonitoring data are reported in the literature. The complementarity observed with MS-DIAL4 results demonstrates the relevance of Scannotation for the efficient pre-annotation of large-scale exposomics data sets.
Collapse
Affiliation(s)
- Jade Chaker
- Univ Rennes, Inserm, EHESP, Irset (Institut de recherche en santé, environnement et travail) - UMR_S 1085, F-35000 Rennes, France
| | - Erwann Gilles
- Univ Rennes, Inserm, EHESP, Irset (Institut de recherche en santé, environnement et travail) - UMR_S 1085, F-35000 Rennes, France
| | - Christine Monfort
- Univ Rennes, Inserm, EHESP, Irset (Institut de recherche en santé, environnement et travail) - UMR_S 1085, F-35000 Rennes, France
| | - Cécile Chevrier
- Univ Rennes, Inserm, EHESP, Irset (Institut de recherche en santé, environnement et travail) - UMR_S 1085, F-35000 Rennes, France
| | - Sarah Lennon
- Univ Rennes, Inserm, EHESP, Irset (Institut de recherche en santé, environnement et travail) - UMR_S 1085, F-35000 Rennes, France
| | - Arthur David
- Univ Rennes, Inserm, EHESP, Irset (Institut de recherche en santé, environnement et travail) - UMR_S 1085, F-35000 Rennes, France
| |
Collapse
|
25
|
Kang Q, Fang P, Zhang S, Qiu H, Lan Z. Deep graph convolutional network for small-molecule retention time prediction. J Chromatogr A 2023; 1711:464439. [PMID: 37865024 DOI: 10.1016/j.chroma.2023.464439] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2023] [Revised: 10/04/2023] [Accepted: 10/06/2023] [Indexed: 10/23/2023]
Abstract
The retention time (RT) is a crucial source of data for liquid chromatography-mass spectrometry (LCMS). A model that can accurately predict the RT for each molecule would empower filtering candidates with similar spectra but differing RT in LCMS-based molecule identification. Recent research shows that graph neural networks (GNNs) outperform traditional machine learning algorithms in RT prediction. However, all of these models use relatively shallow GNNs. This study for the first time investigates how depth affects GNNs' performance on RT prediction. The results demonstrate that a notable improvement can be achieved by pushing the depth of GNNs to 16 layers by the adoption of residual connection. Additionally, we also find that graph convolutional network (GCN) model benefits from the edge information. The developed deep graph convolutional network, DeepGCN-RT, significantly outperforms the previous state-of-the-art method and achieves the lowest mean absolute percentage error (MAPE) of 3.3% and the lowest mean absolute error (MAE) of 26.55 s on the SMRT test set. We also finetune DeepGCN-RT on seven datasets with various chromatographic conditions. The mean MAE of the seven datasets largely decreases 30% compared to previous state-of-the-art method. On the RIKEN-PlaSMA dataset, we also test the effectiveness of DeepGCN-RT in assisting molecular structure identification. By 30% lessening the number of potential structures, DeepGCN-RT is able to improve top-1 accuracy by about 11%.
Collapse
Affiliation(s)
- Qiyue Kang
- School of Engineering, Westlake University, Hangzhou, Zhejiang, 310024, China.
| | - Pengfei Fang
- School of Computer Science and Engineering, Southeast University, Nanjing, Jiangsu, 210096, China
| | - Shuai Zhang
- School of Engineering, Westlake University, Hangzhou, Zhejiang, 310024, China
| | - Huachuan Qiu
- School of Engineering, Westlake University, Hangzhou, Zhejiang, 310024, China
| | - Zhenzhong Lan
- School of Engineering, Westlake University, Hangzhou, Zhejiang, 310024, China.
| |
Collapse
|
26
|
Reddy CS, Natarajan P, Nimmakayala P, Hankins GR, Reddy UK. From Fruit Waste to Medical Insight: The Comprehensive Role of Watermelon Rind Extract on Renal Adenocarcinoma Cellular and Transcriptomic Dynamics. Int J Mol Sci 2023; 24:15615. [PMID: 37958599 PMCID: PMC10647773 DOI: 10.3390/ijms242115615] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2023] [Revised: 10/14/2023] [Accepted: 10/23/2023] [Indexed: 11/15/2023] Open
Abstract
Cancer researchers are fascinated by the chemistry of diverse natural products that show exciting potential as anticancer agents. In this study, we aimed to investigate the anticancer properties of watermelon rind extract (WRE) by examining its effects on cell proliferation, apoptosis, senescence, and global gene expression in human renal cell adenocarcinoma cells (HRAC-769-P) in vitro. Our metabolome data analysis of WRE exhibited untargeted phyto-constituents and targeted citrulline (22.29 µg/mg). HRAC-769-P cells were cultured in RPMI-1640 media and treated with 22.4, 44.8, 67.2, 88.6, 112, 134.4, and 156.8 mg·mL-1 for 24, 48, and 72 h. At 24 h after treatment, (88.6 mg·mL-1 of WRE) cell proliferation significantly reduced, more than 34% compared with the control. Cell viability decreased 48 and 72 h after treatment to 45% and 37%, respectively. We also examined poly caspase, SA-beta-galactosidase (SA-beta-gal), and wound healing activities using WRE. All treatments induced an early poly caspase response and a significant reduction in cell migration. Further, we analyzed the transcript profile of the cells grown at 44.8 mg·mL-1 of WRE after 6 h using RNA sequencing (RNAseq) analysis. We identified 186 differentially expressed genes (DEGs), including 149 upregulated genes and 37 downregulated genes, in cells treated with WRE compared with the control. The differentially expressed genes were associated with NF-Kappa B signaling and TNF pathways. Crucial apoptosis-related genes such as BMF, NPTX1, NFKBIA, NFKBIE, and NFKBID might induce intrinsic and extrinsic apoptosis. Another possible mechanism is a high quantity of citrulline may lead to induction of apoptosis by the production of increased nitric oxide. Hence, our study suggests the potential anticancer properties of WRE and provides insights into its effects on cellular processes and gene expression in HRAC-769-P cells.
Collapse
Affiliation(s)
| | | | | | - Gerald R. Hankins
- Department of Biology, Gus R. Douglass Institute, West Virginia State University, Institute, WV 25112, USA; (C.S.R.); (P.N.); (P.N.)
| | - Umesh K. Reddy
- Department of Biology, Gus R. Douglass Institute, West Virginia State University, Institute, WV 25112, USA; (C.S.R.); (P.N.); (P.N.)
| |
Collapse
|
27
|
Egede Frøkjær E, Rüsz Hansen H, Hansen M. Non-targeted and suspect screening analysis using ion exchange chromatography-Orbitrap tandem mass spectrometry reveals polar and very mobile xenobiotics in Danish drinking water. CHEMOSPHERE 2023; 339:139745. [PMID: 37558003 DOI: 10.1016/j.chemosphere.2023.139745] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/29/2023] [Revised: 06/21/2023] [Accepted: 08/04/2023] [Indexed: 08/11/2023]
Abstract
Non-targeted and suspect screening analysis is gaining approval across the scientific and regulatory community to monitor the chemical status in the environment and thus environmental quality. These holistic screening analyses provides the means to perform suspect screening and go beyond to discover previously undescribed chemical pollutants in environmental samples. In a case study, we developed and optimized a high-resolution tandem mass spectrometry platform hyphenated with anion exchange chromatography to screen drinking water samples in Denmark. The optimized non-targeted screening method was able to detect anionic and polar compounds and was successfully applied to drinking water from two drinking water facilities. Following a data analysis pipeline optimization, anionic pesticide residues and other environmental contaminants were detected at confidence identification level 1 such as dimethachlor ESA, mecoprop, and dichlorprop in drinking water. In addition to these three substances, it was possible to detect another 1662 compounds, of which 97 were annotated at confidence identification level 2. More research is urgently needed to health risk prioritize the detected substances and to determine their concentrations.
Collapse
Affiliation(s)
- Emil Egede Frøkjær
- Environmental Metabolomics Lab, Department of Environmental Science, Aarhus University, Frederiksborgvej 399, 4000, Roskilde, Denmark.
| | - Helle Rüsz Hansen
- Danish Environmental Protection Agency, Tolderlundsvej 5, 5000, Odense C, Denmark
| | - Martin Hansen
- Environmental Metabolomics Lab, Department of Environmental Science, Aarhus University, Frederiksborgvej 399, 4000, Roskilde, Denmark.
| |
Collapse
|
28
|
Kartowikromo KY, Olajide OE, Hamid AM. Collision cross section measurement and prediction methods in omics. JOURNAL OF MASS SPECTROMETRY : JMS 2023; 58:e4973. [PMID: 37620034 PMCID: PMC10530098 DOI: 10.1002/jms.4973] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/19/2023] [Revised: 06/26/2023] [Accepted: 07/20/2023] [Indexed: 08/26/2023]
Abstract
Omics studies such as metabolomics, lipidomics, and proteomics have become important for understanding the mechanisms in living organisms. However, the compounds detected are structurally different and contain isomers, with each structure or isomer leading to a different result in terms of the role they play in the cell or tissue in the organism. Therefore, it is important to detect, characterize, and elucidate the structures of these compounds. Liquid chromatography and mass spectrometry have been utilized for decades in the structure elucidation of key compounds. While prediction models of parameters (such as retention time and fragmentation pattern) have also been developed for these separation techniques, they have some limitations. Moreover, ion mobility has become one of the most promising techniques to give a fingerprint to these compounds by determining their collision cross section (CCS) values, which reflect their shape and size. Obtaining accurate CCS enables its use as a filter for potential analyte structures. These CCS values can be measured experimentally using calibrant-independent and calibrant-dependent approaches. Identification of compounds based on experimental CCS values in untargeted analysis typically requires CCS references from standards, which are currently limited and, if available, would require a large amount of time for experimental measurements. Therefore, researchers use theoretical tools to predict CCS values for untargeted and targeted analysis. In this review, an overview of the different methods for the experimental and theoretical estimation of CCS values is given where theoretical prediction tools include computational and machine modeling type approaches. Moreover, the limitations of the current experimental and theoretical approaches and their potential mitigation methods were discussed.
Collapse
Affiliation(s)
| | - Orobola E Olajide
- Department of Chemistry and Biochemistry, Auburn University, Auburn, Alabama, USA
| | - Ahmed M Hamid
- Department of Chemistry and Biochemistry, Auburn University, Auburn, Alabama, USA
| |
Collapse
|
29
|
Choi E, Yoo WJ, Jang HY, Kim TY, Lee SK, Oh HB. Machine learning liquid chromatography retention time prediction model augments the dansylation strategy for metabolite analysis of urine samples. J Chromatogr A 2023; 1705:464167. [PMID: 37348224 DOI: 10.1016/j.chroma.2023.464167] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2023] [Revised: 06/10/2023] [Accepted: 06/15/2023] [Indexed: 06/24/2023]
Abstract
Herein, a standalone software equipped with a graphic user interface (GUI) is developed to predict liquid chromatography mass spectrometry (LC-MS) retention times (RTs) of dansylated metabolites. Dansylation metabolomics strategy developed by Li et al. narrows down a vast chemical space of metabolites into the metabolites containing amines and phenolic hydroxyls. Combined with differential isotope labeling, e.g., 12C-reagent labeled individual samples spiked with a 13C-reagent labeled reference or pooled sample, LC-MS analysis of the dansylated samples enables accurate relative quantification of all labeled metabolites. Herein, the LC-RTs for dansylated metabolites are predicted using an artificial neural network (ANN) machine-learning model. For the ANN modeling, 315 dansylated urine metabolites obtained from the DnsID database are used. The ANN LC-RT prediction model was reliable, with a mean absolute deviation of 0.74 min for the 30 min LC run. In the RT model, a deviation of more than 2 min was observed in only 3.2% of the total 315 metabolites, while a deviation of 1.5 min or more was observed in 11% of the metabolites. Furthermore, it was found that the LC-RT prediction was also reliable even for metabolites containing both amine and phenolic functional groups that can undergo dansylation on either one of the two functional groups, resulting in the generation of two isomeric forms. This RT-prediction model is embedded into a user-friendly GUI and can be used for identifying nontargeted dansylated metabolites with unknown RTs, along with accurate mass measurements. Furthermore, it is demonstrated that the developed software can help identify metabolites from a urine sample of an anonymous healthy pregnant woman.
Collapse
Affiliation(s)
- Eunwoo Choi
- Department of Chemistry, Sogang University, Seoul 04107, Republic of Korea
| | - Won Jun Yoo
- Department of Chemistry, Sogang University, Seoul 04107, Republic of Korea
| | - Hwa-Yong Jang
- Department of Chemistry, Sogang University, Seoul 04107, Republic of Korea
| | - Tae-Young Kim
- School of Earth Sciences and Environmental Engineering, Gwangju Institute of Science and Technology, Gwangju 61005, Republic of Korea
| | - Sung Ki Lee
- Department of Obstetrics and Gynecology, College of Medicine, Konyang University, Daejeon 35365, Republic of Korea.
| | - Han Bin Oh
- Department of Chemistry, Sogang University, Seoul 04107, Republic of Korea.
| |
Collapse
|
30
|
Trostel L, Coll C, Fenner K, Hafner J. Combining predictive and analytical methods to elucidate pharmaceutical biotransformation in activated sludge. ENVIRONMENTAL SCIENCE. PROCESSES & IMPACTS 2023; 25:1322-1336. [PMID: 37539453 DOI: 10.1039/d3em00161j] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/05/2023]
Abstract
While man-made chemicals in the environment are ubiquitous and a potential threat to human health and ecosystem integrity, the environmental fate of chemical contaminants such as pharmaceuticals is often poorly understood. Biodegradation processes driven by microbial communities convert chemicals into transformation products (TPs) that may themselves have adverse ecological effects. The detection of TPs formed during biodegradation has been continuously improved thanks to the development of TP prediction algorithms and analytical workflows. Here, we contribute to this advance by (i) reviewing past applications of TP identification workflows, (ii) applying an updated workflow for TP prediction to 42 pharmaceuticals in biodegradation experiments with activated sludge, and (iii) benchmarking 5 different pathway prediction models, comprising 4 prediction models trained on different datasets provided by enviPath, and the state-of-the-art EAWAG pathway prediction system. Using the updated workflow, we could tentatively identify 79 transformation products for 31 pharmaceutical compounds. Compared to previous works, we have further automatized several steps that were previously performed by hand. By benchmarking the enviPath prediction system on experimental data, we demonstrate the usefulness of the pathway prediction tool to generate suspect lists for screening, and we propose new avenues to improve their accuracy. Moreover, we provide a well-documented workflow that can be (i) readily applied to detect transformation products in activated sludge and (ii) potentially extended to other environmental studies.
Collapse
Affiliation(s)
- Leo Trostel
- Department of Environmental Chemistry, Swiss Federal Institute of Aquatic Science and Technology (Eawag), Dübendorf, 8600, Zürich, Switzerland.
| | - Claudia Coll
- Department of Environmental Chemistry, Swiss Federal Institute of Aquatic Science and Technology (Eawag), Dübendorf, 8600, Zürich, Switzerland.
| | - Kathrin Fenner
- Department of Environmental Chemistry, Swiss Federal Institute of Aquatic Science and Technology (Eawag), Dübendorf, 8600, Zürich, Switzerland.
- Department of Chemistry, University of Zürich, 8057 Zürich, Switzerland
| | - Jasmin Hafner
- Department of Environmental Chemistry, Swiss Federal Institute of Aquatic Science and Technology (Eawag), Dübendorf, 8600, Zürich, Switzerland.
- Department of Chemistry, University of Zürich, 8057 Zürich, Switzerland
| |
Collapse
|
31
|
Karunaratne E, Hill DW, Dührkop K, Böcker S, Grant DF. Combining Experimental with Computational Infrared and Mass Spectra for High-Throughput Nontargeted Chemical Structure Identification. Anal Chem 2023; 95:11901-11907. [PMID: 37540774 DOI: 10.1021/acs.analchem.3c00937] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/06/2023]
Abstract
The inability to identify the structures of most metabolites detected in environmental or biological samples limits the utility of nontargeted metabolomics. The most widely used analytical approaches combine mass spectrometry and machine learning methods to rank candidate structures contained in large chemical databases. Given the large chemical space typically searched, the use of additional orthogonal data may improve the identification rates and reliability. Here, we present results of combining experimental and computational mass and IR spectral data for high-throughput nontargeted chemical structure identification. Experimental MS/MS and gas-phase IR data for 148 test compounds were obtained from NIST. Candidate structures for each of the test compounds were obtained from PubChem (mean = 4444 candidate structures per test compound). Our workflow used CSI:FingerID to initially score and rank the candidate structures. The top 1000 ranked candidates were subsequently used for IR spectra prediction, scoring, and ranking using density functional theory (DFT-IR). Final ranking of the candidates was based on a composite score calculated as the average of the CSI:FingerID and DFT-IR rankings. This approach resulted in the correct identification of 88 of the 148 test compounds (59%). 129 of the 148 test compounds (87%) were ranked within the top 20 candidates. These identification rates are the highest yet reported when candidate structures are used from PubChem. Combining experimental and computational MS/MS and IR spectral data is a potentially powerful option for prioritizing candidates for final structure verification.
Collapse
Affiliation(s)
- Erandika Karunaratne
- Department of Pharmaceutical Sciences, University of Connecticut, Storrs, Connecticut 06269, United States
| | - Dennis W Hill
- Department of Pharmaceutical Sciences, University of Connecticut, Storrs, Connecticut 06269, United States
| | - Kai Dührkop
- Chair for Bioinformatics, Faculty of Mathematics and Computer Science, Friedrich Schiller University Jena, Jena 07743, Germany
| | - Sebastian Böcker
- Chair for Bioinformatics, Faculty of Mathematics and Computer Science, Friedrich Schiller University Jena, Jena 07743, Germany
| | - David F Grant
- Department of Pharmaceutical Sciences, University of Connecticut, Storrs, Connecticut 06269, United States
| |
Collapse
|
32
|
Bartmanski BJ, Rocha M, Zimmermann-Kogadeeva M. Recent advances in data- and knowledge-driven approaches to explore primary microbial metabolism. Curr Opin Chem Biol 2023; 75:102324. [PMID: 37207402 PMCID: PMC10410306 DOI: 10.1016/j.cbpa.2023.102324] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/28/2022] [Revised: 04/15/2023] [Accepted: 04/18/2023] [Indexed: 05/21/2023]
Abstract
With the rapid progress in metabolomics and sequencing technologies, more data on the metabolome of single microbes and their communities become available, revealing the potential of microorganisms to metabolize a broad range of chemical compounds. The analysis of microbial metabolomics datasets remains challenging since it inherits the technical challenges of metabolomics analysis, such as compound identification and annotation, while harboring challenges in data interpretation, such as distinguishing metabolite sources in mixed samples. This review outlines the recent advances in computational methods to analyze primary microbial metabolism: knowledge-based approaches that take advantage of metabolic and molecular networks and data-driven approaches that employ machine/deep learning algorithms in combination with large-scale datasets. These methods aim at improving metabolite identification and disentangling reciprocal interactions between microbes and metabolites. We also discuss the perspective of combining these approaches and further developments required to advance the investigation of primary metabolism in mixed microbial samples.
Collapse
Affiliation(s)
| | - Miguel Rocha
- Centre of Biological Engineering, University of Minho, Campus of Gualtar, Braga, Portugal
| | | |
Collapse
|
33
|
Muhamadali H, Winder CL, Dunn WB, Goodacre R. Unlocking the secrets of the microbiome: exploring the dynamic microbial interplay with humans through metabolomics and their manipulation for synthetic biology applications. Biochem J 2023; 480:891-908. [PMID: 37378961 PMCID: PMC10317162 DOI: 10.1042/bcj20210534] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2023] [Revised: 06/12/2023] [Accepted: 06/16/2023] [Indexed: 06/29/2023]
Abstract
Metabolomics is a powerful research discovery tool with the potential to measure hundreds to low thousands of metabolites. In this review, we discuss the application of GC-MS and LC-MS in discovery-based metabolomics research, we define metabolomics workflows and we highlight considerations that need to be addressed in order to generate robust and reproducible data. We stress that metabolomics is now routinely applied across the biological sciences to study microbiomes from relatively simple microbial systems to their complex interactions within consortia in the host and the environment and highlight this in a range of biological species and mammalian systems including humans. However, challenges do still exist that need to be overcome to maximise the potential for metabolomics to help us understanding biological systems. To demonstrate the potential of the approach we discuss the application of metabolomics in two broad research areas: (1) synthetic biology to increase the production of high-value fine chemicals and reduction in secondary by-products and (2) gut microbial interaction with the human host. While burgeoning in importance, the latter is still in its infancy and will benefit from the development of tools to detangle host-gut-microbial interactions and their impact on human health and diseases.
Collapse
Affiliation(s)
- Howbeer Muhamadali
- Centre for Metabolomics Research, Department of Biochemistry, Cell and Systems Biology, Institute of Systems, Molecular and Integrative Biology, University of Liverpool, Liverpool L69 7ZB, U.K
| | - Catherine L. Winder
- Centre for Metabolomics Research, Department of Biochemistry, Cell and Systems Biology, Institute of Systems, Molecular and Integrative Biology, University of Liverpool, Liverpool L69 7ZB, U.K
| | - Warwick B. Dunn
- Centre for Metabolomics Research, Department of Biochemistry, Cell and Systems Biology, Institute of Systems, Molecular and Integrative Biology, University of Liverpool, Liverpool L69 7ZB, U.K
| | - Royston Goodacre
- Centre for Metabolomics Research, Department of Biochemistry, Cell and Systems Biology, Institute of Systems, Molecular and Integrative Biology, University of Liverpool, Liverpool L69 7ZB, U.K
| |
Collapse
|
34
|
Anderson BG, Raskind A, Hissong R, Dougherty MK, McGill SK, Gulati A, Theriot CM, Kennedy RT, Evans CR. Offline Two-dimensional Liquid Chromatography-Mass Spectrometry for Deep Annotation of the Fecal Metabolome. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.05.31.543178. [PMID: 37333153 PMCID: PMC10274728 DOI: 10.1101/2023.05.31.543178] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/20/2023]
Abstract
Compound identification is an essential task in the workflow of untargeted metabolomics since the interpretation of the data in a biological context depends on the correct assignment of chemical identities to the features it contains. Current techniques fall short of identifying all or even most observable features in untargeted metabolomics data, even after rigorous data cleaning approaches to remove degenerate features are applied. Hence, new strategies are required to annotate the metabolome more deeply and accurately. The human fecal metabolome, which is the focus of substantial biomedical interest, is a more complex, more variable, yet lesser-investigated sample matrix compared to widely studied sample types like human plasma. This manuscript describes a novel experimental strategy using multidimensional chromatography to facilitate compound identification in untargeted metabolomics. Pooled fecal metabolite extract samples were fractionated using offline semi-preparative liquid chromatography. The resulting fractions were analyzed by an orthogonal LC-MS/MS method, and the data were searched against commercial, public, and local spectral libraries. Multidimensional chromatography yielded more than a 3-fold improvement in identified compounds compared to the typical single-dimensional LC-MS/MS approach and successfully identified several rare and novel compounds, including atypical conjugated bile acid species. Most features identified by the new approach could be matched to features that were detectable but not identifiable in the original single-dimension LC-MS data. Overall, our approach represents a powerful strategy for deeper annotation of the metabolome that can be implemented with commercially-available instrumentation, and should apply to any dataset requiring deeper annotation of the metabolome.
Collapse
|
35
|
Wu L, Xiao F, Luo X, Yun K, Wen D, Lin J, Yang S, Li T, Xiang P, Shi Y. Predicting the retention time of Synthetic Cannabinoids using a combinatorial QSAR approach. Heliyon 2023; 9:e16671. [PMID: 37484220 PMCID: PMC10360586 DOI: 10.1016/j.heliyon.2023.e16671] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2022] [Revised: 05/23/2023] [Accepted: 05/24/2023] [Indexed: 07/25/2023] Open
Abstract
Background Abuse of Synthetic Cannabinoids (SCs) has become a serious threat to public health. Due to the various structural and chemical group modified by criminals, their detection is a major challenge in forensic toxicological identification. Therefore, rapid and efficient identification of SCs is important for forensic toxicology and drug bans. The prediction of an analyte's retention time in liquid chromatography is an important index for the qualitative analysis of compounds and can provide informatics solutions for the interpretation of chromatographic data. Methods In this study, experimental data from high-resolution mass spectrometry (HRMS) are used to construct a regression model for predicting the retention time of SCs using machine learning methods. The prediction ability of the model is improved by adopting a strategy that combines different descriptors in different independent machine-learning methods. Results The best model was obtained with a method that combined Substructure Fingerprint Count and Finger printer features and the support vector regression (SVR) method, as it exhibited an R2 value of 0.81 for the validation set and 0.83 for the test set. In addition, 4 new SCs were predicted by the optimized model, with a prediction error within 3%. Conclusions Our study provides a model that can predict the retention time of compounds and it can be used as a filter to reduce false-positive candidates when used in combination with LC-HRMS, especially in the absence of reference standards. This can improve the confidence of identification in non-targeted analysis and the reliability of identifying unknown substances.
Collapse
Affiliation(s)
- Lina Wu
- Academy of Forensic Science, Shanghai Key Laboratory of Forensic Medicine, Shanghai 200063, PR China
- Shanxi Medical University, Jinzhong 030600, PR China
| | - Fu Xiao
- School of Chinese Materia Medica, Nanjing University of Chinese Medicine, Nanjing 210023, PR China
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Science, 555 Zuchongzhi Road, Shanghai 201203, PR China
| | - Xiaomin Luo
- School of Chinese Materia Medica, Nanjing University of Chinese Medicine, Nanjing 210023, PR China
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Science, 555 Zuchongzhi Road, Shanghai 201203, PR China
| | - Keming Yun
- Shanxi Medical University, Jinzhong 030600, PR China
| | - Di Wen
- Hebei Medical University, Shijiazhuang 050017, PR China
| | - Jiaman Lin
- Academy of Forensic Science, Shanghai Key Laboratory of Forensic Medicine, Shanghai 200063, PR China
- Shanxi Medical University, Jinzhong 030600, PR China
| | - Shuo Yang
- Academy of Forensic Science, Shanghai Key Laboratory of Forensic Medicine, Shanghai 200063, PR China
| | - Tianle Li
- Shanxi Medical University, Jinzhong 030600, PR China
| | - Ping Xiang
- Academy of Forensic Science, Shanghai Key Laboratory of Forensic Medicine, Shanghai 200063, PR China
| | - Yan Shi
- Academy of Forensic Science, Shanghai Key Laboratory of Forensic Medicine, Shanghai 200063, PR China
| |
Collapse
|
36
|
Folz J, Culver RN, Morales JM, Grembi J, Triadafilopoulos G, Relman DA, Huang KC, Shalon D, Fiehn O. Human metabolome variation along the upper intestinal tract. Nat Metab 2023; 5:777-788. [PMID: 37165176 PMCID: PMC10229427 DOI: 10.1038/s42255-023-00777-z] [Citation(s) in RCA: 36] [Impact Index Per Article: 36.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/24/2022] [Accepted: 03/03/2023] [Indexed: 05/12/2023]
Abstract
Most processing of the human diet occurs in the small intestine. Metabolites in the small intestine originate from host secretions, plus the ingested exposome1 and microbial transformations. Here we probe the spatiotemporal variation of upper intestinal luminal contents during routine daily digestion in 15 healthy male and female participants. For this, we use a non-invasive, ingestible sampling device to collect and analyse 274 intestinal samples and 60 corresponding stool homogenates by combining five mass spectrometry assays2,3 and 16S rRNA sequencing. We identify 1,909 metabolites, including sulfonolipids and fatty acid esters of hydroxy fatty acids (FAHFA) lipids. We observe that stool and intestinal metabolomes differ dramatically. Food metabolites display trends in dietary biomarkers, unexpected increases in dicarboxylic acids along the intestinal tract and a positive association between luminal keto acids and fruit intake. Diet-derived and microbially linked metabolites account for the largest inter-individual differences. Notably, two individuals who had taken antibiotics within 6 months before sampling show large variation in levels of bioactive FAHFAs and sulfonolipids and other microbially related metabolites. From inter-individual variation, we identify Blautia species as a candidate to be involved in FAHFA metabolism. In conclusion, non-invasive, in vivo sampling of the human small intestine and ascending colon under physiological conditions reveals links between diet, host and microbial metabolism.
Collapse
Affiliation(s)
- Jacob Folz
- West Coast Metabolomics Center, University of California, Davis, CA, USA
| | - Rebecca Neal Culver
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA
| | | | - Jessica Grembi
- Department of Medicine, Stanford University School of Medicine, Stanford, CA, USA
| | | | - David A Relman
- Department of Medicine, Stanford University School of Medicine, Stanford, CA, USA
- Department of Microbiology and Immunology, Stanford University School of Medicine, Stanford, CA, USA
- Chan Zuckerberg Biohub, San Francisco, CA, USA
- Infectious Diseases Section, Veterans Affairs Palo Alto Health Care System, Palo Alto, CA, USA
| | - Kerwyn Casey Huang
- Department of Microbiology and Immunology, Stanford University School of Medicine, Stanford, CA, USA
- Chan Zuckerberg Biohub, San Francisco, CA, USA
- Department of Bioengineering, Stanford University, Stanford, CA, USA
| | | | - Oliver Fiehn
- West Coast Metabolomics Center, University of California, Davis, CA, USA.
| |
Collapse
|
37
|
Yang J, Zhao F, Zheng J, Wang Y, Fei X, Xiao Y, Fang M. An automated toxicity based prioritization framework for fast chemical characterization in non-targeted analysis. JOURNAL OF HAZARDOUS MATERIALS 2023; 448:130893. [PMID: 36746086 DOI: 10.1016/j.jhazmat.2023.130893] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/16/2022] [Revised: 01/13/2023] [Accepted: 01/26/2023] [Indexed: 06/18/2023]
Abstract
Identification of environmental pollutants with harmful effects is commonly conducted by non-targeted analysis (NTA) using liquid chromatography coupled with high-resolution mass spectrometry. Prioritization of possible candidates is important yet challenging because of the large number of candidates from MS acquisitions. We aimed to prioritize candidates to the exposure potential of organic chemicals by their toxicity and identification evidence in the matrix. We have developed an R package application, "NTAprioritization.R", for fast prioritization of suspect lists. In this workflow, the identification levels of candidates were first rated according to spectral matching and retention time prediction. The toxicity levels were rated according to candidates' toxicity of different endpoints or ToxPi score. Finally, the various levels of candidates were identified as Tier 1 - 5 descending in priority. For validation, we used this workflow to identify pollutants in a sludge water sample spiked with 28 environmental pollutants. The workflow reduced the candidate list of over 6,982 candidates to a final list of 2,779 compounds and prioritized them to 5 tiers (Tier 1 - 5), including 21 out of 28 spiked standards. Overall, this study shows the added value of an automated prioritization R package for the fast screening of environmental pollutants based on the NTA method.
Collapse
Affiliation(s)
- Junjie Yang
- School of Civil and Environmental Engineering, Nanyang Technological University, 639798, Singapore; Singapore Phenome Center, Lee Kong Chian School of Medicine, Nanyang Technological University, 636921, Singapore
| | - Fanrong Zhao
- School of Civil and Environmental Engineering, Nanyang Technological University, 639798, Singapore
| | - Jie Zheng
- Singapore Phenome Center, Lee Kong Chian School of Medicine, Nanyang Technological University, 636921, Singapore
| | - Yulan Wang
- Singapore Phenome Center, Lee Kong Chian School of Medicine, Nanyang Technological University, 636921, Singapore
| | - Xunchang Fei
- School of Civil and Environmental Engineering, Nanyang Technological University, 639798, Singapore
| | - Yongjun Xiao
- International Food & Water Research Centre, Waters Pacific Pte Ltd, 117528, Singapore.
| | - Mingliang Fang
- School of Civil and Environmental Engineering, Nanyang Technological University, 639798, Singapore; Department of Environmental Science and Engineering, Fudan University, Shanghai 200433, China; Institute of Eco-Chongming, 3663 Zhongshan Road, Shanghai 200062, China.
| |
Collapse
|
38
|
Jia M, Li J, Zhang J, Wei N, Yin Y, Chen H, Yan S, Wang Y. Identification and validation of cuproptosis related genes and signature markers in bronchopulmonary dysplasia disease using bioinformatics analysis and machine learning. BMC Med Inform Decis Mak 2023; 23:69. [PMID: 37060021 PMCID: PMC10105406 DOI: 10.1186/s12911-023-02163-x] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2023] [Accepted: 03/31/2023] [Indexed: 04/16/2023] Open
Abstract
BACKGROUND Bronchopulmonary Dysplasia (BPD) has a high incidence and affects the health of preterm infants. Cuproptosis is a novel form of cell death, but its mechanism of action in the disease is not yet clear. Machine learning, the latest tool for the analysis of biological samples, is still relatively rarely used for in-depth analysis and prediction of diseases. METHODS AND RESULTS First, the differential expression of cuproptosis-related genes (CRGs) in the GSE108754 dataset was extracted and the heat map showed that the expression of NFE2L2 gene was significantly higher in the control group whereas the expression of GLS gene was significantly higher in the treatment group. Chromosome location analysis showed that both the genes were positively correlated and associated with chromosome 2. The results of immune infiltration and immune cell differential analysis showed differences in the four immune cells, significantly in Monocytes cells. Five new pathways were analyzed through two subgroups based on consistent clustering of CRG expression. Weighted correlation network analysis (WGCNA) set the screening condition to the top 25% to obtain the disease signature genes. Four machine learning algorithms: Generalized Linear Models (GLM), Random Forest (RF), Support Vector Machine (SVM), and Extreme Gradient Boosting (XGB) were used to screen the disease signature genes, and the final five marker genes for disease prediction. The models constructed by GLM method were proved to be more accurate in the validation of two datasets, GSE190215 and GSE188944. CONCLUSION We eventually identified two copper death-associated genes, NFE2L2 and GLS. A machine learning model-GLM was constructed to predict the prevalence of BPD disease, and five disease signature genes NFATC3, ERMN, PLA2G4A, MTMR9LP and LOC440700 were identified. These genes that were bioinformatics analyzed could be potential targets for identifying BPD disease and treatment.
Collapse
Affiliation(s)
| | - Jieyi Li
- Shanghai Literature Institute of Traditional Chinese Medicine, Shanghai, 200000, China
| | - Jingying Zhang
- Shanghai Literature Institute of Traditional Chinese Medicine, Shanghai, 200000, China
| | - Ningjing Wei
- ChengZheng Wisdom (Shanghai) Health Sciences and Technology Co., Ltd, Shanghai, 200000, China
| | - Yating Yin
- ChengZheng Wisdom (Shanghai) Health Sciences and Technology Co., Ltd, Shanghai, 200000, China
| | - Hui Chen
- Shanghai Literature Institute of Traditional Chinese Medicine, Shanghai, 200000, China
| | - Shixing Yan
- Shanghai Daosh Medical Technology Co., Ltd, Shanghai, 200000, China
| | - Yong Wang
- Shanghai Literature Institute of Traditional Chinese Medicine, Shanghai, 200000, China.
| |
Collapse
|
39
|
Yen NTH, Anh NK, Jayanti RP, Phat NK, Vu DH, Ghim JL, Ahn S, Shin JG, Oh JY, Phuoc Long N, Kim DH. Multimodal plasma metabolomics and lipidomics in elucidating metabolic perturbations in tuberculosis patients with concurrent type 2 diabetes. Biochimie 2023:S0300-9084(23)00086-X. [PMID: 37062470 DOI: 10.1016/j.biochi.2023.04.009] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2023] [Revised: 04/13/2023] [Accepted: 04/13/2023] [Indexed: 04/18/2023]
Abstract
Type 2 diabetes mellitus (DM) poses a major burden for the treatment and control of tuberculosis (TB). Characterization of the underlying metabolic perturbations in DM patients with TB infection would yield insights into the pathophysiology of TB-DM, thus potentially leading to improvements in TB treatment. In this study, a multimodal metabolomics and lipidomics workflow was applied to investigate plasma metabolic profiles of patients with TB and TB-DM. Significantly different biological processes and biomarkers in TB-DM vs. TB were identified using a data-driven, knowledge-based framework. Changes in metabolic and signaling pathways related to carbohydrate and amino acid metabolism were mainly captured by amide HILIC column metabolomics analysis, while perturbations in lipid metabolism were identified by the C18 metabolomics and lipidomics analysis. Compared to TB, TB-DM exhibited elevated levels of bile acids and molecules related to carbohydrate metabolism, as well as the depletion of glutamine, retinol, lysophosphatidylcholine, and phosphatidylcholine. Moreover, arachidonic acid metabolism was determined as a potential important factor in the interaction between TB and DM pathophysiology. In a correlation network of the significantly altered molecules, among the central nodes, chenodeoxycholic acid was robustly associated with TB and DM. Fatty acid (22:4) was a component of all significant modules. In conclusion, the integration of multimodal metabolomics and lipidomics provides a thorough picture of the metabolic changes associated with TB-DM. The results obtained from this comprehensive profiling of TB patients with DM advance the current understanding of DM comorbidity in TB infection and contribute to the development of more effective treatment.
Collapse
Affiliation(s)
- Nguyen Thi Hai Yen
- Department of Pharmacology and PharmacoGenomics Research Center, Inje University College of Medicine, Busan, Republic of Korea; Center for Personalized Precision Medicine of Tuberculosis, Inje University College of Medicine, Busan, Republic of Korea
| | - Nguyen Ky Anh
- Department of Pharmacology and PharmacoGenomics Research Center, Inje University College of Medicine, Busan, Republic of Korea; Center for Personalized Precision Medicine of Tuberculosis, Inje University College of Medicine, Busan, Republic of Korea
| | - Rannissa Puspita Jayanti
- Department of Pharmacology and PharmacoGenomics Research Center, Inje University College of Medicine, Busan, Republic of Korea; Center for Personalized Precision Medicine of Tuberculosis, Inje University College of Medicine, Busan, Republic of Korea
| | - Nguyen Ky Phat
- Department of Pharmacology and PharmacoGenomics Research Center, Inje University College of Medicine, Busan, Republic of Korea; Center for Personalized Precision Medicine of Tuberculosis, Inje University College of Medicine, Busan, Republic of Korea
| | - Dinh Hoa Vu
- The National Centre of Drug Information and Adverse Drug Reaction Monitoring, Hanoi University of Pharmacy, Hanoi, Viet Nam
| | - Jong-Lyul Ghim
- Department of Pharmacology and PharmacoGenomics Research Center, Inje University College of Medicine, Busan, Republic of Korea; Department of Clinical Pharmacology, Inje University Busan Paik Hospital, Busan, Republic of Korea
| | - Sangzin Ahn
- Department of Pharmacology and PharmacoGenomics Research Center, Inje University College of Medicine, Busan, Republic of Korea
| | - Jae-Gook Shin
- Department of Pharmacology and PharmacoGenomics Research Center, Inje University College of Medicine, Busan, Republic of Korea; Center for Personalized Precision Medicine of Tuberculosis, Inje University College of Medicine, Busan, Republic of Korea; Department of Clinical Pharmacology, Inje University Busan Paik Hospital, Busan, Republic of Korea
| | - Jee Youn Oh
- Division of Pulmonary, Allergy and Critical Care Medicine, Department of Internal Medicine, Korea University Guro Hospital, Seoul, Republic of Korea
| | - Nguyen Phuoc Long
- Department of Pharmacology and PharmacoGenomics Research Center, Inje University College of Medicine, Busan, Republic of Korea; Center for Personalized Precision Medicine of Tuberculosis, Inje University College of Medicine, Busan, Republic of Korea.
| | - Dong Hyun Kim
- Department of Pharmacology and PharmacoGenomics Research Center, Inje University College of Medicine, Busan, Republic of Korea.
| |
Collapse
|
40
|
Xing S, Shen S, Xu B, Li X, Huan T. BUDDY: molecular formula discovery via bottom-up MS/MS interrogation. Nat Methods 2023:10.1038/s41592-023-01850-x. [PMID: 37055660 DOI: 10.1038/s41592-023-01850-x] [Citation(s) in RCA: 14] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2022] [Accepted: 03/15/2023] [Indexed: 04/15/2023]
Abstract
A substantial fraction of metabolic features remains undetermined in mass spectrometry (MS)-based metabolomics, and molecular formula annotation is the starting point for unraveling their chemical identities. Here we present bottom-up tandem MS (MS/MS) interrogation, a method for de novo formula annotation. Our approach prioritizes MS/MS-explainable formula candidates, implements machine-learned ranking and offers false discovery rate estimation. Compared with the mathematically exhaustive formula enumeration, our approach shrinks the formula candidate space by 42.8% on average. Method benchmarking on annotation accuracy was systematically carried out on reference MS/MS libraries and real metabolomics datasets. Applied on 155,321 recurrent unidentified spectra, our approach confidently annotated >5,000 novel molecular formulae absent from chemical databases. Beyond the level of individual metabolic features, we combined bottom-up MS/MS interrogation with global optimization to refine formula annotations while revealing peak interrelationships. This approach allowed the systematic annotation of 37 fatty acid amide molecules in human fecal data. All bioinformatics pipelines are available in a standalone software, BUDDY ( https://github.com/HuanLab/BUDDY ).
Collapse
Affiliation(s)
- Shipei Xing
- Department of Chemistry, Faculty of Science, University of British Columbia, Vancouver, British Columbia, Canada
| | - Sam Shen
- Department of Chemistry, Faculty of Science, University of British Columbia, Vancouver, British Columbia, Canada
| | - Banghua Xu
- Department of Chemistry, Faculty of Science, University of British Columbia, Vancouver, British Columbia, Canada
| | - Xiaoxiao Li
- Department of Electrical and Computer Engineering, University of British Columbia, Vancouver, British Columbia, Canada
| | - Tao Huan
- Department of Chemistry, Faculty of Science, University of British Columbia, Vancouver, British Columbia, Canada.
| |
Collapse
|
41
|
Brookhart A, Arora M, McCullagh M, Wilson ID, Plumb RS, Vissers JP, Tanna N. Understanding mobile phase buffer composition and chemical structure effects on electrospray ionization mass spectrometry response. J Chromatogr A 2023; 1696:463966. [PMID: 37054638 DOI: 10.1016/j.chroma.2023.463966] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2023] [Revised: 04/01/2023] [Accepted: 04/03/2023] [Indexed: 04/15/2023]
Abstract
Mobile phase selection is of critical importance in liquid chromatography - mass spectrometry (LC-MS) based studies, since it affects retention, chromatographic selectivity, ionization, limits of detection and quantification, and linear dynamic range. Generalized LC-MS mobile phase selection criteria, suitable for a broad class of chemical compounds, do not exist thus far. Here we have performed a large-scale qualitative assessment of the effect of solvent composition used for reversed-phase LC separations on electrospray ionization (ESI) response for 240 small molecular weight drugs, representing various chemical compound classes. Of these 240 analytes 224 were detectable using ESI. The main chemical structural features affecting ESI response were found to all be surface area or surface charge-related. Mobile phase composition was found to be less differentiating, although for some compounds a pH effect was noted. Unsurprisingly, chemical structure was found to be the dominant factor for ESI response for the majority of the investigated analytes, representing about 85% of the replicating detectable complement of the sample data set. A weak correlation between ESI response and structure complexity was observed. Solvents based on isopropanol, and those containing phosphoric or di- and trifluoracetic acids, performed relatively poorly in terms of chromatographic or ESI response, whilst the best performing 'generic' LC solvents were based on methanol, acetonitrile using formic acid and ammonium acetate as buffer components, consistent with current practice in many laboratories.
Collapse
Affiliation(s)
- Allison Brookhart
- Department of Biochemistry and Molecular Biology, University of Massachusetts Amherst, MA
| | - Mahika Arora
- Manning College of Information and Computer Sciences, University of Massachusetts Amherst, MA
| | | | - Ian D Wilson
- Computational & Systems Medicine, Department of Metabolism, Digestion and Reproduction, Imperial College, United Kingdom
| | | | | | | |
Collapse
|
42
|
Luo M, Yin Y, Zhou Z, Zhang H, Chen X, Wang H, Zhu ZJ. A mass spectrum-oriented computational method for ion mobility-resolved untargeted metabolomics. Nat Commun 2023; 14:1813. [PMID: 37002244 PMCID: PMC10066191 DOI: 10.1038/s41467-023-37539-0] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2022] [Accepted: 03/17/2023] [Indexed: 04/03/2023] Open
Abstract
Ion mobility (IM) adds a new dimension to liquid chromatography-mass spectrometry-based untargeted metabolomics which significantly enhances coverage, sensitivity, and resolving power for analyzing the metabolome, particularly metabolite isomers. However, the high dimensionality of IM-resolved metabolomics data presents a great challenge to data processing, restricting its widespread applications. Here, we develop a mass spectrum-oriented bottom-up assembly algorithm for IM-resolved metabolomics that utilizes mass spectra to assemble four-dimensional peaks in a reverse order of multidimensional separation. We further develop the end-to-end computational framework Met4DX for peak detection, quantification and identification of metabolites in IM-resolved metabolomics. Benchmarking and validation of Met4DX demonstrates superior performance compared to existing tools with regard to coverage, sensitivity, peak fidelity and quantification precision. Importantly, Met4DX successfully detects and differentiates co-eluted metabolite isomers with small differences in the chromatographic and IM dimensions. Together, Met4DX advances metabolite discovery in biological organisms by deciphering the complex 4D metabolomics data.
Collapse
Affiliation(s)
- Mingdu Luo
- Interdisciplinary Research Center on Biology and Chemistry, Shanghai Institute of Organic Chemistry, Chinese Academy of Sciences, Shanghai, 200032, P. R. China
- University of Chinese Academy of Sciences, Beijing, 100049, P. R. China
| | - Yandong Yin
- Interdisciplinary Research Center on Biology and Chemistry, Shanghai Institute of Organic Chemistry, Chinese Academy of Sciences, Shanghai, 200032, P. R. China
| | - Zhiwei Zhou
- Interdisciplinary Research Center on Biology and Chemistry, Shanghai Institute of Organic Chemistry, Chinese Academy of Sciences, Shanghai, 200032, P. R. China
| | - Haosong Zhang
- Interdisciplinary Research Center on Biology and Chemistry, Shanghai Institute of Organic Chemistry, Chinese Academy of Sciences, Shanghai, 200032, P. R. China
- University of Chinese Academy of Sciences, Beijing, 100049, P. R. China
| | - Xi Chen
- Interdisciplinary Research Center on Biology and Chemistry, Shanghai Institute of Organic Chemistry, Chinese Academy of Sciences, Shanghai, 200032, P. R. China
- University of Chinese Academy of Sciences, Beijing, 100049, P. R. China
| | - Hongmiao Wang
- Interdisciplinary Research Center on Biology and Chemistry, Shanghai Institute of Organic Chemistry, Chinese Academy of Sciences, Shanghai, 200032, P. R. China
- University of Chinese Academy of Sciences, Beijing, 100049, P. R. China
| | - Zheng-Jiang Zhu
- Interdisciplinary Research Center on Biology and Chemistry, Shanghai Institute of Organic Chemistry, Chinese Academy of Sciences, Shanghai, 200032, P. R. China.
- Shanghai Key Laboratory of Aging Studies, Shanghai, 201210, P. R. China.
| |
Collapse
|
43
|
Wang X, Zheng F, Sheng M, Xu G, Lin X. Retention time prediction for small samples based on integrating molecular representations and adaptive network. J Chromatogr B Analyt Technol Biomed Life Sci 2023; 1217:123624. [PMID: 36780745 DOI: 10.1016/j.jchromb.2023.123624] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2022] [Revised: 01/13/2023] [Accepted: 01/27/2023] [Indexed: 02/07/2023]
Abstract
Retention time (RT) can provide orthogonal information different from that of mass spectrometry and contribute to identifying compounds. Many machine learning methods have been developed and applied to RT prediction. In application, the training data size is usually small in most chromatography systems. To enhance the performance of RT prediction, this study proposes a RT prediction method based on multi-data combinations and adaptive neural network (MDC-ANN). MDC-ANN establishes the RT prediction model for the target chromatographic system through transfer learning and a base deep learning model trained on a big dataset. It selects the optimal molecular representation combination from the multiple input candidates and automatically determines the neural network structure according to the determined input combination. MDC-ANN was compared with two new efficient deep learning methods, three transferring methods and four popular machine learning methods on 14 small datasets and showed advantages in MAE, MedAE, MRE and R2 in most cases. The experiment results illustrated that integrating multiple molecular representations can provide more information, improve the performance of RT prediction and contribute to compound annotation, different chromatographic systems may use different molecular representation combinations to obtain good RT prediction performance. Hence, MDC-ANN which automatically determines the best combination of molecular representations for a specific system is promising for predicting RTs accurately in real applications.
Collapse
Affiliation(s)
- Xiaoxiao Wang
- School of Computer Science and Technology, Dalian University of Technology, Dalian 116024, Liaoning, China
| | - Fujian Zheng
- CAS Key Laboratory of Separation Science for Analytical Chemistry, Dalian Institute of Chemical Physics, Chinese Academy of Sciences, Dalian 116023, Liaoning, China.
| | - Meizhen Sheng
- School of Computer Science and Technology, Dalian University of Technology, Dalian 116024, Liaoning, China
| | - Guowang Xu
- CAS Key Laboratory of Separation Science for Analytical Chemistry, Dalian Institute of Chemical Physics, Chinese Academy of Sciences, Dalian 116023, Liaoning, China
| | - Xiaohui Lin
- School of Computer Science and Technology, Dalian University of Technology, Dalian 116024, Liaoning, China.
| |
Collapse
|
44
|
Lenski M, Maallem S, Zarcone G, Garçon G, Lo-Guidice JM, Anthérieu S, Allorge D. Prediction of a Large-Scale Database of Collision Cross-Section and Retention Time Using Machine Learning to Reduce False Positive Annotations in Untargeted Metabolomics. Metabolites 2023; 13:metabo13020282. [PMID: 36837901 PMCID: PMC9962007 DOI: 10.3390/metabo13020282] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2023] [Revised: 02/07/2023] [Accepted: 02/12/2023] [Indexed: 02/18/2023] Open
Abstract
Metabolite identification in untargeted metabolomics is complex, with the risk of false positive annotations. This work aims to use machine learning to successively predict the retention time (Rt) and the collision cross-section (CCS) of an open-access database to accelerate the interpretation of metabolomic results. Standards of metabolites were tested using liquid chromatography coupled with high-resolution mass spectrometry. In CCSBase and QSRR predictor machine learning models, experimental results were used to generate predicted CCS and Rt of the Human Metabolome Database. From 542 standards, 266 and 301 compounds were detected in positive and negative electrospray ionization mode, respectively, corresponding to 380 different metabolites. CCS and Rt were then predicted using machine learning tools for almost 114,000 metabolites. R2 score of the linear regression between predicted and measured data achieved 0.938 and 0.898 for CCS and Rt, respectively, demonstrating the models' reliability. A CCS and Rt index filter of mean error ± 2 standard deviations could remove most misidentifications. Its application to data generated from a toxicology study on tobacco cigarettes reduced hits by 76%. Regarding the volume of data produced by metabolomics, the practical workflow provided allows for the implementation of valuable large-scale databases to improve the biological interpretation of metabolomics data.
Collapse
Affiliation(s)
- Marie Lenski
- ULR 4483, IMPECS—IMPact de l’Environnement Chimique sur la Santé humaine, CHU Lille, Institut Pasteur de Lille, Université de Lille, F-59000 Lille, France
- CHU Lille, Unité Fonctionnelle de Toxicologie, F-59037 Lille, France
- Correspondence:
| | - Saïd Maallem
- ULR 4483, IMPECS—IMPact de l’Environnement Chimique sur la Santé humaine, CHU Lille, Institut Pasteur de Lille, Université de Lille, F-59000 Lille, France
| | - Gianni Zarcone
- ULR 4483, IMPECS—IMPact de l’Environnement Chimique sur la Santé humaine, CHU Lille, Institut Pasteur de Lille, Université de Lille, F-59000 Lille, France
| | - Guillaume Garçon
- ULR 4483, IMPECS—IMPact de l’Environnement Chimique sur la Santé humaine, CHU Lille, Institut Pasteur de Lille, Université de Lille, F-59000 Lille, France
| | - Jean-Marc Lo-Guidice
- ULR 4483, IMPECS—IMPact de l’Environnement Chimique sur la Santé humaine, CHU Lille, Institut Pasteur de Lille, Université de Lille, F-59000 Lille, France
| | - Sébastien Anthérieu
- ULR 4483, IMPECS—IMPact de l’Environnement Chimique sur la Santé humaine, CHU Lille, Institut Pasteur de Lille, Université de Lille, F-59000 Lille, France
| | - Delphine Allorge
- ULR 4483, IMPECS—IMPact de l’Environnement Chimique sur la Santé humaine, CHU Lille, Institut Pasteur de Lille, Université de Lille, F-59000 Lille, France
- CHU Lille, Unité Fonctionnelle de Toxicologie, F-59037 Lille, France
| |
Collapse
|
45
|
Stevens NC, Brown VJ, Domanico MC, Edwards PC, Van Winkle LS, Fiehn O. Alteration of glycosphingolipid metabolism by ozone is associated with exacerbation of allergic asthma characteristics in mice. Toxicol Sci 2023; 191:79-89. [PMID: 36331340 PMCID: PMC9887677 DOI: 10.1093/toxsci/kfac117] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022] Open
Abstract
Asthma is a common chronic respiratory disease exacerbated by multiple environmental factors. Acute ozone exposure has previously been implicated in airway inflammation, airway hyperreactivity, and other characteristics of asthma, which may be attributable to altered sphingolipid metabolism. This study tested the hypothesis that acute ozone exposure alters sphingolipid metabolism within the lung, which contributes to exacerbations in characteristics of asthma in allergen-sensitized mice. Adult male and female BALB/c mice were sensitized intranasally to house dust mite (HDM) allergen on days 1, 3, and 5 and challenged on days 12-14. Mice were exposed to ozone following each HDM challenge for 6 h/day. Bronchoalveolar lavage, lung lobes, and microdissected lung airways were collected for metabolomics analysis (N = 8/sex/group). Another subset of mice underwent methacholine challenge using a forced oscillation technique to measure airway resistance (N = 6/sex/group). Combined HDM and ozone exposure in male mice synergistically increased airway hyperreactivity that was not observed in females and was accompanied by increased airway inflammation and eosinophilia relative to control mice. Importantly, glycosphingolipids were significantly increased following combined HDM and ozone exposure relative to controls in both male and female airways, which was also associated with both airway resistance and eosinophilia. However, 15 glycosphingolipid species were increased in females compared with only 6 in males, which was concomitant with significant associations between glycosphingolipids and airway resistance that ranged from R2 = 0.33-0.51 for females and R2 = 0.20-0.34 in male mice. These observed sex differences demonstrate that glycosphingolipids potentially serve to mitigate exacerbations in characteristics of allergic asthma.
Collapse
Affiliation(s)
| | - Veneese J Brown
- Center for Health and the Environment, School of Veterinary Medicine, University of California Davis, Davis, California 95616, USA
| | - Morgan C Domanico
- Center for Health and the Environment, School of Veterinary Medicine, University of California Davis, Davis, California 95616, USA
| | - Patricia C Edwards
- Center for Health and the Environment, School of Veterinary Medicine, University of California Davis, Davis, California 95616, USA
| | - Laura S Van Winkle
- Center for Health and the Environment, School of Veterinary Medicine, University of California Davis, Davis, California 95616, USA
- Department of Anatomy, Physiology and Cell Biology, School of Veterinary Medicine, University of California Davis, Davis, California 95616, USA
| | - Oliver Fiehn
- Genome Center, University of California Davis, Davis, California 95616, USA
| |
Collapse
|
46
|
Cajka T, Hricko J, Rudl Kulhava L, Paucova M, Novakova M, Kuda O. Optimization of Mobile Phase Modifiers for Fast LC-MS-Based Untargeted Metabolomics and Lipidomics. Int J Mol Sci 2023; 24:ijms24031987. [PMID: 36768308 PMCID: PMC9916776 DOI: 10.3390/ijms24031987] [Citation(s) in RCA: 12] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2022] [Revised: 01/13/2023] [Accepted: 01/18/2023] [Indexed: 01/21/2023] Open
Abstract
Liquid chromatography-mass spectrometry (LC-MS) is the method of choice for the untargeted profiling of biological samples. A multiplatform LC-MS-based approach is needed to screen polar metabolites and lipids comprehensively. Different mobile phase modifiers were tested to improve the electrospray ionization process during metabolomic and lipidomic profiling. For polar metabolites, hydrophilic interaction LC using a mobile phase with 10 mM ammonium formate/0.125% formic acid provided the best performance for amino acids, biogenic amines, sugars, nucleotides, acylcarnitines, and sugar phosphate, while reversed-phase LC (RPLC) with 0.1% formic acid outperformed for organic acids. For lipids, RPLC using a mobile phase with 10 mM ammonium formate or 10 mM ammonium formate with 0.1% formic acid permitted the high signal intensity of various lipid classes ionized in ESI(+) and robust retention times. For ESI(-), the mobile phase with 10 mM ammonium acetate with 0.1% acetic acid represented a reasonable compromise regarding the signal intensity of the detected lipids and the stability of retention times compared to 10 mM ammonium acetate alone or 0.02% acetic acid. Collectively, we show that untargeted methods should be evaluated not only on the total number of features but also based on common metabolites detected by a specific platform along with the long-term stability of retention times.
Collapse
|
47
|
Damiani T, Bonciarelli S, Thallinger GG, Koehler N, Krettler CA, Salihoğlu AK, Korf A, Pauling JK, Pluskal T, Ni Z, Goracci L. Software and Computational Tools for LC-MS-Based Epilipidomics: Challenges and Solutions. Anal Chem 2023; 95:287-303. [PMID: 36625108 PMCID: PMC9835057 DOI: 10.1021/acs.analchem.2c04406] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/11/2023]
Affiliation(s)
- Tito Damiani
- Institute
of Organic Chemistry and Biochemistry of the Czech Academy of Sciences, Flemingovo nám. 2, 160 00 Praha 6, Czech Republic
| | - Stefano Bonciarelli
- Department
of Chemistry, Biology and Biotechnology, University of Perugia, Via Elce di Sotto 8, 06123 Perugia, Italy
| | - Gerhard G. Thallinger
- Institute
of Biomedical Informatics, Graz University
of Technology, 8010 Graz, Austria,
| | - Nikolai Koehler
- LipiTUM,
Chair of Experimental Bioinformatics, Technical
University of Munich, Maximus-von-Imhof Forum 3, 85354 Freising, Germany
| | | | - Arif K. Salihoğlu
- Department
of Physiology, Faculty of Medicine and Institute of Health Sciences, Karadeniz Technical University, 61080 Trabzon, Turkey
| | - Ansgar Korf
- Bruker Daltonics
GmbH & Co. KG, Fahrenheitstraße 4, 28359 Bremen, Germany
| | - Josch K. Pauling
- LipiTUM,
Chair of Experimental Bioinformatics, Technical
University of Munich, Maximus-von-Imhof Forum 3, 85354 Freising, Germany
| | - Tomáš Pluskal
- Institute
of Organic Chemistry and Biochemistry of the Czech Academy of Sciences, Flemingovo nám. 2, 160 00 Praha 6, Czech Republic
| | - Zhixu Ni
- Center of
Membrane Biochemistry and Lipid Research, University Hospital and Faculty of Perugia, Via Elce di Sotto 8, 06123 Perugia, Italy,
| | - Laura Goracci
- Department
of Chemistry, Biology and Biotechnology, University of Perugia, Via Elce di Sotto 8, 06123 Perugia, Italy,
| |
Collapse
|
48
|
Hissong R, Evans KR, Evans CR. Compound Identification Strategies in Mass Spectrometry-Based Metabolomics and Pharmacometabolomics. Handb Exp Pharmacol 2023; 277:43-71. [PMID: 36409330 DOI: 10.1007/164_2022_617] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
Abstract
The metabolome is composed of a vast array of molecules, including endogenous metabolites and lipids, diet- and microbiome-derived substances, pharmaceuticals and supplements, and exposome chemicals. Correct identification of compounds from this diversity of classes is essential to derive biologically relevant insights from metabolomics data. In this chapter, we aim to provide a practical overview of compound identification strategies for mass spectrometry-based metabolomics, with a particular eye toward pharmacologically-relevant studies. First, we describe routine compound identification strategies applicable to targeted metabolomics. Next, we discuss both experimental (data acquisition-focused) and computational (software-focused) strategies used to identify unknown compounds in untargeted metabolomics data. We then discuss the importance of, and methods for, assessing and reporting the level of confidence of compound identifications. Throughout the chapter, we discuss how these steps can be implemented using today's technology, but also highlight research underway to further improve accuracy and certainty of compound identification. For readers interested in interpreting metabolomics data already collected, this chapter will supply important context regarding the origin of the metabolite names assigned to features in the data and help them assess the certainty of the identifications. For those planning new data acquisition, the chapter supplies guidance for designing experiments and selecting analysis methods to enable accurate compound identification, and it will point the reader toward best-practice data analysis and reporting strategies to allow sound biological and pharmacological interpretation.
Collapse
|
49
|
Rehfeldt TG, Krawczyk K, Echers SG, Marcatili P, Palczynski P, Röttger R, Schwämmle V. Variability analysis of LC-MS experimental factors and their impact on machine learning. Gigascience 2022; 12:giad096. [PMID: 37983748 PMCID: PMC10659119 DOI: 10.1093/gigascience/giad096] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2023] [Revised: 08/23/2023] [Accepted: 10/11/2023] [Indexed: 11/22/2023] Open
Abstract
BACKGROUND Machine learning (ML) technologies, especially deep learning (DL), have gained increasing attention in predictive mass spectrometry (MS) for enhancing the data-processing pipeline from raw data analysis to end-user predictions and rescoring. ML models need large-scale datasets for training and repurposing, which can be obtained from a range of public data repositories. However, applying ML to public MS datasets on larger scales is challenging, as they vary widely in terms of data acquisition methods, biological systems, and experimental designs. RESULTS We aim to facilitate ML efforts in MS data by conducting a systematic analysis of the potential sources of variability in public MS repositories. We also examine how these factors affect ML performance and perform a comprehensive transfer learning to evaluate the benefits of current best practice methods in the field for transfer learning. CONCLUSIONS Our findings show significantly higher levels of homogeneity within a project than between projects, which indicates that it is important to construct datasets most closely resembling future test cases, as transferability is severely limited for unseen datasets. We also found that transfer learning, although it did increase model performance, did not increase model performance compared to a non-pretrained model.
Collapse
Affiliation(s)
- Tobias Greisager Rehfeldt
- Department of Mathematics and Computer Science, University of Southern Denmark, 5230 Odense, Denmark
| | - Konrad Krawczyk
- Department of Mathematics and Computer Science, University of Southern Denmark, 5230 Odense, Denmark
| | | | - Paolo Marcatili
- Department of Health Technology, Technical University of Denmark, 2800 Kongens Lyngby, Denmark
| | - Pawel Palczynski
- Department of Biochemistry and Molecular Biology, University of Southern Denmark, 5230 Odense, Denmark
| | - Richard Röttger
- Department of Mathematics and Computer Science, University of Southern Denmark, 5230 Odense, Denmark
| | - Veit Schwämmle
- Department of Biochemistry and Molecular Biology, University of Southern Denmark, 5230 Odense, Denmark
| |
Collapse
|
50
|
Stancliffe E, Schwaiger-Haber M, Sindelar M, Murphy MJ, Soerensen M, Patti GJ. An Untargeted Metabolomics Workflow that Scales to Thousands of Samples for Population-Based Studies. Anal Chem 2022; 94:17370-17378. [PMID: 36475608 PMCID: PMC11018270 DOI: 10.1021/acs.analchem.2c01270] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
The success of precision medicine relies upon collecting data from many individuals at the population level. Although advancing technologies have made such large-scale studies increasingly feasible in some disciplines such as genomics, the standard workflows currently implemented in untargeted metabolomics were developed for small sample numbers and are limited by the processing of liquid chromatography/mass spectrometry data. Here we present an untargeted metabolomics workflow that is designed to support large-scale projects with thousands of biospecimens. Our strategy is to first evaluate a reference sample created by pooling aliquots of biospecimens from the cohort. The reference sample captures the chemical complexity of the biological matrix in a small number of analytical runs, which can subsequently be processed with conventional software such as XCMS. Although this generates thousands of so-called features, most do not correspond to unique compounds from the samples and can be filtered with established informatics tools. The features remaining represent a comprehensive set of biologically relevant reference chemicals that can then be extracted from the entire cohort's raw data on the basis of m/z values and retention times by using Skyline. To demonstrate applicability to large cohorts, we evaluated >2000 human plasma samples with our workflow. We focused our analysis on 360 identified compounds, but we also profiled >3000 unknowns from the plasma samples. As part of our workflow, we tested 14 different computational approaches for batch correction and found that a random forest-based approach outperformed the others. The corrected data revealed distinct profiles that were associated with the geographic location of participants.
Collapse
Affiliation(s)
- Ethan Stancliffe
- Department of Chemistry, Washington University in St. Louis, St. Louis, Missouri 63130, United States
- Department of Medicine, Washington University in St. Louis, St. Louis, Missouri 63130, United States
- Center for Metabolomics and Isotope Tracing at Washington University in St. Louis, St. Louis, Missouri 63130, United States
| | - Michaela Schwaiger-Haber
- Department of Chemistry, Washington University in St. Louis, St. Louis, Missouri 63130, United States
- Department of Medicine, Washington University in St. Louis, St. Louis, Missouri 63130, United States
- Center for Metabolomics and Isotope Tracing at Washington University in St. Louis, St. Louis, Missouri 63130, United States
| | - Miriam Sindelar
- Department of Chemistry, Washington University in St. Louis, St. Louis, Missouri 63130, United States
- Department of Medicine, Washington University in St. Louis, St. Louis, Missouri 63130, United States
- Center for Metabolomics and Isotope Tracing at Washington University in St. Louis, St. Louis, Missouri 63130, United States
| | - Matthew J. Murphy
- Department of Chemistry, Washington University in St. Louis, St. Louis, Missouri 63130, United States
- Department of Medicine, Washington University in St. Louis, St. Louis, Missouri 63130, United States
- Center for Metabolomics and Isotope Tracing at Washington University in St. Louis, St. Louis, Missouri 63130, United States
| | - Mette Soerensen
- Epidemiology, Biostatistics and Biodemography, Department of Public Health, University of Southern Denmark, Odense, Denmark
| | - Gary J. Patti
- Department of Chemistry, Washington University in St. Louis, St. Louis, Missouri 63130, United States
- Department of Medicine, Washington University in St. Louis, St. Louis, Missouri 63130, United States
- Center for Metabolomics and Isotope Tracing at Washington University in St. Louis, St. Louis, Missouri 63130, United States
- Siteman Cancer Center, Washington University in St. Louis, St. Louis, Missouri 63130, United States
| |
Collapse
|