1
|
Metz TO, Chang CH, Gautam V, Anjum A, Tian S, Wang F, Colby SM, Nunez JR, Blumer MR, Edison AS, Fiehn O, Jones DP, Li S, Morgan ET, Patti GJ, Ross DH, Shapiro MR, Williams AJ, Wishart DS. Introducing 'identification probability' for automated and transferable assessment of metabolite identification confidence in metabolomics and related studies. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.07.30.605945. [PMID: 39131324 PMCID: PMC11312557 DOI: 10.1101/2024.07.30.605945] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 08/13/2024]
Abstract
Methods for assessing compound identification confidence in metabolomics and related studies have been debated and actively researched for the past two decades. The earliest effort in 2007 focused primarily on mass spectrometry and nuclear magnetic resonance spectroscopy and resulted in four recommended levels of metabolite identification confidence - the Metabolite Standards Initiative (MSI) Levels. In 2014, the original MSI Levels were expanded to five levels (including two sublevels) to facilitate communication of compound identification confidence in high resolution mass spectrometry studies. Further refinement in identification levels have occurred, for example to accommodate use of ion mobility spectrometry in metabolomics workflows, and alternate approaches to communicate compound identification confidence also have been developed based on identification points schema. However, neither qualitative levels of identification confidence nor quantitative scoring systems address the degree of ambiguity in compound identifications in context of the chemical space being considered, are easily automated, or are transferable between analytical platforms. In this perspective, we propose that the metabolomics and related communities consider identification probability as an approach for automated and transferable assessment of compound identification and ambiguity in metabolomics and related studies. Identification probability is defined simply as 1/N, where N is the number of compounds in a reference library or chemical space that match to an experimentally measured molecule within user-defined measurement precision(s), for example mass measurement or retention time accuracy, etc. We demonstrate the utility of identification probability in an in silico analysis of multi-property reference libraries constructed from the Human Metabolome Database and computational property predictions, provide guidance to the community in transparent implementation of the concept, and invite the community to further evaluate this concept in parallel with their current preferred methods for assessing metabolite identification confidence.
Collapse
Affiliation(s)
- Thomas O. Metz
- Biological Sciences Division, Pacific Northwest National Laboratory, Richland, WA USA
| | - Christine H. Chang
- Biological Sciences Division, Pacific Northwest National Laboratory, Richland, WA USA
| | - Vasuk Gautam
- Department of Biological Sciences, University of Alberta, Edmonton, AB, Canada
| | - Afia Anjum
- Department of Biological Sciences, University of Alberta, Edmonton, AB, Canada
| | - Siyang Tian
- Department of Biological Sciences, University of Alberta, Edmonton, AB, Canada
| | - Fei Wang
- Department of Computing Science, University of Alberta, Edmonton, AB, Canada
- Alberta Machine Intelligence Institute, Edmonton, AB, Canada
| | - Sean M. Colby
- Biological Sciences Division, Pacific Northwest National Laboratory, Richland, WA USA
| | - Jamie R. Nunez
- Biological Sciences Division, Pacific Northwest National Laboratory, Richland, WA USA
| | - Madison R. Blumer
- Biological Sciences Division, Pacific Northwest National Laboratory, Richland, WA USA
| | - Arthur S. Edison
- Department of Biochemistry & Molecular Biology, Complex Carbohydrate Research Center and Institute of Bioinformatics, University of Georgia, Athens, GA, USA
| | - Oliver Fiehn
- West Coast Metabolomics Center, University of California Davis, Davis, CA, USA
| | - Dean P. Jones
- Clinical Biomarkers Laboratory, Department of Medicine, Emory University, Atlanta, Georgia, USA
| | - Shuzhao Li
- The Jackson Laboratory for Genomic Medicine, Farmington, CT, USA
| | - Edward T. Morgan
- Department of Pharmacology and Chemical Biology, Emory University School of Medicine, Atlanta, Georgia, USA
| | - Gary J. Patti
- Center for Mass Spectrometry and Metabolic Tracing, Department of Chemistry, Department of Medicine, Washington University, Saint Louis, Missouri, USA
| | - Dylan H. Ross
- Biological Sciences Division, Pacific Northwest National Laboratory, Richland, WA USA
| | - Madelyn R. Shapiro
- Artificial Intelligence & Data Analytics Division, Pacific Northwest National Laboratory, Richland, WA USA
| | - Antony J. Williams
- U.S. Environmental Protection Agency, Office of Research & Development, Center for Computational Toxicology & Exposure (CCTE), Research Triangle Park, NC USA
| | - David S. Wishart
- Department of Biological Sciences, University of Alberta, Edmonton, AB, Canada
| |
Collapse
|
2
|
Elapavalore A, Kondić T, Singh RR, Shoemaker BA, Thiessen PA, Zhang J, Bolton EE, Schymanski EL. Adding open spectral data to MassBank and PubChem using open source tools to support non-targeted exposomics of mixtures. ENVIRONMENTAL SCIENCE. PROCESSES & IMPACTS 2023; 25:1788-1801. [PMID: 37431591 PMCID: PMC10648001 DOI: 10.1039/d3em00181d] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/28/2023] [Accepted: 06/25/2023] [Indexed: 07/12/2023]
Abstract
The term "exposome" is defined as a comprehensive study of life-course environmental exposures and the associated biological responses. Humans are exposed to many different chemicals, which can pose a major threat to the well-being of humanity. Targeted or non-targeted mass spectrometry techniques are widely used to identify and characterize various environmental stressors when linking exposures to human health. However, identification remains challenging due to the huge chemical space applicable to exposomics, combined with the lack of sufficient relevant entries in spectral libraries. Addressing these challenges requires cheminformatics tools and database resources to share curated open spectral data on chemicals to improve the identification of chemicals in exposomics studies. This article describes efforts to contribute spectra relevant for exposomics to the open mass spectral library MassBank (https://www.massbank.eu) using various open source software efforts, including the R packages RMassBank and Shinyscreen. The experimental spectra were obtained from ten mixtures containing toxicologically relevant chemicals from the US Environmental Protection Agency (EPA) Non-Targeted Analysis Collaborative Trial (ENTACT). Following processing and curation, 5582 spectra from 783 of the 1268 ENTACT compounds were added to MassBank, and through this to other open spectral libraries (e.g., MoNA, GNPS) for community benefit. Additionally, an automated deposition and annotation workflow was developed with PubChem to enable the display of all MassBank mass spectra in PubChem, which is rerun with each MassBank release. The new spectral records have already been used in several studies to increase the confidence in identification in non-target small molecule identification workflows applied to environmental and exposomics research.
Collapse
Affiliation(s)
- Anjana Elapavalore
- Luxembourg Centre for Systems Biomedicine (LCSB), University of Luxembourg, 6 Avenue du Swing, 4367, Belvaux, Luxembourg.
| | - Todor Kondić
- Luxembourg Centre for Systems Biomedicine (LCSB), University of Luxembourg, 6 Avenue du Swing, 4367, Belvaux, Luxembourg.
| | - Randolph R Singh
- Luxembourg Centre for Systems Biomedicine (LCSB), University of Luxembourg, 6 Avenue du Swing, 4367, Belvaux, Luxembourg.
- IFREMER (Institut Français de Recherche pour l'Exploitation de la Mer), Laboratoire Biogéochimie des Contaminants Organiques, Rue de l'Ile d'Yeu, BP 21105, Nantes Cedex 3, 44311, France
| | - Benjamin A Shoemaker
- National Center for Biotechnology Information (NCBI), National Library of Medicine (NLM), National Institutes of Health (NIH), Bethesda, MD, 20894, USA
| | - Paul A Thiessen
- National Center for Biotechnology Information (NCBI), National Library of Medicine (NLM), National Institutes of Health (NIH), Bethesda, MD, 20894, USA
| | - Jian Zhang
- National Center for Biotechnology Information (NCBI), National Library of Medicine (NLM), National Institutes of Health (NIH), Bethesda, MD, 20894, USA
| | - Evan E Bolton
- National Center for Biotechnology Information (NCBI), National Library of Medicine (NLM), National Institutes of Health (NIH), Bethesda, MD, 20894, USA
| | - Emma L Schymanski
- Luxembourg Centre for Systems Biomedicine (LCSB), University of Luxembourg, 6 Avenue du Swing, 4367, Belvaux, Luxembourg.
| |
Collapse
|
3
|
Boyce M, Favela KA, Bonzo JA, Chao A, Lizarraga LE, Moody LR, Owens EO, Patlewicz G, Shah I, Sobus JR, Thomas RS, Williams AJ, Yau A, Wambaugh JF. Identifying xenobiotic metabolites with in silico prediction tools and LCMS suspect screening analysis. FRONTIERS IN TOXICOLOGY 2023; 5:1051483. [PMID: 36742129 PMCID: PMC9889941 DOI: 10.3389/ftox.2023.1051483] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2022] [Accepted: 01/03/2023] [Indexed: 01/19/2023] Open
Abstract
Understanding the metabolic fate of a xenobiotic substance can help inform its potential health risks and allow for the identification of signature metabolites associated with exposure. The need to characterize metabolites of poorly studied or novel substances has shifted exposure studies towards non-targeted analysis (NTA), which often aims to profile many compounds within a sample using high-resolution liquid-chromatography mass-spectrometry (LCMS). Here we evaluate the suitability of suspect screening analysis (SSA) liquid-chromatography mass-spectrometry to inform xenobiotic chemical metabolism. Given a lack of knowledge of true metabolites for most chemicals, predictive tools were used to generate potential metabolites as suspect screening lists to guide the identification of selected xenobiotic substances and their associated metabolites. Thirty-three substances were selected to represent a diverse array of pharmaceutical, agrochemical, and industrial chemicals from Environmental Protection Agency's ToxCast chemical library. The compounds were incubated in a metabolically-active in vitro assay using primary hepatocytes and the resulting supernatant and lysate fractions were analyzed with high-resolution LCMS. Metabolites were simulated for each compound structure using software and then combined to serve as the suspect screening list. The exact masses of the predicted metabolites were then used to select LCMS features for fragmentation via tandem mass spectrometry (MS/MS). Of the starting chemicals, 12 were measured in at least one sample in either positive or negative ion mode and a subset of these were used to develop the analysis workflow. We implemented a screening level workflow for background subtraction and the incorporation of time-varying kinetics into the identification of likely metabolites. We used haloperidol as a case study to perform an in-depth analysis, which resulted in identifying five known metabolites and five molecular features that represent potential novel metabolites, two of which were assigned discrete structures based on in silico predictions. This workflow was applied to five additional test chemicals, and 15 molecular features were selected as either reported metabolites, predicted metabolites, or potential metabolites without a structural assignment. This study demonstrates that in some-but not all-cases, suspect screening analysis methods provide a means to rapidly identify and characterize metabolites of xenobiotic chemicals.
Collapse
Affiliation(s)
- Matthew Boyce
- Center for Computational Exposure, Office of Research and Development, U.S. Environmental Protection Agency, Research Triangle Park, Durham, NC, United States
| | | | - Jessica A. Bonzo
- Thermo Fisher Scientific, South San Francisco, CA, United States
| | - Alex Chao
- Center for Computational Exposure, Office of Research and Development, U.S. Environmental Protection Agency, Research Triangle Park, Durham, NC, United States
| | - Lucina E. Lizarraga
- Center for Public Health and Environmental Assessment, Office of Research and Development, U.S. Environmental Protection Agency, Cincinnati, OH, United States
| | - Laura R. Moody
- Thermo Fisher Scientific, South San Francisco, CA, United States
| | - Elizabeth O. Owens
- Center for Public Health and Environmental Assessment, Office of Research and Development, U.S. Environmental Protection Agency, Cincinnati, OH, United States
| | - Grace Patlewicz
- Center for Computational Exposure, Office of Research and Development, U.S. Environmental Protection Agency, Research Triangle Park, Durham, NC, United States
| | - Imran Shah
- Center for Computational Exposure, Office of Research and Development, U.S. Environmental Protection Agency, Research Triangle Park, Durham, NC, United States
| | - Jon R. Sobus
- Center for Computational Exposure, Office of Research and Development, U.S. Environmental Protection Agency, Research Triangle Park, Durham, NC, United States
| | - Russell S. Thomas
- Center for Computational Exposure, Office of Research and Development, U.S. Environmental Protection Agency, Research Triangle Park, Durham, NC, United States
| | - Antony J. Williams
- Center for Computational Exposure, Office of Research and Development, U.S. Environmental Protection Agency, Research Triangle Park, Durham, NC, United States
| | - Alice Yau
- Southwest Research Institute, San Antonio, TX, United States
| | - John F. Wambaugh
- Center for Computational Exposure, Office of Research and Development, U.S. Environmental Protection Agency, Research Triangle Park, Durham, NC, United States,*Correspondence: John F. Wambaugh,
| |
Collapse
|
4
|
Rocco K, Margoum C, Richard L, Coquery M. Enhanced database creation with in silico workflows for suspect screening of unknown tebuconazole transformation products in environmental samples by UHPLC-HRMS. JOURNAL OF HAZARDOUS MATERIALS 2022; 440:129706. [PMID: 35961075 DOI: 10.1016/j.jhazmat.2022.129706] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/04/2022] [Revised: 07/12/2022] [Accepted: 07/30/2022] [Indexed: 06/15/2023]
Abstract
The search and identification of organic contaminants in agricultural watersheds has become a crucial effort to better characterize watershed contamination by pesticides. The past decade has brought a more holistic view of watershed contamination via the deployment of powerful analytical strategies such as non-target and suspect screening analysis that can search more contaminants and their transformation products. However, suspect screening analysis remains broadly confined to known molecules, primarily due to the lack of analytical standards and suspect databases for unknowns such as pesticide transformation products. Here we developed a novel workflow by cross-comparing the results of various in silico prediction tools against literature data to create an enhanced database for suspect screening of pesticide transformation products. This workflow was applied on tebuconazole, used here as a model pesticide, and resulted in a suspect screening database counting 291 transformation products. The chromatographic retention times and tandem mass spectra were predicted for each of these compounds using 6 models based on multilinear regression and more complex machine-learning algorithms. This comprehensive approach to the investigation and identification of tebuconazole transformation products was retrospectively applied on environmental samples and found 6 transformation products identified for the first time in river water samples.
Collapse
Affiliation(s)
- Kevin Rocco
- INRAE, UR RiverLy, 69625 Villeurbanne, France.
| | | | | | | |
Collapse
|
5
|
Sinclair G, Thillainadarajah I, Meyer B, Samano V, Sivasupramaniam S, Adams L, Willighagen EL, Richard AM, Walker M, Williams AJ. Wikipedia on the CompTox Chemicals Dashboard: Connecting Resources to Enrich Public Chemical Data. J Chem Inf Model 2022; 62:4888-4905. [PMID: 36215146 PMCID: PMC9597659 DOI: 10.1021/acs.jcim.2c00886] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
![]()
The online encyclopedia
Wikipedia aggregates a large amount of
data on chemistry, encompassing well over 20,000 individual Wikipedia
pages and serves the general public as well as the chemistry community.
Many other chemical databases and services utilize these data, and
previous projects have focused on methods to index, search, and extract
it for review and use. We present a comprehensive effort that combines
bulk automated data extraction over tens of thousands of pages, semiautomated
data extraction over hundreds of pages, and fine-grained manual extraction
of individual lists and compounds of interest. We then correlate these
data with the existing contents of the U.S. Environmental Protection
Agency’s (EPA) Distributed Structure-Searchable Toxicity (DSSTox)
database. This was performed with a number of intentions including
ensuring as complete a mapping as possible between the Dashboard and
Wikipedia so that relevant snippets of the article are loaded for
the user to review. Conflicts between Dashboard content and Wikipedia
in terms of, for example, identifiers such as chemical registry numbers,
names, and InChIs and structure-based collisions such as SMILES were
identified and used as the basis of curation of both DSSTox and Wikipedia.
This work also allowed us to evaluate available data for sets of chemicals
of interest to the Agency, such as synthetic cannabinoids, and expand
the content in DSSTox as appropriate. This work also led to improved
bidirectional linkage of the detailed chemistry and usage information
from Wikipedia with expert-curated structure and identifier data from
DSSTox for a new list of nearly 20,000 chemicals. All of this work
ultimately enhances the data mappings that allow for the display of
the introduction of the Wikipedia article in the community-accessible
web-based EPA Comptox Chemicals Dashboard, enhancing the user experience
for the thousands of users per day accessing the resource.
Collapse
Affiliation(s)
- Gabriel Sinclair
- ORAU Student Services Contractor to Center for Computational Toxicology and Exposure, Office of Research and Development, U.S. Environmental Protection Agency, Research Triangle Park, North Carolina 27711, United States
| | - Inthirany Thillainadarajah
- Senior Environmental Employment Program, US Environmental Protection Agency, Research Triangle Park, North Carolina 27711, United States
| | - Brian Meyer
- Senior Environmental Employment Program, US Environmental Protection Agency, Research Triangle Park, North Carolina 27711, United States
| | - Vicente Samano
- Senior Environmental Employment Program, US Environmental Protection Agency, Research Triangle Park, North Carolina 27711, United States
| | - Sakuntala Sivasupramaniam
- Senior Environmental Employment Program, US Environmental Protection Agency, Research Triangle Park, North Carolina 27711, United States
| | - Linda Adams
- Center for Computational Toxicology and Exposure, Office of Research and Development, U.S. Environmental Protection Agency, Research Triangle Park, North Carolina 27711, United States
| | - Egon L Willighagen
- Department of Bioinformatics─BiGCaT, Maastricht University, 6229 ER Maastricht, The Netherlands
| | - Ann M Richard
- Center for Computational Toxicology and Exposure, Office of Research and Development, U.S. Environmental Protection Agency, Research Triangle Park, North Carolina 27711, United States
| | - Martin Walker
- Martin Walker, SUNY Potsdam─Chemistry, 44 Pierrepont Avenue, Potsdam, New York 13676, United States
| | - Antony J Williams
- Center for Computational Toxicology and Exposure, Office of Research and Development, U.S. Environmental Protection Agency, Research Triangle Park, North Carolina 27711, United States
| |
Collapse
|
6
|
Bremer PL, Vaniya A, Kind T, Wang S, Fiehn O. How Well Can We Predict Mass Spectra from Structures? Benchmarking Competitive Fragmentation Modeling for Metabolite Identification on Untrained Tandem Mass Spectra. J Chem Inf Model 2022; 62:4049-4056. [PMID: 36043939 DOI: 10.1021/acs.jcim.2c00936] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Competitive Fragmentation Modeling for Metabolite Identification (CFM-ID) is a machine learning tool to predict in silico tandem mass spectra (MS/MS) for known or suspected metabolites for which chemical reference standards are not available. As a machine learning tool, it relies on both an underlying statistical model and an explicit training set that encompasses experimental mass spectra for specific compounds. Such mass spectra depend on specific parameters such as collision energies, instrument types, and adducts which are accumulated in libraries. Yet, ultimately prediction tools that are meant to cover wide expanses of entities must be validated on cases that were not included in the initial training and testing sets. Hence, we here benchmarked the performance of CFM-ID 4.0 to correctly predict MS/MS spectra for spectra that were not included in the CFM-ID training set and for different mass spectrometry conditions. We used 609,456 experimental tandem spectra from the NIST20 mass spectral library that were newly added to the previous NIST17 library version. We found that CFM-ID's highest energy prediction output would maximize the capacity for library generation. Matching the experimental collision energy with CFM-ID's prediction energy produced the best results, even for HCD-Orbitrap instruments. For benzenoids, better MS/MS predictions were achieved than for heterocyclic compounds. However, when exploring CFM-ID's performance on 8,305 compounds at 40 eV HCD-Orbitrap collision energy, >90% of the 20/80 split test compounds showed <700 MS/MS similarity score. Instead of a stand-alone tool, CFM-ID 4.0 might be useful to boost candidate structures in the greater context of identification workflows.
Collapse
Affiliation(s)
- Parker Ladd Bremer
- Department of Chemistry, University of California Davis, Davis, California 95616, United States
| | - Arpana Vaniya
- West Coast Metabolomics Center for Compound Identification, UC Davis Genome Center, University of California Davis, Davis, California 95616, United States
| | - Tobias Kind
- West Coast Metabolomics Center for Compound Identification, UC Davis Genome Center, University of California Davis, Davis, California 95616, United States
| | - Shunyang Wang
- Department of Chemistry, University of California Davis, Davis, California 95616, United States
| | - Oliver Fiehn
- West Coast Metabolomics Center for Compound Identification, UC Davis Genome Center, University of California Davis, Davis, California 95616, United States
| |
Collapse
|
7
|
Chao A, Grossman J, Carberry C, Lai Y, Williams AJ, Minucci JM, Purucker ST, Szilagyi J, Lu K, Boggess K, Fry RC, Sobus JR, Rager JE. Integrative exposomic, transcriptomic, epigenomic analyses of human placental samples links understudied chemicals to preeclampsia. ENVIRONMENT INTERNATIONAL 2022; 167:107385. [PMID: 35952468 PMCID: PMC9552572 DOI: 10.1016/j.envint.2022.107385] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/31/2022] [Revised: 06/22/2022] [Accepted: 06/27/2022] [Indexed: 06/15/2023]
Abstract
BACKGROUND Environmental health research has recently undergone a dramatic shift, with ongoing technological advancements allowing for broader coverage of exposure and molecular biology signatures. Approaches to integrate such measures are still needed to increase understanding between systems-level exposure and biology. OBJECTIVES We address this gap by evaluating placental tissues to identify novel chemical-biological interactions associated with preeclampsia. This study tests the hypothesis that understudied chemicals are present in the human placenta and associated with preeclampsia-relevant disruptions, including overall case status (preeclamptic vs. normotensive patients) and underlying transcriptomic/epigenomic signatures. METHODS A non-targeted analysis based on high-resolution mass spectrometry was used to analyze placental tissues from a cohort of 35 patients with preeclampsia (n = 18) and normotensive (n = 17) pregnancies. Molecular feature data were prioritized for confirmation based on association with preeclampsia case status and confidence of chemical identification. All molecular features were evaluated for relationships to mRNA, microRNA, and CpG methylation (i.e., multi-omic) signature alterations involved in preeclampsia. RESULTS A total of 183 molecular features were identified with significantly differentiated abundance in placental extracts of preeclamptic patients; these features clustered into distinct chemical groupings using unsupervised methods. Of these features, 53 were identified (mapping to 40 distinct chemicals) using chemical standards, fragmentation spectra, and chemical metadata. In general, human metabolites had the largest feature intensities and strongest associations with preeclampsia-relevant multi-omic changes. Exogenous drugs were second most abundant and had fewer associations with multi-omic changes. Other exogenous chemicals (non-drugs) were least abundant and had the fewest associations with multi-omic changes. CONCLUSIONS These global data trends suggest that human metabolites are heavily intertwined with biological processes involved in preeclampsia etiology, while exogenous chemicals may still impact select transcriptomic/epigenomic processes. This study serves as a demonstration of merging systems exposures with systems biology to better understand chemical-disease relationships.
Collapse
Affiliation(s)
- Alex Chao
- U.S. Environmental Protection Agency, Office of Research and Development, Center for Computational Toxicology and Exposure, Chemical Characterization and Exposure Division, Research Triangle Park, NC, USA
| | | | - Celeste Carberry
- Department of Environmental Sciences and Engineering, Gillings School of Global Public Health, The University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
- The Institute for Environmental Health Solutions, Gillings School of Global Public Health, The University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Yunjia Lai
- Department of Environmental Sciences and Engineering, Gillings School of Global Public Health, The University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Antony J. Williams
- U.S. Environmental Protection Agency, Office of Research and Development, Center for Computational Toxicology and Exposure, Chemical Characterization and Exposure Division, Research Triangle Park, NC, USA
| | - Jeffrey M. Minucci
- U.S. Environmental Protection Agency, Office of Research and Development, Center for Public Health and Environmental Assessment, Public Health and Environmental Systems Division, Research Triangle Park, NC, USA
| | - S. Thomas Purucker
- U.S. Environmental Protection Agency, Office of Research and Development, Center for Computational Toxicology and Exposure, Great Lakes Toxicology and Ecology Division, Research Triangle Park, NC, USA
| | - John Szilagyi
- Department of Environmental Sciences and Engineering, Gillings School of Global Public Health, The University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
- The Institute for Environmental Health Solutions, Gillings School of Global Public Health, The University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Kun Lu
- Department of Environmental Sciences and Engineering, Gillings School of Global Public Health, The University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
- The Institute for Environmental Health Solutions, Gillings School of Global Public Health, The University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
- Curriculum in Toxicology and Environmental Medicine, School of Medicine, The University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Kim Boggess
- Department of Obstetrics and Gynecology, Division of Maternal Fetal Medicine, The University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Rebecca C. Fry
- Department of Environmental Sciences and Engineering, Gillings School of Global Public Health, The University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
- The Institute for Environmental Health Solutions, Gillings School of Global Public Health, The University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
- Curriculum in Toxicology and Environmental Medicine, School of Medicine, The University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Jon R. Sobus
- U.S. Environmental Protection Agency, Office of Research and Development, Center for Computational Toxicology and Exposure, Chemical Characterization and Exposure Division, Research Triangle Park, NC, USA
| | - Julia E. Rager
- Department of Environmental Sciences and Engineering, Gillings School of Global Public Health, The University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
- The Institute for Environmental Health Solutions, Gillings School of Global Public Health, The University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
- Curriculum in Toxicology and Environmental Medicine, School of Medicine, The University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| |
Collapse
|
8
|
Yan S, Bhawal R, Yin Z, Thannhauser TW, Zhang S. Recent advances in proteomics and metabolomics in plants. MOLECULAR HORTICULTURE 2022; 2:17. [PMID: 37789425 PMCID: PMC10514990 DOI: 10.1186/s43897-022-00038-9] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/24/2022] [Accepted: 06/20/2022] [Indexed: 10/05/2023]
Abstract
Over the past decade, systems biology and plant-omics have increasingly become the main stream in plant biology research. New developments in mass spectrometry and bioinformatics tools, and methodological schema to integrate multi-omics data have leveraged recent advances in proteomics and metabolomics. These progresses are driving a rapid evolution in the field of plant research, greatly facilitating our understanding of the mechanistic aspects of plant metabolisms and the interactions of plants with their external environment. Here, we review the recent progresses in MS-based proteomics and metabolomics tools and workflows with a special focus on their applications to plant biology research using several case studies related to mechanistic understanding of stress response, gene/protein function characterization, metabolic and signaling pathways exploration, and natural product discovery. We also present a projection concerning future perspectives in MS-based proteomics and metabolomics development including their applications to and challenges for system biology. This review is intended to provide readers with an overview of how advanced MS technology, and integrated application of proteomics and metabolomics can be used to advance plant system biology research.
Collapse
Affiliation(s)
- Shijuan Yan
- Guangdong Key Laboratory for Crop Germplasm Resources Preservation and Utilization, Agro-biological Gene Research Center, Guangdong Academy of Agricultural Sciences, Guangzhou, China
| | - Ruchika Bhawal
- Proteomics and Metabolomics Facility, Institute of Biotechnology, Cornell University, 139 Biotechnology Building, 526 Campus Road, Ithaca, NY, 14853, USA
| | - Zhibin Yin
- Guangdong Key Laboratory for Crop Germplasm Resources Preservation and Utilization, Agro-biological Gene Research Center, Guangdong Academy of Agricultural Sciences, Guangzhou, China
| | | | - Sheng Zhang
- Proteomics and Metabolomics Facility, Institute of Biotechnology, Cornell University, 139 Biotechnology Building, 526 Campus Road, Ithaca, NY, 14853, USA.
| |
Collapse
|
9
|
Zweigle J, Bugsel B, Zwiener C. FindPFΔS: Non-Target Screening for PFAS─Comprehensive Data Mining for MS 2 Fragment Mass Differences. Anal Chem 2022; 94:10788-10796. [PMID: 35866933 DOI: 10.1021/acs.analchem.2c01521] [Citation(s) in RCA: 16] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
The limited availability of analytical reference standards makes non-target screening approaches based on high-resolution mass spectrometry increasingly important for the efficient identification of unknown PFAS (per- and polyfluoroalkyl substances) and their TPs. We developed and optimized a vendor-independent open-source Python-based algorithm (FindPFΔS = FindPolyFluoroDeltas) to search for distinct fragment mass differences in MS/MS raw data (.ms2-files). Optimization with PFAS standards, two pre-characterized paper and soil samples (iterative data-dependent acquisition), revealed Δ(CF2)n, ΔHF, ΔCnH3F2n-3, ΔCnH2F2n-4, ΔCnHF2n-5, ΔCnF2nSO3, ΔCF3, and ΔCF2O as relevant and selective fragment differences depending on applied collision energies. In a PFAS standard mix, 94% (36 of 38 compounds from 10 compound classes) could be found by FindPFΔS. The use of fragment differences was applicable to a wide range of PFAS classes and appears as a promising new approach for PFAS identification. The influence of mass tolerance and intensity threshold on the identification efficiency and on the detection of false positives was systematically evaluated with the use of selected HR-MS2-spectra (20,998) from MassBank. To this end, with the use of FindPFΔS, we could identify different unknown PFAS homologues in the paper extracts. FindPFΔS is freely available as both Python source code on GitHub (https://github.com/JonZwe/FindPFAS) and as an executable windows application (https://doi.org/10.5281/zenodo.6797353) with a graphical user interface on Zenodo.
Collapse
Affiliation(s)
- Jonathan Zweigle
- Environmental Analytical Chemistry, Center for Applied Geoscience, University of Tübingen, Schnarrenbergstraße 94-96, Tübingen 72076, Germany
| | - Boris Bugsel
- Environmental Analytical Chemistry, Center for Applied Geoscience, University of Tübingen, Schnarrenbergstraße 94-96, Tübingen 72076, Germany
| | - Christian Zwiener
- Environmental Analytical Chemistry, Center for Applied Geoscience, University of Tübingen, Schnarrenbergstraße 94-96, Tübingen 72076, Germany
| |
Collapse
|
10
|
Phillips AL, Williams AJ, Sobus JR, Ulrich EM, Gundersen J, Langlois-Miller C, Newton SR. A Framework for Utilizing High-Resolution Mass Spectrometry and Nontargeted Analysis in Rapid Response and Emergency Situations. ENVIRONMENTAL TOXICOLOGY AND CHEMISTRY 2022; 41:1117-1130. [PMID: 34416028 PMCID: PMC9280853 DOI: 10.1002/etc.5196] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/28/2021] [Revised: 07/26/2021] [Accepted: 08/17/2021] [Indexed: 05/03/2023]
Abstract
Unknown chemical releases constitute a large portion of the rapid response situations to which the US Environmental Protection Agency is called on to respond. Workflows used to address unknown chemical releases currently involve screening for a large array of known compounds using many different targeted methods. When matches are not found, expert analytical chemistry knowledge is used to propose possible candidates from the available data, which generally includes low-resolution mass spectra and situational clues such as the location of the release, nearby industrial operations, and other field-reported facts. The past decade has witnessed dramatic improvements in capabilities for identifying unknown compounds using high-resolution mass spectrometry (HRMS) and nontargeted analysis (NTA) approaches. Complementary developments in cheminformatics tools have further enabled an increase in NTA throughput and identification confidence. Together with the expanding availability of HRMS instrumentation in monitoring laboratories, these advancements make NTA highly relevant to rapid response scenarios. In this article, we introduce the concept of NTA as it relates to rapid response needs and describe how it can be applied to address unknown chemical releases. We advocate for the consideration of HRMS-based NTA approaches to support future rapid response scenarios. Environ Toxicol Chem 2022;41:1117-1130. Published 2021. This article is a U.S. Government work and is in the public domain in the USA.
Collapse
Affiliation(s)
- Allison L. Phillips
- U.S. Environmental Protection Agency, Office of Research & Development, Center for Public Health and Environmental Assessment, Research Triangle Park, NC 27711
| | - Antony J. Williams
- U.S. Environmental Protection Agency, Office of Research & Development, Center for Computational Toxicology and Exposure, Research Triangle Park, NC 27711
| | - Jon R. Sobus
- U.S. Environmental Protection Agency, Office of Research & Development, Center for Computational Toxicology and Exposure, Research Triangle Park, NC 27711
| | - Elin M. Ulrich
- U.S. Environmental Protection Agency, Office of Research & Development, Center for Computational Toxicology and Exposure, Research Triangle Park, NC 27711
| | - Jennifer Gundersen
- U.S. Environmental Protection Agency, Office of Research & Development, Center for Environmental Measurement and Modeling, Narragansett, RI 02882
| | - Christina Langlois-Miller
- U.S. Environmental Protection Agency, Office of Land and Emergency Management, Office of Emergency Management, Washington D.C. 20460
| | - Seth R. Newton
- U.S. Environmental Protection Agency, Office of Research & Development, Center for Computational Toxicology and Exposure, Research Triangle Park, NC 27711
- Corresponding author contact information: Seth R. Newton, , Mail: 109 T.W. Alexander Drive E205-05, RTP, NC 27711
| |
Collapse
|
11
|
Reiter A, Asgari J, Wiechert W, Oldiges M. Metabolic Footprinting of Microbial Systems Based on Comprehensive In Silico Predictions of MS/MS Relevant Data. Metabolites 2022; 12:257. [PMID: 35323700 PMCID: PMC8949988 DOI: 10.3390/metabo12030257] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2022] [Revised: 03/08/2022] [Accepted: 03/12/2022] [Indexed: 12/12/2022] Open
Abstract
Metabolic footprinting represents a holistic approach to gathering large-scale metabolomic information of a given biological system and is, therefore, a driving force for systems biology and bioprocess development. The ongoing development of automated cultivation platforms increases the need for a comprehensive and rapid profiling tool to cope with the cultivation throughput. In this study, we implemented a workflow to provide and select relevant metabolite information from a genome-scale model to automatically build an organism-specific comprehensive metabolome analysis method. Based on in-house literature and predicted metabolite information, the deduced metabolite set was distributed in stackable methods for a chromatography-free dilute and shoot flow-injection analysis multiple-reaction monitoring profiling approach. The workflow was used to create a method specific for Saccharomyces cerevisiae, covering 252 metabolites with 7 min/sample. The method was validated with a commercially available yeast metabolome standard, identifying up to 74.2% of the listed metabolites. As a first case study, three commercially available yeast extracts were screened with 118 metabolites passing quality control thresholds for statistical analysis, allowing to identify discriminating metabolites. The presented methodology provides metabolite screening in a time-optimised way by scaling analysis time to metabolite coverage and is open to other microbial systems simply starting from genome-scale model information.
Collapse
Affiliation(s)
- Alexander Reiter
- Institute of Bio- and Geosciences, IBG-1: Biotechnology, Forschungszentrum Jülich GmbH, 52425 Jülich, Germany; (A.R.); (J.A.); (W.W.)
- Institute of Biotechnology, RWTH Aachen University, 52062 Aachen, Germany
| | - Jian Asgari
- Institute of Bio- and Geosciences, IBG-1: Biotechnology, Forschungszentrum Jülich GmbH, 52425 Jülich, Germany; (A.R.); (J.A.); (W.W.)
- Institute of Biotechnology, RWTH Aachen University, 52062 Aachen, Germany
| | - Wolfgang Wiechert
- Institute of Bio- and Geosciences, IBG-1: Biotechnology, Forschungszentrum Jülich GmbH, 52425 Jülich, Germany; (A.R.); (J.A.); (W.W.)
- Computational Systems Biotechnology, RWTH Aachen University, 52062 Aachen, Germany
| | - Marco Oldiges
- Institute of Bio- and Geosciences, IBG-1: Biotechnology, Forschungszentrum Jülich GmbH, 52425 Jülich, Germany; (A.R.); (J.A.); (W.W.)
- Institute of Biotechnology, RWTH Aachen University, 52062 Aachen, Germany
| |
Collapse
|
12
|
Sussman EM, Oktem B, Isayeva IS, Liu J, Wickramasekara S, Chandrasekar V, Nahan K, Shin HY, Zheng J. Chemical Characterization and Non-targeted Analysis of Medical Device Extracts: A Review of Current Approaches, Gaps, and Emerging Practices. ACS Biomater Sci Eng 2022; 8:939-963. [PMID: 35171560 DOI: 10.1021/acsbiomaterials.1c01119] [Citation(s) in RCA: 15] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
Abstract
The developers of medical devices evaluate the biocompatibility of their device prior to FDA's review and subsequent introduction to the market. Chemical characterization, described in ISO 10993-18:2020, can generate information for toxicological risk assessment and is an alternative approach for addressing some biocompatibility end points (e.g., systemic toxicity, genotoxicity, carcinogenicity, reproductive/developmental toxicity) that can reduce the time and cost of testing and the need for animal testing. Additionally, chemical characterization can be used to determine whether modifications to the materials and manufacturing processes alter the chemistry of a patient-contacting device to an extent that could impact device safety. Extractables testing is one approach to chemical characterization that employs combinations of non-targeted analysis, non-targeted screening, and/or targeted analysis to establish the identities and quantities of the various chemical constituents that can be released from a device. Due to the difficulty in obtaining a priori information on all the constituents in finished devices, information generation strategies in the form of analytical chemistry testing are often used. Identified and quantified extractables are then assessed using toxicological risk assessment approaches to determine if reported quantities are sufficiently low to overcome the need for further chemical analysis, biological evaluation of select end points, or risk control. For extractables studies to be useful as a screening tool, comprehensive and reliable non-targeted methods are needed. Although non-targeted methods have been adopted by many laboratories, they are laboratory-specific and require expensive analytical instruments and advanced technical expertise to perform. In this Perspective, we describe the elements of extractables studies and provide an overview of the current practices, identified gaps, and emerging practices that may be adopted on a wider scale in the future. This Perspective is outlined according to the steps of an extractables study: information gathering, extraction, extract sample processing, system selection, qualification, quantification, and identification.
Collapse
Affiliation(s)
- Eric M Sussman
- Center for Devices and Radiological Health, U.S. Food and Drug Administration, Silver Spring, Maryland 20993, United States
| | - Berk Oktem
- Center for Devices and Radiological Health, U.S. Food and Drug Administration, Silver Spring, Maryland 20993, United States
| | - Irada S Isayeva
- Center for Devices and Radiological Health, U.S. Food and Drug Administration, Silver Spring, Maryland 20993, United States
| | - Jinrong Liu
- Center for Devices and Radiological Health, U.S. Food and Drug Administration, Silver Spring, Maryland 20993, United States
| | - Samanthi Wickramasekara
- Center for Devices and Radiological Health, U.S. Food and Drug Administration, Silver Spring, Maryland 20993, United States
| | - Vaishnavi Chandrasekar
- Center for Devices and Radiological Health, U.S. Food and Drug Administration, Silver Spring, Maryland 20993, United States
| | - Keaton Nahan
- Center for Devices and Radiological Health, U.S. Food and Drug Administration, Silver Spring, Maryland 20993, United States
| | - Hainsworth Y Shin
- Center for Devices and Radiological Health, U.S. Food and Drug Administration, Silver Spring, Maryland 20993, United States
| | - Jiwen Zheng
- Center for Devices and Radiological Health, U.S. Food and Drug Administration, Silver Spring, Maryland 20993, United States
| |
Collapse
|
13
|
Jansen LJM, Nijssen R, Bolck YJC, Wegh RS, van de Schans MGM, Berendsen BJA. Systematic assessment of acquisition and data-processing parameters in the suspect screening of veterinary drugs in archive matrices using LC-HRMS. Food Addit Contam Part A Chem Anal Control Expo Risk Assess 2021; 39:272-284. [PMID: 34854800 DOI: 10.1080/19440049.2021.1999507] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
Abstract
Monitoring strategies for veterinary drugs in products of animal origin are shifting towards a more risk-based approach. Such strategies not only target a limited number of predefined .substances but also facilitate detection of unexpected substances. By combining the use of archive matrices such as feather meal with suspect-screening methods, early detection of new hazards in the food and feed industry can be achieved. Effective application of such strategies is hampered by complex data interpretation and therefore, targeted data analysis is commonly applied. In this study, the performance of a suspect-screening data processing workflow using a suspect list or the online spectral database mzCloudTM was explored to facilitate detection of veterinary drugs in archive matrices. Data evaluation parameters specifically investigated for application of a suspect list were mass tolerance and the addition or omission of retention times. Application of a mass tolerance of 1.5 ppm leads to an increase in the number of false positives, as does omission of retention times in the suspect list. Different acquisition modes yielding different qualities of MS2 data were studied and proved to be a critical factor, where data-dependent acquisition is preferred when matching to the mzCloudTM database. Using this approach, it is possible to search for compounds on a dedicated suspect list based on the exact mass and retention times and, at the same time, detect unexpected compounds without a priori information. A pilot study was conducted and fourteen different antibiotics were detected (and confirmed by MS/MS). Three of these antibiotics were not included in the suspect list. The optimised suspect-screening method proved to be fit for the purpose of finding veterinary drugs in feather meal, which are not in the scope of the current monitoring methods and therefore, it gives added value in the perspective of a risk-based monitoring.
Collapse
Affiliation(s)
- Larissa J M Jansen
- Authenticity & Veterinary Drugs, Wageningen Food Safety Research, Wageningen, The Netherlands
| | - Rosalie Nijssen
- Contaminants & Toxicology, Wageningen Food Safety Research, Wageningen, The Netherlands
| | - Yvette J C Bolck
- Authenticity & Veterinary Drugs, Wageningen Food Safety Research, Wageningen, The Netherlands
| | - Robin S Wegh
- Authenticity & Veterinary Drugs, Wageningen Food Safety Research, Wageningen, The Netherlands
| | - Milou G M van de Schans
- Authenticity & Veterinary Drugs, Wageningen Food Safety Research, Wageningen, The Netherlands
| | - Bjorn J A Berendsen
- Authenticity & Veterinary Drugs, Wageningen Food Safety Research, Wageningen, The Netherlands
| |
Collapse
|
14
|
Gastroprotective effects and metabolomic profiling of Chasteberry fruits against indomethacin-induced gastric injury in rats. J Funct Foods 2021. [DOI: 10.1016/j.jff.2021.104732] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022] Open
|
15
|
David A, Chaker J, Price EJ, Bessonneau V, Chetwynd AJ, Vitale CM, Klánová J, Walker DI, Antignac JP, Barouki R, Miller GW. Towards a comprehensive characterisation of the human internal chemical exposome: Challenges and perspectives. ENVIRONMENT INTERNATIONAL 2021; 156:106630. [PMID: 34004450 DOI: 10.1016/j.envint.2021.106630] [Citation(s) in RCA: 32] [Impact Index Per Article: 10.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/07/2021] [Revised: 04/15/2021] [Accepted: 05/03/2021] [Indexed: 05/18/2023]
Abstract
The holistic characterisation of the human internal chemical exposome using high-resolution mass spectrometry (HRMS) would be a step forward to investigate the environmental ætiology of chronic diseases with an unprecedented precision. HRMS-based methods are currently operational to reproducibly profile thousands of endogenous metabolites as well as externally-derived chemicals and their biotransformation products in a large number of biological samples from human cohorts. These approaches provide a solid ground for the discovery of unrecognised biomarkers of exposure and metabolic effects associated with many chronic diseases. Nevertheless, some limitations remain and have to be overcome so that chemical exposomics can provide unbiased detection of chemical exposures affecting disease susceptibility in epidemiological studies. Some of these limitations include (i) the lack of versatility of analytical techniques to capture the wide diversity of chemicals; (ii) the lack of analytical sensitivity that prevents the detection of exogenous (and endogenous) chemicals occurring at (ultra) trace levels from restricted sample amounts, and (iii) the lack of automation of the annotation/identification process. In this article, we discuss a number of technological and methodological limitations hindering applications of HRMS-based methods and propose initial steps to push towards a more comprehensive characterisation of the internal chemical exposome. We also discuss other challenges including the need for harmonisation and the difficulty inherent in assessing the dynamic nature of the internal chemical exposome, as well as the need for establishing a strong international collaboration, high level networking, and sustainable research infrastructure. A great amount of research, technological development and innovative bio-informatics tools are still needed to profile and characterise the "invisible" (not profiled), "hidden" (not detected) and "dark" (not annotated) components of the internal chemical exposome and concerted efforts across numerous research fields are paramount.
Collapse
Affiliation(s)
- Arthur David
- Univ Rennes, Inserm, EHESP, Irset (Institut de recherche en santé, environnement et travail) - UMR_S 1085, F-35000 Rennes, France.
| | - Jade Chaker
- Univ Rennes, Inserm, EHESP, Irset (Institut de recherche en santé, environnement et travail) - UMR_S 1085, F-35000 Rennes, France
| | - Elliott J Price
- Faculty of Sports Studies, Masaryk University, Brno, Czech Republic; RECETOX Centre, Masaryk University, Brno, Czech Republic
| | - Vincent Bessonneau
- Univ Rennes, Inserm, EHESP, Irset (Institut de recherche en santé, environnement et travail) - UMR_S 1085, F-35000 Rennes, France
| | - Andrew J Chetwynd
- School of Geography Earth and Environmental Sciences, University of Birmingham, Edgbaston, Birmingham B15 2TT, UK
| | | | - Jana Klánová
- RECETOX Centre, Masaryk University, Brno, Czech Republic
| | - Douglas I Walker
- Department of Environmental Medicine and Public Health, Icahn School of Medicine at Mount Sinai, New York, NY, United States
| | | | - Robert Barouki
- Unité UMR-S 1124 Inserm-Université Paris Descartes "Toxicologie Pharmacologie et Signalisation Cellulaire", Paris, France
| | - Gary W Miller
- Department of Environmental Health Sciences, Mailman School of Public Health, Columbia University, New York, NY, USA
| |
Collapse
|
16
|
High-confidence structural annotation of metabolites absent from spectral libraries. Nat Biotechnol 2021; 40:411-421. [PMID: 34650271 PMCID: PMC8926923 DOI: 10.1038/s41587-021-01045-9] [Citation(s) in RCA: 91] [Impact Index Per Article: 30.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2021] [Accepted: 08/04/2021] [Indexed: 12/14/2022]
Abstract
Untargeted metabolomics experiments rely on spectral libraries for structure annotation, but, typically, only a small fraction of spectra can be matched. Previous in silico methods search in structure databases but cannot distinguish between correct and incorrect annotations. Here we introduce the COSMIC workflow that combines in silico structure database generation and annotation with a confidence score consisting of kernel density P value estimation and a support vector machine with enforced directionality of features. On diverse datasets, COSMIC annotates a substantial number of hits at low false discovery rates and outperforms spectral library search. To demonstrate that COSMIC can annotate structures never reported before, we annotated 12 natural bile acids. The annotation of nine structures was confirmed by manual evaluation and two structures using synthetic standards. In human samples, we annotated and manually validated 315 molecular structures currently absent from the Human Metabolome Database. Application of COSMIC to data from 17,400 metabolomics experiments led to 1,715 high-confidence structural annotations that were absent from spectral libraries.
Collapse
|
17
|
Williams AJ, Lambert JC, Thayer K, Dorne JLCM. Sourcing data on chemical properties and hazard data from the US-EPA CompTox Chemicals Dashboard: A practical guide for human risk assessment. ENVIRONMENT INTERNATIONAL 2021; 154:106566. [PMID: 33934018 PMCID: PMC9667884 DOI: 10.1016/j.envint.2021.106566] [Citation(s) in RCA: 28] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/14/2020] [Revised: 04/04/2021] [Accepted: 04/05/2021] [Indexed: 05/19/2023]
Abstract
For the past six decades, human health risk assessment of chemicals has relied on in vivo data from human epidemiological and experimental animal toxicological studies to inform the derivation of non-cancer toxicity values. The ongoing evolution of this risk assessment paradigm in an environmental landscape of data-poor chemicals has highlighted the need to develop and implement non-testing methods, so-called New Approach Methodologies (NAMs). NAMs include a growing number of in silico and in vitro data streams designed to inform hazard properties of chemicals, including kinetics and dynamics at different levels of biological organization, environmental fate and transport, and exposure. NAMs provide a fit-for-purpose science-basis for human hazard and risk characterization of chemicals ranging from data-gap filling applications to broad evidence-based decision-making. Systematic assembly and delivery of empirical and predicted data for chemicals are paramount to advancing chemical evaluation, and software tools serve an essential role in delivering these data to the scientific community. The CompTox Chemicals Dashboard (from here on referred to as the "Dashboard") is one such tool and is a publicly available web-based application developed by the US Environmental Protection Agency to provide access to chemistry, toxicity and exposure information for ~900,000 chemicals. The Dashboard is increasingly becoming a valuable resource for assessors tasked with the evaluation of potential human health risks associated with chemical exposures. In this context, the significant amount of information present in the Dashboard facilitates: 1) assembly of information on physicochemical properties and environmental fate and transport and exposure parameters and metrics; 2) identification of cancer and non-cancer health effects from extant human and experimental animal studies in the public domain and/or information not available in the public domain (i.e., "grey literature"); 3) systematic literature searching and review for developing cancer and non-cancer hazard evidence bases; and 4) access to mechanistic information that can aid or augment the analysis of traditional toxicology evidence bases, or potentially, serve as the primary basis for informing hazard identification and dose-response when traditional bioassay data are lacking. Finally, in silico predictive tools developed to conduct structure-activity or read-across analyses are also available within the Dashboard. This practical tutorial is intended to address key questions from the human health risk assessment community dealing with chemicals in both food and in the environment. Perspectives for future development or refinement of the Dashboard highlight foreseen activities to further support the research and risk assessment community in cancer and non-cancer chemical evaluations.
Collapse
Affiliation(s)
- Antony J Williams
- Center for Computational Toxicology and Exposure, Office of Research and Development, U.S. Environmental Protection Agency (U.S. EPA), Research Triangle Park, NC, USA.
| | - Jason C Lambert
- Center for Computational Toxicology and Exposure, Office of Research and Development, U.S. Environmental Protection Agency (U.S. EPA), Research Triangle Park, NC, USA
| | - Kris Thayer
- Center for Public Health and Environmental Assessment, Office of Research and Development, U.S. Environmental Protection Agency (U.S. EPA), Research Triangle Park, NC, USA
| | - Jean-Lou C M Dorne
- Scientific Committee and Emerging Risks Unit, Department of Risk Assessment and Scientific Assistance, European Food Safety Authority, 43126 Parma, Italy
| |
Collapse
|
18
|
Meijer J, Lamoree M, Hamers T, Antignac JP, Hutinet S, Debrauwer L, Covaci A, Huber C, Krauss M, Walker DI, Schymanski EL, Vermeulen R, Vlaanderen J. An annotation database for chemicals of emerging concern in exposome research. ENVIRONMENT INTERNATIONAL 2021; 152:106511. [PMID: 33773387 DOI: 10.1016/j.envint.2021.106511] [Citation(s) in RCA: 26] [Impact Index Per Article: 8.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/01/2020] [Revised: 02/03/2021] [Accepted: 03/06/2021] [Indexed: 05/18/2023]
Abstract
BACKGROUND Chemicals of Emerging Concern (CECs) include a very wide group of chemicals that are suspected to be responsible for adverse effects on health, but for which very limited information is available. Chromatographic techniques coupled with high-resolution mass spectrometry (HRMS) can be used for non-targeted screening and detection of CECs, by using comprehensive annotation databases. Establishing a database focused on the annotation of CECs in human samples will provide new insight into the distribution and extent of exposures to a wide range of CECs in humans. OBJECTIVES This study describes an approach for the aggregation and curation of an annotation database (CECscreen) for the identification of CECs in human biological samples. METHODS The approach consists of three main parts. First, CECs compound lists from various sources were aggregated and duplications and inorganic compounds were removed. Subsequently, the list was curated by standardization of structures to create "MS-ready" and "QSAR-ready" SMILES, as well as calculation of exact masses (monoisotopic and adducts) and molecular formulas. The second step included the simulation of Phase I metabolites. The third and final step included the calculation of QSAR predictions related to physicochemical properties, environmental fate, toxicity and Absorption, Distribution, Metabolism, Excretion (ADME) processes and the retrieval of information from the US EPA CompTox Chemicals Dashboard. RESULTS All CECscreen database and property files are publicly available (DOI: https://doi.org/10.5281/zenodo.3956586). In total, 145,284 entries were aggregated from various CECs data sources. After elimination of duplicates and curation, the pipeline produced 70,397 unique "MS-ready" structures and 66,071 unique QSAR-ready structures, corresponding with 69,526 CAS numbers. Simulation of Phase I metabolites resulted in 306,279 unique metabolites. QSAR predictions could be performed for 64,684 of the QSAR-ready structures, whereas information was retrieved from the CompTox Chemicals Dashboard for 59,739 CAS numbers out of 69,526 inquiries. CECscreen is incorporated in the in silico fragmentation approach MetFrag. DISCUSSION The CECscreen database can be used to prioritize annotation of CECs measured in non-targeted HRMS, facilitating the large-scale detection of CECs in human samples for exposome research. Large-scale detection of CECs can be further improved by integrating the present database with resources that contain CECs (metabolites) and meta-data measurements, further expansion towards in silico and experimental (e.g., MassBank) generation of MS/MS spectra, and development of bioinformatics approaches capable of using correlation patterns in the measured chemical features.
Collapse
Affiliation(s)
- Jeroen Meijer
- Institute for Risk Assessment Sciences (IRAS), Utrecht University, Utrecht, the Netherlands; Department Environment & Health, Vrije Universiteit, Amsterdam, the Netherlands
| | - Marja Lamoree
- Department Environment & Health, Vrije Universiteit, Amsterdam, the Netherlands
| | - Timo Hamers
- Department Environment & Health, Vrije Universiteit, Amsterdam, the Netherlands
| | | | | | - Laurent Debrauwer
- Toxalim (Research Centre in Food Toxicology), Toulouse University, INRAE, ENVT, INP-Purpan, Toulouse, France; Metatoul-AXIOM Platform, National Infrastructure for Metabolomics and Fluxomics: MetaboHUB, Toxalim, INRAE, Toulouse, France
| | - Adrian Covaci
- Toxicological Center, University of Antwerp, Belgium
| | - Carolin Huber
- Department Effect-Directed Analysis, Helmholtz Centre for Environmental Research - UFZ, Leipzig, Germany
| | - Martin Krauss
- Department Effect-Directed Analysis, Helmholtz Centre for Environmental Research - UFZ, Leipzig, Germany
| | - Douglas I Walker
- Department of Environmental Medicine and Public Health, Icahn School of Medicine, Mount Sinai, New York, NY, USA
| | - Emma L Schymanski
- Luxembourg Centre for Systems Biomedicine, University of Luxembourg, Belvaux, Luxembourg
| | - Roel Vermeulen
- Institute for Risk Assessment Sciences (IRAS), Utrecht University, Utrecht, the Netherlands
| | - Jelle Vlaanderen
- Institute for Risk Assessment Sciences (IRAS), Utrecht University, Utrecht, the Netherlands.
| |
Collapse
|
19
|
Odenkirk MT, Reif DM, Baker ES. Multiomic Big Data Analysis Challenges: Increasing Confidence in the Interpretation of Artificial Intelligence Assessments. Anal Chem 2021; 93:7763-7773. [PMID: 34029068 PMCID: PMC8465926 DOI: 10.1021/acs.analchem.0c04850] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/08/2023]
Abstract
The need for holistic molecular measurements to better understand disease initiation, development, diagnosis, and therapy has led to an increasing number of multiomic analyses. The wealth of information available from multiomic assessments, however, requires both the evaluation and interpretation of extremely large data sets, limiting analysis throughput and ease of adoption. Computational methods utilizing artificial intelligence (AI) provide the most promising way to address these challenges, yet despite the conceptual benefits of AI and its successful application in singular omic studies, the widespread use of AI in multiomic studies remains limited. Here, we discuss present and future capabilities of AI techniques in multiomic studies while introducing analytical checks and balances to validate the computational conclusions.
Collapse
Affiliation(s)
- Melanie T Odenkirk
- Department of Chemistry, North Carolina State University, Raleigh, North Carolina 27606, United States
| | - David M Reif
- Department of Biological Sciences, North Carolina State University, Raleigh, North Carolina 27606, United States
- Bioinformatics Research Center, North Carolina State University, Raleigh, North Carolina 27606, United States
| | - Erin S Baker
- Department of Chemistry, North Carolina State University, Raleigh, North Carolina 27606, United States
| |
Collapse
|
20
|
Taylor M, Lukowski JK, Anderton CR. Spatially Resolved Mass Spectrometry at the Single Cell: Recent Innovations in Proteomics and Metabolomics. JOURNAL OF THE AMERICAN SOCIETY FOR MASS SPECTROMETRY 2021; 32:872-894. [PMID: 33656885 PMCID: PMC8033567 DOI: 10.1021/jasms.0c00439] [Citation(s) in RCA: 131] [Impact Index Per Article: 43.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/02/2020] [Revised: 01/20/2021] [Accepted: 01/25/2021] [Indexed: 05/02/2023]
Abstract
Biological systems are composed of heterogeneous populations of cells that intercommunicate to form a functional living tissue. Biological function varies greatly across populations of cells, as each single cell has a unique transcriptome, proteome, and metabolome that translates to functional differences within single species and across kingdoms. Over the past decade, substantial advancements in our ability to characterize omic profiles on a single cell level have occurred, including in multiple spectroscopic and mass spectrometry (MS)-based techniques. Of these technologies, spatially resolved mass spectrometry approaches, including mass spectrometry imaging (MSI), have shown the most progress for single cell proteomics and metabolomics. For example, reporter-based methods using heavy metal tags have allowed for targeted MS investigation of the proteome at the subcellular level, and development of technologies such as laser ablation electrospray ionization mass spectrometry (LAESI-MS) now mean that dynamic metabolomics can be performed in situ. In this Perspective, we showcase advancements in single cell spatial metabolomics and proteomics over the past decade and highlight important aspects related to high-throughput screening, data analysis, and more which are vital to the success of achieving proteomic and metabolomic profiling at the single cell scale. Finally, using this broad literature summary, we provide a perspective on how the next decade may unfold in the area of single cell MS-based proteomics and metabolomics.
Collapse
Affiliation(s)
- Michael
J. Taylor
- Environmental Molecular Sciences Division, Pacific Northwest National Laboratory, Richland, Washington 99352, United States
| | - Jessica K. Lukowski
- Environmental Molecular Sciences Division, Pacific Northwest National Laboratory, Richland, Washington 99352, United States
| | - Christopher R. Anderton
- Environmental Molecular Sciences Division, Pacific Northwest National Laboratory, Richland, Washington 99352, United States
| |
Collapse
|
21
|
Krettler CA, Thallinger GG. A map of mass spectrometry-based in silico fragmentation prediction and compound identification in metabolomics. Brief Bioinform 2021; 22:6184408. [PMID: 33758925 DOI: 10.1093/bib/bbab073] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2020] [Revised: 01/29/2021] [Accepted: 02/12/2021] [Indexed: 12/27/2022] Open
Abstract
Metabolomics, the comprehensive study of the metabolome, and lipidomics-the large-scale study of pathways and networks of cellular lipids-are major driving forces in enabling personalized medicine. Complicated and error-prone data analysis still remains a bottleneck, however, especially for identifying novel metabolites. Comparing experimental mass spectra to curated databases containing reference spectra has been the gold standard for identification of compounds, but constructing such databases is a costly and time-demanding task. Many software applications try to circumvent this process by utilizing cutting-edge advances in computational methods-including quantum chemistry and machine learning-and simulate mass spectra by performing theoretical, so called in silico fragmentations of compounds. Other solutions concentrate directly on experimental spectra and try to identify structural properties by investigating reoccurring patterns and the relationships between them. The considerable progress made in the field allows recent approaches to provide valuable clues to expedite annotation of experimental mass spectra. This review sheds light on individual strengths and weaknesses of these tools, and attempts to evaluate them-especially in view of lipidomics, when considering complex mixtures found in biological samples as well as mass spectrometer inter-instrument variability.
Collapse
Affiliation(s)
- Christoph A Krettler
- Institute of Biomedical Informatics, Graz University of Technology, Stremayrgasse 16/I, 8010, Graz, Austria.,Omics Center Graz, BioTechMed-Graz, Stiftingtalstrasse 24, 8010, Graz, Austria
| | - Gerhard G Thallinger
- Institute of Biomedical Informatics, Graz University of Technology, Stremayrgasse 16/I, 8010, Graz, Austria.,Omics Center Graz, BioTechMed-Graz, Stiftingtalstrasse 24, 8010, Graz, Austria
| |
Collapse
|
22
|
Pleil JD, Lowe CN, Wallace MAG, Williams AJ. Using the US EPA CompTox Chemicals Dashboard to interpret targeted and non-targeted GC-MS analyses from human breath and other biological media. J Breath Res 2021; 15:025001. [PMID: 33734097 DOI: 10.1088/1752-7163/abdb03] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
Abstract
The U.S. EPA CompTox Chemicals Dashboard is a freely available web-based application providing access to chemistry, toxicity, and exposure data for ∼900 000 chemicals. Data, search functionality, and prediction models within the Dashboard can help identify chemicals found in environmental analyses and human biomonitoring. It was designed to deliver data generated to support computational toxicology to reduce chemical testing on animals and provide access to new approach methodologies including prediction models. The inclusion of mass and formula-based searches, together with relevant ranking approaches, allows for the identification and prioritization of exogenous (environmental) chemicals from high resolution mass spectrometry in need of further evaluation. The Dashboard includes chemicals that can be detected by liquid chromatography, gas chromatography-mass spectrometry (GC-MS) and direct-MS analyses, and chemical lists have been added that highlight breath-borne volatile and semi-volatile organic compounds. The Dashboard can be searched using various chemical identifiers (e.g. chemical synonyms, CASRN and InChIKeys), chemical formula, MS-ready formulae monoisotopic mass, consumer product categories and assays/genes associated with high-throughput screening data. An integrated search at a chemical level performs searches against PubMed to identify relevant published literature. This article describes specific procedures using the Dashboard as a first-stop tool for exploring both targeted and non-targeted results from GC-MS analyses of chemicals found in breath, exhaled breath condensate, and associated aerosols.
Collapse
Affiliation(s)
- Joachim D Pleil
- Gillings School of Public Health, University of North Carolina, Chapel Hill, NC, United States of America
| | | | | | | |
Collapse
|
23
|
Bhuyan DJ, Alsherbiny MA, Low MN, Zhou X, Kaur K, Li G, Li CG. Broad-spectrum pharmacological activity of Australian propolis and metabolomic-driven identification of marker metabolites of propolis samples from three continents. Food Funct 2021; 12:2498-2519. [PMID: 33683257 DOI: 10.1039/d1fo00127b] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
Propolis is a by-product of honeybee farming known for its broad therapeutic benefits around the world and is extensively used in the health food and beverage industry. Despite Australia being one of the world's megadiverse countries with rich flora and fauna, Australian propolis samples have not been explored adequately with most in vitro and in vivo studies centred on their Brazilian and Chinese counterparts. In view of this, our study was designed to investigate the chemical composition and anti-proliferative, antibacterial, antifungal, anti-inflammatory and antioxidant properties of Australian propolis (AP-1) extract to draw a comparison with Brazilian (BP-1) and Chinese propolis (CP-1) extracts. The AP-1 extract displayed significantly greater anti-proliferative activity against the MCF7 and the MDA-MB-231 metastatic breast adenocarcinoma cell lines compared to BP-1 and CP-1 (p < 0.05). Similar trends were also observed in the antibacterial (Escherichia coli and Staphylococcus aureus), anti-inflammatory (lipopolysaccharide-induced RAW264.7 macrophages) and antioxidant assays (ABTS, DPPH and CUPRAC) with AP-1 exhibiting more potent activity than BP-1 and CP-1. The ultra-high performance liquid chromatography (UPLC) coupled with quadrupole high-resolution time of flight mass spectrometry (qTOF-MS) and chemometrics implementing unsupervised PCA and supervised OPLS-DA analyses of the propolis samples from Australia, China and Brazil revealed 67 key discriminatory metabolites belonging to seven main chemical classes including flavonoids, triterpenes, acid derivatives, stilbenes, steroid derivatives, diterpenes and miscellaneous compounds. Additionally, seven common phenolic compounds were quantified in the samples. Further mechanistic studies are necessary to elucidate the modes of action of Australian propolis for its prospective use in the food, nutraceutical and pharmaceutical industries.
Collapse
Affiliation(s)
- Deep Jyoti Bhuyan
- NICM Health Research Institute, Western Sydney University, Penrith, NSW, Australia.
| | | | | | | | | | | | | |
Collapse
|
24
|
Data processing strategies for non-targeted analysis of foods using liquid chromatography/high-resolution mass spectrometry. Trends Analyt Chem 2021. [DOI: 10.1016/j.trac.2021.116188] [Citation(s) in RCA: 17] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]
|
25
|
Perkons I, Rusko J, Zacs D, Bartkevics V. Rapid determination of pharmaceuticals in wastewater by direct infusion HRMS using target and suspect screening analysis. THE SCIENCE OF THE TOTAL ENVIRONMENT 2021; 755:142688. [PMID: 33059144 DOI: 10.1016/j.scitotenv.2020.142688] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/29/2020] [Revised: 09/11/2020] [Accepted: 09/26/2020] [Indexed: 06/11/2023]
Abstract
A wide-scope screening of active pharmaceutical ingredients (APIs) and their transformation products (TPs) in wastewater can yield valuable insights and pinpoint emerging contaminants that have not been previously reported. Such information is relevant to investigate their occurrence and fate in various environmental compartments. In this study, we explored the applicability of direct infusion high resolution mass spectrometry (DI-HRMS) for comprehensive and rapid detection of APIs and their TPs in wastewater samples. The method was developed using a Fourier transform ion cyclotron resonance mass spectrometry (FT-ICR-MS) system and incorporated both wide-scope suspect screening and semi-quantitative determination of selected analytes. The identification strategy was based on the following criteria: narrow accurate mass window (±1.25 ppm) for two most abundant full-MS signals, isotopic pattern fit and additional confirmation on the basis of MS2 spectra at three fragmentation levels. The tentative identification of suspects and target compounds relied on an in-house database containing more than 500 different APIs and TPs. The measured fragment spectra were matched against experimental MS2 patterns obtained from a publicly available spectral library (MassBank of North America) and in-silico generated fragmentation features (from the CFM-ID algorithm). In total, 79 suspects were identified and 24 target compounds were semi-quantified in 72 wastewater samples. The highest detection frequencies in treated wastewater effluents were observed for diclofenac, metoprolol and telmisartan, while hydroxydiclofenac, dextrorphan, and carbamazepine metabolites were the most frequently detected TPs. The obtained API profiles were in accordance with the national consumption statistics and the origin of wastewater samples. The developed method is suitable for rapid screening of APIs in wastewater and can be used as a complementary tool to characterize API emissions from wastewater treatment facilities and to identify problematic compounds that require more rigorous monitoring.
Collapse
Affiliation(s)
- Ingus Perkons
- Institute of Food Safety, Animal Health and Environment "BIOR", Lejupes iela 3, Riga LV-1076, Latvia; University of Latvia, Faculty of Chemistry, Jelgavas iela 1, Riga LV-1004, Latvia.
| | - Janis Rusko
- Institute of Food Safety, Animal Health and Environment "BIOR", Lejupes iela 3, Riga LV-1076, Latvia; University of Latvia, Faculty of Chemistry, Jelgavas iela 1, Riga LV-1004, Latvia
| | - Dzintars Zacs
- Institute of Food Safety, Animal Health and Environment "BIOR", Lejupes iela 3, Riga LV-1076, Latvia
| | - Vadims Bartkevics
- Institute of Food Safety, Animal Health and Environment "BIOR", Lejupes iela 3, Riga LV-1076, Latvia; University of Latvia, Faculty of Chemistry, Jelgavas iela 1, Riga LV-1004, Latvia
| |
Collapse
|
26
|
Lowe CN, Williams AJ. Enabling High-Throughput Searches for Multiple Chemical Data Using the U.S.-EPA CompTox Chemicals Dashboard. J Chem Inf Model 2021; 61:565-570. [PMID: 33481596 DOI: 10.1021/acs.jcim.0c01273] [Citation(s) in RCA: 31] [Impact Index Per Article: 10.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2023]
Abstract
The core goal of cheminformatics is to efficiently store robust and accurate chemical information and make it accessible for drug discovery, environmental analysis, and the development of prediction models including quantitative structure-activity relationships (QSAR). The U.S. Environmental Protection Agency (EPA) has developed a web-based application, the CompTox Chemicals Dashboard, which provides access to a compilation of data generated within the agency and sourced from public databases and literature and to utilities for real-time QSAR prediction and chemical read-across. While the vast majority of online tools only allow interrogation of chemicals one at a time, the Dashboard provides a batch search feature that allows for the sourcing of data based on thousands of chemical inputs at one time, by chemical identifier (e.g., names, Chemical Abstract Service registry numbers, or InChIKeys), or by mass or molecular formulas. Chemical information that can then be sourced via the batch search includes chemical identifiers and structures; intrinsic, physicochemical and fate and transport properties; in vitro and in vivo toxicity data; and the presence in environmentally relevant lists. We outline how to use the batch search feature and provide an overview regarding the type of information that can be sourced by considering a series of typical-use questions.
Collapse
Affiliation(s)
- Charles N Lowe
- Center for Computational Toxicology and Exposure, Office of Research and Development, U.S. Environmental Protection Agency (U.S. EPA), Research Triangle Park, North Carolina 27711, United States
| | - Antony J Williams
- Center for Computational Toxicology and Exposure, Office of Research and Development, U.S. Environmental Protection Agency (U.S. EPA), Research Triangle Park, North Carolina 27711, United States
| |
Collapse
|
27
|
Piovesana S, Cavaliere C, Cerrato A, Montone CM, Laganà A, Capriotti AL. Developments and pitfalls in the characterization of phenolic compounds in food: From targeted analysis to metabolomics-based approaches. Trends Analyt Chem 2020. [DOI: 10.1016/j.trac.2020.116083] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
|
28
|
McEachran AD, Chao A, Al-Ghoul H, Lowe C, Grulke C, Sobus JR, Williams AJ. Revisiting Five Years of CASMI Contests with EPA Identification Tools. Metabolites 2020; 10:E260. [PMID: 32585902 PMCID: PMC7345619 DOI: 10.3390/metabo10060260] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2020] [Revised: 06/03/2020] [Accepted: 06/17/2020] [Indexed: 01/02/2023] Open
Abstract
Software applications for high resolution mass spectrometry (HRMS)-based non-targeted analysis (NTA) continue to enhance chemical identification capabilities. Given the variety of available applications, determining the most fit-for-purpose tools and workflows can be difficult. The Critical Assessment of Small Molecule Identification (CASMI) contests were initiated in 2012 to provide a means to evaluate compound identification tools on a standardized set of blinded tandem mass spectrometry (MS/MS) data. Five CASMI contests have resulted in recommendations, publications, and invaluable datasets for practitioners of HRMS-based screening studies. The US Environmental Protection Agency's (EPA) CompTox Chemicals Dashboard is now recognized as a valuable resource for compound identification in NTA studies. However, this application was too new and immature in functionality to participate in the five previous CASMI contests. In this work, we performed compound identification on all five CASMI contest datasets using Dashboard tools and data in order to critically evaluate Dashboard performance relative to that of other applications. CASMI data was accessed via the CASMI webpage and processed for use in our spectral matching and identification workflow. Relative to applications used by former contest participants, our tools, data, and workflow performed well, placing more challenge compounds in the top five of ranked candidates than did the winners of three contest years and tying in a fourth. In addition, we conducted an in-depth review of the CASMI structure sets and made these reviewed sets available via the Dashboard. Our results suggest that Dashboard data and tools would enhance chemical identification capabilities for practitioners of HRMS-based NTA.
Collapse
Affiliation(s)
- Andrew D. McEachran
- Oak Ridge Institute for Science and Education (ORISE) Participant, 109 T.W. Alexander Drive, Research Triangle Park, NC 27709, USA; (A.C.); (H.A.-G.)
| | - Alex Chao
- Oak Ridge Institute for Science and Education (ORISE) Participant, 109 T.W. Alexander Drive, Research Triangle Park, NC 27709, USA; (A.C.); (H.A.-G.)
| | - Hussein Al-Ghoul
- Oak Ridge Institute for Science and Education (ORISE) Participant, 109 T.W. Alexander Drive, Research Triangle Park, NC 27709, USA; (A.C.); (H.A.-G.)
| | - Charles Lowe
- Center for Computational Toxicology and Exposure, Office of Research and Development, U.S. Environmental Protection Agency, 109 T.W. Alexander Drive, Research Triangle Park, NC 27709, USA; (C.L.); (C.G.); (J.R.S.)
| | - Christopher Grulke
- Center for Computational Toxicology and Exposure, Office of Research and Development, U.S. Environmental Protection Agency, 109 T.W. Alexander Drive, Research Triangle Park, NC 27709, USA; (C.L.); (C.G.); (J.R.S.)
| | - Jon R. Sobus
- Center for Computational Toxicology and Exposure, Office of Research and Development, U.S. Environmental Protection Agency, 109 T.W. Alexander Drive, Research Triangle Park, NC 27709, USA; (C.L.); (C.G.); (J.R.S.)
| | - Antony J. Williams
- Center for Computational Toxicology and Exposure, Office of Research and Development, U.S. Environmental Protection Agency, 109 T.W. Alexander Drive, Research Triangle Park, NC 27709, USA; (C.L.); (C.G.); (J.R.S.)
| |
Collapse
|