1
|
Das S, Merz KM. Molecular Gas-Phase Conformational Ensembles. J Chem Inf Model 2024; 64:749-760. [PMID: 38206321 DOI: 10.1021/acs.jcim.3c01309] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/12/2024]
Abstract
Accurately determining the global minima of a molecular structure is important in diverse scientific fields, including drug design, materials science, and chemical synthesis. Conformational search engines serve as valuable tools for exploring the extensive conformational space of molecules and for identifying energetically favorable conformations. In this study, we present a comparison of Auto3D, CREST, Balloon, and ETKDG (from RDKit), which are freely available conformational search engines, to evaluate their effectiveness in locating global minima. These engines employ distinct methodologies, including machine learning (ML) potential-based, semiempirical, and force field-based approaches. To validate these methods, we propose the use of collisional cross-section (CCS) values obtained from ion mobility-mass spectrometry studies. We hypothesize that experimental gas-phase CCS values can provide experimental evidence that we likely have the global minimum for a given molecule. To facilitate this effort, we used our gas-phase conformation library (GPCL) which currently consists of the full ensembles of 20 small molecules and can be used by the community to validate any conformational search engine. Further members of the GPCL can be readily created for any molecule of interest using our standard workflow used to compute CCS values, expanding the ability of the GPCL in validation exercises. These innovative validation techniques enhance our understanding of the conformational landscape and provide valuable insights into the performance of conformational generation engines. Our findings shed light on the strengths and limitations of each search engine, enabling informed decisions for their utilization in various scientific fields, where accurate molecular structure determination is crucial for understanding biological activity and designing targeted interventions. By facilitating the identification of reliable conformations, this study significantly contributes to enhancing the efficiency and accuracy of molecular structure determination, with particular focus on metabolite structure elucidation. The findings of this research also provide valuable insights for developing effective workflows for predicting the structures of unknown compounds with high precision.
Collapse
Affiliation(s)
- Susanta Das
- Department of Chemistry, Michigan State University, 578 S. Shaw Lane, East Lansing, Michigan 48824, United States
| | - Kenneth M Merz
- Department of Chemistry, Michigan State University, 578 S. Shaw Lane, East Lansing, Michigan 48824, United States
| |
Collapse
|
2
|
Zhang H, Luo M, Wang H, Ren F, Yin Y, Zhu ZJ. AllCCS2: Curation of Ion Mobility Collision Cross-Section Atlas for Small Molecules Using Comprehensive Molecular Representations. Anal Chem 2023; 95:13913-13921. [PMID: 37664900 DOI: 10.1021/acs.analchem.3c02267] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/05/2023]
Abstract
The development of ion mobility-mass spectrometry (IM-MS) has revolutionized the analysis of small molecules, such as metabolomics, lipidomics, and exposome studies. The curation of comprehensive reference collision cross-section (CCS) databases plays a pivotal role in the successful application of IM-MS for small-molecule analysis. In this study, we presented AllCCS2, an enhanced version of AllCCS, designed for the universal prediction of the ion mobility CCS values of small molecules. AllCCS2 incorporated newly available experimental CCS data, including 10,384 records and 7713 unified values, as training data. By leveraging a neural network trained on diverse molecular representations encompassing mass spectrometry features, molecular descriptors, and graph features extracted using a graph convolutional network, AllCCS2 achieved exceptional prediction accuracy. AllCCS2 achieved median relative error (MedRE) values of 0.31, 0.72, and 1.64% in the training, validation, and testing sets, respectively, surpassing existing CCS prediction tools in terms of accuracy and coverage. Furthermore, AllCCS2 exhibited excellent compatibility with different instrument platforms (DTIMS, TWIMS, and TIMS). The prediction uncertainties in AllCCS2 from the training data and the prediction model were comprehensively investigated by using representative structure similarity and model prediction variation. Notably, small molecules with high structural similarities to the training set and lower model prediction variation exhibited improved accuracy and lower relative errors. In summary, AllCCS2 serves as a valuable resource to support applications of IM-MS technologies. The AllCCS2 database and tools are freely accessible at http://allccs.zhulab.cn/.
Collapse
Affiliation(s)
- Haosong Zhang
- Interdisciplinary Research Center on Biology and Chemistry, Shanghai Institute of Organic Chemistry, Chinese Academy of Sciences, Shanghai 200032, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Mingdu Luo
- Interdisciplinary Research Center on Biology and Chemistry, Shanghai Institute of Organic Chemistry, Chinese Academy of Sciences, Shanghai 200032, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Hongmiao Wang
- Interdisciplinary Research Center on Biology and Chemistry, Shanghai Institute of Organic Chemistry, Chinese Academy of Sciences, Shanghai 200032, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Fandong Ren
- Interdisciplinary Research Center on Biology and Chemistry, Shanghai Institute of Organic Chemistry, Chinese Academy of Sciences, Shanghai 200032, China
| | - Yandong Yin
- Interdisciplinary Research Center on Biology and Chemistry, Shanghai Institute of Organic Chemistry, Chinese Academy of Sciences, Shanghai 200032, China
| | - Zheng-Jiang Zhu
- Interdisciplinary Research Center on Biology and Chemistry, Shanghai Institute of Organic Chemistry, Chinese Academy of Sciences, Shanghai 200032, China
- Shanghai Key Laboratory of Aging Studies, Shanghai 201210, China
| |
Collapse
|
3
|
Kartowikromo KY, Olajide OE, Hamid AM. Collision cross section measurement and prediction methods in omics. JOURNAL OF MASS SPECTROMETRY : JMS 2023; 58:e4973. [PMID: 37620034 PMCID: PMC10530098 DOI: 10.1002/jms.4973] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/19/2023] [Revised: 06/26/2023] [Accepted: 07/20/2023] [Indexed: 08/26/2023]
Abstract
Omics studies such as metabolomics, lipidomics, and proteomics have become important for understanding the mechanisms in living organisms. However, the compounds detected are structurally different and contain isomers, with each structure or isomer leading to a different result in terms of the role they play in the cell or tissue in the organism. Therefore, it is important to detect, characterize, and elucidate the structures of these compounds. Liquid chromatography and mass spectrometry have been utilized for decades in the structure elucidation of key compounds. While prediction models of parameters (such as retention time and fragmentation pattern) have also been developed for these separation techniques, they have some limitations. Moreover, ion mobility has become one of the most promising techniques to give a fingerprint to these compounds by determining their collision cross section (CCS) values, which reflect their shape and size. Obtaining accurate CCS enables its use as a filter for potential analyte structures. These CCS values can be measured experimentally using calibrant-independent and calibrant-dependent approaches. Identification of compounds based on experimental CCS values in untargeted analysis typically requires CCS references from standards, which are currently limited and, if available, would require a large amount of time for experimental measurements. Therefore, researchers use theoretical tools to predict CCS values for untargeted and targeted analysis. In this review, an overview of the different methods for the experimental and theoretical estimation of CCS values is given where theoretical prediction tools include computational and machine modeling type approaches. Moreover, the limitations of the current experimental and theoretical approaches and their potential mitigation methods were discussed.
Collapse
Affiliation(s)
| | - Orobola E Olajide
- Department of Chemistry and Biochemistry, Auburn University, Auburn, Alabama, USA
| | - Ahmed M Hamid
- Department of Chemistry and Biochemistry, Auburn University, Auburn, Alabama, USA
| |
Collapse
|
4
|
Li X, Wang H, Jiang M, Ding M, Xu X, Xu B, Zou Y, Yu Y, Yang W. Collision Cross Section Prediction Based on Machine Learning. Molecules 2023; 28:molecules28104050. [PMID: 37241791 DOI: 10.3390/molecules28104050] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2023] [Revised: 05/10/2023] [Accepted: 05/10/2023] [Indexed: 05/28/2023] Open
Abstract
Ion mobility-mass spectrometry (IM-MS) is a powerful separation technique providing an additional dimension of separation to support the enhanced separation and characterization of complex components from the tissue metabolome and medicinal herbs. The integration of machine learning (ML) with IM-MS can overcome the barrier to the lack of reference standards, promoting the creation of a large number of proprietary collision cross section (CCS) databases, which help to achieve the rapid, comprehensive, and accurate characterization of the contained chemical components. In this review, advances in CCS prediction using ML in the past 2 decades are summarized. The advantages of ion mobility-mass spectrometers and the commercially available ion mobility technologies with different principles (e.g., time dispersive, confinement and selective release, and space dispersive) are introduced and compared. The general procedures involved in CCS prediction based on ML (acquisition and optimization of the independent and dependent variables, model construction and evaluation, etc.) are highlighted. In addition, quantum chemistry, molecular dynamics, and CCS theoretical calculations are also described. Finally, the applications of CCS prediction in metabolomics, natural products, foods, and the other research fields are reflected.
Collapse
Affiliation(s)
- Xiaohang Li
- State Key Laboratory of Component-Based Chinese Medicine, Tianjin University of Traditional Chinese Medicine, 10 Poyanghu Road, Tianjin 301617, China
- Haihe Laboratory of Modern Chinese Medicine, Tianjin University of Traditional Chinese Medicine, 10 Poyanghu Road, Tianjin 301617, China
| | - Hongda Wang
- State Key Laboratory of Component-Based Chinese Medicine, Tianjin University of Traditional Chinese Medicine, 10 Poyanghu Road, Tianjin 301617, China
- Haihe Laboratory of Modern Chinese Medicine, Tianjin University of Traditional Chinese Medicine, 10 Poyanghu Road, Tianjin 301617, China
| | - Meiting Jiang
- State Key Laboratory of Component-Based Chinese Medicine, Tianjin University of Traditional Chinese Medicine, 10 Poyanghu Road, Tianjin 301617, China
- Haihe Laboratory of Modern Chinese Medicine, Tianjin University of Traditional Chinese Medicine, 10 Poyanghu Road, Tianjin 301617, China
| | - Mengxiang Ding
- State Key Laboratory of Component-Based Chinese Medicine, Tianjin University of Traditional Chinese Medicine, 10 Poyanghu Road, Tianjin 301617, China
- Haihe Laboratory of Modern Chinese Medicine, Tianjin University of Traditional Chinese Medicine, 10 Poyanghu Road, Tianjin 301617, China
| | - Xiaoyan Xu
- State Key Laboratory of Component-Based Chinese Medicine, Tianjin University of Traditional Chinese Medicine, 10 Poyanghu Road, Tianjin 301617, China
- Haihe Laboratory of Modern Chinese Medicine, Tianjin University of Traditional Chinese Medicine, 10 Poyanghu Road, Tianjin 301617, China
| | - Bei Xu
- State Key Laboratory of Component-Based Chinese Medicine, Tianjin University of Traditional Chinese Medicine, 10 Poyanghu Road, Tianjin 301617, China
- Haihe Laboratory of Modern Chinese Medicine, Tianjin University of Traditional Chinese Medicine, 10 Poyanghu Road, Tianjin 301617, China
| | - Yadan Zou
- State Key Laboratory of Component-Based Chinese Medicine, Tianjin University of Traditional Chinese Medicine, 10 Poyanghu Road, Tianjin 301617, China
- Haihe Laboratory of Modern Chinese Medicine, Tianjin University of Traditional Chinese Medicine, 10 Poyanghu Road, Tianjin 301617, China
| | - Yuetong Yu
- State Key Laboratory of Component-Based Chinese Medicine, Tianjin University of Traditional Chinese Medicine, 10 Poyanghu Road, Tianjin 301617, China
| | - Wenzhi Yang
- State Key Laboratory of Component-Based Chinese Medicine, Tianjin University of Traditional Chinese Medicine, 10 Poyanghu Road, Tianjin 301617, China
- Haihe Laboratory of Modern Chinese Medicine, Tianjin University of Traditional Chinese Medicine, 10 Poyanghu Road, Tianjin 301617, China
| |
Collapse
|
5
|
Rainey MA, Watson CA, Asef CK, Foster MR, Baker ES, Fernández FM. CCS Predictor 2.0: An Open-Source Jupyter Notebook Tool for Filtering Out False Positives in Metabolomics. Anal Chem 2022; 94:17456-17466. [PMID: 36473057 PMCID: PMC9772062 DOI: 10.1021/acs.analchem.2c03491] [Citation(s) in RCA: 19] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
Metabolite annotation continues to be the widely accepted bottleneck in nontargeted metabolomics workflows. Annotation of metabolites typically relies on a combination of high-resolution mass spectrometry (MS) with parent and tandem measurements, isotope cluster evaluations, and Kendrick mass defect (KMD) analysis. Chromatographic retention time matching with standards is often used at the later stages of the process, which can also be followed by metabolite isolation and structure confirmation utilizing nuclear magnetic resonance (NMR) spectroscopy. The measurement of gas-phase collision cross-section (CCS) values by ion mobility (IM) spectrometry also adds an important dimension to this workflow by generating an additional molecular parameter that can be used for filtering unlikely structures. The millisecond timescale of IM spectrometry allows the rapid measurement of CCS values and allows easy pairing with existing MS workflows. Here, we report on a highly accurate machine learning algorithm (CCSP 2.0) in an open-source Jupyter Notebook format to predict CCS values based on linear support vector regression models. This tool allows customization of the training set to the needs of the user, enabling the production of models for new adducts or previously unexplored molecular classes. CCSP produces predictions with accuracy equal to or greater than existing machine learning approaches such as CCSbase, DeepCCS, and AllCCS, while being better aligned with FAIR (Findable, Accessible, Interoperable, and Reusable) data principles. Another unique aspect of CCSP 2.0 is its inclusion of a large library of 1613 molecular descriptors via the Mordred Python package, further encoding the fine aspects of isomeric molecular structures. CCS prediction accuracy was tested using CCS values in the McLean CCS Compendium with median relative errors of 1.25, 1.73, and 1.87% for the 170 [M - H]-, 155 [M + H]+, and 138 [M + Na]+ adducts tested. For superclass-matched data sets, CCS predictions via CCSP allowed filtering of 36.1% of incorrect structures while retaining a total of 100% of the correct annotations using a ΔCCS threshold of 2.8% and a mass error of 10 ppm.
Collapse
Affiliation(s)
- Markace A. Rainey
- School of Chemistry and Biochemistry, Georgia Institute of Technology, Atlanta, Georgia 30332, United States
| | - Chandler A. Watson
- School of Chemistry and Biochemistry, Georgia Institute of Technology, Atlanta, Georgia 30332, United States
| | - Carter K. Asef
- School of Chemistry and Biochemistry, Georgia Institute of Technology, Atlanta, Georgia 30332, United States
| | - Makayla R. Foster
- Department of Chemistry, North Carolina State University, Raleigh, North Carolina 27695, United States
| | - Erin S. Baker
- Department of Chemistry and Comparative Medicine Institute, North Carolina State University, Raleigh, North Carolina 27695, United States
| | - Facundo M. Fernández
- School of Chemistry and Biochemistry, Georgia Institute of Technology, Atlanta, Georgia 30332, United States; Petit Institute of Bioengineering and Biotechnology, Georgia Institute of Technology, Atlanta, Georgia 30332, United States
| |
Collapse
|
6
|
Shaver AO, Garcia BM, Gouveia GJ, Morse AM, Liu Z, Asef CK, Borges RM, Leach FE, Andersen EC, Amster IJ, Fernández FM, Edison AS, McIntyre LM. An anchored experimental design and meta-analysis approach to address batch effects in large-scale metabolomics. Front Mol Biosci 2022; 9:930204. [PMID: 36438654 PMCID: PMC9682135 DOI: 10.3389/fmolb.2022.930204] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2022] [Accepted: 10/10/2022] [Indexed: 11/27/2022] Open
Abstract
Untargeted metabolomics studies are unbiased but identifying the same feature across studies is complicated by environmental variation, batch effects, and instrument variability. Ideally, several studies that assay the same set of metabolic features would be used to select recurring features to pursue for identification. Here, we developed an anchored experimental design. This generalizable approach enabled us to integrate three genetic studies consisting of 14 test strains of Caenorhabditis elegans prior to the compound identification process. An anchor strain, PD1074, was included in every sample collection, resulting in a large set of biological replicates of a genetically identical strain that anchored each study. This enables us to estimate treatment effects within each batch and apply straightforward meta-analytic approaches to combine treatment effects across batches without the need for estimation of batch effects and complex normalization strategies. We collected 104 test samples for three genetic studies across six batches to produce five analytical datasets from two complementary technologies commonly used in untargeted metabolomics. Here, we use the model system C. elegans to demonstrate that an augmented design combined with experimental blocks and other metabolomic QC approaches can be used to anchor studies and enable comparisons of stable spectral features across time without the need for compound identification. This approach is generalizable to systems where the same genotype can be assayed in multiple environments and provides biologically relevant features for downstream compound identification efforts. All methods are included in the newest release of the publicly available SECIMTools based on the open-source Galaxy platform.
Collapse
Affiliation(s)
- Amanda O. Shaver
- Department of Genetics, University of Georgia, Athens, GA, United States,Complex Carbohydrate Research Center, University of Georgia, Athens, GA, United States
| | - Brianna M. Garcia
- Complex Carbohydrate Research Center, University of Georgia, Athens, GA, United States,Department of Chemistry, University of Georgia, Athens, GA, United States
| | - Goncalo J. Gouveia
- Complex Carbohydrate Research Center, University of Georgia, Athens, GA, United States,Department of Biochemistry, University of Georgia, Athens, GA, United States
| | - Alison M. Morse
- Department of Molecular Genetics and Microbiology, University of Florida, Gainesville, FL, United States
| | - Zihao Liu
- Department of Molecular Genetics and Microbiology, University of Florida, Gainesville, FL, United States
| | - Carter K. Asef
- School of Chemistry and Biochemistry, Georgia Institute of Technology, Atlanta, GA, United States
| | - Ricardo M. Borges
- Walter Mors Institute of Research on Natural Products, Federal University of Rio de Janeiro, Rio de Janeiro, Brazil
| | - Franklin E. Leach
- Complex Carbohydrate Research Center, University of Georgia, Athens, GA, United States,Department of Environmental Health Science, University of Georgia, Athens, GA, United States
| | - Erik C. Andersen
- Department of Molecular Biosciences, Northwestern University, Evanston, IL, United States
| | - I. Jonathan Amster
- Department of Chemistry, University of Georgia, Athens, GA, United States
| | - Facundo M. Fernández
- School of Chemistry and Biochemistry, Georgia Institute of Technology, Atlanta, GA, United States
| | - Arthur S. Edison
- Department of Genetics, University of Georgia, Athens, GA, United States,Complex Carbohydrate Research Center, University of Georgia, Athens, GA, United States,Department of Biochemistry, University of Georgia, Athens, GA, United States
| | - Lauren M. McIntyre
- Department of Molecular Genetics and Microbiology, University of Florida, Gainesville, FL, United States,University of Florida Genetics Institute, University of Florida, Gainesville, FL, United States,*Correspondence: Lauren M. McIntyre,
| |
Collapse
|
7
|
Foster M, Rainey M, Watson C, Dodds JN, Kirkwood KI, Fernández FM, Baker ES. Uncovering PFAS and Other Xenobiotics in the Dark Metabolome Using Ion Mobility Spectrometry, Mass Defect Analysis, and Machine Learning. ENVIRONMENTAL SCIENCE & TECHNOLOGY 2022; 56:9133-9143. [PMID: 35653285 PMCID: PMC9474714 DOI: 10.1021/acs.est.2c00201] [Citation(s) in RCA: 30] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/15/2023]
Abstract
The identification of xenobiotics in nontargeted metabolomic analyses is a vital step in understanding human exposure. Xenobiotic metabolism, transformation, excretion, and coexistence with other endogenous molecules, however, greatly complicate the interpretation of features detected in nontargeted studies. While mass spectrometry (MS)-based platforms are commonly used in metabolomic measurements, deconvoluting endogenous metabolites from xenobiotics is also often challenged by the lack of xenobiotic parent and metabolite standards as well as the numerous isomers possible for each small molecule m/z feature. Here, we evaluate a xenobiotic structural annotation workflow using ion mobility spectrometry coupled with MS (IMS-MS), mass defect filtering, and machine learning to uncover potential xenobiotic classes and species in large metabolomic feature lists. Xenobiotic classes examined included those of known high toxicities, including per- and polyfluoroalkyl substances (PFAS), polycyclic aromatic hydrocarbons (PAHs), polychlorinated biphenyls (PCBs), polybrominated diphenyl ethers (PBDEs), and pesticides. Specifically, when the workflow was applied to identify PFAS in the NIST SRM 1957 and 909c human serum samples, it greatly reduced the hundreds of detected liquid chromatography (LC)-IMS-MS features by utilizing both mass defect filtering and m/z versus IMS collision cross sections relationships. These potential PFAS features were then compared to the EPA CompTox entries, and while some matched within specific m/z tolerances, there were still many unknowns illustrating the importance of nontargeted studies for detecting new molecules with known chemical characteristics. Additionally, this workflow can also be utilized to evaluate other xenobiotics and enable more confident annotations from nontargeted studies.
Collapse
Affiliation(s)
- MaKayla Foster
- Department of Chemistry, North Carolina State University, Raleigh, North Carolina 27695, United States
| | - Markace Rainey
- School of Chemistry and Biochemistry, Georgia Institute of Technology, 901 Atlantic Drive NW, Atlanta, Georgia 30332, United States
| | - Chandler Watson
- School of Chemistry and Biochemistry, Georgia Institute of Technology, 901 Atlantic Drive NW, Atlanta, Georgia 30332, United States
| | - James N Dodds
- Department of Chemistry, North Carolina State University, Raleigh, North Carolina 27695, United States
| | - Kaylie I Kirkwood
- Department of Chemistry, North Carolina State University, Raleigh, North Carolina 27695, United States
| | - Facundo M Fernández
- School of Chemistry and Biochemistry, Georgia Institute of Technology, 901 Atlantic Drive NW, Atlanta, Georgia 30332, United States
| | - Erin S Baker
- Department of Chemistry, North Carolina State University, Raleigh, North Carolina 27695, United States
- Comparative Medicine Institute, North Carolina State University, Raleigh, North Carolina 27695, United States
| |
Collapse
|
8
|
Ross D, Seguin RP, Krinsky AM, Xu L. High-Throughput Measurement and Machine Learning-Based Prediction of Collision Cross Sections for Drugs and Drug Metabolites. JOURNAL OF THE AMERICAN SOCIETY FOR MASS SPECTROMETRY 2022; 33:1061-1072. [PMID: 35548857 PMCID: PMC9165597 DOI: 10.1021/jasms.2c00111] [Citation(s) in RCA: 15] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 05/14/2023]
Abstract
Drug metabolite identification is a bottleneck of drug metabolism studies due to the need for time-consuming chromatographic separation and structural confirmation. Ion mobility-mass spectrometry (IM-MS), on the other hand, separates analytes on a rapid (millisecond) time scale and enables the measurement of collision cross section (CCS), a unique physical property related to an ion's gas-phase size and shape, which can be used as an additional parameter for identification of unknowns. A current limitation to the application of IM-MS to the identification of drug metabolites is the lack of reference CCS values. In this work, we assembled a large-scale database of drug and drug metabolite CCS values using high-throughput in vitro drug metabolite generation and a rapid IM-MS analysis with automated data processing. Subsequently, we used this database to train a machine learning-based CCS prediction model, employing a combination of conventional 2D molecular descriptors and novel 3D descriptors, achieving high prediction accuracies (0.8-2.2% median relative error on test set data). The inclusion of 3D information in the prediction model enables the prediction of different CCS values for different protomers, conformers, and positional isomers, which is not possible using conventional 2D descriptors. The prediction models, dmCCS, are available at https://CCSbase.net/dmccs_predictions.
Collapse
Affiliation(s)
| | | | | | - Libin Xu
- . Tel: (206) 543-1080. Fax: (206) 685-3252
| |
Collapse
|
9
|
te Brinke E, Arrizabalaga-Larrañaga A, Blokland MH. Insights of ion mobility spectrometry and its application on food safety and authenticity: A review. Anal Chim Acta 2022; 1222:340039. [DOI: 10.1016/j.aca.2022.340039] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2022] [Revised: 06/01/2022] [Accepted: 06/03/2022] [Indexed: 11/01/2022]
|
10
|
Rose BS, May JC, Picache JA, Codreanu SG, Sherrod SD, McLean JA. Improving confidence in lipidomic annotations by incorporating empirical ion mobility regression analysis and chemical class prediction. Bioinformatics 2022; 38:2872-2879. [PMID: 35561172 PMCID: PMC9306740 DOI: 10.1093/bioinformatics/btac197] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2022] [Revised: 03/22/2022] [Accepted: 03/29/2022] [Indexed: 02/03/2023] Open
Abstract
MOTIVATION Mass spectrometry-based untargeted lipidomics aims to globally characterize the lipids and lipid-like molecules in biological systems. Ion mobility increases coverage and confidence by offering an additional dimension of separation and a highly reproducible metric for feature annotation, the collision cross-section (CCS). RESULTS We present a data processing workflow to increase confidence in molecular class annotations based on CCS values. This approach uses class-specific regression models built from a standardized CCS repository (the Unified CCS Compendium) in a parallel scheme that combines a new annotation filtering approach with a machine learning class prediction strategy. In a proof-of-concept study using murine brain lipid extracts, 883 lipids were assigned higher confidence identifications using the filtering approach, which reduced the tentative candidate lists by over 50% on average. An additional 192 unannotated compounds were assigned a predicted chemical class. AVAILABILITY AND IMPLEMENTATION All relevant source code is available at https://github.com/McLeanResearchGroup/CCS-filter. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Bailey S Rose
- Department of Chemistry, Center for Innovative Technology, Vanderbilt-Ingram Cancer Center, Vanderbilt Institute of Chemical Biology, Vanderbilt Institute for Integrative Biosystems Research and Education, Vanderbilt University, Nashville, TN 37235, USA
| | - Jody C May
- Department of Chemistry, Center for Innovative Technology, Vanderbilt-Ingram Cancer Center, Vanderbilt Institute of Chemical Biology, Vanderbilt Institute for Integrative Biosystems Research and Education, Vanderbilt University, Nashville, TN 37235, USA
| | - Jaqueline A Picache
- Department of Chemistry, Center for Innovative Technology, Vanderbilt-Ingram Cancer Center, Vanderbilt Institute of Chemical Biology, Vanderbilt Institute for Integrative Biosystems Research and Education, Vanderbilt University, Nashville, TN 37235, USA
| | - Simona G Codreanu
- Department of Chemistry, Center for Innovative Technology, Vanderbilt-Ingram Cancer Center, Vanderbilt Institute of Chemical Biology, Vanderbilt Institute for Integrative Biosystems Research and Education, Vanderbilt University, Nashville, TN 37235, USA
| | - Stacy D Sherrod
- Department of Chemistry, Center for Innovative Technology, Vanderbilt-Ingram Cancer Center, Vanderbilt Institute of Chemical Biology, Vanderbilt Institute for Integrative Biosystems Research and Education, Vanderbilt University, Nashville, TN 37235, USA
| | - John A McLean
- Department of Chemistry, Center for Innovative Technology, Vanderbilt-Ingram Cancer Center, Vanderbilt Institute of Chemical Biology, Vanderbilt Institute for Integrative Biosystems Research and Education, Vanderbilt University, Nashville, TN 37235, USA
| |
Collapse
|
11
|
Song XC, Dreolin N, Damiani T, Canellas E, Nerin C. Prediction of Collision Cross Section Values: Application to Non-Intentionally Added Substance Identification in Food Contact Materials. JOURNAL OF AGRICULTURAL AND FOOD CHEMISTRY 2022; 70:1272-1281. [PMID: 35041428 PMCID: PMC8815070 DOI: 10.1021/acs.jafc.1c06989] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/03/2021] [Revised: 12/31/2021] [Accepted: 01/05/2022] [Indexed: 05/24/2023]
Abstract
The synthetic chemicals in food contact materials can migrate into food and endanger human health. In this study, the traveling wave collision cross section in nitrogen values of more than 400 chemicals in food contact materials were experimentally derived by traveling wave ion mobility spectrometry. A support vector machine-based collision cross section (CCS) prediction model was developed based on CCS values of food contact chemicals and a series of molecular descriptors. More than 92% of protonated and 81% of sodiated adducts showed a relative deviation below 5%. Median relative errors for protonated and sodiated molecules were 1.50 and 1.82%, respectively. The model was then applied to the structural annotation of oligomers migrating from polyamide adhesives. The identification confidence of 11 oligomers was improved by the direct comparison of the experimental data with the predicted CCS values. Finally, the challenges and opportunities of current machine-learning models on CCS prediction were also discussed.
Collapse
Affiliation(s)
- Xue-Chao Song
- Department
of Analytical Chemistry, Aragon Institute of Engineering Research
I3A, CPS-University of Zaragoza, Maria de Luna 3, 50018 Zaragoza, Spain
| | - Nicola Dreolin
- Waters
Corporation, Altrincham
Road, SK9 4AX Wilmslow, U.K.
| | - Tito Damiani
- Institute
of Organic Chemistry and Biochemistry, Flemingovo náměstí 542/2, 160 00 Prague, Czech Republic
| | - Elena Canellas
- Department
of Analytical Chemistry, Aragon Institute of Engineering Research
I3A, CPS-University of Zaragoza, Maria de Luna 3, 50018 Zaragoza, Spain
| | - Cristina Nerin
- Department
of Analytical Chemistry, Aragon Institute of Engineering Research
I3A, CPS-University of Zaragoza, Maria de Luna 3, 50018 Zaragoza, Spain
| |
Collapse
|
12
|
Target, suspect and non-target screening analysis from wastewater treatment plant effluents to drinking water using collision cross section values as additional identification criterion. Anal Bioanal Chem 2021; 414:425-438. [PMID: 33768366 PMCID: PMC8748347 DOI: 10.1007/s00216-021-03263-1] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2020] [Revised: 02/18/2021] [Accepted: 03/01/2021] [Indexed: 12/13/2022]
Abstract
The anthropogenic entry of organic micropollutants into the aquatic environment leads to a potential risk for drinking water resources and the drinking water itself. Therefore, sensitive screening analysis methods are needed to monitor the raw and drinking water quality continuously. Non-target screening analysis has been shown to allow for a more comprehensive investigation of drinking water processes compared to target analysis alone. However, non-target screening is challenging due to the many features that can be detected. Thus, data processing techniques to reduce the high number of features are necessary, and prioritization techniques are important to find the features of interest for identification, as identification of unknown substances is challenging as well. In this study, a drinking water production process, where drinking water is supplied by a water reservoir, was investigated. Since the water reservoir provides surface water, which is anthropogenically influenced by wastewater treatment plant (WWTP) effluents, substances originating from WWTP effluents and reaching the drinking water were investigated, because this indicates that they cannot be removed by the drinking water production process. For this purpose, ultra-performance liquid chromatography coupled with an ion-mobility high-resolution mass spectrometer (UPLC-IM-HRMS) was used in a combined approach including target, suspect and non-target screening analysis to identify known and unknown substances. Additionally, the role of ion-mobility-derived collision cross sections (CCS) in identification is discussed. To that end, six samples (two WWTP effluent samples, a surface water sample that received the effluents, a raw water sample from a downstream water reservoir, a process sample and the drinking water) were analyzed. Positive findings for a total of 60 substances in at least one sample were obtained through quantitative screening. Sixty-five percent (15 out of 23) of the identified substances in the drinking water sample were pharmaceuticals and transformation products of pharmaceuticals. Using suspect screening, further 33 substances were tentatively identified in one or more samples, where for 19 of these substances, CCS values could be compared with CCS values from the literature, which supported the tentative identification. Eight substances were identified by reference standards. In the non-target screening, a total of ten features detected in all six samples were prioritized, whereby metoprolol acid/atenolol acid (a transformation product of the two β-blockers metoprolol and atenolol) and 1,3-benzothiazol-2-sulfonic acid (a transformation product of the vulcanization accelerator 2-mercaptobenzothiazole) were identified with reference standards. Overall, this study demonstrates the added value of a comprehensive water monitoring approach based on UPLC-IM-HRMS analysis.
Collapse
|
13
|
Gionfriddo E, Gómez-Ríos GA. Analysis of food samples made easy by microextraction technologies directly coupled to mass spectrometry. JOURNAL OF MASS SPECTROMETRY : JMS 2021; 56:e4665. [PMID: 33098354 DOI: 10.1002/jms.4665] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/05/2020] [Revised: 09/18/2020] [Accepted: 09/25/2020] [Indexed: 06/11/2023]
Abstract
Because of the complexity and diversity of food matrices, their chemical analysis often entails several analytical challenges to attain accurate and reliable results, especially for multiresidue analysis and ultratrace quantification. Nonetheless, microextraction technology, such as solid-phase microextraction (SPME), has revolutionized the concept of sample preparation for complex matrices because of its nonexhaustive, yet quantitative extraction approach and its amenability to coupling to multiple analytical platforms. In recent years, microextraction devices directly interfaced with mass spectrometry (MS) have redefined the analytical workflow by providing faster screening and quantitative methods for complex matrices. This review will discuss the latest developments in the field of food analysis by means of microextraction approaches directly coupled to MS. One key feature that differentiates SPME-MS approaches from other ambient MS techniques is the use of matrix compatible extraction phases that prevent biofouling, which could drastically affect the ionization process and are still capable of selective extraction of the targeted analytes from the food matrix. Furthermore, the review examines the most significant applications of SPME-MS for various ionization techniques such as direct analysis in real time, dielectric barrier desorption ionization, and some unique SPME geometries, for example, transmission mode SPME and coated blade spray, that facilitate the interface to MS instrumentation.
Collapse
Affiliation(s)
- Emanuela Gionfriddo
- Department of Chemistry and Biochemistry, College of Natural Sciences and Mathematics, The University of Toledo, Toledo, Ohio, 43606, USA
- School of Green Chemistry and Engineering, The University of Toledo, Toledo, Ohio, 43606, USA
- Dr. Nina McClelland Laboratory for Water Chemistry and Environmental Analysis, The University of Toledo, Toledo, Ohio, 43606, USA
| | | |
Collapse
|