1
|
Ren M, Rigele A, Davaasambuu S, Shun N, Natsagdorj N, Purev N. Study on Gas Chromatography Retention Time Variation of Acetic Acid Combined with Quantum Chemical Calculation. Chromatographia 2022. [DOI: 10.1007/s10337-022-04220-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
|
2
|
Comparative Prediction of Gas Chromatographic Retention Indices for GC/MS Identification of Chemicals Related to Chemical Weapons Convention by Incremental and Machine Learning Methods. SEPARATIONS 2022. [DOI: 10.3390/separations9100265] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open
Abstract
During on-site verification activities conducted by the Technical Secretariat of Organization for the Prohibition of Chemical Weapons, identification by gas chromatography retention indices (RI) data, in addition to mass spectrometry data, increase the reliability of factual findings. However, reference RIs do not cover all the possible chemical structures. That is why it is important to have models to predict RIs. Applicable only for narrow data sets of chemicals with a fixed scaffold (G- and V-series gases as example), the non-learning incremental method demonstrated predictive median absolute and percentage errors of 2–4 units and 0.1–0.2%; these are comparable with the experimental bias in RI measurements in the same laboratory with the same GC conditions. It outperforms the accuracy of two reported machine learning methods–median absolute and percentage errors of 11–52 units and 0.5–2.8%. However, for the whole Chemical Weapons Convention (CWC) data set of chemicals, when a fixed scaffold is absent, the incremental method is not applicable; essential machine learning methods achieved accuracy: median absolute and percentage errors of 29–33 units and 0.5–2.2%, depending on the machine learning method. In addition, we have developed a homology tree approach as a convenient method for the visualization of the CWC chemical space. We conclude that non-learning incremental methods may be more accurate than the state-of-the-art machine learning techniques in particular cases, such as predicting the RIs of homologues and isomers of chemicals related to CWC.
Collapse
|
3
|
Machine Learning-Based Retention Time Prediction of Trimethylsilyl Derivatives of Metabolites. Biomedicines 2022; 10:biomedicines10040879. [PMID: 35453629 PMCID: PMC9024754 DOI: 10.3390/biomedicines10040879] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2022] [Revised: 04/04/2022] [Accepted: 04/06/2022] [Indexed: 11/16/2022] Open
Abstract
In gas chromatography–mass spectrometry-based untargeted metabolomics, metabolites are identified by comparing mass spectra and chromatographic retention time with reference databases or standard materials. In that sense, machine learning has been used to predict the retention time of metabolites lacking reference data. However, the retention time prediction of trimethylsilyl derivatives of metabolites, typically analyzed in untargeted metabolomics using gas chromatography, has been poorly explored. Here, we provide a rationalized framework for machine learning-based retention time prediction of trimethylsilyl derivatives of metabolites in gas chromatography. We compared different machine learning paradigms, in addition to exploring the influence of the computational molecular structure representation to train the prediction models: fingerprint class and fingerprint calculation software. Our study challenged predicted retention time when using chemical ionization and electron impact ionization sources in simulated and real cases, demonstrating a good correct identity ranking capability by machine learning, despite observing a limited false identity filtering power in cases where a spectrum or a monoisotopic mass match to multiple candidates. Specifically, machine learning prediction yielded median absolute and relative retention index (relative retention time) errors of 37.1 retention index units and 2%, respectively. In addition, fingerprint class and fingerprint calculation software, as well as the molecular structural similarity between the training and test or real case sets, showed to be critical modulators of the prediction performance. Finally, we leveraged the structural similarity between the training and test or real case set to determine the probability that the prediction error is below a specific threshold. Overall, our study demonstrates that predicted retention time can provide insights into the true structure of unknown metabolites by ranking from the most to the least plausible molecular identity, and sets the guidelines to assess the confidence in metabolite identification using predicted retention time data.
Collapse
|
4
|
Deep Learning Based Prediction of Gas Chromatographic Retention Indices for a Wide Variety of Polar and Mid-Polar Liquid Stationary Phases. Int J Mol Sci 2021; 22:ijms22179194. [PMID: 34502099 PMCID: PMC8430916 DOI: 10.3390/ijms22179194] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2021] [Revised: 08/23/2021] [Accepted: 08/24/2021] [Indexed: 01/12/2023] Open
Abstract
Prediction of gas chromatographic retention indices based on compound structure is an important task for analytical chemistry. The predicted retention indices can be used as a reference in a mass spectrometry library search despite the fact that their accuracy is worse in comparison with the experimental reference ones. In the last few years, deep learning was applied for this task. The use of deep learning drastically improved the accuracy of retention index prediction for non-polar stationary phases. In this work, we demonstrate for the first time the use of deep learning for retention index prediction on polar (e.g., polyethylene glycol, DB-WAX) and mid-polar (e.g., DB-624, DB-210, DB-1701, OV-17) stationary phases. The achieved accuracy lies in the range of 16–50 in terms of the mean absolute error for several stationary phases and test data sets. We also demonstrate that our approach can be directly applied to the prediction of the second dimension retention times (GC × GC) if a large enough data set is available. The achieved accuracy is considerably better compared with the previous results obtained using linear quantitative structure-retention relationships and ACD ChromGenius software. The source code and pre-trained models are available online.
Collapse
|
5
|
Qu C, Schneider BI, Kearsley AJ, Keyrouz W, Allison TC. Predicting Kováts Retention Indices Using Graph Neural Networks. J Chromatogr A 2021; 1646:462100. [PMID: 33892256 DOI: 10.1016/j.chroma.2021.462100] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2020] [Revised: 03/16/2021] [Accepted: 03/22/2021] [Indexed: 11/16/2022]
Abstract
The Kováts retention index is a dimensionless quantity that characterizes the rate at which a compound is processed through a gas chromatography column. This quantity is independent of many experimental variables and, as such, is considered a near-universal descriptor of retention time on a chromatography column. The Kováts retention indices of a large number of molecules have been determined experimentally. The "NIST 20: GC Method/Retention Index Library" database has collected and, more importantly, curated retention indices of a subset of these compounds resulting in a highly valued reference database. The experimental data in the library form an ideal data set for training machine learning models for the prediction of retention indices of unknown compounds. In this article, we describe the training of a graph neural network model to predict the Kováts retention index for compounds in the NIST library and compare this approach with previous work [1]. We predict the Kováts retention index with a mean unsigned error of 28 index units as compared to 44, the putative best result using a convolutional neural network [1]. The NIST library also incorporates an estimation scheme based on a group contribution approach that achieves a mean unsigned error of 114 compared to the experimental data. Our method uses the same input data source as the group contribution approach, making its application straightforward and convenient to apply to existing libraries. Our results convincingly demonstrate the predictive powers of systematic, data-driven approaches leveraging deep learning methodologies applied to chemical data and for the data in the NIST 20 library outperform previous models.
Collapse
Affiliation(s)
- Chen Qu
- National Institute of Standards and Technology, 100 Bureau Drive, Gaithersburg, Maryland 20899, USA.
| | - Barry I Schneider
- National Institute of Standards and Technology, 100 Bureau Drive, Gaithersburg, Maryland 20899, USA.
| | - Anthony J Kearsley
- National Institute of Standards and Technology, 100 Bureau Drive, Gaithersburg, Maryland 20899, USA.
| | - Walid Keyrouz
- National Institute of Standards and Technology, 100 Bureau Drive, Gaithersburg, Maryland 20899, USA.
| | - Thomas C Allison
- National Institute of Standards and Technology, 100 Bureau Drive, Gaithersburg, Maryland 20899, USA.
| |
Collapse
|
6
|
Zagreb-Type Indices of R-Vertex Join and R-Edge Join of Graphs. J CHEM-NY 2020. [DOI: 10.1155/2020/9767128] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022] Open
Abstract
There are various methods available which are used to search large chemical databases and to predict the physicochemical properties of molecular structures. Using molecular descriptors for this purpose is the simplest of these methods. The Zagreb indices are amongst the oldest molecular descriptors, and their properties have been extensively studied and applied in QSAR/QSPR studies. The Zagreb coindices were recently introduced, attracting the attention of researchers in mathematical chemistry. In this paper, we study Zagreb indices and several other Zagreb-type indices including the general Randić index, sum-connectivity index, F-index, and Zagreb coindices of R-vertex and edge join of two arbitrary graphs.
Collapse
|
7
|
Matyushin DD, Sholokhova AY, Buryak AK. A deep convolutional neural network for the estimation of gas chromatographic retention indices. J Chromatogr A 2019; 1607:460395. [DOI: 10.1016/j.chroma.2019.460395] [Citation(s) in RCA: 23] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2019] [Revised: 06/15/2019] [Accepted: 07/22/2019] [Indexed: 10/26/2022]
|
8
|
Mladenović MZ, Radulović NS. A synthetic library of allylmethoxyphenyl esters: spectral characterization and gas chromatographic behavior. FLAVOUR FRAG J 2019. [DOI: 10.1002/ffj.3529] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Affiliation(s)
- Marko Z. Mladenović
- Department of Chemistry Faculty of Sciences and Mathematics University of Niš Niš Serbia
| | - Niko S. Radulović
- Department of Chemistry Faculty of Sciences and Mathematics University of Niš Niš Serbia
| |
Collapse
|
9
|
Du Z, Ali A, Trinajstić N. Alkanes with the First Three Maximal/Minimal Modified First Zagreb Connection Indices. Mol Inform 2019; 38:e1800116. [PMID: 30614630 DOI: 10.1002/minf.201800116] [Citation(s) in RCA: 23] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2018] [Accepted: 11/01/2018] [Indexed: 11/11/2022]
Abstract
The modified first Zagreb connection index ( <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML"><mml:mrow><mml:mi>Z</mml:mi> <mml:msubsup><mml:mi>C</mml:mi> <mml:mn>1</mml:mn> <mml:mo>*</mml:mo></mml:msubsup> </mml:mrow> </mml:math> ) is a molecular descriptor, which was initially appeared within a formula of the total electron energy of alternant hydrocarbons in 1972. In a recent paper [A. Ali, N. Trinajstić, A novel/old modification of the first Zagreb index, Mol. Inform. 37 (2018) 1800008], it was observed that the molecular descriptor <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML"><mml:mrow><mml:mi>Z</mml:mi> <mml:msubsup><mml:mi>C</mml:mi> <mml:mn>1</mml:mn> <mml:mo>*</mml:mo></mml:msubsup> </mml:mrow> </mml:math> correlates well with the entropy and acentric factor of octane isomers. In this article, the molecules with the first three maximal <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML"><mml:mrow><mml:mi>Z</mml:mi> <mml:msubsup><mml:mi>C</mml:mi> <mml:mn>1</mml:mn> <mml:mo>*</mml:mo></mml:msubsup> </mml:mrow> </mml:math> values as well as the first three minimal <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML"><mml:mrow><mml:mi>Z</mml:mi> <mml:msubsup><mml:mi>C</mml:mi> <mml:mn>1</mml:mn> <mml:mo>*</mml:mo></mml:msubsup> </mml:mrow> </mml:math> values are determined from the family of all alkanes with n carbon atoms, for <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML"><mml:mrow><mml:mi>n</mml:mi> <mml:mo>≥</mml:mo> <mml:mn>6</mml:mn></mml:mrow> </mml:math> . This extends the main results of the aforementioned paper.
Collapse
Affiliation(s)
- Zhibin Du
- School of Mathematics and Statistics, Zhaoqing University, Zhaoqing, 526061, Guangdong, P.R. China.,Institute of Mathematics, Academia Sinica, Taipei, 10617, Taiwan
| | - Akbar Ali
- Knowledge Unit of Science, University of Management & Technology, Sialkot, Pakistan
| | - Nenad Trinajstić
- The Rugjer Bošković Institute P. O. Box 180, HR-10002, Zagreb, Croatia
| |
Collapse
|
10
|
Wolfender JL, Nuzillard JM, van der Hooft JJJ, Renault JH, Bertrand S. Accelerating Metabolite Identification in Natural Product Research: Toward an Ideal Combination of Liquid Chromatography–High-Resolution Tandem Mass Spectrometry and NMR Profiling, in Silico Databases, and Chemometrics. Anal Chem 2018; 91:704-742. [DOI: 10.1021/acs.analchem.8b05112] [Citation(s) in RCA: 113] [Impact Index Per Article: 18.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Affiliation(s)
- Jean-Luc Wolfender
- School of Pharmaceutical Sciences, EPGL, University of Geneva, University of Lausanne, CMU, 1 Rue Michel Servet, 1211 Geneva 4, Switzerland
| | - Jean-Marc Nuzillard
- Institut de Chimie Moléculaire de Reims, UMR CNRS 7312, Université de Reims Champagne Ardenne, 51687 Reims Cedex 2, France
| | | | - Jean-Hugues Renault
- Institut de Chimie Moléculaire de Reims, UMR CNRS 7312, Université de Reims Champagne Ardenne, 51687 Reims Cedex 2, France
| | - Samuel Bertrand
- Groupe Mer, Molécules, Santé-EA 2160, UFR des Sciences Pharmaceutiques et Biologiques, Université de Nantes, 44035 Nantes, France
- ThalassOMICS Metabolomics Facility, Plateforme Corsaire, Biogenouest, 44035 Nantes, France
| |
Collapse
|