1
|
Geer LY, Stein SE, Mallard WG, Slotta DJ. AIRI: Predicting Retention Indices and Their Uncertainties Using Artificial Intelligence. J Chem Inf Model 2024; 64:690-696. [PMID: 38230885 DOI: 10.1021/acs.jcim.3c01758] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/18/2024]
Abstract
The Kováts retention index (RI) is a quantity measured using gas chromatography and is commonly used in the identification of chemical structures. Creating libraries of observed RI values is a laborious task, so we explore the use of a deep neural network for predicting RI values from structure for standard semipolar columns. This network generated predictions with a mean absolute error of 15.1 and, in a quantification of the tail of the error distribution, a 95th percentile absolute error of 46.5. Because of the Artificial Intelligence Retention Indices (AIRI) network's accuracy, it was used to predict RI values for the NIST EI-MS spectral libraries. These RI values are used to improve chemical identification methods and the quality of the library. Estimating uncertainty is an important practical need when using prediction models. To quantify the uncertainty of our network for each individual prediction, we used the outputs of an ensemble of 8 networks to calculate a predicted standard deviation for each RI value prediction. This predicted standard deviation was corrected to follow the error between the observed and predicted RI values. The Z scores using these predicted standard deviations had a standard deviation of 1.52 and a 95th percentile absolute Z score corresponding to a mean RI value of 42.6.
Collapse
Affiliation(s)
- Lewis Y Geer
- National Institute of Standards and Technology, 100 Bureau Dr., Gaithersburg, Maryland 20899, United States
| | - Stephen E Stein
- National Institute of Standards and Technology, 100 Bureau Dr., Gaithersburg, Maryland 20899, United States
| | - William Gary Mallard
- National Institute of Standards and Technology, 100 Bureau Dr., Gaithersburg, Maryland 20899, United States
| | - Douglas J Slotta
- National Institute of Standards and Technology, 100 Bureau Dr., Gaithersburg, Maryland 20899, United States
| |
Collapse
|
2
|
Gheidari D, Mehrdad M, Ghahremani M. Azole Compounds as Inhibitors of Candida albicans: QSAR Modelling. Front Chem 2021; 9:774416. [PMID: 34912782 PMCID: PMC8667819 DOI: 10.3389/fchem.2021.774416] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2021] [Accepted: 11/03/2021] [Indexed: 01/13/2023] Open
Abstract
Candida albicans is a pathogenic opportunistic yeast found in the human gut flora. It may also live outside of the human body, causing diseases ranging from minor to deadly. Candida albicans begins as a budding yeast that can become hyphae in response to a variety of environmental or biological triggers. The hyphae form is responsible for the development of multidrug resistant biofilms, despite the fact that both forms have been associated to virulence Here, we have proposed a linear and SPA-linear quantitative structure activity relationship (QSAR) modeling and prediction of Candida albicans inhibitors. A data set that consisted of 60 derivatives of benzoxazoles, benzimidazoles, oxazolo (4, 5-b) pyridines have been used. In this study, that after applying the leverage analysis method to detect outliers' molecules, the total number of these compounds reached 55. SPA-MLR model shows superiority over the multiple linear regressions (MLR) by accounting 90% of the Q 2 of anti-fungus derivatives 'activity. This paper focuses on investigating the role of SPA-MLR in developing model. The accuracy of SPA-MLR model was illustrated using leave-one-out (LOO). The mean effect of descriptors and sensitivity analysis show that RDF090u is the most important parameter affecting the as behavior of the inhibitors of Candida albicans.
Collapse
Affiliation(s)
- Davood Gheidari
- Department of Chemistry, Faculty of Science, University of Guilan, Rasht, Iran
| | - Morteza Mehrdad
- Department of Chemistry, Faculty of Science, University of Guilan, Rasht, Iran
| | - Mahboubeh Ghahremani
- Department of Chemistry and Biochemistry, Texas Tech University, Lubbock, TX, United States
| |
Collapse
|
3
|
Vahedi N, Mohammadhosseini M, Nekoei M. QSAR Study of PARP Inhibitors by GA-MLR, GA-SVM and GA-ANN Approaches. CURR ANAL CHEM 2020. [DOI: 10.2174/1573411016999200518083359] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
Background:
The poly(ADP-ribose) polymerases (PARP) is a nuclear enzyme superfamily
present in eukaryotes.
Methods:
In the present report, some efficient linear and non-linear methods including multiple linear
regression (MLR), support vector machine (SVM) and artificial neural networks (ANN) were successfully
used to develop and establish quantitative structure-activity relationship (QSAR) models
capable of predicting pEC50 values of tetrahydropyridopyridazinone derivatives as effective PARP
inhibitors. Principal component analysis (PCA) was used to a rational division of the whole data set
and selection of the training and test sets. A genetic algorithm (GA) variable selection method was
employed to select the optimal subset of descriptors that have the most significant contributions to
the overall inhibitory activity from the large pool of calculated descriptors.
Results:
The accuracy and predictability of the proposed models were further confirmed using crossvalidation,
validation through an external test set and Y-randomization (chance correlations) approaches.
Moreover, an exhaustive statistical comparison was performed on the outputs of the proposed
models. The results revealed that non-linear modeling approaches, including SVM and ANN
could provide much more prediction capabilities.
Conclusion:
Among the constructed models and in terms of root mean square error of predictions
(RMSEP), cross-validation coefficients (Q2 LOO and Q2 LGO), as well as R2 and F-statistical value for
the training set, the predictive power of the GA-SVM approach was better. However, compared with
MLR and SVM, the statistical parameters for the test set were more proper using the GA-ANN model.
Collapse
Affiliation(s)
- Nafiseh Vahedi
- Department of Chemistry, College of Basic Sciences, Shahrood Branch, Islamic Azad University, Shahrood, Iran
| | - Majid Mohammadhosseini
- Department of Chemistry, College of Basic Sciences, Shahrood Branch, Islamic Azad University, Shahrood, Iran
| | - Mehdi Nekoei
- Department of Chemistry, College of Basic Sciences, Shahrood Branch, Islamic Azad University, Shahrood, Iran
| |
Collapse
|
4
|
Matyushin DD, Sholokhova AY, Buryak AK. A deep convolutional neural network for the estimation of gas chromatographic retention indices. J Chromatogr A 2019; 1607:460395. [DOI: 10.1016/j.chroma.2019.460395] [Citation(s) in RCA: 23] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2019] [Revised: 06/15/2019] [Accepted: 07/22/2019] [Indexed: 10/26/2022]
|
5
|
Wijit N, Prasitwattanaseree S, Mahatheeranont S, Wolschann P, Jiranusornkul S, Nimmanpipug P. Estimation of Retention Time in GC/MS of Volatile Metabolites in Fragrant Rice Using Principle Components of Molecular Descriptors. ANAL SCI 2018; 33:1211-1217. [PMID: 29129857 DOI: 10.2116/analsci.33.1211] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
Abstract
A quantitative structure-retention relationship (QSRR) study was applied for an estimation of retention times of secondary volatile metabolites in Thai jasmine rice. In this study, chemical components in rice seed were extracted using solvent extraction, then separated and identified by gas chromatography-mass spectrometry (GC-MS). A set of molecular descriptors was generated for these substances obtained from GC-MS analysis to numerically represent the molecular structure of such compounds. Principal component analysis (PCA) and principal component regression analysis (PCR) were used to model the retention times of these compounds as a function of the theoretically derived descriptors. The best fitted regression model was obtained with R-squared of 0.900. The informative chemical properties related to retention time were elucidated. The results of this study demonstrate clearly that the combination of molecular weight and autocorrelation functions of two dimensional interatomic distance, which are molecular polarizability, atom identity, sigma charge, sigma electronegativity and polarizability, can be considered as comprehensive factors for predicting the retention times of volatile compounds in rice.
Collapse
Affiliation(s)
- Nataporn Wijit
- Department of Chemistry and Center of Excellence for Innovation in Chemistry, Faculty of Science and Graduate School, Chiang Mai University
| | | | - Sugunya Mahatheeranont
- Department of Chemistry and Center of Excellence for Innovation in Chemistry, Faculty of Science and Graduate School, Chiang Mai University
| | - Peter Wolschann
- Department of Pharmaceutical Technology and Biopharmaceutics, University of Vienna.,Institute of Theoretical Chemistry, University of Vienna
| | | | - Piyarat Nimmanpipug
- Department of Chemistry and Center of Excellence for Innovation in Chemistry, Faculty of Science and Graduate School, Chiang Mai University
| |
Collapse
|
6
|
Kelly K, Bell S. Evaluation of the reproducibility and repeatability of GCMS retention indices and mass spectra of novel psychoactive substances. Forensic Chem 2018. [DOI: 10.1016/j.forc.2017.11.002] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/23/2023]
|
7
|
QSRR prediction of gas chromatography retention indices of essential oil components. CHEMICAL PAPERS 2017. [DOI: 10.1007/s11696-017-0257-x] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
|
8
|
Barfeii H, Garkani-Nejad Z. A Comparative QSRR Study on Enantioseparation of Ethanol Ester Enantiomers in HPLC Using Multivariate Image Analysis, Quantum Mechanical and Structural Descriptors. J CHIN CHEM SOC-TAIP 2016. [DOI: 10.1002/jccs.201600253] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Affiliation(s)
- Hamideh Barfeii
- Chemistry Department, Faculty of Science; Shahid Bahonar University of Kerman; Kerman, 7616914111 Iran
| | - Zahra Garkani-Nejad
- Chemistry Department, Faculty of Science; Shahid Bahonar University of Kerman; Kerman, 7616914111 Iran
| |
Collapse
|
9
|
Dossin E, Martin E, Diana P, Castellon A, Monge A, Pospisil P, Bentley M, Guy PA. Prediction Models of Retention Indices for Increased Confidence in Structural Elucidation during Complex Matrix Analysis: Application to Gas Chromatography Coupled with High-Resolution Mass Spectrometry. Anal Chem 2016; 88:7539-47. [DOI: 10.1021/acs.analchem.6b00868] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/07/2023]
Affiliation(s)
- Eric Dossin
- Philip Morris International R&D, Philip Morris Products S.A., Quai Jeanrenaud 5, CH-2000 Neuchatel, Switzerland
| | - Elyette Martin
- Philip Morris International R&D, Philip Morris Products S.A., Quai Jeanrenaud 5, CH-2000 Neuchatel, Switzerland
| | - Pierrick Diana
- Philip Morris International R&D, Philip Morris Products S.A., Quai Jeanrenaud 5, CH-2000 Neuchatel, Switzerland
| | - Antonio Castellon
- Philip Morris International R&D, Philip Morris Products S.A., Quai Jeanrenaud 5, CH-2000 Neuchatel, Switzerland
| | - Aurelien Monge
- Philip Morris International R&D, Philip Morris Products S.A., Quai Jeanrenaud 5, CH-2000 Neuchatel, Switzerland
| | - Pavel Pospisil
- Philip Morris International R&D, Philip Morris Products S.A., Quai Jeanrenaud 5, CH-2000 Neuchatel, Switzerland
| | - Mark Bentley
- Philip Morris International R&D, Philip Morris Products S.A., Quai Jeanrenaud 5, CH-2000 Neuchatel, Switzerland
| | - Philippe A. Guy
- Philip Morris International R&D, Philip Morris Products S.A., Quai Jeanrenaud 5, CH-2000 Neuchatel, Switzerland
| |
Collapse
|
10
|
Genetic programming based quantitative structure–retention relationships for the prediction of Kovats retention indices. J Chromatogr A 2015; 1420:98-109. [DOI: 10.1016/j.chroma.2015.09.086] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2015] [Revised: 09/25/2015] [Accepted: 09/25/2015] [Indexed: 11/20/2022]
|
11
|
Nekoei M, Mohammadhosseini M. Application of HS-SPME, SDME and Cold-Press Coupled to GC/MS to Analysis the Essential Oils ofCitrus sinensisCV.Thomson Naveland QSRR Study for Prediction of Retention Indices by Stepwise and Genetic Algorithm-Multiple Linear Regression Approaches. ACTA ACUST UNITED AC 2014. [DOI: 10.1080/22297928.2013.770670] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]
|
12
|
Knorr A, Monge A, Stueber M, Stratmann A, Arndt D, Martin E, Pospisil P. Computer-Assisted Structure Identification (CASI)—An Automated Platform for High-Throughput Identification of Small Molecules by Two-Dimensional Gas Chromatography Coupled to Mass Spectrometry. Anal Chem 2013; 85:11216-24. [DOI: 10.1021/ac4011952] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Affiliation(s)
- Arno Knorr
- Philip Morris International R&D, Philip Morris Products S.A., 2000 Neuchâtel, Switzerland
| | - Aurelien Monge
- Philip Morris International R&D, Philip Morris Products S.A., 2000 Neuchâtel, Switzerland
| | - Markus Stueber
- Philip Morris International R&D, Philip Morris Research Laboratories GmbH, 51149 Köln, Germany
| | - André Stratmann
- Philip Morris International R&D, Philip Morris Research Laboratories GmbH, 51149 Köln, Germany
| | - Daniel Arndt
- Philip Morris International R&D, Philip Morris Products S.A., 2000 Neuchâtel, Switzerland
| | - Elyette Martin
- Philip Morris International R&D, Philip Morris Products S.A., 2000 Neuchâtel, Switzerland
| | - Pavel Pospisil
- Philip Morris International R&D, Philip Morris Products S.A., 2000 Neuchâtel, Switzerland
| |
Collapse
|
13
|
Ulrich N, Schüürmann G, Brack W. Prediction of gas chromatographic retention indices as classifier in non-target analysis of environmental samples. J Chromatogr A 2013; 1285:139-47. [DOI: 10.1016/j.chroma.2013.02.037] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2012] [Revised: 02/11/2013] [Accepted: 02/12/2013] [Indexed: 10/27/2022]
|
14
|
Garkani-Nejad Z, Ahmadi-Roudi B. Investigating the role of weight update functions in developing artificial neural network modeling of retention times of furan and phenol derivatives. CAN J CHEM 2013. [DOI: 10.1139/cjc-2012-0372] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
A quantitative structure−retention relationship study has been carried out on the retention times of 63 furan and phenol derivatives using artificial neural networks (ANNs). First, a large number of descriptors were calculated using HyperChem, Mopac, and Dragon softwares. Then, a suitable number of these descriptors were selected using a multiple linear regression technique. This paper focuses on investigating the role of weight update functions in developing ANNs. Therefore, selected descriptors were used as inputs for ANNs with six different weight update functions including the Levenberg−Marquardt back-propagation network, scaled conjugate gradient back-propagation network, conjugate gradient back-propagation with Powell−Beale restarts network, one-step secant back-propagation network, resilient back-propagation network, and gradient descent with momentum back-propagation network. Comparison of the results indicates that the Levenberg−Marquardt back-propagation network has better predictive power than the other methods.
Collapse
Affiliation(s)
- Zahra Garkani-Nejad
- Chemistry Department, Faculty of Science, Shahid Bahonar University of Kerman, Kerman, Iran
| | - Behzad Ahmadi-Roudi
- Chemistry Department, Faculty of Science, Vali-e-Asr University, Rafsanjan, Iran
| |
Collapse
|
15
|
Scheubert K, Hufsky F, Böcker S. Computational mass spectrometry for small molecules. J Cheminform 2013; 5:12. [PMID: 23453222 PMCID: PMC3648359 DOI: 10.1186/1758-2946-5-12] [Citation(s) in RCA: 108] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2012] [Accepted: 02/01/2013] [Indexed: 12/29/2022] Open
Abstract
: The identification of small molecules from mass spectrometry (MS) data remains a major challenge in the interpretation of MS data. This review covers the computational aspects of identifying small molecules, from the identification of a compound searching a reference spectral library, to the structural elucidation of unknowns. In detail, we describe the basic principles and pitfalls of searching mass spectral reference libraries. Determining the molecular formula of the compound can serve as a basis for subsequent structural elucidation; consequently, we cover different methods for molecular formula identification, focussing on isotope pattern analysis. We then discuss automated methods to deal with mass spectra of compounds that are not present in spectral libraries, and provide an insight into de novo analysis of fragmentation spectra using fragmentation trees. In addition, this review shortly covers the reconstruction of metabolic networks using MS data. Finally, we list available software for different steps of the analysis pipeline.
Collapse
Affiliation(s)
- Kerstin Scheubert
- Chair of Bioinformatics, Friedrich Schiller University, Ernst-Abbe-Platz 2, Jena, Germany.
| | | | | |
Collapse
|
16
|
QSRR Study on Flavor Compounds of Diverse Structures on Different Columns with the Help of New Chemometric Methods. Chromatographia 2012. [DOI: 10.1007/s10337-012-2349-7] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
|
17
|
Yan J, Cao DS, Guo FQ, Zhang LX, He M, Huang JH, Xu QS, Liang YZ. Comparison of quantitative structure–retention relationship models on four stationary phases with different polarity for a diverse set of flavor compounds. J Chromatogr A 2012; 1223:118-25. [DOI: 10.1016/j.chroma.2011.12.020] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2011] [Revised: 12/03/2011] [Accepted: 12/05/2011] [Indexed: 01/22/2023]
|
18
|
Kumari S, Stevens D, Kind T, Denkert C, Fiehn O. Applying in-silico retention index and mass spectra matching for identification of unknown metabolites in accurate mass GC-TOF mass spectrometry. Anal Chem 2011; 83:5895-902. [PMID: 21678983 PMCID: PMC3146571 DOI: 10.1021/ac2006137] [Citation(s) in RCA: 59] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]
Abstract
One of the major obstacles in metabolomics is the identification of unknown metabolites. We tested constraints for reidentifying the correct structures of 29 known metabolite peaks from GCT premier accurate mass chemical ionization GC-TOF mass spectrometry data without any use of mass spectral libraries. Correct elemental formulas were retrieved within the top-3 hits for most molecular ion adducts using the "Seven Golden Rules" algorithm. An average of 514 potential structures per formula was downloaded from the PubChem chemical database and in-silico-derivatized using the ChemAxon software package. After chemical curation, Kovats retention indices (RI) were predicted for up to 747 potential structures per formula using the NIST MS group contribution algorithm and corrected for contribution of trimethylsilyl groups using the Fiehnlib RI library. When matching the range of predicted RI values against the experimentally determined peak retention, all but three incorrect formulas were excluded. For all remaining isomeric structures, accurate mass electron ionization spectra were predicted using the MassFrontier software and scored against experimental spectra. Using a mass error window of 10 ppm for fragment ions, 89% of all isomeric structures were removed and the correct structure was reported in 73% within the top-5 hits of the cases.
Collapse
Affiliation(s)
- Sangeeta Kumari
- UC Davis Genome Center, University of California-Davis, Davis, California 95616, United States
| | | | | | | | | |
Collapse
|
19
|
iMatch: a retention index tool for analysis of gas chromatography-mass spectrometry data. J Chromatogr A 2011; 1218:6522-30. [PMID: 21813131 DOI: 10.1016/j.chroma.2011.07.039] [Citation(s) in RCA: 34] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2011] [Revised: 06/28/2011] [Accepted: 07/10/2011] [Indexed: 11/20/2022]
Abstract
A method was developed to employ National Institute of Standards and Technology (NIST) 2008 retention index database information for molecular retention matching via constructing a set of empirical distribution functions (DFs) of the absolute retention index deviation to its mean value. The effects of different experimental parameters on the molecules' retention indices were first assessed. The column class, the column type, and the data type have significant effects on the retention index values acquired on capillary columns. However, the normal alkane retention index (I(norm)) with the ramp condition is similar to the linear retention index (I(T)), while the I(norm) with the isothermal condition is similar to the Kováts retention index (I). As for the I(norm) with the complex condition, these data should be treated as an additional group, because the mean I(norm) value of the polar column is significantly different from the I(T). Based on this analysis, nine DFs were generated from the grouped retention index data. The DF information was further implemented into a software program called iMatch. The performance of iMatch was evaluated using experimental data of a mixture of standards and metabolite extract of rat plasma with spiked-in standards. About 19% of the molecules identified by ChromaTOF were filtered out by iMatch from the identification list of electron ionization (EI) mass spectral matching, while all of the spiked-in standards were preserved. The analysis results demonstrate that using the retention index values, via constructing a set of DFs, can improve the spectral matching-based identifications by reducing a significant portion of false-positives.
Collapse
|
20
|
Zhu X, Ding G, Levy W, Jakobi G, Schramm KW. Relationship of air sampling rates of semipermeable membrane devices with the properties of organochlorine pesticides. J Environ Sci (China) 2011; 23 Suppl:S40-S44. [PMID: 25084591 DOI: 10.1016/s1001-0742(11)61074-7] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/03/2023]
Abstract
The organochlorine pesticides (OCP) in Eastern-Barvaria at Haidel 1160 m a.s.l. were monitored with a low volume active air sampler and semi-permeable membrane devices (SPMD). The air sampling rates (Rair) of SPMD for OCP were calculated. Quantitative structure-property relationship (QSPR) models of Rair of SPMD were developed for OCP with partial least square (PLS) regression. Quantum chemical descriptors computed by semi-empirical PM6 method were used as predictor variables. The cumulative variance of the dependent variable explained by the PLS components and determined by cross-validation (Q(2)cum), for the optimal models, is 0.637, indicating that the model has good predictive ability and robustness, and could be used to estimate Rair values of OCP. The main factors governing Rair of OCP are intermolecular interactions and the energy required for cave-forming in dissolution of OCP into triolein of SPMD.
Collapse
Affiliation(s)
- Xiuhua Zhu
- School of Environmental and Chemical Engineering, Dalian Jiaotong University, Dalian 116028, China
| | - Guanghui Ding
- College of Environmental Science and Engineering, Dalian Maritime University, Dalian 116026, China
| | - Walkiria Levy
- Helmholtz Zentrum MUnchen - German Research Center for Environmental Health, Institute of Ecological Chemistry, Ingolstadter Landstr. 1, D-85764 Neuherberg, Munich, Germany
| | - Gert Jakobi
- Helmholtz Zentrum MUnchen - German Research Center for Environmental Health, Institute of Ecological Chemistry, Ingolstadter Landstr. 1, D-85764 Neuherberg, Munich, Germany
| | - Karl-Werner Schramm
- Helmholtz Zentrum MUnchen - German Research Center for Environmental Health, Institute of Ecological Chemistry, Ingolstadter Landstr. 1, D-85764 Neuherberg, Munich, Germany; TUM, Wiss ens chafts zentrum Weihenstephan fuer Ernaehrung und Landnutzung, Department fuer Biowis sens chaftliche Grundlagen, Weihenstephaner Steig 23, 85350 Freising, Germany
| |
Collapse
|
21
|
Garkani-Nejad Z, Ahmadvand M. Investigation of Linear and Nonlinear Chemometrics Methods in Modeling of Retention Time of Phenol Derivatives Based on Molecular Descriptors. SEP SCI TECHNOL 2011. [DOI: 10.1080/01496395.2010.539587] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
|
22
|
A quantitative structure-retention relationship for the prediction of retention indices of the essential oils of Ammoides atlantica. JOURNAL OF THE SERBIAN CHEMICAL SOCIETY 2011. [DOI: 10.2298/jsc100219076a] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]
Abstract
A simple, descriptive and interpretable model, based on a quantitative
structure-retention relationship (QSRR), was developed using the genetic
algorithm-multiple linear regression (GA-MLR) approach for the prediction of
the retention indices (RI) of essential oil components. By molecular
modeling, three significant descriptors related to the RI values of the
essential oils were identified. A data set was selected consisting of the
retention indices for 32 essential oil molecules with a range of more than
931 compounds. Then, a suitable set of the molecular descriptors was
calculated and the important descriptors were selected with the aid of the
genetic algorithm and multiple regression method. A model with a low
prediction error and a good correlation coefficient was obtained. This model
was used for the prediction of the RI values of some essential oil
components which were not used in the modeling procedure.
Collapse
|
23
|
Brack W, Ulrich N, Bataineh M. Separation Techniques in Effect-Directed Analysis. THE HANDBOOK OF ENVIRONMENTAL CHEMISTRY 2011. [DOI: 10.1007/978-3-642-18384-3_5] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/04/2023]
|
24
|
Ghavami R, Faham S. QSRR Models for Kováts' Retention Indices of a Variety of Volatile Organic Compounds on Polar and Apolar GC Stationary Phases Using Molecular Connectivity Indexes. Chromatographia 2010; 72:893-903. [PMID: 21088689 PMCID: PMC2965364 DOI: 10.1365/s10337-010-1741-4] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2010] [Revised: 07/13/2010] [Accepted: 08/02/2010] [Indexed: 11/29/2022]
Abstract
Quantitative structure-retention relationship (QSRR) approaches, based on molecular connectivity indices are useful to predict the gas chromatography of Kováts relative retention indices (GC-RRIs) of 132 volatile organic compounds (VOCs) on different 12 (4 apolar and 8 polar) stationary phases (C67, C103, C78, C∞, POH, TTF, MTF, PCL, PBR, TMO, PSH and PCN) at 130 °C. Full geometry optimization based on Austin model 1 semi-empirical molecular orbital method was carried out. The sets of 30 molecular descriptors were derived directly from the topological structures of the compounds from DRAGON program. By means of the final variable selection method, which is elimination selection stepwise regression algorithms, three optimal descriptors were selected to develop a QSRR model to predict the RRI of organic compounds on each stationary phase with a correlation coefficient between 0.9378 and 0.9673 and a leave-one-out cross-validation correlation coefficient between 0.9325 and 0.9653. The root mean squares errors over different 12 phases were within the range of 0.0333–0.0458. Furthermore, the accuracy of all developed models was confirmed using procedures of Y-randomization, external validation through an odd–even number and division of the entire dataset into training and test sets. A successful interpretation of the complex relationship between GC RRIs of VOCs and the chemical structures was achieved by QSRR. The three connectivity indexes in the models are also rationally interpreted, which indicated that all organic compounds’ RRI was precisely represented by molecular connectivity indexes.
Collapse
Affiliation(s)
- Raouf Ghavami
- Department of Chemistry, Faculty of Science, University of Kurdistan, P.O. Box 416, Sanandaj, Iran
| | | |
Collapse
|
25
|
Katritzky AR, Kuanar M, Slavov S, Hall CD, Karelson M, Kahn I, Dobchev DA. Quantitative Correlation of Physical and Chemical Properties with Chemical Structure: Utility for Prediction. Chem Rev 2010; 110:5714-89. [DOI: 10.1021/cr900238d] [Citation(s) in RCA: 386] [Impact Index Per Article: 25.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Affiliation(s)
- Alan R. Katritzky
- Center for Heterocyclic Compounds, Department of Chemistry, University of Florida, Gainesville, Florida 32611
| | - Minati Kuanar
- Center for Heterocyclic Compounds, Department of Chemistry, University of Florida, Gainesville, Florida 32611
| | - Svetoslav Slavov
- Center for Heterocyclic Compounds, Department of Chemistry, University of Florida, Gainesville, Florida 32611
| | - C. Dennis Hall
- Center for Heterocyclic Compounds, Department of Chemistry, University of Florida, Gainesville, Florida 32611
| | - Mati Karelson
- Institute of Chemistry, Tallinn University of Technology, Akadeemia tee 15, Tallinn 19086, Estonia, and MolCode, Ltd., Soola 8, Tartu 51013, Estonia
| | - Iiris Kahn
- Institute of Chemistry, Tallinn University of Technology, Akadeemia tee 15, Tallinn 19086, Estonia, and MolCode, Ltd., Soola 8, Tartu 51013, Estonia
| | - Dimitar A. Dobchev
- Institute of Chemistry, Tallinn University of Technology, Akadeemia tee 15, Tallinn 19086, Estonia, and MolCode, Ltd., Soola 8, Tartu 51013, Estonia
| |
Collapse
|
26
|
Acevedo-Martínez J, Zenkevich IG, Carrasco-Velar R. Use of a Simple Additive Scheme to Predict the GC Retention Indices of Aromatic Compounds with Different Structures. Chromatographia 2010. [DOI: 10.1365/s10337-010-1587-9] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]
|
27
|
Garkani-Nejad Z. Use of Self-Training Artificial Neural Networks in a QSRR Study of a Diverse Set of Organic Compounds. Chromatographia 2009. [DOI: 10.1365/s10337-009-1241-6] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]
|
28
|
Theoretical characterization of gas–liquid chromatographic stationary phases with quantum chemical descriptors. J Chromatogr A 2009; 1216:2540-7. [DOI: 10.1016/j.chroma.2009.01.026] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2008] [Revised: 01/07/2009] [Accepted: 01/12/2009] [Indexed: 11/19/2022]
|
29
|
Mihaleva VV, Verhoeven HA, de Vos RCH, Hall RD, van Ham RCHJ. Automated procedure for candidate compound selection in GC-MS metabolomics based on prediction of Kovats retention index. Bioinformatics 2009; 25:787-94. [DOI: 10.1093/bioinformatics/btp056] [Citation(s) in RCA: 34] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/20/2023] Open
|
30
|
Asadpour-Zeynali K, Jalili-Jahani N. Modeling GC-ECD retention times of pentafluorobenzyl derivatives of phenol by using artificial neural networks. J Sep Sci 2008; 31:3788-95. [DOI: 10.1002/jssc.200800418] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
|
31
|
Quantitative structure–retention relationships of pesticides in reversed-phase high-performance liquid chromatography based on WHIM and GETAWAY molecular descriptors. Anal Chim Acta 2008; 628:162-72. [DOI: 10.1016/j.aca.2008.09.018] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2008] [Revised: 09/05/2008] [Accepted: 09/08/2008] [Indexed: 11/24/2022]
|
32
|
Riahi S, Ganjali MR, Pourbasheer E, Norouzi P. QSRR Study of GC Retention Indices of Essential-Oil Compounds by Multiple Linear Regression with a Genetic Algorithm. Chromatographia 2008. [DOI: 10.1365/s10337-008-0608-4] [Citation(s) in RCA: 52] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]
|
33
|
Prediction Partial Molar Heat Capacity at Infinite Dilution for Aqueous Solutions of Various Polar Aromatic Compounds over a Wide Range of Conditions Using Artificial Neural Networks. B KOREAN CHEM SOC 2007. [DOI: 10.5012/bkcs.2007.28.9.1477] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]
|
34
|
Liu F, Liang Y, Cao C, Zhou N. Theoretical prediction of the Kovat's retention index for oxygen-containing organic compounds using novel topological indices. Anal Chim Acta 2007; 594:279-89. [PMID: 17586126 DOI: 10.1016/j.aca.2007.05.023] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2006] [Revised: 05/14/2007] [Accepted: 05/16/2007] [Indexed: 02/03/2023]
Abstract
For the retention index of polar compounds, polar groups in molecules would participate in polar interactions between eluents and stationary phases and thus would be expected to make large and separate contributions to the total retention index (RI). The characterization of the structural feature will help to elucidate the quantitative structure-retention relationship (QSRR). In this paper, on the basis of the PEI index previously developed by Cao, two novel molecular polarizability effect index, modified molecular polarizability index (MPEI(m)) and modified inner molecular polarizability index (IMPEI(m)) were proposed to predict the GC retention of a variety of oxygen-containing organic compounds with diverse chemical structures on OV-1 and SE-54 stationary phases. The sets of molecular descriptors were derived directly from the structure of the compounds based on graph theory. Simple linear regression equations between the RI and the topological indices were established for each stationary phase separately (R>0.99). Statistical analysis showed that the QSRR models have high internal stability and good predictive ability for external groups. The molecular properties known to be relevant for GC retention data, such as molecular size, branching and polar functional groups were well covered by the generated descriptors. The models with topological indices were compared with those based on quantum-chemical descriptors. It is observed that topological indices produce better correlations with Kovat's retention index. The results indicate the efficiency of presented indices in the structure-retention index correlations of complex compounds with polar multi-functional groups.
Collapse
Affiliation(s)
- Fengping Liu
- School of Chemistry and Chemical Engineering, Hunan University of Science and Technology, Xiangtan 411201, PR China
| | | | | | | |
Collapse
|
35
|
Zhu XH, Wang W, Schramm KW, Niu W. Prediction of the Kováts Retention Indices of Thiols by Use of Quantum Chemical and Physicochemical Descriptors. Chromatographia 2007. [DOI: 10.1365/s10337-007-0237-3] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]
|
36
|
Hemmateenejad B, Javadnia K, Elyasi M. Quantitative structure-retention relationship for the Kovats retention indices of a large set of terpenes: a combined data splitting-feature selection strategy. Anal Chim Acta 2007; 592:72-81. [PMID: 17499073 DOI: 10.1016/j.aca.2007.04.009] [Citation(s) in RCA: 35] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2007] [Revised: 04/04/2007] [Accepted: 04/05/2007] [Indexed: 10/23/2022]
Abstract
A data set consisting of a large number of terpenoids, the widely distributed compounds in nature that are found in abundance in higher plants, have been used to develop a quantitative structure property relationship (QSPR) for their Kovats retention index. QSPR models are usually obtained by splitting the data into two sets including calibration (or training) and prediction (or validation). All model building steps, especially feature selection procedure, are performed using this initial splitting, and therefore the performances of the resulted models are highly dependent on the initial data splitting. To investigate the effects of data splitting on the feature selection in the current article we proposed a combined data splitting-feature selection (CDFS) methodology for QSPR model development by producing several different training/validation/test sets, and repeating all of the model building studies. In this method, data splitting is achieved many times and in each case feature selection is performed. The resulted models are compared for similarity and dissimilarity between the selected descriptors. The final model is one whose descriptors are the common variables between all of resulted models. The method was applied to QSPR study of a large data set containing the Kovats retention indices of 573 terpenoids. A final 8-parametric multilinear model with constitutional and topological indices was obtained. Cross-validation indicated that the model could reproduce more than 90% of variances in the Kovats retention data. The relative error of prediction for an external test set of 50 compounds was 3.2%. Finally, to improve the results, structure-retention relationships were followed by nonlinear approach using artificial neural networks and consequently better results were obtained.
Collapse
|
37
|
Héberger K. Quantitative structure-(chromatographic) retention relationships. J Chromatogr A 2007; 1158:273-305. [PMID: 17499256 DOI: 10.1016/j.chroma.2007.03.108] [Citation(s) in RCA: 276] [Impact Index Per Article: 15.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2007] [Revised: 03/13/2007] [Accepted: 03/19/2007] [Indexed: 01/30/2023]
Abstract
Since the pioneering works of Kaliszan (R. Kaliszan, Quantitative Structure-Chromatographic Retention Relationships, Wiley, New York, 1987; and R. Kaliszan, Structure and Retention in Chromatography. A Chemometric Approach, Harwood Academic, Amsterdam, 1997) no comprehensive summary is available in the field. Present review covers the period of 1996-August 2006. The sources are grouped according to the special properties of kinds of chromatography: Quantitative structure-retention relationship in gas chromatography, in planar chromatography, in column liquid chromatography, in micellar liquid chromatography, affinity chromatography and quantitative structure enantioselective retention relationships. General tendencies, misleading practice and conclusions, validation of the models, suggestions for future works are summarized for each sub-field. Some straightforward applications are emphasized but standard ones. The sources and the model compounds, descriptors, predicted retention data, modeling methods and indicators of their performance, validation of models, and stationary phases are collected in the tables. Some important conclusions are: Not all physicochemical descriptors correlate with the retention data strongly; the heat of formation is not related to the chromatographic retention. It is not appropriate to give the errors of Kovats indices in percentages. The apparently low values (1-3%) can disorient the reviewers and readers. Contemporary mean interlaboratory reproducibility of Kovats indices are about 5-10 i.u. for standard non polar phases and 10-25 i.u. for standard polar phases. The predictive performance of QSRR models deteriorates as the polarity of GC stationary phase increases. The correlation coefficient alone is not a particularly good indicator for the model performance. Residuals are more useful than plots of measured and calculated values. There is no need to give the retention data in a form of an equation if the numbers of compounds are small. The domain of model applicability of models should be given in all cases.
Collapse
Affiliation(s)
- Károly Héberger
- Chemical Research Center, Hungarian Academy of Sciences, P.O. Box 17, H-1525 Budapest, Hungary.
| |
Collapse
|
38
|
Prieto JJ, Talevi A, Bruno-Blanch LE. Application of linear discriminant analysis in the virtual screening of antichagasic drugs through trypanothione reductase inhibition. Mol Divers 2006; 10:361-75. [PMID: 17031538 DOI: 10.1007/s11030-006-9044-2] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2006] [Accepted: 05/17/2006] [Indexed: 10/24/2022]
Abstract
We have performed virtual screening to identify new lead trypanothione reductase inhibitor (TRI) compounds, enzyme present in Tripanozoma cruzi, the agent responsible of Chagas disease. From a training set of 58 compounds, linear discriminant analysis (LDA) was performed using 2D and 3D descriptors as discriminating variables in order to find out which function of descriptors characterizes the active TRI. The values of the statistical parameters F--Snedecor and Wilk's lambda for the discriminant function (DF) showed good statistical significance, as long as the rate of success in the prediction for both the training and the test set: 91.38% and 88.63%, in that order. Internal validation through the Leave--Group--Out methodology was performed with good results, assuring the stability of the DF. Afterwards, the DF was applied in virtual screening of 422,367 compounds. The optimum range of values of octanol--water partition coefficient for a compound to develop trypanothione reductase inhibition was applied as a second filtering criteria. 739 structurally heterogeneous drugs of the virtual library were selected as promissory TRI.
Collapse
Affiliation(s)
- Julián J Prieto
- Medicinal Chemistry, Department of Biological Sciences, Exact Sciences Collage, La Plata National University, La Plata, Buenos Aires, Argentina
| | | | | |
Collapse
|
39
|
Mjøs SA. Prediction of equivalent chain lengths from two-dimensional fatty acid retention indices. J Chromatogr A 2006; 1122:249-54. [PMID: 16701676 DOI: 10.1016/j.chroma.2006.04.067] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2006] [Revised: 04/20/2006] [Accepted: 04/25/2006] [Indexed: 10/24/2022]
Abstract
A recently introduced two-dimensional fatty acid retention index system (2D-FARI) was used as basis for prediction of equivalent chain lengths (ECL) of fatty acid methyl esters (FAME) on a BPX-70 stationary phase. Models for the relationship between 2D-FARI data and ECL values of a calibration sample with 30 common fatty acids were established by a simple multivariate regression. The models were thereafter applied on 2D-FARI data for other FAMEs and used to predict the ECLs for these compounds. The 2D-FARI values for the fatty acids in the calibration sample are given by definition. Thus, the only information necessary to calculate the ECL value for a compound run under identical conditions as the calibration sample is the 2D-FARI values for the compound, which can be acquired from literature data. The method was validated with test sets analysed with different temperature and flow programs. ECLs of various marine FAME and trans isomers of Eicosapentaenoic and Docosahexaenoic acid were predicted with root mean squared error of prediction from 0.002 to 0.012 ECL units.
Collapse
Affiliation(s)
- Svein A Mjøs
- Norwegian Institute of Fisheries and Aquaculture Research, Department SFF, Kjerreidviken 16, N-5141 Fyllingsdalen, Bergen, Norway.
| |
Collapse
|
40
|
Mjøs SA, Grahl-Nielsen O. Prediction of gas chromatographic retention of polyunsaturated fatty acid methyl esters. J Chromatogr A 2006; 1110:171-80. [PMID: 16460747 DOI: 10.1016/j.chroma.2006.01.092] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2005] [Revised: 01/17/2006] [Accepted: 01/19/2006] [Indexed: 11/26/2022]
Abstract
Multivariate regression models were applied to predict retention indices as equivalent chain lengths (ECL) for methylene-interrupted polyunsaturated fatty acids. Simple molecular descriptors, the chain length, the number of double bonds and the position of the double bond system, were used as predictors. The merits of different variable combinations were evaluated. For general models, it was necessary to include the distance from the double bond system to both the carbonyl group (Delta-position) and the methyl end of the fatty acid (n-position). The best accuracy was found for models including higher order terms of Delta and n. For models restricted to n-3 and n-6 isomers, it was not necessary to include the n-position among the variables. The highest residuals for the most accurate models were below 0.06 ECL units, and root mean square error of prediction was below 0.030. The ECL data was achieved by three different temperature programs on a cyanopropyl column.
Collapse
Affiliation(s)
- Svein A Mjøs
- Fiskeriforskning, Kjerreidviken 16, N-5141 Fyllingsdalen, Norway.
| | | |
Collapse
|
41
|
Prediction Acidity Constant of Various Benzoic Acids and Phenols in Water Using Linear and Nonlinear QSPR Models. B KOREAN CHEM SOC 2005. [DOI: 10.5012/bkcs.2005.26.12.2007] [Citation(s) in RCA: 33] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]
|
42
|
Habibi-Yangjeh A, Danandeh-Jenagharad M, Nooshyar M. Application of artificial neural networks for predicting the aqueous acidity of various phenols using QSAR. J Mol Model 2005; 12:338-47. [PMID: 16344950 DOI: 10.1007/s00894-005-0050-6] [Citation(s) in RCA: 30] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2005] [Accepted: 07/22/2005] [Indexed: 10/25/2022]
Abstract
Artificial neural networks (ANNs) have been successfully trained to model and predict the acidity constants (pK(a)) of 128 various phenols with diverse chemical structures using a quantitative structure-activity relationship. An ANN with 6-14-1 architecture was generated using six molecular descriptors that appear in the multi-parameter linear regression (MLR) model. The polarizability term (pi (I)), most positive charge of acidic hydrogen atom (q+), molecular weight (MW), most negative charge of the phenolic oxygen atom (q-), the hydrogen-bond accepting ability (epsilon(B)) and partial-charge weighted topological electronic (PCWTE) descriptors are inputs and its output is pK(a). It was found that a properly selected and trained neural network with 106 phenols could represent the dependence of the acidity constant on molecular descriptors fairly well. For evaluation of the predictive power of the ANN, an optimized network was used to predict the pK(a)s of 22 compounds in the prediction set, which were not used in the optimization procedure. A squared correlation coefficient (R2) and root mean square error (RMSE) of 0.8950 and 0.5621 for the prediction set by the MLR model should be compared with the values of 0.99996 and 0.0114 by the ANN model. These improvements are due to the fact that the pK(a) of phenols shows non-linear correlations with the molecular descriptors. [Figure: see text].
Collapse
Affiliation(s)
- Aziz Habibi-Yangjeh
- Department of Chemistry, Faculty of Science, University of Mohaghegh Ardebili, P. O. Box 179, Ardebil, Iran.
| | | | | |
Collapse
|
43
|
Franke S, Heinzel N, Specht M, Francke W. Identification of Organic Pollutants in Waters and Sediments from the Lower Mulde River Area. ACTA ACUST UNITED AC 2005. [DOI: 10.1002/aheh.200400588] [Citation(s) in RCA: 16] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
|
44
|
|
45
|
Affiliation(s)
- T A Brettell
- Office of Forensic Sciences, New Jersey State Police, New Jersey Forensic Science and Technology Complex, 1200 Negron Road, Horizon Center, Hamilton, New Jersey 08691, USA
| | | | | |
Collapse
|
46
|
Artificial neural network prediction of quantitative structure: Retention relationships of polycyclic aromatic hydocarbons in gas chromatography. JOURNAL OF THE SERBIAN CHEMICAL SOCIETY 2005. [DOI: 10.2298/jsc0511291s] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]
Abstract
A feed-forward artificial neural network (ANN) model was used to link molecular structures (boiling points, connectivity indices and molecular weights) and retention indices of polycyclic aromatic hydrocarbons (PAHs) in linear temperature- programmed gas chromatography. A randomly taken subset of PAH retention data reported by Lee et al. [Anal. Chem. 51 (1979) 768], containing retention index data for 30 PAHs, was used to make the ANN model. The prediction ability of the trained ANN was tested on unseen data for 18 PAHs from the same article, as well as on the retention data for 7 PAHs experimentally obtained in this work. In addition, two different data sets with known retention indices taken from the literature were analyzed by the same ANN model. It has been shown that the relative accuracy as the degree of agreement between the measured and the predicted retention indices in all testing sets, for most of the studied PAHs, were within the experimental error margins (+-3 %).
Collapse
|