1
|
Darsaraee M, Kaveh S, Mani-Varnosfaderani A, Neiband MS. General structure-activity/selectivity relationship patterns for the inhibitors of the chemokine receptors (CCR1/CCR2/CCR4/CCR5) with application for virtual screening of PubChem database. J Biomol Struct Dyn 2023:1-19. [PMID: 37599469 DOI: 10.1080/07391102.2023.2248255] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2023] [Accepted: 08/08/2023] [Indexed: 08/22/2023]
Abstract
CC chemokine receptors (CCRs) form a crucial subfamily of G protein-linked receptors that play a distinct role in the onset and progression of various life-threatening diseases. The main aim of this research is to derive general structure-activity relationship (SAR) patterns to describe the selectivity and activity of CCR inhibitors. To this end, a total of 7332 molecules related to the inhibition of CCR1, CCR2, CCR4, and CCR5 were collected from the Binding Database and analyzed using machine learning techniques. A diverse set of 450 molecular descriptors was calculated for each molecule, and the molecules were classified based on their therapeutic targets and activities. The variable importance in the projection (VIP) approach was used to select discriminatory molecular features, and classification models were developed using supervised Kohonen networks (SKN) and counter-propagation artificial neural networks (CPANN). The reliability and predictability of the models were estimated using 10-fold cross-validation, an external validation set, and an applicability domain approach. We were able to identify different sets of molecular descriptors for discriminating between active and inactive molecules and model the selectivity of inhibitors towards different CCRs. The sensitivities of the predictions for the external test set for the SKN models ranged from 0.827-0.873. Finally, the developed classification models were used to screen approximately 2 million random molecules from the PubChem database, with average values for areas under the receiver operating characteristic curves ranging from 0.78-0.96 for SKN models and 0.75-0.89 for CPANN models.Communicated by Ramaswamy H. Sarma.
Collapse
Affiliation(s)
- M Darsaraee
- Chemometrics and Cheminformatics Laboratory, Department of Analytical Chemistry, Tarbiat Modares University, Tehran, Iran
| | - S Kaveh
- Chemometrics and Cheminformatics Laboratory, Department of Analytical Chemistry, Tarbiat Modares University, Tehran, Iran
| | - A Mani-Varnosfaderani
- Chemometrics and Cheminformatics Laboratory, Department of Analytical Chemistry, Tarbiat Modares University, Tehran, Iran
| | - M S Neiband
- Department of Chemistry, Payame Noor University (PNU), Tehran, Iran
| |
Collapse
|
2
|
Király P, Kiss R, Kovács D, Ballaj A, Tóth G. The Relevance of Goodness-of-fit, Robustness and Prediction Validation Categories of OECD-QSAR Principles with Respect to Sample Size and Model Type. Mol Inform 2022; 41:e2200072. [PMID: 35773201 PMCID: PMC9787734 DOI: 10.1002/minf.202200072] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2022] [Accepted: 06/30/2022] [Indexed: 12/30/2022]
Abstract
We investigated the relevance of the validation principles on the Quantitative Structure Activity Relationship models issued by Organization for Economic and Co-operation and Development. We checked the goodness-of-fit, robustness and predictivity categories in linear and nonlinear models using benchmark datasets. Most of our conclusions are drawn using the sample size dependence of the different validation parameters. We found that the goodness-of-fit parameters misleadingly overestimate the models on small samples. In the case of neural network and support vector models, the feasibility of the goodness-of-fit parameters often might be questioned. We propose to use the simplest y-scrambling method to estimate chance correlation. We found that the leave-one-out and leave-many-out cross-validation parameters can be rescaled to each other in all models and the computationally feasible method should be chosen depending on the model type. We assessed the interdependence of the validation parameters by calculating their rank correlations. Goodness of fit and robustness correlate quite well over a sample size for linear models and one of the approaches might be redundant. In the rank correlation between internal and external validation parameters, we found that the assignment of good and bad modellable data to the training or the test causes negative correlations.
Collapse
Affiliation(s)
- Péter Király
- Institute of ChemistryLoránd Eötvös UniversityPázmány S.1/A1117BudapestHungary
| | - Ramóna Kiss
- Institute of ChemistryLoránd Eötvös UniversityPázmány S.1/A1117BudapestHungary
| | - Dániel Kovács
- Institute of ChemistryLoránd Eötvös UniversityPázmány S.1/A1117BudapestHungary
| | - Amine Ballaj
- Institute of ChemistryLoránd Eötvös UniversityPázmány S.1/A1117BudapestHungary
| | - Gergely Tóth
- Institute of ChemistryLoránd Eötvös UniversityPázmány S.1/A1117BudapestHungary
| |
Collapse
|
3
|
Kovács D, Király P, Tóth G. Sample-size dependence of validation parameters in linear regression models and in QSAR. SAR AND QSAR IN ENVIRONMENTAL RESEARCH 2021; 32:247-268. [PMID: 33749419 DOI: 10.1080/1062936x.2021.1890208] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/09/2020] [Accepted: 02/10/2021] [Indexed: 06/12/2023]
Abstract
The dependence of statistical validation parameters was investigated on the size of the sample taken in fit of multivariate linear curves. We observed that R2 and related internal parameters were misleading as they overestimated the goodness-of-fit of models at small sample size. Cross-validation metrics showed correct trends. It was possible to scale the leave-one-out and the leave-many-out results close to identical by correcting the degrees of freedom of the models. y and x-randomized validation parameters were calculated and the methods provided close to identical results. We suggest to use the simplest methods in both cases. The external parameters followed correct trends with respect to the sample size, but their sensitivity differed. We plotted the Roy-Ojha metrics in 2D and we coloured them with respect to other external parameters to provide an easy classification of models. The rank correlations were calculated between the performance parameters. Up to a sample size, goodness-of-fit and robustness were distinguishable, but above a certain sample size, the parameters were redundant. The external-internal pairs were weakly correlated. Our data show that all the three aspects of validation are necessary at small sample sizes, but the internal check of robustness is not informative above a given sample size.
Collapse
Affiliation(s)
- D Kovács
- Institute of Chemistry, Loránd Eötvös University, Budapest, Hungary
| | - P Király
- Institute of Chemistry, Loránd Eötvös University, Budapest, Hungary
| | - G Tóth
- Institute of Chemistry, Loránd Eötvös University, Budapest, Hungary
| |
Collapse
|
4
|
El-Harbawi M, Samir BB, El blidi L, Ben Ghanem O. Highly accurate prediction of flammability limits of chemical compounds using novel integrated hybrid models. PLoS One 2019; 14:e0224807. [PMID: 31725738 PMCID: PMC6855467 DOI: 10.1371/journal.pone.0224807] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2019] [Accepted: 10/22/2019] [Indexed: 11/18/2022] Open
Abstract
Two novel and highly accurate hybrid models were developed for the prediction of the flammability limits (lower flammability limit (LFL) and upper flammability limit (UFL)) of pure compounds using a quantitative structure-property relationship approach. The two models were developed using a dataset obtained from the DIPPR Project 801 database, which comprises 1057 and 515 literature data for the LFL and UFL, respectively. Multiple linear regression (MLR), logarithmic, and polynomial models were used to develop the models according to an algorithm and code written using the MATLAB software. The results indicated that the proposed models were capable of predicting LFL and UFL values with accuracies that were among the best (i.e. most optimised) reported in the literature (LFL: R2 = 99.72%, with an average absolute relative deviation (AARD) of 0.8%; UFL: R2 = 99.64%, with an AARD of 1.41%). These hybrid models are unique in that they were developed using a modified mathematical technique combined three conventional methods. These models afford good practicability and can be used as cost-effective alternatives to experimental measurements of LFL and UFL values for a wide range of pure compounds.
Collapse
Affiliation(s)
- Mohanad El-Harbawi
- Department of Chemical Engineering, King Saud University, Riyadh, Saudi Arabia
| | - Brahim Belhaouari Samir
- Division of Information & Computing Technology, College of Science and Engineering, Hamad Bin Khalifa University, Doha, Qatar
| | - Lahssen El blidi
- Department of Chemical Engineering, King Saud University, Riyadh, Saudi Arabia
| | - Ouahid Ben Ghanem
- Department of process plant operations, Qatar Technical, Doha, Qatar
- Chemical Engineering Department, Universiti Teknologi PETRONAS, Bandar Seri Iskandar, Tronoh, Perak, Malaysia
| |
Collapse
|
5
|
Wang D, He G, Chen H. Prediction for the detonation velocity of the nitrogen-rich energetic compounds based on quantum chemistry. RUSSIAN JOURNAL OF PHYSICAL CHEMISTRY A 2014. [DOI: 10.1134/s0036024414130032] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
|
6
|
Funar-Timofei S, Iliescu S, Suzuki T. Correlations of limiting oxygen index with structural polyphosphoester features by QSPR approaches. Struct Chem 2014. [DOI: 10.1007/s11224-014-0474-7] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
|
7
|
Mathieu D. Power Law Expressions for Predicting Lower and Upper Flammability Limit Temperatures. Ind Eng Chem Res 2013. [DOI: 10.1021/ie4002348] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022]
|
8
|
Gharagheizi F, Sattari M, Ilani-Kashkouli P, Mohammadi AH, Ramjugernath D, Richon D. Quantitative structure—property relationship for thermal decomposition temperature of ionic liquids. Chem Eng Sci 2012. [DOI: 10.1016/j.ces.2012.08.036] [Citation(s) in RCA: 33] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
|
9
|
Quintero FA, Patel SJ, Muñoz F, Sam Mannan M. Review of Existing QSAR/QSPR Models Developed for Properties Used in Hazardous Chemicals Classification System. Ind Eng Chem Res 2012. [DOI: 10.1021/ie301079r] [Citation(s) in RCA: 48] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/05/2023]
Affiliation(s)
- Flor A. Quintero
- Mary Kay O’Connor Process Safety Center, Artie McFerrin Department of Chemical Engineering, Texas A&M University System, College Station, Texas 77843-3122, United States
- Departamento de
Ingeniería Química, Universidad de los Andes, Cr.1 Este #19 A-40, Bogotá D.C.,
Colombia
| | - Suhani J. Patel
- Mary Kay O’Connor Process Safety Center, Artie McFerrin Department of Chemical Engineering, Texas A&M University System, College Station, Texas 77843-3122, United States
| | - Felipe Muñoz
- Departamento de
Ingeniería Química, Universidad de los Andes, Cr.1 Este #19 A-40, Bogotá D.C.,
Colombia
| | - M. Sam Mannan
- Mary Kay O’Connor Process Safety Center, Artie McFerrin Department of Chemical Engineering, Texas A&M University System, College Station, Texas 77843-3122, United States
| |
Collapse
|
10
|
Fayet G, Rotureau P, Prana V, Adamo C. Global and local quantitative structure-property relationship models to predict the impact sensitivity of nitro compounds. PROCESS SAFETY PROGRESS 2012. [DOI: 10.1002/prs.11499] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
|
11
|
Gharagheizi F, Ilani-Kashkouli P, Mohammadi AH. Corresponding States Method for Estimation of Upper Flammability Limit Temperature of Chemical Compounds. Ind Eng Chem Res 2012. [DOI: 10.1021/ie300375k] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Affiliation(s)
- Farhad Gharagheizi
- Department
of Chemical Engineering, Buinzahra Branch, Islamic Azad University, Buinzahra, Iran
| | | | - Amir H. Mohammadi
- MINES ParisTech, CEP/TEP-Centre Énergétique et Procédés, 35 Rue Saint Honoré,
77305 Fontainebleau, France
- Thermodynamics Research Unit, School of Chemical Engineering, University of KwaZulu-Natal, Howard College Campus,
King George V Avenue, Durban 4041, South Africa
| |
Collapse
|
12
|
Gharagheizi F, Ilani-Kashkouli P, Mirkhani SA, Mohammadi AH. Computation of Upper Flash Point of Chemical Compounds Using a Chemical Structure-Based Model. Ind Eng Chem Res 2012. [DOI: 10.1021/ie202868v] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Affiliation(s)
- Farhad Gharagheizi
- Department
of Chemical Engineering, Science and Research Branch, Islamic Azad University,
Tehran, Iran
| | | | - Seyyed Alireza Mirkhani
- Department
of Chemical Engineering, Science and Research Branch, Islamic Azad University,
Tehran, Iran
| | - Amir H. Mohammadi
- MINES ParisTech, CEP/TEP—Centre Énergétique et Procédés, 35 Rue Saint Honoré, 77305 Fontainebleau, France
- Thermodynamics Research Unit,
School of Chemical Engineering, University of KwaZulu-Natal, Howard College Campus, King George V Avenue, Durban 4041, South
Africa
| |
Collapse
|
13
|
Gharagheizi F, Ilani-Kashkouli P, Mirkhani SA, Farahani N, Mohammadi AH. QSPR Molecular Approach for Estimating Henry’s Law Constants of Pure Compounds in Water at Ambient Conditions. Ind Eng Chem Res 2012. [DOI: 10.1021/ie202646u] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Affiliation(s)
| | | | | | | | - Amir H. Mohammadi
- MINES ParisTech, CEP/TEP - Centre Énergétique
et Procédés, 35 Rue Saint Honoré, 77305 Fontainebleau, France
- Thermodynamics Research Unit,
School of Chemical Engineering, University of KwaZulu-Natal, Howard College Campus, King George V Avenue, Durban 4041, South
Africa
| |
Collapse
|
14
|
Mirkhani SA, Gharagheizi F, Sattari M. A QSPR model for prediction of diffusion coefficient of non-electrolyte organic compounds in air at ambient condition. CHEMOSPHERE 2012; 86:959-966. [PMID: 22189378 DOI: 10.1016/j.chemosphere.2011.11.021] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/17/2011] [Revised: 11/09/2011] [Accepted: 11/13/2011] [Indexed: 05/31/2023]
Abstract
Evaluation of diffusion coefficients of pure compounds in air is of great interest for many diverse industrial and air quality control applications. In this communication, a QSPR method is applied to predict the molecular diffusivity of chemical compounds in air at 298.15K and atmospheric pressure. Four thousand five hundred and seventy nine organic compounds from broad spectrum of chemical families have been investigated to propose a comprehensive and predictive model. The final model is derived by Genetic Function Approximation (GFA) and contains five descriptors. Using this dedicated model, we obtain satisfactory results quantified by the following statistical results: Squared Correlation Coefficient=0.9723, Standard Deviation Error=0.003 and Average Absolute Relative Deviation=0.3% for the predicted properties from existing experimental values.
Collapse
|
15
|
Mirkhani SA, Gharagheizi F. Predictive Quantitative Structure–Property Relationship Model for the Estimation of Ionic Liquid Viscosity. Ind Eng Chem Res 2012. [DOI: 10.1021/ie2025823] [Citation(s) in RCA: 45] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Affiliation(s)
- Seyyed Alireza Mirkhani
- Department of Chemical
Engineering, Buinzahra Branch, Islamic Azad University, Buinzahra, Iran
| | - Farhad Gharagheizi
- Department of Chemical
Engineering, Buinzahra Branch, Islamic Azad University, Buinzahra, Iran
| |
Collapse
|
16
|
Nonlinear molecular based modeling of the flash point for application in inherently safer design. J Loss Prev Process Ind 2012. [DOI: 10.1016/j.jlp.2011.06.025] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
|
17
|
Gharagheizi F, Eslamimanesh A, Mohammadi AH, Richon D. QSPR approach for determination of parachor of non-electrolyte organic compounds. Chem Eng Sci 2011. [DOI: 10.1016/j.ces.2011.03.039] [Citation(s) in RCA: 46] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
|
18
|
Prediction of flammability limit temperatures from molecular structures using a neural network–particle swarm algorithm. J Taiwan Inst Chem Eng 2011. [DOI: 10.1016/j.jtice.2010.08.005] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
|
19
|
Reyes OJ, Patel SJ, Mannan MS. Quantitative Structure Property Relationship Studies for Predicting Dust Explosibility Characteristics (Kst, Pmax) of Organic Chemical Dusts. Ind Eng Chem Res 2011. [DOI: 10.1021/ie1013663] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Affiliation(s)
- Olga J. Reyes
- Mary Kay O’Connor Process Safety Center, Artie McFerrin Department of Chemical Engineering, Texas A&M University, College Station, Texas 77843-3122, United States
| | - Suhani J. Patel
- Mary Kay O’Connor Process Safety Center, Artie McFerrin Department of Chemical Engineering, Texas A&M University, College Station, Texas 77843-3122, United States
| | - M. Sam Mannan
- Mary Kay O’Connor Process Safety Center, Artie McFerrin Department of Chemical Engineering, Texas A&M University, College Station, Texas 77843-3122, United States
| |
Collapse
|
20
|
Gharagheizi F, Sattari M. Prediction of Triple-Point Temperature of Pure Components Using their Chemical Structures. Ind Eng Chem Res 2009. [DOI: 10.1021/ie901029m] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Affiliation(s)
- Farhad Gharagheizi
- Department of Chemical Engineering, Faculty of Engineering, University of Tehran, P.O. Box 11365-4563, Tehran, Iran, and Division of Polymer Science and Technology, Research Institute of Petroleum Industry (RIPI), P.O. Box 14665-1998, Tehran, Iran
| | - Mehdi Sattari
- Department of Chemical Engineering, Faculty of Engineering, University of Tehran, P.O. Box 11365-4563, Tehran, Iran, and Division of Polymer Science and Technology, Research Institute of Petroleum Industry (RIPI), P.O. Box 14665-1998, Tehran, Iran
| |
Collapse
|
21
|
Gharagheizi F, Sattari M. Prediction of the θ(UCST) of Polymer Solutions: A Quantitative Structure−Property Relationship Study. Ind Eng Chem Res 2009. [DOI: 10.1021/ie9000426] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Affiliation(s)
- Farhad Gharagheizi
- Department of Chemical Engineering, Faculty of Engineering,
University of Tehran, P.O. Box 11365-4563, Tehran, Iran, and Division
of Polymer Science and Technology, Research Institute of Petroleum
Industry (RIPI), P.O. Box 14665-1998, Tehran, Iran
| | - Mehdi Sattari
- Department of Chemical Engineering, Faculty of Engineering,
University of Tehran, P.O. Box 11365-4563, Tehran, Iran, and Division
of Polymer Science and Technology, Research Institute of Petroleum
Industry (RIPI), P.O. Box 14665-1998, Tehran, Iran
| |
Collapse
|