1
|
Petrova VV, Domnin AV, Porozov YB, Kuliaev PO, Solovev YV. Implementation of machine learning protocols to predict the hydrolysis reaction properties of organophosphorus substrates using descriptors of electron density topology. J Comput Chem 2023. [PMID: 37772443 DOI: 10.1002/jcc.27227] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2023] [Revised: 07/25/2023] [Accepted: 08/28/2023] [Indexed: 09/30/2023]
Abstract
Prediction of catalytic reaction efficiency is one of the most intriguing and challenging applications of machine learning (ML) algorithms in chemistry. In this study, we demonstrated a strategy for utilizing ML protocols applied to Quantum Theory of Atoms In Molecules (QTAIM) parameters to predict the ability of the A17 L47K catalytic antibody to covalently capture organophosphate pesticides. We found that the novel "composite" DFT functional B97-3c could be effectively employed for fast and accurate initial geometry optimization, aligning well with the input dataset creation. QTAIM descriptors proved to be well-established in describing the examined dataset using density-based and hierarchical clustering algorithms. The obtained clusters exhibited correlations with the chemical classes of the input compounds. The precise physical interpretation of the QTAIM properties simplifies the explanation of feature impact for both supervised and unsupervised ML protocols. It also enables acceleration in the search for entries with desired properties within large databases. Furthermore, our findings indicated that Ridge Regression with Laplacian kernel and CatBoost Regressor algorithms demonstrated suitable performance in handling small datasets with non-trivial dependencies. They were able to predict the actual reaction barrier values with a high level of accuracy. Additionally, the CatBoost Classifier proved reliable in discriminating between "active" and "inactive" compounds.
Collapse
Affiliation(s)
- Vlada V Petrova
- M.M. Shemyakin and Yu.A, Ovchinnikov Institute of Bioorganic Chemistry of the Russian Academy of Sciences, Moscow, Russia
- Quantum Chemistry Department, Institute of Chemistry, St. Petersburg State University, Saint Petersburg, Russia
| | - Anton V Domnin
- Quantum Chemistry Department, Institute of Chemistry, St. Petersburg State University, Saint Petersburg, Russia
| | - Yuri B Porozov
- St. Petersburg School of Physics, Mathematics, and Computer Science, HSE University, Saint Petersburg, Russia
- The Center of Bio- and Chemoinformatics, I.M. Sechenov First Moscow State Medical University, Moscow, Russia
| | - Pavel O Kuliaev
- Independent Researcher from Saint Petersburg, Saint Petersburg, Russia
| | - Yaroslav V Solovev
- M.M. Shemyakin and Yu.A, Ovchinnikov Institute of Bioorganic Chemistry of the Russian Academy of Sciences, Moscow, Russia
| |
Collapse
|
2
|
Li X, Liu G, Wang Z, Zhang L, Liu H, Ai H. Ensemble multiclassification model for aquatic toxicity of organic compounds. AQUATIC TOXICOLOGY (AMSTERDAM, NETHERLANDS) 2023; 255:106379. [PMID: 36587517 DOI: 10.1016/j.aquatox.2022.106379] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/10/2022] [Revised: 12/04/2022] [Accepted: 12/19/2022] [Indexed: 06/17/2023]
Abstract
With environmental pollution becoming increasingly serious, organic compounds have become the main hazard of environmental pollution and exert substantial negative impacts on aquatic organisms. In research pertaining to the acute toxicity of organic compounds, traditional biological experimental methods are time-consuming and expensive. In addition, computer-aided binary classification models cannot accurately classify acute toxicity. Therefore, the multiclassication model is necessary for more accurate classification of acute toxicity. In this study, median lethal concentrations of 373 organic compounds in the environmental toxicology datasets ECOTOX and EAT5 were used. These chemicals were classified into four categories based on the European Economic Community criteria. Then the random forest, support vector machine, extreme gradient boosting, adaptive gradient boosting, and C5.0 decision tree algorithms and eight molecular fingerprints were used to build a multiclassification base model for the acute toxicity of organic compounds. The base models were repeated 100 times with fivefold cross-validation and external validation. The ensemble model was obtained by the voting method. The best base classifier was ExtendFP-C5.0, which had an accuracy, sensitivity and specificity values of 87.30%, 87.32% and 95.76% for external validation, and the voting ensemble model performance of 96.92%, 96.93% and 98.97%, respectively. The ensemble model achieved a higher accuracy than previously reported studies. Our study will help to further classify the acute toxicity of organic compounds to aquatic organisms and predict the hazard classes of organic compounds.
Collapse
Affiliation(s)
- Xinran Li
- College of Life Science, Liaoning University, Shenyang, 110036, China
| | - Gaohua Liu
- College of Life Science, Liaoning University, Shenyang, 110036, China
| | - Zhibo Wang
- College of Life Science, Liaoning University, Shenyang, 110036, China
| | - Li Zhang
- College of Life Science, Liaoning University, Shenyang, 110036, China; China Research Center for Computer Simulating and Information Processing of Bio-macromolecules of Shenyang, China
| | - Hongsheng Liu
- China Research Center for Computer Simulating and Information Processing of Bio-macromolecules of Shenyang, China; College of Pharmacy, Liaoning University, Shenyang, 110036, China
| | - Haixin Ai
- College of Life Science, Liaoning University, Shenyang, 110036, China; China Research Center for Computer Simulating and Information Processing of Bio-macromolecules of Shenyang, China.
| |
Collapse
|
3
|
Du J, Qi S, Fan T, Yang Y, Wang C, Shu Q, Zhuo S, Zhu C. Nitrogen and copper-doped carbon quantum dots with intrinsic peroxidase-like activity for double-signal detection of phenol. Analyst 2021; 146:4280-4289. [PMID: 34105526 DOI: 10.1039/d1an00796c] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Abstract
Herein, a simple and facile one-step hydrothermal carbonization synthesis procedure for the fabrication of N, Cu-doped carbon quantum dots (N, Cu-CQDs) as a peroxidase-mimicking enzyme was reported. The peroxidase-like performance of N, Cu-CQDs was assessed based on the oxidative coupling reaction of phenol with 4-aminoantipyrine (4-AAP) in the presence of hydrogen peroxide (H2O2). The N, Cu-CQDs/4-AAP/H2O2 system was applied to sensing phenol based on double signals of absorption spectra (or colorimetric visualization) as well as fluorescence spectra. The obtained limits of detection (LODs) were as low as 0.12 μM and 0.02 μM, respectively. Moreover, the proposed method was successfully applied to the determination of phenol in sewage with satisfactory recovery. Our results demonstrate that the N, Cu-CQDs/4-AAP/H2O2/phenol sensing system has a great potential prospect for applications in environmental chemistry and biotechnology.
Collapse
Affiliation(s)
- Jinyan Du
- Anhui Key Laboratory of Chemo-Biosensing, Key Laboratory of Functional Molecular Solids, Ministry of Education, College of Chemistry and Materials Science, Anhui Normal University, Wuhu, 241000, P R China.
| | - Shuangqing Qi
- Anhui Key Laboratory of Chemo-Biosensing, Key Laboratory of Functional Molecular Solids, Ministry of Education, College of Chemistry and Materials Science, Anhui Normal University, Wuhu, 241000, P R China.
| | - Tingting Fan
- Anhui Key Laboratory of Chemo-Biosensing, Key Laboratory of Functional Molecular Solids, Ministry of Education, College of Chemistry and Materials Science, Anhui Normal University, Wuhu, 241000, P R China.
| | - Ying Yang
- Anhui Key Laboratory of Chemo-Biosensing, Key Laboratory of Functional Molecular Solids, Ministry of Education, College of Chemistry and Materials Science, Anhui Normal University, Wuhu, 241000, P R China.
| | - Chaofeng Wang
- Anhui Key Laboratory of Chemo-Biosensing, Key Laboratory of Functional Molecular Solids, Ministry of Education, College of Chemistry and Materials Science, Anhui Normal University, Wuhu, 241000, P R China.
| | - Qin Shu
- Anhui Key Laboratory of Chemo-Biosensing, Key Laboratory of Functional Molecular Solids, Ministry of Education, College of Chemistry and Materials Science, Anhui Normal University, Wuhu, 241000, P R China.
| | - Shujuan Zhuo
- Anhui Key Laboratory of Chemo-Biosensing, Key Laboratory of Functional Molecular Solids, Ministry of Education, College of Chemistry and Materials Science, Anhui Normal University, Wuhu, 241000, P R China.
| | - Changqing Zhu
- Anhui Key Laboratory of Chemo-Biosensing, Key Laboratory of Functional Molecular Solids, Ministry of Education, College of Chemistry and Materials Science, Anhui Normal University, Wuhu, 241000, P R China.
| |
Collapse
|
4
|
Cerruela García G, García-Pedrajas N. Boosted feature selectors: a case study on prediction P-gp inhibitors and substrates. J Comput Aided Mol Des 2018; 32:1273-1294. [PMID: 30367310 DOI: 10.1007/s10822-018-0171-5] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/24/2018] [Accepted: 10/18/2018] [Indexed: 01/11/2023]
Abstract
Feature selection is commonly used as a preprocessing step to machine learning for improving learning performance, lowering computational complexity and facilitating model interpretation. This paper proposes the application of boosting feature selection to improve the classification performance of standard feature selection algorithms evaluated for the prediction of P-gp inhibitors and substrates. Two well-known classification algorithms, decision trees and support vector machines, were used to classify the chemical compounds. The experimental results showed better performance for boosting feature selection with respect to the standard feature selection algorithms while maintaining the capability for feature reduction.
Collapse
Affiliation(s)
- Gonzalo Cerruela García
- Department of Computing and Numerical Analysis, University of Córdoba, Campus de Rabanales, Albert Einstein Building, 14071, Córdoba, Spain.
| | - Nicolás García-Pedrajas
- Department of Computing and Numerical Analysis, University of Córdoba, Campus de Rabanales, Albert Einstein Building, 14071, Córdoba, Spain
| |
Collapse
|
5
|
Xue H, Yan Y, Hou Y, Li G, Hao C. Novel carbon quantum dots for fluorescent detection of phenol and insights into the mechanism. NEW J CHEM 2018. [DOI: 10.1039/c8nj01611a] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Abstract
Phenol is considered as one of the most important pollutants in the water environment, and thus its detection plays a cardinal role in environmental assessment and treatment.
Collapse
Affiliation(s)
- Hong Xue
- State Key Laboratory of Fine Chemicals
- Dalian University of Technology
- China
| | - Yang Yan
- State Key Laboratory of Fine Chemicals
- Dalian University of Technology
- China
| | - Yong Hou
- State Key Laboratory of Fine Chemicals
- Dalian University of Technology
- China
| | - Guanglan Li
- State Key Laboratory of Fine Chemicals
- Dalian University of Technology
- China
| | - Ce Hao
- State Key Laboratory of Fine Chemicals
- Dalian University of Technology
- China
| |
Collapse
|
6
|
Chen YS. A comprehensive identification-evidence based alternative for HIV/AIDS treatment with HAART in the healthcare industries. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2016; 131:111-126. [PMID: 27265053 DOI: 10.1016/j.cmpb.2016.04.001] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/29/2015] [Revised: 03/03/2016] [Accepted: 04/01/2016] [Indexed: 06/05/2023]
Abstract
BACKGROUND AND OBJECTIVE The HIV/AIDS-related issue has given rise to a priority concern in which potential new therapies are increasingly highlighted to lessen the negative impact of highly active anti-retroviral therapy (HAART) in the healthcare industry. With the motivation of "medical applications," this study focuses on the main advanced feature selection techniques and classification approaches that reflect a new architecture, and a trial to build a hybrid model for interested parties. METHODS This study first uses an integrated linear-nonlinear feature selection technique to identify the determinants influencing HAART medication and utilizes organizations of different condition-attributes to generate a hybrid model based on a rough set classifier to study evolving HIV/AIDS research in order to improve classification performance. RESULTS The proposed model makes use of a real data set from Taiwan's specialist medical center. The experimental results show that the proposed model yields a satisfactory result that is superior to the listed methods, and the core condition-attributes PVL, CD4, Code, Age, Year, PLT, and Sex were identified in the HIV/AIDS data set. In addition, the decision rule set created can be referenced as a knowledge-based healthcare service system as the best of evidence-based practices in the workflow of current clinical diagnosis. CONCLUSIONS This study highlights the importance of these key factors and provides the rationale that the proposed model is an effective alternative to analyzing sustained HAART medication in follow-up studies of HIV/AIDS treatment in practice.
Collapse
Affiliation(s)
- You-Shyang Chen
- Department of Information Management, Hwa Hsia University of Technology, 111, Gongzhuan Rd., Zhonghe Dist., New Taipei City 235, Taiwan.
| |
Collapse
|
7
|
Geometry optimization method versus predictive ability in QSPR modeling for ionic liquids. J Comput Aided Mol Des 2016; 30:165-76. [DOI: 10.1007/s10822-016-9894-3] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2015] [Accepted: 01/13/2016] [Indexed: 01/03/2023]
|
8
|
Matta* CF. Modeling biophysical and biological properties from the characteristics of the molecular electron density, electron localization and delocalization matrices, and the electrostatic potential. J Comput Chem 2014; 35:1165-98. [PMID: 24777743 PMCID: PMC4368384 DOI: 10.1002/jcc.23608] [Citation(s) in RCA: 109] [Impact Index Per Article: 10.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/24/2013] [Revised: 03/16/2014] [Accepted: 03/21/2014] [Indexed: 11/11/2022]
Abstract
The electron density and the electrostatic potential are fundamentally related to the molecular hamiltonian, and hence are the ultimate source of all properties in the ground- and excited-states. The advantages of using molecular descriptors derived from these fundamental scalar fields, both accessible from theory and from experiment, in the formulation of quantitative structure-to-activity and structure-to-property relationships, collectively abbreviated as QSAR, are discussed. A few such descriptors encode for a wide variety of properties including, for example, electronic transition energies, pK(a)'s, rates of ester hydrolysis, NMR chemical shifts, DNA dimers binding energies, π-stacking energies, toxicological indices, cytotoxicities, hepatotoxicities, carcinogenicities, partial molar volumes, partition coefficients (log P), hydrogen bond donor capacities, enzyme-substrate complementarities, bioisosterism, and regularities in the genetic code. Electronic fingerprinting from the topological analysis of the electron density is shown to be comparable and possibly superior to Hammett constants and can be used in conjunction with traditional bulk and liposolubility descriptors to accurately predict biological activities. A new class of descriptors obtained from the quantum theory of atoms in molecules' (QTAIM) localization and delocalization indices and bond properties, cast in matrix format, is shown to quantify transferability and molecular similarity meaningfully. Properties such as "interacting quantum atoms (IQA)" energies which are expressible into an interaction matrix of two body terms (and diagonal one body "self" terms, as IQA energies) can be used in the same manner. The proposed QSAR-type studies based on similarity distances derived from such matrix representatives of molecular structure necessitate extensive investigation before their utility is unequivocally established.
Collapse
Affiliation(s)
- Chérif F Matta*
- Department of Chemistry and Physics, Mount Saint Vincent UniversityHalifax, Nova Scotia, Canada, B3M 2J6
- Department of Chemistry, Dalhousie UniversityHalifax, Nova Scotia, Canada, B3H 4J3
- Department of Chemistry, Saint Mary's UniversityHalifax, Nova Scotia, Canada, B3H 3C3
| |
Collapse
|
9
|
Chakraborty A, Pan S, Chattaraj PK. Biological Activity and Toxicity: A Conceptual DFT Approach. STRUCTURE AND BONDING 2013. [DOI: 10.1007/978-3-642-32750-6_5] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/01/2022]
|
10
|
Keshavarz MH, Gharagheizi F, Shokrolahi A, Zakinejad S. Accurate prediction of the toxicity of benzoic acid compounds in mice via oral without using any computer codes. JOURNAL OF HAZARDOUS MATERIALS 2012; 237-238:79-101. [PMID: 22959133 DOI: 10.1016/j.jhazmat.2012.07.048] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/22/2011] [Revised: 03/30/2012] [Accepted: 07/25/2012] [Indexed: 06/01/2023]
Abstract
Most of benzoic acid derivatives are toxic, which may cause serious public health and environmental problems. Two novel simple and reliable models are introduced for desk calculations of the toxicity of benzoic acid compounds in mice via oral LD(50) with more reliance on their answers as one could attach to the more complex outputs. They require only elemental composition and molecular fragments without using any computer codes. The first model is based on only the number of carbon and hydrogen atoms, which can be improved by several molecular fragments in the second model. For 57 benzoic compounds, where the computed results of quantitative structure-toxicity relationship (QSTR) were recently reported, the predicted results of two simple models of present method are more reliable than QSTR computations. The present simple method is also tested with further 324 benzoic acid compounds including complex molecular structures, which confirm good forecasting ability of the second model.
Collapse
Affiliation(s)
- Mohammad Hossein Keshavarz
- Department of Chemistry, Malek-ashtar University of Technology, Shahin-shahr P.O. Box 83145/115, Isfahan, Islamic Republic of Iran.
| | | | | | | |
Collapse
|
11
|
Roy K, Das RN. QSTR with extended topochemical atom (ETA) indices. 14. QSAR modeling of toxicity of aromatic aldehydes to Tetrahymena pyriformis. JOURNAL OF HAZARDOUS MATERIALS 2010; 183:913-922. [PMID: 20739120 DOI: 10.1016/j.jhazmat.2010.07.116] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/17/2010] [Revised: 07/27/2010] [Accepted: 07/27/2010] [Indexed: 05/29/2023]
Abstract
Aldehydes are a toxic class of chemicals causing severe health hazards. In this background, quantitative structure-toxicity relationship (QSTR) models have been developed in the present study using Extended Topochemical Atom (ETA) indices for a large group of 77 aromatic aldehydes for their acute toxicity against the protozoan ciliate Tetrahymena pyriformis. The ETA models have been compared with those developed using various non-ETA topological indices. Attempt was also made to include the n-octanol/water partition coefficient (logK(o/w)) as an additional descriptor considering the importance of hydrophobicity in toxicity prediction. Thirty different models were developed using different chemometric tools. All the models have been validated using internal validation and external validation techniques. The statistical quality of the ETA models was found to be comparable to that of the non-ETA models. The ETA models have shown the important effects of steric bulk, lipophilicity, presence of electronegative atom containing substituents and functionality of the aldehydic oxygen to the toxicity of the aldehydes. The best ETA model (without using logK(o/w)) shows encouraging statistical quality (Q(int)(2)=0.709,Q(ext)(2)=0.744). It is interesting to note that some of the topological models reported here are better in statistical quality than previously reported models using quantum chemical descriptors.
Collapse
Affiliation(s)
- Kunal Roy
- Drug Theoretics and Cheminformatics Laboratory, Division of Medicinal and Pharmaceutical Chemistry, Department of Pharmaceutical Technology, Jadavpur University, Kolkata 700 032, India.
| | | |
Collapse
|
12
|
Novel amino acids indices based on quantum topological molecular similarity and their application to QSAR study of peptides. Amino Acids 2010; 40:1169-83. [DOI: 10.1007/s00726-010-0741-x] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2010] [Accepted: 08/31/2010] [Indexed: 10/19/2022]
|
13
|
3D-QSAR studies on caspase-mediated apoptosis activity of phenolic analogues. J Mol Model 2010; 17:1-8. [DOI: 10.1007/s00894-010-0689-5] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2009] [Accepted: 02/11/2010] [Indexed: 10/19/2022]
|
14
|
Kar S, Harding AP, Roy K, Popelier PLA. QSAR with quantum topological molecular similarity indices: toxicity of aromatic aldehydes to Tetrahymena pyriformis. SAR AND QSAR IN ENVIRONMENTAL RESEARCH 2010; 21:149-168. [PMID: 20373218 DOI: 10.1080/10629360903568697] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/29/2023]
Abstract
Extensive production and utilization of aromatic aldehydes and their derivatives without proper certification is alarming with regard to environmental safety. This concern motivated our construction of predictive quantitative structure-activity relationship (QSAR) models for the toxicity of aldehydes to the ecologically important species Tetrahymena pyriformis. Quantum topological molecular similarity (QTMS) descriptors, along with the lipid-water partition coefficient (log K(o/w)), were used as predictor variables. The QTMS descriptors were calculated at different levels of theory including AM1, HF/3-21G(d), HF/6-31G(d), B3LYP/6-31 + G(d,p), B3LYP/6-311 + G(2d,p) and MP2/6-311+G(2d,p). The data set of 77 aromatic aldehydes was divided into a training set (n = 58) and a test (n = 19) set, and 58 models were developed using partial least squares (PLS) and genetic partial least squares (G/PLS). We evaluated the overall predictive capacity of the models based on leave-one-out predictions for the training set compounds and model derived predictions for the test set compounds. For both PLS and G/PLS, the models built at the HF/6-31G(d) level show better predictivity (based on overall prediction) than the models developed at any of the other five levels. Further validation was also performed utilizing (process and model) randomization tests. We show that improved predictive QSAR models for aldehydic toxicity to Tetrahymena pyriformis can be generated using QTMS descriptors along with log K(o/w).
Collapse
Affiliation(s)
- S Kar
- Drug Theoretics and Cheminformatics Lab, Department of Pharmaceutical Technology, Jadavpur University, Kolkata, India
| | | | | | | |
Collapse
|
15
|
Roy K, Ghosh G. QSTR with extended topochemical atom (ETA) indices. 12. QSAR for the toxicity of diverse aromatic compounds to Tetrahymena pyriformis using chemometric tools. CHEMOSPHERE 2009; 77:999-1009. [PMID: 19709717 DOI: 10.1016/j.chemosphere.2009.07.072] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/05/2009] [Accepted: 07/30/2009] [Indexed: 05/28/2023]
Abstract
We have developed QSTR models for the toxicity of 384 diverse aromatic compounds to Tetrahymena pyriformis with recently introduced extended topochemical atom (ETA) indices and compared the ETA models with those derived from various non-ETA topological descriptors and also combined set of descriptors encompassing the ETA and non-ETA parameters. The data set was split into test (25% compounds of total data points) and training (remaining 75%) sets based on K-mean clustering technique. Different statistical analyses (factor analysis followed by multiple linear regression (FA-MLR), stepwise regression and partial least squares (PLS)) were performed with the training set compounds to develop QSTR models using the topological descriptors. All the developed models were cross-validated using leave-one-out (LOO) technique. The best models were selected on the basis of predicted R(2) values for test set compounds. The best models (based on external validation) developed from different techniques came from the combined set of descriptors. The above results indicate that the use of ETA descriptors with non-ETA descriptors improved the statistical quality of the non-ETA models. From the best models involving ETA parameters, it is observed that functionality of halogen atoms (hydrophobicity), volume parameter (bulk) and nitrogen containing functionalities (polarity) are important for developing QSTR models for the current data set. This study suggests that ETA parameters are sufficient power to encode chemical information contributing significantly to the toxicity of diverse aromatic compounds to T. pyriformis.
Collapse
Affiliation(s)
- Kunal Roy
- Drug Theoretics and Cheminformatics Lab, Division of Medicinal and Pharmaceutical Chemistry, Department of Pharmaceutical Technology, Jadavpur University, Kolkata 700 032, India.
| | | |
Collapse
|
16
|
Xue C, Popelier PLA. Prediction of interaction energies of substituted hydrogen-bonded Watson-Crick cytosine:guanine(8X) base pairs. J Phys Chem B 2009; 113:3245-50. [PMID: 19260717 DOI: 10.1021/jp8071926] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
We investigated the variation in the interaction energy between the Watson-Crick hydrogen-bonded DNA base pairs guanine and cytosine (G(8X):C), where guanine is substituted in the C8 position by 37 different functional groups. Base pairs were optimized at the B3LYP/6-311+G(2d,p) level. A base pair complex containing a more strongly electron-withdrawing group remarkably forms a more stable base pair with C. Multivariate linear regression provided a quantitative relationship between the interaction energies and descriptors generated by the quantum chemical topology (QCT) approach. The descriptors were sampled from the monomers only, not the supermolecular base pair complexes. A model with r2 = 0.96 and a root-mean-square (rms) value of 0.6 kJ/mol was obtained for a training set of 28 base pair complexes. The model was tested by an external test set of 9 complexes, yielding r2 = 0.99 and an rms value of 0.2 kJ/mol. The results indicated that the bonds C6=O6 and N2-H2 at the hydrogen-bonded frontier of the guanine derivatives play an important role in transmitting the substituent effects. A linear correlation between substitution energies and Hammett constants (sigma(m)) was also obtained for all 37 substituents, yielding r2 = 0.82 and an rms value of 1.2 kJ/mol. The model based on QCT descriptors can therefore be used for the prediction of the interaction energy of the base pair G(8x):C, strictly based on data for the G(8x) monomers only.
Collapse
Affiliation(s)
- Chunxia Xue
- Manchester Interdisciplinary Biocentre (MIB), 131 Princess Street, Manchester M1 7DN, Great Britain
| | | |
Collapse
|
17
|
Harding AP, Wedge DC, Popelier PLA. pKa Prediction from “Quantum Chemical Topology” Descriptors. J Chem Inf Model 2009; 49:1914-24. [DOI: 10.1021/ci900172h] [Citation(s) in RCA: 49] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Affiliation(s)
- A. P. Harding
- Manchester Interdisciplinary Biocentre (MIB), 131 Princess Street, Manchester M1 7DN, Great Britain, and School of Chemistry, University of Manchester, Oxford Road, Manchester M13 9PL, Great Britain
| | - D. C. Wedge
- Manchester Interdisciplinary Biocentre (MIB), 131 Princess Street, Manchester M1 7DN, Great Britain, and School of Chemistry, University of Manchester, Oxford Road, Manchester M13 9PL, Great Britain
| | - P. L. A. Popelier
- Manchester Interdisciplinary Biocentre (MIB), 131 Princess Street, Manchester M1 7DN, Great Britain, and School of Chemistry, University of Manchester, Oxford Road, Manchester M13 9PL, Great Britain
| |
Collapse
|
18
|
Zvinavashe E, Murk AJ, Rietjens IMCM. Promises and pitfalls of quantitative structure-activity relationship approaches for predicting metabolism and toxicity. Chem Res Toxicol 2009; 21:2229-36. [PMID: 19548346 DOI: 10.1021/tx800252e] [Citation(s) in RCA: 61] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
The description of quantitative structure-activity relationship (QSAR) models has been a topic for scientific research for more than 40 years and a topic within the regulatory framework for more than 20 years. At present, efforts on QSAR development are increasing because of their promise for supporting reduction, refinement, and/or replacement of animal toxicity experiments. However, their acceptance in risk assessment seems to require a more standardized and scientific underpinning of QSAR technology to avoid possible pitfalls. For this reason, guidelines for QSAR model development recently proposed by the Organization for Economic Cooperation and Development (OECD) [Organization for Economic Cooperation and Development (OECD) (2007) Guidance document on the validation of (quantitative) structure-activity relationships [(Q)SAR] models. OECD Environment Health and Safety Publications: Series on Testing and Assessment No. 69, Paris] are expected to help increase the acceptability of QSAR models for regulatory purposes. The guidelines recommend that QSAR models should be associated with (i) a defined end point, (ii) an unambiguous algorithm, (iii) a defined domain of applicability, (iv) appropriate measures of goodness-of-fit, robustness, and predictivity, and (v) a mechanistic interpretation, if possible [Organization for Economic Cooperation and Development (OECD) (2007) Guidance document on the validation of (quantitative) structure-activity relationships [(Q)SAR] models. The present perspective provides an overview of these guidelines for QSAR model development and their rationale, as well as the promises and pitfalls of using QSAR approaches and these guidelines for predicting metabolism and toxicity of new and existing chemicals.
Collapse
Affiliation(s)
- Elton Zvinavashe
- Division of Toxicology, Wageningen University, Tuinlaan 5, 6703 HE Wageningen, The Netherlands
| | | | | |
Collapse
|
19
|
Roy K, Popelier PLA. Predictive QSPR modeling of the acidic dissociation constant (pKa) of phenols in different solvents. J PHYS ORG CHEM 2009. [DOI: 10.1002/poc.1447] [Citation(s) in RCA: 52] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
|
20
|
Roy K, Popelier P. Exploring Predictive QSAR Models Using Quantum Topological Molecular Similarity (QTMS) Descriptors for Toxicity of Nitroaromatics toSaccharomyces cerevisiae. ACTA ACUST UNITED AC 2008. [DOI: 10.1002/qsar.200810028] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
|