1
|
Aakash A, Kulsoom R, Khan S, Siddiqui MS, Nabi D. Novel Models for Accurate Estimation of Air-Blood Partitioning: Applications to Individual Compounds and Complex Mixtures of Neutral Organic Compounds. J Chem Inf Model 2023; 63:7056-7066. [PMID: 37956246 PMCID: PMC10685450 DOI: 10.1021/acs.jcim.3c01288] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2023] [Revised: 10/23/2023] [Accepted: 10/25/2023] [Indexed: 11/15/2023]
Abstract
The air-blood partition coefficient (Kab) is extensively employed in human health risk assessment for chemical exposure. However, current Kab estimation approaches either require an extensive number of parameters or lack precision. In this study, we present two novel and parsimonious models to accurately estimate Kab values for individual neutral organic compounds, as well as their complex mixtures. The first model, termed the GC×GC model, was developed based on the retention times of nonpolar chemical analytes on comprehensive two-dimensional gas chromatography (GC×GC). This model is unique in its ability to estimate the Kab values for complex mixtures of nonpolar organic chemicals. The GC×GC model successfully accounted for the Kab variance (R2 = 0.97) and demonstrated strong prediction power (RMSE = 0.31 log unit) for an independent set of nonpolar chemical analytes. Overall, the GC×GC model can be used to estimate Kab values for complex mixtures of neutral organic compounds. The second model, termed the partition model (PM), is based on two types of partition coefficients: octanol to water (Kow) and air to water (Kaw). The PM was able to effectively account for the variability in Kab data (n = 344), yielding an R2 value of 0.93 and root-mean-square error (RMSE) of 0.34 log unit. The predictive power and explanatory performance of the PM were found to be comparable to those of the parameter-intensive Abraham solvation models (ASMs). Additionally, the PM can be integrated into the software EPI Suite, which is widely used in chemical risk assessment for initial screening. The PM provides quick and reliable estimation of Kab compared to ASMs, while the GC×GC model is uniquely suited for estimating Kab values for complex mixtures of neutral organic compounds. In summary, our study introduces two novel and parsimonious models for the accurate estimation of Kab values for both individual compounds and complex mixtures.
Collapse
Affiliation(s)
- Ahmad Aakash
- Institute
of Environmental Science and Engineering (IESE), School of Civil and
Environmental Engineering (SCEE), National
University of Sciences and Technology (NUST), H-12, 48000 Islamabad, Pakistan
| | - Ramsha Kulsoom
- Institute
of Environmental Science and Engineering (IESE), School of Civil and
Environmental Engineering (SCEE), National
University of Sciences and Technology (NUST), H-12, 48000 Islamabad, Pakistan
| | - Saba Khan
- Institute
of Environmental Science and Engineering (IESE), School of Civil and
Environmental Engineering (SCEE), National
University of Sciences and Technology (NUST), H-12, 48000 Islamabad, Pakistan
| | - Musab Saeed Siddiqui
- Institute
of Environmental Science and Engineering (IESE), School of Civil and
Environmental Engineering (SCEE), National
University of Sciences and Technology (NUST), H-12, 48000 Islamabad, Pakistan
| | - Deedar Nabi
- Institute
of Environmental Science and Engineering (IESE), School of Civil and
Environmental Engineering (SCEE), National
University of Sciences and Technology (NUST), H-12, 48000 Islamabad, Pakistan
- GEOMAR
Helmholtz Center for Ocean Research, Wischhofstrasse 1-3, 24148 Kiel, Germany
| |
Collapse
|
2
|
Jia C, Yu X, Masiak W. Blood/air distribution of volatile organic compounds (VOCs) in a nationally representative sample. THE SCIENCE OF THE TOTAL ENVIRONMENT 2012; 419:225-232. [PMID: 22285084 DOI: 10.1016/j.scitotenv.2011.12.055] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/08/2011] [Revised: 12/22/2011] [Accepted: 12/22/2011] [Indexed: 05/31/2023]
Abstract
Volatile organic compounds (VOCs) in human blood are an effective biomarker of environmental exposure and are closely linked to health outcomes. Unlike VOC concentrations in air, which are routinely collected, blood VOC data are not as readily available. This study aims to develop the quantitative relationship between air and blood VOCs by deriving population-based blood/air distribution coefficients (popKs) of ten common VOCs in the general U.S. population. Air and human blood samples were collected from 364 adults aged 20-59 years in 1999-2000 National Health and Nutrition Examination Survey (NHANES). Determinants of popKs were identified using weighted multivariate regression models. In the non-smoking population, median popKs ranged from 3.1 to 77.3, comparable to values obtained in the laboratory. PopKs decreased with increasing airborne VOC concentrations. Smoking elevated popKs by 1.5-3.5 times for aromatic compounds, but did not affect the popKs for methyl tert-butyl ether (MTBE) or chlorinated compounds. Drinking water concentration was a modifier of MTBE's popK. Age, gender, body composition, nor ethnicity affected popKs. PopKs were predictable using linear models with air concentration as the independent variable for both adults and children. This is the first study to estimate blood/air distribution coefficients using simultaneous environmental and biological monitoring on a national population sample. This study was also the first to determine the blood/air distribution coefficient of p-dichlorobenzene, a compound frequently found in indoor environments. These results have applications in exposure assessment, pharmacokinetic analysis, physiologically-based pharmacokinetic (PBPK) modeling, and uncertainty analysis.
Collapse
Affiliation(s)
- Chunrong Jia
- School of Public Health, University of Memphis, Memphis, TN 38152, USA.
| | | | | |
Collapse
|
3
|
Buist HE, Wit-Bos LD, Bouwman T, Vaes WH. Predicting blood:air partition coefficients using basic physicochemical properties. Regul Toxicol Pharmacol 2012; 62:23-8. [DOI: 10.1016/j.yrtph.2011.11.019] [Citation(s) in RCA: 74] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2011] [Revised: 10/25/2011] [Accepted: 11/30/2011] [Indexed: 11/24/2022]
|
4
|
Basak SC, Mills D, Hawkins DM. Characterization of Dihydrofolate Reductases from Multiple Strains of Plasmodium falciparum Using Mathematical Descriptors of Their Inhibitors. Chem Biodivers 2011; 8:440-53. [DOI: 10.1002/cbdv.201000111] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
|
5
|
Peyret T, Krishnan K. QSARs for PBPK modelling of environmental contaminants. SAR AND QSAR IN ENVIRONMENTAL RESEARCH 2011; 22:129-169. [PMID: 21391145 DOI: 10.1080/1062936x.2010.548351] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/30/2023]
Abstract
Physiologically-based pharmacokinetic (PBPK) models are increasingly finding use in risk assessment applications of data-rich compounds. However, it is a challenge to determine the chemical-specific parameters for these models, particularly in time- and resource-limiting situations. In this regard, SARs, QSARs and QPPRs are potentially useful for computing the chemical-specific input parameters of PBPK models. Based on the frequency of occurrence of molecular fragments (CH(3), CH(2), CH, C, C=C, H, benzene ring and H in benzene ring structure) and exposure conditions, the available QSAR-PBPK models facilitate the simulation of tissue and blood concentrations for some inhaled volatile organic chemicals. The application domain of existing QSARs for developing PBPK models is limited, due to lack of relevant data for diverse chemicals and mechanisms. Even though this approach is conceptually applicable to non-volatile and high molecular weight organics as well, it is more challenging to predict the other PBPK model parameters required for modelling the kinetics of these chemicals (particularly tissue diffusion coefficients, association constants for binding and oral absorption rates). As the level of our understanding of the mechanistic basis of toxicokinetic processes improves, QSARs to provide a priori predictions of key chemical-specific PBPK parameters can be developed to expedite the internal dose-based health risk assessments in data-poor situations.
Collapse
Affiliation(s)
- T Peyret
- Departement de sante environnementale et sante au travail, Universite de Montreal, Montreal, Canada
| | | |
Collapse
|
6
|
Basak SC. Role of mathematical chemodescriptors and proteomics-based biodescriptors in drug discovery. Drug Dev Res 2010. [DOI: 10.1002/ddr.20428] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
|
7
|
Basak SC, Mills D. Quantitative structure-activity relationships for cycloguanil analogs as PfDHFR inhibitors using mathematical molecular descriptors. SAR AND QSAR IN ENVIRONMENTAL RESEARCH 2010; 21:215-229. [PMID: 20544548 DOI: 10.1080/10629361003770951] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/29/2023]
Abstract
Computed molecular descriptors were used to develop quantitative structure-activity relationships (QSARs) for binding affinities (K(i)) for a set of 58 cycloguanil (2,4-diamino-1,6-dihydro-1,3,5-triazine) analogues for dihydrofolate reductase (DHFR) enzyme extracted from wild and A16V+S108T mutant type (a double mutation) malaria parasite Plasmodium falciparum (Pf). High-quality models were obtained in both cases. The results of statistical analyses show that ridge regression (RR) outperformed the two other modelling methods, principal component regression (PCR) and partial least squares (PLS). For both enzymes, recognition of the inhibitors was based on four broad categories of descriptors encoding information on: (1) the electronic character of the various atoms in the molecule, (2) the size and shape of the structure, (3) the degree of branching in the molecular skeleton, and (4) two to five atom molecular fragments with aliphatic carbon at one end and aliphatic or aromatic carbon or nitrogen at the other end. The subsets of influential descriptors underlying the QSARs for the wild versus the mutant DHFR are quite non-overlapping. This indicates that the two enzymes recognize the inhibitor molecules on the basis of mutually distinct structural attributes. Such differential QSARs can be useful in the design of novel drugs active against malaria parasites which are growing in resistant to existing chemotherapeutic agents.
Collapse
Affiliation(s)
- S C Basak
- Center for Water and the Environment, Natural Resources Research Institute, University of Minnesota Duluth, Duluth, USA.
| | | |
Collapse
|
8
|
Basak S, Mills D, Hawkins D, Kraker J. Quantitative Structure-Activity Relationship (QSAR) Modeling of Human Blood : Air Partitioning with Proper Statistical Methods and Validation. Chem Biodivers 2009; 6:487-502. [DOI: 10.1002/cbdv.200800111] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/13/2023]
|
9
|
Basak SC, Mills D. Predicting the vapour pressure of chemicals from structure: a comparison of graph theoretic versus quantum chemical descriptors. SAR AND QSAR IN ENVIRONMENTAL RESEARCH 2009; 20:119-132. [PMID: 19343587 DOI: 10.1080/10629360902726007] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/27/2023]
Abstract
In this paper a set of graph theoretic molecular descriptors was used to predict the normal vapour pressure of a collection of 121 chlorinated organic chemicals. The easily calculated topological descriptors resulted in a robust quantitative structure-property relationship (QSPR) model with q(2) of 0.988, which is comparable to a model published previously developed using the computationally expensive density functional theory (DFT) method at the B3LYP level (Becke three-parameter exchange, Lee-Yang-Parr correlation). The addition of computer-intensive quantum chemical descriptors, including polarizability, to the set of topological descriptors did not improve the predictive ability of the model.
Collapse
Affiliation(s)
- S C Basak
- University of Minnesota Duluth, Natural Resources Research Institute, Center for Water and the Environment, Duluth, MN 55811, USA.
| | | |
Collapse
|
10
|
Basak SC, Mills D, Hawkins DM. Predicting allergic contact dermatitis: a hierarchical structure-activity relationship (SAR) approach to chemical classification using topological and quantum chemical descriptors. J Comput Aided Mol Des 2008; 22:339-43. [PMID: 18338224 DOI: 10.1007/s10822-008-9202-y] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2007] [Accepted: 02/20/2008] [Indexed: 11/27/2022]
Abstract
A hierarchical classification study was carried out based on a set of 70 chemicals-35 which produce allergic contact dermatitis (ACD) and 35 which do not. This approach was implemented using a regular ridge regression computer code, followed by conversion of regression output to binary data values. The hierarchical descriptor classes used in the modeling include topostructural (TS), topochemical (TC), and quantum chemical (QC), all of which are based solely on chemical structure. The concordance, sensitivity, and specificity are reported. The model based on the TC descriptors was found to be the best, while the TS model was extremely poor.
Collapse
Affiliation(s)
- Subhash C Basak
- Natural Resources Research Institute, Center for Water and Environment, University of Minnesota, Duluth, 5013 Miller Trunk Hwy, Duluth, MN, 55811, USA.
| | | | | |
Collapse
|
11
|
Nandi S, Vracko M, Bagchi MC. Anticancer activity of selected phenolic compounds: QSAR studies using ridge regression and neural networks. Chem Biol Drug Des 2008; 70:424-36. [PMID: 17949360 DOI: 10.1111/j.1747-0285.2007.00575.x] [Citation(s) in RCA: 43] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
Phenol and its congeners are known to induce caspase-mediated apoptosis activity and cytotoxicity on various cancer cell lines. Apoptosis, scavenging of radicals, antioxidant, and pro-oxidant characteristics are primarily responsible for the antitumor activities of phenolic compounds. Quantitative structure-activity relationship studies on the cellular apoptosis and cytotoxicity of phenolic compounds have been investigated recently by Selassie and colleagues (J Med Chem; 48:7234, 2005) wherein models were developed for various carcinogenic cell lines. These quantitative structure-activity relationship models are based on few experimentally obtained physicochemical parameters such as Verloop's sterimol descriptor, hydrophobicity, Hammett electronic parameter, and octanol/water partition coefficient. The paper deals with structure-activity relationships of phenols and its derivatives for the development of predictive models from the standpoint of theoretical structural parameters and ridge regression methodology. The quantitative structure-activity relationship studies developed here for the caspase-mediated apoptosis activity and cytotoxicity on murine leukemia cell line (L1210), human promylolytic cell line (HL-60), human breast cancer cell line (MCF-7), parenteral human acute lymphoblastic cells (CCRF-CEM), and multidrug-resistant subline of CCRF-resistant to vinblastine (CEM/VLB) cells utilize physicochemical molecular descriptors calculated solely from the structure of phenolic compounds under investigation along with the descriptors used by Selassie and group. It is seen that such quantitative structure-activity relationships can provide a better quality predictive model for the phenolic compounds. The biological activities of the nine sets of phenolic compounds have been calculated based on ridge regression analysis that clearly gives a better significant correlation compared to the activities predicted by Selassie and co-workers. Counter-propagation artificial neural network studies have been introduced in the present investigation for a better understanding of multidimensional rational patterns in more complex data sets. The counter-propagation artificial neural network studies were performed on the same data set and with the same descriptors as have been carried out in developing ridge regression models and the result of counter-propagation neural network models produces very interesting findings in terms of leave-one-out test. Finally, an attempt has been made for a comparative study of the relative effectiveness of linear statistical methods versus nonlinear techniques, such as counter-propagation neural networks in modeling structure-activity studies of the phenolic compounds.
Collapse
Affiliation(s)
- Sisir Nandi
- Structural Biology and Bioinformatics Division, Indian Institute of Chemical Biology, 4 Raja S.C. Mullick Road, Jadavpur, Calcutta, India
| | | | | |
Collapse
|
12
|
|
13
|
Basak SC, Mills D, Mumtaz MM. A quantitative structure-activity relationship (QSAR) study of dermal absorption using theoretical molecular descriptors. SAR AND QSAR IN ENVIRONMENTAL RESEARCH 2007; 18:45-55. [PMID: 17365958 DOI: 10.1080/10629360601033671] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/14/2023]
Abstract
Quantitative structure-activity relationship (QSAR) models were developed for the prediction of dermal absorption based on experimental log Kp data for a diverse set of 101 chemicals obtained from the literature. Molecular descriptors including topostructural (TS), topochemical (TC), shape or three-dimensional (3D) and quantum chemical (QC) indices were calculated. Based on this information, a generic predictive model was created using the diverse set of 101 compounds. In addition, two submodels were prepared for subsets of 79 cyclic and 22 acyclic chemicals. A modified Gram-Schmidt variable reduction algorithm for descriptor thinning was followed by regression analyses using ridge regression (RR), principal components regression (PCR) and partial least squares regression (PLS). The RR results were found to be superior to PLS and PCR regressions. The cross-validated correlation coefficients for the full set and subsets were 0.67-0.87. Computational methods such as QSAR modelling can be used to augment existing data to prioritise chemicals that need to be studied further for toxicological evaluation and risk assessment.
Collapse
Affiliation(s)
- S C Basak
- University of Minnesota Duluth, Natural Resources Research Institute, 5013 Miller Trunk Hwy, Duluth, MN 55811, USA.
| | | | | |
Collapse
|
14
|
Basak SC, Mills D, Gute BD. Prediction of tissue: air partition coefficients--theoretical vs. experimental methods. SAR AND QSAR IN ENVIRONMENTAL RESEARCH 2006; 17:515-32. [PMID: 17050189 DOI: 10.1080/10629360600934093] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/12/2023]
Abstract
Predictive QSAR models for rat and human tissue : air partition coefficients, namely blood : air, fat : air, brain : air, liver : air, muscle : air, and kidney : air were developed utilizing experimentally determined partition coefficients for 131 chemicals obtained from the literature and molecular descriptors based solely on chemical structure. The descriptors were partitioned into four hierarchical classes, including topostructural, topochemical, 3-dimensional, and ab initio quantum chemical. Three types of regression methodologies--ridge regression, principal components regression, and partial least squares regression--were used comparatively in the development of the structure-based models. In addition to the structure-based models, ordinary least squares regression was used to develop comparative models based on experimentally determined properties including saline : air and olive oil : air partition coefficients. The results of the study indicate that many of the structure-based models are comparable or superior to their respective property-based models. This is an important result considering that structural descriptors can be calculated quickly and inexpensively for both existing chemicals and those not yet synthesized. It was also found that ridge regression outperformed principal components regression and partial least squares regression, with respect to the structure-based models, and that generally the topochemical descriptors alone produced models of good predictive ability.
Collapse
Affiliation(s)
- S C Basak
- Natural Resources Research Institute, University of Minnesota Duluth, 5013 Miller Trunk Hwy, Duluth, MN 55811, USA.
| | | | | |
Collapse
|
15
|
Bagchi MC, Mills D, Basak SC. Quantitative structure-activity relationship (QSAR) studies of quinolone antibacterials against M. fortuitum and M. smegmatis using theoretical molecular descriptors. J Mol Model 2006; 13:111-20. [PMID: 16932890 DOI: 10.1007/s00894-006-0133-z] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2006] [Accepted: 06/28/2006] [Indexed: 11/30/2022]
Abstract
The incidence of tuberculosis infections that are resistant to conventional drug therapy has risen steadily in the last decade. Several of the quinolone antibacterials have been examined as inhibitors of M. tuberculosis infection as well as other mycobacterial infections. However, not much has been done to examine specific structure-activity relationships of the quinolone antibacterials against mycobacteria. The present paper describes quantitative structure-activity relationship modeling for a series of antimycobacterial compounds. Most of the antimycobacterial compounds do not have sufficient physicochemical data, and thus predictive methods based on experimental data are of limited use in this situation. Hence, there is a need for the development of quantitative structure-activity relationship (QSAR) models utilizing theoretical molecular descriptors that can be calculated directly from molecular structures. Descriptors associated with chemical structures of N-1 and C-7 substituted quinolone derivatives as well as 8-substituted quinolone derivatives with good antimycobacterial activities against M. fortuitum and M. smegmatis have been evaluated. Ridge regression (RR), Principal component regression (PCR), and partial least squares (PLS) regression were used, comparatively, to develop predictive models for antibacterial activity, based on the activities of the above compounds. The independent variables include topostructural, topochemical and 3-D geometrical indices, which were used in a hierarchical fashion in the model-development process. The predictive ability of the models was assessed by the cross-validated R2. Comparison of the relative effectiveness of the various classes of molecular descriptors in the regression models shows that the easily calculable topological indices explain most of the variance in the data.
Collapse
Affiliation(s)
- Manish C Bagchi
- Drug Design Development and Molecular Modelling Division, Indian Institute of Chemical Biology, 4 Raja S.C. Mullick Road, Calcutta, 700032, Jadavpur, India.
| | | | | |
Collapse
|
16
|
Ghosh P, Thanadath M, Bagchi MC. On an aspect of calculated molecular descriptors in QSAR studies of quinolone antibacterials. Mol Divers 2006; 10:415-27. [PMID: 16896544 DOI: 10.1007/s11030-006-9018-4] [Citation(s) in RCA: 15] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2005] [Accepted: 01/18/2006] [Indexed: 10/24/2022]
Abstract
The re-emergence of tuberculosis infections, which are resistant to conventional drug therapy, has steadily risen in the last decade and as a result of that, fluoroquinolone drugs are being used as the second line of action. But there is hardly any study to examine specific structure activity relationships of quinolone antibacterials against mycobacteria. In this paper, an attempt has been made to establish a quantitative structure activity relationship modeling for a series of quinolone compounds against Mycobacterium fortuitum and Mycobacterium smegmatis. Due to lack of sufficient physicochemical data for the anti-mycobacterial compounds, it becomes very difficult to develop predictive methods based on experimental data. The present paper is an effort for the development of QSARs from the standpoint of physicochemical, constitutional, geometrical, electrostatic and topological indices. Molecular descriptors have been calculated solely from the chemical structure of N-1, C-7 and 8 substituted quinolone compounds and ridge regression models have been developed which can explain a better structure-activity relationship. Consideration of an intermolecular similarity analysis approach that led to a successful computer program development in PERL language has been used for comparing the influence of various molecular descriptors in different data subsets. The comparison of relative effectiveness of the calculated descriptors in our ridge regression model gives rise to some interesting results.
Collapse
Affiliation(s)
- Payel Ghosh
- Drug Design, Development and Molecular Modelling Division, Indian Institute of Chemical Biology, Jadavpur, Calcutta, India
| | | | | |
Collapse
|
17
|
Basak SC, Natarajan R, Mills D, Hawkins DM, Kraker JJ. Quantitative structure-activity relationship modeling of insect juvenile hormone activity of 2,4-dienoates using computed molecular descriptors. SAR AND QSAR IN ENVIRONMENTAL RESEARCH 2005; 16:581-606. [PMID: 16428133 DOI: 10.1080/10659360500468526] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/06/2023]
Abstract
Juvenile hormone (JH) activity of one hundred and eighty 2,4-dienoates reported for the larvae/pupae of six insect species was modeled using 915 atom pairs and 258 global molecular descriptors (topological and geometrical). Ridge regression, principal component regression and partial least square regression methods were used to model each of the JH activities. The use of all of the available parameters did not yield any good models, and extensive predictor trimming was necessary to improve the models. Ridge regression was found to give the best results among the three statistical tools used. The top ten molecular descriptors selected based on the t-statistic for each of the six models were found to be mostly atom pairs containing heteroatoms and topochemical descriptors. This suggests the importance of the chemical nature of the ligand rather than mere space-filling as the basis of the JH bioactivity. The residual plots indicate the existence of some non-linear relations, and recursive partitioning was used to capture any nonlinear relation between the bioassays and the molecular descriptors.
Collapse
Affiliation(s)
- S C Basak
- Natural Resources Research Institute, Center for Water and Environment, University of Minnesota Duluth, 5013 Miller Trunk Hwy, Duluth, MN 55811, USA.
| | | | | | | | | |
Collapse
|
18
|
Katritzky AR, Kuanar M, Fara DC, Karelson M, Acree WE, Solov'ev VP, Varnek A. QSAR modeling of blood:air and tissue:air partition coefficients using theoretical descriptors. Bioorg Med Chem 2005; 13:6450-63. [PMID: 16202613 DOI: 10.1016/j.bmc.2005.06.066] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2005] [Revised: 06/29/2005] [Accepted: 06/30/2005] [Indexed: 11/21/2022]
Abstract
Human blood:air, human and rat tissue (fat, brain, liver, muscle, and kidney):air partition coefficients of a diverse set of organic compounds were correlated and predicted using structural descriptors by employing CODESSA-PRO and ISIDA programs. Four and five descriptor regression models developed using CODESSA-PRO were validated on three different test sets. Overall, these models have reasonable values of correlation coefficients (R(2)) and leave-one-out correlation coefficients (R(cv)(2)): R(2) = 0.881-0.983; R(cv)(2) = 0.826-0.962. Calculations with ISIDA resulted in models based on atom/bond sequences involving two to three atoms with statistical parameters that were similar to those of models obtained with CODESSA-PRO (R(2) = 0.911-0.974; R(cv)(2) = 0.831-0.936). A mixed pool of molecular and fragment descriptors did not lead to significant improvement of the models.
Collapse
Affiliation(s)
- Alan R Katritzky
- Center for Heterocyclic Compounds, Department of Chemistry, University of Florida, Gainesville, 32611, USA.
| | | | | | | | | | | | | |
Collapse
|
19
|
Basak SC, Natarajan R, Mills D, Hawkins DM, Kraker JJ. Quantitative Structure−Activity Relationship Modeling of Juvenile Hormone Mimetic Compounds for Culex Pipiens Larvae, with a Discussion of Descriptor-Thinning Methods. J Chem Inf Model 2005; 46:65-77. [PMID: 16426041 DOI: 10.1021/ci050215y] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/13/2023]
Abstract
Quantitative structure-activity relationship (QSAR) modelers often encounter the problem of multicollinearity owing to the availability of large numbers of computable molecular descriptors. Sparsity of the variables while using descriptors such as atom pairs increases the complexity. Three different predictor-thinning methods, namely, a modified Gram-Schmidt algorithm, a marginal soft thresholding algorithm, and LASSO (least absolute shrinkage and selection operator), were utilized to reduce the number of descriptors prior to developing linear models. Juvenile hormone (JH) activity of 304 compounds on Culex pipiens larvae was taken as the model data set, and predictor trimming of a large number of diverse descriptors comprising 268 global molecular descriptors (topostructural, topochemical, and geometrical), 13 quantum chemical descriptors, and 915 atom pairs (substructural counts) was applied prior to linear regression by the ridge regression method. The data set (N = 304) was split into five calibration data sets of random samples of sizes 60/110/160/210/260, and the remaining 244/194/144/94/44 compounds were used for validations. LASSO was not found to be a very effective method in handling a large set of descriptors because the number of predictors retained could not exceed the number of observations. The results indicated that the modified Gram-Schmidt algorithm could be used to trim the number of predictors in the global molecular descriptor set where collinearity of the descriptors was the major concern. On the contrary, the soft thresholding approach was found to be an effective tool in subset selection from a diverse set of descriptors having both sparsity and multicollinearity, as in the case of the combined set of atom pairs and global molecular descriptors. The final model developed after variable selection was dominated more by atom pairs, which indicated the important structural moieties that affect JH activity of the compounds. The success of the method reiterates the fact that QSAR or quantitative structure-property relationship (QSPR) models can be developed for a diverse set of compounds using properly parametrized and diverse sets of descriptors, of course, with the selection of the appropriate statistical tools.
Collapse
Affiliation(s)
- Subhash C Basak
- Natural Resources Research Institute, Center for Water and Environment, University of Minnesota-Duluth, 55811, USA.
| | | | | | | | | |
Collapse
|