1
|
Kotli M, Piir G, Maran U. Pesticide effect on earthworm lethality via interpretable machine learning. JOURNAL OF HAZARDOUS MATERIALS 2024; 461:132577. [PMID: 37793249 DOI: 10.1016/j.jhazmat.2023.132577] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/19/2023] [Revised: 09/15/2023] [Accepted: 09/16/2023] [Indexed: 10/06/2023]
Abstract
Earthworms are among the most important animals (invertebrates) for soil health. Many chemical substances released into nature for agricultural development, such as pesticides, may have unwanted effects on those organisms. However, it is essential to understand the extent of the impact of chemicals on soil health first and then make the proper decisions for regulatory or commercial purposes. We hypothesize that there is an expressible quantitative structure-activity relationship (QSAR) between the structure of pesticide compounds and the acute toxicity effect of earthworm species Eisenia fetida. The description of this relationship allows for a better assessment of the impact of chemicals on the said earthworm. To describe this relationship, a dataset of chemicals was collected from open-access sources to develop a mathematical model. A novel approach, combining genetic algorithm and Bayesian optimization, was used to select structural features into the model and to optimize model parameters. The final QSAR classification model was created with the Random Forest algorithm and exhibited good prediction Accuracy of 0.78 on training set and 0.80 on test set. The model representation follows FAIR principles and is available on QsarDB.org.
Collapse
Affiliation(s)
- Mihkel Kotli
- University of Tartu, Institute of Chemistry, Tartu, Estonia
| | - Geven Piir
- University of Tartu, Institute of Chemistry, Tartu, Estonia
| | - Uko Maran
- University of Tartu, Institute of Chemistry, Tartu, Estonia.
| |
Collapse
|
2
|
Yu X. Global classification models for predicting acute toxicity of chemicals towards Daphnia magna. ENVIRONMENTAL RESEARCH 2023; 238:117239. [PMID: 37778597 DOI: 10.1016/j.envres.2023.117239] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/11/2023] [Revised: 09/10/2023] [Accepted: 09/18/2023] [Indexed: 10/03/2023]
Abstract
Molecular descriptors reflecting structural information on hydrophobicity, reactivity, polarizability, hydrogen bond and charged groups, were used to predict the toxicity (pLC50) of chemicals towards Daphnia magna with global quantitative structure-activity/toxicity relationship (QSAR/QSTR) models. A sufficiently large dataset including 1517 chemical toxicity to Daphnia magna was divided into a training set (758 pLC50) and a test set (759 pLC50). By applying random forest algorithm, two classification models, Class Model A and Class Model B were developed, having prediction accuracy, sensitivity and specificity above 85% for Class 1 (with pLC50 ≤ 4.48) and Class 2 (with pLC50 > 4.48). The Class Model A was based on nine molecular descriptors and RF parameters of nodesize = 1, ntree = 80 and mtry = 2, and yielded accuracy of 92.3% (training set), 85.6% (test set) and 88.9% (total data set). Class Model B was based on ten descriptors and parameters, nodesize = 1, ntree = 90 and mtry = 2, produced accuracy of 88.3% (training set), 86.8% (test set) and 87.5% (total data set). The two classification models were satisfactory compared with other classification model reported in the literature, although classification models in this work dealt with more samples. Thus, the two classification models with a larger applicability domain provided efficient tools for assessing chemical aquatic toxicity towards Daphnia magna.
Collapse
Affiliation(s)
- Xinliang Yu
- Hunan Provincial Key Laboratory of Environmental Catalysis & Waste Regeneration, College of Materials and Chemical Engineering, Hunan Institute of Engineering, Xiangtan, Hunan, 411104, China.
| |
Collapse
|
3
|
Chakravarti S. Augmenting Expert Knowledge-Based Toxicity Alerts by Statistically Mined Molecular Fragments. Chem Res Toxicol 2023. [PMID: 37207298 DOI: 10.1021/acs.chemrestox.2c00368] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/21/2023]
Abstract
Structural alerts are molecular substructures assumed to be associated with molecular initiating events in various toxic effects and an integral part of in silico toxicology. However, alerts derived using the knowledge of human experts often suffer from a lack of predictivity, specificity, and satisfactory coverage. In this work, we present a method to build hybrid QSAR models by combining expert knowledge-based alerts and statistically mined molecular fragments. Our objective was to find out if the combination is better than the individual systems. Lasso regularization-based variable selection was applied on combined sets of knowledge-based alerts and molecular fragments, but the variable elimination was only allowed to happen on the molecular fragments. We tested the concept on three toxicity end points, i.e., skin sensitization, acute Daphnia toxicity, and Ames mutagenicity, which covered both classification and regression problems. Results showed the predictive performance of such hybrid models is, indeed, better than the models based solely on expert alerts or statistically mined fragments alone. The method also enables the discovery of activating and mitigating/deactivating features for toxicity alerts and the identification of new alerts, thereby reducing false positive and false negative outcomes commonly associated with generic alerts and alerts with poor coverage, respectively.
Collapse
Affiliation(s)
- Suman Chakravarti
- MultiCASE Inc., 23811 Chagrin Blvd, Suite 305, Beachwood, Ohio 44122, United States
| |
Collapse
|
4
|
Toots KM, Sild S, Leis J, Acree WE, Maran U. Machine Learning Quantitative Structure–Property Relationships as a Function of Ionic Liquid Cations for the Gas-Ionic Liquid Partition Coefficient of Hydrocarbons. Int J Mol Sci 2022; 23:ijms23147534. [PMID: 35886881 PMCID: PMC9323540 DOI: 10.3390/ijms23147534] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2022] [Revised: 06/27/2022] [Accepted: 06/30/2022] [Indexed: 02/01/2023] Open
Abstract
Ionic liquids (ILs) are known for their unique characteristics as solvents and electrolytes. Therefore, new ILs are being developed and adapted as innovative chemical environments for different applications in which their properties need to be understood on a molecular level. Computational data-driven methods provide means for understanding of properties at molecular level, and quantitative structure–property relationships (QSPRs) provide the framework for this. This framework is commonly used to study the properties of molecules in ILs as an environment. The opposite situation where the property is considered as a function of the ionic liquid does not exist. The aim of the present study was to supplement this perspective with new knowledge and to develop QSPRs that would allow the understanding of molecular interactions in ionic liquids based on the structure of the cationic moiety. A wide range of applications in electrochemistry, separation and extraction chemistry depends on the partitioning of solutes between the ionic liquid and the surrounding environment that is characterized by the gas-ionic liquid partition coefficient. To model this property as a function of the structure of a cationic counterpart, a series of ionic liquids was selected with a common bis-(trifluoromethylsulfonyl)-imide anion, [Tf2N]−, for benzene, hexane and cyclohexane. MLR, SVR and GPR machine learning approaches were used to derive data-driven models and their performance was compared. The cross-validation coefficients of determination in the range 0.71–0.93 along with other performance statistics indicated a strong accuracy of models for all data series and machine learning methods. The analysis and interpretation of descriptors revealed that generally higher lipophilicity and dispersion interaction capability, and lower polarity in the cations induces a higher partition coefficient for benzene, hexane, cyclohexane and hydrocarbons in general. The applicability domain analysis of models concluded that there were no highly influential outliers and the models are applicable to a wide selection of cation families with variable size, polarity and aliphatic or aromatic nature.
Collapse
Affiliation(s)
- Karl Marti Toots
- Department of Chemistry, University of Tartu, 14a Ravila Street, 50411 Tartu, Estonia; (K.M.T.); (S.S.); (J.L.)
| | - Sulev Sild
- Department of Chemistry, University of Tartu, 14a Ravila Street, 50411 Tartu, Estonia; (K.M.T.); (S.S.); (J.L.)
| | - Jaan Leis
- Department of Chemistry, University of Tartu, 14a Ravila Street, 50411 Tartu, Estonia; (K.M.T.); (S.S.); (J.L.)
| | - William E. Acree
- Department of Chemistry, University of North Texas, 1155 Union Circle Drive #305070, Denton, TX 76203, USA;
| | - Uko Maran
- Department of Chemistry, University of Tartu, 14a Ravila Street, 50411 Tartu, Estonia; (K.M.T.); (S.S.); (J.L.)
- Correspondence:
| |
Collapse
|
5
|
Li X, Gu W, Zhang B, Xin X, Kang Q, Yang M, Chen B, Li Y. Insights into toxicity of polychlorinated naphthalenes to multiple human endocrine receptors: Mechanism and health risk analysis. ENVIRONMENT INTERNATIONAL 2022; 165:107291. [PMID: 35609500 DOI: 10.1016/j.envint.2022.107291] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/13/2021] [Revised: 05/06/2022] [Accepted: 05/09/2022] [Indexed: 06/15/2023]
Abstract
This study explored the combined disruption mechanism of polychlorinated naphthalenes (PCNs) on the three key receptors (estrogen receptor, thyroid receptor, and adrenoceptor) of the human endocrine system. The intensity of PCN endocrine disruption on these receptors was first determined using a molecular docking method. A comprehensive index of PCN endocrine disruption to human was quantified by analytic hierarchy process and fuzzy analysis. The mode of action between PCNs and the receptors was further identified to screen the molecular characteristics influencing PCN endocrine disruption through molecular docking and fractional factorial design. Quantitative structure-activity relationship (QSAR) models were established to investigate the toxic mechanism due to PCN endocrine disruption. The results showed that the lowest occupied orbital energy (ELUMO) was the most important factor contributing to the toxicity of PCNs on the endocrine receptors, followed by the orbital energy difference (ΔE) and positive Millikan charge (q+). Furthermore, the strategies were formulated through adjusting the nutritious diet to reduce health risk for the workers in PCN contaminated sites and the effectiveness and feasibility were assessed by molecular dynamic simulation. The simulation results indicated that the human health risk caused by PCN endocrine disruption could be effectively decreased by nutritional supplementation. The binding ability between PCNs and endocrine receptors significantly declined (up to -16.45%) with the supplementation of vitamins (A, B2, B12, C, and E) and carotene. This study provided the new insights to reveal the toxic mechanism of PCNs on human endocrine systems and the recommendations on nutritional supplements for health risk reduction. The methodology and findings could serve as valuable references for screening of potential endocrine disruptors and developing appropriate strategies for PCN or other persistent organic pollution control and health risk management.
Collapse
Affiliation(s)
- Xixi Li
- Northern Region Persistent Organic Pollution Control (NRPOP) Laboratory, Faculty of Engineering and Applied Science, Memorial University, St. John's, NL A1B 3X5, Canada.
| | - Wenwen Gu
- Northern Region Persistent Organic Pollution Control (NRPOP) Laboratory, Faculty of Engineering and Applied Science, Memorial University, St. John's, NL A1B 3X5, Canada; MOE Key Laboratory of Resources and Environmental Systems Optimization, North China Electric Power University, Beijing 102206, China.
| | - Baiyu Zhang
- Northern Region Persistent Organic Pollution Control (NRPOP) Laboratory, Faculty of Engineering and Applied Science, Memorial University, St. John's, NL A1B 3X5, Canada.
| | - Xiaying Xin
- Northern Region Persistent Organic Pollution Control (NRPOP) Laboratory, Faculty of Engineering and Applied Science, Memorial University, St. John's, NL A1B 3X5, Canada.
| | - Qiao Kang
- Northern Region Persistent Organic Pollution Control (NRPOP) Laboratory, Faculty of Engineering and Applied Science, Memorial University, St. John's, NL A1B 3X5, Canada.
| | - Min Yang
- Northern Region Persistent Organic Pollution Control (NRPOP) Laboratory, Faculty of Engineering and Applied Science, Memorial University, St. John's, NL A1B 3X5, Canada.
| | - Bing Chen
- Northern Region Persistent Organic Pollution Control (NRPOP) Laboratory, Faculty of Engineering and Applied Science, Memorial University, St. John's, NL A1B 3X5, Canada.
| | - Yu Li
- MOE Key Laboratory of Resources and Environmental Systems Optimization, North China Electric Power University, Beijing 102206, China.
| |
Collapse
|
6
|
Jillella GK, Roy K. QSAR modelling of organic dyes for their acute toxicity in Daphnia magna using 2D-descriptors. SAR AND QSAR IN ENVIRONMENTAL RESEARCH 2022; 33:111-139. [PMID: 35156472 DOI: 10.1080/1062936x.2022.2033318] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/25/2021] [Accepted: 01/20/2022] [Indexed: 06/14/2023]
Abstract
The present study reports quantitative structure-activity relationship (QSAR) models for 22 organic dyes spanning a broad chemical domain to predict their toxicity in Daphnia magna [log (1/EC50)]. Only two-dimensional descriptors with clear physicochemical meaning were used to construct the QSAR models. The process of development, validation, and interpretation of models adheres to the stringent recommendations of the Organization for Economic Cooperation and Development (OECD) guidelines. In this study, the multi-layered stepwise regression method and linear discriminant analysis (LDA) method were employed for the deployment of regression - and classification-based models respectively; however, the final regression-based QSAR models were obtained through the partial least squares (PLS) regression. Additionally, the applicability domain of the developed models was verified. The constructed models should be applicable in the absence of toxicity data of new or untested dye structures, particularly when the compounds fall within the developed models' scope, and also implementable to develop more environmentally friendly alternatives.
Collapse
Affiliation(s)
- G K Jillella
- Department of Pharmacoinformatics, National Institute of Pharmaceutical Educational and Research (NIPER), Kolkata, India
| | - K Roy
- Drug Theoretics and Cheminformatics Laboratory, Department of Pharmaceutical Technology, Jadavpur University, Kolkata, India
| |
Collapse
|
7
|
Toots KM, Sild S, Leis J, Acree Jr. WE, Maran U. The quantitative structure-property relationships for the gas-ionic liquid partition coefficient of a large variety of organic compounds in three ionic liquids. J Mol Liq 2021. [DOI: 10.1016/j.molliq.2021.117573] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022]
|
8
|
Gajewicz-Skretna A, Gromelski M, Wyrzykowska E, Furuhama A, Yamamoto H, Suzuki N. Aquatic toxicity (Pre)screening strategy for structurally diverse chemicals: global or local classification tree models? ECOTOXICOLOGY AND ENVIRONMENTAL SAFETY 2021; 208:111738. [PMID: 33396066 DOI: 10.1016/j.ecoenv.2020.111738] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/12/2020] [Revised: 11/23/2020] [Accepted: 11/25/2020] [Indexed: 06/12/2023]
Abstract
With an ever-increasing number of synthetic chemicals being manufactured, it is unrealistic to expect that they will all be subjected to comprehensive and effective risk assessment. A shift from conventional animal testing to computer-aided methods is therefore an important step towards advancing the environmental risk assessments of chemicals. The aims of this study are two-fold: firstly, it examines the relationships between structural and physicochemical features of a diverse set of organic chemicals, and their acute aquatic toxicity towards Daphnia magna and Oryzias latipes using a classification tree approach. Secondly, it compares the efficiency and accuracy of the predictions of two modeling schemes: local models that are inherently restricted to a smaller subset of structurally-related substances, and a global model that covers a wider chemical space and a number of modes of toxic action. The classification tree-based models differentiate the organic chemicals into either 'highly toxic' or 'low to non-toxic' classes, based on internal and external validation criteria. These mechanistically-driven models, which demonstrate good performance, reveal that the key factors driving acute aquatic toxicity are lipophilicity, electrophilic reactivity, molecular polarizability and size. A comparative analysis of the performance of the two modeling schemes indicates that the local models, trained on homogeneous data sets, are less error prone, and therefore superior to the global model. Although the global models showed worse performance metrics compared to the local ones, their applicability domain is much wider, thereby significantly increasing their usefulness in practical applications for regulatory purposes. This demonstrates their advantage over local models and shows they are an invaluable tool for modeling heterogeneous chemical data sets.
Collapse
Affiliation(s)
- Agnieszka Gajewicz-Skretna
- Laboratory of Environmental Chemometrics, Faculty of Chemistry, University of Gdansk, Wita Stwosza 63, 80-308 Gdansk, Poland.
| | - Maciej Gromelski
- Laboratory of Environmental Chemometrics, Faculty of Chemistry, University of Gdansk, Wita Stwosza 63, 80-308 Gdansk, Poland
| | - Ewelina Wyrzykowska
- Laboratory of Environmental Chemometrics, Faculty of Chemistry, University of Gdansk, Wita Stwosza 63, 80-308 Gdansk, Poland
| | - Ayako Furuhama
- Division of Genetics and Mutagenesis, National Institute of Health Sciences (NIHS), 3-25-26 Tonomachi, Kawasaki-ku, Kawasaki, Kanagawa 210-9501, Japan; Center for Health and Environmental Risk Research, National Institute for Environmental Studies (NIES), 16-2 Onogawa, Tsukuba 305-8506, Japan
| | - Hiroshi Yamamoto
- Center for Health and Environmental Risk Research, National Institute for Environmental Studies (NIES), 16-2 Onogawa, Tsukuba 305-8506, Japan
| | - Noriyuki Suzuki
- Center for Health and Environmental Risk Research, National Institute for Environmental Studies (NIES), 16-2 Onogawa, Tsukuba 305-8506, Japan
| |
Collapse
|
9
|
Zukić S, Maran U. Modelling of antiproliferative activity measured in HeLa cervical cancer cells in a series of xanthene derivatives. SAR AND QSAR IN ENVIRONMENTAL RESEARCH 2020; 31:905-921. [PMID: 33236957 DOI: 10.1080/1062936x.2020.1839131] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/13/2020] [Accepted: 10/15/2020] [Indexed: 06/11/2023]
Abstract
Cancer remains one of the leading causes of death in humans, and new drug substances are therefore being developed. Thus, the anti-cancer activity of xanthene derivatives has become an important topic in the development of new and potent anti-cancer drug substances. Previously published novel series of xanthen-3-one and xanthen-1,8-dione derivatives have been synthesized in one of our laboratories and showed anti-proliferative activity in HeLa cancer cell lines. This series serves as a good basis to develop quantitative structure-activity relationship (QSAR), to study the relations between anti-proliferative activity and chemical structures. A QSAR model has been derived that relies only on two-dimensional molecular descriptors, providing mechanistic insight into the anti-proliferative activity of xanthene derivatives. The model is validated internally and externally and additionally with the set of inactive compounds of the original data, confirming model applicability for the design and discovery of novel xanthene derivatives. The QSAR model is available at the QsarDB repository (http://dx.doi.10.15152/QDB.237).
Collapse
Affiliation(s)
- S Zukić
- Department of Pharmaceutical Chemistry, University of Sarajevo , Sarajevo, Bosnia and Herzegovina
| | - U Maran
- Department of Chemistry, University of Tartu , Tartu, Estonia
| |
Collapse
|
10
|
Viira B, García-Sosa AT, Maran U. Chemical structure and correlation analysis of HIV-1 NNRT and NRT inhibitors and database-curated, published inhibition constants with chemical structure in diverse datasets. J Mol Graph Model 2017; 76:205-223. [PMID: 28738270 DOI: 10.1016/j.jmgm.2017.06.019] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2017] [Revised: 06/18/2017] [Accepted: 06/19/2017] [Indexed: 01/26/2023]
Abstract
Human immunodeficiency virus (HIV-1) reverse transcriptase is a major target for designing anti-HIV drugs. Developed inhibitors are divided into non-nucleoside analog reverse-transcriptase inhibitors (NNRTIs) and nucleoside analog reverse-transcriptase inhibitors (NRTIs) depending on their mechanism. Given that many inhibitors have been studied and for many of them binding affinity constants have been calculated, it is beneficial to analyze the chemical landscape of these families of inhibitors and correlate these inhibition constants with molecular structure descriptors. For this, the HIV-1 RT data was retrieved from the ChEMBL database, carefully curated, and original literature verified, grouped into NRTIs and NNRTIs, analyzed using a hierarchical scaffold classification method and modelled with best multi-linear regression approach. Analysis of the HIV-1 NNRTIs subset results in ten different common structural parent types of oxazepanone, piperazinone, pyrazine, oxazinanone, diazinanone, pyridine, pyrrole, diazepanone, thiazole, and triazine. The same analysis for HIV-1 NRTIs groups structures into four different parent types of uracil, pyrimide, pyrimidione, and imidazole. Each scaffold tree corresponding to the parent types has been carefully analyzed and examined, and changes in chemical structure favorable to potency and stability are highlighted. For both subsets, descriptive and predictive QSAR models are derived, discussed and externally validated, revealing general trends in relationships between molecular structure and binding affinity constants in structurally diverse datasets. Data and QSAR models are available at the QsarDB repository (http://dx.doi.org/10.15152/QDB.202).
Collapse
Affiliation(s)
- Birgit Viira
- Institute of Chemistry, University of Tartu, Tartu 50411, Estonia
| | | | - Uko Maran
- Institute of Chemistry, University of Tartu, Tartu 50411, Estonia.
| |
Collapse
|
11
|
Basant N, Gupta S. QSAR modeling for predicting mutagenic toxicity of diverse chemicals for regulatory purposes. ENVIRONMENTAL SCIENCE AND POLLUTION RESEARCH INTERNATIONAL 2017; 24:14430-14444. [PMID: 28435990 DOI: 10.1007/s11356-017-8903-y] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/01/2017] [Accepted: 03/20/2017] [Indexed: 06/07/2023]
Abstract
The safety assessment process of chemicals requires information on their mutagenic potential. The experimental determination of mutagenicity of a large number of chemicals is tedious and time and cost intensive, thus compelling for alternative methods. We have established local and global QSAR models for discriminating low and high mutagenic compounds and predicting their mutagenic activity in a quantitative manner in Salmonella typhimurium (TA) bacterial strains (TA98 and TA100). The decision treeboost (DTB)-based classification QSAR models discriminated among two categories with accuracies of >96% and the regression QSAR models precisely predicted the mutagenic activity of diverse chemicals yielding high correlations (R 2) between the experimental and model-predicted values in the respective training (>0.96) and test (>0.94) sets. The test set root mean squared error (RMSE) and mean absolute error (MAE) values emphasized the usefulness of the developed models for predicting new compounds. Relevant structural features of diverse chemicals that were responsible and influence the mutagenic activity were identified. The applicability domains of the developed models were defined. The developed models can be used as tools for screening new chemicals for their mutagenicity assessment for regulatory purpose.
Collapse
Affiliation(s)
| | - Shikha Gupta
- CSIR-National Botanical Research Institute, Rana Pratap Marg, Lucknow, 226001, India
| |
Collapse
|
12
|
Aalizadeh R, von der Ohe PC, Thomaidis NS. Prediction of acute toxicity of emerging contaminants on the water flea Daphnia magna by Ant Colony Optimization-Support Vector Machine QSTR models. ENVIRONMENTAL SCIENCE. PROCESSES & IMPACTS 2017; 19:438-448. [PMID: 28234392 DOI: 10.1039/c6em00679e] [Citation(s) in RCA: 42] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]
Abstract
According to the European REACH Directive, the acute toxicity towards Daphnia magna should be assessed for any industrial chemical with a market volume of more than 1 t/a. Therefore, it is highly recommended to determine the toxicity at a certain confidence level, either experimentally or by applying reliable prediction models. To this end, a large dataset was compiled, with the experimental acute toxicity values (pLC50) of 1353 compounds in Daphnia magna after 48 h of exposure. A novel quantitative structure-toxicity relationship (QSTR) model was developed, using Ant Colony Optimization (ACO) to select the most relevant set of molecular descriptors, and Support Vector Machine (SVM) to correlate the selected descriptors with the toxicity data. The proposed model showed high performance (QLOO2 = 0.695, Rfitting2 = 0.920 and Rtest2 = 0.831) with low root mean square errors of 0.498 and 0.707 for the training and test set, respectively. It was found that, in addition to hydrophobicity, polarizability and summation of solute-hydrogen bond basicity affected toxicity positively, while minimum atom-type E-state of -OH influenced toxicity values in Daphnia magna inversely. The applicability domain of the proposed model was carefully studied, considering the effect of chemical structure and prediction error in terms of leverage values and standardized residuals. In addition, a new method was proposed to define the chemical space failure for a compound with unknown toxicity to avoid using these prediction results. The resulting ACO-SVM model was successfully applied on an additional evaluation set and the prediction results were found to be very accurate for those compounds that fall inside the defined applicability domain. In fact, compounds commonly found to be difficult to predict, such as quaternary ammonium compounds or organotin compounds were outside the applicability domain, while five representative homologues of LAS (non-ionic surfactants) were, on average, well predicted within one order of magnitude.
Collapse
Affiliation(s)
- Reza Aalizadeh
- Laboratory of Analytical Chemistry, Department of Chemistry, National and Kapodistrian University of Athens, Panepistimiopolis Zografou, 15771 Athens, Greece.
| | | | - Nikolaos S Thomaidis
- Laboratory of Analytical Chemistry, Department of Chemistry, National and Kapodistrian University of Athens, Panepistimiopolis Zografou, 15771 Athens, Greece.
| |
Collapse
|
13
|
Levet A, Bordes C, Clément Y, Mignon P, Morell C, Chermette H, Marote P, Lantéri P. Acute aquatic toxicity of organic solvents modeled by QSARs. J Mol Model 2016; 22:288. [DOI: 10.1007/s00894-016-3156-0] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2016] [Accepted: 10/13/2016] [Indexed: 11/28/2022]
|
14
|
Exploring the role of quantum chemical descriptors in modeling acute toxicity of diverse chemicals to Daphnia magna. J Mol Graph Model 2015; 61:89-101. [PMID: 26188798 DOI: 10.1016/j.jmgm.2015.06.009] [Citation(s) in RCA: 30] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2015] [Revised: 06/04/2015] [Accepted: 06/20/2015] [Indexed: 11/18/2022]
Abstract
Various quantum-mechanically computed molecular and thermodynamic descriptors along with physico-chemical, electrostatic and topological descriptors are compared while developing quantitative structure-activity relationships (QSARs) for the acute toxicity of 252 diverse organic chemicals towards Daphnia magna. QSAR models based on the quantum-chemical descriptors, computed with routinely employed advanced semi-empirical and ab-initio methods, along with the electron-correlation contribution (CORR) of the descriptors, are analyzed for the external predictivity of the acute toxicity. The models with reliable internal stability and external predictivity are found to be based on the HOMO energy along with the physico-chemical, electrostatic and topological descriptors. Besides this, the total energy and electron-correlation energy are also observed as highly reliable descriptors, suggesting that the intra-molecular interactions between the electrons play an important role in the origin of the acute toxicity, which is in fact an unexplored phenomenon. The models based on quantum-chemical descriptors such as chemical hardness, absolute electronegativity, standard Gibbs free energy and enthalpy are also observed to be reliable. A comparison of the robust models based on the quantum-chemical descriptors computed with various quantum-mechanical methods suggests that the advanced semi-empirical methods such as PM7 can be more reliable than the ab-initio methods which are computationally more expensive.
Collapse
|
15
|
Cassotti M, Consonni V, Mauri A, Ballabio D. Validation and extension of a similarity-based approach for prediction of acute aquatic toxicity towards Daphnia magna. SAR AND QSAR IN ENVIRONMENTAL RESEARCH 2014; 25:1013-1036. [PMID: 25482581 DOI: 10.1080/1062936x.2014.977818] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/17/2014] [Accepted: 09/15/2014] [Indexed: 06/04/2023]
Abstract
Quantitative structure-activity relationship (QSAR) models for predicting acute toxicity to Daphnia magna are often associated with poor performances, urging the need for improvement to meet REACH requirements. The aim of this study was to evaluate the accuracy, stability and reliability of a previously published QSAR model by means of further external validation and to optimize its performance by means of extension to new data as well as a consensus approach. The previously published model was validated with a large set of new molecules and then compared with ChemProp model, from which most of the validation data were taken. Results showed better performance of the proposed model in terms of accuracy and percentage of molecules outside the applicability domain. The model was re-calibrated on all the available data to confirm the efficacy of the similarity-based approach. The extended dataset was also used to develop a novel model based on the same similarity approach but using binary fingerprints to describe the chemical structures. The fingerprint-based model gave lower regression statistics, but also less unpredicted compounds. Eventually, consensus modelling was successfully used to enhance the accuracy of the predictions and to halve the percentage of molecules outside the applicability domain.
Collapse
Affiliation(s)
- M Cassotti
- a Department of Earth and Environmental Sciences , University of Milano-Bicocca , Milan , Italy
| | | | | | | |
Collapse
|
16
|
Golbamaki A, Cassano A, Lombardo A, Moggio Y, Colafranceschi M, Benfenati E. Comparison of in silico models for prediction of Daphnia magna acute toxicity. SAR AND QSAR IN ENVIRONMENTAL RESEARCH 2014; 25:673-694. [PMID: 24911142 DOI: 10.1080/1062936x.2014.923041] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/03/2023]
Abstract
Eight in silico modelling packages were evaluated and compared for the prediction of Daphnia magna acute toxicity from the viewpoint of the European legislation on chemicals, REACH. We tested the following models: Discovery Studio (DS) TOPKAT, ACD/Tox Suite, ADMET Predictor, ECOSAR (Ecological Structure Activity Relationships), TerraQSAR, T.E.S.T. (Toxicity Estimation Software Tool) and two models implemented in VEGA on 480 industrial compounds for 48-h median lethal concentrations (LC50) to D. magna, matching them with experimental values. The quality of the estimates was compared using a standard statistical review and an additional classification approach in which the hazard predictions were grouped using well-defined regulatory criteria. The regression parameters, correlation coefficient being the most influential, showed that four models (ADMET Predictor, DS TOPKAT, TerraQSAR and VEGA DEMETRA) had similar reliability. These performed better than the others, but the coefficient of determination was still low (r2 around 0.6), considering that at least half the predicted compounds were inside the training sets. Additionally, we grouped the results in four defined toxicity classes. TerraQSAR™ gave 60% of correct classifications, followed by DS TOPKAT, ADMET Predictor™ and VEGA DEMETRA, with 56%, 54% and 48%, respectively. These results highlight the challenges associated with developing reliable and easily applied acceptability criteria for the regulatory use of QSAR models to D. magna acute toxicity.
Collapse
Affiliation(s)
- A Golbamaki
- a Laboratory of Chemistry and Environmental Toxicology , Istituto di Ricerche Farmacologiche Mario Negri - IRCCS , Via La Masa 19, 20156 Milano , Italy
| | | | | | | | | | | |
Collapse
|
17
|
Stoyanova-Slavova IB, Slavov SH, Pearce B, Buzatu DA, Beger RD, Wilkes JG. Partial least square and k-nearest neighbor algorithms for improved 3D quantitative spectral data-activity relationship consensus modeling of acute toxicity. ENVIRONMENTAL TOXICOLOGY AND CHEMISTRY 2014; 33:1271-1282. [PMID: 24464801 DOI: 10.1002/etc.2534] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/06/2013] [Revised: 12/20/2013] [Accepted: 01/14/2014] [Indexed: 06/03/2023]
Abstract
A diverse set of 154 chemicals that included US Food and Drug Administration-regulated compounds tested for their aquatic toxicity in Daphnia magna were modeled by a 3-dimensional quantitative spectral data-activity relationship (3D-QSDAR). Two distinct algorithms, partial least squares (PLS) and Tanimoto similarity-based k-nearest neighbors (KNN), were used to process bin occupancy descriptor matrices obtained after tessellation of the 3D-QSDAR space into regularly sized bins. The performance of models utilizing bins ranging in size from 2 ppm × 2 ppm × 0.5 Å to 20 ppm × 20 ppm × 2.5 Å was explored. Rigorous quality-control criteria were imposed: 1) 100 randomized 20% hold-out test sets were generated and the average R(2) test of the respective models was used as a measure of their performance, and 2) a Y-scrambling procedure was used to identify chance correlations. A consensus between the best-performing composite PLS model using 0.5 Å × 14 ppm × 14 ppm bins and 10 latent variables (average R(2) test = 0.770) and the best composite KNN model using 0.5 Å × 8 ppm × 8 ppm and 2 neighbors (average R(2) test = 0.801) offered an improvement of about 7.5% (R(2) test consensus = 0.845). Projection of the most frequently occurring bins on the standard coordinate space indicated that the presence of a primary or secondary amino group-substituted aromatic systems-would result in an increased toxic effect in Daphnia. The presence of a second aromatic ring with highly electronegative substituents 5 Å to 7 Å apart from the first ring would lead to a further increase in toxicity.
Collapse
Affiliation(s)
- Iva B Stoyanova-Slavova
- Division of Systems Biology, National Center for Toxicological Research, US Food and Drug Administration, Jefferson, Arizona
| | | | | | | | | | | |
Collapse
|
18
|
Moosus M, Hiob R, Maran U. Quantitative relationship between rate constants and molecular structure descriptors for the gas phase hydrogen abstraction reactions. SAR AND QSAR IN ENVIRONMENTAL RESEARCH 2013; 24:501-518. [PMID: 23724929 DOI: 10.1080/1062936x.2013.792869] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/02/2023]
Abstract
The abstraction of hydrogen by general radicals has a wide role in environmental and also in technological processes because it results in reactive free radicals that play a vital role in atmospheric chemistry and also in biochemical processes. In addition to experimental studies, the theoretical modelling of this elementary reaction has been important for understanding and predicting respective rate constants. In this paper, molecular descriptors in the context of a QSAR approach are used to codify the relationship between molecular structure and rate constants. Unique experimental data is collected from the literature for the reaction R(i)• + R(j)H → R(i)H + R(j)•, where R(i)• = H• and R(j)• are diverse radicals. The four-parameter QSAR model (n = 34, r(2) = 0.81, r(2)(CV) = 0.74, r(2)(scr) = 0.12, s(2) = 0.19) is presented for the bimolecular rate constants, accompanied with model diagnostics and analysis of descriptors in the model.
Collapse
Affiliation(s)
- Maikki Moosus
- Institute of Chemistry, University of Tartu, Tartu, Estonia
| | | | | |
Collapse
|
19
|
Doucet JP, Doucet-Panaye A, Devillers J. Structure-activity relationship study of trifluoromethylketones: inhibitors of insect juvenile hormone esterase. SAR AND QSAR IN ENVIRONMENTAL RESEARCH 2013; 24:481-499. [PMID: 23721304 DOI: 10.1080/1062936x.2013.792499] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/02/2023]
Abstract
The juvenile hormone esterase (JHE) regulates juvenile hormone titre in insect hemolymph during its larval development. It has been suggested that JHE could be targeted for use in insect control. This enzyme can also be considered as involved in the phenomenon of endocrine disruption by xenobiotics in beneficial insects. Consequently, there is a need to know the characteristics of the molecules able to act on the JHE. Trifluoromethylketones (TFKs) are the most potent JHE inhibitors found to date and different quantitative structure-activity relationships (QSARs) have been derived for this group of chemicals. In this context, a set of 181 TFKs (118 active and 63 inactive compounds), tested on Trichoplusia ni for their JHE inhibition activity and described by physico-chemical descriptors, was split into different training and test sets to derive structure-activity relationship (SAR) models from support vector classification (SVC). A SVC model including 88 descriptors and derived from a Gaussian kernel was selected for its predictive performances. Another model computed only with 13 descriptors was also selected due to its mechanistic interpretability. This study clearly illustrates the difficulty in capturing the essential structural characteristics of the TFKs explaining their JHE inhibitory activity.
Collapse
Affiliation(s)
- J P Doucet
- ITODYS, UMR 7086, Université Paris 7, Paris, France.
| | | | | |
Collapse
|
20
|
Devillers J, Pandard P, Richard B. External validation of structure-biodegradation relationship (SBR) models for predicting the biodegradability of xenobiotics. SAR AND QSAR IN ENVIRONMENTAL RESEARCH 2013; 24:979-993. [PMID: 24313438 DOI: 10.1080/1062936x.2013.848632] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/02/2023]
Abstract
Biodegradation is an important mechanism for eliminating xenobiotics by biotransforming them into simple organic and inorganic products. Faced with the ever growing number of chemicals available on the market, structure-biodegradation relationship (SBR) and quantitative structure-biodegradation relationship (QSBR) models are increasingly used as surrogates of the biodegradation tests. Such models have great potential for a quick and cheap estimation of the biodegradation potential of chemicals. The Estimation Programs Interface (EPI) Suite™ includes different models for predicting the potential aerobic biodegradability of organic substances. They are based on different endpoints, methodologies and/or statistical approaches. Among them, Biowin 5 and 6 appeared the most robust, being derived from the largest biodegradation database with results obtained only from the Ministry of International Trade and Industry (MITI) test. The aim of this study was to assess the predictive performances of these two models from a set of 356 chemicals extracted from notification dossiers including compatible biodegradation data. Another set of molecules with no more than four carbon atoms and substituted by various heteroatoms and/or functional groups was also embodied in the validation exercise. Comparisons were made with the predictions obtained with START (Structural Alerts for Reactivity in Toxtree). Biowin 5 and Biowin 6 gave satisfactorily prediction results except for the prediction of readily degradable chemicals. A consensus model built with Biowin 1 allowed the diminution of this tendency.
Collapse
|
21
|
Furuhama A, Aoki Y, Shiraishi H. Development of ecotoxicity QSAR models based on partial charge descriptors for acrylate and related compounds. SAR AND QSAR IN ENVIRONMENTAL RESEARCH 2012; 23:731-749. [PMID: 22967373 DOI: 10.1080/1062936x.2012.719542] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/01/2023]
Abstract
Using Gasteiger's partial equalization of orbital electronegativity (PEOE) method, we constructed ecotoxicity prediction equations based on two-dimensional descriptors for α,β-unsaturated carbonyl compounds. After examining electrostatic effects on the calculated ecotoxicities of 10 α,β-unsaturated ketones and aldehydes (A-group compounds) by using the Mulliken atomic charges on the carbonyl oxygen atoms, we investigated the efficacy of the PEOE descriptors for the same 10 compounds and the correlation between the PEOE descriptors and the Mulliken charge. We then constructed QSAR models for acute fish and Daphnia toxicities by using the PEOE descriptors for acrylic acids and compounds with acrylate-like substructures (CH-group compounds). In the constructed models, the adjusted squared correlation coefficients between measured and calculated toxicities with the lowest Akaike information criterion were 0.77 and 0.79, respectively. The applicability of the constructed models was then evaluated for various methacrylates and similar compounds (CH(3)-group compounds). Both the fish and the Daphnia toxicities of some of the CH(3)-group compounds were underestimated by these models. Nevertheless, we concluded that the QSAR models based on the PEOE descriptors were practical for predicting acute toxicity, especially for α,β-unsaturated carbonyl compounds with an α-hydrogen. Combining hydrophobicity and PEOE descriptors led to accurate predictions for fish toxicity.
Collapse
Affiliation(s)
- A Furuhama
- Center for Environmental Risk Research, National Institute for Environmental Studies (NIES), Tsukuba, Japan.
| | | | | |
Collapse
|
22
|
Devillers J, Doucet JP, Doucet-Panaye A, Decourtye A, Aupinel P. Linear and non-linear QSAR modelling of juvenile hormone esterase inhibitors. SAR AND QSAR IN ENVIRONMENTAL RESEARCH 2012; 23:357-369. [PMID: 22443267 DOI: 10.1080/1062936x.2012.664562] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/31/2023]
Abstract
A tight control of juvenile hormone (JH) titre is crucial during the life cycle of a holometabolous insect. JH metabolism is made through the action of enzymes, particularly the juvenile hormone esterase (JHE). Trifluoromethylketones (TFKs) are able to inhibit this enzyme to disrupt the endocrine function of the targeted insect. In this context, a set of 96 TFKs, tested on Trichoplusia ni for their JHE inhibition, was split into a training set (n = 77) and a test set (n = 19) to derive a QSAR model. TFKs were initially described by 42 CODESSA (Comprehensive Descriptors for Structural and Statistical Analysis) descriptors, but a feature selection process allowed us to consider only five descriptors encoding the structural characteristics of the TFKs and their reactivity. A classical and spline regression analysis, a three-layer perceptron, a radial basis function network and a support vector regression were experienced as statistical tools. The best results were obtained with the support vector regression (r(2) and r(test)(2) = 0.91). The model provides information on the structural features and properties responsible for the high JHE inhibition activity of TFKs.
Collapse
|