1
|
Lotfi S, Ahmadi S, Azimi A, Kumar P. In silico aquatic toxicity prediction of chemicals toward Daphnia magna and fathead minnow using Monte Carlo approaches. Toxicol Mech Methods 2024:1-13. [PMID: 39397353 DOI: 10.1080/15376516.2024.2416226] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2024] [Revised: 09/05/2024] [Accepted: 10/08/2024] [Indexed: 10/15/2024]
Abstract
The fast-increasing use of chemicals led to large numbers of chemical compounds entering the aquatic environment, raising concerns about their potential effects on ecosystems. Therefore, assessment of the ecotoxicological features of organic compounds on aquatic organisms is very important. Daphnia magna and Fathead minnow are two aquatic species that are commonly tested as standard test organisms for aquatic risk assessment and are typically chosen as the biological model for the ecotoxicology investigations of chemical pollutants. Herein, global quantitative structure-toxicity relationship (QSTR) models have been developed to predict the toxicity (pEC(LC)50) of a large dataset comprising 2106 chemicals toward Daphnia magna and Fathead minnow. The optimal descriptor of correlation weights (DCWs) is calculated using the notation of simplified molecular input line entry system (SMILES) and is used to construct QSTR models. Three target functions, TF1, TF2, and TF3 are utilized to generate 12 QSTR models from four splits, and their statistical characteristics are also compared. The designed QSTR models are validated using both internal and external validation criteria and are found to be reliable, robust, and excellently predictive. Among the models, those generated using the TF3 demonstrate the best statistical quality with R2 values ranging from 0.9467 to 0.9607, Q2 values ranging from 0.9462 to 0.9603 and RMSE values ranging from 0.3764 to 0.4413 for the validation set. The applicability domain and the mechanistic interpretations of generated models were also discussed.
Collapse
Affiliation(s)
- Shahram Lotfi
- Department of Chemistry, Payame Noor University (PNU), Tehran, Iran
| | - Shahin Ahmadi
- Department of Pharmaceutical Chemistry, Faculty of Pharmaceutical Chemistry, Tehran Medical Sciences, Islamic Azad University, Tehran, Iran
| | - Ali Azimi
- Department of Chemistry, Science and Research Branch, Islamic Azad University, Tehran, Iran
| | - Parvin Kumar
- Department of Chemistry, Kurukshetra University, Kurukshetra, India
| |
Collapse
|
2
|
Gajewicz-Skretna A, Furuhama A, Yamamoto H, Suzuki N. Generating accurate in silico predictions of acute aquatic toxicity for a range of organic chemicals: Towards similarity-based machine learning methods. CHEMOSPHERE 2021; 280:130681. [PMID: 34162070 DOI: 10.1016/j.chemosphere.2021.130681] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/02/2021] [Revised: 04/21/2021] [Accepted: 04/22/2021] [Indexed: 06/13/2023]
Abstract
There has been an increase in the use of non-animal approaches, such as in silico and/or in vitro methods, for assessing the risks of hazardous chemicals. A number of machine learning algorithms link molecular descriptors that interpret chemical structural properties with their biological activity. These computer-aided methods encounter several challenges, the most significant being the heterogeneity of datasets; more efficient and inclusive computational methods that are able to process large and heterogeneous chemical datasets are needed. In this context, this study verifies the utility of similarity-based machine learning methods in predicting the acute aquatic toxicity of diverse organic chemicals on Daphnia magna and Oryzias latipes. Two similarity-based methods were tested that employ a limited training dataset, most similar to a given fitting point, instead of using the entire dataset that encompasses a wide range of chemicals. The kernel-weighted local polynomial approach had a number of advantages over the distance-weighted k-nearest neighbor (k-NN) algorithm. The results highlight the importance of lipophilicity, electrophilic reactivity, molecular polarizability, and size in determining acute toxicity. The rigorous model validation ensures that this approach is an important tool for estimating toxicity in new or untested chemicals.
Collapse
Affiliation(s)
- Agnieszka Gajewicz-Skretna
- Laboratory of Environmental Chemometrics, Faculty of Chemistry, University of Gdansk, Wita Stwosza 63, 80-308, Gdansk, Poland.
| | - Ayako Furuhama
- Center for Health and Environmental Risk Research, National Institute for Environmental Studies (NIES), 16-2 Onogawa, Tsukuba, 305-8506, Japan; Division of Genetics and Mutagenesis, National Institute of Health Sciences (NIHS), 3-25-26 Tonomachi, Kawasaki-ku, Kawasaki City, Kanagawa, 210-9501, Japan
| | - Hiroshi Yamamoto
- Center for Health and Environmental Risk Research, National Institute for Environmental Studies (NIES), 16-2 Onogawa, Tsukuba, 305-8506, Japan
| | - Noriyuki Suzuki
- Center for Health and Environmental Risk Research, National Institute for Environmental Studies (NIES), 16-2 Onogawa, Tsukuba, 305-8506, Japan
| |
Collapse
|
3
|
Jain S, Siramshetty VB, Alves VM, Muratov EN, Kleinstreuer N, Tropsha A, Nicklaus MC, Simeonov A, Zakharov AV. Large-Scale Modeling of Multispecies Acute Toxicity End Points Using Consensus of Multitask Deep Learning Methods. J Chem Inf Model 2021; 61:653-663. [PMID: 33533614 DOI: 10.1021/acs.jcim.0c01164] [Citation(s) in RCA: 24] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023]
Abstract
Computational methods to predict molecular properties regarding safety and toxicology represent alternative approaches to expedite drug development, screen environmental chemicals, and thus significantly reduce associated time and costs. There is a strong need and interest in the development of computational methods that yield reliable predictions of toxicity, and many approaches, including the recently introduced deep neural networks, have been leveraged towards this goal. Herein, we report on the collection, curation, and integration of data from the public data sets that were the source of the ChemIDplus database for systemic acute toxicity. These efforts generated the largest publicly available such data set comprising > 80,000 compounds measured against a total of 59 acute systemic toxicity end points. This data was used for developing multiple single- and multitask models utilizing random forest, deep neural networks, convolutional, and graph convolutional neural network approaches. For the first time, we also reported the consensus models based on different multitask approaches. To the best of our knowledge, prediction models for 36 of the 59 end points have never been published before. Furthermore, our results demonstrated a significantly better performance of the consensus model obtained from three multitask learning approaches that particularly predicted the 29 smaller tasks (less than 300 compounds) better than other models developed in the study. The curated data set and the developed models have been made publicly available at https://github.com/ncats/ld50-multitask, https://predictor.ncats.io/, and https://cactus.nci.nih.gov/download/acute-toxicity-db (data set only) to support regulatory and research applications.
Collapse
Affiliation(s)
- Sankalp Jain
- National Center for Advancing Translational Sciences (NCATS), National Institutes of Health, 9800 Medical Center Drive, Rockville, Maryland 20850, United States
| | - Vishal B Siramshetty
- National Center for Advancing Translational Sciences (NCATS), National Institutes of Health, 9800 Medical Center Drive, Rockville, Maryland 20850, United States
| | - Vinicius M Alves
- UNC Eshelman School of Pharmacy, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina 27599, United States
| | - Eugene N Muratov
- UNC Eshelman School of Pharmacy, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina 27599, United States
| | - Nicole Kleinstreuer
- Division of Intramural Research, Biostatistics and Computational Biology Branch, National Institute of Environmental Health Sciences, 111 T.W. Alexander Drive, Durham, North Carolina 27709, United States.,National Toxicology Program Interagency Center for the Evaluation of Alternative Toxicological Methods, National Institute of Environmental Health Sciences, 111 T.W. Alexander Drive, Durham, North Carolina 27709, United States
| | - Alexander Tropsha
- UNC Eshelman School of Pharmacy, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina 27599, United States
| | - Marc C Nicklaus
- Computer-Aided Drug Design (CADD) Group, Chemical Biology Laboratory, Center for Cancer Research, National Cancer Institute, National Institutes of Health, DHHS, NCI-Frederick, 376 Boyles Street, Frederick, Maryland 21702, United States
| | - Anton Simeonov
- National Center for Advancing Translational Sciences (NCATS), National Institutes of Health, 9800 Medical Center Drive, Rockville, Maryland 20850, United States
| | - Alexey V Zakharov
- National Center for Advancing Translational Sciences (NCATS), National Institutes of Health, 9800 Medical Center Drive, Rockville, Maryland 20850, United States
| |
Collapse
|
4
|
Gajewicz-Skretna A, Gromelski M, Wyrzykowska E, Furuhama A, Yamamoto H, Suzuki N. Aquatic toxicity (Pre)screening strategy for structurally diverse chemicals: global or local classification tree models? ECOTOXICOLOGY AND ENVIRONMENTAL SAFETY 2021; 208:111738. [PMID: 33396066 DOI: 10.1016/j.ecoenv.2020.111738] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/12/2020] [Revised: 11/23/2020] [Accepted: 11/25/2020] [Indexed: 06/12/2023]
Abstract
With an ever-increasing number of synthetic chemicals being manufactured, it is unrealistic to expect that they will all be subjected to comprehensive and effective risk assessment. A shift from conventional animal testing to computer-aided methods is therefore an important step towards advancing the environmental risk assessments of chemicals. The aims of this study are two-fold: firstly, it examines the relationships between structural and physicochemical features of a diverse set of organic chemicals, and their acute aquatic toxicity towards Daphnia magna and Oryzias latipes using a classification tree approach. Secondly, it compares the efficiency and accuracy of the predictions of two modeling schemes: local models that are inherently restricted to a smaller subset of structurally-related substances, and a global model that covers a wider chemical space and a number of modes of toxic action. The classification tree-based models differentiate the organic chemicals into either 'highly toxic' or 'low to non-toxic' classes, based on internal and external validation criteria. These mechanistically-driven models, which demonstrate good performance, reveal that the key factors driving acute aquatic toxicity are lipophilicity, electrophilic reactivity, molecular polarizability and size. A comparative analysis of the performance of the two modeling schemes indicates that the local models, trained on homogeneous data sets, are less error prone, and therefore superior to the global model. Although the global models showed worse performance metrics compared to the local ones, their applicability domain is much wider, thereby significantly increasing their usefulness in practical applications for regulatory purposes. This demonstrates their advantage over local models and shows they are an invaluable tool for modeling heterogeneous chemical data sets.
Collapse
Affiliation(s)
- Agnieszka Gajewicz-Skretna
- Laboratory of Environmental Chemometrics, Faculty of Chemistry, University of Gdansk, Wita Stwosza 63, 80-308 Gdansk, Poland.
| | - Maciej Gromelski
- Laboratory of Environmental Chemometrics, Faculty of Chemistry, University of Gdansk, Wita Stwosza 63, 80-308 Gdansk, Poland
| | - Ewelina Wyrzykowska
- Laboratory of Environmental Chemometrics, Faculty of Chemistry, University of Gdansk, Wita Stwosza 63, 80-308 Gdansk, Poland
| | - Ayako Furuhama
- Division of Genetics and Mutagenesis, National Institute of Health Sciences (NIHS), 3-25-26 Tonomachi, Kawasaki-ku, Kawasaki, Kanagawa 210-9501, Japan; Center for Health and Environmental Risk Research, National Institute for Environmental Studies (NIES), 16-2 Onogawa, Tsukuba 305-8506, Japan
| | - Hiroshi Yamamoto
- Center for Health and Environmental Risk Research, National Institute for Environmental Studies (NIES), 16-2 Onogawa, Tsukuba 305-8506, Japan
| | - Noriyuki Suzuki
- Center for Health and Environmental Risk Research, National Institute for Environmental Studies (NIES), 16-2 Onogawa, Tsukuba 305-8506, Japan
| |
Collapse
|
5
|
Jiao Z, Hu P, Xu H, Wang Q. Machine Learning and Deep Learning in Chemical Health and Safety: A Systematic Review of Techniques and Applications. ACS CHEMICAL HEALTH & SAFETY 2020. [DOI: 10.1021/acs.chas.0c00075] [Citation(s) in RCA: 61] [Impact Index Per Article: 15.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]
Affiliation(s)
- Zeren Jiao
- Mary Kay O’Connor Process Safety Center, Artie McFerrin Department of Chemical Engineering, Texas A&M University, College Station, Texas 77843-3122, United States
| | - Pingfan Hu
- Mary Kay O’Connor Process Safety Center, Artie McFerrin Department of Chemical Engineering, Texas A&M University, College Station, Texas 77843-3122, United States
| | - Hongfei Xu
- Mary Kay O’Connor Process Safety Center, Artie McFerrin Department of Chemical Engineering, Texas A&M University, College Station, Texas 77843-3122, United States
| | - Qingsheng Wang
- Mary Kay O’Connor Process Safety Center, Artie McFerrin Department of Chemical Engineering, Texas A&M University, College Station, Texas 77843-3122, United States
| |
Collapse
|
6
|
Lunghini F, Marcou G, Azam P, Enrici MH, Van Miert E, Varnek A. Consensus QSAR models estimating acute toxicity to aquatic organisms from different trophic levels: algae, Daphnia and fish. SAR AND QSAR IN ENVIRONMENTAL RESEARCH 2020; 31:655-675. [PMID: 32799684 DOI: 10.1080/1062936x.2020.1797872] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/28/2020] [Accepted: 07/15/2020] [Indexed: 06/11/2023]
Abstract
We report new consensus models estimating acute toxicity for algae, Daphnia and fish endpoints. We assembled a large collection of 3680 public unique compounds annotated by, at least, one experimental value for the given endpoint. Support Vector Machine models were internally and externally validated following the OECD principles. Reasonable predictive performances were achieved (RMSEext = 0.56-0.78) which are in line with those of state-of-the-art models. The known structural alerts are compared with analysis of the atomic contributions to these models obtained using the ISIDA/ColorAtom utility. A benchmarking against existing tools has been carried out on a set of compounds considered more representative and relevant for the chemical space of the current chemical industry. Our model scored one of the best accuracy and data coverage. Nevertheless, industrial data performances were noticeably lower than those on public data, indicating that existing models fail to meet the industrial needs. Thus, final models were updated with the inclusion of new industrial compounds, extending the applicability domain and relevance for application in an industrial context. Generated models and collected public data are made freely available.
Collapse
Affiliation(s)
- F Lunghini
- Laboratory of Chemoinformatics, University of Strasbourg , Strasbourg, France
- Toxicological and Environmental Risk Assessment Unit , Solvay S.A., St. Fons, France
| | - G Marcou
- Laboratory of Chemoinformatics, University of Strasbourg , Strasbourg, France
| | - P Azam
- Toxicological and Environmental Risk Assessment Unit , Solvay S.A., St. Fons, France
| | - M H Enrici
- Toxicological and Environmental Risk Assessment Unit , Solvay S.A., St. Fons, France
| | - E Van Miert
- Toxicological and Environmental Risk Assessment Unit , Solvay S.A., St. Fons, France
| | - A Varnek
- Laboratory of Chemoinformatics, University of Strasbourg , Strasbourg, France
| |
Collapse
|
7
|
Wang Y, Chen X. A joint optimization QSAR model of fathead minnow acute toxicity based on a radial basis function neural network and its consensus modeling. RSC Adv 2020; 10:21292-21308. [PMID: 35518745 PMCID: PMC9054390 DOI: 10.1039/d0ra02701d] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2020] [Accepted: 05/24/2020] [Indexed: 01/07/2023] Open
Abstract
Acute toxicity of the fathead minnow (Pimephales promelas) is an important indicator to evaluate the hazards and risks of compounds in aquatic environments. The aim of our study is to explore the predictive power of the quantitative structure-activity relationship (QSAR) model based on a radial basis function (RBF) neural network with the joint optimization method to study the acute toxicity mechanism, and to develop a potential acute toxicity prediction model, for fathead minnow. To ensure the symmetry and fairness of the data splitting and to generate multiple chemically diverse training and validation sets, we used a self-organizing mapping (SOM) neural network to split the modeling dataset (containing 955 compounds) characterized by PaDEL-descriptors. After preliminary selection of descriptors via the mean decrease impurity method, a hybrid quantum particle swarm optimization (HQPSO) algorithm was used to jointly optimize the parameters of RBF and select the key descriptors. We established 20 RBF-based QSAR models, and the statistical results showed that the 10-fold cross-validation results (R cv10 2) and the adjusted coefficients of determination (R adj 2) were all great than 0.7 and 0.8, respectively. The Q ext 2 of these models was between 0.6480 and 0.7317, and the R ext 2 was between 0.6563 and 0.7318. Combined with the frequency and importance of the descriptors used in RBF-based models, and the correlation between the descriptors and acute toxicity, we concluded that the water distribution coefficient, molar refractivity, and first ionization potential are important factors affecting the acute toxicity of fathead minnow. A consensus QSAR model with RBF-based models was established; this model showed good performance with R 2 = 0.9118, R cv10 2 = 0.7632, and Q ext 2 = 0.7430. A frequency weighted and distance (FWD)-based application domain (AD) definition method was proposed, and the outliers were analyzed carefully. Compared with previous studies the method proposed in this paper has obvious advantages and its robustness and external predictive power are also better than Xgboost-based model. It is an effective QSAR modeling method.
Collapse
Affiliation(s)
- Yukun Wang
- School of Chemical Engineering, University of Science and Technology Liaoning No. 185, Qianshan Anshan 114051 Liaoning China
- School of Electronic and Information Engineering, University of Science and Technology Liaoning No. 185, Qianshan Anshan 114051 Liaoning China +864125928367
| | - Xuebo Chen
- School of Electronic and Information Engineering, University of Science and Technology Liaoning No. 185, Qianshan Anshan 114051 Liaoning China +864125928367
| |
Collapse
|
8
|
Takata M, Lin BL, Xue M, Zushi Y, Terada A, Hosomi M. Predicting the acute ecotoxicity of chemical substances by machine learning using graph theory. CHEMOSPHERE 2020; 238:124604. [PMID: 31450113 DOI: 10.1016/j.chemosphere.2019.124604] [Citation(s) in RCA: 17] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/21/2019] [Revised: 08/13/2019] [Accepted: 08/15/2019] [Indexed: 06/10/2023]
Abstract
Accurate in silico predictions of chemical substance ecotoxicity has become an important issue in recent years. Most conventional methods, such as the Ecological Structure-Activity Relationship (ECOSAR) model, cluster chemical substances empirically based on structural information and then predict toxicity by employing a log P linear regression model. Due to empirical classification, the prediction accuracy does not improve even if new ecotoxicity test data are added. In addition, most of the conventional methods are not appropriate for predicting the ecotoxicity on inorganic and/or ionized compounds. Furthermore, a user faces difficulty in handling multiple Quantitative Structure-Activity Relationship (QSAR) formulas with one chemical substance. To overcome the flaws of the conventional methods, in this study a new method was developed that applied unsupervised machine learning and graph theory to predict acute ecotoxicity. The proposed machine learning technique is based on the large AIST-MeRAM ecotoxicity test dataset, a software program developed by the National Institute of Advanced Industry Science and Technology for Multi-purpose Ecological Risk Assessment and Management, and the Molecular ACCess System (MACCS) keys that vectorize a chemical structure to 166-bit binary information. The acute toxicity of fish, daphnids, and algae can be predicted with good accuracy, without requiring log P and linear regression models in existing methods. Results from the new method were cross-validated and compared with ECOSAR predictions and show that the new method provides better accuracy for a wider range of chemical substances, including inorganic and ionized compounds.
Collapse
Affiliation(s)
- Michiyoshi Takata
- Department of Chemical Engineering, Tokyo University of Agriculture and Technology, Japan
| | - Bin-Le Lin
- Research Institute of Science for Safety and Sustainability, National Institute of Advanced Industrial Science and Technology (AIST), Japan.
| | - Mianqiang Xue
- Research Institute of Science for Safety and Sustainability, National Institute of Advanced Industrial Science and Technology (AIST), Japan
| | - Yasuyuki Zushi
- Research Institute of Science for Safety and Sustainability, National Institute of Advanced Industrial Science and Technology (AIST), Japan
| | - Akihiko Terada
- Department of Chemical Engineering, Tokyo University of Agriculture and Technology, Japan
| | - Masaaki Hosomi
- Department of Chemical Engineering, Tokyo University of Agriculture and Technology, Japan
| |
Collapse
|
9
|
Chen X, Dang L, Yang H, Huang X, Yu X. Machine learning-based prediction of toxicity of organic compounds towards fathead minnow. RSC Adv 2020; 10:36174-36180. [PMID: 35517078 PMCID: PMC9056962 DOI: 10.1039/d0ra05906d] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2020] [Accepted: 09/14/2020] [Indexed: 01/19/2023] Open
Abstract
Predicting the acute toxicity of a large dataset of diverse chemicals against fathead minnows (Pimephales promelas) is challenging. In this paper, 963 organic compounds with acute toxicity towards fathead minnows were split into a training set (482 compounds) and a test set (481 compounds) with an approximate ratio of 1 : 1. Only six molecular descriptors were used to establish the quantitative structure–activity/toxicity relationship (QSAR/QSTR) model for 96 hour pLC50 through a support vector machine (SVM) along with genetic algorithm. The optimal SVM model (R2 = 0.756) was verified using both internal (leave-one-out cross-validation) and external validations. The validation results (qint2 = 0.699 and qext2 = 0.744) were satisfactory in predicting acute toxicity in fathead minnows compared with other models reported in the literature, although our SVM model has only six molecular descriptors and a large data set for the test set consisting of 481 compounds. A quantitative structure–toxicity relationship of 963 chemicals against fathead minnow was developed by using support vector machine and genetic algorithm.![]()
Collapse
Affiliation(s)
- Xingmei Chen
- Hunan Provincial Key Laboratory of Environmental Catalysis & Waste Regeneration
- College of Materials and Chemical Engineering
- Hunan Institute of Engineering
- Xiangtan
- China
| | - Limin Dang
- Hunan Provincial Key Laboratory of Environmental Catalysis & Waste Regeneration
- College of Materials and Chemical Engineering
- Hunan Institute of Engineering
- Xiangtan
- China
| | - Hai Yang
- Hunan Provincial Key Laboratory of Environmental Catalysis & Waste Regeneration
- College of Materials and Chemical Engineering
- Hunan Institute of Engineering
- Xiangtan
- China
| | - Xianwei Huang
- Hunan Provincial Key Laboratory of Environmental Catalysis & Waste Regeneration
- College of Materials and Chemical Engineering
- Hunan Institute of Engineering
- Xiangtan
- China
| | - Xinliang Yu
- Hunan Provincial Key Laboratory of Environmental Catalysis & Waste Regeneration
- College of Materials and Chemical Engineering
- Hunan Institute of Engineering
- Xiangtan
- China
| |
Collapse
|
10
|
Sosnin S, Karlov D, Tetko IV, Fedorov MV. Comparative Study of Multitask Toxicity Modeling on a Broad Chemical Space. J Chem Inf Model 2019; 59:1062-1072. [PMID: 30589269 DOI: 10.1021/acs.jcim.8b00685] [Citation(s) in RCA: 52] [Impact Index Per Article: 10.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Abstract
Acute toxicity is one of the most challenging properties to predict purely with computational methods due to its direct relationship to biological interactions. Moreover, toxicity can be represented by different end points: it can be measured for different species using different types of administration, etc., and it is questionable if the knowledge transfer between end points is possible. We performed a comparative study of prediction multitask toxicity for a broad chemical space using different descriptors and modeling algorithms and applied multitask learning for a large toxicity data set extracted from the Registry of Toxic Effects of Chemical Substances (RTECS). We demonstrated that multitask modeling provides significant improvement over single-output models and other machine learning methods. Our research reveals that multitask learning can be very useful to improve the quality of acute toxicity modeling and raises a discussion about the usage of multitask approaches for regulation purposes. Our MultiTox models are freely available in OCHEM platform ( ochem.eu/multitox ) under CC-BY-NC license.
Collapse
Affiliation(s)
- Sergey Sosnin
- Skolkovo Institute of Science and Technology , Skolkovo Innovation Center , Moscow 143026 , Russia
| | - Dmitry Karlov
- Skolkovo Institute of Science and Technology , Skolkovo Innovation Center , Moscow 143026 , Russia
| | - Igor V Tetko
- Helmholtz Zentrum München-Research Center for Environmental Health (GmbH) , Institute of Structural Biology and BIGCHEM GmbH , Ingolstädter Landstraße 1 , D-85764 Neuherberg , Germany
| | - Maxim V Fedorov
- Skolkovo Institute of Science and Technology , Skolkovo Innovation Center , Moscow 143026 , Russia.,University of Strathclyde , Department of Physics , John Anderson Building, 107 Rottenrow East , Glasgow , U.K. G40NG
| |
Collapse
|
11
|
Khan K, Kar S, Sanderson H, Roy K, Leszczynski J. Ecotoxicological Modeling, Ranking and Prioritization of Pharmaceuticals Using QSTR and i‐QSTTR Approaches: Application of 2D and Fragment Based Descriptors. Mol Inform 2018; 38:e1800078. [DOI: 10.1002/minf.201800078] [Citation(s) in RCA: 20] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2018] [Accepted: 11/01/2018] [Indexed: 12/22/2022]
Affiliation(s)
- Kabiruddin Khan
- Drug Theoretics and Cheminformatics Laboratory Department of Pharmaceutical Technology Jadavpur University Kolkata 700032 India
| | - Supratik Kar
- Interdisciplinary Center for Nanotoxicity Department of Chemistry, Physics and Atmospheric Sciences Jackson State University Jackson MS-39217 USA
| | - Hans Sanderson
- Department of Environmental Science, Section for Toxicology and Chemistry Aarhus University Frederiksborgvej 399 DK-4000 Roskilde Denmark
| | - Kunal Roy
- Drug Theoretics and Cheminformatics Laboratory Department of Pharmaceutical Technology Jadavpur University Kolkata 700032 India
| | - Jerzy Leszczynski
- Interdisciplinary Center for Nanotoxicity Department of Chemistry, Physics and Atmospheric Sciences Jackson State University Jackson MS-39217 USA
| |
Collapse
|
12
|
|
13
|
Toropov AA, Toropova AP, Marzo M, Dorne JL, Georgiadis N, Benfenati E. QSAR models for predicting acute toxicity of pesticides in rainbow trout using the CORAL software and EFSA's OpenFoodTox database. ENVIRONMENTAL TOXICOLOGY AND PHARMACOLOGY 2017; 53:158-163. [PMID: 28599185 DOI: 10.1016/j.etap.2017.05.011] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/26/2017] [Revised: 04/21/2017] [Accepted: 05/18/2017] [Indexed: 06/07/2023]
Abstract
Optimal (flexible) descriptors were used to establish quantitative structure - activity relationships (QSAR) for toxicity of pesticides (n=116) towards rainbow trout. A heterogeneous set of hundreds of pesticides has been used, taken from the EFSA's chemical Hazards Database: OpenFoodTox. Optimal descriptors are preparing from simplified molecular input-line entry system (SMILES). So-called, correlation weights of different fragments of SMILES are calculating by the Monte Carlo optimization procedure where correlation coefficient between endpoint and optimal descriptor plays role of the target function. Having maximum of the correlation coefficient for the training set, one can suggest that the optimal descriptor calculated with these correlation weights can correlate with endpoint for external validation set. This approach was checked up with three different distributions into the training (≈85%) set and external validation (≈15%) set. The statistical characteristics of these models are (i) for training set correlation coefficient (r2) ranges 0.72-0.81, and root mean squared error (RMSE) ranges 0.54-1.25; (ii) for external (validation) set r2 ranges 0.74-0.84; and RMSE ranges 0.64-0.75. Computational experiments have shown that presence of chlorine, fluorine, sulfur, and aromatic fragments is promoter of increase for the toxicity.
Collapse
Affiliation(s)
- Andrey A Toropov
- Department of Environmental Health Science, Laboratory of Environmental Chemistry and Toxicology, IRCCS-Istituto di Ricerche Farmacologiche Mario Negri, Via La Masa 19, 20156 Milano, Italy
| | - Alla P Toropova
- Department of Environmental Health Science, Laboratory of Environmental Chemistry and Toxicology, IRCCS-Istituto di Ricerche Farmacologiche Mario Negri, Via La Masa 19, 20156 Milano, Italy.
| | - Marco Marzo
- Department of Environmental Health Science, Laboratory of Environmental Chemistry and Toxicology, IRCCS-Istituto di Ricerche Farmacologiche Mario Negri, Via La Masa 19, 20156 Milano, Italy
| | - Jean Lou Dorne
- Scientific Committee and Emerging Risks Unit, European Food Safety Authority, Via Carlo Magno 1A, 43126 Parma, Italy
| | - Nikolaos Georgiadis
- Scientific Committee and Emerging Risks Unit, European Food Safety Authority, Via Carlo Magno 1A, 43126 Parma, Italy
| | - Emilio Benfenati
- Department of Environmental Health Science, Laboratory of Environmental Chemistry and Toxicology, IRCCS-Istituto di Ricerche Farmacologiche Mario Negri, Via La Masa 19, 20156 Milano, Italy
| |
Collapse
|
14
|
Toropova AP, Toropov AA, Raskova M, Raska I. Improved building up a model of toxicity towards Pimephales promelas by the Monte Carlo method. ENVIRONMENTAL TOXICOLOGY AND PHARMACOLOGY 2016; 48:278-285. [PMID: 27863338 DOI: 10.1016/j.etap.2016.11.010] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/13/2016] [Revised: 11/04/2016] [Accepted: 11/10/2016] [Indexed: 06/06/2023]
Abstract
By optimization of so-called correlation weights of attributes of simplified molecular input-line entry system (SMILES) quantitative structure - activity relationships (QSAR) for toxicity towards Pimephales promelas are established. A new SMILES attribute has been utilized in this work. This attribute is a molecular descriptor, which reflects (i) presence of different kinds of bonds (double, triple, and stereo chemical bonds); (ii) presence of nitrogen, oxygen, sulphur, and phosphorus atoms; and (iii) presence of fluorine, chlorine, bromine, and iodine atoms. The statistical characteristics of the best model are the following: n=226, r2=0.7630, RMSE=0.654 (training set); n=114, r2=0.7024, RMSE=0.766 (calibration set); n=226, r2=0.6292, RMSE=0.870 (validation set). A new criterion to select a preferable split into the training and validation sets are suggested and discussed.
Collapse
Affiliation(s)
- Alla P Toropova
- IRCCS-Istituto di Ricerche Farmacologiche Mario Negri, Via La Masa 19, 20156 Milan, Italy.
| | - Andrey A Toropov
- IRCCS-Istituto di Ricerche Farmacologiche Mario Negri, Via La Masa 19, 20156 Milan, Italy
| | - Maria Raskova
- Third Department of Medicine-Department of Endocrinology and Metabolism, First Faculty of Medicine, Charles University in Prague and General University Hospital in Prague, UNemocnice1, 12808 Prague 2, Czechia
| | - Ivan Raska
- Third Department of Medicine-Department of Endocrinology and Metabolism, First Faculty of Medicine, Charles University in Prague and General University Hospital in Prague, UNemocnice1, 12808 Prague 2, Czechia
| |
Collapse
|
15
|
Drgan V, Župerl Š, Vračko M, Como F, Novič M. Robust modelling of acute toxicity towards fathead minnow (Pimephales promelas) using counter-propagation artificial neural networks and genetic algorithm. SAR AND QSAR IN ENVIRONMENTAL RESEARCH 2016; 27:501-519. [PMID: 27322761 DOI: 10.1080/1062936x.2016.1196388] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/13/2016] [Accepted: 05/28/2016] [Indexed: 06/06/2023]
Abstract
Large worldwide use of chemicals has caused great concern about their possible adverse effects on human health, flora and fauna. Increased production of new chemicals has also increased demand for their risk assessment. Traditionally, results from animal tests have been used to assess toxicity of chemicals. However, such methods are ethically questionable since they involve killing and causing suffering of the test animals. Therefore, new in silico methods are being sought to replace the traditional in vivo and in vitro testing methods. In this article we report on one method that can be used to build robust models for the prediction of compounds' properties from their chemical structure. The method has been developed by combining a genetic algorithm, a counter-propagation artificial neural network and cross-validation. It has been tested using existing data on toxicity to fathead minnow (Pimephales promelas). The results show that the method may give reliable results for chemicals belonging to the applicability domain of the developed models. Therefore, it can aid the risk assessment of chemicals and consequently reduce demand for animal tests.
Collapse
Affiliation(s)
- V Drgan
- a National Institute of Chemistry , Ljubljana , Slovenia
| | - Š Župerl
- a National Institute of Chemistry , Ljubljana , Slovenia
| | - M Vračko
- a National Institute of Chemistry , Ljubljana , Slovenia
| | - F Como
- b Istituto di Ricerche Farmacologiche 'Mario Negri' , Milan , Italy
| | - M Novič
- a National Institute of Chemistry , Ljubljana , Slovenia
| |
Collapse
|
16
|
Chen S, Zhang P, Liu X, Qin C, Tao L, Zhang C, Yang SY, Chen YZ, Chui WK. Towards cheminformatics-based estimation of drug therapeutic index: Predicting the protective index of anticonvulsants using a new quantitative structure-index relationship approach. J Mol Graph Model 2016; 67:102-10. [PMID: 27262528 DOI: 10.1016/j.jmgm.2016.05.006] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2016] [Revised: 05/17/2016] [Accepted: 05/18/2016] [Indexed: 02/05/2023]
Abstract
The overall efficacy and safety profile of a new drug is partially evaluated by the therapeutic index in clinical studies and by the protective index (PI) in preclinical studies. In-silico predictive methods may facilitate the assessment of these indicators. Although QSAR and QSTR models can be used for predicting PI, their predictive capability has not been evaluated. To test this capability, we developed QSAR and QSTR models for predicting the activity and toxicity of anticonvulsants at accuracy levels above the literature-reported threshold (LT) of good QSAR models as tested by both the internal 5-fold cross validation and external validation method. These models showed significantly compromised PI predictive capability due to the cumulative errors of the QSAR and QSTR models. Therefore, in this investigation a new quantitative structure-index relationship (QSIR) model was devised and it showed improved PI predictive capability that superseded the LT of good QSAR models. The QSAR, QSTR and QSIR models were developed using support vector regression (SVR) method with the parameters optimized by using the greedy search method. The molecular descriptors relevant to the prediction of anticonvulsant activities, toxicities and PIs were analyzed by a recursive feature elimination method. The selected molecular descriptors are primarily associated with the drug-like, pharmacological and toxicological features and those used in the published anticonvulsant QSAR and QSTR models. This study suggested that QSIR is useful for estimating the therapeutic index of drug candidates.
Collapse
Affiliation(s)
- Shangying Chen
- Department of Pharmacy, National University of Singapore, 18 Science Drive 4, Singapore 117543, Singapore
| | - Peng Zhang
- Department of Pharmacy, National University of Singapore, 18 Science Drive 4, Singapore 117543, Singapore
| | - Xin Liu
- Shanghai Applied Protein Technology Co. Ltd, Research Center for Proteome Analysis, Institute of Biochemistry and cell Biology, Shanghai Institutes for Biological Sciences, Shanghai, 200233, China
| | - Chu Qin
- Department of Pharmacy, National University of Singapore, 18 Science Drive 4, Singapore 117543, Singapore
| | - Lin Tao
- Department of Pharmacy, National University of Singapore, 18 Science Drive 4, Singapore 117543, Singapore
| | - Cheng Zhang
- Department of Pharmacy, National University of Singapore, 18 Science Drive 4, Singapore 117543, Singapore
| | - Sheng Yong Yang
- State Key Laboratory of Biotherapy and Cancer Center, West China Hospital, West China Medical School, Sichuan University, Sichuan, China
| | - Yu Zong Chen
- Department of Pharmacy, National University of Singapore, 18 Science Drive 4, Singapore 117543, Singapore.
| | - Wai Keung Chui
- Department of Pharmacy, National University of Singapore, 18 Science Drive 4, Singapore 117543, Singapore.
| |
Collapse
|
17
|
Wu X, Zhang Q, Hu J. QSAR study of the acute toxicity to fathead minnow based on a large dataset. SAR AND QSAR IN ENVIRONMENTAL RESEARCH 2016; 27:147-164. [PMID: 26911563 DOI: 10.1080/1062936x.2015.1137353] [Citation(s) in RCA: 28] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
Acute fathead minnow toxicity is an important basis of hazard and risk assessment for compounds in the aquatic environment. In this paper, a large dataset consisting of 963 organic compounds with acute toxicity towards fathead minnow was studied with a QSAR approach. All molecular structures of compounds were optimized by the hybrid density functional theory method. Dragon molecular descriptors and log Kow were selected to describe molecular information. Genetic algorithm and multiple linear regression analysis were combined to develop models. A global prediction model for compounds without known mode of action and two local models for organic compounds that exhibit narcosis toxicity and excess toxicity were developed, respectively. For all developed models, internal validations were performed by cross-validation and external validations were implemented by the setting of validation set. In addition, applicability domains of models were evaluated using a leverage method and outliers were listed and checked using toxicological knowledge.
Collapse
Affiliation(s)
- X Wu
- a Environment Research Institute, Shandong University , Jinan , P.R. China
| | - Q Zhang
- a Environment Research Institute, Shandong University , Jinan , P.R. China
| | - J Hu
- a Environment Research Institute, Shandong University , Jinan , P.R. China
| |
Collapse
|
18
|
In Silico Predictions of Human Skin Permeability using Nonlinear Quantitative Structure–Property Relationship Models. Pharm Res 2015; 32:2360-71. [DOI: 10.1007/s11095-015-1629-y] [Citation(s) in RCA: 27] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2014] [Accepted: 01/13/2015] [Indexed: 10/24/2022]
|
19
|
Cassotti M, Ballabio D, Todeschini R, Consonni V. A similarity-based QSAR model for predicting acute toxicity towards the fathead minnow (Pimephales promelas). SAR AND QSAR IN ENVIRONMENTAL RESEARCH 2015; 26:217-243. [PMID: 25780951 DOI: 10.1080/1062936x.2015.1018938] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/04/2023]
Abstract
REACH regulation demands information about acute toxicity of chemicals towards fish and supports the use of QSAR models, provided compliance with OECD principles. Existing models present some drawbacks that may limit their regulatory application. In this study, a dataset of 908 chemicals was used to develop a QSAR model to predict the LC50 96 hours for the fathead minnow. Genetic algorithms combined with k nearest neighbour method were applied on the training set (726 chemicals) and resulted in a model based on six molecular descriptors. An automated assessment of the applicability domain (AD) was carried out by comparing the average distance of each molecule from the nearest neighbours with a fixed threshold. The model had good and balanced performance in internal and external validation (182 test molecules), at the expense of a percentage of molecules outside the AD. Principal Component Analysis showed apparent correlations between model descriptors and toxicity.
Collapse
Affiliation(s)
- M Cassotti
- a Department of Earth and Environmental Sciences , University of Milano-Bicocca , Milano , Italy
| | | | | | | |
Collapse
|
20
|
Lyakurwa FS, Yang X, Li X, Qiao X, Chen J. Development of in silico models for predicting LSER molecular parameters and for acute toxicity prediction to fathead minnow (Pimephales promelas). CHEMOSPHERE 2014; 108:17-25. [PMID: 24875907 DOI: 10.1016/j.chemosphere.2014.02.076] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/25/2013] [Revised: 02/22/2014] [Accepted: 02/23/2014] [Indexed: 06/03/2023]
Abstract
Many chemicals with toxic effects to aquatic species are produced every year. To date, linear solvation energy relationship (LSER) models for toxicity prediction to aquatic species are limited to non-polar and polar narcotic compounds. In this study, the Verhaar scheme was used to classify chemicals into five modes of toxic actions. LSER models for predicting acute toxicity to fathead minnow were developed by identifying chemical functional groups that influence toxicity prediction of reactive chemicals. Moreover, the predictive models that can be used to estimate LSER molecular parameters have been developed by using quantum chemical and Dragon descriptors. All the predictive models were developed following the OECD guidelines for QSAR model development and validation, with a satisfactory goodness-of-fit, robustness and predictive ability. The McGowans volume was the most significant descriptor in the toxicity models. This study also inferred that, compounds with carbonyl group have different behaviors such that some can biodegrade in the organism while others do not biodegrade, which might be the reason for the difficulties in modeling the acute toxicity of reactive chemicals.
Collapse
Affiliation(s)
- Felichesmi Selestini Lyakurwa
- Key Laboratory of Industrial Ecology and Environmental Engineering (MOE), School of Environmental Science and Technology, Dalian University of Technology, Dalian 116024, China
| | - Xianhai Yang
- Key Laboratory of Industrial Ecology and Environmental Engineering (MOE), School of Environmental Science and Technology, Dalian University of Technology, Dalian 116024, China
| | - Xuehua Li
- Key Laboratory of Industrial Ecology and Environmental Engineering (MOE), School of Environmental Science and Technology, Dalian University of Technology, Dalian 116024, China.
| | - Xianliang Qiao
- Key Laboratory of Industrial Ecology and Environmental Engineering (MOE), School of Environmental Science and Technology, Dalian University of Technology, Dalian 116024, China
| | - Jingwen Chen
- Key Laboratory of Industrial Ecology and Environmental Engineering (MOE), School of Environmental Science and Technology, Dalian University of Technology, Dalian 116024, China
| |
Collapse
|
21
|
Doucet JP, Doucet-Panaye A. Structure-activity relationship study of trifluoromethylketone inhibitors of insect juvenile hormone esterase: comparison of several classification methods. SAR AND QSAR IN ENVIRONMENTAL RESEARCH 2014; 25:589-616. [PMID: 24884820 DOI: 10.1080/1062936x.2014.919959] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/03/2023]
Abstract
Juvenile hormone esterase (JHE) plays a key role in the development and metamorphosis of holometabolous insects. Its inhibitors could possibly be targeted for insect control. Conversely, JHE may also be involved in endocrine disruption by xenobiotics, resulting in detrimental effects in beneficial insects. There is therefore a need to know the structural characteristics of the molecules able to monitor JHE activity, and to develop SAR and QSAR studies to estimate their effectiveness. For a large diverse population of 181 trifluoromethylketones (TFKs) - the most potent JHE inhibitors known to date - we recently proposed a binary classification (active/inactive) using a support vector machine and Codessa structural descriptors. We have now examined, using the same data set and with the same descriptors, the applicability and performance of five other machine learning approaches. These have been shown able to handle high dimensional data (with descriptors possibly irrelevant or redundant) and to cope with complex mechanisms, but without delivering explicit directly exploitable models. Splitting the data into five batches (training set 80%, test set 20%) and carrying out leave-one-out cross-validation, led to good results of comparable performance, consistent with our previous support vector classifier (SVC) results. Accuracy was greater than 0.80 for all approaches. A reduced set of 15 descriptors common to all the investigated approaches showed good predictive ability (confirmed using a three-layer perceptron) and gives some clues regarding a mechanistic interpretation.
Collapse
Affiliation(s)
- J P Doucet
- a Itodys , Université Paris-Diderot , UMR 7086 , Paris , France
| | | |
Collapse
|
22
|
Lyakurwa F, Yang X, Li X, Qiao X, Chen J. Development and validation of theoretical linear solvation energy relationship models for toxicity prediction to fathead minnow (Pimephales promelas). CHEMOSPHERE 2014; 96:188-194. [PMID: 24216263 DOI: 10.1016/j.chemosphere.2013.10.039] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/22/2013] [Revised: 10/10/2013] [Accepted: 10/11/2013] [Indexed: 06/02/2023]
Abstract
The acute toxicity predictive models are vitally important for the toxicological information used in the ecological risk assessments. In this study, we used Verhaar classification scheme to group compounds into five modes of toxic action. The quantum chemical descriptors that characterize the electron donor-acceptor property of the compounds were introduced into the theoretical linear solvation energy relationship (TLSER) models. The predictive models have relatively larger data sets, which imply that they cover a wide applicability domain (AD). All models were developed following the Organization for Economic Co-operation and Development (OECD) QSAR models development and validation guidelines. The adjusted determination coefficient (Radj(2)) and external explained variance (QEXT(2)) of the models were ranging from 0.707 to 0.903 and 0.660 to 0.858, respectively, indicating high goodness-of-fit, robustness and predictive capacity. The cavity term (McGowans volume) was the most significant descriptor in the models. Moreover, the electron donor-acceptor (E-TLSER) models are comparable to the TLSER models for the toxicity prediction to fathead minnow. Thus, the E-TLSER models developed can be used to predict acute toxicity of new compounds within the AD.
Collapse
Affiliation(s)
- Felichesmi Lyakurwa
- Key Laboratory of Industrial Ecology and Environmental Engineering (MOE), School of Environmental Science and Technology, Dalian University of Technology, Dalian 116024, China
| | | | | | | | | |
Collapse
|
23
|
Singh KP, Gupta S. In silico prediction of toxicity of non-congeneric industrial chemicals using ensemble learning based modeling approaches. Toxicol Appl Pharmacol 2014; 275:198-212. [PMID: 24463095 DOI: 10.1016/j.taap.2014.01.006] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2013] [Revised: 01/04/2014] [Accepted: 01/13/2014] [Indexed: 02/03/2023]
Abstract
Ensemble learning approach based decision treeboost (DTB) and decision tree forest (DTF) models are introduced in order to establish quantitative structure-toxicity relationship (QSTR) for the prediction of toxicity of 1450 diverse chemicals. Eight non-quantum mechanical molecular descriptors were derived. Structural diversity of the chemicals was evaluated using Tanimoto similarity index. Stochastic gradient boosting and bagging algorithms supplemented DTB and DTF models were constructed for classification and function optimization problems using the toxicity end-point in T. pyriformis. Special attention was drawn to prediction ability and robustness of the models, investigated both in external and 10-fold cross validation processes. In complete data, optimal DTB and DTF models rendered accuracies of 98.90%, 98.83% in two-category and 98.14%, 98.14% in four-category toxicity classifications. Both the models further yielded classification accuracies of 100% in external toxicity data of T. pyriformis. The constructed regression models (DTB and DTF) using five descriptors yielded correlation coefficients (R(2)) of 0.945, 0.944 between the measured and predicted toxicities with mean squared errors (MSEs) of 0.059, and 0.064 in complete T. pyriformis data. The T. pyriformis regression models (DTB and DTF) applied to the external toxicity data sets yielded R(2) and MSE values of 0.637, 0.655; 0.534, 0.507 (marine bacteria) and 0.741, 0.691; 0.155, 0.173 (algae). The results suggest for wide applicability of the inter-species models in predicting toxicity of new chemicals for regulatory purposes. These approaches provide useful strategy and robust tools in the screening of ecotoxicological risk or environmental hazard potential of chemicals.
Collapse
Affiliation(s)
- Kunwar P Singh
- Academy of Scientific and Innovative Research, Anusandhan Bhawan, Rafi Marg, New Delhi 110 001, India; Environmental Chemistry Division, CSIR-Indian Institute of Toxicology Research, Post Box 80, Mahatma Gandhi Marg, Lucknow 226 001, India.
| | - Shikha Gupta
- Academy of Scientific and Innovative Research, Anusandhan Bhawan, Rafi Marg, New Delhi 110 001, India; Environmental Chemistry Division, CSIR-Indian Institute of Toxicology Research, Post Box 80, Mahatma Gandhi Marg, Lucknow 226 001, India
| |
Collapse
|
24
|
Peng J, Lu J, Shen Q, Zheng M, Luo X, Zhu W, Jiang H, Chen K. In silico site of metabolism prediction for human UGT-catalyzed reactions. Bioinformatics 2013; 30:398-405. [DOI: 10.1093/bioinformatics/btt681] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/28/2023] Open
|
25
|
Levet A, Bordes C, Clément Y, Mignon P, Chermette H, Marote P, Cren-Olivé C, Lantéri P. Quantitative structure-activity relationship to predict acute fish toxicity of organic solvents. CHEMOSPHERE 2013; 93:1094-1103. [PMID: 23866172 DOI: 10.1016/j.chemosphere.2013.06.002] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/07/2013] [Revised: 05/30/2013] [Accepted: 06/02/2013] [Indexed: 06/02/2023]
Abstract
REACH regulation requires ecotoxicological data to characterize industrial chemicals. To limit in vivo testing, Quantitative Structure-Activity Relationships (QSARs) are advocated to predict toxicity of a molecule. In this context, the topic of this work was to develop a reliable QSAR explaining the experimental acute toxicity of organic solvents for fish trophic level. Toxicity was expressed as log(LC50), the concentration in mmol.L(-1) producing the 50% death of fish. The 141 chemically heterogeneous solvents of the dataset were described by physico-chemical descriptors and quantum theoretical parameters calculated via Density Functional Theory. The best subsets of solvent descriptors for LC50 prediction were chosen both through the Kubinyi function associated with Enhanced Replacement Method and a stepwise forward multiple linear regressions. The 4-parameters selected in the model were the octanol-water partition coefficient, LUMO energy, dielectric constant and surface tension. The predictive power and robustness of the QSAR developed were assessed by internal and external validations. Several techniques for training sets selection were evaluated: a random selection, a LC50-based selection, a balanced selection in terms of toxic and non-toxic solvents, a solvent profile-based selection with a space filling technique and a D-optimality onions-based selection. A comparison with fish LC50 predicted by ECOSAR model validated for neutral organics confirmed the interest of the QSAR developed for the prediction of organic solvent aquatic toxicity regardless of the mechanism of toxic action involved.
Collapse
Affiliation(s)
- A Levet
- Université de Lyon, F-69622 Villeurbanne, France; Université Claude Bernard Lyon 1, Institut des Sciences Analytiques, UMR CNRS 5280, F-69622 Villeurbanne, France
| | | | | | | | | | | | | | | |
Collapse
|
26
|
Singh KP, Gupta S, Rai P. Predicting acute aquatic toxicity of structurally diverse chemicals in fish using artificial intelligence approaches. ECOTOXICOLOGY AND ENVIRONMENTAL SAFETY 2013; 95:221-233. [PMID: 23764236 DOI: 10.1016/j.ecoenv.2013.05.017] [Citation(s) in RCA: 62] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/09/2013] [Revised: 05/15/2013] [Accepted: 05/16/2013] [Indexed: 06/02/2023]
Abstract
The research aims to develop global modeling tools capable of categorizing structurally diverse chemicals in various toxicity classes according to the EEC and European Community directives, and to predict their acute toxicity in fathead minnow using set of selected molecular descriptors. Accordingly, artificial intelligence approach based classification and regression models, such as probabilistic neural networks (PNN), generalized regression neural networks (GRNN), multilayer perceptron neural network (MLPN), radial basis function neural network (RBFN), support vector machines (SVM), gene expression programming (GEP), and decision tree (DT) were constructed using the experimental toxicity data. Diversity and non-linearity in the chemicals' data were tested using the Tanimoto similarity index and Brock-Dechert-Scheinkman statistics. Predictive and generalization abilities of various models constructed here were compared using several statistical parameters. PNN and GRNN models performed relatively better than MLPN, RBFN, SVM, GEP, and DT. Both in two and four category classifications, PNN yielded a considerably high accuracy of classification in training (95.85 percent and 90.07 percent) and validation data (91.30 percent and 86.96 percent), respectively. GRNN rendered a high correlation between the measured and model predicted -log LC50 values both for the training (0.929) and validation (0.910) data and low prediction errors (RMSE) of 0.52 and 0.49 for two sets. Efficiency of the selected PNN and GRNN models in predicting acute toxicity of new chemicals was adequately validated using external datasets of different fish species (fathead minnow, bluegill, trout, and guppy). The PNN and GRNN models showed good predictive and generalization abilities and can be used as tools for predicting toxicities of structurally diverse chemical compounds.
Collapse
Affiliation(s)
- Kunwar P Singh
- Academy of Scientific and Innovative Research, CSIR-Indian Institute of Toxicology Research (Council of Scientific & Industrial Research), Lucknow, Uttar Pradesh, India.
| | | | | |
Collapse
|
27
|
Gharaghani S, Khayamian T, Ebrahimi M. Molecular dynamics simulation study and molecular docking descriptors in structure-based QSAR on acetylcholinesterase (AChE) inhibitors. SAR AND QSAR IN ENVIRONMENTAL RESEARCH 2013; 24:773-794. [PMID: 23863115 DOI: 10.1080/1062936x.2013.792877] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/02/2023]
Abstract
In this study we present an approach for predicting the inhibitory activity of acetylcholinesterase (AChE) inhibitors by combining molecular dynamics (MD) simulation and docking studies in a structure-based quantitative structure-activity relationship (QSAR) model. The MD simulation was performed on AChE to obtain enzyme conformation in a water environment. The resulting conformation of the enzyme was used for docking with the most potent inhibitor (26a). Docking analysis revealed that hydrophobic interactions play important roles in the AChE-inhibitor complex. Then, all inhibitors that could bind simultaneously at the catalytic site and at the peripheral anionic site of AChE were docked into the enzyme and their interactions with AChE were used as new interpretable descriptors in a structure-based QSAR model. The least squares support vector regression was constructed using the four most relevant docking descriptors and one molecular structure descriptor. The Q(2) value of the model was found to be 0.790. Furthermore, to study the enzyme conformation stability, a second MD simulation was performed on AChE-inhibitor 26a complex. In MD simulation, the topological parameters of the inhibitor were derived from the PRODRG server, and partial atomic charges were modified using the B3LYP/6-31G level of theory. The radius of gyration for the complex showed that AChE conformation did not change in the presence of the inhibitors.
Collapse
Affiliation(s)
- S Gharaghani
- Department of Chemistry Isfahan University of Technology, Isfahan, Iran
| | | | | |
Collapse
|
28
|
Singh KP, Gupta S, Rai P. Predicting carcinogenicity of diverse chemicals using probabilistic neural network modeling approaches. Toxicol Appl Pharmacol 2013; 272:465-75. [PMID: 23856075 DOI: 10.1016/j.taap.2013.06.029] [Citation(s) in RCA: 39] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2013] [Accepted: 06/22/2013] [Indexed: 01/31/2023]
Abstract
Robust global models capable of discriminating positive and non-positive carcinogens; and predicting carcinogenic potency of chemicals in rodents were developed. The dataset of 834 structurally diverse chemicals extracted from Carcinogenic Potency Database (CPDB) was used which contained 466 positive and 368 non-positive carcinogens. Twelve non-quantum mechanical molecular descriptors were derived. Structural diversity of the chemicals and nonlinearity in the data were evaluated using Tanimoto similarity index and Brock-Dechert-Scheinkman statistics. Probabilistic neural network (PNN) and generalized regression neural network (GRNN) models were constructed for classification and function optimization problems using the carcinogenicity end point in rat. Validation of the models was performed using the internal and external procedures employing a wide series of statistical checks. PNN constructed using five descriptors rendered classification accuracy of 92.09% in complete rat data. The PNN model rendered classification accuracies of 91.77%, 80.70% and 92.08% in mouse, hamster and pesticide data, respectively. The GRNN constructed with nine descriptors yielded correlation coefficient of 0.896 between the measured and predicted carcinogenic potency with mean squared error (MSE) of 0.44 in complete rat data. The rat carcinogenicity model (GRNN) applied to the mouse and hamster data yielded correlation coefficient and MSE of 0.758, 0.71 and 0.760, 0.46, respectively. The results suggest for wide applicability of the inter-species models in predicting carcinogenic potency of chemicals. Both the PNN and GRNN (inter-species) models constructed here can be useful tools in predicting the carcinogenicity of new chemicals for regulatory purposes.
Collapse
Affiliation(s)
- Kunwar P Singh
- Academy of Scientific and Innovative Research, Council of Scientific & Industrial Research, New Delhi, India; Environmental Chemistry Division, CSIR-Indian Institute of Toxicology Research, Post Box 80, Mahatma Gandhi Marg, Lucknow 226 001, India.
| | | | | |
Collapse
|
29
|
Doucet JP, Doucet-Panaye A, Devillers J. Structure-activity relationship study of trifluoromethylketones: inhibitors of insect juvenile hormone esterase. SAR AND QSAR IN ENVIRONMENTAL RESEARCH 2013; 24:481-499. [PMID: 23721304 DOI: 10.1080/1062936x.2013.792499] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/02/2023]
Abstract
The juvenile hormone esterase (JHE) regulates juvenile hormone titre in insect hemolymph during its larval development. It has been suggested that JHE could be targeted for use in insect control. This enzyme can also be considered as involved in the phenomenon of endocrine disruption by xenobiotics in beneficial insects. Consequently, there is a need to know the characteristics of the molecules able to act on the JHE. Trifluoromethylketones (TFKs) are the most potent JHE inhibitors found to date and different quantitative structure-activity relationships (QSARs) have been derived for this group of chemicals. In this context, a set of 181 TFKs (118 active and 63 inactive compounds), tested on Trichoplusia ni for their JHE inhibition activity and described by physico-chemical descriptors, was split into different training and test sets to derive structure-activity relationship (SAR) models from support vector classification (SVC). A SVC model including 88 descriptors and derived from a Gaussian kernel was selected for its predictive performances. Another model computed only with 13 descriptors was also selected due to its mechanistic interpretability. This study clearly illustrates the difficulty in capturing the essential structural characteristics of the TFKs explaining their JHE inhibitory activity.
Collapse
Affiliation(s)
- J P Doucet
- ITODYS, UMR 7086, Université Paris 7, Paris, France.
| | | | | |
Collapse
|
30
|
Devillers J, Doucet JP, Doucet-Panaye A, Decourtye A, Aupinel P. Linear and non-linear QSAR modelling of juvenile hormone esterase inhibitors. SAR AND QSAR IN ENVIRONMENTAL RESEARCH 2012; 23:357-369. [PMID: 22443267 DOI: 10.1080/1062936x.2012.664562] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/31/2023]
Abstract
A tight control of juvenile hormone (JH) titre is crucial during the life cycle of a holometabolous insect. JH metabolism is made through the action of enzymes, particularly the juvenile hormone esterase (JHE). Trifluoromethylketones (TFKs) are able to inhibit this enzyme to disrupt the endocrine function of the targeted insect. In this context, a set of 96 TFKs, tested on Trichoplusia ni for their JHE inhibition, was split into a training set (n = 77) and a test set (n = 19) to derive a QSAR model. TFKs were initially described by 42 CODESSA (Comprehensive Descriptors for Structural and Statistical Analysis) descriptors, but a feature selection process allowed us to consider only five descriptors encoding the structural characteristics of the TFKs and their reactivity. A classical and spline regression analysis, a three-layer perceptron, a radial basis function network and a support vector regression were experienced as statistical tools. The best results were obtained with the support vector regression (r(2) and r(test)(2) = 0.91). The model provides information on the structural features and properties responsible for the high JHE inhibition activity of TFKs.
Collapse
|
31
|
Classification of 5-HT(1A) receptor agonists and antagonists using GA-SVM method. Acta Pharmacol Sin 2011; 32:1424-30. [PMID: 21963891 DOI: 10.1038/aps.2011.112] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/21/2023] Open
Abstract
AIM To construct a reliable computational model for the classification of agonists and antagonists of 5-HT(1A) receptor. METHODS Support vector machine (SVM), a well-known machine learning method, was employed to build a prediction model, and genetic algorithm (GA) was used to select the most relevant descriptors and to optimize two important parameters, C and r of the SVM model. The overall dataset used in this study comprised 284 ligands of the 5-HT(1A) receptor with diverse structures reported in the literatures. RESULTS A SVM model was successfully developed that could be used to predict the probability of a ligand being an agonist or antagonist of the 5-HT(1A) receptor. The predictive accuracy for training and test sets was 0.942 and 0.865, respectively. For compounds with probability estimate higher than 0.7, the predictive accuracy of the model for training and test sets was 0.954 and 0.927, respectively. To further validate our model, the receiver operating characteristic (ROC) curve was plotted, and the Area-Under-the-ROC- Curve (AUC) value was calculated to be 0.883 for training set and 0.906 for test set. CONCLUSION A reliable SVM model was successfully developed that could effectively distinguish agonists and antagonists among the ligands of the 5-HT(1A) receptor. To our knowledge, this is the first effort for the classification of 5-HT(1A) receptor agonists and antagonists based on a diverse dataset. This method may be used to classify the ligands of other members of the GPCR family.
Collapse
|