1
|
Collins SP, Mailloux B, Kulkarni S, Gagné M, Long AS, Barton-Maclaren TS. Development and application of consensus in silico models for advancing high-throughput toxicological predictions. Front Pharmacol 2024; 15:1307905. [PMID: 38333007 PMCID: PMC10850302 DOI: 10.3389/fphar.2024.1307905] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2023] [Accepted: 01/02/2024] [Indexed: 02/10/2024] Open
Abstract
Computational toxicology models have been successfully implemented to prioritize and screen chemicals. There are numerous in silico (quantitative) structure-activity relationship ([Q]SAR) models for the prediction of a range of human-relevant toxicological endpoints, but for a given endpoint and chemical, not all predictions are identical due to differences in their training sets, algorithms, and methodology. This poses an issue for high-throughput screening of a large chemical inventory as it necessitates several models to cover diverse chemistries but will then generate data conflicts. To address this challenge, we developed a consensus modeling strategy to combine predictions obtained from different existing in silico (Q)SAR models into a single predictive value while also expanding chemical space coverage. This study developed consensus models for nine toxicological endpoints relating to estrogen receptor (ER) and androgen receptor (AR) interactions (i.e., binding, agonism, and antagonism) and genotoxicity (i.e., bacterial mutation, in vitro chromosomal aberration, and in vivo micronucleus). Consensus models were created by combining different (Q)SAR models using various weighting schemes. As a multi-objective optimization problem, there is no single best consensus model, and therefore, Pareto fronts were determined for each endpoint to identify the consensus models that optimize the multiple-criterion decisions simultaneously. Accordingly, this work presents sets of solutions for each endpoint that contain the optimal combination, regardless of the trade-off, with the results demonstrating that the consensus models improved both the predictive power and chemical space coverage. These solutions were further analyzed to find trends between the best consensus models and their components. Here, we demonstrate the development of a flexible and adaptable approach for in silico consensus modeling and its application across nine toxicological endpoints related to ER activity, AR activity, and genotoxicity. These consensus models are developed to be integrated into a larger multi-tier NAM-based framework to prioritize chemicals for further investigation and support the transition to a non-animal approach to risk assessment in Canada.
Collapse
Affiliation(s)
- Sean P. Collins
- Existing Substances Risk Assessment Bureau, Healthy Environments and Consumer Safety Branch, Health Canada, Ottawa, ON, Canada
| | | | | | | | | | | |
Collapse
|
2
|
Zhang R, Wang B, Li L, Li S, Guo H, Zhang P, Hua Y, Cui X, Li Y, Mu Y, Huang X, Li X. Modeling and insights into the structural characteristics of endocrine-disrupting chemicals. ECOTOXICOLOGY AND ENVIRONMENTAL SAFETY 2023; 263:115251. [PMID: 37451095 DOI: 10.1016/j.ecoenv.2023.115251] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/20/2023] [Revised: 07/03/2023] [Accepted: 07/09/2023] [Indexed: 07/18/2023]
Abstract
Endocrine-disrupting chemicals (EDCs) can cause serious harm to human health and the environment; therefore, it is important to rapidly and correctly identify EDCs. Different computational models have been proposed for the prediction of EDCs over the past few decades, but the reported models are not always easily available, and few studies have investigated the structural characteristics of EDCs. In the present study, we have developed a series of artificial intelligence models targeting EDC receptors: the androgen receptor (AR); estrogen receptor (ER); and pregnane X receptor (PXR). The consensus models achieved good predictive results for validation sets with balanced accuracy values of 87.37%, 90.13%, and 79.21% for AR, ER, and PXR binding assays, respectively. Analysis of the physical-chemical properties suggested that several chemical properties were significantly (p < 0.05) different between EDCs and non-EDCs. We also identified structural alerts that can indicate an EDC, which were integrated into the web server SApredictor. These models and structural characteristics can provide useful tools and information in the discrimination and mechanistic understanding of EDCs in drug discovery and environmental risk assessment.
Collapse
Affiliation(s)
- Ruiqiu Zhang
- Department of Clinical Pharmacy, The First Affiliated Hospital of Shandong First Medical University & Shandong Provincial Qianfoshan Hospital, Shandong Engineering and Technology Research Center for Pediatric Drug Development, Shandong Medicine and Health Key Laboratory of Clinical Pharmacy, Jinan 250014, China
| | - Bailun Wang
- Department of Anesthesiology and perioperative medicine, The First Affiliated Hospital of Shandong First Medical University & Shandong Provincial Qianfoshan Hospital, Shandong Institute of Anesthesia and Respiratory Intensive Care Medicine, Jinan 250014, China
| | - Ling Li
- Department of Clinical Pharmacy, The First Affiliated Hospital of Shandong First Medical University & Shandong Provincial Qianfoshan Hospital, Shandong Engineering and Technology Research Center for Pediatric Drug Development, Shandong Medicine and Health Key Laboratory of Clinical Pharmacy, Jinan 250014, China
| | - Shengjie Li
- Department of Neurosurgery, The First Affiliated Hospital of Shandong First Medical University & Shandong Provincial Qianfoshan Hospital, Jinan 250014, China
| | - Huizhu Guo
- Department of Clinical Pharmacy, The First Affiliated Hospital of Shandong First Medical University & Shandong Provincial Qianfoshan Hospital, Shandong Engineering and Technology Research Center for Pediatric Drug Development, Shandong Medicine and Health Key Laboratory of Clinical Pharmacy, Jinan 250014, China
| | - Pei Zhang
- Department of Clinical Pharmacy, The First Affiliated Hospital of Shandong First Medical University & Shandong Provincial Qianfoshan Hospital, Shandong Engineering and Technology Research Center for Pediatric Drug Development, Shandong Medicine and Health Key Laboratory of Clinical Pharmacy, Jinan 250014, China
| | - Yuqing Hua
- Department of Clinical Pharmacy, The First Affiliated Hospital of Shandong First Medical University & Shandong Provincial Qianfoshan Hospital, Shandong Engineering and Technology Research Center for Pediatric Drug Development, Shandong Medicine and Health Key Laboratory of Clinical Pharmacy, Jinan 250014, China
| | - Xueyan Cui
- Department of Clinical Pharmacy, The First Affiliated Hospital of Shandong First Medical University & Shandong Provincial Qianfoshan Hospital, Shandong Engineering and Technology Research Center for Pediatric Drug Development, Shandong Medicine and Health Key Laboratory of Clinical Pharmacy, Jinan 250014, China
| | - Yan Li
- Department of Clinical Pharmacy, The First Affiliated Hospital of Shandong First Medical University & Shandong Provincial Qianfoshan Hospital, Shandong Engineering and Technology Research Center for Pediatric Drug Development, Shandong Medicine and Health Key Laboratory of Clinical Pharmacy, Jinan 250014, China
| | - Yan Mu
- Department of Clinical Pharmacy, The First Affiliated Hospital of Shandong First Medical University & Shandong Provincial Qianfoshan Hospital, Shandong Engineering and Technology Research Center for Pediatric Drug Development, Shandong Medicine and Health Key Laboratory of Clinical Pharmacy, Jinan 250014, China
| | - Xin Huang
- Department of Clinical Pharmacy, The First Affiliated Hospital of Shandong First Medical University & Shandong Provincial Qianfoshan Hospital, Shandong Engineering and Technology Research Center for Pediatric Drug Development, Shandong Medicine and Health Key Laboratory of Clinical Pharmacy, Jinan 250014, China
| | - Xiao Li
- Department of Clinical Pharmacy, The First Affiliated Hospital of Shandong First Medical University & Shandong Provincial Qianfoshan Hospital, Shandong Engineering and Technology Research Center for Pediatric Drug Development, Shandong Medicine and Health Key Laboratory of Clinical Pharmacy, Jinan 250014, China.
| |
Collapse
|
3
|
Wang B, Guo J, Liu X, Yu Y, Wu J, Wang Y. Prediction of the effects of small molecules on the gut microbiome using machine learning method integrating with optimal molecular features. BMC Bioinformatics 2023; 24:338. [PMID: 37697256 PMCID: PMC10496404 DOI: 10.1186/s12859-023-05455-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2023] [Accepted: 08/25/2023] [Indexed: 09/13/2023] Open
Abstract
BACKGROUND The human gut microbiome (HGM), consisting of trillions of microorganisms, is crucial to human health. Adverse drug use is one of the most important causes of HGM disorder. Thus, it is necessary to identify drugs or compounds with anti-commensal effects on HGM in the early drug discovery stage. This study proposes a novel anti-commensal effects classification using a machine learning method and optimal molecular features. To improve the prediction performance, we explored combinations of six fingerprints and three descriptors to filter the best characterization as molecular features. RESULTS The final consensus model based on optimal features yielded the F1-score of 0.725 ± 0.014, ACC of 82.9 ± 0.7%, and AUC of 0.791 ± 0.009 for five-fold cross-validation. In addition, this novel model outperformed the prior studies by using the same algorithm. Furthermore, the important chemical descriptors and misclassified anti-commensal compounds are analyzed to better understand and interpret the model. Finally, seven structural alerts responsible for the chemical anti-commensal effect are identified, implying valuable information for drug design. CONCLUSION Our study would be a promising tool for screening anti-commensal compounds in the early stage of drug discovery and assessing the potential risks of these drugs in vivo.
Collapse
Affiliation(s)
- Binyou Wang
- School of Basic Medical Sciences, Southwest Medical University, Luzhou, 646000, China
- School of Pharmacy, Southwest Medical University, Luzhou, 646000, China
| | - Jianmin Guo
- School of Basic Medical Sciences, Southwest Medical University, Luzhou, 646000, China
| | - Xiaofeng Liu
- School of Basic Medical Sciences, Southwest Medical University, Luzhou, 646000, China
| | - Yang Yu
- School of Basic Medical Sciences, Southwest Medical University, Luzhou, 646000, China
- Key Laboratory of Medical Electrophysiology, Ministry of Education and Medical Electrophysiological Key Laboratory of Sichuan Province, Institute of Cardiovascular Research, Southwest Medical University, Luzhou, 646000, China
| | - Jianming Wu
- School of Basic Medical Sciences, Southwest Medical University, Luzhou, 646000, China.
- School of Pharmacy, Southwest Medical University, Luzhou, 646000, China.
- Key Laboratory of Medical Electrophysiology, Ministry of Education and Medical Electrophysiological Key Laboratory of Sichuan Province, Institute of Cardiovascular Research, Southwest Medical University, Luzhou, 646000, China.
- Sichuan Key Medical Laboratory of New Drug Discovery and Druggability Evaluation, Luzhou Key Laboratory of Activity Screening and Druggability Evaluation for Chinese Materia Medica, School of Pharmacy, Southwest Medical University, Luzhou, 646000, China.
| | - Yiwei Wang
- School of Basic Medical Sciences, Southwest Medical University, Luzhou, 646000, China.
- School of Pharmacy, Southwest Medical University, Luzhou, 646000, China.
- Key Laboratory of Medical Electrophysiology, Ministry of Education and Medical Electrophysiological Key Laboratory of Sichuan Province, Institute of Cardiovascular Research, Southwest Medical University, Luzhou, 646000, China.
| |
Collapse
|
4
|
Zhang R, Chen Z, Wang B, Li Y, Mu Y, Li X. Modeling and Insights into the Structural Characteristics of Chemical Mitochondrial Toxicity. ACS OMEGA 2023; 8:31675-31682. [PMID: 37692239 PMCID: PMC10483523 DOI: 10.1021/acsomega.3c01725] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/14/2023] [Accepted: 08/11/2023] [Indexed: 09/12/2023]
Abstract
Mitochondria are the energy metabolism center of cells and are involved in a number of other processes, such as cell differentiation and apoptosis, signal transduction, and regulation of cell cycle and cell proliferation. It is of great significance to evaluate the mitochondrial toxicity of drugs and other chemicals. In the present study, we aimed to propose easily available artificial intelligence (AI) models for the prediction of chemical mitochondrial toxicity and investigate the structural characteristics with the analysis of molecular properties and structural alerts. The consensus model achieved good predictive results with high total accuracy at 87.21% for validation sets. The models can be accessed freely via https://ochem.eu/article/158582. Besides, several commonly used chemical properties were significantly different between chemicals with and without mitochondrial toxicity. We also detected the structural alerts (SAs) responsible for mitochondrial toxicity and integrated them into the web-server SApredictor (www.sapredictor.cn). The study may provide useful tools for in silico estimation of mitochondrial toxicity and be helpful to understand the mechanisms of mitochondrial toxicity.
Collapse
Affiliation(s)
- Ruiqiu Zhang
- Department
of Clinical Pharmacy, The First Affiliated Hospital of Shandong First
Medical University & Shandong Provincial Qianfoshan Hospital,
Shandong Engineering and Technology Research Center for Pediatric
Drug Development, Shandong Medicine and
Health Key Laboratory of Clinical Pharmacy, Jinan 250014, China
| | - Zhaoyang Chen
- Department
of Clinical Pharmacy, The First Affiliated Hospital of Shandong First
Medical University & Shandong Provincial Qianfoshan Hospital,
Shandong Engineering and Technology Research Center for Pediatric
Drug Development, Shandong Medicine and
Health Key Laboratory of Clinical Pharmacy, Jinan 250014, China
| | - Baobao Wang
- Department
of Nephrology, The First Affiliated Hospital
of Shandong First Medical University & Shandong Provincial Qianfoshan
Hospital, Jinan 250014, China
| | - Yan Li
- Department
of Clinical Pharmacy, The First Affiliated Hospital of Shandong First
Medical University & Shandong Provincial Qianfoshan Hospital,
Shandong Engineering and Technology Research Center for Pediatric
Drug Development, Shandong Medicine and
Health Key Laboratory of Clinical Pharmacy, Jinan 250014, China
| | - Yan Mu
- Department
of Clinical Pharmacy, The First Affiliated Hospital of Shandong First
Medical University & Shandong Provincial Qianfoshan Hospital,
Shandong Engineering and Technology Research Center for Pediatric
Drug Development, Shandong Medicine and
Health Key Laboratory of Clinical Pharmacy, Jinan 250014, China
| | - Xiao Li
- Department
of Clinical Pharmacy, The First Affiliated Hospital of Shandong First
Medical University & Shandong Provincial Qianfoshan Hospital,
Shandong Engineering and Technology Research Center for Pediatric
Drug Development, Shandong Medicine and
Health Key Laboratory of Clinical Pharmacy, Jinan 250014, China
| |
Collapse
|
5
|
Singh R, Kumar P, Sindhu J, Devi M, Kumar A, Lal S, Singh D, Kumar H. Thiazolidinedione-triazole conjugates: design, synthesis and probing of the α-amylase inhibitory potential. Future Med Chem 2023; 15:1273-1294. [PMID: 37551699 DOI: 10.4155/fmc-2023-0144] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/09/2023] Open
Abstract
Aim: The primary objective of this investigation was the synthesis, spectral interpretation and evaluation of the α-amylase inhibition of rationally designed thiazolidinedione-triazole conjugates (7a-7aa). Materials & methods: The designed compounds were synthesized by stirring a mixture of thiazolidine-2,4-dione, propargyl bromide, cinnamaldehyde and azide derivatives in polyethylene glycol-400. The α-amylase inhibitory activity of the synthesized conjugates was examined by integrating in vitro and in silico studies. Results: The investigated derivatives exhibited promising α-amylase inhibitory activity, with IC50 values ranging between 0.028 and 0.088 μmol ml-1. Various computational approaches were employed to get detailed information about the inhibition mechanism. Conclusion: The thiazolidinedione-triazole conjugate 7p, with IC50 = 0.028 μmol ml-1, was identified as the best hit for inhibiting α-amylase.
Collapse
Affiliation(s)
- Rahul Singh
- Department of Chemistry, Kurukshetra University, Kurukshetra, 136119, India
| | - Parvin Kumar
- Department of Chemistry, Kurukshetra University, Kurukshetra, 136119, India
| | - Jayant Sindhu
- Department of Chemistry, COBS&H, CCS Haryana Agricultural University, Hisar, 125004, India
| | - Meena Devi
- Department of Chemistry, Kurukshetra University, Kurukshetra, 136119, India
| | - Ashwani Kumar
- Department of Pharmaceutical Sciences, GJUS&T, Hisar, 125001, India
| | - Sohan Lal
- Department of Chemistry, Kurukshetra University, Kurukshetra, 136119, India
| | - Devender Singh
- Department of Chemistry, Maharshi Dayanand University, Rohtak, 124001, India
| | - Harish Kumar
- Department of Chemistry, School of Basic Sciences, Central University Haryana, Mahendergarh, 123029, India
| |
Collapse
|
6
|
Swirog M, Mikolajczyk A, Jagiello K, Jänes J, Tämm K, Puzyn T. Predicting electrophoretic mobility of TiO 2, ZnO, and CeO 2 nanoparticles in natural waters: The importance of environment descriptors in nanoinformatics models. THE SCIENCE OF THE TOTAL ENVIRONMENT 2022; 840:156572. [PMID: 35710003 DOI: 10.1016/j.scitotenv.2022.156572] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/01/2022] [Revised: 05/16/2022] [Accepted: 06/05/2022] [Indexed: 06/15/2023]
Abstract
Natural and engineered nanoparticles (NPs) entering the environment are influenced by many physicochemical processes and show various behavior in different systems (e.g., natural waters showing different characteristics). Determining the physicochemical characteristics and predicting the behavior of nanoparticles ending up in the natural aquatic environment are key aspects of their risk assessment. Here, we show that the quantitative structure-property relationship modeling method used in nanoinformatics (nano-QSPR) can be successfully applied to predict environmental fate-relevant properties (electrophoretic mobility) of TiO2, ZnO, and CeO2 nanoparticles. However, in contrast to the previous works, we postulate to use, in parallel: (i) the nanoparticles' structure descriptors (S-descriptors) and (ii) the environment descriptors (E-descriptors) as the input variables. Thus, the method should be abbreviated more precisely as nano-QSEPR ("E" stands for the "environment"). As a proof-of-the-concept, we have developed a group of models (including MLR, GA-PLS, PCR, and Meta-Consensus models) with high predictive capabilities (QEXT2 = 0.931 for the GA-PLS model), where the S-descriptors are represented by the core-shell model descriptor and the E-descriptors - by different ambient water features (including ions concentration and the ionic strength). The newly proposed nano-QSEPR modeling scheme can be efficiently used to design safe and sustainable nanomaterials.
Collapse
Affiliation(s)
- Marta Swirog
- Laboratory of Environmental Chemoinformatics, Faculty of Chemistry, University of Gdansk, Wita Stwosza 63, 80-308 Gdańsk, Poland
| | - Alicja Mikolajczyk
- Laboratory of Environmental Chemoinformatics, Faculty of Chemistry, University of Gdansk, Wita Stwosza 63, 80-308 Gdańsk, Poland; QSAR Lab Sp. Z o. o., Trzy Lipy 3, 80-172, Poland.
| | - Karolina Jagiello
- Laboratory of Environmental Chemoinformatics, Faculty of Chemistry, University of Gdansk, Wita Stwosza 63, 80-308 Gdańsk, Poland; QSAR Lab Sp. Z o. o., Trzy Lipy 3, 80-172, Poland
| | - Jaak Jänes
- Institute of Chemistry, University of Tartu, Ravila 14A, Tartu 50411, Estonia
| | - Kaido Tämm
- Institute of Chemistry, University of Tartu, Ravila 14A, Tartu 50411, Estonia
| | - Tomasz Puzyn
- Laboratory of Environmental Chemoinformatics, Faculty of Chemistry, University of Gdansk, Wita Stwosza 63, 80-308 Gdańsk, Poland; QSAR Lab Sp. Z o. o., Trzy Lipy 3, 80-172, Poland.
| |
Collapse
|
7
|
Liang L, Liu Y, Kang B, Wang R, Sun MY, Wu Q, Meng XF, Lin JP. Large-scale comparison of machine learning algorithms for target prediction of natural products. Brief Bioinform 2022; 23:6675751. [PMID: 36007240 DOI: 10.1093/bib/bbac359] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2022] [Revised: 07/26/2022] [Accepted: 07/31/2022] [Indexed: 11/13/2022] Open
Abstract
Natural products (NPs) and their derivatives are important resources for drug discovery. There are many in silico target prediction methods that have been reported, however, very few of them distinguish NPs from synthetic molecules. Considering the fact that NPs and synthetic molecules are very different in many characteristics, it is necessary to build specific target prediction models of NPs. Therefore, we collected the activity data of NPs and their derivatives from the public databases and constructed four datasets, including the NP dataset, the NPs and its first-class derivatives dataset, the NPs and all its derivatives and the ChEMBL26 compounds dataset. Conditions, including activity thresholds and input features, were explored to access the performance of eight machine learning methods of target prediction of NPs, including support vector machines (SVM), extreme gradient boosting, random forests, K-nearest neighbor, naive Bayes, feedforward neural networks (FNN), convolutional neural networks and recurrent neural networks. As a result, the NPs and all their derivatives datasets were selected to build the best NP-specific models. Furthermore, the consensus models, as well as the voting models, were additionally applied to improve the prediction performance. More evaluations were made on the external validation set and the results demonstrated that (1) the NP-specific model performed better on the target prediction of NPs than the traditional models training on the whole compounds of ChEMBL26. (2) The consensus model of FNN + SVM possessed the best overall performance, and the voting model can significantly improve recall and specificity.
Collapse
Affiliation(s)
- Lu Liang
- State Key Laboratory of Medicinal Chemical Biology, College of Pharmacy and Tianjin Key Laboratory of Molecular Drug Research, Nankai University, Haihe Education Park, 38 Tongyan Road, Tianjin 300353, China
| | - Ye Liu
- State Key Laboratory of Medicinal Chemical Biology, College of Pharmacy and Tianjin Key Laboratory of Molecular Drug Research, Nankai University, Haihe Education Park, 38 Tongyan Road, Tianjin 300353, China
| | - Bo Kang
- National Supercomputer Center in Tianjin, 10 Xinhuanxi Road, Tianjin Binhai New Area, Tianjin 300457, China
| | - Ru Wang
- State Key Laboratory of Medicinal Chemical Biology, College of Pharmacy and Tianjin Key Laboratory of Molecular Drug Research, Nankai University, Haihe Education Park, 38 Tongyan Road, Tianjin 300353, China
| | - Meng-Yu Sun
- State Key Laboratory of Medicinal Chemical Biology, College of Pharmacy and Tianjin Key Laboratory of Molecular Drug Research, Nankai University, Haihe Education Park, 38 Tongyan Road, Tianjin 300353, China
| | - Qi Wu
- National Supercomputer Center in Tianjin, 10 Xinhuanxi Road, Tianjin Binhai New Area, Tianjin 300457, China
| | - Xiang-Fei Meng
- National Supercomputer Center in Tianjin, 10 Xinhuanxi Road, Tianjin Binhai New Area, Tianjin 300457, China
| | - Jian-Ping Lin
- State Key Laboratory of Medicinal Chemical Biology, College of Pharmacy and Tianjin Key Laboratory of Molecular Drug Research, Nankai University, Haihe Education Park, 38 Tongyan Road, Tianjin 300353, China.,Biodesign Center, Tianjin Institute of Industrial Biotechnology, Chinese Academy of Sciences, 32 West 7th Avenue, Tianjin Airport Economic Area, Tianjin 300308, China.,Platform of Pharmaceutical Intelligence, Tianjin International Joint Academy of Biomedicine, Tianjin 300457, China
| |
Collapse
|
8
|
|
9
|
Štekláč M, Breza M. DFT Studies of Substituted Phenols Cytotoxicity I.
Para
‐substituted Phenols. ChemistrySelect 2021. [DOI: 10.1002/slct.202101568] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/11/2023]
Affiliation(s)
- Marek Štekláč
- Department of Physical Chemistry Faculty of Chemical and Food Technology Slovak Technical University, SK- 81237 Bratislava Slovakia
| | - Martin Breza
- Department of Physical Chemistry Faculty of Chemical and Food Technology Slovak Technical University, SK- 81237 Bratislava Slovakia
| |
Collapse
|
10
|
Hua Y, Shi Y, Cui X, Li X. In silico prediction of chemical-induced hematotoxicity with machine learning and deep learning methods. Mol Divers 2021; 25:1585-1596. [PMID: 34196933 DOI: 10.1007/s11030-021-10255-x] [Citation(s) in RCA: 18] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2021] [Accepted: 06/14/2021] [Indexed: 12/15/2022]
Abstract
Chemical-induced hematotoxicity is an important concern in the drug discovery, since it can often be fatal when it happens. It is quite useful for us to give special attention to chemicals which can cause hematotoxicity. In the present study, we focused on in silico prediction of chemical-induced hematotoxicity with machine learning (ML) and deep learning (DL) methods. We collected a large data set contained 632 hematotoxic chemicals and 1525 approved drugs without hematotoxicity. Computational models were built using several different machine learning and deep learning algorithms integrated on the Online Chemical Modeling Environment (OCHEM). Based on the three best individual models, a consensus model was developed. It yielded the prediction accuracy of 0.83 and balanced accuracy of 0.77 on external validation. The consensus model and the best individual model developed with random forest regression and classification algorithm (RFR) and QNPR descriptors were made available at https://ochem.eu/article/135149 , respectively. The relevance of 8 commonly used molecular properties and chemical-induced hematotoxicity was also investigated. Several molecular properties have an obvious differentiating effect on chemical-induced hematotoxicity. Besides, 12 structural alerts responsible for chemical hematotoxicity were identified using frequency analysis of substructures from Klekota-Roth fingerprint. These results should provide meaningful knowledge and useful tools for hematotoxicity evaluation in drug discovery and environmental risk assessment.
Collapse
Affiliation(s)
- Yuqing Hua
- School of Pharmacy, Shandong First Medical University, Taian, 271000, China.,Department of Clinical Pharmacy, The First Affiliated Hospital of Shandong First Medical University & Shandong Provincial Qianfoshan Hospital, Jinan, 250014, China
| | - Yinping Shi
- Department of Clinical Pharmacy, The First Affiliated Hospital of Shandong First Medical University & Shandong Provincial Qianfoshan Hospital, Jinan, 250014, China
| | - Xueyan Cui
- Department of Clinical Pharmacy, The First Affiliated Hospital of Shandong First Medical University & Shandong Provincial Qianfoshan Hospital, Jinan, 250014, China
| | - Xiao Li
- Department of Clinical Pharmacy, The First Affiliated Hospital of Shandong First Medical University & Shandong Provincial Qianfoshan Hospital, Jinan, 250014, China. .,Department of Clinical Pharmacy, Shandong Provincial Qianfoshan Hospital, Shandong University, Jinan, 250014, China.
| |
Collapse
|
11
|
Ding X, Cui C, Wang D, Zhao J, Zheng M, Luo X, Jiang H, Chen K. Bioactivity Prediction Based on Matched Molecular Pair and Matched Molecular Series Methods. Curr Pharm Des 2021; 26:4195-4205. [PMID: 32338210 DOI: 10.2174/1381612826666200427111309] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2019] [Accepted: 04/08/2020] [Indexed: 11/22/2022]
Abstract
BACKGROUND Enhancing a compound's biological activity is the central task for lead optimization in small molecules drug discovery. However, it is laborious to perform many iterative rounds of compound synthesis and bioactivity tests. To address the issue, it is highly demanding to develop high quality in silico bioactivity prediction approaches, to prioritize such more active compound derivatives and reduce the trial-and-error process. METHODS Two kinds of bioactivity prediction models based on a large-scale structure-activity relationship (SAR) database were constructed. The first one is based on the similarity of substituents and realized by matched molecular pair analysis, including SA, SA_BR, SR, and SR_BR. The second one is based on SAR transferability and realized by matched molecular series analysis, including Single MMS pair, Full MMS series, and Multi single MMS pairs. Moreover, we also defined the application domain of models by using the distance-based threshold. RESULTS Among seven individual models, Multi single MMS pairs bioactivity prediction model showed the best performance (R2 = 0.828, MAE = 0.406, RMSE = 0.591), and the baseline model (SA) produced the most lower prediction accuracy (R2 = 0.798, MAE = 0.446, RMSE = 0.637). The predictive accuracy could further be improved by consensus modeling (R2 = 0.842, MAE = 0.397 and RMSE = 0.563). CONCLUSION An accurate prediction model for bioactivity was built with a consensus method, which was superior to all individual models. Our model should be a valuable tool for lead optimization.
Collapse
Affiliation(s)
- Xiaoyu Ding
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai 201203, China
| | - Chen Cui
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai 201203, China
| | - Dingyan Wang
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai 201203, China
| | - Jihui Zhao
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai 201203, China
| | - Mingyue Zheng
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai 201203, China
| | - Xiaomin Luo
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai 201203, China
| | - Hualiang Jiang
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai 201203, China
| | - Kaixian Chen
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai 201203, China
| |
Collapse
|
12
|
Wu Z, Zhu M, Kang Y, Leung ELH, Lei T, Shen C, Jiang D, Wang Z, Cao D, Hou T. Do we need different machine learning algorithms for QSAR modeling? A comprehensive assessment of 16 machine learning algorithms on 14 QSAR data sets. Brief Bioinform 2020; 22:6032614. [PMID: 33313673 DOI: 10.1093/bib/bbaa321] [Citation(s) in RCA: 50] [Impact Index Per Article: 12.5] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2020] [Revised: 10/09/2020] [Accepted: 10/19/2020] [Indexed: 12/18/2022] Open
Abstract
Although a wide variety of machine learning (ML) algorithms have been utilized to learn quantitative structure-activity relationships (QSARs), there is no agreed single best algorithm for QSAR learning. Therefore, a comprehensive understanding of the performance characteristics of popular ML algorithms used in QSAR learning is highly desirable. In this study, five linear algorithms [linear function Gaussian process regression (linear-GPR), linear function support vector machine (linear-SVM), partial least squares regression (PLSR), multiple linear regression (MLR) and principal component regression (PCR)], three analogizers [radial basis function support vector machine (rbf-SVM), K-nearest neighbor (KNN) and radial basis function Gaussian process regression (rbf-GPR)], six symbolists [extreme gradient boosting (XGBoost), Cubist, random forest (RF), multiple adaptive regression splines (MARS), gradient boosting machine (GBM), and classification and regression tree (CART)] and two connectionists [principal component analysis artificial neural network (pca-ANN) and deep neural network (DNN)] were employed to learn the regression-based QSAR models for 14 public data sets comprising nine physicochemical properties and five toxicity endpoints. The results show that rbf-SVM, rbf-GPR, XGBoost and DNN generally illustrate better performances than the other algorithms. The overall performances of different algorithms can be ranked from the best to the worst as follows: rbf-SVM > XGBoost > rbf-GPR > Cubist > GBM > DNN > RF > pca-ANN > MARS > linear-GPR ≈ KNN > linear-SVM ≈ PLSR > CART ≈ PCR ≈ MLR. In terms of prediction accuracy and computational efficiency, SVM and XGBoost are recommended to the regression learning for small data sets, and XGBoost is an excellent choice for large data sets. We then investigated the performances of the ensemble models by integrating the predictions of multiple ML algorithms. The results illustrate that the ensembles of two or three algorithms in different categories can indeed improve the predictions of the best individual ML algorithms.
Collapse
Affiliation(s)
- Zhenxing Wu
- College of Pharmaceutical Sciences, Hangzhou Institute of Innovative Medicine, Zhejiang University, P. R. China
| | - Minfeng Zhu
- Xiangya School of Pharmaceutical Sciences, Central South University, P. R. China
| | - Yu Kang
- College of Pharmaceutical Sciences, Hangzhou Institute of Innovative Medicine, Zhejiang University, P. R. China
| | - Elaine Lai-Han Leung
- State Key Laboratory of Quality Research in Chinese Medicine, Macau Institute for Applied Research in Medicine and Health, Macau University of Science and Technology, P. R. China
| | - Tailong Lei
- College of Pharmaceutical Sciences, Hangzhou Institute of Innovative Medicine, Zhejiang University, P. R. China
| | - Chao Shen
- College of Pharmaceutical Sciences, Hangzhou Institute of Innovative Medicine, Zhejiang University, P. R. China
| | - Dejun Jiang
- College of Pharmaceutical Sciences, Hangzhou Institute of Innovative Medicine, Zhejiang University, P. R. China
| | - Zhe Wang
- College of Pharmaceutical Sciences, Hangzhou Institute of Innovative Medicine, Zhejiang University, P. R. China
| | | | - Tingjun Hou
- Peking University, China. He is currently a professor in the College of Pharmaceutical Sciences, Zhejiang University, China
| |
Collapse
|
13
|
Molecular Docking Reveals the Binding Modes of Anticancer Alkylphospholipids and Lysophosphatidylcholine within the Catalytic Domain of Cytidine Triphosphate: Phosphocholine Cytidyltransferase. EUR J LIPID SCI TECH 2020. [DOI: 10.1002/ejlt.201900422] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022]
|
14
|
Wang Y, Chen X. A joint optimization QSAR model of fathead minnow acute toxicity based on a radial basis function neural network and its consensus modeling. RSC Adv 2020; 10:21292-21308. [PMID: 35518745 PMCID: PMC9054390 DOI: 10.1039/d0ra02701d] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2020] [Accepted: 05/24/2020] [Indexed: 01/07/2023] Open
Abstract
Acute toxicity of the fathead minnow (Pimephales promelas) is an important indicator to evaluate the hazards and risks of compounds in aquatic environments. The aim of our study is to explore the predictive power of the quantitative structure-activity relationship (QSAR) model based on a radial basis function (RBF) neural network with the joint optimization method to study the acute toxicity mechanism, and to develop a potential acute toxicity prediction model, for fathead minnow. To ensure the symmetry and fairness of the data splitting and to generate multiple chemically diverse training and validation sets, we used a self-organizing mapping (SOM) neural network to split the modeling dataset (containing 955 compounds) characterized by PaDEL-descriptors. After preliminary selection of descriptors via the mean decrease impurity method, a hybrid quantum particle swarm optimization (HQPSO) algorithm was used to jointly optimize the parameters of RBF and select the key descriptors. We established 20 RBF-based QSAR models, and the statistical results showed that the 10-fold cross-validation results (R cv10 2) and the adjusted coefficients of determination (R adj 2) were all great than 0.7 and 0.8, respectively. The Q ext 2 of these models was between 0.6480 and 0.7317, and the R ext 2 was between 0.6563 and 0.7318. Combined with the frequency and importance of the descriptors used in RBF-based models, and the correlation between the descriptors and acute toxicity, we concluded that the water distribution coefficient, molar refractivity, and first ionization potential are important factors affecting the acute toxicity of fathead minnow. A consensus QSAR model with RBF-based models was established; this model showed good performance with R 2 = 0.9118, R cv10 2 = 0.7632, and Q ext 2 = 0.7430. A frequency weighted and distance (FWD)-based application domain (AD) definition method was proposed, and the outliers were analyzed carefully. Compared with previous studies the method proposed in this paper has obvious advantages and its robustness and external predictive power are also better than Xgboost-based model. It is an effective QSAR modeling method.
Collapse
Affiliation(s)
- Yukun Wang
- School of Chemical Engineering, University of Science and Technology Liaoning No. 185, Qianshan Anshan 114051 Liaoning China
- School of Electronic and Information Engineering, University of Science and Technology Liaoning No. 185, Qianshan Anshan 114051 Liaoning China +864125928367
| | - Xuebo Chen
- School of Electronic and Information Engineering, University of Science and Technology Liaoning No. 185, Qianshan Anshan 114051 Liaoning China +864125928367
| |
Collapse
|
15
|
Valsecchi C, Grisoni F, Consonni V, Ballabio D. Consensus versus Individual QSARs in Classification: Comparison on a Large-Scale Case Study. J Chem Inf Model 2020; 60:1215-1223. [PMID: 32073844 PMCID: PMC7997107 DOI: 10.1021/acs.jcim.9b01057] [Citation(s) in RCA: 22] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/29/2023]
Abstract
![]()
Consensus strategies have been widely
applied in many different
scientific fields, based on the assumption that the fusion of several
sources of information increases the outcome reliability. Despite
the widespread application of consensus approaches, their advantages
in quantitative structure–activity relationship (QSAR) modeling
have not been thoroughly evaluated, mainly due to the lack of appropriate
large-scale data sets. In this study, we evaluated the advantages
and drawbacks of consensus approaches compared to single classification
QSAR models. To this end, we used a data set of three properties (androgen
receptor binding, agonism, and antagonism) for approximately 4000
molecules with predictions performed by more than 20 QSAR models,
made available in a large-scale collaborative project. The individual
QSAR models were compared with two consensus approaches, majority
voting and the Bayes consensus with discrete probability distributions,
in both protective and nonprotective forms. Consensus strategies proved
to be more accurate and to better cover the analyzed chemical space
than individual QSARs on average, thus motivating their widespread
application for property prediction. Scripts and data to reproduce
the results of this study are available for download.
Collapse
Affiliation(s)
- Cecile Valsecchi
- Milano Chemometrics and QSAR Research Group, University of Milano Bicocca, P.za della Scienza 1, 20126 Milano, Italy
| | - Francesca Grisoni
- Department of Chemistry and Applied Biosciences, ETH Zurich, Vladimir-Prelog-Weg 4, 8049 Zurich, Switzerland
| | - Viviana Consonni
- Milano Chemometrics and QSAR Research Group, University of Milano Bicocca, P.za della Scienza 1, 20126 Milano, Italy
| | - Davide Ballabio
- Milano Chemometrics and QSAR Research Group, University of Milano Bicocca, P.za della Scienza 1, 20126 Milano, Italy
| |
Collapse
|
16
|
Vukovic K, Gadaleta D, Benfenati E. Methodology of aiQSAR: a group-specific approach to QSAR modelling. J Cheminform 2019; 11:27. [PMID: 30945010 PMCID: PMC6446381 DOI: 10.1186/s13321-019-0350-y] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2019] [Accepted: 03/25/2019] [Indexed: 12/26/2022] Open
Abstract
Background Several QSAR methodology developments have shown promise in recent years. These include the consensus approach to generate the final prediction of a model, utilizing new, advanced machine learning algorithms and streamlining, standardization and automation of various QSAR steps. One approach that seems under-explored is at-the-runtime generation of local models specific to individual compounds. This approach was quite likely limited by the computational requirements, but with current increases in processing power and the widespread availability of cluster-computing infrastructure, this limitation is no longer that severe. Results We propose a new QSAR methodology: aiQSAR, whose aim is to generate endpoint predictions directly from the input dataset by building an array of local models generated at-the-runtime and specific for each compound in the dataset. The local group of each compound is selected on the basis of fingerprint similarities and the final prediction is calculated by integrating the results of a number of autonomous mathematical models. The method is applicable to regression, binary classification and multi-class classification and was tested on one dataset for each endpoint type: bioconcentration factor (BCF) for regression, Ames test for binary classification and Environmental Protection Agency (EPA) acute rat oral toxicity ranking for multi-class classification. As part of this method, the applicability domain of each prediction is assessed through the applicability domain measure, calculated on the basis of the fingerprint similarities in each local group of compounds. Conclusions We outline the methodology for a new QSAR-based predictive tool whose advantages are automation, group-specific approach to modelling and simplicity of execution. Our aim now will be to develop this method into a stand-alone software tool. We hope that eventual adoption of our tool would make QSAR modelling more accessible and transparent. Our methodology could be used as an initial modelling step, to predict new compounds by simply loading the training dataset as an input. Predictions could then be further evaluated and refined either by other tools or through optimization of aiQSAR parameters. Electronic supplementary material The online version of this article (10.1186/s13321-019-0350-y) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Kristijan Vukovic
- Istituto di Ricerche Farmacologiche Mario Negri-IRCCS, Via Mario Negri 2, 20156, Milan, Italy. .,Jozef Stefan International Postgraduate School, Jamova cesta 39, 1000, Ljubljana, Slovenia.
| | - Domenico Gadaleta
- Istituto di Ricerche Farmacologiche Mario Negri-IRCCS, Via Mario Negri 2, 20156, Milan, Italy
| | - Emilio Benfenati
- Istituto di Ricerche Farmacologiche Mario Negri-IRCCS, Via Mario Negri 2, 20156, Milan, Italy
| |
Collapse
|
17
|
Grisoni F, Consonni V, Vighi M. Acceptable-by-design QSARs to predict the dietary biomagnification of organic chemicals in fish. INTEGRATED ENVIRONMENTAL ASSESSMENT AND MANAGEMENT 2019; 15:51-63. [PMID: 30447095 DOI: 10.1002/ieam.4106] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/12/2018] [Revised: 09/20/2018] [Accepted: 11/08/2018] [Indexed: 06/09/2023]
Abstract
This work presents the first-time QSAR approach to predict the laboratory-based fish biomagnification factor (BMF) of organic chemicals, to be used as a supporting tool for assessing bioaccumulation at the regulatory level. The developed strategy is based on 2 levels of prediction, with a varying trade-off between interpretability and performance according to the user's needs. We designed our models to be intrinsically acceptable at the regulatory level (in what we defined as "acceptable-by-design" strategy), by (i) complying with OECD principles directly in the approach development phase, (ii) choosing easy-to-apply modeling techniques, (iii) preferring simple descriptors when possible, and (iv) striving to provide data-driven mechanistic insights. Our novel tool has an error comparable to the observed experimental inter- and intraspecies variability and is stable on borderline compounds (root mean square error [RMSE] ranging from RMSE = 0.45 to RMSE = 0.45 log units on test data). Additionally, the models' molecular descriptors are carefully described and interpreted, allowing us to gather additional mechanistic insights into the structural features controlling the dietary bioaccumulation of chemicals in fish. To improve the transparency and promote the application of the model, the data set and the stand alone prediction tool are provided free of charge at https://github.com/grisoniFr/bmf_qsar Integr Environ Assess Manag 2019;15:51-63. © 2018 SETAC.
Collapse
Affiliation(s)
- Francesca Grisoni
- Milano Chemometrics and QSAR Research Group, University of Milano-Bicocca, Department of Earth and Environmental Sciences, Milano, Italy
| | - Viviana Consonni
- Milano Chemometrics and QSAR Research Group, University of Milano-Bicocca, Department of Earth and Environmental Sciences, Milano, Italy
| | - Marco Vighi
- IMDEA Water Institute, Alcalà de Henares, Madrid, Spain
| |
Collapse
|
18
|
Ballabio D, Grisoni F, Consonni V, Todeschini R. Integrated QSAR Models to Predict Acute Oral Systemic Toxicity. Mol Inform 2018; 38:e1800124. [PMID: 30549437 DOI: 10.1002/minf.201800124] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2018] [Accepted: 11/26/2018] [Indexed: 11/07/2022]
Abstract
The ICCVAM Acute Toxicity Workgroup (U.S. Department of Health and Human Services), in collaboration with the U.S. Environmental Protection Agency (U.S. EPA, National Center for Computational Toxicology), coordinated the "Predictive Models for Acute Oral Systemic Toxicity" collaborative project to develop in silico models to predict acute oral systemic toxicity for filling regulatory needs. In this framework, new Quantitative Structure-Activity Relationship (QSAR) models for the prediction of very toxic (LD50 lower than 50 mg/kg) and nontoxic (LD50 greater than or equal to 2,000 mg/kg) endpoints were developed, as described in this study. Models were developed on a large set of chemicals (8992), provided by the project coordinators, considering the five OCED principles for QSAR applicability to regulatory endpoints. A Bayesian consensus approach integrating three different classification QSAR algorithms was applied as modelling method. For both the considered endpoints, the proposed approach demonstrated to be robust and predictive, as determined by a blind validation on a set of external molecules provided in a later stage by the coordinators of the collaborative project. Finally, the integration of predictions obtained for the very toxic and nontoxic endpoints allowed the identification of compounds associated to medium toxicity, as well as the analysis of consistency between the predictions obtained for the two endpoints on the same molecules. Predictions of the proposed consensus approach will be integrated with those originated from models proposed by the participants of the collaborative project to facilitate the regulatory acceptance of in-silico predictions and thus reduce or replace experimental tests for acute toxicity.
Collapse
Affiliation(s)
- Davide Ballabio
- Department of Earth and Environmental Sciences, University of Milano-Bicocca, P.za della Scienza 1, 20126, Milano, Italy
| | - Francesca Grisoni
- Department of Earth and Environmental Sciences, University of Milano-Bicocca, P.za della Scienza 1, 20126, Milano, Italy
| | - Viviana Consonni
- Department of Earth and Environmental Sciences, University of Milano-Bicocca, P.za della Scienza 1, 20126, Milano, Italy
| | - Roberto Todeschini
- Department of Earth and Environmental Sciences, University of Milano-Bicocca, P.za della Scienza 1, 20126, Milano, Italy
| |
Collapse
|
19
|
Barrett R, Jiang S, White AD. Classifying antimicrobial and multifunctional peptides with Bayesian network models. Pept Sci (Hoboken) 2018. [DOI: 10.1002/pep2.24079] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/15/2023]
Affiliation(s)
- Rainier Barrett
- Department of Chemical Engineering University of Rochester Rochester New York
| | - Shaoyi Jiang
- Department of Chemical Engineering University of Washington Seattle Washington
| | - Andrew D. White
- Department of Chemical Engineering University of Rochester Rochester New York
| |
Collapse
|
20
|
Jagiello K, Makurat S, Pereć S, Rak J, Puzyn T. Molecular features of thymidine analogues governing the activity of human thymidine kinase. Struct Chem 2018. [DOI: 10.1007/s11224-018-1124-2] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/16/2022]
|
21
|
Mansouri K, Grulke CM, Judson RS, Williams AJ. OPERA models for predicting physicochemical properties and environmental fate endpoints. J Cheminform 2018. [PMID: 29520515 PMCID: PMC5843579 DOI: 10.1186/s13321-018-0263-1] [Citation(s) in RCA: 271] [Impact Index Per Article: 45.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022] Open
Abstract
The collection of chemical structure information and associated experimental data for quantitative structure–activity/property relationship (QSAR/QSPR) modeling is facilitated by an increasing number of public databases containing large amounts of useful data. However, the performance of QSAR models highly depends on the quality of the data and modeling methodology used. This study aims to develop robust QSAR/QSPR models for chemical properties of environmental interest that can be used for regulatory purposes. This study primarily uses data from the publicly available PHYSPROP database consisting of a set of 13 common physicochemical and environmental fate properties. These datasets have undergone extensive curation using an automated workflow to select only high-quality data, and the chemical structures were standardized prior to calculation of the molecular descriptors. The modeling procedure was developed based on the five Organization for Economic Cooperation and Development (OECD) principles for QSAR models. A weighted k-nearest neighbor approach was adopted using a minimum number of required descriptors calculated using PaDEL, an open-source software. The genetic algorithms selected only the most pertinent and mechanistically interpretable descriptors (2–15, with an average of 11 descriptors). The sizes of the modeled datasets varied from 150 chemicals for biodegradability half-life to 14,050 chemicals for logP, with an average of 3222 chemicals across all endpoints. The optimal models were built on randomly selected training sets (75%) and validated using fivefold cross-validation (CV) and test sets (25%). The CV Q2 of the models varied from 0.72 to 0.95, with an average of 0.86 and an R2 test value from 0.71 to 0.96, with an average of 0.82. Modeling and performance details are described in QSAR model reporting format and were validated by the European Commission’s Joint Research Center to be OECD compliant. All models are freely available as an open-source, command-line application called OPEn structure–activity/property Relationship App (OPERA). OPERA models were applied to more than 750,000 chemicals to produce freely available predicted data on the U.S. Environmental Protection Agency’s CompTox Chemistry Dashboard.![]()
Collapse
Affiliation(s)
- Kamel Mansouri
- National Center for Computational Toxicology, Office of Research and Development, U.S. Environmental Protection Agency, Research Triangle Park, NC, 27711, USA. .,Oak Ridge Institute for Science and Education, 1299 Bethel Valley Road, Oak Ridge, TN, 37830, USA. .,ScitoVation LLC, 6 Davis Drive, Research Triangle Park, NC, 27709, USA.
| | - Chris M Grulke
- National Center for Computational Toxicology, Office of Research and Development, U.S. Environmental Protection Agency, Research Triangle Park, NC, 27711, USA
| | - Richard S Judson
- National Center for Computational Toxicology, Office of Research and Development, U.S. Environmental Protection Agency, Research Triangle Park, NC, 27711, USA
| | - Antony J Williams
- National Center for Computational Toxicology, Office of Research and Development, U.S. Environmental Protection Agency, Research Triangle Park, NC, 27711, USA
| |
Collapse
|
22
|
Grace P, George H, Prachi P, Imran S. Navigating through the minefield of read-across tools: A review of in silico tools for grouping. ACTA ACUST UNITED AC 2017; 3:1-18. [PMID: 30221211 DOI: 10.1016/j.comtox.2017.05.003] [Citation(s) in RCA: 59] [Impact Index Per Article: 8.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/30/2023]
Abstract
Read-across is a popular data gap filling technique used within analogue and category approaches for regulatory purposes. In recent years there have been many efforts focused on the challenges involved in read-across development, its scientific justification and documentation. Tools have also been developed to facilitate read-across development and application. Here, we describe a number of publicly available read-across tools in the context of the category/analogue workflow and review their respective capabilities, strengths and weaknesses. No single tool addresses all aspects of the workflow. We highlight how the different tools complement each other and some of the opportunities for their further development to address the continued evolution of read-across.
Collapse
Affiliation(s)
- Patlewicz Grace
- National Center for Computational Toxicology (NCCT), Office of Research and Development, US Environmental Protection Agency, 109 TW Alexander Dr, Research Triangle Park (RTP), NC 27711, USA
| | - Helman George
- National Center for Computational Toxicology (NCCT), Office of Research and Development, US Environmental Protection Agency, 109 TW Alexander Dr, Research Triangle Park (RTP), NC 27711, USA.,Oak Ridge Institute for Science and Education (ORISE), Oak Ridge, TN, USA
| | - Pradeep Prachi
- National Center for Computational Toxicology (NCCT), Office of Research and Development, US Environmental Protection Agency, 109 TW Alexander Dr, Research Triangle Park (RTP), NC 27711, USA.,Oak Ridge Institute for Science and Education (ORISE), Oak Ridge, TN, USA
| | - Shah Imran
- National Center for Computational Toxicology (NCCT), Office of Research and Development, US Environmental Protection Agency, 109 TW Alexander Dr, Research Triangle Park (RTP), NC 27711, USA
| |
Collapse
|
23
|
Global versus local QSAR models for predicting ionic liquids toxicity against IPC-81 leukemia rat cell line: The predictive ability. J Mol Liq 2017. [DOI: 10.1016/j.molliq.2017.02.025] [Citation(s) in RCA: 25] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
|
24
|
Kizhedath A, Wilkinson S, Glassey J. Applicability of predictive toxicology methods for monoclonal antibody therapeutics: status Quo and scope. Arch Toxicol 2016; 91:1595-1612. [PMID: 27766364 PMCID: PMC5364268 DOI: 10.1007/s00204-016-1876-7] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2016] [Accepted: 10/12/2016] [Indexed: 12/31/2022]
Abstract
Biopharmaceuticals, monoclonal antibody (mAb)-based therapeutics in particular, have positively impacted millions of lives. MAbs and related therapeutics are highly desirable from a biopharmaceutical perspective as they are highly target specific and well tolerated within the human system. Nevertheless, several mAbs have been discontinued or withdrawn based either on their inability to demonstrate efficacy and/or due to adverse effects. Approved monoclonal antibodies and derived therapeutics have been associated with adverse effects such as immunogenicity, cytokine release syndrome, progressive multifocal leukoencephalopathy, intravascular haemolysis, cardiac arrhythmias, abnormal liver function, gastrointestinal perforation, bronchospasm, intraocular inflammation, urticaria, nephritis, neuropathy, birth defects, fever and cough to name a few. The advances made in this field are also impeded by a lack of progress in bioprocess development strategies as well as increasing costs owing to attrition, wherein the lack of efficacy and safety accounts for nearly 60 % of all factors contributing to attrition. This reiterates the need for smarter preclinical development using quality by design-based approaches encompassing carefully designed predictive models during early stages of drug development. Different in vitro and in silico methods are extensively used for predicting biological activity as well as toxicity during small molecule drug development; however, their full potential has not been utilized for biological drug development. The scope of in vitro and in silico tools in early developmental stages of monoclonal antibody-based therapeutics production and how it contributes to lower attrition rates leading to faster development of potential drug candidates has been evaluated. The applicability of computational toxicology approaches in this context as well as the pitfalls and promises of extending such techniques to biopharmaceutical development has been highlighted.
Collapse
Affiliation(s)
- Arathi Kizhedath
- Chemical Engineering and Advanced Materials, Newcastle University, Newcastle upon Tyne, NE17RU, UK. .,Medical Toxicology Centre, Institute of Cellular Medicine, Newcastle University, Newcastle upon Tyne, NE2 4AA, UK.
| | - Simon Wilkinson
- Medical Toxicology Centre, Institute of Cellular Medicine, Newcastle University, Newcastle upon Tyne, NE2 4AA, UK
| | - Jarka Glassey
- Chemical Engineering and Advanced Materials, Newcastle University, Newcastle upon Tyne, NE17RU, UK
| |
Collapse
|
25
|
Sosnowska A, Barycki M, Gajewicz A, Bobrowski M, Freza S, Skurski P, Uhl S, Laux E, Journot T, Jeandupeux L, Keppner H, Puzyn T. Towards the Application of Structure-Property Relationship Modeling in Materials Science: Predicting the Seebeck Coefficient for Ionic Liquid/Redox Couple Systems. Chemphyschem 2016; 17:1591-600. [DOI: 10.1002/cphc.201600080] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2016] [Indexed: 11/09/2022]
Affiliation(s)
- Anita Sosnowska
- Laboratory of Environmental Chemometrics; Department of Chemistry; University of Gdansk; Wita Stwosza 63 80-308 Gdansk Poland
| | - Maciej Barycki
- Laboratory of Environmental Chemometrics; Department of Chemistry; University of Gdansk; Wita Stwosza 63 80-308 Gdansk Poland
| | - Agnieszka Gajewicz
- Laboratory of Environmental Chemometrics; Department of Chemistry; University of Gdansk; Wita Stwosza 63 80-308 Gdansk Poland
| | - Maciej Bobrowski
- Department of Technical Physics and Applied Mathematics; Gdansk University of Technology; Narutowicza 11/12 80-233 Gdansk Poland
| | - Sylwia Freza
- Department of Chemistry; University of Gdansk; Wita Stwosza 63 80-308 Gdansk Poland
| | - Piotr Skurski
- Department of Chemistry; University of Gdansk; Wita Stwosza 63 80-308 Gdansk Poland
| | - Stefanie Uhl
- HES-SO Arc; Institut des Microtechnologies Appliquees; La Chaux-de Fonds Switzerland
| | - Edith Laux
- HES-SO Arc; Institut des Microtechnologies Appliquees; La Chaux-de Fonds Switzerland
| | - Tony Journot
- HES-SO Arc; Institut des Microtechnologies Appliquees; La Chaux-de Fonds Switzerland
| | - Laure Jeandupeux
- HES-SO Arc; Institut des Microtechnologies Appliquees; La Chaux-de Fonds Switzerland
| | - Herbert Keppner
- HES-SO Arc; Institut des Microtechnologies Appliquees; La Chaux-de Fonds Switzerland
| | - Tomasz Puzyn
- Laboratory of Environmental Chemometrics; Department of Chemistry; University of Gdansk; Wita Stwosza 63 80-308 Gdansk Poland
| |
Collapse
|
26
|
Cassotti M, Ballabio D, Todeschini R, Consonni V. A similarity-based QSAR model for predicting acute toxicity towards the fathead minnow (Pimephales promelas). SAR AND QSAR IN ENVIRONMENTAL RESEARCH 2015; 26:217-243. [PMID: 25780951 DOI: 10.1080/1062936x.2015.1018938] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/04/2023]
Abstract
REACH regulation demands information about acute toxicity of chemicals towards fish and supports the use of QSAR models, provided compliance with OECD principles. Existing models present some drawbacks that may limit their regulatory application. In this study, a dataset of 908 chemicals was used to develop a QSAR model to predict the LC50 96 hours for the fathead minnow. Genetic algorithms combined with k nearest neighbour method were applied on the training set (726 chemicals) and resulted in a model based on six molecular descriptors. An automated assessment of the applicability domain (AD) was carried out by comparing the average distance of each molecule from the nearest neighbours with a fixed threshold. The model had good and balanced performance in internal and external validation (182 test molecules), at the expense of a percentage of molecules outside the AD. Principal Component Analysis showed apparent correlations between model descriptors and toxicity.
Collapse
Affiliation(s)
- M Cassotti
- a Department of Earth and Environmental Sciences , University of Milano-Bicocca , Milano , Italy
| | | | | | | |
Collapse
|
27
|
Sun L, Zhang C, Chen Y, Li X, Zhuang S, Li W, Liu G, Lee PW, Tang Y. In silico prediction of chemical aquatic toxicity with chemical category approaches and substructural alerts. Toxicol Res (Camb) 2015. [DOI: 10.1039/c4tx00174e] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022] Open
Abstract
Aquatic toxicity is an important endpoint in the evaluation of chemically adverse effects on ecosystems.
Collapse
Affiliation(s)
- Lu Sun
- Shanghai Key Laboratory of New Drug Design
- School of Pharmacy
- East China University of Science and Technology
- Shanghai 200237
- China
| | - Chen Zhang
- Shanghai Key Laboratory of New Drug Design
- School of Pharmacy
- East China University of Science and Technology
- Shanghai 200237
- China
| | - Yingjie Chen
- Shanghai Key Laboratory of New Drug Design
- School of Pharmacy
- East China University of Science and Technology
- Shanghai 200237
- China
| | - Xiao Li
- Shanghai Key Laboratory of New Drug Design
- School of Pharmacy
- East China University of Science and Technology
- Shanghai 200237
- China
| | - Shulin Zhuang
- College of Environmental and Resource Sciences
- Zhejiang University
- Hangzhou 310058
- China
| | - Weihua Li
- Shanghai Key Laboratory of New Drug Design
- School of Pharmacy
- East China University of Science and Technology
- Shanghai 200237
- China
| | - Guixia Liu
- Shanghai Key Laboratory of New Drug Design
- School of Pharmacy
- East China University of Science and Technology
- Shanghai 200237
- China
| | - Philip W. Lee
- Shanghai Key Laboratory of New Drug Design
- School of Pharmacy
- East China University of Science and Technology
- Shanghai 200237
- China
| | - Yun Tang
- Shanghai Key Laboratory of New Drug Design
- School of Pharmacy
- East China University of Science and Technology
- Shanghai 200237
- China
| |
Collapse
|
28
|
Cassotti M, Consonni V, Mauri A, Ballabio D. Validation and extension of a similarity-based approach for prediction of acute aquatic toxicity towards Daphnia magna. SAR AND QSAR IN ENVIRONMENTAL RESEARCH 2014; 25:1013-1036. [PMID: 25482581 DOI: 10.1080/1062936x.2014.977818] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/17/2014] [Accepted: 09/15/2014] [Indexed: 06/04/2023]
Abstract
Quantitative structure-activity relationship (QSAR) models for predicting acute toxicity to Daphnia magna are often associated with poor performances, urging the need for improvement to meet REACH requirements. The aim of this study was to evaluate the accuracy, stability and reliability of a previously published QSAR model by means of further external validation and to optimize its performance by means of extension to new data as well as a consensus approach. The previously published model was validated with a large set of new molecules and then compared with ChemProp model, from which most of the validation data were taken. Results showed better performance of the proposed model in terms of accuracy and percentage of molecules outside the applicability domain. The model was re-calibrated on all the available data to confirm the efficacy of the similarity-based approach. The extended dataset was also used to develop a novel model based on the same similarity approach but using binary fingerprints to describe the chemical structures. The fingerprint-based model gave lower regression statistics, but also less unpredicted compounds. Eventually, consensus modelling was successfully used to enhance the accuracy of the predictions and to halve the percentage of molecules outside the applicability domain.
Collapse
Affiliation(s)
- M Cassotti
- a Department of Earth and Environmental Sciences , University of Milano-Bicocca , Milan , Italy
| | | | | | | |
Collapse
|
29
|
Doucet JP, Doucet-Panaye A. Structure-activity relationship study of trifluoromethylketone inhibitors of insect juvenile hormone esterase: comparison of several classification methods. SAR AND QSAR IN ENVIRONMENTAL RESEARCH 2014; 25:589-616. [PMID: 24884820 DOI: 10.1080/1062936x.2014.919959] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/03/2023]
Abstract
Juvenile hormone esterase (JHE) plays a key role in the development and metamorphosis of holometabolous insects. Its inhibitors could possibly be targeted for insect control. Conversely, JHE may also be involved in endocrine disruption by xenobiotics, resulting in detrimental effects in beneficial insects. There is therefore a need to know the structural characteristics of the molecules able to monitor JHE activity, and to develop SAR and QSAR studies to estimate their effectiveness. For a large diverse population of 181 trifluoromethylketones (TFKs) - the most potent JHE inhibitors known to date - we recently proposed a binary classification (active/inactive) using a support vector machine and Codessa structural descriptors. We have now examined, using the same data set and with the same descriptors, the applicability and performance of five other machine learning approaches. These have been shown able to handle high dimensional data (with descriptors possibly irrelevant or redundant) and to cope with complex mechanisms, but without delivering explicit directly exploitable models. Splitting the data into five batches (training set 80%, test set 20%) and carrying out leave-one-out cross-validation, led to good results of comparable performance, consistent with our previous support vector classifier (SVC) results. Accuracy was greater than 0.80 for all approaches. A reduced set of 15 descriptors common to all the investigated approaches showed good predictive ability (confirmed using a three-layer perceptron) and gives some clues regarding a mechanistic interpretation.
Collapse
Affiliation(s)
- J P Doucet
- a Itodys , Université Paris-Diderot , UMR 7086 , Paris , France
| | | |
Collapse
|
30
|
Estimation of acute oral toxicity in rat using local lazy learning. J Cheminform 2014; 6:26. [PMID: 24959207 PMCID: PMC4047767 DOI: 10.1186/1758-2946-6-26] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2014] [Accepted: 05/06/2014] [Indexed: 01/19/2023] Open
Abstract
Background Acute toxicity means the ability of a substance to cause adverse effects within a short period following dosing or exposure, which is usually the first step in the toxicological investigations of unknown substances. The median lethal dose, LD50, is frequently used as a general indicator of a substance’s acute toxicity, and there is a high demand on developing non-animal-based prediction of LD50. Unfortunately, it is difficult to accurately predict compound LD50 using a single QSAR model, because the acute toxicity may involve complex mechanisms and multiple biochemical processes. Results In this study, we reported the use of local lazy learning (LLL) methods, which could capture subtle local structure-toxicity relationships around each query compound, to develop LD50 prediction models: (a) local lazy regression (LLR): a linear regression model built using k neighbors; (b) SA: the arithmetical mean of the activities of k nearest neighbors; (c) SR: the weighted mean of the activities of k nearest neighbors; (d) GP: the projection point of the compound on the line defined by its two nearest neighbors. We defined the applicability domain (AD) to decide to what an extent and under what circumstances the prediction is reliable. In the end, we developed a consensus model based on the predicted values of individual LLL models, yielding correlation coefficients R2 of 0.712 on a test set containing 2,896 compounds. Conclusion Encouraged by the promising results, we expect that our consensus LLL model of LD50 would become a useful tool for predicting acute toxicity. All models developed in this study are available via http://www.dddc.ac.cn/admetus.
Collapse
|
31
|
Sheridan RP. Global Quantitative Structure–Activity Relationship Models vs Selected Local Models as Predictors of Off-Target Activities for Project Compounds. J Chem Inf Model 2014; 54:1083-92. [DOI: 10.1021/ci500084w] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/23/2023]
Affiliation(s)
- Robert P. Sheridan
- Cheminformatics Department,
RY800-D133, Merck Research Laboratories, Rahway, New Jersey 07065, United States
| |
Collapse
|
32
|
Direct QSPR: the most efficient way of predicting organic carbon/water partition coefficient (log K OC) for polyhalogenated POPs. Struct Chem 2014. [DOI: 10.1007/s11224-014-0419-1] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]
|
33
|
Slavov SH, Pearce BA, Buzatu DA, Wilkes JG, Beger RD. Complementary PLS and KNN algorithms for improved 3D-QSDAR consensus modeling of AhR binding. J Cheminform 2013; 5:47. [PMID: 24257141 PMCID: PMC3843526 DOI: 10.1186/1758-2946-5-47] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2013] [Accepted: 11/15/2013] [Indexed: 11/10/2022] Open
Abstract
Multiple validation techniques (Y-scrambling, complete training/test set randomization, determination of the dependence of R2test on the number of randomization cycles, etc.) aimed to improve the reliability of the modeling process were utilized and their effect on the statistical parameters of the models was evaluated. A consensus partial least squares (PLS)-similarity based k-nearest neighbors (KNN) model utilizing 3D-SDAR (three dimensional spectral data-activity relationship) fingerprint descriptors for prediction of the log(1/EC50) values of a dataset of 94 aryl hydrocarbon receptor binders was developed. This consensus model was constructed from a PLS model utilizing 10 ppm x 10 ppm x 0.5 Å bins and 7 latent variables (R2test of 0.617), and a KNN model using 2 ppm x 2 ppm x 0.5 Å bins and 6 neighbors (R2test of 0.622). Compared to individual models, improvement in predictive performance of approximately 10.5% (R2test of 0.685) was observed. Further experiments indicated that this improvement is likely an outcome of the complementarity of the information contained in 3D-SDAR matrices of different granularity. For similarly sized data sets of Aryl hydrocarbon (AhR) binders the consensus KNN and PLS models compare favorably to earlier reports. The ability of 3D-QSDAR (three dimensional quantitative spectral data-activity relationship) to provide structural interpretation was illustrated by a projection of the most frequently occurring bins on the standard coordinate space, thus allowing identification of structural features related to toxicity.
Collapse
Affiliation(s)
| | | | | | | | - Richard D Beger
- Division of Systems Biology, National Center for Toxicological Research, US Food and Drug Administration, 3900 NCTR Road, Jefferson, AR 72079, USA.
| |
Collapse
|
34
|
Low Y, Sedykh A, Fourches D, Golbraikh A, Whelan M, Rusyn I, Tropsha A. Integrative chemical-biological read-across approach for chemical hazard classification. Chem Res Toxicol 2013; 26:1199-208. [PMID: 23848138 DOI: 10.1021/tx400110f] [Citation(s) in RCA: 88] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]
Abstract
Traditional read-across approaches typically rely on the chemical similarity principle to predict chemical toxicity; however, the accuracy of such predictions is often inadequate due to the underlying complex mechanisms of toxicity. Here, we report on the development of a hazard classification and visualization method that draws upon both chemical structural similarity and comparisons of biological responses to chemicals measured in multiple short-term assays ("biological" similarity). The Chemical-Biological Read-Across (CBRA) approach infers each compound's toxicity from both chemical and biological analogues whose similarities are determined by the Tanimoto coefficient. Classification accuracy of CBRA was compared to that of classical RA and other methods using chemical descriptors alone or in combination with biological data. Different types of adverse effects (hepatotoxicity, hepatocarcinogenicity, mutagenicity, and acute lethality) were classified using several biological data types (gene expression profiling and cytotoxicity screening). CBRA-based hazard classification exhibited consistently high external classification accuracy and applicability to diverse chemicals. Transparency of the CBRA approach is aided by the use of radial plots that show the relative contribution of analogous chemical and biological neighbors. Identification of both chemical and biological features that give rise to the high accuracy of CBRA-based toxicity prediction facilitates mechanistic interpretation of the models.
Collapse
Affiliation(s)
- Yen Low
- Laboratory for Molecular Modeling, Division of Chemical Biology and Medicinal Chemistry, University of North Carolina, Chapel Hill, North Carolina 27599, USA
| | | | | | | | | | | | | |
Collapse
|
35
|
Mansouri K, Ringsted T, Ballabio D, Todeschini R, Consonni V. Quantitative Structure–Activity Relationship Models for Ready Biodegradability of Chemicals. J Chem Inf Model 2013; 53:867-78. [DOI: 10.1021/ci4000213] [Citation(s) in RCA: 126] [Impact Index Per Article: 11.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/20/2023]
Affiliation(s)
- Kamel Mansouri
- Milano
Chemometrics and QSAR Research Group, Department
of Earth and Environmental Sciences, University of Milano Bicocca, Milano, Italy
| | - Tine Ringsted
- Milano
Chemometrics and QSAR Research Group, Department
of Earth and Environmental Sciences, University of Milano Bicocca, Milano, Italy
| | - Davide Ballabio
- Milano
Chemometrics and QSAR Research Group, Department
of Earth and Environmental Sciences, University of Milano Bicocca, Milano, Italy
| | - Roberto Todeschini
- Milano
Chemometrics and QSAR Research Group, Department
of Earth and Environmental Sciences, University of Milano Bicocca, Milano, Italy
| | - Viviana Consonni
- Milano
Chemometrics and QSAR Research Group, Department
of Earth and Environmental Sciences, University of Milano Bicocca, Milano, Italy
| |
Collapse
|
36
|
Piir G, Sild S, Maran U. Comparative analysis of local and consensus quantitative structure-activity relationship approaches for the prediction of bioconcentration factor. SAR AND QSAR IN ENVIRONMENTAL RESEARCH 2013; 24:175-199. [PMID: 23410132 DOI: 10.1080/1062936x.2012.762426] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/01/2023]
Abstract
Quantitative structure-activity relationships (QSARs) are broadly classified as global or local, depending on their molecular constitution. Global models use large and diverse training sets covering a wide range of chemical space. Local models focus on smaller structurally or chemically similar subsets that are conventionally selected by human experts or alternatively using clustering analysis. The current study focuses on the comparative analysis of different clustering algorithms (expectation-maximization, K-means and hierarchical) for seven different descriptor sets as structural characteristics and two rule-based approaches to select subsets for designing local QSAR models. A total of 111 local QSAR models are developed for predicting bioconcentration factor. Predictions from local models were compared with corresponding predictions from the global model. The comparison of coefficients of determination (r(2)) and standard deviations for local models with similar subsets from the global model show improved prediction quality in 97% of cases. The descriptor content of derived QSARs is discussed and analyzed. Local QSAR models were further consolidated within the framework of consensus approach. All different consensus approaches increased performance over the global and local models. The consensus approach reduced the number of strongly deviating predictions by evening out prediction errors, which were produced by some local QSARs.
Collapse
Affiliation(s)
- G Piir
- Institute of Chemistry, University of Tartu, Tartu, Estonia
| | | | | |
Collapse
|
37
|
Benchmarking ligand-based virtual High-Throughput Screening with the PubChem database. Molecules 2013; 18:735-56. [PMID: 23299552 PMCID: PMC3759399 DOI: 10.3390/molecules18010735] [Citation(s) in RCA: 50] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2012] [Revised: 10/11/2012] [Accepted: 12/17/2012] [Indexed: 01/04/2023] Open
Abstract
With the rapidly increasing availability of High-Throughput Screening (HTS) data in the public domain, such as the PubChem database, methods for ligand-based computer-aided drug discovery (LB-CADD) have the potential to accelerate and reduce the cost of probe development and drug discovery efforts in academia. We assemble nine data sets from realistic HTS campaigns representing major families of drug target proteins for benchmarking LB-CADD methods. Each data set is public domain through PubChem and carefully collated through confirmation screens validating active compounds. These data sets provide the foundation for benchmarking a new cheminformatics framework BCL::ChemInfo, which is freely available for non-commercial use. Quantitative structure activity relationship (QSAR) models are built using Artificial Neural Networks (ANNs), Support Vector Machines (SVMs), Decision Trees (DTs), and Kohonen networks (KNs). Problem-specific descriptor optimization protocols are assessed including Sequential Feature Forward Selection (SFFS) and various information content measures. Measures of predictive power and confidence are evaluated through cross-validation, and a consensus prediction scheme is tested that combines orthogonal machine learning algorithms into a single predictor. Enrichments ranging from 15 to 101 for a TPR cutoff of 25% are observed.
Collapse
|
38
|
Quantitative structure-activity relationships for organophosphates binding to acetylcholinesterase. Arch Toxicol 2012; 87:281-9. [PMID: 22990135 DOI: 10.1007/s00204-012-0934-z] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2012] [Accepted: 08/28/2012] [Indexed: 10/27/2022]
Abstract
Organophosphates are a group of pesticides and chemical warfare nerve agents that inhibit acetylcholinesterase, the enzyme responsible for hydrolysis of the excitatory neurotransmitter acetylcholine. Numerous structural variants exist for this chemical class, and data regarding their toxicity can be difficult to obtain in a timely fashion. At the same time, their use as pesticides and military weapons is widespread, which presents a major concern and challenge in evaluating human toxicity. To address this concern, a quantitative structure-activity relationship (QSAR) was developed to predict pentavalent organophosphate oxon human acetylcholinesterase bimolecular rate constants. A database of 278 three-dimensional structures and their bimolecular rates was developed from 15 peer-reviewed publications. A database of simplified molecular input line entry notations and their respective acetylcholinesterase bimolecular rate constants are listed in Supplementary Material, Table I. The database was quite diverse, spanning 7 log units of activity. In order to describe their structure, 675 molecular descriptors were calculated using AMPAC 8.0 and CODESSA 2.7.10. Orthogonal projection to latent structures regression, bootstrap leave-random-many-out cross-validation and y-randomization were used to develop an externally validated consensus QSAR model. The domain of applicability was assessed by the William's plot. Six external compounds were outside the warning leverage indicating potential model extrapolation. A number of compounds had residuals >2 or <-2, indicating potential outliers or activity cliffs. The results show that the HOMO-LUMO energy gap contributed most significantly to the binding affinity. A mean training R (2) of 0.80, a mean test set R (2) of 0.76 and a consensus external test set R (2) of 0.66 were achieved using the QSAR. The training and external test set RMSE values were found to be 0.76 and 0.88. The results suggest that this QSAR model can be used in physiologically based pharmacokinetic/pharmacodynamic models of organophosphate toxicity to determine the rate of acetylcholinesterase inhibition.
Collapse
|
39
|
Rusyn I, Sedykh A, Low Y, Guyton KZ, Tropsha A. Predictive modeling of chemical hazard by integrating numerical descriptors of chemical structures and short-term toxicity assay data. Toxicol Sci 2012; 127:1-9. [PMID: 22387746 DOI: 10.1093/toxsci/kfs095] [Citation(s) in RCA: 52] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022] Open
Abstract
Quantitative structure-activity relationship (QSAR) models are widely used for in silico prediction of in vivo toxicity of drug candidates or environmental chemicals, adding value to candidate selection in drug development or in a search for less hazardous and more sustainable alternatives for chemicals in commerce. The development of traditional QSAR models is enabled by numerical descriptors representing the inherent chemical properties that can be easily defined for any number of molecules; however, traditional QSAR models often have limited predictive power due to the lack of data and complexity of in vivo endpoints. Although it has been indeed difficult to obtain experimentally derived toxicity data on a large number of chemicals in the past, the results of quantitative in vitro screening of thousands of environmental chemicals in hundreds of experimental systems are now available and continue to accumulate. In addition, publicly accessible toxicogenomics data collected on hundreds of chemicals provide another dimension of molecular information that is potentially useful for predictive toxicity modeling. These new characteristics of molecular bioactivity arising from short-term biological assays, i.e., in vitro screening and/or in vivo toxicogenomics data can now be exploited in combination with chemical structural information to generate hybrid QSAR-like quantitative models to predict human toxicity and carcinogenicity. Using several case studies, we illustrate the benefits of a hybrid modeling approach, namely improvements in the accuracy of models, enhanced interpretation of the most predictive features, and expanded applicability domain for wider chemical space coverage.
Collapse
Affiliation(s)
- Ivan Rusyn
- Department of Environmental Sciences and Engineering, University of North Carolina, Chapel Hill, North Carolina 27599, USA.
| | | | | | | | | |
Collapse
|
40
|
Krasowski MD, Hopfinger AJ. The discovery of new anesthetics by targeting GABAAreceptors. Expert Opin Drug Discov 2011; 6:1187-201. [DOI: 10.1517/17460441.2011.627324] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
|
41
|
Hewitt M, Cronin MTD, Rowe PH, Schultz TW. Repeatability analysis of the Tetrahymena pyriformis population growth impairment assay. SAR AND QSAR IN ENVIRONMENTAL RESEARCH 2011; 22:621-637. [PMID: 21830879 DOI: 10.1080/1062936x.2011.604100] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/31/2023]
Abstract
Assessments necessary to ensure the safety of both humans and the environment are challenged by the sheer number of chemicals in use today. Chemical legislation, such as REACH, aims to use alternative methods to reduce the reliance on in vivo animal testing. Consequently, databases such as the TETRATOX database, containing data from the Tetrahymena pyriformis population growth impairment assay, have been used extensively to develop computational models which aid in priority setting and initial hazard assessments. To use any toxicological data, an assessment of quality is required. One important aspect of quality is the repeatability of the assay. This study considered TETRATOX assay data for 85 structurally and mechanistically diverse compounds. The repeatability of replicate determinations was assessed and factors relating to repeatability are discussed. Despite the majority of compounds demonstrating excellent repeatability, it was found that the mechanism of action is likely to be a modulating factor, with compounds acting via electrophilic mechanisms being more likely to exhibit reduced repeatability than those acting via narcotic mechanisms. It is evident from this study that the TETRATOX assay is a robust and highly repeatable assay, suitable for use in toxicological modelling studies and priority setting.
Collapse
Affiliation(s)
- M Hewitt
- School of Pharmacy and Chemistry, Liverpool John Moores University, Liverpool, UK
| | | | | | | |
Collapse
|
42
|
Lagunin A, Zakharov A, Filimonov D, Poroikov V. QSAR Modelling of Rat Acute Toxicity on the Basis of PASS Prediction. Mol Inform 2011; 30:241-50. [PMID: 27466777 DOI: 10.1002/minf.201000151] [Citation(s) in RCA: 187] [Impact Index Per Article: 14.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2010] [Accepted: 02/23/2011] [Indexed: 11/07/2022]
Abstract
The method for QSAR modelling of rat acute toxicity based on the combination of QNA (Quantitative Neighbourhoods of Atoms) descriptors, PASS (Prediction of Activity Spectra for Substances) predictions and self-consistent regression (SCR) is presented. PASS predicted biological activity profiles are used as independent input variables for QSAR modelling with SCR. QSAR models were developed using LD50 values for compounds tested on rats with four types of administration (oral, intravenous, intraperitoneal, subcutaneous). The proposed method was evaluated on the set of compounds tested for acute rat toxicity with oral administration (7286 compounds) used for testing the known QSAR methods in T.E.S.T. 3.0 program (U.S. EPA). The several other sets of compounds tested for acute rat toxicity by different routes of administration selected from SYMYX MDL Toxicity Database were used too. The method was compared with the results of prediction of acute rodent toxicity for noncongeneric sets obtained by ACD/Labs Inc. The test sets were predicted with regards to the applicability domain. Comparison of accuracy for QSAR models obtained separately using QNA descriptors, PASS predictions, nearest neighbours' assessment with consensus models clearly demonstrated the benefits of consensus prediction. Free available web-service for prediction of LD50 values of rat acute toxicity was developed: http://www.pharmaexpert.ru/GUSAR/AcuToxPredict/.
Collapse
Affiliation(s)
- Alexey Lagunin
- Department for Bioinformatics, Institute of Biomedical Chemistry of the Russian Academy of Medical Sciences, Pogodinskaya Str., 10, Moscow, 119121, Russia phone/fax: +7 499 2553029/+7499 2450857.
| | - Alexey Zakharov
- Department for Bioinformatics, Institute of Biomedical Chemistry of the Russian Academy of Medical Sciences, Pogodinskaya Str., 10, Moscow, 119121, Russia phone/fax: +7 499 2553029/+7499 2450857
| | - Dmitry Filimonov
- Department for Bioinformatics, Institute of Biomedical Chemistry of the Russian Academy of Medical Sciences, Pogodinskaya Str., 10, Moscow, 119121, Russia phone/fax: +7 499 2553029/+7499 2450857
| | - Vladimir Poroikov
- Department for Bioinformatics, Institute of Biomedical Chemistry of the Russian Academy of Medical Sciences, Pogodinskaya Str., 10, Moscow, 119121, Russia phone/fax: +7 499 2553029/+7499 2450857
| |
Collapse
|
43
|
Rodgers SL, Davis AM, Tomkinson NP, van de Waterbeemd H. Predictivity of Simulated ADME AutoQSAR Models over Time. Mol Inform 2011; 30:256-66. [DOI: 10.1002/minf.201000160] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2010] [Accepted: 01/24/2011] [Indexed: 11/08/2022]
|
44
|
Global versus local QSPR models for persistent organic pollutants: balancing between predictivity and economy. Struct Chem 2011. [DOI: 10.1007/s11224-011-9764-5] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
|
45
|
Nasonov AF. Computational methods and software in computer-aided combinatorial library design. RUSS J GEN CHEM+ 2011. [DOI: 10.1134/s1070363210120248] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
|
46
|
Abstract
This introductory chapter gives a brief overview of the history of cheminformatics, and then summarizes some recent trends in computing, cultures, open systems, chemical structure representation, docking, de novo design, fragment-based drug design, molecular similarity, quantitative structure-activity relationships (QSAR), metabolite prediction, the use of phamacophores in drug discovery, data reduction and visualization, and text mining. The aim is to set the scene for the more detailed exposition of these topics in the later chapters.
Collapse
Affiliation(s)
- Wendy A Warr
- Wendy Warr & Associates, Holmes Chapel, Cheshire, UK
| |
Collapse
|
47
|
Lozano S, Halm-Lemeille MP, Lepailleur A, Rault S, Bureau R. Consensus QSAR Related to Global or MOA Models: Application to Acute Toxicity for Fish. Mol Inform 2010; 29:803-13. [DOI: 10.1002/minf.201000104] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/03/2010] [Accepted: 10/26/2010] [Indexed: 11/06/2022]
|
48
|
Hewitt M, Ellison CM. Developing the Applicability Domain of In Silico Models: Relevance, Importance and Methods. IN SILICO TOXICOLOGY 2010. [DOI: 10.1039/9781849732093-00301] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/29/2023]
Abstract
The past two decades has seen the rapid growth in the development and utilisation of computational technologies to predict the toxicity of chemicals. Most notably, widespread pressure to both reduce and replace current animal testing regimes has led to in silico modelling becoming a widely utilised tool in toxicological screening. Unfortunately, given that computational models are open to misuse, there has been, and still is, significant reluctance to accept them for regulatory use. In an effort to combat this, the validation of both model and predictions is now at the forefront of research, with the concept of applicability domain being central to the validation process.
In this chapter the applicability domain concept is defined and numerous methods for its characterisation are detailed and explored with the aid of a case study example. These approaches are shown to span from relatively simple descriptor-based methods to more complex approaches based upon structural similarity or mechanism of action. Given the wealth of differing approaches available and the different information each method yields about the model, a stepwise scheme which considers numerous methods is recommended. With appreciation of model architecture and subsequent utilisation, this chapter shows that a robust and multifaceted applicability domain can be generated. Once defined, the applicability domain serves as a critical screening stage ensuring that a model is fit-for-purpose and predictions are made with maximal confidence.
Collapse
Affiliation(s)
- M. Hewitt
- School of Pharmacy and Chemistry, Liverpool John Moores University Byrom Street, Liverpool L3 3AF UK
| | - C. M. Ellison
- School of Pharmacy and Chemistry, Liverpool John Moores University Byrom Street, Liverpool L3 3AF UK
| |
Collapse
|
49
|
Abstract
Expert systems offer the facility to predict a toxicity endpoint, as well sometimes as additional relevant information, simply by inputting the chemical structure of a compound. There is now a number of expert systems available, mostly on a commercial basis although a few are free to use or download. This chapter discusses nineteen currently available expert systems, and their performances (if known). Published studies of consensus predictions with these expert systems indicate that these give better results than do individual expert systems.
A test set of compounds with Tetrahymena pyriformis toxicities has been run through the two expert systems known to predict these toxicities; the predictions were quite good, with standard errors of prediction of 0.395 and 0.433 log unit. A further test set of compounds with local lymph node assay skin sensitisation data has been run through seven expert systems, and it was found that consensus predictions were better than were those from any individual expert system.
Collapse
Affiliation(s)
- J. C. Dearden
- School of Pharmacy and Chemistry, Liverpool John Moores University Byrom Street Liverpool L3 3AF UK
| |
Collapse
|
50
|
Ghafourian T, Bozorgi AHA. Estimation of drug solubility in water, PEG 400 and their binary mixtures using the molecular structures of solutes. Eur J Pharm Sci 2010; 40:430-40. [DOI: 10.1016/j.ejps.2010.04.016] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2010] [Revised: 03/30/2010] [Accepted: 04/29/2010] [Indexed: 10/19/2022]
|