1
|
Ortega-Vallbona R, Palomino-Schätzlein M, Tolosa L, Benfenati E, Ecker GF, Gozalbes R, Serrano-Candelas E. Computational Strategies for Assessing Adverse Outcome Pathways: Hepatic Steatosis as a Case Study. Int J Mol Sci 2024; 25:11154. [PMID: 39456937 PMCID: PMC11508863 DOI: 10.3390/ijms252011154] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2024] [Revised: 10/10/2024] [Accepted: 10/11/2024] [Indexed: 10/28/2024] Open
Abstract
The evolving landscape of chemical risk assessment is increasingly focused on developing tiered, mechanistically driven approaches that avoid the use of animal experiments. In this context, adverse outcome pathways have gained importance for evaluating various types of chemical-induced toxicity. Using hepatic steatosis as a case study, this review explores the use of diverse computational techniques, such as structure-activity relationship models, quantitative structure-activity relationship models, read-across methods, omics data analysis, and structure-based approaches to fill data gaps within adverse outcome pathway networks. Emphasizing the regulatory acceptance of each technique, we examine how these methodologies can be integrated to provide a comprehensive understanding of chemical toxicity. This review highlights the transformative impact of in silico techniques in toxicology, proposing guidelines for their application in evidence gathering for developing and filling data gaps in adverse outcome pathway networks. These guidelines can be applied to other cases, advancing the field of toxicological risk assessment.
Collapse
Affiliation(s)
- Rita Ortega-Vallbona
- ProtoQSAR S.L., Calle Nicolás Copérnico 6, Parque Tecnológico de Valencia, 46980 Paterna, Spain; (R.O.-V.); (M.P.-S.); (R.G.)
| | - Martina Palomino-Schätzlein
- ProtoQSAR S.L., Calle Nicolás Copérnico 6, Parque Tecnológico de Valencia, 46980 Paterna, Spain; (R.O.-V.); (M.P.-S.); (R.G.)
| | - Laia Tolosa
- Unidad de Hepatología Experimental, Instituto de Investigación Sanitaria La Fe (IIS La Fe), Av. Fernando Abril Martorell 106, 46026 Valencia, Spain;
- Biomedical Research Networking Center on Bioengineering, Biomaterials and Nanomedicine (CIBER-BBN), Instituto de Salud Carlos III, C/Monforte de Lemos, 28029 Madrid, Spain
| | - Emilio Benfenati
- Istituto di Ricerche Farmacologiche Mario Negri IRCCS, Via Mario Negri 2, 20156 Milan, Italy;
| | - Gerhard F. Ecker
- Department of Pharmaceutical Sciences, University of Vienna, Josef-Holaubek Platz 2, 1090 Wien, Austria;
| | - Rafael Gozalbes
- ProtoQSAR S.L., Calle Nicolás Copérnico 6, Parque Tecnológico de Valencia, 46980 Paterna, Spain; (R.O.-V.); (M.P.-S.); (R.G.)
- MolDrug AI Systems S.L., Olimpia Arozena Torres 45, 46108 Valencia, Spain
| | - Eva Serrano-Candelas
- ProtoQSAR S.L., Calle Nicolás Copérnico 6, Parque Tecnológico de Valencia, 46980 Paterna, Spain; (R.O.-V.); (M.P.-S.); (R.G.)
| |
Collapse
|
2
|
Torres LHM, Arrais JP, Ribeiro B. Combining graph neural networks and transformers for few-shot nuclear receptor binding activity prediction. J Cheminform 2024; 16:109. [PMID: 39334272 PMCID: PMC11429188 DOI: 10.1186/s13321-024-00902-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2024] [Accepted: 09/05/2024] [Indexed: 09/30/2024] Open
Abstract
Nuclear receptors (NRs) play a crucial role as biological targets in drug discovery. However, determining which compounds can act as endocrine disruptors and modulate the function of NRs with a reduced amount of candidate drugs is a challenging task. Moreover, the computational methods for NR-binding activity prediction mostly focus on a single receptor at a time, which may limit their effectiveness. Hence, the transfer of learned knowledge among multiple NRs can improve the performance of molecular predictors and lead to the development of more effective drugs. In this research, we integrate graph neural networks (GNNs) and Transformers to introduce a few-shot GNN-Transformer, Meta-GTNRP to predict the binding activity of compounds using the combined information of different NRs and identify potential NR-modulators with limited data. The Meta-GTNRP model captures the local information in graph-structured data and preserves the global-semantic structure of molecular graph embeddings for NR-binding activity prediction. Furthermore, a few-shot meta-learning approach is proposed to optimize model parameters for different NR-binding tasks and leverage the complementarity among multiple NR-specific tasks to predict binding activity of compounds for each NR with just a few labeled molecules. Experiments with a compound database containing annotations on the binding activity for 11 NRs shows that Meta-GTNRP outperforms other graph-based approaches. The data and code are available at: https://github.com/ltorres97/Meta-GTNRP .Scientific contributionThe proposed few-shot GNN-Transformer model, Meta-GTNRP captures the local structure of molecular graphs and preserves the global-semantic information of graph embeddings to predict the NR-binding activity of compounds with limited available data; A few-shot meta-learning framework adapts model parameters across NR-specific tasks for different NRs in a joint learning procedure to predict the binding activity of compounds for each NR with just a few labeled molecules in highly imbalanced data scenarios; Meta-GTNRP is a data-efficient approach that combines the strengths of GNNs and Transformers to predict the NR-binding properties of compounds through an optimized meta-learning procedure and deliver robust results valuable to identify potential NR-based drug candidates.
Collapse
Affiliation(s)
- Luis H M Torres
- Department of Informatics Engineering, Univ Coimbra, Centre for Informatics and Systems of the University of Coimbra, Coimbra, 3030-790, Portugal.
| | - Joel P Arrais
- Department of Informatics Engineering, Univ Coimbra, Centre for Informatics and Systems of the University of Coimbra, Coimbra, 3030-790, Portugal
| | - Bernardete Ribeiro
- Department of Informatics Engineering, Univ Coimbra, Centre for Informatics and Systems of the University of Coimbra, Coimbra, 3030-790, Portugal
| |
Collapse
|
3
|
Ryzhkov FV, Ryzhkova YE, Elinson MN. Python tools for structural tasks in chemistry. Mol Divers 2024:10.1007/s11030-024-10889-7. [PMID: 38744790 DOI: 10.1007/s11030-024-10889-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2024] [Accepted: 04/27/2024] [Indexed: 05/16/2024]
Abstract
In recent decades, the use of computational approaches and artificial intelligence in the scientific environment has become more widespread. In this regard, the popular and versatile programming language Python has attracted considerable attention from scientists in the field of chemistry. It is used to solve a variety of chemical and structural problems, including calculating descriptors, molecular fingerprints, graph construction, and computing chemical reaction networks. Python offers high-quality visualization tools for analyzing chemical spaces and compound libraries. This review is a list of tools for the above tasks, including scripts, libraries, ready-made programs, and web interfaces. Inevitably this manuscript does not claim to be an all-encompassing handbook including all the existing Python-based structural chemistry codes. The review serves as a starting point for scientists wishing to apply automatization or optimization to routine chemistry problems.
Collapse
Affiliation(s)
- Fedor V Ryzhkov
- N. D. Zelinsky Institute of Organic Chemistry Russian Academy of Sciences, 47 Leninsky Prospekt, Moscow, 119991, Russia.
| | - Yuliya E Ryzhkova
- N. D. Zelinsky Institute of Organic Chemistry Russian Academy of Sciences, 47 Leninsky Prospekt, Moscow, 119991, Russia
| | - Michail N Elinson
- N. D. Zelinsky Institute of Organic Chemistry Russian Academy of Sciences, 47 Leninsky Prospekt, Moscow, 119991, Russia
| |
Collapse
|
4
|
Temizer AB, Uludoğan G, Özçelik R, Koulani T, Ozkirimli E, Ulgen KO, Karali N, Özgür A. Exploring data-driven chemical SMILES tokenization approaches to identify key protein-ligand binding moieties. Mol Inform 2024; 43:e202300249. [PMID: 38196065 DOI: 10.1002/minf.202300249] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2023] [Revised: 11/13/2023] [Accepted: 01/06/2024] [Indexed: 01/11/2024]
Abstract
Machine learning models have found numerous successful applications in computational drug discovery. A large body of these models represents molecules as sequences since molecular sequences are easily available, simple, and informative. The sequence-based models often segment molecular sequences into pieces called chemical words, analogous to the words that make up sentences in human languages, and then apply advanced natural language processing techniques for tasks such as de novo drug design, property prediction, and binding affinity prediction. However, the chemical characteristics and significance of these building blocks, chemical words, remain unexplored. To address this gap, we employ data-driven SMILES tokenization techniques such as Byte Pair Encoding, WordPiece, and Unigram to identify chemical words and compare the resulting vocabularies. To understand the chemical significance of these words, we build a language-inspired pipeline that treats high affinity ligands of protein targets as documents and selects key chemical words making up those ligands based on tf-idf weighting. The experiments on multiple protein-ligand affinity datasets show that despite differences in words, lengths, and validity among the vocabularies generated by different subword tokenization algorithms, the identified key chemical words exhibit similarity. Further, we conduct case studies on a number of target to analyze the impact of key chemical words on binding. We find that these key chemical words are specific to protein targets and correspond to known pharmacophores and functional groups. Our approach elucidates chemical properties of the words identified by machine learning models and can be used in drug discovery studies to determine significant chemical moieties.
Collapse
Affiliation(s)
- Asu Busra Temizer
- Department of Pharmaceutical Chemistry, Faculty of Pharmacy, İstanbul University, İstanbul, Turkey
- Department of Pharmaceutical Chemistry, Institute of Health Sciences, İstanbul University, İstanbul, Turkey
| | - Gökçe Uludoğan
- Department of Computer Engineering, Boğaziçi University, İstanbul, Turkey
| | - Rıza Özçelik
- Department of Computer Engineering, Boğaziçi University, İstanbul, Turkey
| | - Taha Koulani
- Department of Pharmaceutical Chemistry, Faculty of Pharmacy, İstanbul University, İstanbul, Turkey
- Department of Pharmaceutical Chemistry, Institute of Health Sciences, İstanbul University, İstanbul, Turkey
| | - Elif Ozkirimli
- Science and Research Informatics, F. Hoffmann-La Roche Ltd, Basel, Switzerland
| | - Kutlu O Ulgen
- Department of Chemical Engineering, Boğaziçi University, İstanbul, Turkey
| | - Nilgun Karali
- Department of Pharmaceutical Chemistry, Faculty of Pharmacy, İstanbul University, İstanbul, Turkey
| | - Arzucan Özgür
- Department of Computer Engineering, Boğaziçi University, İstanbul, Turkey
| |
Collapse
|
5
|
Mittal A, Ahuja G. Advancing chemical carcinogenicity prediction modeling: opportunities and challenges. Trends Pharmacol Sci 2023; 44:400-410. [PMID: 37183054 DOI: 10.1016/j.tips.2023.04.002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2023] [Revised: 04/11/2023] [Accepted: 04/18/2023] [Indexed: 05/16/2023]
Abstract
Carcinogenicity assessment of any compound is a laborious and expensive exercise with several associated ethical and practical concerns. While artificial intelligence (AI) offers promising solutions, unfortunately, it is contingent on several challenges concerning the inadequacy of available experimentally validated (non)carcinogen datasets and variabilities within bioassays, which contribute to the compromised model training. Existing AI solutions that leverage classical chemistry-driven descriptors do not provide adequate biological interpretability involved in imparting carcinogenicity. This highlights the urgency to devise alternative AI strategies. We propose multiple strategies, including implementing data-driven (integrated databases) and known carcinogen-characteristic-derived features to overcome these apparent shortcomings. In summary, these next-generation approaches will continue facilitating robust chemical carcinogenicity prediction, concomitant with deeper mechanistic insights.
Collapse
Affiliation(s)
- Aayushi Mittal
- Department of Computational Biology, Indraprastha Institute of Information Technology-Delhi (IIIT-Delhi), Okhla, Phase III, New Delhi, 110020, India.
| | - Gaurav Ahuja
- Department of Computational Biology, Indraprastha Institute of Information Technology-Delhi (IIIT-Delhi), Okhla, Phase III, New Delhi, 110020, India.
| |
Collapse
|
6
|
Kong W, Huang W, Peng C, Zhang B, Duan G, Ma W, Huang Z. Multiple machine learning methods aided virtual screening of Na V 1.5 inhibitors. J Cell Mol Med 2022; 27:266-276. [PMID: 36573431 PMCID: PMC9843531 DOI: 10.1111/jcmm.17652] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2022] [Revised: 10/30/2022] [Accepted: 12/06/2022] [Indexed: 12/28/2022] Open
Abstract
Nav 1.5 sodium channels contribute to the generation of the rapid upstroke of the myocardial action potential and thereby play a central role in the excitability of myocardial cells. At present, the patch clamp method is the gold standard for ion channel inhibitor screening. However, this method has disadvantages such as high technical difficulty, high cost and low speed. In this study, novel machine learning models to screen chemical blockers were developed to overcome the above shortage. The data from the ChEMBL Database were employed to establish the machine learning models. Firstly, six molecular fingerprints together with five machine learning algorithms were used to develop 30 classification models to predict effective inhibitors. A validation and a test set were used to evaluate the performance of the models. Subsequently, the privileged substructures tightly associated with the inhibition of the Nav 1.5 ion channel were extracted using the bioalerts Python package. In the validation set, the RF-Graph model performed best. Similarly, RF-Graph produced the best result in the test set in which the Prediction Accuracy (Q) was 0.9309 and Matthew's correlation coefficient was 0.8627, further indicating the model had high classification ability. The results of the privileged substructures indicated Sulfa structures and fragments with large Steric hindrance tend to block Nav 1.5. In the unsupervised learning task of identifying sulfa drugs, MACCS and Graph fingerprints had good results. In summary, effective machine learning models have been constructed which help to screen potential inhibitors of the Nav 1.5 ion channel and key privileged substructures with high affinity were also extracted.
Collapse
Affiliation(s)
- Weikaixin Kong
- Department of Molecular and Cellular Pharmacology, School of Pharmaceutical SciencesPeking University Health Science CenterBeijingChina,Institute for Molecular Medicine Finland (FIMM)HiLIFE, University of HelsinkiHelsinkiFinland,Institute Sanqu Technology (Hangzhou) Co., Ltd.HangzhouChina
| | - Weiran Huang
- Department of Molecular and Cellular Pharmacology, School of Pharmaceutical SciencesPeking University Health Science CenterBeijingChina
| | - Chao Peng
- Department of Molecular and Cellular Pharmacology, School of Pharmaceutical SciencesPeking University Health Science CenterBeijingChina
| | - Bowen Zhang
- ComMedX (Computational Medicine Beijing Co., Ltd.)BeijingChina
| | - Guifang Duan
- Department of Molecular and Cellular Pharmacology, School of Pharmaceutical SciencesPeking University Health Science CenterBeijingChina
| | - Weining Ma
- Department of NeurologyShengjing Hospital affiliated to China Medical UniversityShenyangChina
| | - Zhuo Huang
- Department of Molecular and Cellular Pharmacology, School of Pharmaceutical SciencesPeking University Health Science CenterBeijingChina,State Key Laboratory of Natural and Biomimetic Drugs, Department of Molecular and Cellular Pharmacology, School of Pharmaceutical SciencesPeking University Health Science CenterBeijingChina
| |
Collapse
|
7
|
Yang Y, Wu Z, Yao X, Kang Y, Hou T, Hsieh CY, Liu H. Exploring Low-Toxicity Chemical Space with Deep Learning for Molecular Generation. J Chem Inf Model 2022; 62:3191-3199. [PMID: 35713712 DOI: 10.1021/acs.jcim.2c00671] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
Creating a wide range of new compounds that not only have ideal pharmacological properties but also easily pass long-term toxicity evaluation is still a challenging task in current drug discovery. In this study, we developed a conditional generative model by combining a semisupervised variational autoencoder (SSVAE) with an MGA toxicity predictor. Our aim is to generate molecules with low toxicity, good drug-like properties, and structural diversity. For multiobjective optimization, we have developed a method with hierarchical constraints on the toxicity space of small molecules to generate drug-like small molecules, which can also minimize the effect on the diversity of generated results. The evaluation results of the metrics indicate that the developed model has good effectiveness, novelty, and diversity. The generated molecules by this model are mainly distributed in low-toxicity regions, which suggests that our model can efficiently constrain the generation of toxic structures. In contrast to simply filtering toxic ones after generation, the low-toxicity molecular generative model can generate molecules with structural diversity. Our strategy can be used in target-based drug discovery to improve the quality of generated molecules with low-toxicity, drug-like, and highly active properties.
Collapse
Affiliation(s)
- Yuwei Yang
- School of Pharmacy, Lanzhou University, Lanzhou 730000, China
| | - Zhenxing Wu
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, P. R. China
| | - Xiaojun Yao
- College of Chemistry and Chemical Engineering, Lanzhou University, Lanzhou 730000, China
| | - Yu Kang
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, P. R. China
| | - Tingjun Hou
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, P. R. China
| | - Chang-Yu Hsieh
- Tencent Quantum Laboratory, Tencent, Shenzhen 518000, China
| | - Huanxiang Liu
- School of Pharmacy, Lanzhou University, Lanzhou 730000, China.,Faculty of Applied Science, Macao Polytechnic University, Macao, SAR 999078, China
| |
Collapse
|
8
|
Liu Q, Jiang Y, Zhang L, Du J. A computational toolbox for molecular property prediction based on quantum mechanics and quantitative structure-property relationship. Front Chem Sci Eng 2021. [DOI: 10.1007/s11705-021-2060-z] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
|
9
|
Chikowe I, Phiri AC, Mbewe KP, Matekenya D. In-silico evaluation of Malawi essential medicines and reactive metabolites for potential drug-induced toxicities. BMC Pharmacol Toxicol 2021; 22:36. [PMID: 34134770 PMCID: PMC8207713 DOI: 10.1186/s40360-021-00499-6] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2020] [Accepted: 05/10/2021] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Drug-induced toxicity is one of the problems that have negatively impacted on the well-being of populations throughout the world, including Malawi. It results in unnecessary hospitalizations, retarding the development of the country. This study assessed the Malawi Essential Medicines List (MEML) for structural alerts and reactive metabolites with the potential for drug-induced toxicities. METHODS This in-silico screening study used StopTox, ToxAlerts and LD-50 values toxicity models to assess the MEML drugs. A total of 296 drugs qualified for the analysis (those that had defined chemical structures) and were screened in each software programme. Each model had its own toxicity endpoints and the models were compared for consensus of their results. RESULTS In the StopTox model, 86% of the drugs had potential to cause at least one toxicity including 55% that had the potential of causing eye irritation and corrosion. In ToxAlerts, 90% of the drugs had the potential of causing at least one toxicity and 72% were found to be potentially reactive, unstable and toxic. In LD-50, 70% of the drugs were potentially toxic. Model consensus evaluation results showed that the highest consensus was observed between ToxAlerts and StopTox (80%). The overall consensus amongst the three models was 57% and statistically significant (p < 0.05). CONCLUSIONS A large number of drugs had the potential to cause various systemic toxicities. But the results need to be interpreted cautiously since the clinical translation of QSAR-based predictions depends on many factors. In addition, inconsistencies have been reported between screening results amongst different models.
Collapse
Affiliation(s)
- Ibrahim Chikowe
- Pharmacy Department, College of Medicine, University of Malawi, Blantyre, Malawi.
| | | | - Kirios Patrick Mbewe
- Pharmacy Department, College of Medicine, University of Malawi, Blantyre, Malawi
| | | |
Collapse
|
10
|
Yang ZY, Yang ZJ, Zhao Y, Yin MZ, Lu AP, Chen X, Liu S, Hou TJ, Cao DS. PySmash: Python package and individual executable program for representative substructure generation and application. Brief Bioinform 2021; 22:6168498. [PMID: 33709154 DOI: 10.1093/bib/bbab017] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2020] [Revised: 01/06/2021] [Accepted: 01/12/2021] [Indexed: 01/23/2023] Open
Abstract
BACKGROUND Substructure screening is widely applied to evaluate the molecular potency and ADMET properties of compounds in drug discovery pipelines, and it can also be used to interpret QSAR models for the design of new compounds with desirable physicochemical and biological properties. With the continuous accumulation of more experimental data, data-driven computational systems which can derive representative substructures from large chemical libraries attract more attention. Therefore, the development of an integrated and convenient tool to generate and implement representative substructures is urgently needed. RESULTS In this study, PySmash, a user-friendly and powerful tool to generate different types of representative substructures, was developed. The current version of PySmash provides both a Python package and an individual executable program, which achieves ease of operation and pipeline integration. Three types of substructure generation algorithms, including circular, path-based and functional group-based algorithms, are provided. Users can conveniently customize their own requirements for substructure size, accuracy and coverage, statistical significance and parallel computation during execution. Besides, PySmash provides the function for external data screening. CONCLUSION PySmash, a user-friendly and integrated tool for the automatic generation and implementation of representative substructures, is presented. Three screening examples, including toxicophore derivation, privileged motif detection and the integration of substructures with machine learning (ML) models, are provided to illustrate the utility of PySmash in safety profile evaluation, therapeutic activity exploration and molecular optimization, respectively. Its executable program and Python package are available at https://github.com/kotori-y/pySmash.
Collapse
Affiliation(s)
- Zi-Yi Yang
- Department of Pharmacy, Xiangya Hospital, Central South University and the Xiangya School of Pharmaceutical Sciences, Central South University, Sichuan, China
| | - Zhi-Jiang Yang
- Xiangya School of Pharmaceutical Sciences, Central South University, Hunan, China
| | - Yue Zhao
- Xiangya School of Pharmaceutical Sciences, Central South University (Changsha), Sichuan, China
| | - Ming-Zhu Yin
- Department of Dermatology, Hunan Engineering Research Center of Skin Health and Disease, Hunan Key Laboratory of Skin Cancer and Psoriasis, Xiangya Hospital, Central South University, Hunan
| | - Ai-Ping Lu
- Institute for Advancing Translational Medicine in Bone and Joint Diseases, School of Chinese Medicine, Hong Kong Baptist University, Hong Kong
| | - Xiang Chen
- Department of Dermatology, Hunan Engineering Research Center of Skin Health and Disease, Hunan Key Laboratory of Skin Cancer and Psoriasis, Xiangya Hospital, Central South University, Hunan
| | - Shao Liu
- Department of Pharmacy, Xiangya Hospital, Central South University, Hunan
| | - Ting-Jun Hou
- College of Pharmaceutical Sciences, Zhejiang University, China
| | - Dong-Sheng Cao
- Xiangya School of Pharmaceutical Sciences, Central South University, China
| |
Collapse
|
11
|
Sedykh AY, Shah RR, Kleinstreuer NC, Auerbach SS, Gombar VK. Saagar-A New, Extensible Set of Molecular Substructures for QSAR/QSPR and Read-Across Predictions. Chem Res Toxicol 2020; 34:634-640. [PMID: 33356152 DOI: 10.1021/acs.chemrestox.0c00464] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]
Abstract
Molecular structure-based predictive models provide a proven alternative to costly and inefficient animal testing. However, due to a lack of interpretability of predictive models built with abstract molecular descriptors they have earned the notoriety of being black boxes. Interpretable models require interpretable descriptors to provide chemistry-backed predictive reasoning and facilitate intelligent molecular design. We developed a novel set of extensible chemistry-aware substructures, Saagar, to support interpretable predictive models and read-across protocols. Performance of Saagar in chemical characterization and search for structurally similar actives for read-across applications was compared with four publicly available fingerprint sets (MACCS (166), PubChem (881), ECFP4 (1024), ToxPrint (729)) in three benchmark sets (MUV, ULS, and Tox21) spanning ∼145 000 compounds and 78 molecular targets at 1%, 2%, 5%, and 10% false discovery rates. In 18 of the 20 comparisons, interpretable Saagar features performed better than the publicly available, but less interpretable and fixed-bit length, fingerprints. Examples are provided to show the enhanced capability of Saagar in extracting compounds with higher scaffold similarity. Saagar features are interpretable and efficiently characterize diverse chemical collections, thus making them a better choice for building interpretable predictive in silico models and read-across protocols.
Collapse
Affiliation(s)
| | - Ruchir R Shah
- Sciome LLC, Research Triangle Park, North Carolina 27709, United States
| | - Nicole C Kleinstreuer
- National Institute of Environmental Health Sciences (NIEHS), National Toxicology Program (NTP), Research Triangle Park, North Carolina 27709, United States
| | - Scott S Auerbach
- National Institute of Environmental Health Sciences (NIEHS), National Toxicology Program (NTP), Research Triangle Park, North Carolina 27709, United States
| | - Vijay K Gombar
- Sciome LLC, Research Triangle Park, North Carolina 27709, United States
| |
Collapse
|
12
|
Kong W, Tu X, Huang W, Yang Y, Xie Z, Huang Z. Prediction and Optimization of NaV1.7 Sodium Channel Inhibitors Based on Machine Learning and Simulated Annealing. J Chem Inf Model 2020; 60:2739-2753. [DOI: 10.1021/acs.jcim.9b01180] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]
Affiliation(s)
- Weikaixin Kong
- Department of Molecular and Cellular Pharmacology, School of Pharmaceutical Sciences, Peking University Health Science Center, 38 Xueyuan Lu,
Haidian district, Beijing 100191, China
| | - Xinyu Tu
- Department of Molecular and Cellular Pharmacology, School of Pharmaceutical Sciences, Peking University Health Science Center, 38 Xueyuan Lu,
Haidian district, Beijing 100191, China
| | - Weiran Huang
- Department of Molecular and Cellular Pharmacology, School of Pharmaceutical Sciences, Peking University Health Science Center, 38 Xueyuan Lu,
Haidian district, Beijing 100191, China
| | - Yang Yang
- Department of Medicinal Chemistry and Molecular Pharmacology, College of Pharmacy, Purdue University, West Lafayette, Indiana 47907, United States
| | - Zhengwei Xie
- Peking University International Cancer Institute and Department of Pharmacology, School of Basic Medical Sciences, Peking University Health Science Center, 38 Xueyuan Lu, Haidian district, Beijing 100191, China
| | - Zhuo Huang
- Department of Molecular and Cellular Pharmacology, School of Pharmaceutical Sciences, Peking University Health Science Center, 38 Xueyuan Lu,
Haidian district, Beijing 100191, China
- State Key Laboratory of Natural and Biomimetic Drugs, Department of Molecular and Cellular Pharmacology, School of Pharmaceutical Sciences, Peking University Health Science Center, 38 Xueyuan Lu,
Haidian district, Beijing 100191, China
| |
Collapse
|
13
|
Lee YO, Kim YJ. The Effect of Resampling on Data‐imbalanced Conditions for Prediction towards Nuclear Receptor Profiling Using Deep Learning. Mol Inform 2020; 39:e1900131. [DOI: 10.1002/minf.201900131] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2019] [Accepted: 01/25/2020] [Indexed: 11/11/2022]
Affiliation(s)
- Yong Oh Lee
- Smart Convergence GroupKIST Europe Saarbrücken 66123 Germany
| | - Young Jun Kim
- Environmental Safety GroupKIST Europe Saarbrücken 66123 Germany
| |
Collapse
|
14
|
Hemmerich J, Ecker GF. In silico toxicology: From structure–activity relationships towards deep learning and adverse outcome pathways. WILEY INTERDISCIPLINARY REVIEWS-COMPUTATIONAL MOLECULAR SCIENCE 2020; 10:e1475. [PMID: 35866138 PMCID: PMC9286356 DOI: 10.1002/wcms.1475] [Citation(s) in RCA: 57] [Impact Index Per Article: 14.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/07/2020] [Revised: 03/09/2020] [Accepted: 03/10/2020] [Indexed: 12/18/2022]
Abstract
In silico toxicology is an emerging field. It gains increasing importance as research is aiming to decrease the use of animal experiments as suggested in the 3R principles by Russell and Burch. In silico toxicology is a means to identify hazards of compounds before synthesis, and thus in very early stages of drug development. For chemical industries, as well as regulatory agencies it can aid in gap‐filling and guide risk minimization strategies. Techniques such as structural alerts, read‐across, quantitative structure–activity relationship, machine learning, and deep learning allow to use in silico toxicology in many cases, some even when data is scarce. Especially the concept of adverse outcome pathways puts all techniques into a broader context and can elucidate predictions by mechanistic insights. This article is categorized under:Structure and Mechanism > Computational Biochemistry and Biophysics Data Science > Chemoinformatics
Collapse
Affiliation(s)
- Jennifer Hemmerich
- Department of Pharmaceutical Chemistry University of Vienna Vienna Austria
| | - Gerhard F. Ecker
- Department of Pharmaceutical Chemistry University of Vienna Vienna Austria
| |
Collapse
|
15
|
Drakakis G, Cortés-Ciriano I, Alexander-Dann B, Bender A. Elucidating Compound Mechanism of Action and Predicting Cytotoxicity Using Machine Learning Approaches, Taking Prediction Confidence into Account. ACTA ACUST UNITED AC 2020; 11:e73. [PMID: 31483099 DOI: 10.1002/cpch.73] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
Abstract
The modes of action (MoAs) of drugs frequently are unknown, because many are small molecules initially identified from phenotypic screens, giving rise to the need to elucidate their MoAs. In addition, the high attrition rate for candidate drugs in preclinical studies due to intolerable toxicity has motivated the development of computational approaches to predict drug candidate (cyto)toxicity as early as possible in the drug-discovery process. Here, we provide detailed instructions for capitalizing on bioactivity predictions to elucidate the MoAs of small molecules and infer their underlying phenotypic effects. We illustrate how these predictions can be used to infer the underlying antidepressive effects of marketed drugs. We also provide the necessary functionalities to model cytotoxicity data using single and ensemble machine-learning algorithms. Finally, we give detailed instructions on how to calculate confidence intervals for individual predictions using the conformal prediction framework. © 2019 by John Wiley & Sons, Inc.
Collapse
Affiliation(s)
- Georgios Drakakis
- Centre for Molecular Informatics, Department of Chemistry, University of Cambridge, Cambridge, United Kingdom
| | - Isidro Cortés-Ciriano
- Centre for Molecular Informatics, Department of Chemistry, University of Cambridge, Cambridge, United Kingdom
| | - Ben Alexander-Dann
- Centre for Molecular Informatics, Department of Chemistry, University of Cambridge, Cambridge, United Kingdom
| | - Andreas Bender
- Centre for Molecular Informatics, Department of Chemistry, University of Cambridge, Cambridge, United Kingdom
| |
Collapse
|
16
|
Yang H, Lou C, Li W, Liu G, Tang Y. Computational Approaches to Identify Structural Alerts and Their Applications in Environmental Toxicology and Drug Discovery. Chem Res Toxicol 2020; 33:1312-1322. [DOI: 10.1021/acs.chemrestox.0c00006] [Citation(s) in RCA: 26] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022]
Affiliation(s)
- Hongbin Yang
- Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, Shanghai 200237, China
| | - Chaofeng Lou
- Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, Shanghai 200237, China
| | - Weihua Li
- Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, Shanghai 200237, China
| | - Guixia Liu
- Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, Shanghai 200237, China
| | - Yun Tang
- Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, Shanghai 200237, China
| |
Collapse
|
17
|
Cortés-Ciriano I, Firth NC, Bender A, Watson O. Discovering Highly Potent Molecules from an Initial Set of Inactives Using Iterative Screening. J Chem Inf Model 2018; 58:2000-2014. [PMID: 30130102 DOI: 10.1021/acs.jcim.8b00376] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Abstract
The versatility of similarity searching and quantitative structure-activity relationships to model the activity of compound sets within given bioactivity ranges (i.e., interpolation) is well established. However, their relative performance in the common scenario in early stage drug discovery where lots of inactive data but no active data points are available (i.e., extrapolation from the low-activity to the high-activity range) has not been thoroughly examined yet. To this aim, we have designed an iterative virtual screening strategy which was evaluated on 25 diverse bioactivity data sets from ChEMBL. We benchmark the efficiency of random forest (RF), multiple linear regression, ridge regression, similarity searching, and random selection of compounds to identify a highly active molecule in the test set among a large number of low-potency compounds. We use the number of iterations required to find this active molecule to evaluate the performance of each experimental setup. We show that linear and ridge regression often outperform RF and similarity searching, reducing the number of iterations to find an active compound by a factor of 2 or more. Even simple regression methods seem better able to extrapolate to high-bioactivity ranges than RF, which only provides output values in the range covered by the training set. In addition, examination of the scaffold diversity in the data sets used shows that in some cases similarity searching and RF require two times as many iterations as random selection depending on the chemical space covered in the initial training data. Lastly, we show using bioactivity data for COX-1 and COX-2 that our framework can be extended to multitarget drug discovery, where compounds are selected by concomitantly considering their activity against multiple targets. Overall, this study provides an approach for iterative screening where only inactive data are present in early stages of drug discovery in order to discover highly potent compounds and the best experimental set up in which to do so.
Collapse
Affiliation(s)
- Isidro Cortés-Ciriano
- Centre for Molecular Informatics, Department of Chemistry , University of Cambridge , Lensfield Road , Cambridge CB2 1EW , United Kingdom
| | - Nicholas C Firth
- Centre for Medical Image Computing, Department of Computer Science , UCL , London WC1E 6BT , United Kingdom.,Evariste Technologies Ltd , Goring on Thames RG8 9AL , United Kingdom
| | - Andreas Bender
- Centre for Molecular Informatics, Department of Chemistry , University of Cambridge , Lensfield Road , Cambridge CB2 1EW , United Kingdom
| | - Oliver Watson
- Evariste Technologies Ltd , Goring on Thames RG8 9AL , United Kingdom
| |
Collapse
|
18
|
Yang H, Sun L, Li W, Liu G, Tang Y. Identification of Nontoxic Substructures: A New Strategy to Avoid Potential Toxicity Risk. Toxicol Sci 2018; 165:396-407. [DOI: 10.1093/toxsci/kfy146] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022] Open
Affiliation(s)
- Hongbin Yang
- Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, Shanghai 200237, China
| | - Lixia Sun
- Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, Shanghai 200237, China
| | - Weihua Li
- Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, Shanghai 200237, China
| | - Guixia Liu
- Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, Shanghai 200237, China
| | - Yun Tang
- Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, Shanghai 200237, China
| |
Collapse
|
19
|
Zheng S, Jiang M, Zhao C, Zhu R, Hu Z, Xu Y, Lin F. e-Bitter: Bitterant Prediction by the Consensus Voting From the Machine-Learning Methods. Front Chem 2018; 6:82. [PMID: 29651416 PMCID: PMC5885771 DOI: 10.3389/fchem.2018.00082] [Citation(s) in RCA: 37] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2017] [Accepted: 03/12/2018] [Indexed: 11/25/2022] Open
Abstract
In-silico bitterant prediction received the considerable attention due to the expensive and laborious experimental-screening of the bitterant. In this work, we collect the fully experimental dataset containing 707 bitterants and 592 non-bitterants, which is distinct from the fully or partially hypothetical non-bitterant dataset used in the previous works. Based on this experimental dataset, we harness the consensus votes from the multiple machine-learning methods (e.g., deep learning etc.) combined with the molecular fingerprint to build the bitter/bitterless classification models with five-fold cross-validation, which are further inspected by the Y-randomization test and applicability domain analysis. One of the best consensus models affords the accuracy, precision, specificity, sensitivity, F1-score, and Matthews correlation coefficient (MCC) of 0.929, 0.918, 0.898, 0.954, 0.936, and 0.856 respectively on our test set. For the automatic prediction of bitterant, a graphic program “e-Bitter” is developed for the convenience of users via the simple mouse click. To our best knowledge, it is for the first time to adopt the consensus model for the bitterant prediction and develop the first free stand-alone software for the experimental food scientist.
Collapse
Affiliation(s)
- Suqing Zheng
- School of Pharmaceutical Sciences, Wenzhou Medical University, Wenzhou, China.,Chemical Biology Research Center, Wenzhou Medical University, Wenzhou, China
| | - Mengying Jiang
- School of Pharmaceutical Sciences, Wenzhou Medical University, Wenzhou, China
| | - Chengwei Zhao
- School of Pharmaceutical Sciences, Wenzhou Medical University, Wenzhou, China
| | - Rui Zhu
- School of Pharmaceutical Sciences, Wenzhou Medical University, Wenzhou, China
| | - Zhicheng Hu
- School of Pharmaceutical Sciences, Wenzhou Medical University, Wenzhou, China
| | - Yong Xu
- Center of Chemical Biology, Guangzhou Institutes of Biomedicine and Health, Chinese Academy of Sciences, Guangzhou, China
| | - Fu Lin
- School of Pharmaceutical Sciences, Wenzhou Medical University, Wenzhou, China
| |
Collapse
|
20
|
Yang H, Sun L, Li W, Liu G, Tang Y. In Silico Prediction of Chemical Toxicity for Drug Design Using Machine Learning Methods and Structural Alerts. Front Chem 2018; 6:30. [PMID: 29515993 PMCID: PMC5826228 DOI: 10.3389/fchem.2018.00030] [Citation(s) in RCA: 108] [Impact Index Per Article: 18.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2018] [Accepted: 02/05/2018] [Indexed: 12/17/2022] Open
Abstract
During drug development, safety is always the most important issue, including a variety of toxicities and adverse drug effects, which should be evaluated in preclinical and clinical trial phases. This review article at first simply introduced the computational methods used in prediction of chemical toxicity for drug design, including machine learning methods and structural alerts. Machine learning methods have been widely applied in qualitative classification and quantitative regression studies, while structural alerts can be regarded as a complementary tool for lead optimization. The emphasis of this article was put on the recent progress of predictive models built for various toxicities. Available databases and web servers were also provided. Though the methods and models are very helpful for drug design, there are still some challenges and limitations to be improved for drug safety assessment in the future.
Collapse
Affiliation(s)
| | | | | | | | - Yun Tang
- Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, Shanghai, China
| |
Collapse
|
21
|
Yang H, Li J, Wu Z, Li W, Liu G, Tang Y. Evaluation of Different Methods for Identification of Structural Alerts Using Chemical Ames Mutagenicity Data Set as a Benchmark. Chem Res Toxicol 2017; 30:1355-1364. [DOI: 10.1021/acs.chemrestox.7b00083] [Citation(s) in RCA: 37] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Affiliation(s)
- Hongbin Yang
- Shanghai Key Laboratory of
New Drug Design, School of Pharmacy, East China University of Science and Technology, Shanghai 200237, China
| | - Jie Li
- Shanghai Key Laboratory of
New Drug Design, School of Pharmacy, East China University of Science and Technology, Shanghai 200237, China
| | - Zengrui Wu
- Shanghai Key Laboratory of
New Drug Design, School of Pharmacy, East China University of Science and Technology, Shanghai 200237, China
| | - Weihua Li
- Shanghai Key Laboratory of
New Drug Design, School of Pharmacy, East China University of Science and Technology, Shanghai 200237, China
| | - Guixia Liu
- Shanghai Key Laboratory of
New Drug Design, School of Pharmacy, East China University of Science and Technology, Shanghai 200237, China
| | - Yun Tang
- Shanghai Key Laboratory of
New Drug Design, School of Pharmacy, East China University of Science and Technology, Shanghai 200237, China
| |
Collapse
|
22
|
Abstract
The success of molecular modeling and computational chemistry efforts are, by definition, dependent on quality software applications. Open source software development provides many advantages to users of modeling applications, not the least of which is that the software is free and completely extendable. In this review we categorize, enumerate, and describe available open source software packages for molecular modeling and computational chemistry. An updated online version of this catalog can be found at https://opensourcemolecularmodeling.github.io.
Collapse
|